%0 Journal Article %J Ethics and Information Technology %D 2023 %T Generative AI models should include detection mechanisms as a condition for public releaseAbstract %A Knott, Alistair %A Pedreschi, Dino %A Chatila, Raja %A Chakraborti, Tapabrata %A Leavy, Susan %A Baeza-Yates, Ricardo %A Eyers, David %A Trotman, Andrew %A Teal, Paul D. %A Biecek, Przemyslaw %A Russell, Stuart %A Bengio, Yoshua %X The new wave of ‘foundation models’—general-purpose generative AI models, for production of text (e.g., ChatGPT) or images (e.g., MidJourney)—represent a dramatic advance in the state of the art for AI. But their use also introduces a range of new risks, which has prompted an ongoing conversation about possible regulatory mechanisms. Here we propose a specific principle that should be incorporated into legislation: that any organization developing a foundation model intended for public use must demonstrate a reliable detection mechanism for the content it generates, as a condition of its public release. The detection mechanism should be made publicly available in a tool that allows users to query, for an arbitrary item of content, whether the item was generated (wholly or partly) by the model. In this paper, we argue that this requirement is technically feasible and would play an important role in reducing certain risks from new AI models in many domains. We also outline a number of options for the tool’s design, and summarize a number of points where further input from policymakers and researchers would be required. %B Ethics and Information Technology %V 25 %8 Jan-12-2023 %G eng %U https://link.springer.com/article/10.1007/s10676-023-09728-4?utm_source=rct_congratemailt&utm_medium=email&utm_campaign=oa_20231028&utm_content=10.1007/s10676-023-09728-4 %! Ethics Inf Technol %R 10.1007/s10676-023-09728-4 %0 Journal Article %J Scientific Data %D 2022 %T The long-tail effect of the COVID-19 lockdown on Italians’ quality of life, sleep and physical activity %A Michela Natilli %A Alessio Rossi %A Trecroci, Athos %A Cavaggioni, Luca %A Merati, Giampiero %A Formenti, Damiano %X From March 2020 to May 2021, several lockdown periods caused by the COVID-19 pandemic have limited people’s usual activities and mobility in Italy, as well as around the world. These unprecedented confinement measures dramatically modified citizens’ daily lifestyles and behaviours. However, with the advent of summer 2021 and thanks to the vaccination campaign that significantly prevents serious illness and death, and reduces the risk of contagion, all the Italian regions finally returned to regular behaviours and routines. Anyhow, it is unclear if there is a long-tail effect on people’s quality of life, sleep- and physical activity-related behaviours. Thanks to the dataset described in this paper, it will be possible to obtain accurate insights of the changes induced by the lockdown period in the Italians’ health that will permit to provide practical suggestions at local, regional, and state institutions and companies to improve infrastructures and services that could be beneficial to Italians’ well being. %B Scientific Data %V 9 %P 1–10 %G eng %U https://www.nature.com/articles/s41597-022-01376-5 %0 Conference Paper %B HHAI 2022: Augmenting Human Intellect - Proceedings of the First International Conference on Hybrid Human-Artificial Intelligence, Amsterdam, The Netherlands, 13-17 June 2022 %D 2022 %T Monitoring Fairness in HOLDA %A Michele Fontana %A Francesca Naretto %A Anna Monreale %A Fosca Giannotti %E Stefan Schlobach %E María Pérez-Ortiz %E Myrthe Tielman %B HHAI 2022: Augmenting Human Intellect - Proceedings of the First International Conference on Hybrid Human-Artificial Intelligence, Amsterdam, The Netherlands, 13-17 June 2022 %I IOS Press %G eng %U https://doi.org/10.3233/FAIA220205 %R 10.3233/FAIA220205 %0 Conference Proceedings %B 30th Italian Symposium on Advanced Database Systems (SEBD – Sistemi Evoluti per Basi di Dati) %D 2022 %T SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics. %A Trasarti, Roberto %A Grossi, Valerio %A Michela Natilli %A Rapisarda, Beatrice %X SoBigData RI has the ambition to support the rising demand for cross-disciplinary research and innovation on the multiple aspects of social complexity from combined data and model-driven perspectives and the increasing importance of ethics and data scientists’ responsibility as pillars of trustworthy use of Big Data and analytical technology. Digital traces of human activities offer a considerable opportunity to scrutinize the ground truth of individual and collective behaviour at an unprecedented detail and on a global scale. This increasing wealth of data is a chance to understand social complexity, provided we can rely on social mining, i.e., adequate means for accessing big social data and models for extracting knowledge from them. SoBigData RI, with its tools and services, empowers researchers and innovators through a platform for the design and execution of large-scale social mining experiments, open to users with diverse backgrounds, accessible on the cloud (aligned with EOSC), and also exploiting supercomputing facilities. Pushing the FAIR (Findable, Accessible, Interoperable) and FACT (Fair, Accountable, Confidential, and Transparent) principles will render social mining experiments more efficiently designed, adjusted, and repeatable by domain experts that are not data scientists. SoBigData RI moves forward from the simple awareness of ethical and legal challenges in social mining to the development of concrete tools that operationalize ethics with value-sensitive design, incorporating values and norms for privacy protection, fairness, transparency, and pluralism. SoBigData RI is the result of two H2020 grants (g.a. n.654024 and 871042), and it is part of the ESFRI 2021 Roadmap. %B 30th Italian Symposium on Advanced Database Systems (SEBD – Sistemi Evoluti per Basi di Dati) %C Tirrenia, Pisa %G eng %0 Journal Article %J Data Mining and Knowledge Discovery %D 2022 %T Stable and actionable explanations of black-box models through factual and counterfactual rules %A Guidotti, Riccardo %A Monreale, Anna %A Ruggieri, Salvatore %A Naretto, Francesca %A Turini, Franco %A Pedreschi, Dino %A Giannotti, Fosca %B Data Mining and Knowledge Discovery %G eng %0 Journal Article %D 2021 %T Give more data, awareness and control to individual citizens, and they will help COVID-19 containment %A Mirco Nanni %A Andrienko, Gennady %A Barabasi, Albert-Laszlo %A Boldrini, Chiara %A Bonchi, Francesco %A Cattuto, Ciro %A Chiaromonte, Francesca %A Comandé, Giovanni %A Conti, Marco %A Coté, Mark %A Dignum, Frank %A Dignum, Virginia %A Domingo-Ferrer, Josep %A Ferragina, Paolo %A Fosca Giannotti %A Riccardo Guidotti %A Helbing, Dirk %A Kaski, Kimmo %A Kertész, János %A Lehmann, Sune %A Lepri, Bruno %A Lukowicz, Paul %A Matwin, Stan %A Jiménez, David Megías %A Anna Monreale %A Morik, Katharina %A Oliver, Nuria %A Passarella, Andrea %A Passerini, Andrea %A Dino Pedreschi %A Pentland, Alex %A Pianesi, Fabio %A Francesca Pratesi %A S Rinzivillo %A Salvatore Ruggieri %A Siebes, Arno %A Torra, Vicenc %A Roberto Trasarti %A Hoven, Jeroen van den %A Vespignani, Alessandro %X The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the “phase 2” of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens’ privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens’ “personal data stores”, to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates—if and when they want and for specific aims—with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society. %8 2021/02/02 %@ 1572-8439 %G eng %U https://link.springer.com/article/10.1007/s10676-020-09572-w %! Ethics and Information Technology %R https://doi.org/10.1007/s10676-020-09572-w %0 Journal Article %D 2021 %T GLocalX - From Local to Global Explanations of Black Box AI Models %A Mattia Setzu %A Riccardo Guidotti %A Anna Monreale %A Franco Turini %A Dino Pedreschi %A Fosca Giannotti %X Artificial Intelligence (AI) has come to prominence as one of the major components of our society, with applications in most aspects of our lives. In this field, complex and highly nonlinear machine learning models such as ensemble models, deep neural networks, and Support Vector Machines have consistently shown remarkable accuracy in solving complex tasks. Although accurate, AI models often are “black boxes” which we are not able to understand. Relying on these models has a multifaceted impact and raises significant concerns about their transparency. Applications in sensitive and critical domains are a strong motivational factor in trying to understand the behavior of black boxes. We propose to address this issue by providing an interpretable layer on top of black box models by aggregating “local” explanations. We present GLocalX, a “local-first” model agnostic explanation method. Starting from local explanations expressed in form of local decision rules, GLocalX iteratively generalizes them into global explanations by hierarchically aggregating them. Our goal is to learn accurate yet simple interpretable models to emulate the given black box, and, if possible, replace it entirely. We validate GLocalX in a set of experiments in standard and constrained settings with limited or no access to either data or local explanations. Experiments show that GLocalX is able to accurately emulate several models with simple and small models, reaching state-of-the-art performance against natively global solutions. Our findings show how it is often possible to achieve a high level of both accuracy and comprehensibility of classification models, even in complex domains with high-dimensional data, without necessarily trading one property for the other. This is a key requirement for a trustworthy AI, necessary for adoption in high-stakes decision making applications. %V 294 %P 103457 %8 2021/05/01/ %@ 0004-3702 %G eng %U https://www.sciencedirect.com/science/article/pii/S0004370221000084 %! Artificial Intelligence %R https://doi.org/10.1016/j.artint.2021.103457 %0 Conference Paper %B Discovery Science - 24th International Conference, DS 2021, Halifax, NS, Canada, October 11-13, 2021, Proceedings %D 2021 %T Privacy Risk Assessment of Individual Psychometric Profiles %A Giacomo Mariani %A Anna Monreale %A Francesca Naretto %E Carlos Soares %E Luís Torgo %B Discovery Science - 24th International Conference, DS 2021, Halifax, NS, Canada, October 11-13, 2021, Proceedings %I Springer %G eng %U https://doi.org/10.1007/978-3-030-88942-5_32 %R 10.1007/978-3-030-88942-5_32 %0 Journal Article %J Health Policy %D 2021 %T Understanding eating choices among university students: A study using data from cafeteria cashiers’ transactions %A Lorenzoni, Valentina %A Triulzi, Isotta %A Martinucci, Irene %A Toncelli, Letizia %A Michela Natilli %A Barale, Roberto %A Turchetti, Giuseppe %B Health Policy %V 125 %P 665–673 %G eng %0 Journal Article %J Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery %D 2020 %T Bias in data-driven artificial intelligence systems—An introductory survey %A Ntoutsi, Eirini %A Fafalios, Pavlos %A Gadiraju, Ujwal %A Iosifidis, Vasileios %A Nejdl, Wolfgang %A Vidal, Maria-Esther %A Salvatore Ruggieri %A Franco Turini %A Papadopoulos, Symeon %A Krasanakis, Emmanouil %A others %X Artificial Intelligence (AI)‐based systems are widely employed nowadays to make decisions that have far‐reaching impact on individuals and society. Their decisions might affect everyone, everywhere, and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training, and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multidisciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well‐grounded in a legal frame. In this survey, we focus on data‐driven AI, as a large part of AI is powered nowadays by (big) data and powerful machine learning algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features such as race, sex, and so forth. %B Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery %V 10 %P e1356 %G eng %U https://onlinelibrary.wiley.com/doi/full/10.1002/widm.1356 %R https://doi.org/10.1002/widm.1356 %0 Conference Paper %B Discovery Science %D 2020 %T Explaining Sentiment Classification with Synthetic Exemplars and Counter-Exemplars %A Lampridis, Orestis %A Riccardo Guidotti %A Salvatore Ruggieri %E Appice, Annalisa %E Tsoumakas, Grigorios %E Manolopoulos, Yannis %E Matwin, Stan %X We present xspells, a model-agnostic local approach for explaining the decisions of a black box model for sentiment classification of short texts. The explanations provided consist of a set of exemplar sentences and a set of counter-exemplar sentences. The former are examples classified by the black box with the same label as the text to explain. The latter are examples classified with a different label (a form of counter-factuals). Both are close in meaning to the text to explain, and both are meaningful sentences – albeit they are synthetically generated. xspells generates neighbors of the text to explain in a latent space using Variational Autoencoders for encoding text and decoding latent instances. A decision tree is learned from randomly generated neighbors, and used to drive the selection of the exemplars and counter-exemplars. We report experiments on two datasets showing that xspells outperforms the well-known lime method in terms of quality of explanations, fidelity, and usefulness, and that is comparable to it in terms of stability. %B Discovery Science %I Springer International Publishing %C Cham %8 2020// %@ 978-3-030-61527-7 %G eng %U https://link.springer.com/chapter/10.1007/978-3-030-61527-7_24 %R https://doi.org/10.1007/978-3-030-61527-7_24 %0 Conference Paper %B Machine Learning and Knowledge Discovery in Databases %D 2020 %T Global Explanations with Local Scoring %A Mattia Setzu %A Riccardo Guidotti %A Anna Monreale %A Franco Turini %E Cellier, Peggy %E Driessens, Kurt %X Artificial Intelligence systems often adopt machine learning models encoding complex algorithms with potentially unknown behavior. As the application of these “black box” models grows, it is our responsibility to understand their inner working and formulate them in human-understandable explanations. To this end, we propose a rule-based model-agnostic explanation method that follows a local-to-global schema: it generalizes a global explanation summarizing the decision logic of a black box starting from the local explanations of single predicted instances. We define a scoring system based on a rule relevance score to extract global explanations from a set of local explanations in the form of decision rules. Experiments on several datasets and black boxes show the stability, and low complexity of the global explanations provided by the proposed solution in comparison with baselines and state-of-the-art global explainers. %B Machine Learning and Knowledge Discovery in Databases %I Springer International Publishing %C Cham %8 2020// %@ 978-3-030-43823-4 %G eng %U https://link.springer.com/chapter/10.1007%2F978-3-030-43823-4_14 %R https://doi.org/10.1007/978-3-030-43823-4_14 %0 Generic %D 2020 %T Mobile phone data analytics against the COVID-19 epidemics in Italy: flow diversity and local job markets during the national lockdown %A Pietro Bonato %A Paolo Cintia %A Francesco Fabbri %A Daniele Fadda %A Fosca Giannotti %A Pier Luigi Lopalco %A Sara Mazzilli %A Mirco Nanni %A Luca Pappalardo %A Dino Pedreschi %A Francesco Penone %A S Rinzivillo %A Giulio Rossetti %A Marcello Savarese %A Lara Tavoschi %X Understanding collective mobility patterns is crucial to plan the restart of production and economic activities, which are currently put in stand-by to fight the diffusion of the epidemics. In this report, we use mobile phone data to infer the movements of people between Italian provinces and municipalities, and we analyze the incoming, outcoming and internal mobility flows before and during the national lockdown (March 9th, 2020) and after the closure of non-necessary productive and economic activities (March 23th, 2020). The population flow across provinces and municipalities enable for the modelling of a risk index tailored for the mobility of each municipality or province. Such an index would be a useful indicator to drive counter-measures in reaction to a sudden reactivation of the epidemics. Mobile phone data, even when aggregated to preserve the privacy of individuals, are a useful data source to track the evolution in time of human mobility, hence allowing for monitoring the effectiveness of control measures such as physical distancing. We address the following analytical questions: How does the mobility structure of a territory change? Do incoming and outcoming flows become more predictable during the lockdown, and what are the differences between weekdays and weekends? Can we detect proper local job markets based on human mobility flows, to eventually shape the borders of a local outbreak? %G eng %U https://arxiv.org/abs/2004.11278 %R https://dx.doi.org/10.32079/ISTI-TR-2020/005 %0 Conference Paper %B International Conference on Complex Networks and Their Applications %D 2020 %T Opinion Dynamic Modeling of Fake News Perception %A Toccaceli, Cecilia %A Letizia Milli %A Giulio Rossetti %X Fake news diffusion represents one of the most pressing issues of our online society. In recent years, fake news has been analyzed from several points of view, primarily to improve our ability to separate them from the legit ones as well as identify their sources. Among such vast literature, a rarely discussed theme is likely to play uttermost importance in our understanding of such a controversial phenomenon: the analysis of fake news’ perception. In this work, we approach such a problem by proposing a family of opinion dynamic models tailored to study how specific social interaction patterns concur to the acceptance, or refusal, of fake news by a population of interacting individuals. To discuss the peculiarities of the proposed models, we tested them on several synthetic network topologies, thus underlying when/how they affect the stable states reached by the performed simulations. %B International Conference on Complex Networks and Their Applications %I Springer %G eng %U https://link.springer.com/chapter/10.1007/978-3-030-65347-7_31 %R https://doi.org/10.1007/978-3-030-65347-7_31 %0 Conference Paper %B Discovery Science %D 2020 %T Predicting and Explaining Privacy Risk Exposure in Mobility Data %A Francesca Naretto %A Roberto Pellungrini %A Anna Monreale %A Nardini, Franco Maria %A Musolesi, Mirco %E Appice, Annalisa %E Tsoumakas, Grigorios %E Manolopoulos, Yannis %E Matwin, Stan %X Mobility data is a proxy of different social dynamics and its analysis enables a wide range of user services. Unfortunately, mobility data are very sensitive because the sharing of people’s whereabouts may arise serious privacy concerns. Existing frameworks for privacy risk assessment provide tools to identify and measure privacy risks, but they often (i) have high computational complexity; and (ii) are not able to provide users with a justification of the reported risks. In this paper, we propose expert, a new framework for the prediction and explanation of privacy risk on mobility data. We empirically evaluate privacy risk on real data, simulating a privacy attack with a state-of-the-art privacy risk assessment framework. We then extract individual mobility profiles from the data for predicting their risk. We compare the performance of several machine learning algorithms in order to identify the best approach for our task. Finally, we show how it is possible to explain privacy risk prediction on real data, using two algorithms: Shap, a feature importance-based method and Lore, a rule-based method. Overall, expert is able to provide a user with the privacy risk and an explanation of the risk itself. The experiments show excellent performance for the prediction task. %B Discovery Science %I Springer International Publishing %C Cham %8 2020// %@ 978-3-030-61527-7 %G eng %U https://link.springer.com/chapter/10.1007/978-3-030-61527-7_27 %R https://doi.org/10.1007/978-3-030-61527-7_27 %0 Journal Article %J arXiv preprint arXiv:2006.03141 %D 2020 %T The relationship between human mobility and viral transmissibility during the COVID-19 epidemics in Italy %A Paolo Cintia %A Daniele Fadda %A Fosca Giannotti %A Luca Pappalardo %A Giulio Rossetti %A Dino Pedreschi %A S Rinzivillo %A Bonato, Pietro %A Fabbri, Francesco %A Penone, Francesco %A Savarese, Marcello %A Checchi, Daniele %A Chiaromonte, Francesca %A Vineis , Paolo %A Guzzetta, Giorgio %A Riccardo, Flavia %A Marziano, Valentina %A Poletti, Piero %A Trentini, Filippo %A Bella, Antonio %A Andrianou, Xanthi %A Del Manso, Martina %A Fabiani, Massimo %A Bellino, Stefania %A Boros, Stefano %A Mateo Urdiales, Alberto %A Vescio, Maria Fenicia %A Brusaferro, Silvio %A Rezza, Giovanni %A Pezzotti, Patrizio %A Ajelli, Marco %A Merler, Stefano %X We describe in this report our studies to understand the relationship between human mobility and the spreading of COVID-19, as an aid to manage the restart of the social and economic activities after the lockdown and monitor the epidemics in the coming weeks and months. We compare the evolution (from January to May 2020) of the daily mobility flows in Italy, measured by means of nation-wide mobile phone data, and the evolution of transmissibility, measured by the net reproduction number, i.e., the mean number of secondary infections generated by one primary infector in the presence of control interventions and human behavioural adaptations. We find a striking relationship between the negative variation of mobility flows and the net reproduction number, in all Italian regions, between March 11th and March 18th, when the country entered the lockdown. This observation allows us to quantify the time needed to "switch off" the country mobility (one week) and the time required to bring the net reproduction number below 1 (one week). A reasonably simple regression model provides evidence that the net reproduction number is correlated with a region's incoming, outgoing and internal mobility. We also find a strong relationship between the number of days above the epidemic threshold before the mobility flows reduce significantly as an effect of lockdowns, and the total number of confirmed SARS-CoV-2 infections per 100k inhabitants, thus indirectly showing the effectiveness of the lockdown and the other non-pharmaceutical interventions in the containment of the contagion. Our study demonstrates the value of "big" mobility data to the monitoring of key epidemic indicators to inform choices as the epidemics unfolds in the coming months. %B arXiv preprint arXiv:2006.03141 %G eng %U https://arxiv.org/abs/2006.03141 %0 Journal Article %J Frontiers in Psychology %D 2019 %T Do “girls just wanna have fun”? Participation trends and motivational profiles of women in Norway’s ultimate mass participation ski event %A Calogiuri, Giovanna %A Johansen, Patrick Foss %A Alessio Rossi %A Thurston, Miranda %X Mass participation sporting events (MPSEs) are viewed as encouraging regular exercise in the population, but concerns have been expressed about the extent to which they are inclusive for women. This study focuses on an iconic cross-country skiing MPSE in Norway, the Birkebeiner race (BR), which includes different variants (main, Friday, half-distance, and women-only races). In order to shed light on women’s participation in this specific MPSE, as well as add to the understanding of women’s MPSEs participation in general, this study was set up to: (i) analyze trends in women’s participation, (ii) examine the characteristics, and (iii) identify key factors characterizing the motivational profile of women in different BR races, with emphasis on the full-distance vs. the women-only races. Entries in the different races throughout the period 1996–2018 were analyzed using an autoregressive model. Information on women’s sociodemographic characteristics, sport and exercise participation, and a range of psychological variables (motives, perceptions, overall satisfaction, and future participation intention) were extracted from a market survey and analyzed using a machine learning (ML) approach (n = 1,149). Additionally, qualitative information generated through open-ended questions was analyzed thematically (n = 116). The relative prevalence of women in the main BR was generally low (< 20%). While the other variants contributed to boosting women’s participation in the overall event, a future increment of women in the main BR was predicted, with women’s ratings possibly matching the men’s by the year 2034. Across all races, most of the women were physically active, of medium-high income, and living in the most urbanized region of Norway. Satisfaction and future participation intention were relatively high, especially among the participants in the women-only races. “Exercise goal” was the predominant participation motive. The participants in women-only races assigned greater importance to social aspects, and perceived the race as a tradition, whereas those in the full-distance races were younger and gave more importance to performance aspects. These findings corroborate known trends and challenges in MPSE participation, but also contribute to greater understanding in this under-researched field. Further research is needed in order to gain more knowledge on how to foster women’s participation in MPSEs. %B Frontiers in Psychology %V 10 %P 2548 %G eng %U https://www.frontiersin.org/articles/10.3389/fpsyg.2019.02548/full %R 10.3389/fpsyg.2019.02548 %0 Journal Article %J IEEE Intelligent Systems %D 2019 %T Factual and Counterfactual Explanations for Black Box Decision Making %A Riccardo Guidotti %A Anna Monreale %A Fosca Giannotti %A Dino Pedreschi %A Salvatore Ruggieri %A Franco Turini %X The rise of sophisticated machine learning models has brought accurate but obscure decision systems, which hide their logic, thus undermining transparency, trust, and the adoption of artificial intelligence (AI) in socially sensitive and safety-critical contexts. We introduce a local rule-based explanation method, providing faithful explanations of the decision made by a black box classifier on a specific instance. The proposed method first learns an interpretable, local classifier on a synthetic neighborhood of the instance under investigation, generated by a genetic algorithm. Then, it derives from the interpretable classifier an explanation consisting of a decision rule, explaining the factual reasons of the decision, and a set of counterfactuals, suggesting the changes in the instance features that would lead to a different outcome. Experimental results show that the proposed method outperforms existing approaches in terms of the quality of the explanations and of the accuracy in mimicking the black box. %B IEEE Intelligent Systems %G eng %U https://ieeexplore.ieee.org/abstract/document/8920138 %R 10.1109/MIS.2019.2957223 %0 Conference Paper %B Proceedings of the AAAI Conference on Artificial Intelligence %D 2019 %T Meaningful explanations of Black Box AI decision systems %A Dino Pedreschi %A Fosca Giannotti %A Riccardo Guidotti %A Anna Monreale %A Salvatore Ruggieri %A Franco Turini %X Black box AI systems for automated decision making, often based on machine learning over (big) data, map a user’s features into a class or a score without exposing the reasons why. This is problematic not only for lack of transparency, but also for possible biases inherited by the algorithms from human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong decisions. We focus on the urgent open challenge of how to construct meaningful explanations of opaque AI/ML systems, introducing the local-toglobal framework for black box explanation, articulated along three lines: (i) the language for expressing explanations in terms of logic rules, with statistical and causal interpretation; (ii) the inference of local explanations for revealing the decision rationale for a specific case, by auditing the black box in the vicinity of the target instance; (iii), the bottom-up generalization of many local explanations into simple global ones, with algorithms that optimize for quality and comprehensibility. We argue that the local-first approach opens the door to a wide variety of alternative solutions along different dimensions: a variety of data sources (relational, text, images, etc.), a variety of learning problems (multi-label classification, regression, scoring, ranking), a variety of languages for expressing meaningful explanations, a variety of means to audit a black box. %B Proceedings of the AAAI Conference on Artificial Intelligence %G eng %U https://aaai.org/ojs/index.php/AAAI/article/view/5050 %R 10.1609/aaai.v33i01.33019780 %0 Journal Article %J ERCIM News %D 2019 %T Transparency in Algorithmic Decision Making %A Andreas Rauber %A Roberto Trasarti %A Fosca Giannotti %B ERCIM News %G eng %U https://ercim-news.ercim.eu/en116/special/transparency-in-algorithmic-decision-making-introduction-to-the-special-theme %0 Journal Article %J BMC gastroenterology %D 2018 %T Gastroesophageal reflux symptoms among Italian university students: epidemiology and dietary correlates using automatically recorded transactions %A Martinucci, Irene %A Michela Natilli %A Lorenzoni, Valentina %A Luca Pappalardo %A Anna Monreale %A Turchetti, Giuseppe %A Dino Pedreschi %A Marchi, Santino %A Barale, Roberto %A de Bortoli, Nicola %X Background: Gastroesophageal reflux disease (GERD) is one of the most common gastrointestinal disorders worldwide, with relevant impact on the quality of life and health care costs.The aim of our study is to assess the prevalence of GERD based on self-reported symptoms among university students in central Italy. The secondary aim is to evaluate lifestyle correlates, particularly eating habits, in GERD students using automatically recorded transactions through cashiers at university canteen. Methods: A web-survey was created and launched through an app, ad-hoc developed for an interactive exchange of information with students, including anthropometric data and lifestyle habits. Moreover, the web-survey allowed users a self-diagnosis of GERD through a simple questionnaire. As regard eating habits, detailed collection of meals consumed, including number and type of dishes, were automatically recorded through cashiers at the university canteen equipped with an automatic registration system. Results: We collected 3012 questionnaires. A total of 792 students (26.2% of the respondents) reported typical GERD symptoms occurring at least weekly. Female sex was more prevalent than male sex. In the set of students with GERD, the percentage of smokers was higher, and our results showed that when BMI tends to higher values the percentage of students with GERD tends to increase. When evaluating correlates with diet, we found, among all users, a lower frequency of legumes choice in GERD students and, among frequent users, a lower frequency of choice of pasta and rice in GERD students. Discussion: The results of our study are in line with the values reported in the literature. Nowadays, GERD is a common problem in our communities, and can potentially lead to serious medical complications; the economic burden involved in the diagnostic and therapeutic management of the disease has a relevant impact on healthcare costs. Conclusions: To our knowledge, this is the first study evaluating the prevalence of typical GERD–related symptoms in a young population of University students in Italy. Considering the young age of enrolled subjects, our prevalence rate, relatively high compared to the usual estimates, could represent a further negative factor for the future economic sustainability of the healthcare system. Keywords: Gastroesophageal reflux disease, GERD, Heartburn, Regurgitation, Diet, Prevalence, University students %B BMC gastroenterology %V 18 %P 116 %G eng %U https://bmcgastroenterol.biomedcentral.com/articles/10.1186/s12876-018-0832-9 %R 10.1186/s12876-018-0832-9 %0 Book Section %B A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years %D 2018 %T How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science %A Amato, G. %A Candela, L. %A Castelli, D. %A Esuli, A. %A Falchi, F. %A Gennaro, C. %A Fosca Giannotti %A Anna Monreale %A Mirco Nanni %A Pagano, P. %A Luca Pappalardo %A Dino Pedreschi %A Francesca Pratesi %A Rabitti, F. %A S Rinzivillo %A Giulio Rossetti %A Salvatore Ruggieri %A Sebastiani, F. %A Tesconi, M. %E Flesca, Sergio %E Greco, Sergio %E Masciari, Elio %E Saccà, Domenico %X During the last 35 years, data management principles such as physical and logical independence, declarative querying and cost-based optimization have led to profound pervasiveness of relational databases in any kind of organization. More importantly, these technical advances have enabled the first round of business intelligence applications and laid the foundation for managing and analyzing Big Data today. %B A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years %I Springer International Publishing %C Cham %P 287 - 306 %@ 978-3-319-61893-7 %G eng %U https://link.springer.com/chapter/10.1007%2F978-3-319-61893-7_17 %R https://doi.org/10.1007/978-3-319-61893-7_17 %0 Report %D 2018 %T Local Rule-Based Explanations of Black Box Decision Systems %A Riccardo Guidotti %A Anna Monreale %A Salvatore Ruggieri %A Dino Pedreschi %A Franco Turini %A Fosca Giannotti %B arXiv preprint arXiv:1805.10820 %G eng %0 Report %D 2018 %T Open the Black Box Data-Driven Explanation of Black Box Decision Systems %A Dino Pedreschi %A Fosca Giannotti %A Riccardo Guidotti %A Anna Monreale %A Luca Pappalardo %A Salvatore Ruggieri %A Franco Turini %B arXiv preprint arXiv:1806.09936 %G eng %0 Journal Article %J Transactions on Data Privacy %D 2018 %T PRUDEnce: a system for assessing privacy risk vs utility in data sharing ecosystems %A Francesca Pratesi %A Anna Monreale %A Roberto Trasarti %A Fosca Giannotti %A Dino Pedreschi %A Yanagihara, Tadashi %X Data describing human activities are an important source of knowledge useful for understanding individual and collective behavior and for developing a wide range of user services. Unfortunately, this kind of data is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database. Therefore, Data Providers, before sharing those data, must apply any sort of anonymization to lower the privacy risks, but they must be aware and capable of controlling also the data quality, since these two factors are often a trade-off. In this paper we propose PRUDEnce (Privacy Risk versus Utility in Data sharing Ecosystems), a system enabling a privacy-aware ecosystem for sharing personal data. It is based on a methodology for assessing both the empirical (not theoretical) privacy risk associated to users represented in the data, and the data quality guaranteed only with users not at risk. Our proposal is able to support the Data Provider in the exploration of a repertoire of possible data transformations with the aim of selecting one specific transformation that yields an adequate trade-off between data quality and privacy risk. We study the practical effectiveness of our proposal over three data formats underlying many services, defined on real mobility data, i.e., presence data, trajectory data and road segment data. %B Transactions on Data Privacy %V 11 %8 08/2018 %G eng %U http://www.tdp.cat/issues16/tdp.a284a17.pdf %0 Conference Paper %B Companion of the The Web Conference 2018 on The Web Conference 2018 %D 2018 %T SoBigData: Social Mining & Big Data Ecosystem %A Fosca Giannotti %A Roberto Trasarti %A Bontcheva, Kalina %A Valerio Grossi %X One of the most pressing and fascinating challenges scientists face today, is understanding the complexity of our globally interconnected society. The big data arising from the digital breadcrumbs of human activities has the potential of providing a powerful social microscope, which can help us understand many complex and hidden socio-economic phenomena. Such challenge requires high-level analytics, modeling and reasoning across all the social dimensions above. There is a need to harness these opportunities for scientific advancement and for the social good, compared to the currently prevalent exploitation of big data for commercial purposes or, worse, social control and surveillance. The main obstacle to this accomplishment, besides the scarcity of data scientists, is the lack of a large-scale open ecosystem where big data and social mining research can be carried out. The SoBigData Research Infrastructure (RI) provides an integrated ecosystem for ethic-sensitive scientific discoveries and advanced applications of social data mining on the various dimensions of social life as recorded by "big data". The research community uses the SoBigData facilities as a "secure digital wind-tunnel" for large-scale social data analysis and simulation experiments. SoBigData promotes repeatable and open science and supports data science research projects by providing: i) an ever-growing, distributed data ecosystem for procurement, access and curation and management of big social data, to underpin social data mining research within an ethic-sensitive context; ii) an ever-growing, distributed platform of interoperable, social data mining methods and associated skills: tools, methodologies and services for mining, analysing, and visualising complex and massive datasets, harnessing the techno-legal barriers to the ethically safe deployment of big data for social mining; iii) an ecosystem where protection of personal information and the respect for fundamental human rights can coexist with a safe use of the same information for scientific purposes of broad and central societal interest. SoBigData has a dedicated ethical and legal board, which is implementing a legal and ethical framework. %B Companion of the The Web Conference 2018 on The Web Conference 2018 %I International World Wide Web Conferences Steering Committee %G eng %U http://www.sobigdata.eu/sites/default/files/www%202018.pdf %0 Journal Article %J ACM computing surveys (CSUR) %D 2018 %T A survey of methods for explaining black box models %A Riccardo Guidotti %A Anna Monreale %A Salvatore Ruggieri %A Franco Turini %A Fosca Giannotti %A Dino Pedreschi %X In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective. %B ACM computing surveys (CSUR) %V 51 %P 93 %G eng %U https://dl.acm.org/doi/abs/10.1145/3236009 %R 10.1145/3236009 %0 Journal Article %J Royal Society open science %D 2017 %T Assessing the use of mobile phone data to describe recurrent mobility patterns in spatial epidemic models %A Cecilia Panigutti %A Tizzoni, Michele %A Bajardi, Paolo %A Smoreda, Zbigniew %A Colizza, Vittoria %X The recent availability of large-scale call detail record data has substantially improved our ability of quantifying human travel patterns with broad applications in epidemiology. Notwithstanding a number of successful case studies, previous works have shown that using different mobility data sources, such as mobile phone data or census surveys, to parametrize infectious disease models can generate divergent outcomes. Thus, it remains unclear to what extent epidemic modelling results may vary when using different proxies for human movements. Here, we systematically compare 658 000 simulated outbreaks generated with a spatially structured epidemic model based on two different human mobility networks: a commuting network of France extracted from mobile phone data and another extracted from a census survey. We compare epidemic patterns originating from all the 329 possible outbreak seed locations and identify the structural network properties of the seeding nodes that best predict spatial and temporal epidemic patterns to be alike. We find that similarity of simulated epidemics is significantly correlated to connectivity, traffic and population size of the seeding nodes, suggesting that the adequacy of mobile phone data for infectious disease models becomes higher when epidemics spread between highly connected and heavily populated locations, such as large urban areas. %B Royal Society open science %V 4 %P 160950 %G eng %0 Journal Article %J Information %D 2017 %T Discovering and Understanding City Events with Big Data: The Case of Rome %A Barbara Furletti %A Roberto Trasarti %A Paolo Cintia %A Lorenzo Gabrielli %X The increasing availability of large amounts of data and digital footprints has given rise to ambitious research challenges in many fields, which spans from medical research, financial and commercial world, to people and environmental monitoring. Whereas traditional data sources and census fail in capturing actual and up-to-date behaviors, Big Data integrate the missing knowledge providing useful and hidden information to analysts and decision makers. With this paper, we focus on the identification of city events by analyzing mobile phone data (Call Detail Record), and we study and evaluate the impact of these events over the typical city dynamics. We present an analytical process able to discover, understand and characterize city events from Call Detail Record, designing a distributed computation to implement Sociometer, that is a profiling tool to categorize phone users. The methodology provides an useful tool for city mobility manager to manage the events and taking future decisions on specific classes of users, i.e., residents, commuters and tourists. %B Information %V 8 %P 74 %8 06/2017 %G eng %U https://doi.org/10.3390/info8030074 %R 10.3390/info8030074 %0 Journal Article %J Education and Information Technologies %D 2017 %T An empirical verification of a-priori learning models on mailing archives in the context of online learning activities of participants in free\libre open source software (FLOSS) communities %A Mukala, Patrick %A Cerone, Antonio %A Franco Turini %X Free\Libre Open Source Software (FLOSS) environments are increasingly dubbed as learning environments where practical software engineering skills can be acquired. Numerous studies have extensively investigated how knowledge is acquired in these environments through a collaborative learning model that define a learning process. Such a learning process, identified either as a result of surveys or by means of questionnaires, can be depicted through a series of graphical representations indicating the steps FLOSS community members go through as they acquire and exchange skills. These representations are referred to as a-priori learning models. They are Petri net-like workflow nets (WF-net) that provide a visual representation of the learning process as it is expected to occur. These models are representations of a learning framework or paradigm in FLOSS communities. As such, the credibility of any models is estimated through a process of model verification and validation. Therefore in this paper, we analyze these models in comparison with the real behavior captured in FLOSS repositories by means of conformance verification in process mining. The purpose of our study is twofold. Firstly, the results of our analysis provide insights on the possible discrepancies that are observed between the initial theoretical representations of learning processes and the real behavior captured in FLOSS event logs, constructed from mailing archives. Secondly, this comparison helps foster the understanding on how learning actually takes place in FLOSS environments based on empirical evidence directly from the data. %B Education and Information Technologies %V 22 %P 3207–3229 %G eng %U https://link.springer.com/article/10.1007/s10639-017-9573-6 %R 10.1007/s10639-017-9573-6 %0 Journal Article %J D-Lib Magazine %D 2017 %T HyWare: a HYbrid Workflow lAnguage for Research E-infrastructures %A Leonardo Candela %A Paolo Manghi %A Fosca Giannotti %A Valerio Grossi %A Roberto Trasarti %X Research e-infrastructures are "systems of systems", patchworks of tools, services and data sources, evolving over time to address the needs of the scientific process. Accordingly, in such environments, researchers implement their scientific processes by means of workflows made of a variety of actions, including for example usage of web services, download and execution of shared software libraries or tools, or local and manual manipulation of data. Although scientists may benefit from sharing their scientific process, the heterogeneity underpinning e-infrastructures hinders their ability to represent, share and eventually reproduce such workflows. This work presents HyWare, a language for representing scientific process in highly-heterogeneous e-infrastructures in terms of so-called hybrid workflows. HyWare lays in between "business process modeling languages", which offer a formal and high-level description of a reasoning, protocol, or procedure, and "workflow execution languages", which enable the fully automated execution of a sequence of computational steps via dedicated engines. %B D-Lib Magazine %V 23 %G eng %U http://dx.doi.org/10.1045/january2017-candela %R 10.1045/january2017-candela %0 Book Section %B Participatory Sensing, Opinions and Collective Awareness %D 2017 %T Large Scale Engagement Through Web-Gaming and Social Computations %A Vito D P Servedio %A Saverio Caminiti %A Pietro Gravino %A Vittorio Loreto %A Alina Sirbu %A Francesca Tria %X In the last few years the Web has progressively acquired the status of an infrastructure for social computation that allows researchers to coordinate the cognitive abilities of human agents, so to steer the collective user activity towards predefined goals. This general trend is also triggering the adoption of web-games as an alternative laboratory to run experiments in the social sciences and whenever the contribution of human beings can be effectively used for research purposes. Web-games introduce a playful aspect in scientific experiments with the result of increasing participation of people and of keeping their attention steady in time. The aim of this chapter is to suggest a general purpose web-based platform scheme for web-gaming and social computation. This platform will simplify the realization of web-games and will act as a repository of different scientific experiments, thus realizing a sort of showcase that stimulates users’ curiosity and helps researchers in recruiting volunteers. A platform built by following these criteria has been developed within the EveryAware project, the Experimental Tribe (XTribe) platform, which is operational and ready to be used. Finally, a sample web-game hosted by the XTribe platform will be presented with the aim of reporting the results, in terms of participation and motivation, of two different player recruiting strategies. %B Participatory Sensing, Opinions and Collective Awareness %I Springer %P 237–254 %G eng %U http://link.springer.com/chapter/10.1007/978-3-319-25658-0_12 %R 10.1007/978-3-319-25658-0_12 %0 Conference Paper %B Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, {PAP} 2017, Held in Conjunction with {ECML} {PKDD} 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers %D 2017 %T Movement Behaviour Recognition for Water Activities %A Mirco Nanni %A Roberto Trasarti %A Fosca Giannotti %B Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, {PAP} 2017, Held in Conjunction with {ECML} {PKDD} 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers %G eng %U https://doi.org/10.1007/978-3-319-71970-2_7 %R 10.1007/978-3-319-71970-2_7 %0 Journal Article %J Information Systems %D 2017 %T MyWay: Location prediction via mobility profiling %A Roberto Trasarti %A Riccardo Guidotti %A Anna Monreale %A Fosca Giannotti %X Forecasting the future positions of mobile users is a valuable task allowing us to operate efficiently a myriad of different applications which need this type of information. We propose MyWay, a prediction system which exploits the individual systematic behaviors modeled by mobility profiles to predict human movements. MyWay provides three strategies: the individual strategy uses only the user individual mobility profile, the collective strategy takes advantage of all users individual systematic behaviors, and the hybrid strategy that is a combination of the previous two. A key point is that MyWay only requires the sharing of individual mobility profiles, a concise representation of the user׳s movements, instead of raw trajectory data revealing the detailed movement of the users. We evaluate the prediction performances of our proposal by a deep experimentation on large real-world data. The results highlight that the synergy between the individual and collective knowledge is the key for a better prediction and allow the system to outperform the state-of-art methods. %B Information Systems %V 64 %P 350–367 %8 03/2017 %G eng %0 Book Section %B Participatory Sensing, Opinions and Collective Awareness %D 2017 %T Opinion dynamics: models, extensions and external effects %A Alina Sirbu %A Vittorio Loreto %A Vito D P Servedio %A Francesca Tria %X Recently, social phenomena have received a lot of attention not only from social scientists, but also from physicists, mathematicians and computer scientists, in the emerging interdisciplinary field of complex system science. Opinion dynamics is one of the processes studied, since opinions are the drivers of human behaviour, and play a crucial role in many global challenges that our complex world and societies are facing: global financial crises, global pandemics, growth of cities, urbanisation and migration patterns, and last but not least important, climate change and environmental sustainability and protection. Opinion formation is a complex process affected by the interplay of different elements, including the individual predisposition, the influence of positive and negative peer interaction (social networks playing a crucial role in this respect), the information each individual is exposed to, and many others. Several models inspired from those in use in physics have been developed to encompass many of these elements, and to allow for the identification of the mechanisms involved in the opinion formation process and the understanding of their role, with the practical aim of simulating opinion formation and spreading under various conditions. These modelling schemes range from binary simple models such as the voter model, to multi-dimensional continuous approaches. Here, we provide a review of recent methods, focusing on models employing both peer interaction and external information, and emphasising the role that less studied mechanisms, such as disagreement, has in driving the opinion dynamics. Due to the important role that external information (mainly in the form of mass media broadcast) can have in enhancing awareness of social issues, a special emphasis will be devoted to study different forms it can take, investigating their effectiveness in driving the opinion formation at the population level. The review shows that, although a large number of approaches exist, some mechanisms such as the effect of multiple external information sources could largely benefit from further studies. Additionally, model validation with real data, which are starting to become available, is still largely lacking and should in our opinion be the main ambition of future investigations. %B Participatory Sensing, Opinions and Collective Awareness %I Springer %P 363–401 %G eng %U http://link.springer.com/chapter/10.1007/978-3-319-25658-0_17 %R 10.1007/978-3-319-25658-0_17 %0 Journal Article %J Data Mining and Knowledge Discovery %D 2017 %T Survey on using constraints in data mining %A Valerio Grossi %A Andrea Romei %A Franco Turini %X This paper provides an overview of the current state-of-the-art on using constraints in knowledge discovery and data mining. The use of constraints in a data mining task requires specific definition and satisfaction tools during knowledge extraction. This survey proposes three groups of studies based on classification, clustering and pattern mining, whether the constraints are on the data, the models or the measures, respectively. We consider the distinctions between hard and soft constraint satisfaction, and between the knowledge extraction phases where constraints are considered. In addition to discussing how constraints can be used in data mining, we show how constraint-based languages can be used throughout the data mining process. %B Data Mining and Knowledge Discovery %V 31 %P 424–464 %G eng %R 10.1007/s10618-016-0480-z %0 Conference Paper %B 4th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2017) %D 2017 %T There's A Path For Everyone: A Data-Driven Personal Model Reproducing Mobility Agendas %A Riccardo Guidotti %A Roberto Trasarti %A Mirco Nanni %A Fosca Giannotti %A Dino Pedreschi %B 4th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2017) %I IEEE %C Tokyo %G eng %0 Conference Paper %B Data Mining Workshops (ICDMW), 2016 IEEE 16th International Conference on %D 2016 %T Classification Rule Mining Supported by Ontology for Discrimination Discovery %A Luong, Binh Thanh %A Salvatore Ruggieri %A Franco Turini %X Discrimination discovery from data consists of designing data mining methods for the actual discovery of discriminatory situations and practices hidden in a large amount of historical decision records. Approaches based on classification rule mining consider items at a flat concept level, with no exploitation of background knowledge on the hierarchical and inter-relational structure of domains. On the other hand, ontologies are a widespread and ever increasing means for expressing such a knowledge. In this paper, we propose a framework for discrimination discovery from ontologies, where contexts of prima-facie evidence of discrimination are summarized in the form of generalized classification rules at different levels of abstraction. Throughout the paper, we adopt a motivating and intriguing case study based on discriminatory tariffs applied by the U. S. Harmonized Tariff Schedules on imported goods. %B Data Mining Workshops (ICDMW), 2016 IEEE 16th International Conference on %I IEEE %G eng %R 10.1109/ICDMW.2016.0128 %0 Book Section %B Data Mining and Constraint Programming %D 2016 %T Data Mining and Constraints: An Overview %A Valerio Grossi %A Dino Pedreschi %A Franco Turini %X This paper provides an overview of the current state-of-the-art on using constraints in knowledge discovery and data mining. The use of constraints requires mechanisms for defining and evaluating them during the knowledge extraction process. We give a structured account of three main groups of constraints based on the specific context in which they are defined and used. The aim is to provide a complete view on constraints as a building block of data mining methods. %B Data Mining and Constraint Programming %I Springer International Publishing %P 25–48 %G eng %R 10.1007/978-3-319-50137-6_2 %0 Journal Article %J Journal ACM Transactions on Intelligent Systems and Technology (TIST) %D 2016 %T Driving Profiles Computation and Monitoring for Car Insurance CRM %A Mirco Nanni %A Roberto Trasarti %A Anna Monreale %A Valerio Grossi %A Dino Pedreschi %X Customer segmentation is one of the most traditional and valued tasks in customer relationship management (CRM). In this article, we explore the problem in the context of the car insurance industry, where the mobility behavior of customers plays a key role: Different mobility needs, driving habits, and skills imply also different requirements (level of coverage provided by the insurance) and risks (of accidents). In the present work, we describe a methodology to extract several indicators describing the driving profile of customers, and we provide a clustering-oriented instantiation of the segmentation problem based on such indicators. Then, we consider the availability of a continuous flow of fresh mobility data sent by the circulating vehicles, aiming at keeping our segments constantly up to date. We tackle a major scalability issue that emerges in this context when the number of customers is large-namely, the communication bottleneck-by proposing and implementing a sophisticated distributed monitoring solution that reduces communications between vehicles and company servers to the essential. We validate the framework on a large database of real mobility data coming from GPS devices on private cars. Finally, we analyze the privacy risks that the proposed approach might involve for the users, providing and evaluating a countermeasure based on data perturbation. %B Journal ACM Transactions on Intelligent Systems and Technology (TIST) %V 8 %P 14:1–14:26 %G eng %U http://doi.acm.org/10.1145/2912148 %R 10.1145/2912148 %0 Conference Paper %B Joint European Conference on Machine Learning and Knowledge Discovery in Databases %D 2016 %T A KDD process for discrimination discovery %A Salvatore Ruggieri %A Franco Turini %X The acceptance of analytical methods for discrimination discovery by practitioners and legal scholars can be only achieved if the data mining and machine learning communities will be able to provide case studies, methodological refinements, and the consolidation of a KDD process. We summarize here an approach along these directions. %B Joint European Conference on Machine Learning and Knowledge Discovery in Databases %I Springer International Publishing %G eng %R 10.1007/978-3-319-46131-1_28 %0 Book %D 2016 %T Realising the European open science cloud %A Ayris, Paul %A Berthou, Jean-Yves %A Bruce, Rachel %A Lindstaedt, Stefanie %A Anna Monreale %A Mons, Barend %A Murayama, Yasuhiro %A Södergård, Caj %A Tochtermann, Klaus %A Wilkinson, Ross %X The European Open Science Cloud (EOSC) aims to accelerate and support the current transition to more effective Open Science and Open Innovation in the Digital Single Market. It should enable trusted access to services, systems and the re-use of shared scientific data across disciplinary, social and geographical borders. This report approaches the EOSC as a federated environment for scientific data sharing and re-use, based on existing and emerging elements in the Member States, with light-weight international guidance and governance, and a large degree of freedom regarding practical implementation. %@ 978-92-79-61762-1 %G eng %U http://dx.doi.org/10.2777/940154 %R 10.2777/940154 %0 Journal Article %J Computer Communications %D 2016 %T Special Issue on Mobile Traffic Analytics %A Marco Fiore %A Zubair Shafiq %A Zbigniew Smoreda %A Razvan Stanica %A Roberto Trasarti %X This Special Issue of Computer Communications is dedicated to mobile traffic data analysis. This is an emerging field of research that stems from the increasing pervasiveness in our lives of always-connected mobile devices. These devices continuously collect, generate, receive or communicate data; in doing so, they leave trails of digital crumbs that can be followed, recorded and analysed in many and varied ways, and for a number of different purposes. From a data collection perspective, applications running on smartphones allow tracking user activities with extreme accuracy, in terms of mobility, context, and service usage. Yet, having individuals informedly install and run software that monitors their actions is not obvious; finding adequate incentives is equivalently complex. The other option is gathering mobile traffic data in the mobile network. This is an increasingly common practice for telecommunication operators: the collection of minimum information required for billing is giving way to in-depth inspection and recording of mobile service usages in space and time, and of traffic flows at the network edge and core. In this case, data access remains the major impediment, due to privacy and industrial secrecy reasons. Despite the issues inherent to the data collection, the richness of knowledge that can be extracted from the aforementioned sources is such that actors in both academia and industry are putting significant effort in gathering, analysing and possibly making available mobile traffic data. Indeed, mobile traffic data typically contain information on large populations of individuals (from thousands to millions users) with high spatio-temporal granularity. The combination of accuracy and coverage is unprecedented, and it has proven key in validating theories and scaling up experimental studies in a number of research fields across many disciplines, including physics, sociology, epidemiology, transportation systems, and, of course, mobile networking. As a result, we witness today a rapid growth of the literature that proposes or exploits mobile traffic analytics. Included in this Special Issue are eight papers that cover a significant portion of the different research topics in this area, ranging from data collection to the characterization of land use and mobile service consumption, from the inference and prediction of user mobility to the detection of malicious traffic. These papers were selected from 30 high-quality submissions after at least two rounds of reviews by experts and guest editors. The original submissions were received from five continents and a variety of countries, including Austria, Argentina, Belgium, Brazil, Chile, China, France, Germany, Italy, South Korea, Luxembourg, Pakistan, Saudi Arabia, Spain, Sweden, Tunisia, Turkey, USA. The accepted papers reflect this geographical heterogeneity, and are authored by researchers based in Europe, North and South America. %B Computer Communications %V 95 %P 1–2 %G eng %U http://dx.doi.org/10.1016/j.comcom.2016.10.009 %R 10.1016/j.comcom.2016.10.009 %0 Conference Paper %B Proceedings of the 1st International Conference on Complex Information Systems %D 2016 %T Unveiling Political Opinion Structures with a Web-experiment %A Pietro Gravino %A Saverio Caminiti %A Alina Sirbu %A Francesca Tria %A Vito D P Servedio %A Vittorio Loreto %X The dynamics of political votes has been widely studied, both for its practical interest and as a paradigm of the dynamics of mass opinions and collective phenomena, where theoretical predictions can be easily tested. However, the vote outcome is often influenced by many factors beyond the bare opinion on the candidate, and in most cases it is bound to a single preference. The voter perception of the political space is still to be elucidated. We here propose a web experiment (laPENSOcos`ı) where we explicitly investigate participants’ opinions on political entities (parties, coalitions, individual candidates) of the Italian political scene. As a main result, we show that the political perception follows a Weber-Fechner-like law, i.e., when ranking political entities according to the user expressed preferences, the perceived distance of the user from a given entity scales as the logarithm of this rank. %B Proceedings of the 1st International Conference on Complex Information Systems %@ 978-989-758-181-6 %G eng %U http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0005906300390047 %R 10.5220/0005906300390047 %0 Conference Paper %B IEEE Big Data %D 2015 %T City users’ classification with mobile phone data %A Lorenzo Gabrielli %A Barbara Furletti %A Roberto Trasarti %A Fosca Giannotti %A Dino Pedreschi %X Nowadays mobile phone data are an actual proxy for studying the users’ social life and urban dynamics. In this paper we present the Sociometer, and analytical framework aimed at classifying mobile phone users into behavioral categories by means of their call habits. The analytical process starts from spatio-temporal profiles, learns the different behaviors, and returns annotated profiles. After the description of the methodology and its evaluation, we present an application of the Sociometer for studying city users of one small and one big city, evaluating the impact of big events in these cities. %B IEEE Big Data %C Santa Clara (CA) - USA %8 11/2015 %G eng %0 Conference Paper %B Software Engineering and Formal Methods - {SEFM} 2015 Collocated Workshops: ATSE, HOFM, MoKMaSD, and VERY*SCART, York, UK, September 7-8, 2015, Revised Selected Papers %D 2015 %T Clustering Formulation Using Constraint Optimization %A Valerio Grossi %A Anna Monreale %A Mirco Nanni %A Dino Pedreschi %A Franco Turini %X The problem of clustering a set of data is a textbook machine learning problem, but at the same time, at heart, a typical optimization problem. Given an objective function, such as minimizing the intra-cluster distances or maximizing the inter-cluster distances, the task is to find an assignment of data points to clusters that achieves this objective. In this paper, we present a constraint programming model for a centroid based clustering and one for a density based clustering. In particular, as a key contribution, we show how the expressivity introduced by the formulation of the problem by constraint programming makes the standard problem easy to be extended with other constraints that permit to generate interesting variants of the problem. We show this important aspect in two different ways: first, we show how the formulation of the density-based clustering by constraint programming makes it very similar to the label propagation problem and then, we propose a variant of the standard label propagation approach. %B Software Engineering and Formal Methods - {SEFM} 2015 Collocated Workshops: ATSE, HOFM, MoKMaSD, and VERY*SCART, York, UK, September 7-8, 2015, Revised Selected Papers %I Springer Berlin Heidelberg %G eng %U http://dx.doi.org/10.1007/978-3-662-49224-6_9 %R 10.1007/978-3-662-49224-6_9 %0 Conference Paper %B 2015 {IEEE} 18th International Conference on Intelligent Transportation Systems %D 2015 %T ComeWithMe: An Activity-Oriented Carpooling Approach %A Vinicius Monteiro de Lira %A Valéria Cesário Times %A Chiara Renso %A S Rinzivillo %X The interest in carpooling is increasing due to the need to reduce traffic and noise pollution. Most of the available approaches and systems are route oriented, where driver and passengers are matched when the destination location is the same. ComeWithMe offers a new perspective: the destination is the intended activity instead of a location. This novel matching method is aimed to boost the possibilities of rides if passenger reaches a different location maintaining the activity. We conducted experiments using a real data set of trajectories and our results showed that the proposed matching algorithm improved the traditional carpooling approach in more than 80%. %B 2015 {IEEE} 18th International Conference on Intelligent Transportation Systems %I Institute of Electrical {&} Electronics Engineers ({IEEE}) %8 09/2015 %G eng %U http://dx.doi.org/10.1109/itsc.2015.414 %R 10.1109/itsc.2015.414 %0 Conference Paper %B NetMob %D 2015 %T Detecting and understanding big events in big cities %A Barbara Furletti %A Lorenzo Gabrielli %A Roberto Trasarti %A Zbigniew Smoreda %A Maarten Vanhoof %A Cezary Ziemlicki %X Recent studies have shown the great potential of big data such as mobile phone location data to model human behavior. Big data allow to analyze people presence in a territory in a fast and effective way with respect to the classical surveys (diaries or questionnaires). One of the drawbacks of these collection systems is incompleteness of the users' traces; people are localized only when they are using their phones. In this work we define a data mining method for identifying people presence and understanding the impact of big events in big cities. We exploit the ability of the Sociometer for classifying mobile phone users in mobility categories through their presence profile. The experiment in cooperation with Orange Telecom has been conduced in Paris during the event F^ete de la Musique using a privacy preserving protocol. %B NetMob %C Boston %8 04/2015 %G eng %U http://www.netmob.org/assets/img/netmob15_book_of_abstracts_posters.pdf %0 Generic %D 2015 %T An exploration of learning processes as process maps in FLOSS repositories %A Mukala, Patrick %A Cerone, Antonio %A Franco Turini %X Evidence suggests that Free/Libre Open Source Software (FLOSS) environ-ments provide unlimited learning opportunities. Community members engage in a number of activities both during their interaction with their peers and while mak-ing use of the tools available in these environments. A number of studies docu-ment the existence of learning processes in FLOSS through the analysis of sur-veys and questionnaires filled by FLOSS project participants. At the same time, the interest in understanding the dynamics of the FLOSS phenomenon, its popu-larity and success resulted in the development of tools and techniques for extract-ing and analyzing data from different FLOSS data sources. This new field is called Mining Software Repositories (MSR). In spite of these efforts, there is limited work aiming to provide empirical evidence of learning processes directly from FLOSS repositories. In this paper, we seek to trigger such an initiative by proposing an approach based on Process Mining to trace learning behaviors from FLOSS participants’ trails of activities, as recorded in FLOSS repositories, and visualize them as pro-cess maps. Process maps provide a pictorial representation of real behavior as it is recorded in FLOSS data. Our aim is to provide critical evidence that boosts the understanding of learning behavior in FLOSS communities by analyzing the rel-evant repositories. In order to accomplish this, we propose an effective approach that comprises first the mining of FLOSS repositories in order to generate Event logs, and then the generation of process maps, equipped with relevant statistical data interpreting and indicating the value of process discovery from these reposi-tories. %G eng %U http://eprints.adm.unipi.it/id/eprint/2344 %0 Conference Paper %B Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on %D 2015 %T The layered structure of company share networks %A Andrea Romei %A Salvatore Ruggieri %A Franco Turini %X We present a framework for the analysis of corporate governance problems using network science and graph algorithms on ownership networks. In such networks, nodes model companies/shareholders and edges model shares owned. Inspired by the widespread pyramidal organization of corporate groups of companies, we model ownership networks as layered graphs, and exploit the layered structure to design feasible and efficient solutions to three key problems of corporate governance. The first one is the long-standing problem of computing direct and indirect ownership (integrated ownership problem). The other two problems are introduced here: computing direct and indirect dividends (dividend problem), and computing the group of companies controlled by a parent shareholder (corporate group problem). We conduct an extensive empirical analysis of the Italian ownership network, which, with its 3.9M nodes, is 30× the largest network studied so far. %B Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on %I IEEE %G eng %R 10.1109/DSAA.2015.7344809 %0 Conference Paper %B Conference on e-Business, e-Services and e-Society %D 2015 %T Mining learning processes from FLOSS mailing archives %A Mukala, Patrick %A Cerone, Antonio %A Franco Turini %X Evidence suggests that Free/Libre Open Source Software (FLOSS) environments provide unlimited learning opportunities. Community members engage in a number of activities both during their interaction with their peers and while making use of these environments. As FLOSS repositories store data about participants’ interaction and activities, we analyze participants’ interaction and knowledge exchange in emails to trace learning activities that occur in distinct phases of the learning process. We make use of semantic search in SQL to retrieve data and build corresponding event logs which are then fed to a process mining tool in order to produce visual workflow nets. We view these nets as representative of the traces of learning activities in FLOSS as well as their relevant flow of occurrence. Additional statistical details are provided to contextualize and describe these models. %B Conference on e-Business, e-Services and e-Society %I Springer, Cham %G eng %R 10.1007/978-3-319-25013-7_23 %0 Journal Article %J PLoS One %D 2015 %T Participatory Patterns in an International Air Quality Monitoring Initiative. %A Alina Sirbu %A Becker, Martin %A Saverio Caminiti %A De Baets, Bernard %A Elen, Bart %A Francis, Louise %A Pietro Gravino %A Hotho, Andreas %A Ingarra, Stefano %A Vittorio Loreto %A Molino, Andrea %A Mueller, Juergen %A Peters, Jan %A Ricchiuti, Ferdinando %A Saracino, Fabio %A Vito D P Servedio %A Stumme, Gerd %A Theunis, Jan %A Francesca Tria %A Van den Bossche, Joris %X

The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective way of producing a change is involving the citizens themselves in monitoring campaigns (a citizen science bottom-up approach). This is possible by developing novel technologies and IT infrastructures enabling large citizen participation. Here, in the wider framework of one of the first such projects, we show results from an international competition where citizens were involved in mobile air pollution monitoring using low cost sensing devices, combined with a web-based game to monitor perceived levels of pollution. Measures of shift in perceptions over the course of the campaign are provided, together with insights into participatory patterns emerging from this study. Interesting effects related to inertia and to direct involvement in measurement activities rather than indirect information exposure are also highlighted, indicating that direct involvement can enhance learning and environmental awareness. In the future, this could result in better adoption of policies towards decreasing pollution.

%B PLoS One %V 10 %P e0136763 %8 2015 %G eng %R 10.1371/journal.pone.0136763 %0 Journal Article %J Journal of Trust Management %D 2015 %T A risk model for privacy in trajectory data %A Anirban Basu %A Anna Monreale %A Roberto Trasarti %A Juan Camilo Corena %A Fosca Giannotti %A Dino Pedreschi %A Shinsaku Kiyomoto %A Yutaka Miyake %A Tadashi Yanagihara %X Time sequence data relating to users, such as medical histories and mobility data, are good candidates for data mining, but often contain highly sensitive information. Different methods in privacy-preserving data publishing are utilised to release such private data so that individual records in the released data cannot be re-linked to specific users with a high degree of certainty. These methods provide theoretical worst-case privacy risks as measures of the privacy protection that they offer. However, often with many real-world data the worst-case scenario is too pessimistic and does not provide a realistic view of the privacy risks: the real probability of re-identification is often much lower than the theoretical worst-case risk. In this paper, we propose a novel empirical risk model for privacy which, in relation to the cost of privacy attacks, demonstrates better the practical risks associated with a privacy preserving data release. We show detailed evaluation of the proposed risk model by using k-anonymised real-world mobility data and then, we show how the empirical evaluation of the privacy risk has a different trend in synthetic data describing random movements. %B Journal of Trust Management %V 2 %P 9 %G eng %R 10.1186/s40493-015-0020-6 %0 Conference Paper %B Proceedings of the 23rd {SIGSPATIAL} International Conference on Advances in Geographic Information Systems, Bellevue, WA, USA, November 3-6, 2015 %D 2015 %T {TOSCA:} two-steps clustering algorithm for personal locations detection %A Riccardo Guidotti %A Roberto Trasarti %A Mirco Nanni %B Proceedings of the 23rd {SIGSPATIAL} International Conference on Advances in Geographic Information Systems, Bellevue, WA, USA, November 3-6, 2015 %G eng %U http://doi.acm.org/10.1145/2820783.2820818 %R 10.1145/2820783.2820818 %0 Conference Paper %B Proceedings of the 4th {ACM} {SIGSPATIAL} International Workshop on Mobile Geographic Information Systems, MobiGIS 2015, Bellevue, WA, USA, November 3-6, 2015 %D 2015 %T Towards user-centric data management: individual mobility analytics for collective services %A Riccardo Guidotti %A Roberto Trasarti %A Mirco Nanni %A Fosca Giannotti %B Proceedings of the 4th {ACM} {SIGSPATIAL} International Workshop on Mobile Geographic Information Systems, MobiGIS 2015, Bellevue, WA, USA, November 3-6, 2015 %G eng %U http://doi.acm.org/10.1145/2834126.2834132 %R 10.1145/2834126.2834132 %0 Conference Paper %B International Conference on Software Engineering and Formal Methods %D 2014 %T An abstract state machine (ASM) representation of learning process in FLOSS communities %A Mukala, Patrick %A Cerone, Antonio %A Franco Turini %X Free/Libre Open Source Software (FLOSS) communities as collaborative environments enable the occurrence of learning between participants in these groups. With the increasing interest research on understanding the mechanisms and processes through which learning occurs in FLOSS, there is an imperative to describe these processes. One successful way of doing this is through specification methods. In this paper, we describe the adoption of Abstract States Machines (ASMs) as a specification methodology for the description of learning processes in FLOSS. The goal of this endeavor is to represent the many possible steps and/or activities FLOSS participants go through during interactions that can be categorized as learning processes. Through ASMs, we express learning phases as states while activities that take place before moving from one state to another are expressed as transitions. %B International Conference on Software Engineering and Formal Methods %I Springer, Cham %G eng %R 10.1007/978-3-319-15201-1_15 %0 Conference Paper %B EDBT/ICDT 2014 Workshops - Mining Urban Data (MUD) %D 2014 %T Big data analytics for smart mobility: a case study %A Barbara Furletti %A Roberto Trasarti %A Lorenzo Gabrielli %A Mirco Nanni %A Dino Pedreschi %B EDBT/ICDT 2014 Workshops - Mining Urban Data (MUD) %C Athens, Greece %8 03/2014 %U http://ceur-ws.org/Vol-1133/paper-57.pdf %M ISSN - 1613-0073 %0 Journal Article %J Concurrency and Computation: Practice and Experience %D 2014 %T Decision tree building on multi-core using FastFlow %A Aldinucci, Marco %A Salvatore Ruggieri %A Torquati, Massimo %X The whole computer hardware industry embraced the multi-core. The extreme optimisation of sequential algorithms is then no longer sufficient to squeeze the real machine power, which can be only exploited via thread-level parallelism. Decision tree algorithms exhibit natural concurrency that makes them suitable to be parallelised. This paper presents an in-depth study of the parallelisation of an implementation of the C4.5 algorithm for multi-core architectures. We characterise elapsed time lower bounds for the forms of parallelisations adopted and achieve close to optimal performance. Our implementation is based on the FastFlow parallel programming environment, and it requires minimal changes to the original sequential code. Copyright © 2013 John Wiley & Sons, Ltd. %B Concurrency and Computation: Practice and Experience %V 26 %P 800–820 %G eng %R 10.1002/cpe.3063 %0 Journal Article %J Telecommunications Policy %D 2014 %T Discovering urban and country dynamics from mobile phone data with spatial correlation patterns %A Roberto Trasarti %A Ana-Maria Olteanu-Raimond %A Mirco Nanni %A Thomas Couronné %A Barbara Furletti %A Fosca Giannotti %A Zbigniew Smoreda %A Cezary Ziemlicki %K Urban dynamics %X Abstract Mobile communication technologies pervade our society and existing wireless networks are able to sense the movement of people, generating large volumes of data related to human activities, such as mobile phone call records. At the present, this kind of data is collected and stored by telecom operators infrastructures mainly for billing reasons, yet it represents a major source of information in the study of human mobility. In this paper, we propose an analytical process aimed at extracting interconnections between different areas of the city that emerge from highly correlated temporal variations of population local densities. To accomplish this objective, we propose a process based on two analytical tools: (i) a method to estimate the presence of people in different geographical areas; and (ii) a method to extract time- and space-constrained sequential patterns capable to capture correlations among geographical areas in terms of significant co-variations of the estimated presence. The methods are presented and combined in order to deal with two real scenarios of different spatial scale: the Paris Region and the whole France. %B Telecommunications Policy %P - %U http://www.sciencedirect.com/science/article/pii/S0308596113002012 %R http://dx.doi.org/10.1016/j.telpol.2013.12.002 %0 Conference Paper %B 18th International Database Engineering {&} Applications Symposium, {IDEAS} 2014, Porto, Portugal, July 7-9, 2014 %D 2014 %T Investigating semantic regularity of human mobility lifestyle %A Vinicius Monteiro de Lira %A S Rinzivillo %A Chiara Renso %A Valéria Cesário Times %A Patr{\'ı}cia C. A. R. Tedesco %B 18th International Database Engineering {&} Applications Symposium, {IDEAS} 2014, Porto, Portugal, July 7-9, 2014 %I ACM %C Porto, Portugal %P 314–317 %U http://doi.acm.org/10.1145/2628194.2628226 %R 10.1145/2628194.2628226 %0 Conference Paper %B Web Engineering, 14th International Conference, {ICWE} 2014, Toulouse, France, July 1-4, 2014. Proceedings %D 2014 %T {MAPMOLTY:} {A} Web Tool for Discovering Place Loyalty Based on Mobile Crowdsource Data %A Vinicius Monteiro de Lira %A S Rinzivillo %A Valéria Cesário Times %A Chiara Renso %B Web Engineering, 14th International Conference, {ICWE} 2014, Toulouse, France, July 1-4, 2014. Proceedings %P 528–531 %U http://dx.doi.org/10.1007/978-3-319-08245-5_43 %R 10.1007/978-3-319-08245-5_43 %0 Book Section %B Data Science and Simulation in Transportation Research %D 2014 %T Mobility Profiling %A Mirco Nanni %A Roberto Trasarti %A Paolo Cintia %A Barbara Furletti %A Chiara Renso %A Lorenzo Gabrielli %A S Rinzivillo %A Fosca Giannotti %X The ability to understand the dynamics of human mobility is crucial for tasks like urban planning and transportation management. The recent rapidly growing availability of large spatio-temporal datasets gives us the possibility to develop sophisticated and accurate analysis methods and algorithms that can enable us to explore several relevant mobility phenomena: the distinct access paths to a territory, the groups of persons that move together in space and time, the regions of a territory that contains a high density of traffic demand, etc. All these paradigmatic perspectives focus on a collective view of the mobility where the interesting phenomenon is the result of the contribution of several moving objects. In this chapter, the authors explore a different approach to the topic and focus on the analysis and understanding of relevant individual mobility habits in order to assign a profile to an individual on the basis of his/her mobility. This process adds a semantic level to the raw mobility data, enabling further analyses that require a deeper understanding of the data itself. The studies described in this chapter are based on two large datasets of spatio-temporal data, originated, respectively, from GPS-equipped devices and from a mobile phone network. %B Data Science and Simulation in Transportation Research %I IGI Global %P 1-29 %& 1 %R 10.4018/978-1-4666-4920-0.ch001 %0 Conference Paper %B International Conference on Software Engineering and Formal Methods %D 2014 %T Ontolifloss: Ontology for learning processes in FLOSS communities %A Mukala, Patrick %A Cerone, Antonio %A Franco Turini %X Free/Libre Open Source Software (FLOSS) communities are considered an example of commons-based peer-production models where groups of participants work together to achieve projects of common purpose. In these settings, many occurring activities can be documented and have established them as learning environments. As knowledge exchange is proved to occur in FLOSS, the dynamic and free nature of participation poses a great challenge in understanding activities pertaining to Learning Processes. In this paper we raise this question and propose an ontology (called OntoLiFLOSS) in order to define terms and concepts that can explain learning activities taking place in these communities. The objective of this endeavor is to define in the simplest possible way a common definition of concepts and activities that can guide the identification of learning processes taking place among FLOSS members in any of the standard repositories such as mailing list, SVN, bug trackers and even discussion forums. %B International Conference on Software Engineering and Formal Methods %I Springer, Cham %G eng %R 10.1007/978-3-319-15201-1_11 %0 Conference Paper %B Trust Management {VIII} - 8th {IFIP} {WG} 11.11 International Conference, {IFIPTM} 2014, Singapore, July 7-10, 2014. Proceedings %D 2014 %T A Privacy Risk Model for Trajectory Data %A Anirban Basu %A Anna Monreale %A Juan Camilo Corena %A Fosca Giannotti %A Dino Pedreschi %A Shinsaku Kiyomoto %A Yutaka Miyake %A Tadashi Yanagihara %A Roberto Trasarti %X Time sequence data relating to users, such as medical histories and mobility data, are good candidates for data mining, but often contain highly sensitive information. Different methods in privacy-preserving data publishing are utilised to release such private data so that individual records in the released data cannot be re-linked to specific users with a high degree of certainty. These methods provide theoretical worst-case privacy risks as measures of the privacy protection that they offer. However, often with many real-world data the worst-case scenario is too pessimistic and does not provide a realistic view of the privacy risks: the real probability of re-identification is often much lower than the theoretical worst-case risk. In this paper we propose a novel empirical risk model for privacy which, in relation to the cost of privacy attacks, demonstrates better the practical risks associated with a privacy preserving data release. We show detailed evaluation of the proposed risk model by using k-anonymised real-world mobility data. %B Trust Management {VIII} - 8th {IFIP} {WG} 11.11 International Conference, {IFIPTM} 2014, Singapore, July 7-10, 2014. Proceedings %P 125–140 %U http://dx.doi.org/10.1007/978-3-662-43813-8_9 %R 10.1007/978-3-662-43813-8_9 %0 Conference Paper %B International Conference on Software Engineering and Formal Methods %D 2014 %T Process mining event logs from FLOSS data: state of the art and perspectives %A Mukala, Patrick %A Cerone, Antonio %A Franco Turini %X Free/Libre Open Source Software (FLOSS) is a phenomenon that has undoubtedly triggered extensive research endeavors. At the heart of these initiatives is the ability to mine data from FLOSS repositories with the hope of revealing empirical evidence to answer existing questions on the FLOSS development process. In spite of the success produced with existing mining techniques, emerging questions about FLOSS data require alternative and more appropriate ways to explore and analyse such data. In this paper, we explore a different perspective called process mining. Process mining has been proved to be successful in terms of tracing and reconstructing process models from data logs (event logs). The chief objective of our analysis is threefold. We aim to achieve: (1) conformance to predefined models; (2) discovery of new model patterns; and, finally, (3) extension to predefined models. %B International Conference on Software Engineering and Formal Methods %I Springer, Cham %G eng %R 10.1007/978-3-319-15201-1_12 %0 Journal Article %J PLoS One %D 2013 %T Awareness and learning in participatory noise sensing. %A Becker, Martin %A Saverio Caminiti %A Fiorella, Donato %A Francis, Louise %A Pietro Gravino %A Haklay, Mordechai Muki %A Hotho, Andreas %A Vittorio Loreto %A Mueller, Juergen %A Ricchiuti, Ferdinando %A Vito D P Servedio %A Alina Sirbu %A Francesca Tria %X

The development of ICT infrastructures has facilitated the emergence of new paradigms for looking at society and the environment over the last few years. Participatory environmental sensing, i.e. directly involving citizens in environmental monitoring, is one example, which is hoped to encourage learning and enhance awareness of environmental issues. In this paper, an analysis of the behaviour of individuals involved in noise sensing is presented. Citizens have been involved in noise measuring activities through the WideNoise smartphone application. This application has been designed to record both objective (noise samples) and subjective (opinions, feelings) data. The application has been open to be used freely by anyone and has been widely employed worldwide. In addition, several test cases have been organised in European countries. Based on the information submitted by users, an analysis of emerging awareness and learning is performed. The data show that changes in the way the environment is perceived after repeated usage of the application do appear. Specifically, users learn how to recognise different noise levels they are exposed to. Additionally, the subjective data collected indicate an increased user involvement in time and a categorisation effect between pleasant and less pleasant environments.

%B PLoS One %V 8 %P e81638 %8 2013 %G eng %R 10.1371/journal.pone.0081638 %0 Conference Paper %B Entity Relationship Conference - ER 2013 %D 2013 %T Baquara: A Holistic Ontological Framework for Movement Analysis with Linked Data %A Renato Fileto %A Marcelo Krger %A Nikos Pelekis %A Yannis Theodoridis %A Chiara Renso %B Entity Relationship Conference - ER 2013 %C Hong Kong %0 Journal Article %J Advances in Complex Systems %D 2013 %T Cohesion, consensus and extreme information in opinion dynamics %A Alina Sirbu %A Vittorio Loreto %A Vito D P Servedio %A Francesca Tria %B Advances in Complex Systems %V 16 %P 1350035 %G eng %U http://www.worldscientific.com/doi/abs/10.1142/S0219525913500355 %R 10.1142/S0219525913500355 %0 Book Section %B Discrimination and privacy in the information society %D 2013 %T The discovery of discrimination %A Dino Pedreschi %A Salvatore Ruggieri %A Franco Turini %B Discrimination and privacy in the information society %I Springer %P 91–108 %G eng %0 Journal Article %J Expert Systems with Applications %D 2013 %T Discrimination discovery in scientific project evaluation: A case study %A Andrea Romei %A Salvatore Ruggieri %A Franco Turini %B Expert Systems with Applications %V 40 %P 6064–6079 %G eng %0 Conference Paper %B 2013 {IEEE} 14th International Conference on Mobile Data Management, Milan, Italy, June 3-6, 2013 - Volume 2 %D 2013 %T A Gravity Model for Speed Estimation over Road Network %A Paolo Cintia %A Roberto Trasarti %A José Antônio Fernandes de Macêdo %A Livia Almada %A Camila Fereira %B 2013 {IEEE} 14th International Conference on Mobile Data Management, Milan, Italy, June 3-6, 2013 - Volume 2 %G eng %U http://dx.doi.org/10.1109/MDM.2013.83 %R 10.1109/MDM.2013.83 %0 Journal Article %J Knowl. Inf. Syst. %D 2013 %T How you move reveals who you are: understanding human behavior by analyzing trajectory data %A Chiara Renso %A Miriam Baglioni %A José Antônio Fernandes de Macêdo %A Roberto Trasarti %A Monica Wachowicz %B Knowl. Inf. Syst. %V 37 %P 331–362 %G eng %U http://dx.doi.org/10.1007/s10115-012-0511-z %R 10.1007/s10115-012-0511-z %0 Conference Paper %B SecoGIS 2013 - International Workshop on Semantic Aspects of GIS, Joint to ER conference 2013 %D 2013 %T Mob-Warehouse: A semantic approach for mobility analysis with a Trajectory Data Ware- house %A Ricardo Wagner %A de José Antônio Fernandes Macêdo %A Alessandra Raffaetà %A Chiara Renso %A Alessandro Roncato %A Roberto Trasarti %B SecoGIS 2013 - International Workshop on Semantic Aspects of GIS, Joint to ER conference 2013 %C Hong Kong %0 Conference Paper %B In D4D Challenge @ 3rd Conf. on the Analysis of Mobile Phone datasets (NetMob 2013) %D 2013 %T MP4-A Project: Mobility Planning For Africa %A Mirco Nanni %A Roberto Trasarti %A Barbara Furletti %A Lorenzo Gabrielli %A Peter Van Der Mede %A Joost De Bruijn %A Erik de Romph %A Gerard Bruil %X This project aims to create a tool that uses mobile phone transaction (trajectory) data that will be able to address transportation related challenges, thus allowing promotion and facilitation of sustainable urban mobility planning in Third World countries. The proposed tool is a transport demand model for Ivory Coast, with emphasis on its major urbanization Abidjan. The consortium will bring together available data from the internet, and integrate these with the mobility data obtained from the mobile phones in order to build the best possible transport model. A transport model allows an understanding of current and future infrastructure requirements in Ivory Coast. As such, this project will provide the first proof of concept. In this context, long-term analysis of individual call traces will be performed to reconstruct systematic movements, and to infer an origin-destination matrix. A similar process will be performed using the locations of caller and recipient of phone calls, enabling the comparison of socio-economic ties vs. mobility. The emerging links between different areas will be used to build an effective map to optimize regional border definitions and road infrastructure from a mobility perspective. Finally, we will try to build specialized origin-destination matrices for specific categories of population. Such categories will be inferred from data through analysis of calling behaviours, and will also be used to characterize the population of different cities. The project also includes a study of data compliance with distributions of standard measures observed in literature, including distribution of calls, call durations and call network features. %B In D4D Challenge @ 3rd Conf. on the Analysis of Mobile Phone datasets (NetMob 2013) %C Cambridge, USA %G eng %U http://perso.uclouvain.be/vincent.blondel/netmob/2013/D4D-book.pdf %0 Journal Article %J Journal of Statistical Physics %D 2013 %T Opinion dynamics with disagreement and modulated information %A Alina Sirbu %A Vittorio Loreto %A Vito D P Servedio %A Francesca Tria %B Journal of Statistical Physics %P 1–20 %G eng %U http://link.springer.com/article/10.1007/s10955-013-0724-x %R 10.1007/s10955-013-0724-x %0 Journal Article %J Spatio-Temporal Databases: Flexible Querying and Reasoning %D 2013 %T Spatio-Temporal Data %A Mirco Nanni %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %B Spatio-Temporal Databases: Flexible Querying and Reasoning %P 75 %G eng %0 Journal Article %J {SIGMOD} Record %D 2013 %T Towards mega-modeling: a walk through data analysis experiences %A Stefano Ceri %A Themis Palpanas %A Emanuele Della Valle %A Dino Pedreschi %A Johann-Christoph Freytag %A Roberto Trasarti %B {SIGMOD} Record %V 42 %P 19–27 %G eng %U http://doi.acm.org/10.1145/2536669.2536673 %R 10.1145/2536669.2536673 %0 Conference Paper %B Citizen in Sensor Networks - Second International Workshop, CitiSens 2013, Barcelona, Spain, September 19, 2013, Revised Selected Papers %D 2013 %T Transportation Planning Based on {GSM} Traces: {A} Case Study on Ivory Coast %A Mirco Nanni %A Roberto Trasarti %A Barbara Furletti %A Lorenzo Gabrielli %A Peter Van Der Mede %A Joost De Bruijn %A Erik de Romph %A Gerard Bruil %B Citizen in Sensor Networks - Second International Workshop, CitiSens 2013, Barcelona, Spain, September 19, 2013, Revised Selected Papers %G eng %U http://dx.doi.org/10.1007/978-3-319-04178-0_2 %R 10.1007/978-3-319-04178-0_2 %0 Conference Paper %B Cloud and Green Computing (CGC), 2013 Third International Conference on %D 2013 %T XTribe: a web-based social computation platform %A Saverio Caminiti %A Cicali, Claudio %A Pietro Gravino %A Vittorio Loreto %A Vito D P Servedio %A Alina Sirbu %A Francesca Tria %B Cloud and Green Computing (CGC), 2013 Third International Conference on %I IEEE %G eng %U http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6686061&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6686061 %R 10.1109/CGC.2013.69 %0 Conference Paper %B Proceedings of the 3rd International Conference on Ambient Systems, Networks and Technologies {(ANT} 2012), the 9th International Conference on Mobile Web Information Systems (MobiWIS-2012), Niagara Falls, Ontario, Canada, August 27-29, 2012 %D 2012 %T An Agent-Based Model to Evaluate Carpooling at Large Manufacturing Plants %A Tom Bellemans %A Sebastian Bothe %A Sungjin Cho %A Fosca Giannotti %A Davy Janssens %A Luk Knapen %A Christine Körner %A Michael May %A Mirco Nanni %A Dino Pedreschi %A Hendrik Stange %A Roberto Trasarti %A Ansar-Ul-Haque Yasar %A Geert Wets %B Proceedings of the 3rd International Conference on Ambient Systems, Networks and Technologies {(ANT} 2012), the 9th International Conference on Mobile Web Information Systems (MobiWIS-2012), Niagara Falls, Ontario, Canada, August 27-29, 2012 %G eng %U http://dx.doi.org/10.1016/j.procs.2012.08.001 %R 10.1016/j.procs.2012.08.001 %0 Report %D 2012 %T Analisi di Mobilita' con dati eterogenei %A Barbara Furletti %A Roberto Trasarti %A Lorenzo Gabrielli %A S Rinzivillo %A Luca Pappalardo %A Fosca Giannotti %I ISTI - CNR %C Pisa %0 Conference Proceedings %B MDM 2012 %D 2012 %T ComeTogether: Discovering Communities of Places in Mobility Data %A Igo Brilhante %A Michele Berlingerio %A Roberto Trasarti %A Chiara Renso %A de José Antônio Fernandes Macêdo %A Marco A. Casanova %B MDM 2012 %P 268-273 %8 2012 %0 Conference Paper %B Twentieth Italian Symposium on Advanced Database Systems, {SEBD} 2012, Venice, Italy, June 24-27, 2012, Proceedings %D 2012 %T Individual Mobility Profiles: Methods and Application on Vehicle Sharing %A Roberto Trasarti %A Fabio Pinelli %A Mirco Nanni %A Fosca Giannotti %B Twentieth Italian Symposium on Advanced Database Systems, {SEBD} 2012, Venice, Italy, June 24-27, 2012, Proceedings %G eng %U http://sebd2012.dei.unipd.it/documents/188475/32d00b8a-8ead-4d97-923f-bd2f2cf6ddcb %0 Journal Article %J Intelligent Data Analysis %D 2012 %T Knowledge Discovery in Ontologies %A Barbara Furletti %A Franco Turini %B Intelligent Data Analysis %V 16 %U http://iospress.metapress.com/content/765h53w41286p578/fulltext.pdf %& 513 %R 10.3233/IDA-2012-0536 %0 Conference Paper %B Conceptual Modeling - 31st International Conference {ER} 2012, Florence, Italy, October 15-18, 2012. Proceedings %D 2012 %T Mega-modeling for Big Data Analytics %A Stefano Ceri %A Emanuele Della Valle %A Dino Pedreschi %A Roberto Trasarti %B Conceptual Modeling - 31st International Conference {ER} 2012, Florence, Italy, October 15-18, 2012. Proceedings %G eng %U http://dx.doi.org/10.1007/978-3-642-34002-4_1 %R 10.1007/978-3-642-34002-4_1 %0 Book Section %B Software and Data Technologies %D 2012 %T What else can be extracted from ontologies? Influence Rules %A Franco Turini %A Barbara Furletti %B Software and Data Technologies %S Communications in Computer and Information Science %I Springer %0 Journal Article %J Transactions on Data Privacy %D 2011 %T C-safety: a framework for the anonymization of semantic trajectories %A Anna Monreale %A Roberto Trasarti %A Dino Pedreschi %A Chiara Renso %A Vania Bogorny %X The increasing abundance of data about the trajectories of personal movement is opening new opportunities for analyzing and mining human mobility. However, new risks emerge since it opens new ways of intruding into personal privacy. Representing the personal movements as sequences of places visited by a person during her/his movements - semantic trajectory - poses great privacy threats. In this paper we propose a privacy model defining the attack model of semantic trajectory linking and a privacy notion, called c-safety based on a generalization of visited places based on a taxonomy. This method provides an upper bound to the probability of inferring that a given person, observed in a sequence of non-sensitive places, has also visited any sensitive location. Coherently with the privacy model, we propose an algorithm for transforming any dataset of semantic trajectories into a c-safe one. We report a study on two real-life GPS trajectory datasets to show how our algorithm preserves interesting quality/utility measures of the original trajectories, when mining semantic trajectories sequential pattern mining results. We also empirically measure how the probability that the attacker’s inference succeeds is much lower than the theoretical upper bound established. %B Transactions on Data Privacy %V 4 %P 73-101 %U http://dl.acm.org/citation.cfm?id=2019319&CFID=803961971&CFTOKEN=35994039 %0 Book %D 2011 %T Dinamiche di impoverimento. Meccanismi, traiettorie ed effetti in un contesto locale %A Tomei, Gabriele %A Michela Natilli %I Carocci Editore %G eng %0 Conference Paper %B International Conference on Software and Data Technologies (ICSOFT) %D 2011 %T Mining Influence Rules out of Ontologies %A Barbara Furletti %A Franco Turini %B International Conference on Software and Data Technologies (ICSOFT) %C Siviglia, Spagna %8 2011 %0 Conference Paper %B KDD %D 2011 %T Mining mobility user profiles for car pooling %A Roberto Trasarti %A Fabio Pinelli %A Mirco Nanni %A Fosca Giannotti %B KDD %P 1190-1198 %0 Journal Article %J IJDWM %D 2011 %T A Query Language for Mobility Data Mining %A Roberto Trasarti %A Fosca Giannotti %A Mirco Nanni %A Dino Pedreschi %A Chiara Renso %B IJDWM %V 7 %P 24-45 %0 Journal Article %D 2011 %T Stiramenti identitari. Strategie di integrazione degli strannieri nella provincia di Massa Carrara tra appartenenza etnica ed esperienza transnazionale %A Tomei, Gabriele %A Paletti, F %A Michela Natilli %G eng %0 Conference Paper %B ECML/PKDD (3) %D 2011 %T Traffic Jams Detection Using Flock Mining %A Rebecca Ong %A Fabio Pinelli %A Roberto Trasarti %A Mirco Nanni %A Chiara Renso %A S Rinzivillo %A Fosca Giannotti %B ECML/PKDD (3) %P 650-653 %0 Journal Article %J VLDB J. %D 2011 %T Unveiling the complexity of human mobility by querying and mining massive trajectory data %A Fosca Giannotti %A Mirco Nanni %A Dino Pedreschi %A Fabio Pinelli %A Chiara Renso %A S Rinzivillo %A Roberto Trasarti %B VLDB J. %V 20 %P 695-719 %0 Conference Paper %B EDBT %D 2010 %T Advanced knowledge discovery on movement data with the GeoPKDD system %A Mirco Nanni %A Roberto Trasarti %A Chiara Renso %A Fosca Giannotti %A Dino Pedreschi %B EDBT %P 693-696 %0 Conference Paper %B EDBT %D 2010 %T Advanced knowledge discovery on movement data with the GeoPKDD system %A Mirco Nanni %A Roberto Trasarti %A Chiara Renso %A Fosca Giannotti %A Dino Pedreschi %B EDBT %P 693-696 %0 Conference Paper %B ECML/PKDD (3) %D 2010 %T Exploring Real Mobility Data with M-Atlas %A Roberto Trasarti %A S Rinzivillo %A Fabio Pinelli %A Mirco Nanni %A Anna Monreale %A Chiara Renso %A Dino Pedreschi %A Fosca Giannotti %X Research on moving-object data analysis has been recently fostered by the widespread diffusion of new techniques and systems for monitoring, collecting and storing location aware data, generated by a wealth of technological infrastructures, such as GPS positioning and wireless networks. These have made available massive repositories of spatio-temporal data recording human mobile activities, that call for suitable analytical methods, capable of enabling the development of innovative, location-aware applications. %B ECML/PKDD (3) %P 624-627 %R 10.1007/978-3-642-15939-8_48 %0 Journal Article %J Quality Technology & Quantitative Management %D 2010 %T Improving the Business Plan Evaluation Process: the Role of Intangibles %A Barbara Furletti %A Franco Turini %A Andrea Bellandi %A Miriam Baglioni %A Chiara Pratesi %B Quality Technology & Quantitative Management %V 7 %8 2010 %U http://web.it.nctu.edu.tw/~qtqm/upcomingpapers/2010V7N1/2010V7N1_F3.pdf %& 35 %0 Conference Paper %B SEBD %D 2010 %T Location Prediction through Trajectory Pattern Mining (Extended Abstract) %A Anna Monreale %A Fabio Pinelli %A Roberto Trasarti %A Fosca Giannotti %B SEBD %P 134-141 %0 Conference Paper %B Computational Transportation Science %D 2010 %T Mobility data mining: discovering movement patterns from trajectory data %A Fosca Giannotti %A Mirco Nanni %A Dino Pedreschi %A Fabio Pinelli %A Chiara Renso %A S Rinzivillo %A Roberto Trasarti %B Computational Transportation Science %P 7-10 %0 Conference Paper %B SPRINGL %D 2010 %T Preserving privacy in semantic-rich trajectories of human mobility %A Anna Monreale %A Roberto Trasarti %A Chiara Renso %A Dino Pedreschi %A Vania Bogorny %X The increasing abundance of data about the trajectories of personal movement is opening up new opportunities for analyzing and mining human mobility, but new risks emerge since it opens new ways of intruding into personal privacy. Representing the personal movements as sequences of places visited by a person during her/his movements - semantic trajectory - poses even greater privacy threats w.r.t. raw geometric location data. In this paper we propose a privacy model defining the attack model of semantic trajectory linking, together with a privacy notion, called c-safety. This method provides an upper bound to the probability of inferring that a given person, observed in a sequence of nonsensitive places, has also stopped in any sensitive location. Coherently with the privacy model, we propose an algorithm for transforming any dataset of semantic trajectories into a c-safe one. We report a study on a real-life GPS trajectory dataset to show how our algorithm preserves interesting quality/utility measures of the original trajectories, such as sequential pattern mining results. %B SPRINGL %P 47-54 %R 10.1145/1868470.1868481 %0 Conference Paper %B SEBD %D 2010 %T Querying and mining trajectories with gaps: a multi-path reconstruction approach (Extended Abstract) %A Mirco Nanni %A Roberto Trasarti %B SEBD %P 126-133 %0 Journal Article %J Inf. Syst. %D 2009 %T A constraint-based querying system for exploratory pattern discovery %A Francesco Bonchi %A Fosca Giannotti %A Claudio Lucchese %A Salvatore Orlando %A Raffaele Perego %A Roberto Trasarti %B Inf. Syst. %V 34 %P 3-27 %0 Journal Article %J Inf. Syst. %D 2009 %T A constraint-based querying system for exploratory pattern discovery %A Francesco Bonchi %A Fosca Giannotti %A Claudio Lucchese %A Salvatore Orlando %A Raffaele Perego %A Roberto Trasarti %B Inf. Syst. %V 34 %P 3-27 %0 Conference Paper %B DEXA Workshops %D 2009 %T DAMSEL: A System for Progressive Querying and Reasoning on Movement Data %A Roberto Trasarti %A Miriam Baglioni %A Chiara Renso %B DEXA Workshops %P 452-456 %0 Conference Paper %B EDBT %D 2009 %T Geographic privacy-aware knowledge discovery and delivery %A Fosca Giannotti %A Dino Pedreschi %A Yannis Theodoridis %B EDBT %P 1157-1158 %0 Conference Paper %B The European Future Technologies Conference (FET 2009) %D 2009 %T GeoPKDD – Geographic Privacy-aware Knowledge Discovery %A Fosca Giannotti %A Mirco Nanni %A Dino Pedreschi %A Chiara Renso %A S Rinzivillo %A Roberto Trasarti %B The European Future Technologies Conference (FET 2009) %0 Conference Paper %B ICAIL %D 2009 %T Integrating induction and deduction for finding evidence of discrimination %A Dino Pedreschi %A Salvatore Ruggieri %A Franco Turini %B ICAIL %P 157-166 %0 Conference Paper %B ICDM Workshops %D 2009 %T K-BestMatch Reconstruction and Comparison of Trajectory Data %A Mirco Nanni %A Roberto Trasarti %B ICDM Workshops %P 610-615 %0 Conference Paper %B ICDM Workshops %D 2009 %T K-BestMatch Reconstruction and Comparison of Trajectory Data %A Mirco Nanni %A Roberto Trasarti %B ICDM Workshops %P 610-615 %0 Conference Paper %B SDM %D 2009 %T Measuring Discrimination in Socially-Sensitive Decision Records %A Dino Pedreschi %A Salvatore Ruggieri %A Franco Turini %B SDM %P 581-592 %0 Book Section %B Biomedical Data and Applications %D 2009 %T Mining Clinical, Immunological, and Genetic Data of Solid Organ Transplantation %A Michele Berlingerio %A Francesco Bonchi %A Michele Curcio %A Fosca Giannotti %A Franco Turini %B Biomedical Data and Applications %P 211-236 %0 Conference Paper %B CSE (4) %D 2009 %T Mining Mobility Behavior from Trajectory Data %A Fosca Giannotti %A Mirco Nanni %A Dino Pedreschi %A Chiara Renso %A Roberto Trasarti %B CSE (4) %P 948-951 %0 Conference Paper %B SEBD %D 2009 %T A new technique for sequential pattern mining under regular expressions %A Roberto Trasarti %A Francesco Bonchi %A Bart Goethals %B SEBD %P 325-332 %0 Conference Paper %B Global Recession: Regional Impacts on Housing, Jobs, Health and Wellbeing %D 2009 %T Poverty as a Social Condition: a Case Study on a Small Municipality in Tuscany %A Tomei, Gabriele %A Michela Natilli %B Global Recession: Regional Impacts on Housing, Jobs, Health and Wellbeing %I SEAFORD %G eng %0 Conference Paper %B AGILE Conf. %D 2009 %T Towards Semantic Interpretation of Movement Behavior %A Miriam Baglioni %A de José Antônio Fernandes Macêdo %A Chiara Renso %A Roberto Trasarti %A Monica Wachowicz %B AGILE Conf. %P 271-288 %0 Conference Paper %B AGILE Conf. %D 2009 %T Towards Semantic Interpretation of Movement Behavior %A Miriam Baglioni %A de José Antônio Fernandes Macêdo %A Chiara Renso %A Roberto Trasarti %A Monica Wachowicz %B AGILE Conf. %P 271-288 %0 Conference Proceedings %B 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining %D 2009 %T WhereNext: a Location Predictor on Trajectory Pattern Mining %A Anna Monreale %A Fabio Pinelli %A Roberto Trasarti %A Fosca Giannotti %X The pervasiveness of mobile devices and location based services is leading to an increasing volume of mobility data.This side eect provides the opportunity for innovative methods that analyse the behaviors of movements. In this paper we propose WhereNext, which is a method aimed at predicting with a certain level of accuracy the next location of a moving object. The prediction uses previously extracted movement patterns named Trajectory Patterns, which are a concise representation of behaviors of moving objects as sequences of regions frequently visited with a typical travel time. A decision tree, named T-pattern Tree, is built and evaluated with a formal training and test process. The tree is learned from the Trajectory Patterns that hold a certain area and it may be used as a predictor of the next location of a new trajectory finding the best matching path in the tree. Three dierent best matching methods to classify a new moving object are proposed and their impact on the quality of prediction is studied extensively. Using Trajectory Patterns as predictive rules has the following implications: (I) the learning depends on the movement of all available objects in a certain area instead of on the individual history of an object; (II) the prediction tree intrinsically contains the spatio-temporal properties that have emerged from the data and this allows us to define matching methods that striclty depend on the properties of such movements. In addition, we propose a set of other measures, that evaluate a priori the predictive power of a set of Trajectory Patterns. This measures were tuned on a real life case study. Finally, an exhaustive set of experiments and results on the real dataset are presented. %B 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining %R 10.1145/1557019.1557091 %0 Journal Article %J GeoInformatica %D 2008 %T An Application of Advanced Spatio-Temporal Formalisms to Behavioural Ecology %A Alessandra Raffaetà %A T. Ceccarelli %A D. Centeno %A Fosca Giannotti %A A. Massolo %A Christine Parent %A Chiara Renso %A Stefano Spaccapietra %A Franco Turini %B GeoInformatica %V 12 %P 37-72 %G eng %0 Journal Article %D 2008 %T An Application of Advanced Spatio-Temporal Formalisms to Behavioural Ecology %A T. Ceccarelli %A D. Centeno %A Fosca Giannotti %A A. Massolo %A Christine Parent %A Alessandra Raffaetà %A Chiara Renso %A Stefano Spaccapietra %A Franco Turini %G eng %0 Conference Paper %B SEBD %D 2008 %T DAEDALUS: A knowledge discovery analysis framework for movement data %A Riccardo Ortale %A E Ritacco %A N. Pelekisy %A Roberto Trasarti %A Gianni Costa %A Fosca Giannotti %A Giuseppe Manco %A Chiara Renso %A Yannis Theodoridis %B SEBD %P 191-198 %G eng %0 Conference Paper %B GIS %D 2008 %T The DAEDALUS framework: progressive querying and mining of movement data %A Riccardo Ortale %A E Ritacco %A Nikos Pelekis %A Roberto Trasarti %A Gianni Costa %A Fosca Giannotti %A Giuseppe Manco %A Chiara Renso %A Yannis Theodoridis %B GIS %P 52 %0 Conference Paper %B KDD %D 2008 %T Discrimination-aware data mining %A Dino Pedreschi %A Salvatore Ruggieri %A Franco Turini %B KDD %P 560-568 %0 Book Section %B Mobility, Data Mining and Privacy %D 2008 %T Knowledge Discovery from Geographical Data %A S Rinzivillo %A Franco Turini %A Vania Bogorny %A Christine Körner %A Bart Kuijpers %A Michael May %B Mobility, Data Mining and Privacy %P 243-265 %0 Conference Proceedings %B First International Workshop on Computational Transportation Science %D 2008 %T Location prediction within the mobility data analysis environment Daedalus %A Fabio Pinelli %A Anna Monreale %A Roberto Trasarti %A Fosca Giannotti %X In this paper we propose a method to predict the next location of a moving object based on two recent results in GeoPKDD project: DAEDALUS, a mobility data analysis environment and Trajectory Pattern, a sequential pattern mining algorithm with temporal annotation integrated in DAEDALUS. The first one is a DMQL environment for moving objects, where both data and patterns can be represented. The second one extracts movement patterns as sequences of movements between locations with typical travel times. This paper proposes a prediction method which uses the local models extracted by Trajectory Pattern to build a global model called Prediction Tree. The future location of a moving object is predicted visiting the tree and calculating the best matching function. The integration within DAEDALUS system supports an interactive construction of the predictor on the top of a set of spatio-temporal patterns. Others proposals in literature base the definition of prediction methods for future location of a moving object on previously extracted frequent patterns. They use the recent history of movements of the object itself and often use time only to order the events. Our work uses the movements of all moving objects in a certain area to learn a classifier built on the mined trajectory patterns, which are intrinsically equipped with temporal information. %B First International Workshop on Computational Transportation Science %C Dublin, Ireland %R 10.4108/ICST.MOBIQUITOUS2008.3894 %0 Conference Paper %B PinKDD %D 2008 %T Mobility, Data Mining and Privacy the Experience of the GeoPKDD Project %A Fosca Giannotti %A Dino Pedreschi %A Franco Turini %B PinKDD %P 25-32 %0 Conference Paper %B EDOC %D 2008 %T Ontology-Based Business Plan Classification %A Miriam Baglioni %A Andrea Bellandi %A Barbara Furletti %A Laura Spinsanti %A Franco Turini %B EDOC %P 365-371 %0 Conference Paper %B Enterprise Distributed Object Computing Conference (EDOC) %D 2008 %T Ontology-Based Business Plan Classification %A Franco Turini %A Barbara Furletti %A Miriam Baglioni %A Laura Spinsanti %A Andrea Bellandi %B Enterprise Distributed Object Computing Conference (EDOC) %8 2008 %@ 978-0-7695-3373-5 %U http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4634789 %R http://dx.doi.org/10.1109/EDOC.2008.30 %0 Book Section %B Mobility, Data Mining and Privacy %D 2008 %T Privacy Protection: Regulations and Technologies, Opportunities and Threats %A Dino Pedreschi %A Francesco Bonchi %A Franco Turini %A Vassilios S. Verykios %A Maurizio Atzori %A Bradley Malin %A Bart Moelans %A Yücel Saygin %B Mobility, Data Mining and Privacy %P 101-119 %0 Journal Article %J Journal of Intelligent Information Systems %D 2007 %T Knowledge discovery from spatial transactions %A S Rinzivillo %A Franco Turini %B Journal of Intelligent Information Systems %V 28 %P 1-22 %0 Conference Paper %B BIBM %D 2007 %T Mining Clinical Data with a Temporal Dimension: A Case Study %A Michele Berlingerio %A Francesco Bonchi %A Fosca Giannotti %A Franco Turini %B BIBM %P 429-436 %G eng %0 Conference Paper %B ICDM Workshops %D 2007 %T Time-Annotated Sequences for Medical Data Mining %A Michele Berlingerio %A Francesco Bonchi %A Fosca Giannotti %A Franco Turini %B ICDM Workshops %P 133-138 %G eng %0 Conference Paper %B ICDE %D 2006 %T ConQueSt: a Constraint-based Querying System for Exploratory Pattern Discovery %A Francesco Bonchi %A Fosca Giannotti %A Claudio Lucchese %A Salvatore Orlando %A Raffaele Perego %A Roberto Trasarti %B ICDE %P 159 %G eng %0 Conference Paper %B Reasoning, Action and Interaction in AI Theories and Systems %D 2006 %T Examples of Integration of Induction and Deduction in Knowledge Discovery %A Franco Turini %A Miriam Baglioni %A Barbara Furletti %A S Rinzivillo %B Reasoning, Action and Interaction in AI Theories and Systems %P 307-326 %0 Book Section %B Reasoning, Action and Interaction in AI Theories and Systems %D 2006 %T Examples of Integration of Induction and Deduction in Knowledge Discovery %A Franco Turini %A Miriam Baglioni %A Barbara Furletti %A S Rinzivillo %B Reasoning, Action and Interaction in AI Theories and Systems %S LNAI %V 4155 %P 307-326 %U http://www.springerlink.com/content/m400v4507476n18g/fulltext.pdf %R 10.1007/11829263_17 %0 Conference Paper %B SEBD %D 2006 %T On Interactive Pattern Mining from Relational Databases %A Claudio Lucchese %A Francesco Bonchi %A Fosca Giannotti %A Salvatore Orlando %A Raffaele Perego %A Roberto Trasarti %B SEBD %P 329-338 %G eng %0 Conference Paper %B KDID %D 2006 %T On Interactive Pattern Mining from Relational Databases %A Francesco Bonchi %A Fosca Giannotti %A Claudio Lucchese %A Salvatore Orlando %A Raffaele Perego %A Roberto Trasarti %B KDID %P 42-62 %G eng %0 Conference Paper %B IADIS International Conference Applied Computing 2006 %D 2006 %T A Tool for Economic Plans analysis based on expert knowledge and data mining techniques %A Miriam Baglioni %A Barbara Furletti %A Franco Turini %B IADIS International Conference Applied Computing 2006 %8 2006 %@ 972-8924-09-7 %U http://www.iadisportal.org/digital-library/mdownload/a-tool-for-economic-plans-analysis-based-on-expert-knowledge-and-data-mining-techniques %0 Conference Paper %B ACM Symposium on Applied Computing %D 2005 %T DrC4.5: Improving C4.5 by means of Prior Knowledge %A Miriam Baglioni %A Barbara Furletti %A Franco Turini %B ACM Symposium on Applied Computing %I ACM %C Santa Fe, New Mexico, USA %@ 1-58113-964-0 %U http://dl.acm.org/ft_gateway.cfm?id=1066787&ftid=311609&dwn=1&CFID=96873366&CFTOKEN=59233511 %R http://dx.doi.org/10.1145/1066677.1066787 %0 Conference Paper %B ACM GIS %D 2005 %T Extracting spatial association rules from spatial transactions %A S Rinzivillo %A Franco Turini %B ACM GIS %P 79-86 %0 Conference Paper %B PKDD %D 2004 %T Classification in Geographical Information Systems %A S Rinzivillo %A Franco Turini %B PKDD %P 374-385 %0 Conference Paper %B INAP/WLP %D 2004 %T Deductive and Inductive Reasoning on Spatio-Temporal Data %A Mirco Nanni %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %B INAP/WLP %P 98-115 %G eng %0 Conference Paper %B SEBD %D 2004 %T Deductive and Inductive Reasoning on Trajectories %A Mirco Nanni %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %B SEBD %P 98-105 %G eng %0 Journal Article %D 2004 %T Integrating Knowledge Representation and Reasoning in Geographical %A Paolo Mancarella %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %G eng %0 Journal Article %J International Journal of Geographical Information Science %D 2004 %T Integrating knowledge representation and reasoning in Geographical Information Systems %A Paolo Mancarella %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %B International Journal of Geographical Information Science %V 18 %P 417-447 %G eng %0 Journal Article %D 2004 %T \newblock{A Declarative Framework for Reasoning on Spatio-temporal Data} %A Mirco Nanni %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %G eng %0 Journal Article %J IEEE Trans. Knowl. Data Eng. %D 2004 %T Specifying Mining Algorithms with Iterative User-Defined Aggregates %A Fosca Giannotti %A Giuseppe Manco %A Franco Turini %B IEEE Trans. Knowl. Data Eng. %V 16 %P 1232-1246 %G eng %0 Conference Paper %B Database Support for Data Mining Applications %D 2004 %T Towards a Logic Query Language for Data Mining %A Fosca Giannotti %A Giuseppe Manco %A Franco Turini %B Database Support for Data Mining Applications %P 76-94 %G eng %0 Conference Paper %B AI*IA %D 2003 %T Qualitative Spatial Reasoning in a Logical Framework %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %B AI*IA %P 78-90 %G eng %0 Conference Paper %B ACM-GIS %D 2002 %T Enhancing GISs for spatio-temporal reasoning %A Alessandra Raffaetà %A Franco Turini %A Chiara Renso %B ACM-GIS %P 42-48 %G eng %0 Conference Paper %B SEBD %D 2002 %T Qualitative Reasoning in a Spatio-Temporal Language %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %B SEBD %P 105-118 %G eng %0 Conference Paper %B SEBD %D 2001 %T Complex Reasoning on Geographical Data %A Fosca Giannotti %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %B SEBD %P 331-338 %G eng %0 Conference Paper %B SEBD %D 2001 %T Complex Reasoning on Geographical Data %A Fosca Giannotti %A Alessandra Raffaetà %A Chiara Renso %A Franco Turini %B SEBD %P 331-338 %G eng %0 Conference Paper %B PKDD %D 2001 %T Specifying Mining Algorithms with Iterative User-Defined Aggregates: A Case Study %A Fosca Giannotti %A Giuseppe Manco %A Franco Turini %B PKDD %P 128-139 %G eng %0 Journal Article %J Journal of Logic Programming %D 2000 %T Using Medlan to Integrate Geographical Data %A Domenico Aquilino %A Patrizia Asirelli %A A Formuso %A Chiara Renso %A Franco Turini %B Journal of Logic Programming %P 3–14 %G eng %0 Journal Article %J J. Log. Program. %D 2000 %T Using MedLan to Integrate Geographical Data %A Domenico Aquilino %A Patrizia Asirelli %A A Formuso %A Chiara Renso %A Franco Turini %B J. Log. Program. %V 43 %P 3-14 %G eng %0 Conference Paper %B DEXA Workshop %D 1999 %T Beyond Current Technology: The Perspective of Three EC GIS Projects %A Fosca Giannotti %A Robert Jeansoulin %A Yannis Theodoridis %B DEXA Workshop %P 510 %G eng %0 Journal Article %J Computer Languages %D 1999 %T Dynamic Composition of Parameterised Logic Modules %A Antonio Brogi %A Chiara Renso %A Franco Turini %B Computer Languages %P 211–242 %G eng %0 Journal Article %J Comput. Lang. %D 1999 %T Dynamic composition of parameterised logic modules %A Antonio Brogi %A Chiara Renso %A Franco Turini %B Comput. Lang. %V 25 %P 211-242 %G eng %0 Conference Paper %B 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery %D 1999 %T Experiences with a Logic-based knowledge discovery Support Environment %A Fosca Giannotti %A Giuseppe Manco %A Dino Pedreschi %A Franco Turini %B 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery %G eng %0 Conference Paper %B AI*IA %D 1999 %T Experiences with a Logic-Based Knowledge Discovery Support Environment %A Fosca Giannotti %A Giuseppe Manco %A Dino Pedreschi %A Franco Turini %B AI*IA %P 202-213 %G eng %0 Conference Paper %B SEBD %D 1999 %T Integration of Deduction and Induction for Mining Supermarket Sales Data %A Fosca Giannotti %A Giuseppe Manco %A Mirco Nanni %A Dino Pedreschi %A Franco Turini %B SEBD %P 117-131 %G eng %0 Conference Paper %B IICIS %D 1998 %T The Constraint Operator of MedLan: Its Efficient Implementation and Use %A Patrizia Asirelli %A Chiara Renso %A Franco Turini %B IICIS %P 41-55 %G eng %0 Journal Article %J Annals of Mathematics and Artificial Intelligence %D 1997 %T Applying Restriction Constraint to Deductive Databases %A Domenico Aquilino %A Patrizia Asirelli %A Chiara Renso %A Franco Turini %B Annals of Mathematics and Artificial Intelligence %P 3–25 %G eng %0 Journal Article %J Ann. Math. Artif. Intell. %D 1997 %T Applying Restriction Constraints to Deductive Databases %A Domenico Aquilino %A Patrizia Asirelli %A Chiara Renso %A Franco Turini %B Ann. Math. Artif. Intell. %V 19 %P 3-25 %G eng %0 Conference Paper %B Logic in Databases %D 1996 %T Language Extensions for Semantic Integration of Deductive Databases %A Patrizia Asirelli %A Chiara Renso %A Franco Turini %B Logic in Databases %P 415-434 %G eng %0 Book Section %D 1996 %T Towards {D}eclarative {GIS} {A}nalysis %A Domenico Aquilino %A Chiara Renso %A Franco Turini %P 99–105 %G eng %0 Conference Paper %B ACM-GIS %D 1996 %T Towards Declarative GIS Analysis %A Domenico Aquilino %A Chiara Renso %A Franco Turini %B ACM-GIS %P 98-104 %G eng %0 Conference Paper %B FAPR %D 1996 %T Using Temporary Integrity Constraints to Optimize Databases %A Danilo Montesi %A Chiara Renso %A Franco Turini %B FAPR %P 430-435 %G eng %0 Journal Article %D 1995 %T An Operator for Composing Deductive Databases with Theories of Constraints %A Domenico Aquilino %A Patrizia Asirelli %A Chiara Renso %A Franco Turini %P 57–70 %G eng %0 Conference Paper %B LPNMR %D 1995 %T An Operator for Composing Deductive Databases with Theories of Constraints %A Domenico Aquilino %A Patrizia Asirelli %A Chiara Renso %A Franco Turini %B LPNMR %P 57-70 %G eng %0 Conference Paper %B GULP-PRODE (2) %D 1994 %T Amalgamating Language and Meta-language for Composing Logic Programs %A Antonio Brogi %A Chiara Renso %A Franco Turini %B GULP-PRODE (2) %P 408-422 %G eng %0 Journal Article %D 1994 %T Implementations of Program Composition Operations %A Antonio Brogi %A A. Chiarelli %A Paolo Mancarella %A V. Mazzotta %A Dino Pedreschi %A Chiara Renso %A Franco Turini %P 292–307 %G eng %0 Conference Paper %B PLILP %D 1994 %T Implementations of Program Composition Operations %A Antonio Brogi %A A. Chiarelli %A Paolo Mancarella %A V. Mazzotta %A Dino Pedreschi %A Chiara Renso %A Franco Turini %B PLILP %P 292-307 %G eng %0 Conference Paper %B PLILP %D 1994 %T Implementations of Program Composition Operations %A Antonio Brogi %A A. Chiarelli %A Paolo Mancarella %A V. Mazzotta %A Dino Pedreschi %A Chiara Renso %A Franco Turini %B PLILP %P 292-307 %G eng %0 Journal Article %J ACM Trans. Program. Lang. Syst. %D 1994 %T Modular Logic Programming %A Antonio Brogi %A Paolo Mancarella %A Dino Pedreschi %A Franco Turini %B ACM Trans. Program. Lang. Syst. %V 16 %P 1361-1398 %G eng %0 Conference Paper %B META %D 1992 %T Meta for Modularising Logic Programming %A Antonio Brogi %A Paolo Mancarella %A Dino Pedreschi %A Franco Turini %B META %P 105-119 %G eng %0 Book %B Types in Logic Programming %D 1992 %T The Type System of LML %A Bruno Bertolino %A Luigi Meo %A Dino Pedreschi %A Franco Turini %B Types in Logic Programming %P 313-332 %G eng %0 Conference Paper %B ICLP Workshop on Construction of Logic Programs %D 1991 %T Theory Construction in Computational Logic %A Antonio Brogi %A Paolo Mancarella %A Dino Pedreschi %A Franco Turini %B ICLP Workshop on Construction of Logic Programs %P 241-250 %G eng %0 Conference Paper %B NACLP %D 1990 %T Algebraic Properties of a Class of Logic Programs %A Paolo Mancarella %A Dino Pedreschi %A Marina Rondinelli %A Marco Tagliatti %B NACLP %P 23-39 %G eng %0 Conference Paper %B PLILP %D 1990 %T Logic Programming within a Functional Framework %A Antonio Brogi %A Paolo Mancarella %A Dino Pedreschi %A Franco Turini %B PLILP %P 372-386 %G eng %0 Journal Article %J J. Log. Program. %D 1990 %T A Transformational Approach to Negation in Logic Programming %A Roberto Barbuti %A Paolo Mancarella %A Dino Pedreschi %A Franco Turini %B J. Log. Program. %V 8 %P 201-228 %G eng %0 Conference Paper %B ECAI %D 1990 %T Universal Quantification by Case Analysis %A Antonio Brogi %A Paolo Mancarella %A Dino Pedreschi %A Franco Turini %B ECAI %P 111-116 %G eng %0 Conference Paper %B FGCS %D 1988 %T A Progress Report on the LML Project %A Bruno Bertolino %A Paolo Mancarella %A Luigi Meo %A Luca Nini %A Dino Pedreschi %A Franco Turini %B FGCS %P 675-684 %G eng %0 Conference Paper %B TAPSOFT, Vol.2 %D 1987 %T Intensional Negation of Logic Programs: Examples and Implementation Techniques %A Roberto Barbuti %A Paolo Mancarella %A Dino Pedreschi %A Franco Turini %B TAPSOFT, Vol.2 %P 96-110 %G eng %0 Journal Article %J Sci. Comput. Program. %D 1987 %T Symbolic Evaluation with Structural Recursive Symbolic Constants %A Fosca Giannotti %A Attilio Matteucci %A Dino Pedreschi %A Franco Turini %B Sci. Comput. Program. %V 9 %P 161-177 %G eng %0 Journal Article %J IEEE Trans. Software Eng. %D 1985 %T Symbolic Semantics and Program Reduction %A Vincenzo Ambriola %A Fosca Giannotti %A Dino Pedreschi %A Franco Turini %B IEEE Trans. Software Eng. %V 11 %P 784-794 %G eng