TY  - CONF
T1  - Exploiting Vehicular Data for Exposure-Aware Pedestrian Routing
T2  - 2025 26th IEEE International Conference on Mobile Data Management (MDM)
Y1  - 2025
A1  - Aliyev, Gurban
A1  - Nanni, Mirco
AB  - Vehicular traffic is one of the major sources of air pollution in urban settings, making it essential to clearly understand how much and where vehicle emissions impact residents. Recent approaches manage to yield pollution maps at the microscopic level by processing GPS trajectories of vehicles. That is achieved by applying mathematical models to estimate instantaneous emissions from GPS data, extending estimates to areas without data through missing data imputation, and further considering air dispersion factors. In this work, we leverage such inferred knowledge to implement an emission-aware pedestrian routing strategy and to study its impact on the reduction of exposure to vehicular pollutants and walking time. The study is realized through simulations of large masses of pedestrians over a medium-sized city in Italy, analyzing the interplay between the two factors - exposure versus walking time - in terms of time efficiency of paths and changes over existing habits both at a global and at an individual level. Experiments suggest that exposure-aware routing can yield a significant margin of improvement in health over most paths with minor effects on mobility, making it feasible and effective.
JF  - 2025 26th IEEE International Conference on Mobile Data Management (MDM)
ER  - 

TY  - CONF
T1  - From GPS Traces to Individual Emission Exposure: A Data-Driven Four-Step Process
T2  - Intelligent Transport Systems
Y1  - 2025
A1  - Aliyev, Gurban
A1  - Nanni, Mirco
JF  - Intelligent Transport Systems
PB  - Springer Nature Switzerland
CY  - Cham
SN  - 978-3-031-86370-7
ER  - 

TY  - CONF
T1  - Optimization of Exposure-Aware Routing for Vehicles and Pedestrians
T2  - Proceedings of the 19th International Symposium on Spatial and Temporal Data (SSTD '25)
Y1  - 2025
A1  - Aliyev, Gurban
AB  - Vehicular traffic is a significant source of air pollution in cities, exposing city residents and tourists to harmful emissions. To tackle this issue, our PhD research investigates how exposure to vehicular emissions in urban areas can be estimated and how this exposure can be mitigated through green routing. The work builds on three research questions: how to estimate car emissions and their dispersion; how to reduce pedestrian exposure through rerouting with a reasonable trade-off; and how to jointly simulate pedestrians and vehicles that dynamically adapt to each other. In line with these questions, we developed a pipeline to estimate vehicular emissions and dispersion, simulated exposure-aware pedestrian routing, and demonstrated its effectiveness in reducing exposure with minimal travel-time cost. Our most recent contribution is a joint simulation of pedestrians and vehicles, tested on real-world data, which achieved over 95% reduction in emissions along pedestrian-dense roads. We are planning to extend the framework by considering static resident populations in the co-evolutionary simulation. Additionally, a systematic comparison of the framework’s performance across diverse urban environments is necessary to assess its generalizability. Furthermore, we aim to investigate how variations in vehicle speed influence emission levels and, consequently, affect the outcomes of our exposure-aware routing simulations. Finally, we plan to analyze how emission-aware routing impacts fairness in exposure distribution across both non-moving residents and moving pedestrians.
JF  - Proceedings of the 19th International Symposium on Spatial and Temporal Data (SSTD '25)
PB  - Association for Computing Machinery
CY  - Osaka, Japan
UR  - https://doi.org/10.1145/3748777.3748808
ER  - 

TY  - JOUR
T1  - Vehicle-Pedestrian Optimization Framework for Exposure-Aware Routing
JF  - Mobile Networks and Applications
Y1  - 2025
A1  - Aliyev, Gurban
A1  - Nanni, Mirco
AB  - Vehicular traffic is a major source of air pollution in urban areas, exposing pedestrians and residents to harmful emissions. Recent works have proposed exposure-aware pedestrian routing strategies based on static emission maps. In this study, we extend this approach to a dynamic, multi-agent simulation framework involving both cars and pedestrians. Starting from the initial fastest-path routing, we simulate the co-evolution of vehicular emissions and pedestrian exposure over multiple steps, where pedestrian flows dynamically influence car emissions, and vice versa. Two routing strategies are explored: global weighting, where a shared trade-off between travel time and exposure is selected, and local weighting, where each trip independently chooses its optimal trade-off. Experiments on real-world urban data of a medium-sized city in Italy show that both strategies achieve significant reductions in pedestrian exposure, but differ in their impact on vehicle emissions and travel times. Global weighting yields more coordinated adaptation but at a higher systemic cost, while local weighting achieves more balanced outcomes with lower disruption. These results provide insights into designing urban routing policies that jointly optimize mobility efficiency and environmental sustainability.
UR  - https://link.springer.com/article/10.1007/s11036-025-02459-4
ER  - 

TY  - CONF
T1  - Enhancing Privacy and Utility in Federated Learning: A Hybrid P2P and Server-Based Approach with Differential Privacy Protection
T2  - Proceedings of the 21st International Conference on Security and Cryptography - SECRYPT
Y1  - 2024
A1  - Luca Corbucci
A1  - Anna Monreale
A1  - Roberto Pellungrini
JF  - Proceedings of the 21st International Conference on Security and Cryptography - SECRYPT
PB  - INSTICC
SN  - 978-989-758-709-2
ER  - 

TY  - CONF
T1  - An Overview of Recent Approaches to Enable Diversity in Large Language Models through Aligning with Human Perspectives
T2  - Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ LREC-COLING 2024
Y1  - 2024
A1  - Muscato, Benedetta
A1  - Mala, Chandana Sree
A1  - Marchiori Manerba, Marta
A1  - Gezici, Gizem
A1  - Giannotti, Fosca
ED  - Abercrombie, Gavin
ED  - Basile, Valerio
ED  - Bernadi, Davide
ED  - Dudy, Shiran
ED  - Frenda, Simona
ED  - Havens, Lucy
ED  - Tonelli, Sara
AB  - The varied backgrounds and experiences of human annotators inject different opinions and potential biases into the data, inevitably leading to disagreements. Yet, traditional aggregation methods fail to capture individual judgments since they rely on the notion of a single ground truth. Our aim is to review prior contributions to pinpoint the shortcomings that might cause stereotypical content generation. As a preliminary study, our purpose is to investigate state-of-the-art approaches, primarily focusing on the following two research directions. First, we investigate how adding subjectivity aspects to LLMs might guarantee diversity. We then look into the alignment between humans and LLMs and discuss how to measure it. Considering existing gaps, our review explores possible methods to mitigate the perpetuation of biases targeting specific communities. However, we recognize the potential risk of disseminating sensitive information due to the utilization of socio-demographic data in the training process. These considerations underscore the inclusion of diverse perspectives while taking into account the critical importance of implementing robust safeguards to protect individuals' privacy and prevent the inadvertent propagation of sensitive information.
JF  - Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ LREC-COLING 2024
PB  - ELRA and ICCL
CY  - Torino, Italia
UR  - https://aclanthology.org/2024.nlperspectives-1.5/
ER  - 

TY  - CONF
T1  - PUFFLE: Balancing Privacy, Utility, and Fairness in Federated Learning
Y1  - 2024
A1  - Luca Corbucci
A1  - Mikko A Heikkila
A1  - David Solans Noguero
A1  - Anna Monreale
A1  - Nicolas Kourtellis
UR  - https://arxiv.org/abs/2407.15224
ER  - 

TY  - CONF
T1  - Agnostic Label-Only Membership Inference Attack
T2  - 17th International Conference on Network and System Security
Y1  - 2023
A1  - Anna Monreale
A1  - Francesca Naretto
A1  - Simone Rizzo
JF  - 17th International Conference on Network and System Security
PB  - Springer
ER  - 

TY  - CONF
T1  - Analysis, Prediction and Mitigation of Exposure to Vehicular Air Pollution Based on Multi-Source Urban Data
T2  - SEBD 2023: 31st Symposium on Advanced Database Systems, Galzignano Terme (PD), Italy, July 2–5, 2023
Y1  - 2023
A1  - Aliyev, Gurban
AB  - An increasing amount of vehicular emissions in urban air pollution create a health risk for urban residents. Meanwhile, calculation and analysis of vehicular pollution using GPS trajectories and microscopic models is getting more popular as this method proves to be more useful and reliable in comparison to other methods. However, GPS-trajectory-based estimations suffer from the lack of GPS data and absence of validation/calibration of estimated emission amounts. Another problem is in the assessment of pollution levels using GPS trajectories as previous studies only consider changes in total vehicular emissions and ignore air quality guideline levels. In this paper, the methodology and preliminary results of experiments conducted for imputation of missing emission data are reported. An existing graph convolutional network model which is designed to predict traffic flows is adopted to estimate vehicular emissions in Pisa. This approach is based on the assumption that the same model can predict traffic emissions as a traffic flow and resulting emission are correlated. In the end of the paper, there is a discussion of future research directions planned to be taken during my PhD period to address issues in the estimation, analysis and mitigation of exposure to vehicular emissions in cities.
JF  - SEBD 2023: 31st Symposium on Advanced Database Systems, Galzignano Terme (PD), Italy, July 2–5, 2023
UR  - https://ceur-ws.org/Vol-3478/paper13.pdf
ER  - 

TY  - JOUR
T1  - Attributed Stream Hypergraphs: temporal modeling of node-attributed high-order interactions
JF  - Applied Network Science
Y1  - 2023
A1  - Failla, Andrea
A1  - Citraro, Salvatore
A1  - Rossetti, Giulio
VL  - 8
ER  - 

TY  - CONF
T1  - AUC-based Selective Classification
T2  - International Conference on Artificial Intelligence and Statistics, 25-27 April 2023, Palau de Congressos, Valencia, Spain
Y1  - 2023
A1  - Andrea Pugnana
A1  - Salvatore Ruggieri
AB  - Selective classification (or classification with a reject option) pairs a classifier with a selection function to determine whether or not a prediction should be accepted. This framework trades off coverage (probability of accepting a prediction) with predictive performance, typically measured by distributive loss functions. In many application scenarios, such as credit scoring, performance is instead measured by ranking metrics, such as the Area Under the ROC Curve (AUC). We propose a model-agnostic approach to associate a selection function to a given probabilistic binary classifier. The approach is specifically targeted at optimizing the AUC. We provide both theoretical justifications and a novel algorithm, called AUCROSS, to achieve such a goal. Experiments show that our method succeeds in trading-off coverage for AUC, improving over existing selective classification methods targeted at optimizing accuracy.
JF  - International Conference on Artificial Intelligence and Statistics, 25-27 April 2023, Palau de Congressos, Valencia, Spain
PB  - PMLR
UR  - https://proceedings.mlr.press/v206/pugnana23a.html
ER  - 

TY  - CONF
T1  - Evaluating the Privacy Exposure of Interpretable Global and Local Explainers
T2  - Submitted at Journal of Artificial Intelligence and Law
Y1  - 2023
A1  - Francesca Naretto
A1  - Anna Monreale
A1  - Fosca Giannotti
JF  - Submitted at Journal of Artificial Intelligence and Law
ER  - 

TY  - CONF
T1  - EXPHLOT: EXplainable Privacy assessment for Human LOcation Trajectories
T2  - Discovery Science
Y1  - 2023
A1  - Francesca Naretto
A1  - Roberto Pellungrini
A1  - Daniele Fadda
A1  - Salvo Rinzivillo
JF  - Discovery Science
ER  - 

TY  - CONF
T1  - Explain and Interpret Few-Shot Learning
T2  - Joint Proceedings of the xAI-2023 Late-breaking Work, Demos and Doctoral Consortium co-located with the 1st World Conference on eXplainable Artificial Intelligence (xAI-2023), Lisbon, Portugal, July 26-28, 2023
Y1  - 2023
A1  - Andrea Fedele
AB  - Recent advancements in Artificial Intelligence have been fueled by vast datasets, powerful computing resources, and sophisticated algorithms. However, traditional Machine Learning models face limitations in handling scarce data. Few-Shot Learning (FSL) offers a promising solution by training models on a small number of examples per class. This manuscript introduces FXI-FSL, a framework for eXplainability and Interpretability in FSL, which aims to develop post-hoc explainability algorithms and interpretableby-design alternatives. A noteworthy contribution is the SIamese Network EXplainer (SINEX), a post-hoc approach shedding light on Siamese Network behavior. The proposed framework seeks to unveil the rationale behind FSL models, instilling trust in their real-world applications. Moreover, it emerges as a safeguard for developers, facilitating models fine-tuning prior to deployment, and as a guide for end users navigating the decisions of these models
JF  - Joint Proceedings of the xAI-2023 Late-breaking Work, Demos and Doctoral Consortium co-located with the 1st World Conference on eXplainable Artificial Intelligence (xAI-2023), Lisbon, Portugal, July 26-28, 2023
PB  - CEUR-WS.org
UR  - https://ceur-ws.org/Vol-3554/paper38.pdf
ER  - 

TY  - CONF
T1  - Explaining Black-Boxes in Federated Learning
T2  - Explainable Artificial Intelligence
Y1  - 2023
A1  - Luca Corbucci
A1  - Guidotti, Riccardo
A1  - Monreale, Anna
AB  - Federated Learning has witnessed increasing popularity in the past few years for its ability to train Machine Learning models in critical contexts, using private data without moving them. Most of the work in the literature proposes algorithms and architectures for training neural networks, which although they present high performance in different predicting tasks and are easy to be learned with a cooperative mechanism, their predictive reasoning is obscure. Therefore, in this paper, we propose a variant of SHAP, one of the most widely used explanation methods, tailored to Horizontal server-based Federated Learning. The basic idea is having the possibility to explain an instance's prediction performed by the trained Machine Leaning model as an aggregation of the explanations provided by the clients participating in the cooperation. We empirically test our proposal on two different tabular datasets, and we observe interesting and encouraging preliminary results.
JF  - Explainable Artificial Intelligence
PB  - Springer Nature Switzerland
CY  - Cham
SN  - 978-3-031-44067-0
ER  - 

TY  - JOUR
T1  - Fair Federated Learning methodology based on Multi-Objective Optimization
JF  - Submitted at JAIR
Y1  - 2023
A1  - Michele Fontana, Francesca Naretto, Anna Monreale, Mirco Nanni, Fosca Giannotti
ER  - 

TY  - JOUR
T1  - Generative AI models should include detection mechanisms as a condition for public releaseAbstract
JF  - Ethics and Information Technology
Y1  - 2023
A1  - Knott, Alistair
A1  - Pedreschi, Dino
A1  - Chatila, Raja
A1  - Chakraborti, Tapabrata
A1  - Leavy, Susan
A1  - Baeza-Yates, Ricardo
A1  - Eyers, David
A1  - Trotman, Andrew
A1  - Teal, Paul D.
A1  - Biecek, Przemyslaw
A1  - Russell, Stuart
A1  - Bengio, Yoshua
AB  - The new wave of ‘foundation models’—general-purpose generative AI models, for production of text (e.g., ChatGPT) or images (e.g., MidJourney)—represent a dramatic advance in the state of the art for AI. But their use also introduces a range of new risks, which has prompted an ongoing conversation about possible regulatory mechanisms. Here we propose a specific principle that should be incorporated into legislation: that any organization developing a foundation model intended for public use must demonstrate a reliable detection mechanism for the content it generates, as a condition of its public release. The detection mechanism should be made publicly available in a tool that allows users to query, for an arbitrary item of content, whether the item was generated (wholly or partly) by the model. In this paper, we argue that this requirement is technically feasible and would play an important role in reducing certain risks from new AI models in many domains. We also outline a number of options for the tool’s design, and summarize a number of points where further input from policymakers and researchers would be required.
VL  - 25
UR  - https://link.springer.com/article/10.1007/s10676-023-09728-4?utm_source=rct_congratemailt&utm_medium=email&utm_campaign=oa_20231028&utm_content=10.1007/s10676-023-09728-4
JO  - Ethics Inf Technol
ER  - 

TY  - JOUR
T1  - Mobility Constraints in Segregation Models
JF  - Scientific Reports
Y1  - 2023
A1  - Daniele Gambetta
A1  - Giovanni Mauro
A1  - Luca Pappalardo
AB  - Since the development of the original Schelling model of urban segregation, several enhancements have been proposed, but none have considered the impact of mobility constraints on model dynamics. Recent studies have shown that human mobility follows specific patterns, such as a preference for short distances and dense locations. This paper proposes a segregation model incorporating mobility constraints to make agents select their location based on distance and location relevance. Our findings indicate that the mobility-constrained model produces lower segregation levels but takes longer to converge than the original Schelling model. We identified a few persistently unhappy agents from the minority group who cause this prolonged convergence time and lower segregation level as they move around the grid centre. Our study presents a more realistic representation of how agents move in urban areas and provides a novel and insightful approach to analyzing the impact of mobility constraints on segregation models. We highlight the significance of incorporating mobility constraints when policymakers design interventions to address urban segregation.
VL  - 13
ER  - 

TY  - CONF
T1  - A Model-Agnostic Heuristics for Selective Classification
T2  - Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, 2023, Washington, DC, USA, February 7-14, 2023
Y1  - 2023
A1  - Andrea Pugnana
A1  - Salvatore Ruggieri
AB  - Selective classification (also known as classification with reject option) conservatively extends a classifier with a selection function to determine whether or not a prediction should be accepted (i.e., trusted, used, deployed). This is a highly relevant issue in socially sensitive tasks, such as credit scoring. State-of-the-art approaches rely on Deep Neural Networks (DNNs) that train at the same time both the classifier and the selection function. These approaches are model-specific and computationally expensive. We propose a model-agnostic approach, as it can work with any base probabilistic binary classification algorithm, and it can be scalable to large tabular datasets if the base classifier is so. The proposed algorithm, called SCROSS, exploits a cross-fitting strategy and theoretical results for quantile estimation to build the selection function. Experiments on real-world data show that SCROSS improves over existing methods.
JF  - Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, 2023, Washington, DC, USA, February 7-14, 2023
PB  - AAAI Press
UR  - https://doi.org/10.1609/aaai.v37i8.26133
ER  - 

TY  - CONF
T1  - Topics in Selective Classification
T2  - Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023
Y1  - 2023
A1  - Andrea Pugnana
AB  - In recent decades, advancements in information technology allowed Artificial Intelligence (AI) systems to predict future outcomes with unprecedented success. This brought the widespread deployment of these methods in many fields, intending to support decision-making. A pressing question is how to make AI systems robust to common challenges in real-life scenarios and trustworthy. In my work, I plan to explore ways to enhance the trustworthiness of AI through the selective classification framework. In this setting, the AI system can refrain from predicting whenever it is not confident enough, allowing it to trade off coverage, i.e. the percentage of instances that receive a prediction, for performance.
JF  - Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023
PB  - AAAI Press
UR  - https://doi.org/10.1609/aaai.v37i13.26925
ER  - 

TY  - CONF
T1  - Attribute-aware Community Events in Feature-rich Dynamic Networks
T2  - Complex Networks and Their Applications XI: Proceedings of The Eleventh International Conference on Complex Networks and Their Applications: COMPLEX NETWORKS 2022—Book of Abstracts
Y1  - 2022
A1  - Failla, Andrea
A1  - Mazzoni, Federico
A1  - Citraro, Salvatore
JF  - Complex Networks and Their Applications XI: Proceedings of The Eleventh International Conference on Complex Networks and Their Applications: COMPLEX NETWORKS 2022—Book of Abstracts
PB  - Springer
ER  - 

TY  - CONF
T1  - Attributed stream-hypernetwork analysis: homophilic behaviors in pairwise and group political discussions on reddit
T2  - International Conference on Complex Networks and Their Applications
Y1  - 2022
A1  - Failla, Andrea
A1  - Citraro, Salvatore
A1  - Rossetti, Giulio
JF  - International Conference on Complex Networks and Their Applications
PB  - Springer
ER  - 

TY  - CONF
T1  - Connected Vehicle Simulation Framework for Parking Occupancy Prediction (Demo Paper)
T2  - Proceedings of the 30th International Conference on Advances in Geographic Information Systems
Y1  - 2022
A1  - Resce, Pierpaolo
A1  - Vorwerk, Lukas
A1  - Han, Zhiwei
A1  - Cornacchia, Giuliano
A1  - Alamdari, Omid Isfahani
A1  - Mirco Nanni
A1  - Luca Pappalardo
A1  - Weimer, Daniel
A1  - Liu, Yuanting
AB  - This paper demonstrates a simulation framework that collects data about connected vehicles' locations and surroundings in a realistic traffic scenario. Our focus lies on the capability to detect parking spots and their occupancy status. We use this data to train machine learning models that predict parking occupancy levels of specific areas in the city center of San Francisco. By comparing their performance to a given ground truth, our results show that it is possible to use simulated connected vehicle data as a base for prototyping meaningful AI-based applications.
JF  - Proceedings of the 30th International Conference on Advances in Geographic Information Systems
PB  - Association for Computing Machinery
CY  - New York, NY, USA
SN  - 9781450395298
UR  - https://doi.org/10.1145/3557915.3560995
ER  - 

TY  - JOUR
T1  - Explaining Black Box with visual exploration of Latent Space
JF  - EuroVis–Short Papers
Y1  - 2022
A1  - Bodria, Francesco
A1  - Rinzivillo, Salvatore
A1  - Fadda, Daniele
A1  - Guidotti, Riccardo
A1  - Giannotti, Fosca
A1  - Pedreschi, Dino
AB  - Autoencoders are a powerful yet opaque feature reduction technique, on top of which we propose a novel way for the joint visual exploration of both latent and real space. By interactively exploiting the mapping between latent and real features, it is possible to unveil the meaning of latent features while providing deeper insight into the original variables. To achieve this goal, we exploit and re-adapt existing approaches from eXplainable Artificial Intelligence (XAI) to understand the relationships between the input and latent features. The uncovered relationships between input features and latent ones allow the user to understand the data structure concerning external variables such as the predictions of a classification model. We developed an interactive framework that visually explores the latent space and allows the user to understand the relationships of the input features with model prediction.
UR  - https://diglib.eg.org/xmlui/bitstream/handle/10.2312/evs20221098/085-089.pdf?sequence=1
ER  - 

TY  - CONF
T1  - Explaining Siamese Networks in Few-Shot Learning for Audio Data
T2  - Discovery Science - 25th International Conference, DS 2022, Montpellier, France, October 10-12, 2022, Proceedings
Y1  - 2022
A1  - Andrea Fedele
A1  - Riccardo Guidotti
A1  - Dino Pedreschi
AB  - Machine learning models are not able to generalize correctly when queried on samples belonging to class distributions that were never seen during training. This is a critical issue, since real world applications might need to quickly adapt without the necessity of re-training. To overcome these limitations, few-shot learning frameworks have been proposed and their applicability has been studied widely for computer vision tasks. Siamese Networks learn pairs similarity in form of a metric that can be easily extended on new unseen classes. Unfortunately, the downside of such systems is the lack of explainability. We propose a method to explain the outcomes of Siamese Networks in the context of few-shot learning for audio data. This objective is pursued through a local perturbation-based approach that evaluates segments-weighted-average contributions to the final outcome considering the interplay between different areas of the audio spectrogram. Qualitative and quantitative results demonstrate that our method is able to show common intra-class characteristics and erroneous reliance on silent sections.
JF  - Discovery Science - 25th International Conference, DS 2022, Montpellier, France, October 10-12, 2022, Proceedings
PB  - Springer
UR  - https://doi.org/10.1007/978-3-031-18840-4_36
ER  - 

TY  - CONF
T1  - From Mean-Field to Complex Topologies: Network Effects on the Algorithmic Bias Model
T2  - Complex Networks & Their Applications X
Y1  - 2022
A1  - Valentina Pansanella
A1  - Giulio Rossetti
A1  - Letizia Milli
JF  - Complex Networks & Their Applications X
ER  - 

TY  - JOUR
T1  - Generating mobility networks with generative adversarial networks
JF  - EPJ data science
Y1  - 2022
A1  - Giovanni Mauro
A1  - Luca, Massimiliano
A1  - Longa, Antonio
A1  - Lepri, Bruno
A1  - Luca Pappalardo
AB  - The increasingly crucial role of human displacements in complex societal phenomena, such as traffic congestion, segregation, and the diffusion of epidemics, is attracting the interest of scientists from several disciplines. In this article, we address mobility network generation, i.e., generating a city’s entire mobility network, a weighted directed graph in which nodes are geographic locations and weighted edges represent people’s movements between those locations, thus describing the entire mobility set flows within a city. Our solution is MoGAN, a model based on Generative Adversarial Networks (GANs) to generate realistic mobility networks. We conduct extensive experiments on public datasets of bike and taxi rides to show that MoGAN outperforms the classical Gravity and Radiation models regarding the realism of the generated networks. Our model can be used for data augmentation and performing simulations and what-if analysis.
VL  - 11
ER  - 

TY  - Generic
T1  - GET-Viz: a library for automatic generation of visual dashboard for geographical time series
T2  - 8th International Conference on Computational Social Science (IC2S2)
Y1  - 2022
A1  - Fadda, Daniele
A1  - Michela Natilli
A1  - S Rinzivillo
JF  - 8th International Conference on Computational Social Science (IC2S2)
CY  - Chicago, USA
ER  - 

TY  - CONF
T1  - How Routing Strategies Impact Urban Emissions
T2  - Proceedings of the 30th International Conference on Advances in Geographic Information Systems
Y1  - 2022
A1  - Cornacchia, Giuliano
A1  - Böhm, Matteo
A1  - Giovanni Mauro
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Luca Pappalardo
AB  - Navigation apps use routing algorithms to suggest the best path to reach a user's desired destination. Although undoubtedly useful, navigation apps' impact on the urban environment (e.g., CO2 emissions and pollution) is still largely unclear. In this work, we design a simulation framework to assess the impact of routing algorithms on carbon dioxide emissions within an urban environment. Using APIs from TomTom and OpenStreetMap, we find that settings in which either all vehicles or none of them follow a navigation app's suggestion lead to the worst impact in terms of CO2 emissions. In contrast, when just a portion (around half) of vehicles follow these suggestions, and some degree of randomness is added to the remaining vehicles' paths, we observe a reduction in the overall CO2 emissions over the road network. Our work is a first step towards designing next-generation routing principles that may increase urban well-being while satisfying individual needs.
JF  - Proceedings of the 30th International Conference on Advances in Geographic Information Systems
PB  - Association for Computing Machinery
CY  - New York, NY, USA
SN  - 9781450395298
UR  - https://doi.org/10.1145/3557915.3560977
ER  - 

TY  - JOUR
T1  - The long-tail effect of the COVID-19 lockdown on Italians’ quality of life, sleep and physical activity
JF  - Scientific Data
Y1  - 2022
A1  - Michela Natilli
A1  - Alessio Rossi
A1  - Trecroci, Athos
A1  - Cavaggioni, Luca
A1  - Merati, Giampiero
A1  - Formenti, Damiano
AB  - From March 2020 to May 2021, several lockdown periods caused by the COVID-19 pandemic have limited people’s usual activities and mobility in Italy, as well as around the world. These unprecedented confinement measures dramatically modified citizens’ daily lifestyles and behaviours. However, with the advent of summer 2021 and thanks to the vaccination campaign that significantly prevents serious illness and death, and reduces the risk of contagion, all the Italian regions finally returned to regular behaviours and routines. Anyhow, it is unclear if there is a long-tail effect on people’s quality of life, sleep- and physical activity-related behaviours. Thanks to the dataset described in this paper, it will be possible to obtain accurate insights of the changes induced by the lockdown period in the Italians’ health that will permit to provide practical suggestions at local, regional, and state institutions and companies to improve infrastructures and services that could be beneficial to Italians’ well being.
VL  - 9
UR  - https://www.nature.com/articles/s41597-022-01376-5
ER  - 

TY  - JOUR
T1  - Methods and tools for causal discovery and causal inference
JF  - WIREs Data Mining Knowl. Discov.
Y1  - 2022
A1  - Ana Rita Nogueira
A1  - Andrea Pugnana
A1  - Salvatore Ruggieri
A1  - Dino Pedreschi
A1  - João Gama
AB  - Causality is a complex concept, which roots its developments across several fields, such as statistics, economics, epidemiology, computer science, and philosophy. In recent years, the study of causal relationships has become a crucial part of the Artificial Intelligence community, as causality can be a key tool for overcoming some limitations of correlation-based Machine Learning systems. Causality research can generally be divided into two main branches, that is, causal discovery and causal inference. The former focuses on obtaining causal knowledge directly from observational data. The latter aims to estimate the impact deriving from a change of a certain variable over an outcome of interest. This article aims at covering several methodologies that have been developed for both tasks. This survey does not only focus on theoretical aspects. But also provides a practical toolkit for interested researchers and practitioners, including software, datasets, and running examples.
VL  - 12
ER  - 

TY  - CONF
T1  - Monitoring Fairness in HOLDA
T2  - HHAI 2022: Augmenting Human Intellect - Proceedings of the First International Conference on Hybrid Human-Artificial Intelligence, Amsterdam, The Netherlands, 13-17 June 2022
Y1  - 2022
A1  - Michele Fontana
A1  - Francesca Naretto
A1  - Anna Monreale
A1  - Fosca Giannotti
ED  - Stefan Schlobach
ED  - María Pérez-Ortiz
ED  - Myrthe Tielman
JF  - HHAI 2022: Augmenting Human Intellect - Proceedings of the First International Conference on Hybrid Human-Artificial Intelligence, Amsterdam, The Netherlands, 13-17 June 2022
PB  - IOS Press
UR  - https://doi.org/10.3233/FAIA220205
ER  - 

TY  - Generic
T1  - Semantic Enrichment of XAI Explanations for Healthcare
T2  - 24th International Conference on Artificial Intelligence
Y1  - 2022
A1  - Luca Corbucci
A1  - Anna Monreale
A1  - Cecilia Panigutti
A1  - Michela Natilli
A1  - Smiraglio, Simona
A1  - Dino Pedreschi
AB  - Explaining black-box models decisions is crucial to increase doctors' trust in AI-based clinical decision support systems. However, current eXplainable Artificial Intelligence (XAI) techniques usually provide explanations that are not easily understandable by experts outside of AI. Furthermore, most of the them produce explanations that consider only the input features of the algorithm. However, broader information about the clinical context of a patient is usually available even if not processed by the AI-based clinical decision support system for its decision. Enriching the explanations with relevant clinical information concerning the health status of a patient would increase the ability of human experts to assess the reliability of the AI decision. Therefore, in this paper we present a methodology that aims to enable clinical reasoning by semantically enriching AI explanations. Starting from a medical AI explanation based only on the input features provided to the algorithm, our methodology leverages medical ontologies and NLP embedding techniques to link relevant information present in the patient's clinical notes to the original explanation. We validate our methodology with two experiments involving a human expert. Our results highlight promising performance in correctly identifying relevant information about the diseases of the patients, in particular about the associated morphology. This suggests that the presented methodology could be a first step toward developing a natural language explanation of AI decision support systems.
JF  - 24th International Conference on Artificial Intelligence
ER  - 

TY  - Generic
T1  - SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics.
T2  - 30th Italian Symposium on Advanced Database Systems (SEBD – Sistemi Evoluti per Basi di Dati)
Y1  - 2022
A1  - Trasarti, Roberto
A1  - Grossi, Valerio
A1  - Michela Natilli
A1  - Rapisarda, Beatrice
AB  - SoBigData RI has the ambition to support the rising demand for cross-disciplinary research and innovation on the multiple aspects of social complexity from combined data and model-driven perspectives and the increasing importance of ethics and data scientists’ responsibility as pillars of trustworthy use of Big Data and analytical technology. Digital traces of human activities offer a considerable opportunity to scrutinize the ground truth of individual and collective behaviour at an unprecedented detail and on a global scale. This increasing wealth of data is a chance to understand social complexity, provided we can rely on social mining, i.e., adequate means for accessing big social data and models for extracting knowledge from them. SoBigData RI, with its tools and services, empowers researchers and innovators through a platform for the design and execution of large-scale social mining experiments, open to users with diverse backgrounds, accessible on the cloud (aligned with EOSC), and also exploiting supercomputing facilities. Pushing the FAIR (Findable, Accessible, Interoperable) and FACT (Fair, Accountable, Confidential, and Transparent) principles will render social mining experiments more efficiently designed, adjusted, and repeatable by domain experts that are not data scientists. SoBigData RI moves forward from the simple awareness of ethical and legal challenges in social mining to the development of concrete tools that operationalize ethics with value-sensitive design, incorporating values and norms for privacy protection, fairness, transparency, and pluralism. SoBigData RI is the result of two H2020 grants (g.a. n.654024 and 871042), and it is part of the ESFRI 2021 Roadmap.
JF  - 30th Italian Symposium on Advanced Database Systems (SEBD – Sistemi Evoluti per Basi di Dati)
CY  - Tirrenia, Pisa
ER  - 

TY  - JOUR
T1  - Stable and actionable explanations of black-box models through factual and counterfactual rules
JF  - Data Mining and Knowledge Discovery
Y1  - 2022
A1  - Guidotti, Riccardo
A1  - Monreale, Anna
A1  - Ruggieri, Salvatore
A1  - Naretto, Francesca
A1  - Turini, Franco
A1  - Pedreschi, Dino
A1  - Giannotti, Fosca
ER  - 

TY  - JOUR
T1  - Benchmarking and Survey of Explanation Methods for Black Box Models
JF  - CoRR
Y1  - 2021
A1  - Francesco Bodria
A1  - Fosca Giannotti
A1  - Riccardo Guidotti
A1  - Francesca Naretto
A1  - Dino Pedreschi
A1  - S Rinzivillo
VL  - abs/2102.13076
UR  - https://arxiv.org/abs/2102.13076
ER  - 

TY  - JOUR
T1  - Cognitive network science quantifies feelings expressed in suicide letters and Reddit mental health communities
JF  - arXiv preprint arXiv:2110.15269
Y1  - 2021
A1  - Joseph, Simmi Marina
A1  - Salvatore Citraro
A1  - Morini, Virginia
A1  - Giulio Rossetti
A1  - Stella, Massimo
ER  - 

TY  - JOUR
T1  - Conformity: a Path-Aware Homophily measure for Node-Attributed Networks
JF  - IEEE Intelligent SystemsIEEE Intelligent Systems
Y1  - 2021
A1  - Giulio Rossetti
A1  - Salvatore Citraro
A1  - Letizia Milli
AB  - Unveil the homophilic/heterophilic behaviors that characterize the wiring patterns of complex networks is an important task in social network analysis, often approached studying the assortative mixing of node attributes. Recent works underlined that a global measure to quantify node homophily necessarily provides a partial, often deceiving, picture of the reality. Moving from such literature, in this work, we propose a novel measure, namely Conformity, designed to overcome such limitation by providing a node-centric quantification of assortative mixing patterns. Differently from the measures proposed so far, Conformity is designed to be path-aware, thus allowing for a more detailed evaluation of the impact that nodes at different degrees of separations have on the homophilic embeddedness of a target. Experimental analysis on synthetic and real data allowed us to observe that Conformity can unveil valuable insights from node-attributed graphs.
SN  - 1941-1294
UR  - https://ieeexplore.ieee.org/document/9321348
JO  - IEEE Intelligent Systems
ER  - 

TY  - JOUR
T1  - Estimating the Total Volume of Queries to a Search Engine
JF  - IEEE Transactions on Knowledge and Data Engineering
Y1  - 2021
A1  - F. Lillo
A1  - Salvatore Ruggieri
AB  - We study the problem of estimating the total number of searches (volume) of queries in a specific domain, which were submitted to a search engine in a given time period. Our statistical model assumes that the distribution of searches follows a Zipf's law, and that the observed sample volumes are biased accordingly to three possible scenarios. These assumptions are consistent with empirical data, with keyword research practices, and with approximate algorithms used to take counts of query frequencies. A few estimators of the parameters of the distribution are devised and experimented, based on the nature of the empirical/simulated data. We apply the methods on the domain of recipes and cooking queries searched in Italian in 2017. The observed volumes of sample queries are collected from Google Trends (continuous data) and SearchVolume (binned data). The estimated total number of queries and total volume are computed for the two cases, and the results are compared and discussed.
UR  - https://ieeexplore.ieee.org/abstract/document/9336245
ER  - 

TY  - CONF
T1  - Explainable for Trustworthy AI
T2  - Human-Centered Artificial Intelligence - Advanced Lectures, 18th European Advanced Course on AI, ACAI 2021, Berlin, Germany, October 11-15, 2021, extended and improved lecture notes
Y1  - 2021
A1  - Fosca Giannotti
A1  - Francesca Naretto
A1  - Francesco Bodria
JF  - Human-Centered Artificial Intelligence - Advanced Lectures, 18th European Advanced Course on AI, ACAI 2021, Berlin, Germany, October 11-15, 2021, extended and improved lecture notes
PB  - Springer
UR  - https://doi.org/10.1007/978-3-031-24349-3_10
ER  - 

TY  - JOUR
T1  - Explaining the difference between men’s and women’s football
JF  - PLOS ONE
Y1  - 2021
A1  - Luca Pappalardo
A1  - Alessio Rossi
A1  - Michela Natilli
A1  - Paolo Cintia
ED  - Constantinou, Anthony C.
AB  - Women’s football is gaining supporters and practitioners worldwide, raising questions about what the differences are with men’s football. While the two sports are often compared based on the players’ physical attributes, we analyze the spatio-temporal events during matches in the last World Cups to compare male and female teams based on their technical performance. We train an artificial intelligence model to recognize if a team is male or female based on variables that describe a match’s playing intensity, accuracy, and performance quality. Our model accurately distinguishes between men’s and women’s football, revealing crucial technical differences, which we investigate through the extraction of explanations from the classifier’s decisions. The differences between men’s and women’s football are rooted in play accuracy, the recovery time of ball possession, and the players’ performance quality. Our methodology may help journalists and fans understand what makes women’s football a distinct sport and coaches design tactics tailored to female teams.
VL  - 16
UR  - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0255407
JO  - PLoS ONE
ER  - 

TY  - JOUR
T1  - Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
Y1  - 2021
A1  - Mirco Nanni
A1  - Andrienko, Gennady
A1  - Barabasi, Albert-Laszlo
A1  - Boldrini, Chiara
A1  - Bonchi, Francesco
A1  - Cattuto, Ciro
A1  - Chiaromonte, Francesca
A1  - Comandé, Giovanni
A1  - Conti, Marco
A1  - Coté, Mark
A1  - Dignum, Frank
A1  - Dignum, Virginia
A1  - Domingo-Ferrer, Josep
A1  - Ferragina, Paolo
A1  - Fosca Giannotti
A1  - Riccardo Guidotti
A1  - Helbing, Dirk
A1  - Kaski, Kimmo
A1  - Kertész, János
A1  - Lehmann, Sune
A1  - Lepri, Bruno
A1  - Lukowicz, Paul
A1  - Matwin, Stan
A1  - Jiménez, David Megías
A1  - Anna Monreale
A1  - Morik, Katharina
A1  - Oliver, Nuria
A1  - Passarella, Andrea
A1  - Passerini, Andrea
A1  - Dino Pedreschi
A1  - Pentland, Alex
A1  - Pianesi, Fabio
A1  - Francesca Pratesi
A1  - S Rinzivillo
A1  - Salvatore Ruggieri
A1  - Siebes, Arno
A1  - Torra, Vicenc
A1  - Roberto Trasarti
A1  - Hoven, Jeroen van den
A1  - Vespignani, Alessandro
AB  - The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the “phase 2” of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens’ privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens’ “personal data stores”, to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates—if and when they want and for specific aims—with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.
SN  - 1572-8439
UR  - https://link.springer.com/article/10.1007/s10676-020-09572-w
JO  - Ethics and Information Technology
ER  - 

TY  - JOUR
T1  - GLocalX - From Local to Global Explanations of Black Box AI Models
Y1  - 2021
A1  - Mattia Setzu
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Franco Turini
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Artificial Intelligence (AI) has come to prominence as one of the major components of our society, with applications in most aspects of our lives. In this field, complex and highly nonlinear machine learning models such as ensemble models, deep neural networks, and Support Vector Machines have consistently shown remarkable accuracy in solving complex tasks. Although accurate, AI models often are “black boxes” which we are not able to understand. Relying on these models has a multifaceted impact and raises significant concerns about their transparency. Applications in sensitive and critical domains are a strong motivational factor in trying to understand the behavior of black boxes. We propose to address this issue by providing an interpretable layer on top of black box models by aggregating “local” explanations. We present GLocalX, a “local-first” model agnostic explanation method. Starting from local explanations expressed in form of local decision rules, GLocalX iteratively generalizes them into global explanations by hierarchically aggregating them. Our goal is to learn accurate yet simple interpretable models to emulate the given black box, and, if possible, replace it entirely. We validate GLocalX in a set of experiments in standard and constrained settings with limited or no access to either data or local explanations. Experiments show that GLocalX is able to accurately emulate several models with simple and small models, reaching state-of-the-art performance against natively global solutions. Our findings show how it is often possible to achieve a high level of both accuracy and comprehensibility of classification models, even in complex domains with high-dimensional data, without necessarily trading one property for the other. This is a key requirement for a trustworthy AI, necessary for adoption in high-stakes decision making applications.
VL  - 294
SN  - 0004-3702
UR  - https://www.sciencedirect.com/science/article/pii/S0004370221000084
JO  - Artificial Intelligence
ER  - 

TY  - JOUR
T1  - Introduction to the special issue on social mining and big data ecosystem for open, responsible data science
Y1  - 2021
A1  - Luca Pappalardo
A1  - Grossi, Valerio
A1  - Dino Pedreschi
SN  - 2364-4168
UR  - https://doi.org/10.1007/s41060-021-00253-5
JO  - International Journal of Data Science and Analytics
ER  - 

TY  - JOUR
T1  - A Mechanistic Data-Driven Approach to Synthesize Human Mobility Considering the Spatial, Temporal, and Social Dimensions Together
JF  - ISPRS International Journal of Geo-Information
Y1  - 2021
A1  - Cornacchia, Giuliano
A1  - Luca Pappalardo
AB  - Modelling human mobility is crucial in several areas, from urban planning to epidemic modelling, traffic forecasting, and what-if analysis. Existing generative models focus mainly on reproducing the spatial and temporal dimensions of human mobility, while the social aspect, though it influences human movements significantly, is often neglected. Those models that capture some social perspectives of human mobility utilize trivial and unrealistic spatial and temporal mechanisms. In this paper, we propose the Spatial, Temporal and Social Exploration and Preferential Return model (STS-EPR), which embeds mechanisms to capture the spatial, temporal, and social aspects together. We compare the trajectories produced by STS-EPR with respect to real-world trajectories and synthetic trajectories generated by two state-of-the-art generative models on a set of standard mobility measures. Our experiments conducted on an open dataset show that STS-EPR, overall, outperforms existing spatial-temporal or social models demonstrating the importance of modelling adequately the sociality to capture precisely all the other dimensions of human mobility. We further investigate the impact of the tile shape of the spatial tessellation on the performance of our model. STS-EPR, which is open-source and tested on open data, represents a step towards the design of a mechanistic data-driven model that captures all the aspects of human mobility comprehensively.
VL  - 10
UR  - https://www.mdpi.com/2220-9964/10/9/599
ER  - 

TY  - CONF
T1  - A new approach for cross-silo federated learning and its privacy risks
T2  - 2021 18th International Conference on Privacy, Security and Trust (PST)
Y1  - 2021
A1  - Fontana, Michele
A1  - Naretto, Francesca
A1  - Monreale, Anna
JF  - 2021 18th International Conference on Privacy, Security and Trust (PST)
ER  - 

TY  - CONF
T1  - A new approach for cross-silo federated learning and its privacy risks
T2  - 18th International Conference on Privacy, Security and Trust, PST 2021, Auckland, New Zealand, December 13-15, 2021
Y1  - 2021
A1  - Michele Fontana
A1  - Francesca Naretto
A1  - Anna Monreale
JF  - 18th International Conference on Privacy, Security and Trust, PST 2021, Auckland, New Zealand, December 13-15, 2021
PB  - IEEE
UR  - https://doi.org/10.1109/PST52912.2021.9647753
ER  - 

TY  - CONF
T1  - Privacy Risk Assessment of Individual Psychometric Profiles
T2  - Discovery Science - 24th International Conference, DS 2021, Halifax, NS, Canada, October 11-13, 2021, Proceedings
Y1  - 2021
A1  - Giacomo Mariani
A1  - Anna Monreale
A1  - Francesca Naretto
ED  - Carlos Soares
ED  - Luís Torgo
JF  - Discovery Science - 24th International Conference, DS 2021, Halifax, NS, Canada, October 11-13, 2021, Proceedings
PB  - Springer
UR  - https://doi.org/10.1007/978-3-030-88942-5_32
ER  - 

TY  - JOUR
T1  - STS-EPR: Modelling individual mobility considering the spatial, temporal, and social dimensions together
Y1  - 2021
A1  - Cornacchia, Giuliano
A1  - Luca Pappalardo
ER  - 

TY  - JOUR
T1  - Toward a Standard Approach for Echo Chamber Detection: Reddit Case Study
JF  - Applied Sciences
Y1  - 2021
A1  - Morini, Virginia
A1  - Pollacci, Laura
A1  - Giulio Rossetti
VL  - 11
ER  - 

TY  - JOUR
T1  - Understanding eating choices among university students: A study using data from cafeteria cashiers’ transactions
JF  - Health Policy
Y1  - 2021
A1  - Lorenzoni, Valentina
A1  - Triulzi, Isotta
A1  - Martinucci, Irene
A1  - Toncelli, Letizia
A1  - Michela Natilli
A1  - Barale, Roberto
A1  - Turchetti, Giuseppe
VL  - 125
ER  - 

TY  - CONF
T1  - Analysis and Visualization of Performance Indicators in University Admission Tests
T2  - Formal Methods. FM 2019 International Workshops
Y1  - 2020
A1  - Michela Natilli
A1  - Daniele Fadda
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Licari, Federica
ED  - Sekerinski, Emil
ED  - Moreira, Nelma
ED  - Oliveira, José N.
ED  - Ratiu, Daniel
ED  - Riccardo Guidotti
ED  - Farrell, Marie
ED  - Luckcuck, Matt
ED  - Marmsoler, Diego
ED  - Campos, José
ED  - Astarte, Troy
ED  - Gonnord, Laure
ED  - Cerone, Antonio
ED  - Couto, Luis
ED  - Dongol, Brijesh
ED  - Kutrib, Martin
ED  - Monteiro, Pedro
ED  - Delmas, David
AB  - This paper presents an analytical platform for evaluation of the performance and anomaly detection of tests for admission to public universities in Italy. Each test is personalized for each student and is composed of a series of questions, classified on different domains (e.g. maths, science, logic, etc.). Since each test is unique for composition, it is crucial to guarantee a similar level of difficulty for all the tests in a session. For this reason, to each question, it is assigned a level of difficulty from a domain expert. Thus, the general difficultness of a test depends on the correct classification of each item. We propose two approaches to detect outliers. A visualization-based approach using dynamic filter and responsive visual widgets. A data mining approach to evaluate the performance of the different questions for five years. We used clustering to group the questions according to a set of performance indicators to provide labeling of the data-driven level of difficulty. The measured level is compared with the a priori assigned by experts. The misclassifications are then highlighted to the expert, who will be able to refine the question or the classification. Sequential pattern mining is used to check if biases are present in the composition of the tests and their performance. This analysis is meant to exclude overlaps or direct dependencies among questions. Analyzing co-occurrences we are able to state that the composition of each test is fair and uniform for all the students, even on several sessions. The analytical results are presented to the expert through a visual web application that loads the analytical data and indicators and composes an interactive dashboard. The user may explore the patterns and models extracted by filtering and changing thresholds and analytical parameters.
JF  - Formal Methods. FM 2019 International Workshops
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-030-54994-7
UR  - https://link.springer.com/chapter/10.1007/978-3-030-54994-7_14
ER  - 

TY  - JOUR
T1  - ANGEL: efficient, and effective, node-centric community discovery in static and dynamic networks
JF  - Applied Network Science
Y1  - 2020
A1  - Giulio Rossetti
AB  - Community discovery is one of the most challenging tasks in social network analysis. During the last decades, several algorithms have been proposed with the aim of identifying communities in complex networks, each one searching for mesoscale topologies having different and peculiar characteristics. Among such vast literature, an interesting family of Community Discovery algorithms, designed for the analysis of social network data, is represented by overlapping, node-centric approaches. In this work, following such line of research, we propose Angel, an algorithm that aims to lower the computational complexity of previous solutions while ensuring the identification of high-quality overlapping partitions. We compare Angel, both on synthetic and real-world datasets, against state of the art community discovery algorithms designed for the same community definition. Our experiments underline the effectiveness and efficiency of the proposed methodology, confirmed by its ability to constantly outperform the identified competitors.
VL  - 5
UR  - https://link.springer.com/article/10.1007/s41109-020-00270-6
ER  - 

TY  - RPRT
T1  - Artificial Intelligence (AI): new developments and innovations applied to e-commerce
Y1  - 2020
A1  - Dino Pedreschi
A1  - Ioanna Miliou
AB  - This in-depth analysis discusses the opportunities and challenges brought by the recent and the foreseeable developments of Artificial Intelligence into online platforms and marketplaces. The paper advocates the importance to support tustworthy, explainable AI (in order to fight discrimination and manipulation, and empower citizens), and societal-aware AI (in order to fight polarisation, monopolistic concentration and excessive inequality, and pursue diversity and openness).  This document was provided by the Policy Department for Economic, Scientific and Quality of Life Policies at the request of the committee on the Internal Market and Consumer Protection (IMCO).
PB  - European Parliament's committee on the Internal Market and Consumer Protection
UR  - https://www.europarl.europa.eu/thinktank/en/document.html?reference=IPOL_IDA(2020)648791
ER  - 

TY  - JOUR
T1  - Authenticated Outlier Mining for Outsourced Databases
JF  - IEEE Transactions on Dependable and Secure Computing
Y1  - 2020
A1  - Dong, Boxiang
A1  - Wang, Hui
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Guo, Wenge
AB  - The Data-Mining-as-a-Service (DMaS) paradigm is becoming the focus of research, as it allows the data owner (client) who lacks expertise and/or computational resources to outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises some issues about result integrity: how could the client verify the mining results returned by the server are both sound and complete? In this paper, we focus on outlier mining, an important mining task. Previous verification techniques use an authenticated data structure (ADS) for correctness authentication, which may incur much space and communication cost. In this paper, we propose a novel solution that returns a probabilistic result integrity guarantee with much cheaper verification cost. The key idea is to insert a set of artificial records (ARs) into the dataset, from which it constructs a set of artificial outliers (AOs) and artificial non-outliers (ANOs). The AOs and ANOs are used by the client to detect any incomplete and/or incorrect mining results with a probabilistic guarantee. The main challenge that we address is how to construct ARs so that they do not change the (non-)outlierness of original records, while guaranteeing that the client can identify ANOs and AOs without executing mining. Furthermore, we build a strategic game and show that a Nash equilibrium exists only when the server returns correct outliers. Our implementation and experiments demonstrate that our verification solution is efficient and lightweight.
VL  - 17
UR  - https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8858https://ieeexplore.ieee.org/document/8048342/http://xplorestaging.ieee.org/ielx7/8858/9034462/08048342.pdf?arnumber=8048342https://ieeexplore.ieee.org/ielam/8858/9034462/8048342-aam.pdf
JO  - IEEE Trans. Dependable and Secure Comput.
ER  - 

TY  - JOUR
T1  - Bias in data-driven artificial intelligence systems—An introductory survey
JF  - Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Y1  - 2020
A1  - Ntoutsi, Eirini
A1  - Fafalios, Pavlos
A1  - Gadiraju, Ujwal
A1  - Iosifidis, Vasileios
A1  - Nejdl, Wolfgang
A1  - Vidal, Maria-Esther
A1  - Salvatore Ruggieri
A1  - Franco Turini
A1  - Papadopoulos, Symeon
A1  - Krasanakis, Emmanouil
A1  - others
AB  - Artificial Intelligence (AI)‐based systems are widely employed nowadays to make decisions that have far‐reaching impact on individuals and society. Their decisions might affect everyone, everywhere, and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training, and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multidisciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well‐grounded in a legal frame. In this survey, we focus on data‐driven AI, as a large part of AI is powered nowadays by (big) data and powerful machine learning algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features such as race, sex, and so forth.
VL  - 10
UR  - https://onlinelibrary.wiley.com/doi/full/10.1002/widm.1356
ER  - 

TY  - CONF
T1  - Black Box Explanation by Learning Image Exemplars in the Latent Feature Space
T2  - Machine Learning and Knowledge Discovery in Databases
Y1  - 2020
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Matwin, Stan
A1  - Dino Pedreschi
ED  - Brefeld, Ulf
ED  - Fromont, Elisa
ED  - Hotho, Andreas
ED  - Knobbe, Arno
ED  - Maathuis, Marloes
ED  - Robardet, Céline
AB  - We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by “morphing” into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.
JF  - Machine Learning and Knowledge Discovery in Databases
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-030-46150-8
UR  - https://link.springer.com/chapter/10.1007/978-3-030-46150-8_12
ER  - 

TY  - CONF
T1  - Capturing Political Polarization of Reddit Submissions in the Trump Era
T2  - SEBD
Y1  - 2020
A1  - Giulio Rossetti
A1  - Morini, Virginia
A1  - Pollacci, Laura
JF  - SEBD
ER  - 

TY  - JOUR
T1  - Causal inference for social discrimination reasoning
Y1  - 2020
A1  - Qureshi, Bilal
A1  - Kamiran, Faisal
A1  - Karim, Asim
A1  - Salvatore Ruggieri
A1  - Dino Pedreschi
AB  - The discovery of discriminatory bias in human or automated decision making is a task of increasing importance and difficulty, exacerbated by the pervasive use of machine learning and data mining. Currently, discrimination discovery largely relies upon correlation analysis of decisions records, disregarding the impact of confounding biases. We present a method for causal discrimination discovery based on propensity score analysis, a statistical tool for filtering out the effect of confounding variables. We introduce causal measures of discrimination which quantify the effect of group membership on the decisions, and highlight causal discrimination/favoritism patterns by learning regression trees over the novel measures. We validate our approach on two real world datasets. Our proposed framework for causal discrimination has the potential to enhance the transparency of machine learning with tools for detecting discriminatory bias both in the training data and in the learning algorithms.
VL  - 54
SN  - 1573-7675
UR  - https://link.springer.com/article/10.1007/s10844-019-00580-x
JO  - Journal of Intelligent Information Systems
ER  - 

TY  - JOUR
T1  - Conformity: A Path-Aware Homophily Measure for Node-Attributed Networks
JF  - arXiv preprint arXiv:2012.05195
Y1  - 2020
A1  - Giulio Rossetti
A1  - Salvatore Citraro
A1  - Letizia Milli
ER  - 

TY  - CONF
T1  - Digital Footprints of International Migration on Twitter
T2  - International Symposium on Intelligent Data Analysis
Y1  - 2020
A1  - Jisu Kim
A1  - Alina Sirbu
A1  - Fosca Giannotti
A1  - Lorenzo Gabrielli
AB  - Studying migration using traditional data has some limitations. To date, there have been several studies proposing innovative methodologies to measure migration stocks and flows from social big data. Nevertheless, a uniform definition of a migrant is difficult to find as it varies from one work to another depending on the purpose of the study and nature of the dataset used. In this work, a generic methodology is developed to identify migrants within the Twitter population. This describes a migrant as a person who has the current residence different from the nationality. The residence is defined as the location where a user spends most of his/her time in a certain year. The nationality is inferred from linguistic and social connections to a migrant’s country of origin. This methodology is validated first with an internal gold standard dataset and second with two official statistics, and shows strong performance scores and correlation coefficients. Our method has the advantage that it can identify both immigrants and emigrants, regardless of the origin/destination countries. The new methodology can be used to study various aspects of migration, including opinions, integration, attachment, stocks and flows, motivations for migration, etc. Here, we exemplify how trending topics across and throughout different migrant communities can be observed.
JF  - International Symposium on Intelligent Data Analysis
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-030-44584-3_22
ER  - 

TY  - CONF
T1  - Doctor XAI: an ontology-based approach to black-box sequential data classification explanations
T2  - Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency
Y1  - 2020
A1  - Cecilia Panigutti
A1  - Perotti, Alan
A1  - Dino Pedreschi
AB  - Several recent advancements in Machine Learning involve black-box models: algorithms that do not provide human-understandable explanations in support of their decisions. This limitation hampers the fairness, accountability and transparency of these models; the field of eXplainable Artificial Intelligence (XAI) tries to solve this problem providing human-understandable explanations for black-box models. However, healthcare datasets (and the related learning tasks) often present peculiar features, such as sequential data, multi-label predictions, and links to structured background knowledge. In this paper, we introduce Doctor XAI, a model-agnostic explainability technique able to deal with multi-labeled, sequential, ontology-linked data. We focus on explaining Doctor AI, a multilabel classifier which takes as input the clinical history of a patient in order to predict the next visit. Furthermore, we show how exploiting the temporal dimension in the data and the domain knowledge encoded in the medical ontology improves the quality of the mined explanations.
JF  - Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency
UR  - https://dl.acm.org/doi/pdf/10.1145/3351095.3372855?download=true
ER  - 

TY  - JOUR
T1  - Error Estimation of Ultra-Short Heart Rate Variability Parameters: Effect of Missing Data Caused by Motion Artifacts
JF  - Sensors
Y1  - 2020
A1  - Alessio Rossi
A1  - Dino Pedreschi
A1  - Clifton, David A.
A1  - Morelli, Davide
AB  - Application of ultra&ndash;short Heart Rate Variability (HRV) is desirable in order to increase the applicability of HRV features to wrist-worn wearable devices equipped with heart rate sensors that are nowadays becoming more and more popular in people&rsquo;s daily life. This study is focused in particular on the the two most used HRV parameters, i.e., the standard deviation of inter-beat intervals (SDNN) and the root Mean Squared error of successive inter-beat intervals differences (rMSSD). The huge problem of extracting these HRV parameters from wrist-worn devices is that their data are affected by the motion artifacts. For this reason, estimating the error caused by this huge quantity of missing values is fundamental to obtain reliable HRV parameters from these devices. To this aim, we simulate missing values induced by motion artifacts (from 0 to 70%) in an ultra-short time window (i.e., from 4 min to 30 s) by the random walk Gilbert burst model in 22 young healthy subjects. In addition, 30 s and 2 min ultra-short time windows are required to estimate rMSSD and SDNN, respectively. Moreover, due to the fact that ultra-short time window does not permit assessing very low frequencies, and the SDNN is highly affected by these frequencies, the bias for estimating SDNN continues to increase as the time window length decreases. On the contrary, a small error is detected in rMSSD up to 30 s due to the fact that it is highly affected by high frequencies which are possible to be evaluated even if the time window length decreases. Finally, the missing values have a small effect on rMSSD and SDNN estimation. As a matter of fact, the HRV parameter errors increase slightly as the percentage of missing values increase.
VL  - 20
UR  - https://www.mdpi.com/1424-8220/20/24/7122
ER  - 

TY  - CONF
T1  - Estimating countries’ peace index through the lens of the world news as monitored by GDELT
T2  - 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)
Y1  - 2020
A1  - V. Voukelatou
A1  - Luca Pappalardo
A1  - Lorenzo Gabrielli
A1  - Fosca Giannotti
AB  - Peacefulness is a principal dimension of well-being, and its measurement has lately drawn the attention of researchers and policy-makers. During the last years, novel digital data streams have drastically changed research in this field. In the current study, we exploit information extracted from Global Data on Events, Location, and Tone (GDELT) digital news database, to capture peacefulness through the Global Peace Index (GPI). Applying machine learning techniques, we demonstrate that news media attention, sentiment, and social stability from GDELT can be used as proxies for measuring GPI at a monthly level. Additionally, through the variable importance analysis, we show that each country's socio-economic, political, and military profile emerges. This could bring added value to researchers interested in "Data Science for Social Good", to policy-makers, and peacekeeping organizations since they could monitor peacefulness almost real-time, and therefore facilitate timely and more efficient policy-making.
JF  - 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)
UR  - https://ieeexplore.ieee.org/abstract/document/9260052
ER  - 

TY  - JOUR
T1  - An ethico-legal framework for social data science
Y1  - 2020
A1  - Forgó, Nikolaus
A1  - Hänold, Stefanie
A1  - van den Hoven, Jeroen
A1  - Krügel, Tina
A1  - Lishchuk, Iryna
A1  - Mahieu, René
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Francesca Pratesi
A1  - van Putten, David
AB  - This paper presents a framework for research infrastructures enabling ethically sensitive and legally compliant data science in Europe. Our goal is to describe how to design and implement an open platform for big data social science, including, in particular, personal data. To this end, we discuss a number of infrastructural, organizational and methodological principles to be developed for a concrete implementation. These include not only systematically tools and methodologies that effectively enable both the empirical evaluation of the privacy risk and data transformations by using privacy-preserving approaches, but also the development of training materials (a massive open online course) and organizational instruments based on legal and ethical principles. This paper provides, by way of example, the implementation that was adopted within the context of the SoBigData Research Infrastructure.
SN  - 2364-4168
UR  - https://link.springer.com/article/10.1007/s41060-020-00211-7
JO  - International Journal of Data Science and Analytics
ER  - 

TY  - JOUR
T1  - Evaluating community detection algorithms for progressively evolving graphs
JF  - arXiv preprint arXiv:2007.08635
Y1  - 2020
A1  - Cazabet, Rémy
A1  - Boudebza, Souaad
A1  - Giulio Rossetti
ER  - 

TY  - CONF
T1  - Explainability Methods for Natural Language Processing: Applications to Sentiment Analysis.
T2  - SEBD
Y1  - 2020
A1  - Francesco Bodria
A1  - Panisson, André
A1  - Perotti, Alan
A1  - Piaggesi, Simone
JF  - SEBD
ER  - 

TY  - CONF
T1  - Explaining Sentiment Classification with Synthetic Exemplars and Counter-Exemplars
T2  - Discovery Science
Y1  - 2020
A1  - Lampridis, Orestis
A1  - Riccardo Guidotti
A1  - Salvatore Ruggieri
ED  - Appice, Annalisa
ED  - Tsoumakas, Grigorios
ED  - Manolopoulos, Yannis
ED  - Matwin, Stan
AB  - We present xspells, a model-agnostic local approach for explaining the decisions of a black box model for sentiment classification of short texts. The explanations provided consist of a set of exemplar sentences and a set of counter-exemplar sentences. The former are examples classified by the black box with the same label as the text to explain. The latter are examples classified with a different label (a form of counter-factuals). Both are close in meaning to the text to explain, and both are meaningful sentences – albeit they are synthetically generated. xspells generates neighbors of the text to explain in a latent space using Variational Autoencoders for encoding text and decoding latent instances. A decision tree is learned from randomly generated neighbors, and used to drive the selection of the exemplars and counter-exemplars. We report experiments on two datasets showing that xspells outperforms the well-known lime method in terms of quality of explanations, fidelity, and usefulness, and that is comparable to it in terms of stability.
JF  - Discovery Science
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-030-61527-7
UR  - https://link.springer.com/chapter/10.1007/978-3-030-61527-7_24
ER  - 

TY  - CONF
T1  - Global Explanations with Local Scoring
T2  - Machine Learning and Knowledge Discovery in Databases
Y1  - 2020
A1  - Mattia Setzu
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Franco Turini
ED  - Cellier, Peggy
ED  - Driessens, Kurt
AB  - Artificial Intelligence systems often adopt machine learning models encoding complex algorithms with potentially unknown behavior. As the application of these “black box” models grows, it is our responsibility to understand their inner working and formulate them in human-understandable explanations. To this end, we propose a rule-based model-agnostic explanation method that follows a local-to-global schema: it generalizes a global explanation summarizing the decision logic of a black box starting from the local explanations of single predicted instances. We define a scoring system based on a rule relevance score to extract global explanations from a set of local explanations in the form of decision rules. Experiments on several datasets and black boxes show the stability, and low complexity of the global explanations provided by the proposed solution in comparison with baselines and state-of-the-art global explainers.
JF  - Machine Learning and Knowledge Discovery in Databases
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-030-43823-4
UR  - https://link.springer.com/chapter/10.1007%2F978-3-030-43823-4_14
ER  - 

TY  - JOUR
T1  - Human migration: the big data perspective
JF  - International Journal of Data Science and Analytics
Y1  - 2020
A1  - Alina Sirbu
A1  - Andrienko, Gennady
A1  - Andrienko, Natalia
A1  - Boldrini, Chiara
A1  - Conti, Marco
A1  - Fosca Giannotti
A1  - Riccardo Guidotti
A1  - Bertoli, Simone
A1  - Jisu Kim
A1  - Muntean, Cristina Ioana
A1  - Luca Pappalardo
A1  - Passarella, Andrea
A1  - Dino Pedreschi
A1  - Pollacci, Laura
A1  - Francesca Pratesi
A1  - Sharma, Rajesh
AB  - How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.
SN  - 2364-4168
UR  - https://link.springer.com/article/10.1007%2Fs41060-020-00213-5
JO  - International Journal of Data Science and Analytics
ER  - 

TY  - JOUR
T1  - Identifying and exploiting homogeneous communities in labeled networks
JF  - Applied Network Science
Y1  - 2020
A1  - Salvatore Citraro
A1  - Giulio Rossetti
AB  - Attribute-aware community discovery aims to find well-connected communities that are also homogeneous w.r.t. the labels carried by the nodes. In this work, we address such a challenging task presenting EVA, an algorithmic approach designed to maximize a quality function tailoring both structural and homophilic clustering criteria. We evaluate EVA on several real-world labeled networks carrying both nominal and ordinal information, and we compare our approach to other classic and attribute-aware algorithms. Our results suggest that EVA is the only method, among the compared ones, able to discover homogeneous clusters without considerably degrading partition modularity.We also investigate two well-defined applicative scenarios to characterize better EVA: i) the clustering of a mental lexicon, i.e., a linguistic network modeling human semantic memory, and (ii) the node label prediction task, namely the problem of inferring the missing label of a node.
VL  - 5
UR  - https://appliednetsci.springeropen.com/articles/10.1007/s41109-020-00302-1
ER  - 

TY  - CONF
T1  - “Know Thyself” How Personal Music Tastes Shape the Last.Fm Online Social Network
T2  - Formal Methods. FM 2019 International Workshops
Y1  - 2020
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
ED  - Sekerinski, Emil
ED  - Moreira, Nelma
ED  - Oliveira, José N.
ED  - Ratiu, Daniel
ED  - Riccardo Guidotti
ED  - Farrell, Marie
ED  - Luckcuck, Matt
ED  - Marmsoler, Diego
ED  - Campos, José
ED  - Astarte, Troy
ED  - Gonnord, Laure
ED  - Cerone, Antonio
ED  - Couto, Luis
ED  - Dongol, Brijesh
ED  - Kutrib, Martin
ED  - Monteiro, Pedro
ED  - Delmas, David
AB  - As Nietzsche once wrote “Without music, life would be a mistake” (Twilight of the Idols, 1889.). The music we listen to reflects our personality, our way to approach life. In order to enforce self-awareness, we devised a Personal Listening Data Model that allows for capturing individual music preferences and patterns of music consumption. We applied our model to 30k users of Last.Fm for which we collected both friendship ties and multiple listening. Starting from such rich data we performed an analysis whose final aim was twofold: (i) capture, and characterize, the individual dimension of music consumption in order to identify clusters of like-minded Last.Fm users; (ii) analyze if, and how, such clusters relate to the social structure expressed by the users in the service. Do there exist individuals having similar Personal Listening Data Models? If so, are they directly connected in the social graph or belong to the same community?.
JF  - Formal Methods. FM 2019 International Workshops
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-030-54994-7
UR  - https://link.springer.com/chapter/10.1007/978-3-030-54994-7_11
ER  - 

TY  - ABST
T1  - Mobile phone data analytics against the COVID-19 epidemics in Italy: flow diversity and local job markets during the national lockdown
Y1  - 2020
A1  - Pietro Bonato
A1  - Paolo Cintia
A1  - Francesco Fabbri
A1  - Daniele Fadda
A1  - Fosca Giannotti
A1  - Pier Luigi Lopalco
A1  - Sara Mazzilli
A1  - Mirco Nanni
A1  - Luca Pappalardo
A1  - Dino Pedreschi
A1  - Francesco Penone
A1  - S Rinzivillo
A1  - Giulio Rossetti
A1  - Marcello Savarese
A1  - Lara Tavoschi
AB  - Understanding collective mobility patterns is crucial to plan the restart of production and economic activities, which are currently put in stand-by to fight the diffusion of the epidemics. In this report, we use mobile phone data to infer the movements of people between Italian provinces and municipalities, and we analyze the incoming, outcoming and internal mobility flows before and during the national lockdown (March 9th, 2020) and after the closure of non-necessary productive and economic activities (March 23th, 2020). The population flow across provinces and municipalities enable for the modelling of a risk index tailored for the mobility of each municipality or province. Such an index would be a useful indicator to drive counter-measures in reaction to a sudden reactivation of the epidemics. Mobile phone data, even when aggregated to preserve the privacy of individuals, are a useful data source to track the evolution in time of human mobility, hence allowing for monitoring the effectiveness of control measures such as physical distancing. We address the following analytical questions: How does the mobility structure of a territory change? Do incoming and outcoming flows become more predictable during the lockdown, and what are the differences between weekdays and weekends? Can we detect proper local job markets based on human mobility flows, to eventually shape the borders of a local outbreak?
UR  - https://arxiv.org/abs/2004.11278
ER  - 

TY  - JOUR
T1  - Modeling Adversarial Behavior Against Mobility Data Privacy
JF  - IEEE Transactions on Intelligent Transportation SystemsIEEE Transactions on Intelligent Transportation Systems
Y1  - 2020
A1  - Roberto Pellungrini
A1  - Luca Pappalardo
A1  - F. Simini
A1  - Anna Monreale
AB  - Privacy risk assessment is a crucial issue in any privacy-aware analysis process. Traditional frameworks for privacy risk assessment systematically generate the assumed knowledge for a potential adversary, evaluating the risk without realistically modelling the collection of the background knowledge used by the adversary when performing the attack. In this work, we propose Simulated Privacy Annealing (SPA), a new adversarial behavior model for privacy risk assessment in mobility data. We model the behavior of an adversary as a mobility trajectory and introduce an optimization approach to find the most effective adversary trajectory in terms of privacy risk produced for the individuals represented in a mobility data set. We use simulated annealing to optimize the movement of the adversary and simulate a possible attack on mobility data. We finally test the effectiveness of our approach on real human mobility data, showing that it can simulate the knowledge gathering process for an adversary in a more realistic way.
SN  - 1558-0016
UR  - https://ieeexplore.ieee.org/abstract/document/9199893
JO  - IEEE Transactions on Intelligent Transportation Systems
ER  - 

TY  - JOUR
T1  - Modelling Human Mobility considering Spatial, Temporal and Social Dimensions
JF  - arXiv preprint arXiv:2007.02371
Y1  - 2020
A1  - Cornacchia, Giuliano
A1  - Giulio Rossetti
A1  - Luca Pappalardo
ER  - 

TY  - CONF
T1  - Opinion Dynamic Modeling of Fake News Perception
T2  - International Conference on Complex Networks and Their Applications
Y1  - 2020
A1  - Toccaceli, Cecilia
A1  - Letizia Milli
A1  - Giulio Rossetti
AB  - Fake news diffusion represents one of the most pressing issues of our online society. In recent years, fake news has been analyzed from several points of view, primarily to improve our ability to separate them from the legit ones as well as identify their sources. Among such vast literature, a rarely discussed theme is likely to play uttermost importance in our understanding of such a controversial phenomenon: the analysis of fake news’ perception. In this work, we approach such a problem by proposing a family of opinion dynamic models tailored to study how specific social interaction patterns concur to the acceptance, or refusal, of fake news by a population of interacting individuals. To discuss the peculiarities of the proposed models, we tested them on several synthetic network topologies, thus underlying when/how they affect the stable states reached by the performed simulations.
JF  - International Conference on Complex Networks and Their Applications
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-030-65347-7_31
ER  - 

TY  - CONF
T1  - Predicting and Explaining Privacy Risk Exposure in Mobility Data
T2  - Discovery Science
Y1  - 2020
A1  - Francesca Naretto
A1  - Roberto Pellungrini
A1  - Anna Monreale
A1  - Nardini, Franco Maria
A1  - Musolesi, Mirco
ED  - Appice, Annalisa
ED  - Tsoumakas, Grigorios
ED  - Manolopoulos, Yannis
ED  - Matwin, Stan
AB  - Mobility data is a proxy of different social dynamics and its analysis enables a wide range of user services. Unfortunately, mobility data are very sensitive because the sharing of people’s whereabouts may arise serious privacy concerns. Existing frameworks for privacy risk assessment provide tools to identify and measure privacy risks, but they often (i) have high computational complexity; and (ii) are not able to provide users with a justification of the reported risks. In this paper, we propose expert, a new framework for the prediction and explanation of privacy risk on mobility data. We empirically evaluate privacy risk on real data, simulating a privacy attack with a state-of-the-art privacy risk assessment framework. We then extract individual mobility profiles from the data for predicting their risk. We compare the performance of several machine learning algorithms in order to identify the best approach for our task. Finally, we show how it is possible to explain privacy risk prediction on real data, using two algorithms: Shap, a feature importance-based method and Lore, a rule-based method. Overall, expert is able to provide a user with the privacy risk and an explanation of the risk itself. The experiments show excellent performance for the prediction task.
JF  - Discovery Science
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-030-61527-7
UR  - https://link.springer.com/chapter/10.1007/978-3-030-61527-7_27
ER  - 

TY  - CONF
T1  - Prediction and Explanation of Privacy Risk on Mobility Data with Neural Networks
T2  - ECML PKDD 2020 Workshops
Y1  - 2020
A1  - Francesca Naretto
A1  - Roberto Pellungrini
A1  - Nardini, Franco Maria
A1  - Fosca Giannotti
ED  - Koprinska, Irena
ED  - Kamp, Michael
ED  - Appice, Annalisa
ED  - Loglisci, Corrado
ED  - Antonie, Luiza
ED  - Zimmermann, Albrecht
ED  - Riccardo Guidotti
ED  - Özgöbek, Özlem
ED  - Ribeiro, Rita P.
ED  - Gavaldà, Ricard
ED  - Gama, João
ED  - Adilova, Linara
ED  - Krishnamurthy, Yamuna
ED  - Ferreira, Pedro M.
ED  - Malerba, Donato
ED  - Medeiros, Ibéria
ED  - Ceci, Michelangelo
ED  - Manco, Giuseppe
ED  - Masciari, Elio
ED  - Ras, Zbigniew W.
ED  - Christen, Peter
ED  - Ntoutsi, Eirini
ED  - Schubert, Erich
ED  - Zimek, Arthur
ED  - Anna Monreale
ED  - Biecek, Przemyslaw
ED  - S Rinzivillo
ED  - Kille, Benjamin
ED  - Lommatzsch, Andreas
ED  - Gulla, Jon Atle
AB  - The analysis of privacy risk for mobility data is a fundamental part of any privacy-aware process based on such data. Mobility data are highly sensitive. Therefore, the correct identification of the privacy risk before releasing the data to the public is of utmost importance. However, existing privacy risk assessment frameworks have high computational complexity. To tackle these issues, some recent work proposed a solution based on classification approaches to predict privacy risk using mobility features extracted from the data. In this paper, we propose an improvement of this approach by applying long short-term memory (LSTM) neural networks to predict the privacy risk directly from original mobility data. We empirically evaluate privacy risk on real data by applying our LSTM-based approach. Results show that our proposed method based on a LSTM network is effective in predicting the privacy risk with results in terms of F1 of up to 0.91. Moreover, to explain the predictions of our model, we employ a state-of-the-art explanation algorithm, Shap. We explore the resulting explanation, showing how it is possible to provide effective predictions while explaining them to the end-user.
JF  - ECML PKDD 2020 Workshops
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-030-65965-3
UR  - https://link.springer.com/chapter/10.1007/978-3-030-65965-3_34
ER  - 

TY  - JOUR
T1  - PRIMULE: Privacy risk mitigation for user profiles
Y1  - 2020
A1  - Francesca Pratesi
A1  - Lorenzo Gabrielli
A1  - Paolo Cintia
A1  - Anna Monreale
A1  - Fosca Giannotti
AB  - The availability of mobile phone data has encouraged the development of different data-driven tools, supporting social science studies and providing new data sources to the standard official statistics. However, this particular kind of data are subject to privacy concerns because they can enable the inference of personal and private information. In this paper, we address the privacy issues related to the sharing of user profiles, derived from mobile phone data, by proposing PRIMULE, a privacy risk mitigation strategy. Such a method relies on PRUDEnce (Pratesi et al., 2018), a privacy risk assessment framework that provides a methodology for systematically identifying risky-users in a set of data. An extensive experimentation on real-world data shows the effectiveness of PRIMULE strategy in terms of both quality of mobile user profiles and utility of these profiles for analytical services such as the Sociometer (Furletti et al., 2013), a data mining tool for city users classification.
VL  - 125
SN  - 0169-023X
UR  - https://www.sciencedirect.com/science/article/pii/S0169023X18305342
JO  - Data & Knowledge Engineering
ER  - 

TY  - JOUR
T1  - The relationship between human mobility and viral transmissibility during the COVID-19 epidemics in Italy
JF  - arXiv preprint arXiv:2006.03141
Y1  - 2020
A1  - Paolo Cintia
A1  - Daniele Fadda
A1  - Fosca Giannotti
A1  - Luca Pappalardo
A1  - Giulio Rossetti
A1  - Dino Pedreschi
A1  - S Rinzivillo
A1  - Bonato, Pietro
A1  - Fabbri, Francesco
A1  - Penone, Francesco
A1  - Savarese, Marcello
A1  - Checchi, Daniele
A1  - Chiaromonte, Francesca
A1  - Vineis , Paolo
A1  - Guzzetta, Giorgio
A1  - Riccardo, Flavia
A1  - Marziano, Valentina
A1  - Poletti, Piero
A1  - Trentini, Filippo
A1  - Bella, Antonio
A1  - Andrianou, Xanthi
A1  - Del Manso, Martina
A1  - Fabiani, Massimo
A1  - Bellino, Stefania
A1  - Boros, Stefano
A1  - Mateo Urdiales, Alberto
A1  - Vescio, Maria Fenicia
A1  - Brusaferro, Silvio
A1  - Rezza, Giovanni
A1  - Pezzotti, Patrizio
A1  - Ajelli, Marco
A1  - Merler, Stefano
AB  - We describe in this report our studies to understand the relationship between human mobility and the spreading of COVID-19, as an aid to manage the restart of the social and economic activities after the lockdown and monitor the epidemics in the coming weeks and months. We compare the evolution (from January to May 2020) of the daily mobility flows in Italy, measured by means of nation-wide mobile phone data, and the evolution of transmissibility, measured by the net reproduction number, i.e., the mean number of secondary infections generated by one primary infector in the presence of control interventions and human behavioural adaptations. We find a striking relationship between the negative variation of mobility flows and the net reproduction number, in all Italian regions, between March 11th and March 18th, when the country entered the lockdown. This observation allows us to quantify the time needed to "switch off" the country mobility (one week) and the time required to bring the net reproduction number below 1 (one week). A reasonably simple regression model provides evidence that the net reproduction number is correlated with a region's incoming, outgoing and internal mobility. We also find a strong relationship between the number of days above the epidemic threshold before the mobility flows reduce significantly as an effect of lockdowns, and the total number of confirmed SARS-CoV-2 infections per 100k inhabitants, thus indirectly showing the effectiveness of the lockdown and the other non-pharmaceutical interventions in the containment of the contagion. Our study demonstrates the value of "big" mobility data to the monitoring of key epidemic indicators to inform choices as the epidemics unfolds in the coming months.
UR  - https://arxiv.org/abs/2006.03141
ER  - 

TY  - JOUR
T1  - (So) Big Data and the transformation of the city
JF  - International Journal of Data Science and Analytics
Y1  - 2020
A1  - Andrienko, Gennady
A1  - Andrienko, Natalia
A1  - Boldrini, Chiara
A1  - Caldarelli, Guido
A1  - Paolo Cintia
A1  - Cresci, Stefano
A1  - Facchini, Angelo
A1  - Fosca Giannotti
A1  - Gionis, Aristides
A1  - Riccardo Guidotti
A1  - others
AB  - The exponential increase in the availability of large-scale mobility data has fueled the vision of smart cities that will transform our lives. The truth is that we have just scratched the surface of the research challenges that should be tackled in order to make this vision a reality. Consequently, there is an increasing interest among different research communities (ranging from civil engineering to computer science) and industrial stakeholders in building knowledge discovery pipelines over such data sources. At the same time, this widespread data availability also raises privacy issues that must be considered by both industrial and academic stakeholders. In this paper, we provide a wide perspective on the role that big data have in reshaping cities. The paper covers the main aspects of urban data analytics, focusing on privacy issues, algorithms, applications and services, and georeferenced data from social media. In discussing these aspects, we leverage, as concrete examples and case studies of urban data science tools, the results obtained in the “City of Citizens” thematic area of the Horizon 2020 SoBigData initiative, which includes a virtual research environment with mobility datasets and urban analytics methods developed by several institutions around Europe. We conclude the paper outlining the main research challenges that urban data science has yet to address in order to help make the smart city vision a reality.
UR  - https://link.springer.com/article/10.1007/s41060-020-00207-3
ER  - 

TY  - JOUR
T1  - UTLDR: an agent-based framework for modeling infectious diseases and public interventions
JF  - arXiv preprint arXiv:2011.05606
Y1  - 2020
A1  - Giulio Rossetti
A1  - Letizia Milli
A1  - Salvatore Citraro
A1  - Morini, Virginia
ER  - 

TY  - JOUR
T1  - The AI black box Explanation Problem
JF  - ERCIM NEWS
Y1  - 2019
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Dino Pedreschi
ER  - 

TY  - JOUR
T1  - Algorithmic bias amplifies opinion fragmentation and polarization: A bounded confidence model
JF  - PloS one
Y1  - 2019
A1  - Alina Sirbu
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Kertész, János
AB  - The flow of information reaching us via the online media platforms is optimized not by the information content or relevance but by popularity and proximity to the target. This is typically performed in order to maximise platform usage. As a side effect, this introduces an algorithmic bias that is believed to enhance fragmentation and polarization of the societal debate. To study this phenomenon, we modify the well-known continuous opinion dynamics model of bounded confidence in order to account for the algorithmic bias and investigate its consequences. In the simplest version of the original model the pairs of discussion participants are chosen at random and their opinions get closer to each other if they are within a fixed tolerance level. We modify the selection rule of the discussion partners: there is an enhanced probability to choose individuals whose opinions are already close to each other, thus mimicking the behavior of online media which suggest interaction with similar peers. As a result we observe: a) an increased tendency towards opinion fragmentation, which emerges also in conditions where the original model would predict consensus, b) increased polarisation of opinions and c) a dramatic slowing down of the speed at which the convergence at the asymptotic state is reached, which makes the system highly unstable. Fragmentation and polarization are augmented by a fragmented initial population.
VL  - 14
UR  - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0213246
ER  - 

TY  - JOUR
T1  - Analysis of the Impact of Interpolation Methods of Missing RR-intervals Caused by Motion Artifacts on HRV Features Estimations
JF  - Sensors
Y1  - 2019
A1  - Morelli, Davide
A1  - Alessio Rossi
A1  - Cairo, Massimo
A1  - Clifton, David A
AB  - Wearable physiological monitors have become increasingly popular, often worn during people’s daily life, collecting data 24 hours a day, 7 days a week. In the last decade, these devices have attracted the attention of the scientific community as they allow us to automatically extract information about user physiology (e.g., heart rate, sleep quality and physical activity) enabling inference on their health. However, the biggest issue about the data recorded by wearable devices is the missing values due to motion and mechanical artifacts induced by external stimuli during data acquisition. This missing data could negatively affect the assessment of heart rate (HR) response and estimation of heart rate variability (HRV), that could in turn provide misleading insights concerning the health status of the individual. In this study, we focus on healthy subjects with normal heart activity and investigate the effects of missing variation of the timing between beats (RR-intervals) caused by motion artifacts on HRV features estimation by randomly introducing missing values within a five min time windows of RR-intervals obtained from the nsr2db PhysioNet dataset by using Gilbert burst method. We then evaluate several strategies for estimating HRV in the presence of missing values by interpolating periods of missing values, covering the range of techniques often deployed in the literature, via linear, quadratic, cubic, and cubic spline functions. We thereby compare the HRV features obtained by handling missing data in RR-interval time series against HRV features obtained from the same data without missing values. Finally, we assess the difference between the use of interpolation methods on time (i.e., the timestamp when the heartbeats happen) and on duration (i.e., the duration of the heartbeats), in order to identify the best methodology to handle the missing RR-intervals. The main novel finding of this study is that the interpolation of missing data on time produces more reliable HRV estimations when compared to interpolation on duration. Hence, we can conclude that interpolation on duration modifies the power spectrum of the RR signal, negatively affecting the estimation of the HRV features as the amount of missing values increases. We can conclude that interpolation in time is the optimal method among those considered for handling data with large amounts of missing values, such as data from wearable sensors.
VL  - 19
UR  - https://www.mdpi.com/1424-8220/19/14/3163
ER  - 

TY  - JOUR
T1  - Causal inference for social discrimination reasoning
JF  - Journal of Intelligent Information Systems
Y1  - 2019
A1  - Qureshi, Bilal
A1  - Kamiran, Faisal
A1  - Karim, Asim
A1  - Salvatore Ruggieri
A1  - Dino Pedreschi
AB  - The discovery of discriminatory bias in human or automated decision making is a task of increasing importance and difficulty, exacerbated by the pervasive use of machine learning and data mining. Currently, discrimination discovery largely relies upon correlation analysis of decisions records, disregarding the impact of confounding biases. We present a method for causal discrimination discovery based on propensity score analysis, a statistical tool for filtering out the effect of confounding variables. We introduce causal measures of discrimination which quantify the effect of group membership on the decisions, and highlight causal discrimination/favoritism patterns by learning regression trees over the novel measures. We validate our approach on two real world datasets. Our proposed framework for causal discrimination has the potential to enhance the transparency of machine learning with tools for detecting discriminatory bias both in the training data and in the learning algorithms.
UR  - https://link.springer.com/article/10.1007/s10844-019-00580-x
ER  - 

TY  - JOUR
T1  - CDLIB: a python library to extract, compare and evaluate communities from complex networks
JF  - Applied Network Science
Y1  - 2019
A1  - Giulio Rossetti
A1  - Letizia Milli
A1  - Cazabet, Rémy
AB  - Community Discovery is among the most studied problems in complex network analysis. During the last decade, many algorithms have been proposed to address such task; however, only a few of them have been integrated into a common framework, making it hard to use and compare different solutions. To support developers, researchers and practitioners, in this paper we introduce a python library - namely CDlib - designed to serve this need. The aim of CDlib is to allow easy and standardized access to a wide variety of network clustering algorithms, to evaluate and compare the results they provide, and to visualize them. It notably provides the largest available collection of community detection implementations, with a total of 39 algorithms.
VL  - 4
SN  - 2364-8228
UR  - https://link.springer.com/article/10.1007/s41109-019-0165-9
JO  - Applied Network Science
ER  - 

TY  - CHAP
T1  - Challenges in community discovery on temporal networks
T2  - Temporal Network Theory
Y1  - 2019
A1  - Cazabet, Rémy
A1  - Giulio Rossetti
AB  - Community discovery is one of the most studied problems in network science. In recent years, many works have focused on discovering communities in temporal networks, thus identifying dynamic communities. Interestingly, dynamic communities are not mere sequences of static ones; new challenges arise from their dynamic nature. Despite the large number of algorithms introduced in the literature, some of these challenges have been overlooked or little studied until recently. In this chapter, we will discuss some of these challenges and recent propositions to tackle them. We will, among other topics, discuss of community events in gradually evolving networks, on the notion of identity through change and the ship of Theseus paradox, on dynamic communities in different types of networks including link streams, on the smoothness of dynamic communities, and on the different types of complexity of algorithms for their discovery. We will also list available tools and libraries adapted to work with this problem.
JF  - Temporal Network Theory
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-030-23495-9_10
ER  - 

TY  - CONF
T1  - Community-Aware Content Diffusion: Embeddednes and Permeability
T2  - International Conference on Complex Networks and Their Applications
Y1  - 2019
A1  - Letizia Milli
A1  - Giulio Rossetti
JF  - International Conference on Complex Networks and Their Applications
PB  - Springer
ER  - 

TY  - CONF
T1  - A complex network approach to semantic spaces: How meaning organizes itself
T2  - SEBD
Y1  - 2019
A1  - Salvatore Citraro
A1  - Giulio Rossetti
JF  - SEBD
ER  - 

TY  - JOUR
T1  - Defining Geographic Markets from Probabilistic Clusters: A Machine Learning Algorithm Applied to Supermarket Scanner Data
JF  - Available at SSRN 3452058
Y1  - 2019
A1  - Bruestle, Stephen
A1  - Luca Pappalardo
A1  - Riccardo Guidotti
ER  - 

TY  - JOUR
T1  - Do “girls just wanna have fun”? Participation trends and motivational profiles of women in Norway’s ultimate mass participation ski event
JF  - Frontiers in Psychology
Y1  - 2019
A1  - Calogiuri, Giovanna
A1  - Johansen, Patrick Foss
A1  - Alessio Rossi
A1  - Thurston, Miranda
AB  - Mass participation sporting events (MPSEs) are viewed as encouraging regular exercise in the population, but concerns have been expressed about the extent to which they are inclusive for women. This study focuses on an iconic cross-country skiing MPSE in Norway, the Birkebeiner race (BR), which includes different variants (main, Friday, half-distance, and women-only races). In order to shed light on women’s participation in this specific MPSE, as well as add to the understanding of women’s MPSEs participation in general, this study was set up to: (i) analyze trends in women’s participation, (ii) examine the characteristics, and (iii) identify key factors characterizing the motivational profile of women in different BR races, with emphasis on the full-distance vs. the women-only races. Entries in the different races throughout the period 1996–2018 were analyzed using an autoregressive model. Information on women’s sociodemographic characteristics, sport and exercise participation, and a range of psychological variables (motives, perceptions, overall satisfaction, and future participation intention) were extracted from a market survey and analyzed using a machine learning (ML) approach (n = 1,149). Additionally, qualitative information generated through open-ended questions was analyzed thematically (n = 116). The relative prevalence of women in the main BR was generally low (< 20%). While the other variants contributed to boosting women’s participation in the overall event, a future increment of women in the main BR was predicted, with women’s ratings possibly matching the men’s by the year 2034. Across all races, most of the women were physically active, of medium-high income, and living in the most urbanized region of Norway. Satisfaction and future participation intention were relatively high, especially among the participants in the women-only races. “Exercise goal” was the predominant participation motive. The participants in women-only races assigned greater importance to social aspects, and perceived the race as a tradition, whereas those in the full-distance races were younger and gave more importance to performance aspects. These findings corroborate known trends and challenges in MPSE participation, but also contribute to greater understanding in this under-researched field. Further research is needed in order to gain more knowledge on how to foster women’s participation in MPSEs.
VL  - 10
UR  - https://www.frontiersin.org/articles/10.3389/fpsyg.2019.02548/full
ER  - 

TY  - JOUR
T1  - DynComm R Package–Dynamic Community Detection for Evolving Networks
JF  - arXiv preprint arXiv:1905.01498
Y1  - 2019
A1  - Sarmento, Rui Portocarrero
A1  - Lemos, Luís
A1  - Cordeiro, Mário
A1  - Giulio Rossetti
A1  - Cardoso, Douglas
ER  - 

TY  - CONF
T1  - Eva: Attribute-Aware Network Segmentation
T2  - International Conference on Complex Networks and Their Applications
Y1  - 2019
A1  - Salvatore Citraro
A1  - Giulio Rossetti
AB  - Identifying topologically well-defined communities that are also homogeneous w.r.t. attributes carried by the nodes that compose them is a challenging social network analysis task. We address such a problem by introducing Eva, a bottom-up low complexity algorithm designed to identify network hidden mesoscale topologies by optimizing structural and attribute-homophilic clustering criteria. We evaluate the proposed approach on heterogeneous real-world labeled network datasets, such as co-citation, linguistic, and social networks, and compare it with state-of-art community discovery competitors. Experimental results underline that Eva ensures that network nodes are grouped into communities according to their attribute similarity without considerably degrading partition modularity, both in single and multi node-attribute scenarios.
JF  - International Conference on Complex Networks and Their Applications
PB  - Springer
ER  - 

TY  - CONF
T1  - Exorcising the Demon: Angel, Efficient Node-Centric Community Discovery
T2  - International Conference on Complex Networks and Their Applications
Y1  - 2019
A1  - Giulio Rossetti
JF  - International Conference on Complex Networks and Their Applications
PB  - Springer
ER  - 

TY  - CONF
T1  - Explaining multi-label black-box classifiers for health applications
T2  - International Workshop on Health Intelligence
Y1  - 2019
A1  - Cecilia Panigutti
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Dino Pedreschi
AB  - Today the state-of-the-art performance in classification is achieved by the so-called “black boxes”, i.e. decision-making systems whose internal logic is obscure. Such models could revolutionize the health-care system, however their deployment in real-world diagnosis decision support systems is subject to several risks and limitations due to the lack of transparency. The typical classification problem in health-care requires a multi-label approach since the possible labels are not mutually exclusive, e.g. diagnoses. We propose MARLENA, a model-agnostic method which explains multi-label black box decisions. MARLENA explains an individual decision in three steps. First, it generates a synthetic neighborhood around the instance to be explained using a strategy suitable for multi-label decisions. It then learns a decision tree on such neighborhood and finally derives from it a decision rule that explains the black box decision. Our experiments show that MARLENA performs well in terms of mimicking the black box behavior while gaining at the same time a notable amount of interpretability through compact decision rules, i.e. rules with limited length.
JF  - International Workshop on Health Intelligence
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-030-24409-5_9
ER  - 

TY  - JOUR
T1  - Factual and Counterfactual Explanations for Black Box Decision Making
JF  - IEEE Intelligent Systems
Y1  - 2019
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
A1  - Franco Turini
AB  - The rise of sophisticated machine learning models has brought accurate but obscure decision systems, which hide their logic, thus undermining transparency, trust, and the adoption of artificial intelligence (AI) in socially sensitive and safety-critical contexts. We introduce a local rule-based explanation method, providing faithful explanations of the decision made by a black box classifier on a specific instance. The proposed method first learns an interpretable, local classifier on a synthetic neighborhood of the instance under investigation, generated by a genetic algorithm. Then, it derives from the interpretable classifier an explanation consisting of a decision rule, explaining the factual reasons of the decision, and a set of counterfactuals, suggesting the changes in the instance features that would lead to a different outcome. Experimental results show that the proposed method outperforms existing approaches in terms of the quality of the explanations and of the accuracy in mimicking the black box.
UR  - https://ieeexplore.ieee.org/abstract/document/8920138
ER  - 

TY  - CONF
T1  - Human Mobility from theory to practice: Data, Models and Applications
T2  - Companion of The 2019 World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019.
Y1  - 2019
A1  - Luca Pappalardo
A1  - Gianni Barlacchi
A1  - Roberto Pellungrini
A1  - Filippo Simini
AB  - The inclusion of tracking technologies in personal devices opened the doors to the analysis of large sets of mobility data like GPS traces and call detail records. This tutorial presents an overview of both modeling principles of human mobility and machine learning models applicable to specific problems. We review the state of the art of five main aspects in human mobility: (1) human mobility data landscape; (2) key measures of individual and collective mobility; (3) generative models at the level of individual, population and mixture of the two; (4) next location prediction algorithms; (5) applications for social good. For each aspect, we show experiments and simulations using the Python library ”scikit-mobility” developed by the presenters of the tutorial.
JF  - Companion of The 2019 World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019.
UR  - https://doi.org/10.1145/3308560.3320099
ER  - 

TY  - CONF
T1  - Investigating Neighborhood Generation Methods for Explanations of Obscure Image Classifiers
T2  - Pacific-Asia Conference on Knowledge Discovery and Data Mining
Y1  - 2019
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Cariaggi, Leonardo
AB  - Given the wide use of machine learning approaches based on opaque prediction models, understanding the reasons behind decisions of black box decision systems is nowadays a crucial topic. We address the problem of providing meaningful explanations in the widely-applied image classification tasks. In particular, we explore the impact of changing the neighborhood generation function for a local interpretable model-agnostic explanator by proposing four different variants. All the proposed methods are based on a grid-based segmentation of the images, but each of them proposes a different strategy for generating the neighborhood of the image for which an explanation is required. A deep experimentation shows both improvements and weakness of each proposed approach.
JF  - Pacific-Asia Conference on Knowledge Discovery and Data Mining
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-030-16148-4_5
ER  - 

TY  - CONF
T1  - “Know Thyself” How Personal Music Tastes Shape the Last. Fm Online Social Network
T2  - International Symposium on Formal Methods
Y1  - 2019
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
JF  - International Symposium on Formal Methods
PB  - Springer
ER  - 

TY  - CONF
T1  - Meaningful explanations of Black Box AI decision systems
T2  - Proceedings of the AAAI Conference on Artificial Intelligence
Y1  - 2019
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Salvatore Ruggieri
A1  - Franco Turini
AB  - Black box AI systems for automated decision making, often based on machine learning over (big) data, map a user’s features into a class or a score without exposing the reasons why. This is problematic not only for lack of transparency, but also for possible biases inherited by the algorithms from human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong decisions. We focus on the urgent open challenge of how to construct meaningful explanations of opaque AI/ML systems, introducing the local-toglobal framework for black box explanation, articulated along three lines: (i) the language for expressing explanations in terms of logic rules, with statistical and causal interpretation; (ii) the inference of local explanations for revealing the decision rationale for a specific case, by auditing the black box in the vicinity of the target instance; (iii), the bottom-up generalization of many local explanations into simple global ones, with algorithms that optimize for quality and comprehensibility. We argue that the local-first approach opens the door to a wide variety of alternative solutions along different dimensions: a variety of data sources (relational, text, images, etc.), a variety of learning problems (multi-label classification, regression, scoring, ranking), a variety of languages for expressing meaningful explanations, a variety of means to audit a black box.
JF  - Proceedings of the AAAI Conference on Artificial Intelligence
UR  - https://aaai.org/ojs/index.php/AAAI/article/view/5050
ER  - 

TY  - JOUR
T1  - PlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach
JF  - ACM Transactions on Intelligent Systems and Technology (TIST)
Y1  - 2019
A1  - Luca Pappalardo
A1  - Paolo Cintia
A1  - Ferragina, Paolo
A1  - Massucco, Emanuele
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - The problem of evaluating the performance of soccer players is attracting the interest of many companies and the scientific community, thanks to the availability of massive data capturing all the events generated during a match (e.g., tackles, passes, shots, etc.). Unfortunately, there is no consolidated and widely accepted metric for measuring performance quality in all of its facets. In this article, we design and implement PlayeRank, a data-driven framework that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. We build our framework by deploying a massive dataset of soccer-logs and consisting of millions of match events pertaining to four seasons of 18 prominent soccer competitions. By comparing PlayeRank to known algorithms for performance evaluation in soccer, and by exploiting a dataset of players’ evaluations made by professional soccer scouts, we show that PlayeRank significantly outperforms the competitors. We also explore the ratings produced by PlayeRank and discover interesting patterns about the nature of excellent performances and what distinguishes the top players from the others. At the end, we explore some applications of PlayeRank—i.e. searching players and player versatility—showing its flexibility and efficiency, which makes it worth to be used in the design of a scalable platform for soccer analytics.
VL  - 10
UR  - https://dl.acm.org/doi/abs/10.1145/3343172
ER  - 

TY  - CONF
T1  - Privacy Risk for Individual Basket Patterns
T2  - ECML PKDD 2018 Workshops
Y1  - 2019
A1  - Roberto Pellungrini
A1  - Anna Monreale
A1  - Riccardo Guidotti
ED  - Alzate, Carlos
ED  - Anna Monreale
ED  - Bioglio, Livio
ED  - Bitetta, Valerio
ED  - Bordino, Ilaria
ED  - Caldarelli, Guido
ED  - Ferretti, Andrea
ED  - Riccardo Guidotti
ED  - Gullo, Francesco
ED  - Pascolutti, Stefano
ED  - Pensa, Ruggero G.
ED  - Robardet, Céline
ED  - Squartini, Tiziano
AB  - Retail data are of fundamental importance for businesses and enterprises that want to understand the purchasing behaviour of their customers. Such data is also useful to develop analytical services and for marketing purposes, often based on individual purchasing patterns. However, retail data and extracted models may also provide very sensitive information to possible malicious third parties. Therefore, in this paper we propose a methodology for empirically assessing privacy risk in the releasing of individual purchasing data. The experiments on real-world retail data show that although individual patterns describe a summary of the customer activity, they may be successful used for the customer re-identifiation.
JF  - ECML PKDD 2018 Workshops
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-030-13463-1
UR  - https://link.springer.com/chapter/10.1007/978-3-030-13463-1_11
ER  - 

TY  - JOUR
T1  - A public data set of spatio-temporal match events in soccer competitions
JF  - Scientific data
Y1  - 2019
A1  - Luca Pappalardo
A1  - Paolo Cintia
A1  - Alessio Rossi
A1  - Massucco, Emanuele
A1  - Ferragina, Paolo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Soccer analytics is attracting increasing interest in academia and industry, thanks to the availability of sensing technologies that provide high-fidelity data streams for every match. Unfortunately, these detailed data are owned by specialized companies and hence are rarely publicly available for scientific research. To fill this gap, this paper describes the largest open collection of soccer-logs ever released, containing all the spatio-temporal events (passes, shots, fouls, etc.) that occured during each match for an entire season of seven prominent soccer competitions. Each match event contains information about its position, time, outcome, player and characteristics. The nature of team sports like soccer, halfway between the abstraction of a game and the reality of complex social systems, combined with the unique size and composition of this dataset, provide an ideal ground for tackling a wide range of data science problems, including the measurement and evaluation of performance, both at individual and at collective level, and the determinants of success and failure.
VL  - 6
UR  - https://www.nature.com/articles/s41597-019-0247-7
ER  - 

TY  - JOUR
T1  - Public opinion and Algorithmic bias
JF  - ERCIM News
Y1  - 2019
A1  - Alina Sirbu
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Kertész, János
UR  - https://ercim-news.ercim.eu/en116/special/public-opinion-and-algorithmic-bias
ER  - 

TY  - JOUR
T1  - Relationship between External and Internal Workloads in Elite Soccer Players: Comparison between Rate of Perceived Exertion and Training Load
JF  - Applied Sciences
Y1  - 2019
A1  - Alessio Rossi
A1  - Perri, Enrico
A1  - Luca Pappalardo
A1  - Paolo Cintia
A1  - Iaia, F Marcello
AB  - The use of machine learning (ML) in soccer allows for the management of a large amount of data deriving from the monitoring of sessions and matches. Although the rate of perceived exertion (RPE), training load (S-RPE), and global position system (GPS) are standard methodologies used in team sports to assess the internal and external workload; how the external workload affects RPE and S-RPE remains still unclear. This study explores the relationship between both RPE and S-RPE and the training workload through ML. Data were recorded from 22 elite soccer players, in 160 training sessions and 35 matches during the 2015/2016 season, by using GPS tracking technology. A feature selection process was applied to understand which workload features influence RPE and S-RPE the most. Our results show that the training workloads performed in the previous week have a strong effect on perceived exertion and training load. On the other hand, the analysis of our predictions shows higher accuracy for medium RPE and S-RPE values compared with the extremes. These results provide further evidence of the usefulness of ML as a support to athletic trainers and coaches in understanding the relationship between training load and individual-response in team sports.
VL  - 9
UR  - https://www.mdpi.com/2076-3417/9/23/5174/htm
ER  - 

TY  - Generic
T1  - SAI a Sensible Artificial Intelligence that plays Go
Y1  - 2019
A1  - F Morandin
A1  - G Amato
A1  - R Gini
A1  - C Metta
A1  - M Parton
A1  - G.C. Pascutto
ER  - 

TY  - BOOK
T1  - Sarò Franco - Vita di Franco Turini, executive chef dell’Università di Pisa
Y1  - 2019
A1  - Marco Malvaldi
AB  - Chi è Franco Turini? Come molti sanno, uno dei pionieri dell’informatica italiana. Ma non è questa la domanda che ci interessa. Quella a cui questo breve saggio si propone di rispondere è una questione molto più importante: chi avrebbe voluto essere Franco Turini?  In questo scritto, la vita e la carriera di Turini vengono ripercorse alla luce della sua vera, unica e irredimibile passione: la cucina. In un intreccio romanzesco, denso di colpi di scena e assolutamente falso e tendenzioso, il contributo di Franco Turini all’informatica e all’intelligenza artiﬁciale si dipana, indissolubilmente intrecciato alla sua passione per i fornelli, attraverso le molte intuizioni geniali che lo hanno colpito mentre cucinava.
PB  - Pisa University Press
CY  - Pisa, Italy
UR  - https://store.streetlib.com/it/marco-malvaldi/saro-franco-9788833392523/
ER  - 

TY  - CONF
T1  - On The Stability of Interpretable Models
T2  - 2019 International Joint Conference on Neural Networks (IJCNN)
Y1  - 2019
A1  - Riccardo Guidotti
A1  - Salvatore Ruggieri
AB  - Interpretable classification models are built with the purpose of providing a comprehensible description of the decision logic to an external oversight agent. When considered in isolation, a decision tree, a set of classification rules, or a linear model, are widely recognized as human-interpretable. However, such models are generated as part of a larger analytical process. Bias in data collection and preparation, or in model's construction may severely affect the accountability of the design process. We conduct an experimental study of the stability of interpretable models with respect to feature selection, instance selection, and model selection. Our conclusions should raise awareness and attention of the scientific community on the need of a stability impact assessment of interpretable models.
JF  - 2019 International Joint Conference on Neural Networks (IJCNN)
PB  - IEEE
UR  - https://ieeexplore.ieee.org/abstract/document/8852158
ER  - 

TY  - JOUR
T1  - Towards the dynamic community discovery in decentralized online social networks
JF  - Journal of Grid Computing
Y1  - 2019
A1  - Guidi, Barbara
A1  - Michienzi, Andrea
A1  - Giulio Rossetti
VL  - 17
ER  - 

TY  - JOUR
T1  - Transparency in Algorithmic Decision Making
JF  - ERCIM News
Y1  - 2019
A1  - Andreas Rauber
A1  - Roberto Trasarti
A1  - Fosca Giannotti
UR  - https://ercim-news.ercim.eu/en116/special/transparency-in-algorithmic-decision-making-introduction-to-the-special-theme
ER  - 

TY  - JOUR
T1  - A Visual Analytics Platform to Measure Performance on University Entrance Tests (Discussion Paper)
Y1  - 2019
A1  - Boncoraglio, Daniele
A1  - Deri, Francesca
A1  - Distefano, Francesco
A1  - Daniele Fadda
A1  - Filippi, Giorgio
A1  - Forte, Giuseppe
A1  - Licari, Federica
A1  - Michela Natilli
A1  - Dino Pedreschi
A1  - S Rinzivillo
ER  - 

TY  - JOUR
T1  - Active and passive diffusion processes in complex networks
JF  - Applied network science
Y1  - 2018
A1  - Letizia Milli
A1  - Giulio Rossetti
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Ideas, information, viruses: all of them, with their mechanisms, spread over the complex social information, viruses: all tissues described by our interpersonal relations. Usually, to simulate and understand the unfolding of such complex phenomena are used general mathematical models; these models act agnostically from the object of which they simulate the diffusion, thus considering spreading of virus, ideas and innovations alike. Indeed, such degree of abstraction makes it easier to define a standard set of tools that can be applied to heterogeneous contexts; however, it can also lead to biased, incorrect, simulation outcomes. In this work we introduce the concepts of active and passive diffusion to discriminate the degree in which individuals choice affect the overall spreading of content over a social graph. Moving from the analysis of a well-known passive diffusion schema, the Threshold model (that can be used to model peer-pressure related processes), we introduce two novel approaches whose aim is to provide active and mixed schemas applicable in the context of innovations/ideas diffusion simulation.    Our analysis, performed both in synthetic and real-world data, underline that the adoption of exclusively passive/active models leads to conflicting results, thus highlighting the need of mixed approaches to capture the real complexity of the simulated system better.
VL  - 3
UR  - https://link.springer.com/article/10.1007/s41109-018-0100-5
ER  - 

TY  - CONF
T1  - Analyzing Privacy Risk in Human Mobility Data
T2  - Software Technologies: Applications and Foundations - STAF 2018 Collocated Workshops, Toulouse, France, June 25-29, 2018, Revised Selected Papers
Y1  - 2018
A1  - Roberto Pellungrini
A1  - Luca Pappalardo
A1  - Francesca Pratesi
A1  - Anna Monreale
AB  - Mobility data are of fundamental importance for understanding the patterns of human movements, developing analytical services and modeling human dynamics. Unfortunately, mobility data also contain individual sensitive information, making it necessary an accurate privacy risk assessment for the individuals involved. In this paper, we propose a methodology for assessing privacy risk in human mobility data. Given a set of individual and collective mobility features, we define the minimum data format necessary for the computation of each feature and we define a set of possible attacks on these data formats. We perform experiments computing the empirical risk in a real-world mobility dataset, and show how the distributions of the considered mobility features are affected by the removal of individuals with different levels of privacy risk.
JF  - Software Technologies: Applications and Foundations - STAF 2018 Collocated Workshops, Toulouse, France, June 25-29, 2018, Revised Selected Papers
UR  - https://doi.org/10.1007/978-3-030-04771-9_10
ER  - 

TY  - JOUR
T1  - Assessing the Stability of Interpretable Models
JF  - arXiv preprint arXiv:1810.09352
Y1  - 2018
A1  - Riccardo Guidotti
A1  - Salvatore Ruggieri
AB  - Interpretable classification models are built with the purpose of providing a comprehensible description of the decision logic to an external oversight agent. When considered in isolation, a decision tree, a set of classification rules, or a linear model, are widely recognized as human-interpretable. However, such models are generated as part of a larger analytical process, which, in particular, comprises data collection and filtering. Selection bias in data collection or in data pre-processing may affect the model learned. Although model induction algorithms are designed to learn to generalize, they pursue optimization of predictive accuracy. It remains unclear how interpretability is instead impacted. We conduct an experimental analysis to investigate whether interpretable models are able to cope with data selection bias as far as interpretability is concerned.
ER  - 

TY  - JOUR
T1  - Community Discovery in Dynamic Networks: a Survey
JF  - Journal ACM Computing Surveys
Y1  - 2018
A1  - Giulio Rossetti
A1  - Cazabet, Rémy
AB  - Networks built to model real world phenomena are characeterised by some properties that have attracted the attention of the scientific community: (i) they are organised according to community structure and (ii) their structure evolves with time. Many researchers have worked on methods that can efficiently unveil substructures in complex networks, giving birth to the field of community discovery. A novel and challenging problem started capturing researcher interest recently: the identification of evolving communities. To model the evolution of a system, dynamic networks can be used: nodes and edges are mutable and their presence, or absence, deeply impacts the community structure that composes them. The aim of this survey is to present the distinctive features and challenges of dynamic community discovery, and propose a classification of published approaches. As a "user manual", this work organizes state of art methodologies into a taxonomy, based on their rationale, and their specific instanciation. Given a desired definition of network dynamics, community characteristics and analytical needs, this survey will support researchers to identify the set of approaches that best fit their needs. The proposed classification could also help researchers to choose in which direction should future research be oriented.
VL  - 51
UR  - https://dl.acm.org/citation.cfm?id=3172867
ER  - 

TY  - CONF
T1  - Diffusive Phenomena in Dynamic Networks: a data-driven study
T2  - International Conference on Complex Networks CompleNet
Y1  - 2018
A1  - Letizia Milli
A1  - Giulio Rossetti
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Everyday, ideas, information as well as viruses spread over complex social tissues described by our interpersonal relations. So far, the network contexts upon which diffusive phenomena unfold have usually considered static, composed by a fixed set of nodes and edges. Recent studies describe social networks as rapidly changing topologies. In this work – following a data-driven approach – we compare the behaviors of classical spreading models when used to analyze a given social network whose topological dynamics are observed at different temporal-granularities. Our goal is to shed some light on the impacts that the adoption of a static topology has on spreading simulations as well as to provide an alternative formulation of two classical diffusion models.
JF  - International Conference on Complex Networks CompleNet
PB  - Springer
CY  - Boston March 5-8 2018
UR  - https://link.springer.com/chapter/10.1007/978-3-319-73198-8_13
ER  - 

TY  - CONF
T1  - Discovering Mobility Functional Areas: A Mobility Data Analysis Approach
T2  - International Workshop on Complex Networks
Y1  - 2018
A1  - Lorenzo Gabrielli
A1  - Daniele Fadda
A1  - Giulio Rossetti
A1  - Mirco Nanni
A1  - Piccinini, Leonardo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Patrizia Lattarulo
AB  - How do we measure the borders of urban areas and therefore decide which are the functional units of the territory? Nowadays, we typically do that just looking at census data, while in this work we aim to identify functional areas for mobility in a completely data-driven way. Our solution makes use of human mobility data (vehicle trajectories) and consists in an agglomerative process which gradually groups together those municipalities that maximize internal vehicular traffic while minimizing external one. The approach is tested against a dataset of trips involving individuals of an Italian Region, obtaining a new territorial division which allows us to identify mobility attractors. Leveraging such partitioning and external knowledge, we show that our method outperforms the state-of-the-art algorithms. Indeed, the outcome of our approach is of great value to public administrations for creating synergies within the aggregations of the territories obtained.
JF  - International Workshop on Complex Networks
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-319-73198-8_27
ER  - 

TY  - JOUR
T1  - Discovering temporal regularities in retail customers’ shopping behavior
JF  - EPJ Data Science
Y1  - 2018
A1  - Riccardo Guidotti
A1  - Lorenzo Gabrielli
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - In this paper we investigate the regularities characterizing the temporal purchasing behavior of the customers of a retail market chain. Most of the literature studying purchasing behavior focuses on what customers buy while giving few importance to the temporal dimension. As a consequence, the state of the art does not allow capturing which are the temporal purchasing patterns of each customers. These patterns should describe the customer’s temporal habits highlighting when she typically makes a purchase in correlation with information about the amount of expenditure, number of purchased items and other similar aggregates. This knowledge could be exploited for different scopes: set temporal discounts for making the purchases of customers more regular with respect the time, set personalized discounts in the day and time window preferred by the customer, provide recommendations for shopping time schedule, etc. To this aim, we introduce a framework for extracting from personal retail data a temporal purchasing profile able to summarize whether and when a customer makes her distinctive purchases. The individual profile describes a set of regular and characterizing shopping behavioral patterns, and the sequences in which these patterns take place. We show how to compare different customers by providing a collective perspective to their individual profiles, and how to group the customers with respect to these comparable profiles. By analyzing real datasets containing millions of shopping sessions we found that there is a limited number of patterns summarizing the temporal purchasing behavior of all the customers, and that they are sequentially followed in a finite number of ways. Moreover, we recognized regular customers characterized by a small number of temporal purchasing behaviors, and changing customers characterized by various types of temporal purchasing behaviors. Finally, we discuss on how the profiles can be exploited both by customers to enable personalized services, and by the retail market chain for providing tailored discounts based on temporal purchasing regularity.
VL  - 7
UR  - https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-018-0133-0
ER  - 

TY  - JOUR
T1  - Effective injury forecasting in soccer with GPS training data and machine learning
JF  - PloS one
Y1  - 2018
A1  - Alessio Rossi
A1  - Luca Pappalardo
A1  - Paolo Cintia
A1  - Iaia, F Marcello
A1  - Fernàndez, Javier
A1  - Medina, Daniel
AB  - Injuries have a great impact on professional soccer, due to their large influence on team performance and the considerable costs of rehabilitation for players. Existing studies in the literature provide just a preliminary understanding of which factors mostly affect injury risk, while an evaluation of the potential of statistical models in forecasting injuries is still missing. In this paper, we propose a multi-dimensional approach to injury forecasting in professional soccer that is based on GPS measurements and machine learning. By using GPS tracking technology, we collect data describing the training workload of players in a professional soccer club during a season. We then construct an injury forecaster and show that it is both accurate and interpretable by providing a set of case studies of interest to soccer practitioners. Our approach opens a novel perspective on injury prevention, providing a set of simple and practical rules for evaluating and interpreting the complex relations between injury risk and training performance in professional soccer.
VL  - 13
UR  - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0201264
ER  - 

TY  - CONF
T1  - Explaining successful docker images using pattern mining analysis
T2  - Federation of International Conferences on Software Technologies: Applications and Foundations
Y1  - 2018
A1  - Riccardo Guidotti
A1  - Soldani, Jacopo
A1  - Neri, Davide
A1  - Antonio Brogi
AB  - Docker is on the rise in today’s enterprise IT. It permits shipping applications inside portable containers, which run from so-called Docker images. Docker images are distributed in public registries, which also monitor their popularity. The popularity of an image directly impacts on its usage, and hence on the potential revenues of its developers. In this paper, we present a frequent pattern mining-based approach for understanding how to improve an image to increase its popularity. The results in this work can provide valuable insights to Docker image providers, helping them to design more competitive software products.
JF  - Federation of International Conferences on Software Technologies: Applications and Foundations
PB  - Springer, Cham
UR  - https://link.springer.com/chapter/10.1007/978-3-030-04771-9_9
ER  - 

TY  - CONF
T1  - Exploring Students Eating Habits Through Individual Profiling and Clustering Analysis
T2  - ECML PKDD 2018 Workshops
Y1  - 2018
A1  - Michela Natilli
A1  - Anna Monreale
A1  - Riccardo Guidotti
A1  - Luca Pappalardo
JF  - ECML PKDD 2018 Workshops
PB  - Springer
ER  - 

TY  - CONF
T1  - The Fractal Dimension of Music: Geography, Popularity and Sentiment Analysis
T2  - International Conference on Smart Objects and Technologies for Social Good
Y1  - 2018
A1  - Pollacci, Laura
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Nowadays there is a growing standardization of musical contents. Our finding comes out from a cross-service multi-level dataset analysis where we study how geography affects the music production. The investigation presented in this paper highlights the existence of a “fractal” musical structure that relates the technical characteristics of the music produced at regional, national and world level. Moreover, a similar structure emerges also when we analyze the musicians’ popularity and the polarity of their songs defined as the mood that they are able to convey. Furthermore, the clusters identified are markedly distinct one from another with respect to popularity and sentiment.
JF  - International Conference on Smart Objects and Technologies for Social Good
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-319-76111-4_19
ER  - 

TY  - JOUR
T1  - Gastroesophageal reflux symptoms among Italian university students: epidemiology and dietary correlates using automatically recorded transactions
JF  - BMC gastroenterology
Y1  - 2018
A1  - Martinucci, Irene
A1  - Michela Natilli
A1  - Lorenzoni, Valentina
A1  - Luca Pappalardo
A1  - Anna Monreale
A1  - Turchetti, Giuseppe
A1  - Dino Pedreschi
A1  - Marchi, Santino
A1  - Barale, Roberto
A1  - de Bortoli, Nicola
AB  - Background: Gastroesophageal reflux disease (GERD) is one of the most common gastrointestinal disorders  worldwide, with relevant impact on the quality of life and health care costs.The aim of our study is to assess the  prevalence of GERD based on self-reported symptoms among university students in central Italy. The secondary aim is  to evaluate lifestyle correlates, particularly eating habits, in GERD students using automatically recorded transactions  through cashiers at university canteen.  Methods: A web-survey was created and launched through an app, ad-hoc developed for an interactive exchange of  information with students, including anthropometric data and lifestyle habits. Moreover, the web-survey allowed  users a self-diagnosis of GERD through a simple questionnaire. As regard eating habits, detailed collection of meals  consumed, including number and type of dishes, were automatically recorded through cashiers at the university  canteen equipped with an automatic registration system.  Results: We collected 3012 questionnaires. A total of 792 students (26.2% of the respondents) reported typical GERD  symptoms occurring at least weekly. Female sex was more prevalent than male sex. In the set of students with GERD,  the percentage of smokers was higher, and our results showed that when BMI tends to higher values the percentage  of students with GERD tends to increase. When evaluating correlates with diet, we found, among all users, a lower  frequency of legumes choice in GERD students and, among frequent users, a lower frequency of choice of pasta and  rice in GERD students.  Discussion: The results of our study are in line with the values reported in the literature. Nowadays, GERD is a common  problem in our communities, and can potentially lead to serious medical complications; the economic burden  involved in the diagnostic and therapeutic management of the disease has a relevant impact on healthcare costs.  Conclusions: To our knowledge, this is the first study evaluating the prevalence of typical GERD–related symptoms  in a young population of University students in Italy. Considering the young age of enrolled subjects, our prevalence  rate, relatively high compared to the usual estimates, could represent a further negative factor for the future  economic sustainability of the healthcare system.  Keywords: Gastroesophageal reflux disease, GERD, Heartburn, Regurgitation, Diet, Prevalence, University students
VL  - 18
UR  - https://bmcgastroenterol.biomedcentral.com/articles/10.1186/s12876-018-0832-9
ER  - 

TY  - JOUR
T1  - Gravity and scaling laws of city to city migration
JF  - PLOS ONE
Y1  - 2018
A1  - Prieto Curiel, Rafael
A1  - Luca Pappalardo
A1  - Lorenzo Gabrielli
A1  - Bishop, Steven Richard
AB  - Models of human migration provide powerful tools to forecast the flow of migrants, measure the impact of a policy, determine the cost of physical and political frictions and more. Here, we analyse the migration of individuals from and to cities in the US, finding that city to city migration follows scaling laws, so that the city size is a significant factor in determining whether, or not, an individual decides to migrate and the city size of both the origin and destination play key roles in the selection of the destination. We observe that individuals from small cities tend to migrate more frequently, tending to move to similar-sized cities, whereas individuals from large cities do not migrate so often, but when they do, they tend to move to other large cities. Building upon these findings we develop a scaling model which describes internal migration as a two-step decision process, demonstrating that it can partially explain migration fluxes based solely on city size. We then consider the impact of distance and construct a gravity-scaling model by combining the observed scaling patterns with the gravity law of migration. Results show that the scaling laws are a significant feature of human migration and that the inclusion of scaling can overcome the limits of the gravity and the radiation models of human migration.
VL  - 13
UR  - https://doi.org/10.1371/journal.pone.0199892
ER  - 

TY  - CONF
T1  - Helping your docker images to spread based on explainable models
T2  - Joint European Conference on Machine Learning and Knowledge Discovery in Databases
Y1  - 2018
A1  - Riccardo Guidotti
A1  - Soldani, Jacopo
A1  - Neri, Davide
A1  - Brogi, Antonio
A1  - Dino Pedreschi
AB  - Docker is on the rise in today’s enterprise IT. It permits shipping applications inside portable containers, which run from so-called Docker images. Docker images are distributed in public registries, which also monitor their popularity. The popularity of an image impacts on its actual usage, and hence on the potential revenues for its developers. In this paper, we present a solution based on interpretable decision tree and regression trees for estimating the popularity of a given Docker image, and for understanding how to improve an image to increase its popularity. The results presented in this work can provide valuable insights to Docker developers, helping them in spreading their images. Code related to this paper is available at: https://github.com/di-unipi-socc/DockerImageMiner.
JF  - Joint European Conference on Machine Learning and Knowledge Discovery in Databases
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-030-10997-4_13
ER  - 

TY  - CHAP
T1  - How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science
T2  - A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years
Y1  - 2018
A1  - Amato, G.
A1  - Candela, L.
A1  - Castelli, D.
A1  - Esuli, A.
A1  - Falchi, F.
A1  - Gennaro, C.
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Mirco Nanni
A1  - Pagano, P.
A1  - Luca Pappalardo
A1  - Dino Pedreschi
A1  - Francesca Pratesi
A1  - Rabitti, F.
A1  - S Rinzivillo
A1  - Giulio Rossetti
A1  - Salvatore Ruggieri
A1  - Sebastiani, F.
A1  - Tesconi, M.
ED  - Flesca, Sergio
ED  - Greco, Sergio
ED  - Masciari, Elio
ED  - Saccà, Domenico
AB  - During the last 35 years, data management principles such as physical and logical independence, declarative querying and cost-based optimization have led to profound pervasiveness of relational databases in any kind of organization. More importantly, these technical advances have enabled the first round of business intelligence applications and laid the foundation for managing and analyzing Big Data today.
JF  - A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-319-61893-7
UR  - https://link.springer.com/chapter/10.1007%2F978-3-319-61893-7_17
ER  - 

TY  - JOUR
T1  - The italian music superdiversity
JF  - Multimedia Tools and Applications
Y1  - 2018
A1  - Pollacci, Laura
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Globalization can lead to a growing standardization of musical contents. Using a cross-service multi-level dataset we investigate the actual Italian music scene. The investigation highlights the musical Italian superdiversity both individually analyzing the geographical and lexical dimensions and combining them. Using different kinds of features over the geographical dimension leads to two similar, comparable and coherent results, confirming the strong and essential correlation between melodies and lyrics. The profiles identified are markedly distinct one from another with respect to sentiment, lexicon, and melodic features. Through a novel application of a sentiment spreading algorithm and songs’ melodic features, we are able to highlight discriminant characteristics that violate the standard regional political boundaries, reconfiguring them following the actual musical communicative practices.
UR  - https://link.springer.com/article/10.1007/s11042-018-6511-6
ER  - 

TY  - CONF
T1  - Learning Data Mining
T2  - 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)
Y1  - 2018
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - S Rinzivillo
AB  - In the last decade the usage and study of data mining and machine learning algorithms have received an increasing attention from several and heterogeneous fields of research. Learning how and why a certain algorithm returns a particular result, and understanding which are the main problems connected to its execution is a hot topic in the education of data mining methods. In order to support data mining beginners, students, teachers, and researchers we introduce a novel didactic environment. The Didactic Data Mining Environment (DDME) allows to execute a data mining algorithm on a dataset and to observe the algorithm behavior step by step to learn how and why a certain result is returned. DDME can be practically exploited by teachers and students for having a more interactive learning of data mining. Indeed, on top of the core didactic library, we designed a visual platform that allows online execution of experiments and the visualization of the algorithm steps. The visual platform abstracts the coding activity and makes available the execution of algorithms to non-technicians.
JF  - 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)
UR  - https://ieeexplore.ieee.org/document/8631453
ER  - 

TY  - RPRT
T1  - Local Rule-Based Explanations of Black Box Decision Systems
Y1  - 2018
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Salvatore Ruggieri
A1  - Dino Pedreschi
A1  - Franco Turini
A1  - Fosca Giannotti
JF  - arXiv preprint arXiv:1805.10820
ER  - 

TY  - JOUR
T1  - NDlib: a python library to model and analyze diffusion processes over complex networks
JF  - International Journal of Data Science and Analytics
Y1  - 2018
A1  - Giulio Rossetti
A1  - Letizia Milli
A1  - S Rinzivillo
A1  - Alina Sirbu
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Nowadays the analysis of dynamics of and on networks represents a hot topic in the social network analysis playground. To support students, teachers, developers and researchers, in this work we introduce a novel framework, namely NDlib, an environment designed to describe diffusion simulations. NDlib is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. For this reason, upon NDlib, we designed a simulation server that allows remote execution of experiments as well as an online visualization tool that abstracts its programmatic interface and makes available the simulation platform to non-technicians.
VL  - 5
UR  - https://link.springer.com/article/10.1007/s41060-017-0086-6
ER  - 

TY  - RPRT
T1  - Open the Black Box Data-Driven Explanation of Black Box Decision Systems
Y1  - 2018
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Luca Pappalardo
A1  - Salvatore Ruggieri
A1  - Franco Turini
JF  - arXiv preprint arXiv:1806.09936
ER  - 

TY  - JOUR
T1  - Personalized Market Basket Prediction with Temporal Annotated Recurring Sequences
JF  - IEEE Transactions on Knowledge and Data Engineering
Y1  - 2018
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Nowadays, a hot challenge for supermarket chains is to offer personalized services to their customers. Market basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable of capturing at the same time the different factors influencing the customer's decision process: co-occurrence, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern Temporal Annotated Recurring Sequence (TARS) able to capture simultaneously and adaptively all these factors. We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to understand the level of the customer's stocks and recommend the set of most necessary items. By adopting the TBP the supermarket chains could crop tailored suggestions for each individual customer which in turn could effectively speed up their shopping sessions. A deep experimentation shows that TARS are able to explain the customer purchase behavior, and that TBP outperforms the state-of-the-art competitors.
UR  - https://ieeexplore.ieee.org/abstract/document/8477157
ER  - 

TY  - JOUR
T1  - PRUDEnce: a system for assessing privacy risk vs utility in data sharing ecosystems
JF  - Transactions on Data Privacy
Y1  - 2018
A1  - Francesca Pratesi
A1  - Anna Monreale
A1  - Roberto Trasarti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Yanagihara, Tadashi
AB  - Data describing human activities are an important source of knowledge useful for understanding individual and collective behavior and for developing a wide range of user services. Unfortunately, this kind of data is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database. Therefore, Data Providers, before sharing those data, must apply any sort of anonymization to lower the privacy risks, but they must be aware and capable of controlling also the data quality, since these two factors are often a trade-off. In this paper we propose PRUDEnce (Privacy Risk versus Utility in Data sharing Ecosystems), a system enabling a privacy-aware ecosystem for sharing personal data. It is based on a methodology for assessing both the empirical (not theoretical) privacy risk associated to users represented in the data, and the data quality guaranteed only with users not at risk. Our proposal is able to support the Data Provider in the exploration of a repertoire of possible data transformations with the aim of selecting one specific transformation that yields an adequate trade-off between data quality and privacy risk. We study the practical effectiveness of our proposal over three data formats underlying many services, defined on real mobility data, i.e., presence data, trajectory data and road segment data.
VL  - 11
UR  - http://www.tdp.cat/issues16/tdp.a284a17.pdf
ER  - 

TY  - CONF
T1  - SoBigData: Social Mining & Big Data Ecosystem
T2  - Companion of the The Web Conference 2018 on The Web Conference 2018
Y1  - 2018
A1  - Fosca Giannotti
A1  - Roberto Trasarti
A1  - Bontcheva, Kalina
A1  - Valerio Grossi
AB  - One of the most pressing and fascinating challenges scientists face today, is understanding the complexity of our globally interconnected society. The big data arising from the digital breadcrumbs of human activities has the potential of providing a powerful social microscope, which can help us understand many complex and hidden socio-economic phenomena. Such challenge requires high-level analytics, modeling and reasoning across all the social dimensions above. There is a need to harness these opportunities for scientific advancement and for the social good, compared to the currently prevalent exploitation of big data for commercial purposes or, worse, social control and surveillance. The main obstacle to this accomplishment, besides the scarcity of data scientists, is the lack of a large-scale open ecosystem where big data and social mining research can be carried out. The SoBigData Research Infrastructure (RI) provides an integrated ecosystem for ethic-sensitive scientific discoveries and advanced applications of social data mining on the various dimensions of social life as recorded by "big data". The research community uses the SoBigData facilities as a "secure digital wind-tunnel" for large-scale social data analysis and simulation experiments. SoBigData promotes repeatable and open science and supports data science research projects by providing: i) an ever-growing, distributed data ecosystem for procurement, access and curation and management of big social data, to underpin social data mining research within an ethic-sensitive context; ii) an ever-growing, distributed platform of interoperable, social data mining methods and associated skills: tools, methodologies and services for mining, analysing, and visualising complex and massive datasets, harnessing the techno-legal barriers to the ethically safe deployment of big data for social mining; iii) an ecosystem where protection of personal information and the respect for fundamental human rights can coexist with a safe use of the same information for scientific purposes of broad and central societal interest. SoBigData has a dedicated ethical and legal board, which is implementing a legal and ethical framework.
JF  - Companion of the The Web Conference 2018 on The Web Conference 2018
PB  - International World Wide Web Conferences Steering Committee
UR  - http://www.sobigdata.eu/sites/default/files/www%202018.pdf
ER  - 

TY  - JOUR
T1  - A survey of methods for explaining black box models
JF  - ACM computing surveys (CSUR)
Y1  - 2018
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Salvatore Ruggieri
A1  - Franco Turini
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.
VL  - 51
UR  - https://dl.acm.org/doi/abs/10.1145/3236009
ER  - 

TY  - CONF
T1  - Weak nodes detection in urban transport systems: Planning for resilience in Singapore
T2  - 2018 IEEE 5th international conference on data science and advanced analytics (DSAA)
Y1  - 2018
A1  - Ferretti, Michele
A1  - Barlacchi, Gianni
A1  - Luca Pappalardo
A1  - Lucchini, Lorenzo
A1  - Lepri, Bruno
AB  - The availability of massive data-sets describing human mobility offers the possibility to design simulation tools to monitor and improve the resilience of transport systems in response to traumatic events such as natural and man-made disasters (e.g., floods, terrorist attacks, etc. . . ). In this perspective, we propose ACHILLES, an application to models people's movements in a given transport mode through a multiplex network representation based on mobility data. ACHILLES is a web-based application which provides an easy-to-use interface to explore the mobility fluxes and the connectivity of every urban zone in a city, as well as to visualize changes in the transport system resulting from the addition or removal of transport modes, urban zones, and single stops. Notably, our application allows the user to assess the overall resilience of the transport network by identifying its weakest node, i.e. Urban Achilles Heel, with reference to the ancient Greek mythology. To demonstrate the impact of ACHILLES for humanitarian aid we consider its application to a real-world scenario by exploring human mobility in Singapore in response to flood prevention.
JF  - 2018 IEEE 5th international conference on data science and advanced analytics (DSAA)
PB  - IEEE
UR  - https://ieeexplore.ieee.org/abstract/document/8631413/authors#authors
ER  - 

TY  - CHAP
T1  - Applications for Environmental Sensing in EveryAware
T2  - Participatory Sensing, Opinions and Collective Awareness
Y1  - 2017
A1  - Atzmueller, Martin
A1  - Becker, Martin
A1  - Molino, Andrea
A1  - Mueller, Juergen
A1  - Peters, Jan
A1  - Alina Sirbu
AB  - This chapter provides a technical description of the EveryAware applications for air quality and noise monitoring. Specifically, we introduce AirProbe, for measuring air quality, and WideNoise Plus for estimating environmental noise. We also include an overview on hardware components and smartphone-based measurement technology, and we present the according web backend, e.g., providing for real-time tracking, data storage, analysis and visualizations.
JF  - Participatory Sensing, Opinions and Collective Awareness
PB  - Springer
UR  - http://link.springer.com/chapter/10.1007/978-3-319-25658-0_7
ER  - 

TY  - CONF
T1  - Assessing Privacy Risk in Retail Data
T2  - Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, PAP 2017, Held in Conjunction with ECML PKDD 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers
Y1  - 2017
A1  - Roberto Pellungrini
A1  - Francesca Pratesi
A1  - Luca Pappalardo
AB  - Retail data are one of the most requested commodities by commercial companies. Unfortunately, from this data it is possible to retrieve highly sensitive information about individuals. Thus, there exists the need for accurate individual privacy risk evaluation. In this paper, we propose a methodology for assessing privacy risk in retail data. We define the data formats for representing retail data, the privacy framework for calculating privacy risk and some possible privacy attacks for this kind of data. We perform experiments in a real-world retail dataset, and show the distribution of privacy risk for the various attacks.
JF  - Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, PAP 2017, Held in Conjunction with ECML PKDD 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers
UR  - https://doi.org/10.1007/978-3-319-71970-2_3
ER  - 

TY  - JOUR
T1  - Assessing the use of mobile phone data to describe recurrent mobility patterns in spatial epidemic models
JF  - Royal Society open science
Y1  - 2017
A1  - Cecilia Panigutti
A1  - Tizzoni, Michele
A1  - Bajardi, Paolo
A1  - Smoreda, Zbigniew
A1  - Colizza, Vittoria
AB  - The recent availability of large-scale call detail record data has substantially improved our ability of quantifying human travel patterns with broad applications in epidemiology. Notwithstanding a number of successful case studies, previous works have shown that using different mobility data sources, such as mobile phone data or census surveys, to parametrize infectious disease models can generate divergent outcomes. Thus, it remains unclear to what extent epidemic modelling results may vary when using different proxies for human movements. Here, we systematically compare 658 000 simulated outbreaks generated with a spatially structured epidemic model based on two different human mobility networks: a commuting network of France extracted from mobile phone data and another extracted from a census survey. We compare epidemic patterns originating from all the 329 possible outbreak seed locations and identify the structural network properties of the seeding nodes that best predict spatial and temporal epidemic patterns to be alike. We find that similarity of simulated epidemics is significantly correlated to connectivity, traffic and population size of the seeding nodes, suggesting that the adequacy of mobile phone data for infectious disease models becomes higher when epidemics spread between highly connected and heavily populated locations, such as large urban areas.
VL  - 4
ER  - 

TY  - JOUR
T1  - Authenticated Outlier Mining for Outsourced Databases
JF  - IEEE Transactions on Dependable and Secure Computing
Y1  - 2017
A1  - Dong, Boxiang
A1  - Hui Wendy Wang
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - W Guo
AB  - The Data-Mining-as-a-Service (DMaS) paradigm is becoming the focus of research, as it allows the data owner (client) who lacks expertise and/or computational resources to outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises some issues about result integrity: how could the client verify the mining results returned by the server are both sound and complete? In this paper, we focus on outlier mining, an important mining task. Previous verification techniques use an authenticated data structure (ADS) for correctness authentication, which may incur much space and communication cost. In this paper, we propose a novel solution that returns a probabilistic result integrity guarantee with much cheaper verification cost. The key idea is to insert a set of artificial records (ARs) into the dataset, from which it constructs a set of artificial outliers (AOs) and artificial non-outliers (ANOs). The AOs and ANOs are used by the client to detect any incomplete and/or incorrect mining results with a probabilistic guarantee. The main challenge that we address is how to construct ARs so that they do not change the (non-)outlierness of original records, while guaranteeing that the client can identify ANOs and AOs without executing mining. Furthermore, we build a strategic game and show that a Nash equilibrium exists only when the server returns correct outliers. Our implementation and experiments demonstrate that our verification solution is efficient and lightweight.
UR  - https://ieeexplore.ieee.org/document/8048342/
ER  - 

TY  - CONF
T1  - Clustering Individual Transactional Data for Masses of Users
T2  - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Y1  - 2017
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Mirco Nanni
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Mining a large number of datasets recording human activities for making sense of individual data is the key enabler of a new wave of personalized knowledge-based services. In this paper we focus on the problem of clustering individual transactional data for a large mass of users. Transactional data is a very pervasive kind of information that is collected by several services, often involving huge pools of users. We propose txmeans, a parameter-free clustering algorithm able to efficiently partitioning transactional data in a completely automatic way. Txmeans is designed for the case where clustering must be applied on a massive number of different datasets, for instance when a large set of users need to be analyzed individually and each of them has generated a long history of transactions. A deep experimentation on both real and synthetic datasets shows the practical effectiveness of txmeans for the mass clustering of different personal datasets, and suggests that txmeans outperforms existing methods in terms of quality and efficiency. Finally, we present a personal cart assistant application based on txmeans
JF  - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB  - ACM
ER  - 

TY  - JOUR
T1  - A Data Mining Approach to Assess Privacy Risk in Human Mobility Data
JF  - ACM Trans. Intell. Syst. Technol.
Y1  - 2017
A1  - Roberto Pellungrini
A1  - Luca Pappalardo
A1  - Francesca Pratesi
A1  - Anna Monreale
AB  - Human mobility data are an important proxy to understand human mobility dynamics, develop analytical services, and design mathematical models for simulation and what-if analysis. Unfortunately mobility data are very sensitive since they may enable the re-identification of individuals in a database. Existing frameworks for privacy risk assessment provide data providers with tools to control and mitigate privacy risks, but they suffer two main shortcomings: (i) they have a high computational complexity; (ii) the privacy risk must be recomputed every time new data records become available and for every selection of individuals, geographic areas, or time windows. In this article, we propose a fast and flexible approach to estimate privacy risk in human mobility data. The idea is to train classifiers to capture the relation between individual mobility patterns and the level of privacy risk of individuals. We show the effectiveness of our approach by an extensive experiment on real-world GPS data in two urban areas and investigate the relations between human mobility patterns and the privacy risk of individuals.
VL  - 9
UR  - http://doi.acm.org/10.1145/3106774
ER  - 

TY  - UNPB
T1  - Data Science a Game-changer for Science and Innovation
Y1  - 2017
A1  - Fabio Beltram
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Digital technology is ubiquitous and very much part of public and private organizations and of individuals’ lives. People and things are becoming increasingly interconnected. Smartphones, smart buildings, smart factories, smart cities, autonomous vehicles and other smart environments and devices are filled with digital sensors, all of them creating an abundance of data. Governance and health care collect, generate and use data in an unprecedented quantity. New high- throughput scientific instruments and methods, like telescopes, satellites, accelerators, supercomputers, sensor networks and gene sequencing methods as well as large scale simulations generate massive amounts of data. Often referred to as data deluge, or Big Data, massive datasets revolutionize the way research is carried out, resulting in the emergence of a new, fourth paradigm of science based on data-intensive computing and data driven discovery4. Accordingly, the path to the solution of the problem of sustainable development will lead through Big Data, as maintaining the whole complexity of our modern society, including communication and traffic services, manufacturing, trade and commerce, financial services, health security, science, education and policy making requires this novel approach.  The new availability of huge amounts of data, along with advanced tools of exploratory data analysis, data mining/machine learning, and data visualization, and scalable infrastructures, has produced a spectacular change in the scientific method: all this is Data Science. This paper describes the main issues around Data Science as it will play out in the coming years in science and society. It focus on the scientific, technical and ethical challenges (A), on its role for disruptive innovation for science, industry, policy and people (B), on its scientific, technological and educational challenges (C) and finally, on the quantitative expectations of its economic impact (D). In our work we could count on many reports and studies on the subject, particularly on the BDVA5 and ERCIM6 reports.
PB  - G7 Academy
ER  - 

TY  - JOUR
T1  - Data-driven generation of spatio-temporal routines in human mobility
JF  - Data Mining and Knowledge Discovery
Y1  - 2017
A1  - Luca Pappalardo
A1  - Filippo Simini
AB  - The generation of realistic spatio-temporal trajectories of human mobility is of fundamental importance in a wide range of applications, such as the developing of protocols for mobile ad-hoc networks or what-if analysis in urban ecosystems. Current generative algorithms fail in accurately reproducing the individuals' recurrent schedules and at the same time in accounting for the possibility that individuals may break the routine during periods of variable duration. In this article we present Ditras (DIary-based TRAjectory Simulator), a framework to simulate the spatio-temporal patterns of human mobility. Ditras operates in two steps: the generation of a mobility diary and the translation of the mobility diary into a mobility trajectory. We propose a data-driven algorithm which constructs a diary generator from real data, capturing the tendency of individuals to follow or break their routine. We also propose a trajectory generator based on the concept of preferential exploration and preferential return. We instantiate Ditras with the proposed diary and trajectory generators and compare the resulting algorithm with real data and synthetic data produced by other generative algorithms, built by instantiating Ditras with several combinations of diary and trajectory generators. We show that the proposed algorithm reproduces the statistical properties of real trajectories in the most accurate way, making a step forward the understanding of the origin of the spatio-temporal patterns of human mobility.
UR  - https://doi.org/10.1007/s10618-017-0548-4
ER  - 

TY  - JOUR
T1  - Discovering and Understanding City Events with Big Data: The Case of Rome
JF  - Information
Y1  - 2017
A1  - Barbara Furletti
A1  - Roberto Trasarti
A1  - Paolo Cintia
A1  - Lorenzo Gabrielli
AB  - The increasing availability of large amounts of data and digital footprints has given rise  to ambitious research challenges in many fields, which spans from medical research, financial and  commercial world, to people and environmental monitoring. Whereas traditional data sources and  census fail in capturing actual and up-to-date behaviors, Big Data integrate the missing knowledge  providing useful and hidden information to analysts and decision makers. With this paper, we focus  on the identification of city events by analyzing mobile phone data (Call Detail Record), and we study  and evaluate the impact of these events over the typical city dynamics. We present an analytical  process able to discover, understand and characterize city events from Call Detail Record, designing  a distributed computation to implement Sociometer, that is a profiling tool to categorize phone users.  The methodology provides an useful tool for city mobility manager to manage the events and taking  future decisions on specific classes of users, i.e., residents, commuters and tourists.
VL  - 8
UR  - https://doi.org/10.3390/info8030074
ER  - 

TY  - CONF
T1  - Dynamic community analysis in decentralized online social networks
T2  - European Conference on Parallel Processing
Y1  - 2017
A1  - Guidi, Barbara
A1  - Michienzi, Andrea
A1  - Giulio Rossetti
JF  - European Conference on Parallel Processing
PB  - Springer
ER  - 

TY  - JOUR
T1  - Efficiently Clustering Very Large Attributed Graphs
JF  - arXiv preprint arXiv:1703.08590
Y1  - 2017
A1  - Alessandro Baroni
A1  - Conte, Alessio
A1  - Patrignani, Maurizio
A1  - Salvatore Ruggieri
AB  - Attributed graphs model real networks by enriching their nodes with attributes accounting for properties. Several techniques have been proposed for partitioning these graphs into clusters that are homogeneous with respect to both semantic attributes and to the structure of the graph. However, time and space complexities of state of the art algorithms limit their scalability to medium-sized graphs. We propose SToC (for Semantic-Topological Clustering), a fast and scalable algorithm for partitioning large attributed graphs. The approach is robust, being compatible both with categorical and with quantitative attributes, and it is tailorable, allowing the user to weight the semantic and topological components. Further, the approach does not require the user to guess in advance the number of clusters. SToC relies on well known approximation techniques such as bottom-k sketches, traditional graph-theoretic concepts, and a new perspective on the composition of heterogeneous distance measures. Experimental results demonstrate its ability to efficiently compute high-quality partitions of large scale attributed graphs.
ER  - 

TY  - JOUR
T1  - An empirical verification of a-priori learning models on mailing archives in the context of online learning activities of participants in free\libre open source software (FLOSS) communities
JF  - Education and Information Technologies
Y1  - 2017
A1  - Mukala, Patrick
A1  - Cerone, Antonio
A1  - Franco Turini
AB  - Free\Libre Open Source Software (FLOSS) environments are increasingly dubbed as learning environments where practical software engineering skills can be acquired. Numerous studies have extensively investigated how knowledge is acquired in these environments through a collaborative learning model that define a learning process. Such a learning process, identified either as a result of surveys or by means of questionnaires, can be depicted through a series of graphical representations indicating the steps FLOSS community members go through as they acquire and exchange skills. These representations are referred to as a-priori learning models. They are Petri net-like workflow nets (WF-net) that provide a visual representation of the learning process as it is expected to occur. These models are representations of a learning framework or paradigm in FLOSS communities. As such, the credibility of any models is estimated through a process of model verification and validation. Therefore in this paper, we analyze these models in comparison with the real behavior captured in FLOSS repositories by means of conformance verification in process mining. The purpose of our study is twofold. Firstly, the results of our analysis provide insights on the possible discrepancies that are observed between the initial theoretical representations of learning processes and the real behavior captured in FLOSS event logs, constructed from mailing archives. Secondly, this comparison helps foster the understanding on how learning actually takes place in FLOSS environments based on empirical evidence directly from the data.
VL  - 22
UR  - https://link.springer.com/article/10.1007/s10639-017-9573-6
ER  - 

TY  - CONF
T1  - Enumerating Distinct Decision Trees
T2  - International Conference on Machine Learning
Y1  - 2017
A1  - Salvatore Ruggieri
AB  - The search space for the feature selection problem in decision tree learning is the lattice of subsets of the available features. We provide an exact enumeration procedure of the subsets that lead to all and only the distinct decision trees. The procedure can be adopted to prune the search space of complete and heuristics search methods in wrapper models for feature selection. Based on this, we design a computational optimization of the sequential backward elimination heuristics with a performance improvement of up to 100X.
JF  - International Conference on Machine Learning
UR  - http://proceedings.mlr.press/v70/ruggieri17a.html
ER  - 

TY  - CONF
T1  - On the Equivalence Between Community Discovery and Clustering
T2  - International Conference on Smart Objects and Technologies for Social Good
Y1  - 2017
A1  - Riccardo Guidotti
A1  - Michele Coscia
JF  - International Conference on Smart Objects and Technologies for Social Good
PB  - Springer, Cham
ER  - 

TY  - CHAP
T1  - Experimental Assessment of the Emergence of Awareness and Its Influence on Behavioral Changes: The Everyaware Lesson
T2  - Participatory Sensing, Opinions and Collective Awareness
Y1  - 2017
A1  - Pietro Gravino
A1  - Alina Sirbu
A1  - Becker, Martin
A1  - Vito D P Servedio
A1  - Vittorio Loreto
AB  - The emergence of awareness is deeply connected to the process of learning. In fact, by learning that high sound levels may harm one’s health, that noise levels that we estimate as innocuous may be dangerous, that there exist an alternative path we can walk to go to work and minimize our exposure to air pollution, etc., citizens will be able to understand the environment around them and act consequently to go toward a more sustainable world.
JF  - Participatory Sensing, Opinions and Collective Awareness
PB  - Springer
UR  - http://link.springer.com/chapter/10.1007/978-3-319-25658-0_16
ER  - 

TY  - ABST
T1  - Fast Estimation of Privacy Risk in Human Mobility Data
Y1  - 2017
A1  - Roberto Pellungrini
A1  - Luca Pappalardo
A1  - Francesca Pratesi
A1  - Anna Monreale
AB  - Mobility data are an important proxy to understand the patterns of human movements, develop analytical services and design models for simulation and prediction of human dynamics. Unfortunately mobility data are also very sensitive, since they may contain personal information about the individuals involved. Existing frameworks for privacy risk assessment enable the data providers to quantify and mitigate privacy risks, but they suffer two main limitations: (i) they have a high computational complexity; (ii) the privacy risk must be re-computed for each new set of individuals, geographic areas or time windows. In this paper we explore a fast and flexible solution to estimate privacy risk in human mobility data, using predictive models to capture the relation between an individual’s mobility patterns and her privacy risk. We show the effectiveness of our approach by experimentation on a real-world GPS dataset and provide a comparison with traditional methods.
SN  - 978-3-319-66283-1
ER  - 

TY  - JOUR
T1  - Forecasting success via early adoptions analysis: A data-driven study
JF  - PloS one
Y1  - 2017
A1  - Giulio Rossetti
A1  - Letizia Milli
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Innovations are continuously launched over markets, such as new products over the retail market or new artists over the music scene. Some innovations become a success; others don’t. Forecasting which innovations will succeed at the beginning of their lifecycle is hard. In this paper, we provide a data-driven, large-scale account of the existence of a special niche among early adopters, individuals that consistently tend to adopt successful innovations before they reach success: we will call them Hit-Savvy. Hit-Savvy can be discovered in very different markets and retain over time their ability to anticipate the success of innovations. As our second contribution, we devise a predictive analytical process, exploiting Hit-Savvy as signals, which achieves high accuracy in the early-stage prediction of successful innovations, far beyond the reach of state-of-the-art time series forecasting models. Indeed, our findings and predictive model can be fruitfully used to support marketing strategies and product placement.
VL  - 12
ER  - 

TY  - CONF
T1  - The Fractal Dimension of Music: Geography, Popularity and Sentiment Analysis
T2  - International Conference on Smart Objects and Technologies for Social Good
Y1  - 2017
A1  - Pollacci, Laura
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Nowadays there is a growing standardization of musical contents. Our finding comes out from a cross-service multi-level dataset analysis where we study how geography affects the music production. The investigation presented in this paper highlights the existence of a “fractal” musical structure that relates the technical characteristics of the music produced at regional, national and world level. Moreover, a similar structure emerges also when we analyze the musicians’ popularity and the polarity of their songs defined as the mood that they are able to convey. Furthermore, the clusters identified are markedly distinct one from another with respect to popularity and sentiment.
JF  - International Conference on Smart Objects and Technologies for Social Good
PB  - Springer, Cham
UR  - https://link.springer.com/chapter/10.1007/978-3-319-76111-4_19
ER  - 

TY  - JOUR
T1  - HyWare: a HYbrid Workflow lAnguage for Research E-infrastructures
JF  - D-Lib Magazine
Y1  - 2017
A1  - Leonardo Candela
A1  - Paolo Manghi
A1  - Fosca Giannotti
A1  - Valerio Grossi
A1  - Roberto Trasarti
AB  - Research e-infrastructures are "systems of systems", patchworks of tools, services and data sources, evolving over time to address the needs of the scientific process. Accordingly, in such environments, researchers implement their scientific processes by means of workflows made of a variety of actions, including for example usage of web services, download and execution of shared software libraries or tools, or local and manual manipulation of data. Although scientists may benefit from sharing their scientific process, the heterogeneity underpinning e-infrastructures hinders their ability to represent, share and eventually reproduce such workflows. This work presents HyWare, a language for representing scientific process in highly-heterogeneous e-infrastructures in terms of so-called hybrid workflows. HyWare lays in between "business process modeling languages", which offer a formal and high-level description of a reasoning, protocol, or procedure, and "workflow execution languages", which enable the fully automated execution of a sequence of computational steps via dedicated engines.
VL  - 23
UR  - http://dx.doi.org/10.1045/january2017-candela
ER  - 

TY  - JOUR
T1  - ICON Loop Carpooling Show Case
JF  - Data Mining and Constraint Programming: Foundations of a Cross-Disciplinary Approach
Y1  - 2017
A1  - Mirco Nanni
A1  - Lars Kotthoff
A1  - Riccardo Guidotti
A1  - Barry O'Sullivan
A1  - Dino Pedreschi
AB  - In this chapter we describe a proactive carpooling service that combines induction and optimization mechanisms to maximize the impact of carpooling within a community. The approach autonomously infers the mobility demand of the users through the analysis of their mobility traces (i.e. Data Mining of GPS trajectories) and builds the network of all possible ride sharing opportunities among the users. Then, the maximal set of carpooling matches that satisfy some standard requirements (maximal capacity of vehicles, etc.) is computed through Constraint Programming models, and the resulting matches are proactively proposed to the users. Finally, in order to maximize the expected impact of the service, the probability that each carpooling match is accepted by the users involved is inferred through Machine Learning mechanisms and put in the CP model. The whole process is reiterated at regular intervals, thus forming an instance of the general ICON loop.
VL  - 10101
UR  - https://link.springer.com/content/pdf/10.1007/978-3-319-50137-6.pdf#page=314
ER  - 

TY  - JOUR
T1  - The Inductive Constraint Programming Loop
JF  - Data Mining and Constraint Programming: Foundations of a Cross-Disciplinary Approach
Y1  - 2017
A1  - Mirco Nanni
A1  - Siegfried Nijssen
A1  - Barry O'Sullivan
A1  - Paparrizou, Anastasia
A1  - Dino Pedreschi
A1  - Simonis, Helmut
AB  - Constraint programming is used for a variety of real-world optimization problems, such as planning, scheduling and resource allocation problems. At the same time, one continuously gathers vast amounts of data about these problems. Current constraint programming software does not exploit such data to update schedules, resources and plans. We propose a new framework, that we call the Inductive Constraint Programming (ICON) loop. In this approach data is gathered and analyzed systematically in order to dynamically revise and adapt constraints and optimization criteria. Inductive Constraint Programming aims at bridging the gap between the areas of data mining and machine learning on the one hand, and constraint programming on the other end.
VL  - 10101
UR  - https://link.springer.com/content/pdf/10.1007/978-3-319-50137-6.pdf#page=307
ER  - 

TY  - JOUR
T1  - The Inductive Constraint Programming Loop
JF  - IEEE Intelligent Systems
Y1  - 2017
A1  - Bessiere, Christian
A1  - De Raedt, Luc
A1  - Tias Guns
A1  - Lars Kotthoff
A1  - Mirco Nanni
A1  - Siegfried Nijssen
A1  - Barry O'Sullivan
A1  - Paparrizou, Anastasia
A1  - Dino Pedreschi
A1  - Simonis, Helmut
AB  - Constraint programming is used for a variety of real-world optimization problems, such as planning, scheduling and resource allocation problems. At the same time, one continuously gathers vast amounts of data about these problems. Current constraint programming software does not exploit such data to update schedules, resources and plans. We propose a new framework, which we call the inductive constraint programming loop. In this approach data is gathered and analyzed systematically in order to dynamically revise and adapt constraints and optimization criteria. Inductive Constraint Programming aims at bridging the gap between the areas of data mining and machine learning on the one hand, and constraint programming on the other.
ER  - 

TY  - CONF
T1  - Information diffusion in complex networks: The active/passive conundrum
T2  - International Workshop on Complex Networks and their Applications
Y1  - 2017
A1  - Letizia Milli
A1  - Giulio Rossetti
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Ideas, information, viruses: all of them, with their mechanisms, can spread over the complex social tissues described by our interpersonal relations. Classical spreading models can agnostically from the object of which they simulate the diffusion, thus considering spreading of virus, ideas and innovations alike. Indeed, such simplification makes easier to define a standard set of tools that can be applied to heterogeneous contexts; however, it can also lead to biased, partial, simulation outcomes. In this work we discuss the concepts of active and passive diffusion: moving from analysis of a well-known passive model, the Threshold one, we introduce two novel approaches whose aim is to provide active and mixed schemas applicable in the context of innovations/ideas diffusion simulation. Our data-driven analysis shows how, in such context, the adoption of exclusively passive/active models leads to conflicting results, thus highlighting the need of mixed approaches.
JF  - International Workshop on Complex Networks and their Applications
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-319-72150-7_25
ER  - 

TY  - CHAP
T1  - Large Scale Engagement Through Web-Gaming and Social Computations
T2  - Participatory Sensing, Opinions and Collective Awareness
Y1  - 2017
A1  - Vito D P Servedio
A1  - Saverio Caminiti
A1  - Pietro Gravino
A1  - Vittorio Loreto
A1  - Alina Sirbu
A1  - Francesca Tria
AB  - In the last few years the Web has progressively acquired the status of an infrastructure for social computation that allows researchers to coordinate the cognitive abilities of human agents, so to steer the collective user activity towards predefined goals. This general trend is also triggering the adoption of web-games as an alternative laboratory to run experiments in the social sciences and whenever the contribution of human beings can be effectively used for research purposes. Web-games introduce a playful aspect in scientific experiments with the result of increasing participation of people and of keeping their attention steady in time. The aim of this chapter is to suggest a general purpose web-based platform scheme for web-gaming and social computation. This platform will simplify the realization of web-games and will act as a repository of different scientific experiments, thus realizing a sort of showcase that stimulates users’ curiosity and helps researchers in recruiting volunteers. A platform built by following these criteria has been developed within the EveryAware project, the Experimental Tribe (XTribe) platform, which is operational and ready to be used. Finally, a sample web-game hosted by the XTribe platform will be presented with the aim of reporting the results, in terms of participation and motivation, of two different player recruiting strategies.
JF  - Participatory Sensing, Opinions and Collective Awareness
PB  - Springer
UR  - http://link.springer.com/chapter/10.1007/978-3-319-25658-0_12
ER  - 

TY  - CONF
T1  - Market Basket Prediction using User-Centric Temporal Annotated Recurring Sequences
T2  - 2017 IEEE International Conference on Data Mining (ICDM)
Y1  - 2017
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Nowadays, a hot challenge for supermarket chains is to offer personalized services to their customers. Market basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable of capturing at the same time the different factors influencing the customer’s decision process: co-occurrence, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern named Temporal Annotated Recurring Sequence (TARS). We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to understand the level of the customer’s stocks and recommend the set of most necessary items. A deep experimentation shows that TARS can explain the customers’ purchase behavior, and that TBP outperforms the state-of-the-art competitors.
JF  - 2017 IEEE International Conference on Data Mining (ICDM)
PB  - IEEE
ER  - 

TY  - CONF
T1  - Movement Behaviour Recognition for Water Activities
T2  - Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, {PAP} 2017, Held in Conjunction with {ECML} {PKDD} 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers
Y1  - 2017
A1  - Mirco Nanni
A1  - Roberto Trasarti
A1  - Fosca Giannotti
JF  - Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, {PAP} 2017, Held in Conjunction with {ECML} {PKDD} 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers
UR  - https://doi.org/10.1007/978-3-319-71970-2_7
ER  - 

TY  - JOUR
T1  - MyWay: Location prediction via mobility profiling
JF  - Information Systems
Y1  - 2017
A1  - Roberto Trasarti
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - Fosca Giannotti
AB  - Forecasting the future positions of mobile users is a valuable task allowing us to operate efficiently a myriad of different applications which need this type of information. We propose MyWay, a prediction system which exploits the individual systematic behaviors modeled by mobility profiles to predict human movements. MyWay provides three strategies: the individual strategy uses only the user individual mobility profile, the collective strategy takes advantage of all users individual systematic behaviors, and the hybrid strategy that is a combination of the previous two. A key point is that MyWay only requires the sharing of individual mobility profiles, a concise representation of the user׳s movements, instead of raw trajectory data revealing the detailed movement of the users. We evaluate the prediction performances of our proposal by a deep experimentation on large real-world data. The results highlight that the synergy between the individual and collective knowledge is the key for a better prediction and allow the system to outperform the state-of-art methods.
VL  - 64
ER  - 

TY  - JOUR
T1  - NDlib: a python library to model and analyze diffusion processes over complex networks
JF  - International Journal of Data Science and Analytics
Y1  - 2017
A1  - Giulio Rossetti
A1  - Letizia Milli
A1  - S Rinzivillo
A1  - Alina Sirbu
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Nowadays the analysis of dynamics of and on networks represents a hot topic in the social network analysis playground.To support students, teachers, developers and researchers, in this work we introduce a novel framework, namely NDlib, an environment designed to describe diffusion simulations. NDlib is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. For this reason, upon NDlib, we designed a simulation server that allows remote execution of experiments as well as an online visualization tool that abstracts its programmatic interface and makes available the simulation platform to non-technicians.
ER  - 

TY  - CONF
T1  - NDlib: Studying Network Diffusion Dynamics
T2  - IEEE International Conference on Data Science and Advanced Analytics, DSA
Y1  - 2017
A1  - Giulio Rossetti
A1  - Letizia Milli
A1  - S Rinzivillo
A1  - Alina Sirbu
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Nowadays the analysis of diffusive phenomena occurring on top of complex networks represents a hot topic in the Social Network Analysis playground. In order to support students, teachers, developers and researchers in this work we introduce a novel simulation framework, ND LIB . ND LIB is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. Upon the diffusion library, we designed a simulation server that allows remote execution of experiments and an online visualization tool that abstract the programmatic interface and makes available the simulation platform to non-technicians.
JF  - IEEE International Conference on Data Science and Advanced Analytics, DSA
CY  - Tokyo
UR  - https://ieeexplore.ieee.org/abstract/document/8259774
ER  - 

TY  - JOUR
T1  - Never drive alone: Boosting carpooling with network analysis
JF  - Information Systems
Y1  - 2017
A1  - Riccardo Guidotti
A1  - Mirco Nanni
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Carpooling, i.e., the act where two or more travelers share the same car for a common trip, is one of the possibilities brought forward to reduce traffic and its externalities, but experience shows that it is difficult to boost the adoption of carpooling to significant levels. In our study, we analyze the potential impact of carpooling as a collective phenomenon emerging from people׳s mobility, by network analytics. Based on big mobility data from travelers in a given territory, we construct the network of potential carpooling, where nodes correspond to the users and links to possible shared trips, and analyze the structural and topological properties of this network, such as network communities and node ranking, to the purpose of highlighting the subpopulations with higher chances to create a carpooling community, and the propensity of users to be either drivers or passengers in a shared car. Our study is anchored to reality thanks to a large mobility dataset, consisting of the complete one-month-long GPS trajectories of approx. 10% circulating cars in Tuscany. We also analyze the aggregated outcome of carpooling by means of empirical simulations, showing how an assignment policy exploiting the network analytic concepts of communities and node rankings minimizes the number of single occupancy vehicles observed after carpooling.
VL  - 64
ER  - 

TY  - JOUR
T1  - Next Basket Prediction using Recurring Sequential Patterns
JF  - arXiv preprint arXiv:1702.07158
Y1  - 2017
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Nowadays, a hot challenge for supermarket chains is to offer personalized services for their customers. Next basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable to capture at the same time the different factors influencing the customer's decision process: co-occurrency, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern Temporal Annotated Recurring Sequence (TARS) able to capture simultaneously and adaptively all these factors. We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to to understand the level of the customer's stocks and recommend the set of most necessary items. By adopting the TBP the supermarket chains could crop tailored suggestions for each individual customer which in turn could effectively speed up their shopping sessions. A deep experimentation shows that TARS are able to explain the customer purchase behavior, and that TBP outperforms the state-of-the-art competitors.
UR  - https://arxiv.org/abs/1702.07158
ER  - 

TY  - JOUR
T1  - Node-centric Community Discovery: From static to dynamic social network analysis
JF  - Online Social Networks and Media
Y1  - 2017
A1  - Giulio Rossetti
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Nowadays, online social networks represent privileged playgrounds that enable researchers to study, characterize and understand complex human behaviors. Social Network Analysis, commonly known as SNA, is the multidisciplinary field of research under which researchers of different backgrounds perform their studies: one of the hottest topics in such diversified context is indeed Community Discovery. Clustering individuals, whose relations are described by a networked structure, into homogeneous communities is a complex task required by several analytical processes. Moreover, due to the user-centric and dynamic nature of online social services, during the last decades, particular emphasis was dedicated to the definition of node-centric, overlapping and evolutive Community Discovery methodologies.  In this paper we provide a comprehensive and concise review of the main results, both algorithmic and analytical, we obtained in this field. Moreover, to better underline the rationale behind our research activity on Community Discovery, in this work we provide a synthetic review of the relevant literature, discussing not only methodological results but also analytical ones.
VL  - 3
UR  - https://www.sciencedirect.com/science/article/abs/pii/S2468696417301052
ER  - 

TY  - CHAP
T1  - Opinion dynamics: models, extensions and external effects
T2  - Participatory Sensing, Opinions and Collective Awareness
Y1  - 2017
A1  - Alina Sirbu
A1  - Vittorio Loreto
A1  - Vito D P Servedio
A1  - Francesca Tria
AB  - Recently, social phenomena have received a lot of attention not only from social scientists, but also from physicists, mathematicians and computer scientists, in the emerging interdisciplinary field of complex system science. Opinion dynamics is one of the processes studied, since opinions are the drivers of human behaviour, and play a crucial role in many global challenges that our complex world and societies are facing: global financial crises, global pandemics, growth of cities, urbanisation and migration patterns, and last but not least important, climate change and environmental sustainability and protection. Opinion formation is a complex process affected by the interplay of different elements, including the individual predisposition, the influence of positive and negative peer interaction (social networks playing a crucial role in this respect), the information each individual is exposed to, and many others. Several models inspired from those in use in physics have been developed to encompass many of these elements, and to allow for the identification of the mechanisms involved in the opinion formation process and the understanding of their role, with the practical aim of simulating opinion formation and spreading under various conditions. These modelling schemes range from binary simple models such as the voter model, to multi-dimensional continuous approaches. Here, we provide a review of recent methods, focusing on models employing both peer interaction and external information, and emphasising the role that less studied mechanisms, such as disagreement, has in driving the opinion dynamics. Due to the important role that external information (mainly in the form of mass media broadcast) can have in enhancing awareness of social issues, a special emphasis will be devoted to study different forms it can take, investigating their effectiveness in driving the opinion formation at the population level. The review shows that, although a large number of approaches exist, some mechanisms such as the effect of multiple external information sources could largely benefit from further studies. Additionally, model validation with real data, which are starting to become available, is still largely lacking and should in our opinion be the main ambition of future investigations.
JF  - Participatory Sensing, Opinions and Collective Awareness
PB  - Springer
UR  - http://link.springer.com/chapter/10.1007/978-3-319-25658-0_17
ER  - 

TY  - CONF
T1  - Privacy Preserving Multidimensional Profiling
T2  - International Conference on Smart Objects and Technologies for Social Good
Y1  - 2017
A1  - Francesca Pratesi
A1  - Anna Monreale
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Recently, big data had become central in the analysis of human behavior and the development of innovative services. In particular, a new class of services is emerging, taking advantage of different sources of data, in order to consider the multiple aspects of human beings. Unfortunately, these data can lead to re-identification problems and other privacy leaks, as diffusely reported in both scientific literature and media. The risk is even more pressing if multiple sources of data are linked together since a potential adversary could know information related to each dataset. For this reason, it is necessary to evaluate accurately and mitigate the individual privacy risk before releasing personal data. In this paper, we propose a methodology for the first task, i.e., assessing privacy risk, in a multidimensional scenario, defining some possible privacy attacks and simulating them using real-world datasets.
JF  - International Conference on Smart Objects and Technologies for Social Good
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-319-76111-4_15
ER  - 

TY  - JOUR
T1  - Quantifying the relation between performance and success in soccer
JF  - Advances in Complex Systems
Y1  - 2017
A1  - Luca Pappalardo
A1  - Paolo Cintia
AB  - The availability of massive data about sports activities offers nowadays the opportunity to quantify the relation between performance and success. In this study, we analyze more than 6000 games and 10 million events in six European leagues and investigate this relation in soccer competitions. We discover that a team’s position in a competition’s final ranking is significantly related to its typical performance, as described by a set of technical features extracted from the soccer data. Moreover, we find that, while victory and defeats can be explained by the team’s performance during a game, it is difficult to detect draws by using a machine learning approach. We then simulate the outcomes of an entire season of each league only relying on technical data and exploiting a machine learning model trained on data from past seasons. The simulation produces a team ranking which is similar to the actual ranking, suggesting that a complex systems’ view on soccer has the potential of revealing hidden patterns regarding the relation between performance and success.
UR  - http://www.worldscientific.com/doi/abs/10.1142/S021952591750014X
ER  - 

TY  - CONF
T1  - Recognizing Residents and Tourists with Retail Data Using Shopping Profiles
T2  - International Conference on Smart Objects and Technologies for Social Good
Y1  - 2017
A1  - Riccardo Guidotti
A1  - Lorenzo Gabrielli
AB  - The huge quantity of personal data stored by service providers registering customers daily life enables the analysis of individual fingerprints characterizing the customers’ behavioral profiles. We propose  a framework for recognizing residents, tourists and occasional shoppers  among the customers of a retail market chain. We employ our recognition framework on a real massive dataset containing the shopping transactions of more than one million of customers, and we identify representative temporal shopping profiles for residents, tourists and occasional  customers. Our experiments show that even though residents are about  33% of the customers they are responsible for more than 90% of the expenditure. We statistically validate the number of residents and tourists  with national official statistics enabling in this way the adoption of our  recognition framework for the development of novel services and analysis.
JF  - International Conference on Smart Objects and Technologies for Social Good
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-319-76111-4_35
ER  - 

TY  - JOUR
T1  - Scalable and flexible clustering solutions for mobile phone-based population indicators
JF  - I. J. Data Science and Analytics
Y1  - 2017
A1  - Alessandro Lulli
A1  - Lorenzo Gabrielli
A1  - Patrizio Dazzi
A1  - Matteo Dell'Amico
A1  - Pietro Michiardi
A1  - Mirco Nanni
A1  - Laura Ricci
VL  - 4
UR  - https://doi.org/10.1007/s41060-017-0065-y
ER  - 

TY  - JOUR
T1  - Segregation discovery in a social network of companies
JF  - Journal of Intelligent Information Systems
Y1  - 2017
A1  - Alessandro Baroni
A1  - Salvatore Ruggieri
AB  - We introduce a framework for the data-driven analysis of social segregation of minority groups, and challenge it on a complex scenario. The framework builds on quantitative measures of segregation, called segregation indexes, proposed in the social science literature. The segregation discovery problem is introduced, which consists of searching sub-groups of population and minorities for which a segregation index is above a minimum threshold. A search algorithm is devised that solves the segregation problem by computing a multi-dimensional data cube that can be explored by the analyst. The machinery underlying the search algorithm relies on frequent itemset mining concepts and tools. The framework is challenged on a cases study in the context of company networks. We analyse segregation on the grounds of sex and age for directors in the boards of the Italian companies. The network includes 2.15M companies and 3.63M directors.
UR  - https://doi.org/10.1007/s10844-017-0485-0
ER  - 

TY  - CONF
T1  - Sentiment Spreading: An Epidemic Model for Lexicon-Based Sentiment Analysis on Twitter
T2  - Conference of the Italian Association for Artificial Intelligence
Y1  - 2017
A1  - Pollacci, Laura
A1  - Alina Sirbu
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Claudio Lucchese
A1  - Muntean, Cristina Ioana
AB  - While sentiment analysis has received significant attention in the last years, problems still exist when tools need to be applied to microblogging content. This because, typically, the text to be analysed consists of very short messages lacking in structure and semantic context. At the same time, the amount of text produced by online platforms is enormous. So, one needs simple, fast and effective methods in order to be able to efficiently study sentiment in these data. Lexicon-based methods, which use a predefined dictionary of terms tagged with sentiment valences to evaluate sentiment in longer sentences, can be a valid approach. Here we present a method based on epidemic spreading to automatically extend the dictionary used in lexicon-based sentiment analysis, starting from a reduced dictionary and large amounts of Twitter data. The resulting dictionary is shown to contain valences that correlate well with human-annotated sentiment, and to produce tweet sentiment classifications comparable to the original dictionary, with the advantage of being able to tag more tweets than the original. The method is easily extensible to various languages and applicable to large amounts of data.
JF  - Conference of the Italian Association for Artificial Intelligence
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-319-70169-1_9
ER  - 

TY  - JOUR
T1  - Survey on using constraints in data mining
JF  - Data Mining and Knowledge Discovery
Y1  - 2017
A1  - Valerio Grossi
A1  - Andrea Romei
A1  - Franco Turini
AB  - This paper provides an overview of the current state-of-the-art on using constraints in knowledge discovery and data mining. The use of constraints in a data mining task requires specific definition and satisfaction tools during knowledge extraction. This survey proposes three groups of studies based on classification, clustering and pattern mining, whether the constraints are on the data, the models or the measures, respectively. We consider the distinctions between hard and soft constraint satisfaction, and between the knowledge extraction phases where constraints are considered. In addition to discussing how constraints can be used in data mining, we show how constraint-based languages can be used throughout the data mining process.
VL  - 31
ER  - 

TY  - CONF
T1  - There's A Path For Everyone: A Data-Driven Personal Model Reproducing Mobility Agendas
T2  - 4th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2017)
Y1  - 2017
A1  - Riccardo Guidotti
A1  - Roberto Trasarti
A1  - Mirco Nanni
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - 4th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2017)
PB  - IEEE
CY  - Tokyo
ER  - 

TY  - JOUR
T1  - Tiles: an online algorithm for community discovery in dynamic social networks
JF  - Machine Learning
Y1  - 2017
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Community discovery has emerged during the last decade as one of the most challenging problems in social network analysis. Many algorithms have been proposed to find communities on static networks, i.e. networks which do not change in time. However, social networks are dynamic realities (e.g. call graphs, online social networks): in such scenarios static community discovery fails to identify a partition of the graph that is semantically consistent with the temporal information expressed by the data. In this work we propose Tiles, an algorithm that extracts overlapping communities and tracks their evolution in time following an online iterative procedure. Our algorithm operates following a domino effect strategy, dynamically recomputing nodes community memberships whenever a new interaction takes place. We compare Tiles with state-of-the-art community detection algorithms on both synthetic and real world networks having annotated community structure: our experiments show that the proposed approach is able to guarantee lower execution times and better correspondence with the ground truth communities than its competitors. Moreover, we illustrate the specifics of the proposed approach by discussing the properties of identified communities it is able to identify.
VL  - 106
UR  - https://link.springer.com/article/10.1007/s10994-016-5582-8
ER  - 

TY  - ABST
T1  - Advances in Network Science: 12th International Conference and School, NetSci-X 2016, Wroclaw, Poland, January 11-13, 2016, Proceedings
Y1  - 2016
A1  - Wierzbicki, Adam
A1  - Brandes, Ulrik
A1  - Schweitzer, Frank
A1  - Dino Pedreschi
AB  - This book constitutes the refereed proceedings of the 12th International Conference and  School of Network Science, NetSci-X 2016, held in Wroclaw, Poland, in January 2016. The 12 full and 6 short papers were carefully reviewed and selected from 59 submissions. The papers deal with the study of network models in domains ranging from biology and physics to computer science, from financial markets to cultural integration, and from social media to infectious diseases.
ER  - 

TY  - JOUR
T1  - An analytical framework to nowcast well-being using mobile phone data
JF  - International Journal of Data Science and Analytics
Y1  - 2016
A1  - Luca Pappalardo
A1  - Maarten Vanhoof
A1  - Lorenzo Gabrielli
A1  - Zbigniew Smoreda
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - An intriguing open question is whether measurements derived from Big Data recording human activities can yield high-fidelity proxies of socio-economic development and well-being. Can we monitor and predict the socio-economic development of a territory just by observing the behavior of its inhabitants through the lens of Big Data? In this paper, we design a data-driven analytical framework that uses mobility measures and social measures extracted from mobile phone data to estimate indicators for socio-economic development and well-being. We discover that the diversity of mobility, defined in terms of entropy of the individual users’ trajectories, exhibits (i) significant correlation with two different socio-economic indicators and (ii) the highest importance in predictive models built to predict the socio-economic indicators. Our analytical framework opens an interesting perspective to study human behavior through the lens of Big Data by means of new statistical indicators that quantify and possibly “nowcast” the well-being and the socio-economic development of a territory.
VL  - 2
ER  - 

TY  - Generic
T1  - “Are we playing like Music-Stars?” Placing Emerging Artists on the Italian Music Scene
T2  - 9th International Workshop on Machine Learning and Music, ECML-PKDD
Y1  - 2016
A1  - Pollacci, Laura
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
AB  - The Italian emerging bands chase success on the footprint of popular artists by playing rhythmic danceable and happy songs. Our finding comes out from a deep study of the Italian music scene and how the new generation ofmusicians relate with the tradition of their country. By analyzing Spotify data we investigated the peculiarity of regional mu- sic and we placed emerging bands within the musical movements defined by already successful artists. The approach proposed and the results ob- tained are a first attempt to outline some rules suggesting how to reach the success in the musical Italian scene.
JF  - 9th International Workshop on Machine Learning and Music, ECML-PKDD
CY  - Riva del Garda
ER  - 

TY  - CONF
T1  - Audio Ergo Sum
T2  - Federation of International Conferences on Software Technologies: Applications and Foundations
Y1  - 2016
A1  - Riccardo Guidotti
A1  - Giulio Rossetti
A1  - Dino Pedreschi
AB  - Nobody can state “Rock is my favorite genre” or “David Bowie is my favorite artist”. We defined a Personal Listening Data Model able to capture musical preferences through indicators and patterns, and we discovered that we are all characterized by a limited set of musical preferences, but not by a unique predilection. The empowered capacity of mobile devices and their growing adoption in our everyday life is generating an enormous increment in the production of personal data such as calls, positioning, online purchases and even music listening. Musical listening is a type of data that has started receiving more attention from the scientific community as consequence of the increasing availability of rich and punctual online data sources. Starting from the listening of 30k Last.Fm users, we show how the employment of the Personal Listening Data Models can provide higher levels of self-awareness. In addition, the proposed model will enable the development of a wide range of analysis and musical services both at personal and at collective level.
JF  - Federation of International Conferences on Software Technologies: Applications and Foundations
PB  - Springer
ER  - 

TY  - CONF
T1  - Big Data and Public Administration: a case study for Tuscany Airports
T2  - SEBD - Italian 	Symposium on  Advanced Database Systems
Y1  - 2016
A1  - Barbara Furletti
A1  - Daniele Fadda
A1  - Leonardo Piccini
A1  - Mirco Nanni
A1  - Patrizia Lattarulo
AB  - In the last decade, the fast development of Information and Communication Technologies led to the wide diffusion of sensors able to track various aspects of human activity, as well as the storage and computational capabilities needed to record and analyze them. The so-called Big Data promise to improve the effectiveness of businesses, the quality of urban life, as well as many other fields, including the functioning of public administrations. Yet, translating the wealth of potential information hidden in Big Data to consumable intelligence seems to be still a difficult task, with a limited basis of success stories. This paper reports a project activity centered on a public administration  - IRPET, the Regional Institute for Economic Planning of Tuscany (Italy). The paper deals, among other topics, with human mobility and public transportation at a regional scale, summarizing the open questions posed by the Public Administration (PA), the envisioned role that Big Data might have in answering them, the actual challenges that emerged in trying to implement them, and finally the results we obtained, the limitations that emerged and the lessons learned.
JF  - SEBD - Italian 	Symposium on  Advanced Database Systems
PB  - Matematicamente.it
CY  - Ugento, Lecce (Italy)
SN  - 9788896354889
UR  - http://sebd2016.unisalento.it/grid/SEBD2016-proceedings.pdf
ER  - 

TY  - JOUR
T1  - Big Data Research in Italy: A Perspective
JF  - Engineering
Y1  - 2016
A1  - Sonia Bergamaschi
A1  - Emanuele Carlini
A1  - Michelangelo Ceci
A1  - Barbara Furletti
A1  - Fosca Giannotti
A1  - Donato Malerba
A1  - Mario Mezzanzanica
A1  - Anna Monreale
A1  - Gabriella Pasi
A1  - Dino Pedreschi
A1  - Raffaele Perego
A1  - Salvatore Ruggieri
AB  - The aim of this article is to synthetically describe the research projects that a selection of Italian universities is undertaking in the context of big data. Far from being exhaustive, this article has the objective of offering a sample of distinct applications that address the issue of managing huge amounts of data in Italy, collected in relation to diverse domains.
VL  - 2
UR  - http://engineering.org.cn/EN/abstract/article_12288.shtml
ER  - 

TY  - JOUR
T1  - Causal Discrimination Discovery Through Propensity Score Analysis
JF  - arXiv preprint arXiv:1608.03735
Y1  - 2016
A1  - Qureshi, Bilal
A1  - Kamiran, Faisal
A1  - Karim, Asim
A1  - Salvatore Ruggieri
AB  - Social discrimination is considered illegal and unethical in the modern world. Such discrimination is often implicit in observed decisions' datasets, and anti-discrimination organizations seek to discover cases of discrimination and to understand the reasons behind them. Previous work in this direction adopted simple observational data analysis; however, this can produce biased results due to the effect of confounding variables. In this paper, we propose a causal discrimination discovery and understanding approach based on propensity score analysis. The propensity score is an effective statistical tool for filtering out the effect of confounding variables. We employ propensity score weighting to balance the distribution of individuals from protected and unprotected groups w.r.t. the confounding variables. For each individual in the dataset, we quantify its causal discrimination or favoritism with a neighborhood-based measure calculated on the balanced distributions. Subsequently, the causal discrimination/favoritism patterns are understood by learning a regression tree. Our approach avoids common pitfalls in observational data analysis and make its results legally admissible. We demonstrate the results of our approach on two discrimination datasets.
UR  - https://arxiv.org/abs/1608.03735
ER  - 

TY  - CONF
T1  - Classification Rule Mining Supported by Ontology for Discrimination Discovery
T2  - Data Mining Workshops (ICDMW), 2016 IEEE 16th International Conference on
Y1  - 2016
A1  - Luong, Binh Thanh
A1  - Salvatore Ruggieri
A1  - Franco Turini
AB  - Discrimination discovery from data consists of designing data mining methods for the actual discovery of discriminatory situations and practices hidden in a large amount of historical decision records. Approaches based on classification rule mining consider items at a flat concept level, with no exploitation of background knowledge on the hierarchical and inter-relational structure of domains. On the other hand, ontologies are a widespread and ever increasing means for expressing such a knowledge. In this paper, we propose a framework for discrimination discovery from ontologies, where contexts of prima-facie evidence of discrimination are summarized in the form of generalized classification rules at different levels of abstraction. Throughout the paper, we adopt a motivating and intriguing case study based on discriminatory tariffs applied by the U. S. Harmonized Tariff Schedules on imported goods.
JF  - Data Mining Workshops (ICDMW), 2016 IEEE 16th International Conference on
PB  - IEEE
ER  - 

TY  - ABST
T1  - Data Mining and Constraint Programming - Foundations of a Cross-Disciplinary Approach.
Y1  - 2016
A1  - Bessiere, Christian
A1  - De Raedt, Luc
A1  - Lars Kotthoff
A1  - Siegfried Nijssen
A1  - Barry O'Sullivan
A1  - Dino Pedreschi
AB  - A successful integration of constraint programming and data mining has the potential to lead to a new ICT paradigm with far reaching implications. It could change the face of data mining and machine learning, as well as constraint programming technology. It would not only allow one to use data mining techniques in constraint programming to identify and update constraints and optimization criteria, but also to employ constraints and criteria in data mining and machine learning in order to discover models compatible with prior knowledge.  This book reports on some key results obtained on this integrated and cross- disciplinary approach within the European FP7 FET Open project no. 284715 on “Inductive Constraint Programming” and a number of associated workshops and Dagstuhl seminars. The book is structured in five parts: background; learning to model; learning to solve; constraint programming for data mining; and showcases.
ER  - 

TY  - CHAP
T1  - Data Mining and Constraints: An Overview
T2  - Data Mining and Constraint Programming
Y1  - 2016
A1  - Valerio Grossi
A1  - Dino Pedreschi
A1  - Franco Turini
AB  - This paper provides an overview of the current state-of-the-art on using constraints in knowledge discovery and data mining. The use of constraints requires mechanisms for defining and evaluating them during the knowledge extraction process. We give a structured account of three main groups of constraints based on the specific context in which they are defined and used. The aim is to provide a complete view on constraints as a building block of data mining methods.
JF  - Data Mining and Constraint Programming
PB  - Springer International Publishing
ER  - 

TY  - JOUR
T1  - Driving Profiles Computation and Monitoring for Car Insurance CRM
JF  - Journal ACM Transactions on Intelligent Systems and Technology (TIST)
Y1  - 2016
A1  - Mirco Nanni
A1  - Roberto Trasarti
A1  - Anna Monreale
A1  - Valerio Grossi
A1  - Dino Pedreschi
AB  - Customer segmentation is one of the most traditional and valued tasks in customer relationship management (CRM). In this article, we explore the problem in the context of the car insurance industry, where the mobility behavior of customers plays a key role: Different mobility needs, driving habits, and skills imply also different requirements (level of coverage provided by the insurance) and risks (of accidents). In the present work, we describe a methodology to extract several indicators describing the driving profile of customers, and we provide a clustering-oriented instantiation of the segmentation problem based on such indicators. Then, we consider the availability of a continuous flow of fresh mobility data sent by the circulating vehicles, aiming at keeping our segments constantly up to date. We tackle a major scalability issue that emerges in this context when the number of customers is large-namely, the communication bottleneck-by proposing and implementing a sophisticated distributed monitoring solution that reduces communications between vehicles and company servers to the essential. We validate the framework on a large database of real mobility data coming from GPS devices on private cars. Finally, we analyze the privacy risks that the proposed approach might involve for the users, providing and evaluating a countermeasure based on data perturbation.
VL  - 8
UR  - http://doi.acm.org/10.1145/2912148
ER  - 

TY  - CHAP
T1  - Going Beyond GDP to Nowcast Well-Being Using Retail Market Data
T2  - Advances in Network Science
Y1  - 2016
A1  - Riccardo Guidotti
A1  - Michele Coscia
A1  - Dino Pedreschi
A1  - Diego Pennacchioli
AB  - One of the most used measures of the economic health of a nation is the Gross Domestic Product (GDP): the market value of all officially recognized final goods and services produced within a country in a given period of time. GDP, prosperity and well-being of the citizens of a country have been shown to be highly correlated. However, GDP is an imperfect measure in many respects. GDP usually takes a lot of time to be estimated and arguably the well-being of the people is not quantifiable simply by the market value of the products available to them. In this paper we use a quantification of the average sophistication of satisfied needs of a population as an alternative to GDP. We show that this quantification can be calculated more easily than GDP and it is a very promising predictor of the GDP value, anticipating its estimation by six months. The measure is arguably a more multifaceted evaluation of the well-being of the population, as it tells us more about how people are satisfying their needs. Our study is based on a large dataset of retail micro transactions happening across the Italian territory.
JF  - Advances in Network Science
PB  - Springer International Publishing
ER  - 

TY  - JOUR
T1  - Homophilic network decomposition: a community-centric analysis of online social services
JF  - Social Network Analysis and Mining
Y1  - 2016
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - Riivo Kikas
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Marlon Dumas
AB  - In this paper we formulate the homophilic network decomposition problem: Is it possible to identify a network partition whose structure is able to characterize the degree of homophily of its nodes? The aim of our work is to understand the relations between the homophily of individuals and the topological features expressed by specific network substructures. We apply several community detection algorithms on three large-scale online social networks—Skype, LastFM and Google+—and advocate the need of identifying the right algorithm for each specific network in order to extract a homophilic network decomposition. Our results show clear relations between the topological features of communities and the degree of homophily of their nodes in three online social scenarios: product engagement in the Skype network, number of listened songs on LastFM and homogeneous level of education among users of Google+.
VL  - 6
ER  - 

TY  - CONF
T1  - A KDD process for discrimination discovery
T2  - Joint European Conference on Machine Learning and Knowledge Discovery in Databases
Y1  - 2016
A1  - Salvatore Ruggieri
A1  - Franco Turini
AB  - The acceptance of analytical methods for discrimination discovery by practitioners and legal scholars can be only achieved if the data mining and machine learning communities will be able to provide case studies, methodological refinements, and the consolidation of a KDD process. We summarize here an approach along these directions.
JF  - Joint European Conference on Machine Learning and Knowledge Discovery in Databases
PB  - Springer International Publishing
ER  - 

TY  - CONF
T1  - A novel approach to evaluate community detection algorithms on ground truth
T2  - 7th Workshop on Complex Networks
Y1  - 2016
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - S Rinzivillo
AB  - Evaluating a community detection algorithm is a complex task due to the lack of a shared and universally accepted definition of community. In literature, one of the most common way to assess the performances of a community detection algorithm is to compare its output with given ground truth communities by using computationally expensive metrics (i.e., Normalized Mutual Information). In this paper we propose a novel approach aimed at evaluating the adherence of a community partition to the ground truth: our methodology provides more information than the state-of-the-art ones and is fast to compute on large-scale networks. We evaluate its correctness by applying it to six popular community detection algorithms on four large-scale network datasets. Experimental results show how our approach allows to easily evaluate the obtained communities on the ground truth and to characterize the quality of community detection algorithms.
JF  - 7th Workshop on Complex Networks
PB  - Springer-Verlag
CY  - Dijon, France
UR  - http://www.giuliorossetti.net/about/wp-content/uploads/2015/12/Complenet16.pdf
ER  - 

TY  - CHAP
T1  - Partition-Based Clustering Using Constraint Optimization
T2  - Data Mining and Constraint Programming - Foundations of a Cross-Disciplinary Approach
Y1  - 2016
A1  - Valerio Grossi
A1  - Tias Guns
A1  - Anna Monreale
A1  - Mirco Nanni
A1  - Siegfried Nijssen
AB  - Partition-based clustering is the task of partitioning a dataset in a number of groups of examples, such that examples in each group are similar to each other. Many criteria for what constitutes a good clustering have been identified in the literature; furthermore, the use of additional constraints to find more useful clusterings has been proposed. In this chapter, it will be shown that most of these clustering tasks can be formalized using optimization criteria and constraints. We demonstrate how a range of clustering tasks can be modelled in generic constraint programming languages with these constraints and optimization criteria. Using the constraint-based modeling approach we also relate the DBSCAN method for density-based clustering to the label propagation technique for community discovery.
JF  - Data Mining and Constraint Programming - Foundations of a Cross-Disciplinary Approach
PB  - Springer International Publishing
UR  - http://dx.doi.org/10.1007/978-3-319-50137-6_11
ER  - 

TY  - Generic
T1  - Power Consumption Modeling and Prediction in a Hybrid CPU-GPU-MIC Supercomputer
T2  - 22nd International European Conference on Parallel and Distributed Computing, Euro-Par 2016
Y1  - 2016
A1  - Alina Sirbu
A1  - Ozalp Babaoglu
AB  - Power consumption is a major obstacle for High Performance Computing (HPC) systems in their quest towards the holy grail of ExaFLOP performance. Significant advances in power efficiency have to be made before this goal can be attained and accurate modeling is an essential step towards power efficiency by optimizing system operating parameters to match dynamic energy needs. In this paper we present a study of power consumption by jobs in Eurora, a hybrid CPU-GPU-MIC system installed at the largest Italian data center. Using data from a dedicated monitoring framework, we build a data-driven model of power consumption for each user in the system and use it to predict the power requirements of future jobs. We are able to achieve good prediction results for over 80 % of the users in the system. For the remaining users, we identify possible reasons why prediction performance is not as good. Possible applications for our predictive modeling results include scheduling optimization, power-aware billing and system-scale power modeling. All the scripts used for the study have been made available on GitHub.
JF  - 22nd International European Conference on Parallel and Distributed Computing, Euro-Par 2016
PB  - Springer LNCS
CY  - Grenoble, France
VL  - LNCS 9833
UR  - http://arxiv.org/abs/1601.05961
ER  - 

TY  - CONF
T1  - Predicting System-level Power for a Hybrid Supercomputer
T2  - 2016 International Conference on High Performance Computing Simulation (HPCS)
Y1  - 2016
A1  - Alina Sirbu
A1  - Ozalp Babaoglu
AB  - For current High Performance Computing systems to scale towards the holy grail of ExaFLOP performance, their power consumption has to be reduced by at least one order of magnitude. This goal can be achieved only through a combination of hardware and software advances. Being able to model and accurately predict the power consumption of large computational systems is necessary for software-level innovations such as proactive and power-aware scheduling, resource allocation and fault tolerance techniques. In this paper we present a 2-layer model of power consumption for a hybrid supercomputer (which held the top spot of the Green500 list on July 2013) that combines CPU, GPU and MIC technologies to achieve higher energy efficiency. Our model takes as input workload information - the number and location of resources that are used by each job at a certain time - and calculates the resulting system-level power consumption. When jobs are submitted to the system, the workload configuration can be foreseen based on the scheduler policies, and our model can then be applied to predict the ensuing system-level power consumption. Additionally, alternative workload configurations can be evaluated from a power perspective and more efficient ones can be selected. Applications of the model include not only power-aware scheduling but also prediction of anomalous behavior.
JF  - 2016 International Conference on High Performance Computing Simulation (HPCS)
PB  - IEEE
CY  - Innsbruck, Austria
UR  - http://ieeexplore.ieee.org/document/7568420/
ER  - 

TY  - CONF
T1  - Privacy-Preserving Outsourcing of Data Mining
T2  - 40th IEEE Annual Computer Software and Applications Conference, {COMPSAC} Workshops 2016, Atlanta, GA, USA, June 10-14, 2016
Y1  - 2016
A1  - Anna Monreale
A1  - Hui Wendy Wang
AB  - Data mining is gaining momentum in society due to the ever increasing availability of large amounts of data, easily gathered by a variety of collection technologies and stored via computer systems. Due to the limited computational resources of data owners and the developments in cloud computing, there has been considerable recent interest in the paradigm of data mining-as-a-service (DMaaS). In this paradigm, a company (data owner) lacking in expertise or computational resources outsources its mining needs to a third party service provider (server). Given the fact that the server may not be fully trusted, one of the main concerns of the DMaaS paradigm is the protection of data privacy. In this paper, we provide an overview of a variety of techniques and approaches that address the privacy issues of the DMaaS paradigm.
JF  - 40th IEEE Annual Computer Software and Applications Conference, {COMPSAC} Workshops 2016, Atlanta, GA, USA, June 10-14, 2016
PB  - IEEE Computer Society
CY  - Atlanta, GA, USA
UR  - http://dx.doi.org/10.1109/COMPSAC.2016.169
ER  - 

TY  - CONF
T1  - Privacy-Preserving Outsourcing of Pattern Mining of Event-Log Data-A Use-Case from Process Industry
T2  - Cloud Computing Technology and Science (CloudCom), 2016 IEEE International Conference on
Y1  - 2016
A1  - Marrella, Alessandro
A1  - Anna Monreale
A1  - Kloepper, Benjamin
A1  - Krueger, Martin W
AB  - With the advent of cloud computing and its model for IT services based on the Internet and big data centers, the interest of industries into XaaS ("Anything as a Service") paradigm is increasing. Business intelligence and knowledge discovery services are typical services that companies tend to externalize on the cloud, due to their data intensive nature and the algorithms complexity. What is appealing for a company is to rely on external expertise and infrastructure to compute the analytical results and models which are required by the business analysts for understanding the business phenomena under observation. Although it is advantageous to achieve sophisticated analysis there exist several serious privacy issues in this paradigm. In this paper we investigate through an industrial use-case the application of a framework for privacypreserving outsourcing of pattern mining on event-log data. Moreover, we present and discuss some ideas about possible extensions.
JF  - Cloud Computing Technology and Science (CloudCom), 2016 IEEE International Conference on
PB  - IEEE
ER  - 

TY  - BOOK
T1  - Realising the European open science cloud
Y1  - 2016
A1  - Ayris, Paul
A1  - Berthou, Jean-Yves
A1  - Bruce, Rachel
A1  - Lindstaedt, Stefanie
A1  - Anna Monreale
A1  - Mons, Barend
A1  - Murayama, Yasuhiro
A1  - Södergård, Caj
A1  - Tochtermann, Klaus
A1  - Wilkinson, Ross
AB  - The European Open Science Cloud (EOSC) aims to accelerate and support the current transition to more effective Open Science and Open Innovation in the Digital Single Market. It should enable trusted access to services, systems and the re-use of shared scientific data across disciplinary, social and geographical borders. This report approaches the EOSC as a federated environment for scientific data sharing and re-use, based on existing and emerging elements in the Member States, with light-weight international guidance and governance, and a large degree of freedom regarding practical implementation.
SN  - 978-92-79-61762-1
UR  - http://dx.doi.org/10.2777/940154
ER  - 

TY  - CONF
T1  - SPARQL Queries over Source Code
T2  - 2016 IEEE Tenth International Conference on Semantic Computing (ICSC)
Y1  - 2016
A1  - Mattia Setzu
A1  - Atzori, Maurizio
JF  - 2016 IEEE Tenth International Conference on Semantic Computing (ICSC)
PB  - IEEE
ER  - 

TY  - JOUR
T1  - Special Issue on Mobile Traffic Analytics
JF  - Computer Communications
Y1  - 2016
A1  - Marco Fiore
A1  - Zubair Shafiq
A1  - Zbigniew Smoreda
A1  - Razvan Stanica
A1  - Roberto Trasarti
AB  - This Special Issue of Computer Communications is dedicated to mobile traffic data analysis. This is an emerging field of research that stems from the increasing pervasiveness in our lives of always-connected mobile devices. These devices continuously collect, generate, receive or communicate data; in doing so, they leave trails of digital crumbs that can be followed, recorded and analysed in many and varied ways, and for a number of different purposes.  From a data collection perspective, applications running on smartphones allow tracking user activities with extreme accuracy, in terms of mobility, context, and service usage. Yet, having individuals informedly install and run software that monitors their actions is not obvious; finding adequate incentives is equivalently complex. The other option is gathering mobile traffic data in the mobile network. This is an increasingly common practice for telecommunication operators: the collection of minimum information required for billing is giving way to in-depth inspection and recording of mobile service usages in space and time, and of traffic flows at the network edge and core. In this case, data access remains the major impediment, due to privacy and industrial secrecy reasons.    Despite the issues inherent to the data collection, the richness of knowledge that can be extracted from the aforementioned sources is such that actors in both academia and industry are putting significant effort in gathering, analysing and possibly making available mobile traffic data. Indeed, mobile traffic data typically contain information on large populations of individuals (from thousands to millions users) with high spatio-temporal granularity. The combination of accuracy and coverage is unprecedented, and it has proven key in validating theories and scaling up experimental studies in a number of research fields across many disciplines, including physics, sociology, epidemiology, transportation systems, and, of course, mobile networking.    As a result, we witness today a rapid growth of the literature that proposes or exploits mobile traffic analytics. Included in this Special Issue are eight papers that cover a significant portion of the different research topics in this area, ranging from data collection to the characterization of land use and mobile service consumption, from the inference and prediction of user mobility to the detection of malicious traffic. These papers were selected from 30 high-quality submissions after at least two rounds of reviews by experts and guest editors. The original submissions were received from five continents and a variety of countries, including Austria, Argentina, Belgium, Brazil, Chile, China, France, Germany, Italy, South Korea, Luxembourg, Pakistan, Saudi Arabia, Spain, Sweden, Tunisia, Turkey, USA. The accepted papers reflect this geographical heterogeneity, and are authored by researchers based in Europe, North and South America.
VL  - 95
UR  - http://dx.doi.org/10.1016/j.comcom.2016.10.009
ER  - 

TY  - JOUR
T1  - A supervised approach for intra-/inter-community interaction prediction in dynamic social networks
JF  - Social Network Analysis and Mining
Y1  - 2016
A1  - Giulio Rossetti
A1  - Riccardo Guidotti
A1  - Ioanna Miliou
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Due to the growing availability of Internet services in the last decade, the interactions between people became more and more easy to establish. For example, we can have an intercontinental job interview, or we can send real-time multimedia content to any friend of us just owning a smartphone. All this kind of human activities generates digital footprints, that describe a complex, rapidly evolving, network structures. In such dynamic scenario, one of the most challenging tasks involves the prediction of future interactions between couples of actors (i.e., users in online social networks, researchers in collaboration networks). In this paper, we approach such problem by leveraging networks dynamics: to this extent, we propose a supervised learning approach which exploits features computed by time-aware forecasts of topological measures calculated between node pairs. Moreover, since real social networks are generally composed by weakly connected modules, we instantiate the interaction prediction problem in two disjoint applicative scenarios: intra-community and inter-community link prediction. Experimental results on real time-stamped networks show how our approach is able to reach high accuracy. Furthermore, we analyze the performances of our methodology when varying the typologies of features, community discovery algorithms and forecast methods.
VL  - 6
UR  - http://dx.doi.org/10.1007/s13278-016-0397-y
ER  - 

TY  - JOUR
T1  - Towards operator-less data centers through data-driven, predictive, proactive autonomics
JF  - Cluster Computing
Y1  - 2016
A1  - Alina Sirbu
A1  - Ozalp Babaoglu
AB  - Continued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using live data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating predictive models for node failures. Our results support the practicality of a data-driven approach by showing the effectiveness of predictive models based on data found in typical data center logs. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing node state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if nodes will fail in a future 24-h window. Our evaluation reveals that if we limit false positive rates to 5 %, we can achieve true positive rates between 27 and 88 % with precision varying between 50 and 72 %. This level of performance allows us to recover large fraction of jobs’ executions (by redirecting them to other nodes when a failure of the present node is predicted) that would otherwise have been wasted due to failures. We discuss the feasibility of including our predictive model as the central component of a data-driven autonomic manager and operating it on-line with live data streams (rather than off-line on data logs). All of the scripts used for BigQuery and classification analyses are publicly available on GitHub.
UR  - http://link.springer.com/article/10.1007/s10586-016-0564-y
ER  - 

TY  - CHAP
T1  - Understanding human mobility with big data
T2  - Solving Large Scale Learning Tasks. Challenges and Algorithms
Y1  - 2016
A1  - Fosca Giannotti
A1  - Lorenzo Gabrielli
A1  - Dino Pedreschi
A1  - S Rinzivillo
AB  - The paper illustrates basic methods of mobility data mining, designed to extract from the big mobility data the patterns of collective movement behavior, i.e., discover the subgroups of travelers characterized by a common purpose, profiles of individual movement activity, i.e., characterize the routine mobility of each traveler. We illustrate a number of concrete case studies where mobility data mining is put at work to create powerful analytical services for policy makers, businesses, public administrations, and individual citizens.
JF  - Solving Large Scale Learning Tasks. Challenges and Algorithms
PB  - Springer International Publishing
ER  - 

TY  - JOUR
T1  - Unveiling mobility complexity through complex network analysis
JF  - Social Network Analysis and Mining
Y1  - 2016
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - The availability of massive digital traces of individuals is offering a series of novel insights on the understanding of patterns characterizing human mobility. Many studies try to semantically enrich mobility data with annotations about human activities. However, these approaches either focus on places with high frequencies (e.g., home and work), or relay on background knowledge (e.g., public available points of interest). In this paper, we depart from the concept of frequency and we focus on a high level representation of mobility using network analytics. The visits of each driver to each systematic destination are modeled as links in a bipartite network where a set of nodes represents drivers and the other set represents places. We extract such network from two real datasets of human mobility based, respectively, on GPS and GSM data. We introduce the concept of mobility complexity of drivers and places as a ranking analysis over the nodes of these networks. In addition, by means of community discovery analysis, we differentiate subgroups of drivers and places according both to their homogeneity and to their mobility complexity.
VL  - 6
ER  - 

TY  - CONF
T1  - Unveiling Political Opinion Structures with a Web-experiment
T2  - Proceedings of the 1st International Conference on Complex Information Systems
Y1  - 2016
A1  - Pietro Gravino
A1  - Saverio Caminiti
A1  - Alina Sirbu
A1  - Francesca Tria
A1  - Vito D P Servedio
A1  - Vittorio Loreto
AB  - The dynamics of political votes has been widely studied, both for its practical interest and as a paradigm of the dynamics of mass opinions and collective phenomena, where theoretical predictions can be easily tested. However, the vote outcome is often influenced by many factors beyond the bare opinion on the candidate, and in most cases it is bound to a single preference. The voter perception of the political space is still to be elucidated. We here propose a web experiment (laPENSOcos`ı) where we explicitly investigate participants’ opinions on political entities (parties, coalitions, individual candidates) of the Italian political scene. As a main result, we show that the political perception follows a Weber-Fechner-like law, i.e., when ranking political entities according to the user expressed preferences, the perceived distance of the user from a given entity scales as the logarithm of this rank.
JF  - Proceedings of the 1st International Conference on Complex Information Systems
SN  - 978-989-758-181-6
UR  - http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0005906300390047
ER  - 

TY  - CHAP
T1  - Where Is My Next Friend? Recommending Enjoyable Profiles in Location Based Services
T2  - Complex Networks VII
Y1  - 2016
A1  - Riccardo Guidotti
A1  - Michele Berlingerio
AB  - How many of your friends, with whom you enjoy spending some time, live close by? How many people are at your reach, with whom you could have a nice conversation? We introduce a measure of enjoyability that may be the basis for a new class of location-based services aimed at maximizing the likelihood that two persons, or a group of people, would enjoy spending time together. Our enjoyability takes into account both topic similarity between two users and the users’ tendency to connect to people with similar or dissimilar interest. We computed the enjoyability on two datasets of geo-located tweets, and we reasoned on the applicability of the obtained results for producing friend recommendations. We aim at suggesting couples of users which are not friends yet, but which are frequently co-located and maximize our enjoyability measure. By taking into account the spatial dimension, we show how 50 % of users may find at least one enjoyable person within 10 km of their two most visited locations. Our results are encouraging, and open the way for a new class of recommender systems based on enjoyability.
JF  - Complex Networks VII
PB  - Springer International Publishing
ER  - 

TY  - CONF
T1  - Behavioral Entropy and Profitability in Retail
T2  - IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA'2015)
Y1  - 2015
A1  - Riccardo Guidotti
A1  - Michele Coscia
A1  - Dino Pedreschi
A1  - Diego Pennacchioli
AB  - Human behavior is predictable in principle: people are systematic in their everyday choices. This predictability can be used to plan events and infrastructure, both for the public good and for private gains. In this paper we investigate the largely unexplored relationship between the systematic behavior of a customer and its profitability for a retail company. We estimate a customer’s behavioral entropy over two dimensions: the basket entropy is the variety of what customers buy, and the spatio-temporal entropy is the spatial and temporal variety of their shopping sessions. To estimate the basket and the spatiotemporal entropy we use data mining and information theoretic techniques. We find that predictable systematic customers are more profitable for a supermarket: their average per capita expenditures are higher than non systematic customers and they visit the shops more often. However, this higher individual profitability is masked by its overall level. The highly systematic customers are a minority of the customer set. As a consequence, the total amount of revenues they generate is small. We suggest that favoring a systematic behavior in their customers might be a good strategy for supermarkets to increase revenue. These results are based on data coming from a large Italian supermarket chain, including more than 50 thousand customers visiting 23 shops to purchase more than 80 thousand distinct products.
JF  - IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA'2015)
PB  - IEEE
CY  - Paris
ER  - 

TY  - JOUR
T1  - A Big Data Analyzer for Large Trace Logs
JF  - Computing
Y1  - 2015
A1  - Balliu, Alkida
A1  - Olivetti, Dennis
A1  - Ozalp Babaoglu
A1  - Marzolla, Moreno
A1  - Alina Sirbu
UR  - http://link.springer.com/article/10.1007/s00607-015-0480-7
ER  - 

TY  - CONF
T1  - City users’ classification with mobile phone data
T2  - IEEE Big Data
Y1  - 2015
A1  - Lorenzo Gabrielli
A1  - Barbara Furletti
A1  - Roberto Trasarti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Nowadays mobile phone data are an actual proxy for studying the users’ social life and urban dynamics. In this paper we present the Sociometer, and analytical framework aimed at classifying mobile phone users into behavioral categories by means of their call habits. The analytical process starts from spatio-temporal profiles, learns the different behaviors, and returns annotated profiles. After the description of the methodology and its evaluation, we present an application of the Sociometer for studying city users of one small and one big city, evaluating the impact of big events in these cities.
JF  - IEEE Big Data
CY  - Santa Clara (CA) - USA
ER  - 

TY  - CONF
T1  - Clustering Formulation Using Constraint Optimization
T2  - Software Engineering and Formal Methods - {SEFM} 2015 Collocated Workshops: ATSE, HOFM, MoKMaSD, and VERY*SCART, York, UK, September 7-8, 2015, Revised Selected Papers
Y1  - 2015
A1  - Valerio Grossi
A1  - Anna Monreale
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Franco Turini
AB  - The problem of clustering a set of data is a textbook machine learning problem, but at the same time, at heart, a typical optimization problem. Given an objective function, such as minimizing the intra-cluster distances or maximizing the inter-cluster distances, the task is to find an assignment of data points to clusters that achieves this objective. In this paper, we present a constraint programming model for a centroid based clustering and one for a density based clustering. In particular, as a key contribution, we show how the expressivity introduced by the formulation of the problem by constraint programming makes the standard problem easy to be extended with other constraints that permit to generate interesting variants of the problem. We show this important aspect in two different ways: first, we show how the formulation of the density-based clustering by constraint programming makes it very similar to the label propagation problem and then, we propose a variant of the standard label propagation approach.
JF  - Software Engineering and Formal Methods - {SEFM} 2015 Collocated Workshops: ATSE, HOFM, MoKMaSD, and VERY*SCART, York, UK, September 7-8, 2015, Revised Selected Papers
PB  - Springer Berlin Heidelberg
UR  - http://dx.doi.org/10.1007/978-3-662-49224-6_9
ER  - 

TY  - CONF
T1  - ComeWithMe: An Activity-Oriented Carpooling Approach
T2  - 2015 {IEEE} 18th International Conference on Intelligent Transportation Systems
Y1  - 2015
A1  - Vinicius Monteiro de Lira
A1  - Valéria Cesário Times
A1  - Chiara Renso
A1  - S Rinzivillo
AB  - The interest in carpooling is increasing due to the need to reduce traffic and noise pollution. Most of the available approaches and systems are route oriented, where driver and passengers are matched when the destination location is the same. ComeWithMe offers a new perspective: the destination is the intended activity instead of a location. This novel matching method is aimed to boost the possibilities of rides if passenger reaches a different location maintaining the activity. We conducted experiments using a real data set of trajectories and our results showed that the proposed matching algorithm improved the traditional carpooling approach in more than 80%.
JF  - 2015 {IEEE} 18th International Conference on Intelligent Transportation Systems
PB  - Institute of Electrical {&} Electronics Engineers ({IEEE})
UR  - http://dx.doi.org/10.1109/itsc.2015.414
ER  - 

TY  - CONF
T1  - Community-centric analysis of user engagement in Skype social network
T2  - International conference on Advances in Social Network Analysis and Mining
Y1  - 2015
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - Riivo Kikas
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Marlon Dumas
JF  - International conference on Advances in Social Network Analysis and Mining
PB  - IEEE
CY  - Paris, France
SN  - 978-1-4503-3854-7
UR  - http://dl.acm.org/citation.cfm?doid=2808797.2809384
ER  - 

TY  - JOUR
T1  - Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks
JF  - Microarrays
Y1  - 2015
A1  - Alina Sirbu
A1  - Martin Crane
A1  - Heather J Ruskin
VL  - 4
UR  - http://www.mdpi.com/2076-3905/4/2/255
ER  - 

TY  - CONF
T1  - Detecting and understanding big events in big cities
T2  - NetMob
Y1  - 2015
A1  - Barbara Furletti
A1  - Lorenzo Gabrielli
A1  - Roberto Trasarti
A1  - Zbigniew Smoreda
A1  - Maarten Vanhoof
A1  - Cezary Ziemlicki
AB  - Recent studies have shown the great potential of big data such as mobile phone location data to model human behavior. Big data allow to analyze people presence in a territory in a fast and effective way with respect to the classical surveys (diaries or questionnaires). One of the drawbacks of these collection systems is incompleteness of the users' traces; people are localized only when they are using their phones. In this work we define a data mining method for identifying people presence and understanding the impact of big events in big cities. We exploit the ability of the Sociometer for classifying mobile phone users in mobility categories through their presence profile. The experiment in cooperation with Orange Telecom has been conduced in Paris during the event F^ete de la Musique using a  privacy preserving protocol.
JF  - NetMob
CY  - Boston
UR  - http://www.netmob.org/assets/img/netmob15_book_of_abstracts_posters.pdf
ER  - 

TY  - JOUR
T1  - Discrimination- and privacy-aware patterns
JF  - Data Min. Knowl. Discov.
Y1  - 2015
A1  - Sara Hajian
A1  - Josep Domingo-Ferrer
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Data mining is gaining societal momentum due to the ever increasing availability of large amounts of human data, easily collected by a variety of sensing technologies. We are therefore faced with unprecedented opportunities and risks: a deeper understanding of human behavior and how our society works is darkened by a greater chance of privacy intrusion and unfair discrimination based on the extracted patterns and profiles. Consider the case when a set of patterns extracted from the personal data of a population of individual persons is released for a subsequent use into a decision making process, such as, e.g., granting or denying credit. First, the set of patterns may reveal sensitive information about individual persons in the training population and, second, decision rules based on such patterns may lead to unfair discrimination, depending on what is represented in the training cases. Although methods independently addressing privacy or discrimination in data mining have been proposed in the literature, in this context we argue that privacy and discrimination risks should be tackled together, and we present a methodology for doing so while publishing frequent pattern mining results. We describe a set of pattern sanitization methods, one for each discrimination measure used in the legal literature, to achieve a fair publishing of frequent patterns in combination with two possible privacy transformations: one based on k-anonymity and one based on differential privacy. Our proposed pattern sanitization methods based on k-anonymity yield both privacy- and discrimination-protected patterns, while introducing reasonable (controlled) pattern distortion. Moreover, they obtain a better trade-off between protection and data quality than the sanitization methods based on differential privacy. Finally, the effectiveness of our proposals is assessed by extensive experiments.
VL  - 29
UR  - http://dx.doi.org/10.1007/s10618-014-0393-7
ER  - 

TY  - JOUR
T1  - Egalitarianism in the rank aggregation problem: a new dimension for democracy
JF  - Quality & Quantity
Y1  - 2015
A1  - Contucci, Pierluigi
A1  - Panizzi, Emanuele
A1  - Ricci-Tersenghi, Federico
A1  - Alina Sirbu
UR  - http://link.springer.com/article/10.1007/s11135-015-0197-x
ER  - 

TY  - ABST
T1  - An exploration of learning processes as process maps in FLOSS repositories
Y1  - 2015
A1  - Mukala, Patrick
A1  - Cerone, Antonio
A1  - Franco Turini
AB  - Evidence suggests that Free/Libre Open Source Software (FLOSS) environ-ments provide unlimited learning opportunities. Community members engage in a number of activities both during their interaction with their peers and while mak-ing use of the tools available in these environments. A number of studies docu-ment the existence of learning processes in FLOSS through the analysis of sur-veys and questionnaires filled by FLOSS project participants. At the same time, the interest in understanding the dynamics of the FLOSS phenomenon, its popu-larity and success resulted in the development of tools and techniques for extract-ing and analyzing data from different FLOSS data sources. This new field is called Mining Software Repositories (MSR). In spite of these efforts, there is limited work aiming to provide empirical evidence of learning processes directly from FLOSS repositories. In this paper, we seek to trigger such an initiative by proposing an approach based on Process Mining to trace learning behaviors from FLOSS participants’ trails of activities, as recorded in FLOSS repositories, and visualize them as pro-cess maps. Process maps provide a pictorial representation of real behavior as it is recorded in FLOSS data. Our aim is to provide critical evidence that boosts the understanding of learning behavior in FLOSS communities by analyzing the rel-evant repositories. In order to accomplish this, we propose an effective approach that comprises first the mining of FLOSS repositories in order to generate Event logs, and then the generation of process maps, equipped with relevant statistical data interpreting and indicating the value of process discovery from these reposi-tories.
UR  - http://eprints.adm.unipi.it/id/eprint/2344
ER  - 

TY  - CONF
T1  - Find Your Way Back: Mobility Profile Mining with Constraints
T2  - Principles and Practice of Constraint Programming
Y1  - 2015
A1  - Lars Kotthoff
A1  - Mirco Nanni
A1  - Riccardo Guidotti
A1  - Barry O'Sullivan
AB  - Mobility profile mining is a data mining task that can be formulated as clustering over movement trajectory data. The main challenge is to separate the signal from the noise, i.e. one-off trips. We show that standard data mining approaches suffer the important drawback that they cannot take the symmetry of non-noise trajectories into account. That is, if a trajectory has a symmetric equivalent that covers the same trip in the reverse direction, it should become more likely that neither of them is labelled as noise. We present a constraint model that takes this knowledge into account to produce better clusters. We show the efficacy of our approach on real-world data that was previously processed using standard data mining techniques.
JF  - Principles and Practice of Constraint Programming
PB  - Springer International Publishing
CY  - Cork
ER  - 

TY  - Generic
T1  - The harsh rule of the goals: data-driven performance indicators for football teams
T2  - IEEE International Conference on Data Science and Advanced Analytics
Y1  - 2015
A1  - Paolo Cintia
A1  - Luca Pappalardo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Marco Malvaldi
AB  - —Sports analytics in general, and football (soccer in  USA) analytics in particular, have evolved in recent years in an  amazing way, thanks to automated or semi-automated sensing  technologies that provide high-fidelity data streams extracted  from every game. In this paper we propose a data-driven  approach and show that there is a large potential to boost the  understanding of football team performance. From observational  data of football games we extract a set of pass-based performance  indicators and summarize them in the H indicator. We observe a  strong correlation among the proposed indicator and the success  of a team, and therefore perform a simulation on the four major  European championships (78 teams, almost 1500 games). The  outcome of each game in the championship was replaced by a  synthetic outcome (win, loss or draw) based on the performance  indicators computed for each team. We found that the final  rankings in the simulated championships are very close to the  actual rankings in the real championships, and show that teams  with high ranking error show extreme values of a defense/attack  efficiency measure, the Pezzali score. Our results are surprising  given the simplicity of the proposed indicators, suggesting that  a complex systems’ view on football data has the potential of  revealing hidden patterns and behavior of superior quality.
JF  - IEEE International Conference on Data Science and Advanced Analytics
UR  - https://www.researchgate.net/profile/Luca_Pappalardo/publication/281318318_The_harsh_rule_of_the_goals_data-driven_performance_indicators_for_football_teams/links/561668e308ae37cfe4090a5d.pdf
ER  - 

TY  - CONF
T1  - A Holistic Approach to Log Data Analysis in High-Performance Computing Systems: The Case of IBM Blue Gene/Q
T2  - Euro-Par 2015: parallel Processing Workshops, LNCS 9523
Y1  - 2015
A1  - Alina Sirbu
A1  - Ozalp Babaoglu
JF  - Euro-Par 2015: parallel Processing Workshops, LNCS 9523
PB  - Springer
UR  - http://link.springer.com/chapter/10.1007%2F978-3-319-27308-2_51
ER  - 

TY  - CONF
T1  - Interaction Prediction in Dynamic Networks exploiting Community Discovery
T2  - International conference on Advances in Social Network Analysis and Mining, ASONAM 2015
Y1  - 2015
A1  - Giulio Rossetti
A1  - Riccardo Guidotti
A1  - Diego Pennacchioli
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Due to the growing availability of online social services, interactions between people became more and more easy to establish and track. Online social human activities generate digital footprints, that describe complex, rapidly evolving, dynamic networks. In such scenario one of the most challenging task to address involves the prediction of future interactions between couples of actors. In this study, we want to leverage networks dynamics and community structure to predict which are the future interactions more likely to appear. To this extent, we propose a supervised learning approach which exploit features computed by time-aware forecasts of topological measures calculated between pair of nodes belonging to the same community. Our experiments on real dynamic networks show that the designed analytical process is able to achieve interesting results.
JF  - International conference on Advances in Social Network Analysis and Mining, ASONAM 2015
PB  - IEEE
CY  - Paris, France
SN  - 978-1-4503-3854-7
UR  - http://dl.acm.org/citation.cfm?doid=2808797.2809401
ER  - 

TY  - JOUR
T1  - Introduction to the special issue on Artificial Intelligence for Society and Economy
JF  - Intelligenza Artificiale
Y1  - 2015
A1  - Salvatore Ruggieri
VL  - 9
ER  - 

TY  - Generic
T1  - ItEM: A Vector Space Model to Bootstrap an Italian Emotive Lexicon
T2  - Second Italian Conference on Computational Linguistics CLiC-it 2015
Y1  - 2015
A1  - Lucia Passaro
A1  - Pollacci, Laura
A1  - Lenci, Alessandro
AB  - In  recent  years  computational  linguistics  has  seen  a  rising  interest  in subjectivity,  opinions,  feelings and  emotions.  Even  though  great attention  has been given to polarity recognition, the research in emotion detection has had to rely on small emotion resources. In this paper,  we  present  a  methodology  to  build emotive   lexicons   by  jointly   exploiting vector  space  models  and  human  annotation,  and  we  provide  the  first  results  of the  evaluation  with  a  crowdsourcing  experiment.
JF  - Second Italian Conference on Computational Linguistics CLiC-it 2015
VL  - II
SN  - 978-88-99200-62-6
ER  - 

TY  - CONF
T1  - The layered structure of company share networks
T2  - Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
Y1  - 2015
A1  - Andrea Romei
A1  - Salvatore Ruggieri
A1  - Franco Turini
AB  - We present a framework for the analysis of corporate governance problems using network science and graph algorithms on ownership networks. In such networks, nodes model companies/shareholders and edges model shares owned. Inspired by the widespread pyramidal organization of corporate groups of companies, we model ownership networks as layered graphs, and exploit the layered structure to design feasible and efficient solutions to three key problems of corporate governance. The first one is the long-standing problem of computing direct and indirect ownership (integrated ownership problem). The other two problems are introduced here: computing direct and indirect dividends (dividend problem), and computing the group of companies controlled by a parent shareholder (corporate group problem). We conduct an extensive empirical analysis of the Italian ownership network, which, with its 3.9M nodes, is 30× the largest network studied so far.
JF  - Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
PB  - IEEE
ER  - 

TY  - CONF
T1  - Managing travels with PETRA: The Rome use case
T2  - 2015 31st IEEE International Conference on Data Engineering Workshops (ICDEW)
Y1  - 2015
A1  - Botea, Adi
A1  - Braghin, Stefano
A1  - Lopes, Nuno
A1  - Riccardo Guidotti
A1  - Francesca Pratesi
AB  - The aim of the PETRA project is to provide the basis for a city-wide transportation system that supports policies catering for both individual preferences of users and city-wide travel patterns. The PETRA platform will be initially deployed in the partner city of Rome, and later in Venice, and Tel-Aviv.
JF  - 2015 31st IEEE International Conference on Data Engineering Workshops (ICDEW)
PB  - IEEE
ER  - 

TY  - CONF
T1  - Mining learning processes from FLOSS mailing archives
T2  - Conference on e-Business, e-Services and e-Society
Y1  - 2015
A1  - Mukala, Patrick
A1  - Cerone, Antonio
A1  - Franco Turini
AB  - Evidence suggests that Free/Libre Open Source Software (FLOSS) environments provide unlimited learning opportunities. Community members engage in a number of activities both during their interaction with their peers and while making use of these environments. As FLOSS repositories store data about participants’ interaction and activities, we analyze participants’ interaction and knowledge exchange in emails to trace learning activities that occur in distinct phases of the learning process. We make use of semantic search in SQL to retrieve data and build corresponding event logs which are then fed to a process mining tool in order to produce visual workflow nets. We view these nets as representative of the traces of learning activities in FLOSS as well as their relevant flow of occurrence. Additional statistical details are provided to contextualize and describe these models.
JF  - Conference on e-Business, e-Services and e-Society
PB  - Springer, Cham
ER  - 

TY  - CONF
T1  - Mobility Mining for Journey Planning in Rome
T2  - Machine Learning and Knowledge Discovery in Databases
Y1  - 2015
A1  - Michele Berlingerio
A1  - Bicer, Veli
A1  - Botea, Adi
A1  - Braghin, Stefano
A1  - Lopes, Nuno
A1  - Riccardo Guidotti
A1  - Francesca Pratesi
AB  - We present recent results on integrating private car GPS routines obtained by a Data Mining module. into the PETRA (PErsonal TRansport Advisor) platform. The routines are used as additional “bus lines”, available to provide a ride to travelers. We present the effects of querying the planner with and without the routines, which show how Data Mining may help Smarter Cities applications.
JF  - Machine Learning and Knowledge Discovery in Databases
PB  - Springer International Publishing
ER  - 

TY  - JOUR
T1  - Participatory Patterns in an International Air Quality Monitoring Initiative.
JF  - PLoS One
Y1  - 2015
A1  - Alina Sirbu
A1  - Becker, Martin
A1  - Saverio Caminiti
A1  - De Baets, Bernard
A1  - Elen, Bart
A1  - Francis, Louise
A1  - Pietro Gravino
A1  - Hotho, Andreas
A1  - Ingarra, Stefano
A1  - Vittorio Loreto
A1  - Molino, Andrea
A1  - Mueller, Juergen
A1  - Peters, Jan
A1  - Ricchiuti, Ferdinando
A1  - Saracino, Fabio
A1  - Vito D P Servedio
A1  - Stumme, Gerd
A1  - Theunis, Jan
A1  - Francesca Tria
A1  - Van den Bossche, Joris
AB  - <p>The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective way of producing a change is involving the citizens themselves in monitoring campaigns (a citizen science bottom-up approach). This is possible by developing novel technologies and IT infrastructures enabling large citizen participation. Here, in the wider framework of one of the first such projects, we show results from an international competition where citizens were involved in mobile air pollution monitoring using low cost sensing devices, combined with a web-based game to monitor perceived levels of pollution. Measures of shift in perceptions over the course of the campaign are provided, together with insights into participatory patterns emerging from this study. Interesting effects related to inertia and to direct involvement in measurement activities rather than indirect information exposure are also highlighted, indicating that direct involvement can enhance learning and environmental awareness. In the future, this could result in better adoption of policies towards decreasing pollution.</p>
VL  - 10
ER  - 

TY  - JOUR
T1  - Product assortment and customer mobility
JF  - EPJ Data Science
Y1  - 2015
A1  - Michele Coscia
A1  - Diego Pennacchioli
A1  - Fosca Giannotti
AB  - Customers mobility is dependent on the sophistication of their needs: sophisticated customers need to travel more to fulfill their needs. In this paper, we provide more detailed evidence of this phenomenon, providing an empirical validation of the Central Place Theory. For each customer, we detect what is her favorite shop, where she purchases most products. We can study the relationship between the favorite shop and the closest one, by recording the influence of the shop’s size and the customer’s sophistication in the discordance cases, i.e. the cases in which the favorite shop is not the closest one. We show that larger shops are able to retain most of their closest customers and they are able to catch large portions of customers from smaller shops around them. We connect this observation with the shop’s larger sophistication, and not with its other characteristics, as the phenomenon is especially noticeable when customers want to satisfy their sophisticated needs. This is a confirmation of the recent extensions of the Central Place Theory, where the original assumptions of homogeneity in customer purchase power and needs are challenged. Different types of shops have also different survival logics. The largest shops get closed if they are unable to catch customers from the smaller shops, while medium size shops get closed if they cannot retain their closest customers. All analysis are performed on a large real-world dataset recording all purchases from millions of customers across the west coast of Italy.
VL  - 4
UR  - http://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-015-0051-3
ER  - 

TY  - CONF
T1  - Quantification in Social Networks
T2  - International Conference on Data Science and Advanced Analytics (IEEE DSAA'2015)
Y1  - 2015
A1  - Letizia Milli
A1  - Anna Monreale
A1  - Giulio Rossetti
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Fabrizio Sebastiani
AB  - In many real-world applications there is a need to monitor the distribution of a population across different classes, and to track changes in this distribution over time. As an example, an important task is to monitor the percentage of unemployed adults in a given region. When the membership of an individual in a class cannot be established deterministically, a typical solution is the classification task. However, in the above applications the final goal is not determining which class the individuals belong to, but estimating the prevalence of each class in the unlabeled data. This task is called quantification. Most of the work in the literature addressed the quantification problem considering data presented in conventional attribute format. Since the ever-growing availability of web and social media we have a flourish of network data representing a new important source of information and by using quantification network techniques we could quantify collective behavior, i.e., the number of users that are involved in certain type of activities, preferences, or behaviors. In this paper we exploit the homophily effect observed in many social networks in order to construct a quantifier for networked data. Our experiments show the effectiveness of the proposed approaches and the comparison with the existing state-of-the-art quantification methods shows that they are more accurate.
JF  - International Conference on Data Science and Advanced Analytics (IEEE DSAA'2015)
PB  - IEEE
CY  - Paris, France
UR  - http://www.giuliorossetti.net/about/wp-content/uploads/2015/12/main_DSAA.pdf
ER  - 

TY  - JOUR
T1  - Returners and explorers dichotomy in human mobility
JF  - Nat Commun
Y1  - 2015
A1  - Luca Pappalardo
A1  - Filippo Simini
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Barabasi, Albert-Laszlo
AB  - The availability of massive digital traces of human whereabouts has offered a series of novel insights on the quantitative patterns characterizing human mobility. In particular, numerous recent studies have lead to an unexpected consensus: the considerable variability in the characteristic travelled distance of individuals coexists with a high degree of predictability of their future locations. Here we shed light on this surprising coexistence by systematically investigating the impact of recurrent mobility on the characteristic distance travelled by individuals. Using both mobile phone and GPS data, we discover the existence of two distinct classes of individuals: returners and explorers. As existing models of human mobility cannot explain the existence of these two classes, we develop more realistic models able to capture the empirical findings. Finally, we show that returners and explorers play a distinct quantifiable role in spreading phenomena and that a correlation exists between their mobility patterns and social interactions.
VL  - 6
UR  - http://dx.doi.org/10.1038/ncomms9166
ER  - 

TY  - JOUR
T1  - A risk model for privacy in trajectory data
JF  - Journal of Trust Management
Y1  - 2015
A1  - Anirban Basu
A1  - Anna Monreale
A1  - Roberto Trasarti
A1  - Juan Camilo Corena
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Shinsaku Kiyomoto
A1  - Yutaka Miyake
A1  - Tadashi Yanagihara
AB  - Time sequence data relating to users, such as medical histories and mobility data, are good candidates for data mining, but often contain highly sensitive information. Different methods in privacy-preserving data publishing are utilised to release such private data so that individual records in the released data cannot be re-linked to specific users with a high degree of certainty. These methods provide theoretical worst-case privacy risks as measures of the privacy protection that they offer. However, often with many real-world data the worst-case scenario is too pessimistic and does not provide a realistic view of the privacy risks: the real probability of re-identification is often much lower than the theoretical worst-case risk. In this paper, we propose a novel empirical risk model for privacy which, in relation to the cost of privacy attacks, demonstrates better the practical risks associated with a privacy preserving data release. We show detailed evaluation of the proposed risk model by using k-anonymised real-world mobility data and then, we show how the empirical evaluation of the privacy risk has a different trend in synthetic data describing random movements.
VL  - 2
ER  - 

TY  - CONF
T1  - Segregation Discovery in a Social Network of Companies
T2  - International Symposium on Intelligent Data Analysis
Y1  - 2015
A1  - Alessandro Baroni
A1  - Salvatore Ruggieri
AB  - We introduce a framework for a data-driven analysis of segregation of minority groups in social networks, and challenge it on a complex scenario. The framework builds on quantitative measures of segregation, called segregation indexes, proposed in the social science literature. The segregation discovery problem consists of searching sub-graphs and sub-groups for which a reference segregation index is above a minimum threshold. A search algorithm is devised that solves the segregation problem. The framework is challenged on the analysis of segregation of social groups in the boards of directors of the real and large network of Italian companies connected through shared directors.
JF  - International Symposium on Intelligent Data Analysis
PB  - Springer, Cham
ER  - 

TY  - JOUR
T1  - Small Area Model-Based Estimators Using Big Data Sources
JF  - Journal of Official Statistics
Y1  - 2015
A1  - Stefano Marchetti
A1  - Caterina Giusti
A1  - Monica Pratesi
A1  - Nicola Salvati
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - S Rinzivillo
A1  - Luca Pappalardo
A1  - Lorenzo Gabrielli
VL  - 31
ER  - 

TY  - CONF
T1  - Social or green? A data-driven approach for more enjoyable carpooling
T2  - Intelligent Transportation Systems (ITSC), 2015 IEEE 18th International Conference on
Y1  - 2015
A1  - Riccardo Guidotti
A1  - Sassi, Andrea
A1  - Michele Berlingerio
A1  - Pascale, Alessandra
A1  - Ghaddar, Bissan
JF  - Intelligent Transportation Systems (ITSC), 2015 IEEE 18th International Conference on
PB  - IEEE
ER  - 

TY  - CONF
T1  - {TOSCA:} two-steps clustering algorithm for personal locations detection
T2  - Proceedings of the 23rd {SIGSPATIAL} International Conference on Advances in Geographic Information Systems, Bellevue, WA, USA, November 3-6, 2015
Y1  - 2015
A1  - Riccardo Guidotti
A1  - Roberto Trasarti
A1  - Mirco Nanni
JF  - Proceedings of the 23rd {SIGSPATIAL} International Conference on Advances in Geographic Information Systems, Bellevue, WA, USA, November 3-6, 2015
UR  - http://doi.acm.org/10.1145/2820783.2820818
ER  - 

TY  - CHAP
T1  - Towards a Boosted Route Planner Using Individual Mobility Models
T2  - Software Engineering and Formal Methods
Y1  - 2015
A1  - Riccardo Guidotti
A1  - Paolo Cintia
JF  - Software Engineering and Formal Methods
PB  - Springer Berlin Heidelberg
ER  - 

TY  - CONF
T1  - Towards Data-Driven Autonomics in Data Centers
T2  - IEEE International Conference on Cloud and Autonomic Computing
Y1  - 2015
A1  - Alina Sirbu
A1  - Ozalp Babaoglu
JF  - IEEE International Conference on Cloud and Autonomic Computing
PB  - IEEE
UR  - http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7312140&filter%3DAND%28p_IS_Number%3A7312127%29
ER  - 

TY  - CONF
T1  - Towards user-centric data management: individual mobility analytics for collective services
T2  - Proceedings of the 4th {ACM} {SIGSPATIAL} International Workshop on Mobile Geographic Information Systems, MobiGIS 2015, Bellevue, WA, USA, November 3-6, 2015
Y1  - 2015
A1  - Riccardo Guidotti
A1  - Roberto Trasarti
A1  - Mirco Nanni
A1  - Fosca Giannotti
JF  - Proceedings of the 4th {ACM} {SIGSPATIAL} International Workshop on Mobile Geographic Information Systems, MobiGIS 2015, Bellevue, WA, USA, November 3-6, 2015
UR  - http://doi.acm.org/10.1145/2834126.2834132
ER  - 

TY  - CHAP
T1  - Use of Mobile Phone Data to Estimate Visitors Mobility Flows
T2  - Software Engineering and Formal Methods
Y1  - 2015
A1  - Lorenzo Gabrielli
A1  - Barbara Furletti
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - S Rinzivillo
AB  - Big Data originating from the digital breadcrumbs of human activities, sensed as by-product of the technologies that we use for our daily activities, allows us to observe the individual and collective behavior of people at an unprecedented detail. Many dimensions of our social life have big data “proxies”, such as the mobile calls data for mobility. In this paper we investigate to what extent data coming from mobile operators could be a support in producing reliable and timely estimates of intra-city mobility flows. The idea is to define an estimation method based on calling data to characterize the mobility habits of visitors at the level of a single municipality.
JF  - Software Engineering and Formal Methods
PB  - Springer International Publishing
VL  - 8938
UR  - http://link.springer.com/chapter/10.1007%2F978-3-319-15201-1_14
ER  - 

TY  - CONF
T1  - An abstract state machine (ASM) representation of learning process in FLOSS communities
T2  - International Conference on Software Engineering and Formal Methods
Y1  - 2014
A1  - Mukala, Patrick
A1  - Cerone, Antonio
A1  - Franco Turini
AB  - Free/Libre Open Source Software (FLOSS) communities as collaborative environments enable the occurrence of learning between participants in these groups. With the increasing interest research on understanding the mechanisms and processes through which learning occurs in FLOSS, there is an imperative to describe these processes. One successful way of doing this is through specification methods. In this paper, we describe the adoption of Abstract States Machines (ASMs) as a specification methodology for the description of learning processes in FLOSS. The goal of this endeavor is to represent the many possible steps and/or activities FLOSS participants go through during interactions that can be categorized as learning processes. Through ASMs, we express learning phases as states while activities that take place before moving from one state to another are expressed as transitions.
JF  - International Conference on Software Engineering and Formal Methods
PB  - Springer, Cham
ER  - 

TY  - JOUR
T1  - Anonymity preserving sequential pattern mining
JF  - Artif. Intell. Law
Y1  - 2014
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Ruggero G. Pensa
A1  - Fabio Pinelli
AB  - The increasing availability of personal data of a sequential nature, such as time-stamped transaction or location data, enables increasingly sophisticated sequential pattern mining techniques. However, privacy is at risk if it is possible to reconstruct the identity of individuals from sequential data. Therefore, it is important to develop privacy-preserving techniques that support publishing of really anonymous data, without altering the analysis results significantly. In this paper we propose to apply the Privacy-by-design paradigm for designing a technological framework to counter the threats of undesirable, unlawful effects of privacy violation on sequence data, without obstructing the knowledge discovery opportunities of data mining technologies. First, we introduce a k-anonymity framework for sequence data, by defining the sequence linking attack model and its associated countermeasure, a k-anonymity notion for sequence datasets, which provides a formal protection against the attack. Second, we instantiate this framework and provide a specific method for constructing the k-anonymous version of a sequence dataset, which preserves the results of sequential pattern mining, together with several basic statistics and other analytical properties of the original data, including the clustering structure. A comprehensive experimental study on realistic datasets of process-logs, web-logs and GPS tracks is carried out, which empirically shows how, in our proposed method, the protection of privacy meets analytical utility.
VL  - 22
UR  - http://dx.doi.org/10.1007/s10506-014-9154-6
ER  - 

TY  - CONF
T1  - Anti-discrimination analysis using privacy attack strategies
T2  - Joint European Conference on Machine Learning and Knowledge Discovery in Databases
Y1  - 2014
A1  - Salvatore Ruggieri
A1  - Sara Hajian
A1  - Kamiran, Faisal
A1  - Zhang, Xiangliang
AB  - Social discrimination discovery from data is an important task to identify illegal and unethical discriminatory patterns towards protected-by-law groups, e.g., ethnic minorities. We deploy privacy attack strategies as tools for discrimination discovery under hard assumptions which have rarely tackled in the literature: indirect discrimination discovery, privacy-aware discrimination discovery, and discrimination data recovery. The intuition comes from the intriguing parallel between the role of the anti-discrimination authority in the three scenarios above and the role of an attacker in private data publishing. We design strategies and algorithms inspired/based on Frèchet bounds attacks, attribute inference attacks, and minimality attacks to the purpose of unveiling hidden discriminatory practices. Experimental results show that they can be effective tools in the hands of anti-discrimination authorities.
JF  - Joint European Conference on Machine Learning and Knowledge Discovery in Databases
PB  - Springer, Berlin, Heidelberg
ER  - 

TY  - CONF
T1  - BiDAl: Big Data Analyzer for Cluster Traces
T2  - Informatika (BigSys workshop)
Y1  - 2014
A1  - Balliu, Alkida
A1  - Olivetti, Dennis
A1  - Ozalp Babaoglu
A1  - Marzolla, Moreno
A1  - Alina Sirbu
JF  - Informatika (BigSys workshop)
PB  - GI-Edition Lecture Notes in Informatics
UR  - http://arxiv.org/abs/1410.1309
ER  - 

TY  - CONF
T1  - Big data analytics for smart mobility: a case study
T2  - EDBT/ICDT 2014 Workshops - Mining Urban Data (MUD)
Y1  - 2014
A1  - Barbara Furletti
A1  - Roberto Trasarti
A1  - Lorenzo Gabrielli
A1  - Mirco Nanni
A1  - Dino Pedreschi
JF  - EDBT/ICDT 2014 Workshops - Mining Urban Data (MUD)
CY  - Athens, Greece
UR  - http://ceur-ws.org/Vol-1133/paper-57.pdf
ER  - 

TY  - CONF
T1  - CF-inspired Privacy-Preserving Prediction of Next Location in the Cloud
T2  - Cloud Computing Technology and Science (CloudCom), 2014 IEEE 6th International Conference on
Y1  - 2014
A1  - Anirban Basu
A1  - Juan Camilo Corena
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Shinsaku Kiyomoto
A1  - Vaidya, Jaideep
A1  - Yutaka Miyake
AB  - Mobility data gathered from location sensors such as Global Positioning System (GPS) enabled phones and vehicles is valuable for spatio-temporal data mining for various location-based services (LBS). Such data is often considered sensitive and there exist many a mechanism for privacy preserving analyses of the data. Through various anonymisation mechanisms, it can be ensured with a high probability that a particular individual cannot be identified when mobility data is outsourced to third parties for analysis. However, challenges remain with the privacy of the queries on outsourced analysis results, especially when the queries are sent directly to third parties by end-users. Drawing inspiration from our earlier work in privacy preserving collaborative filtering (CF) and next location prediction, in this exploratory work, we propose a novel representation of trajectory data in the CF domain and experiment with a privacy preserving Slope One CF predictor. We present evaluations for the accuracy and the computational performance of our proposal using anonymised data gathered from real traffic data in the Italian cities of Pisa and Milan. One use-case is a third-party location-prediction-as-a-service deployed on a public cloud, which can respond to privacy-preserving queries while enabling data owners to build a rich predictor on the cloud.
JF  - Cloud Computing Technology and Science (CloudCom), 2014 IEEE 6th International Conference on
PB  - IEEE
UR  - http://dx.doi.org/10.1109/CloudCom.2014.114
ER  - 

TY  - Generic
T1  - The CoLing Lab system for Sentiment Polarity Classification of tweets
T2  - First Italian Conference on Computational Linguistics CLiC-it 2014 & Fourth International Workshop EVALITA 2014
Y1  - 2014
A1  - Lucia Passaro
A1  - Lebani, Gianluca E
A1  - Pollacci, Laura
A1  - Chersoni, Emmanuele
A1  - Lenci, Alessandro
JF  - First Italian Conference on Computational Linguistics CLiC-it 2014 & Fourth International Workshop EVALITA 2014
VL  - II
ER  - 

TY  - JOUR
T1  - On the complexity of quantified linear systems
JF  - Theoretical Computer Science
Y1  - 2014
A1  - Salvatore Ruggieri
A1  - Eirinakis, Pavlos
A1  - Subramani, K
A1  - Wojciechowski, Piotr
AB  - In this paper, we explore the computational complexity of the conjunctive fragment of the first-order theory of linear arithmetic. Quantified propositional formulas of linear inequalities with (k−1) quantifier alternations are log-space complete in ΣkP or ΠkP depending on the initial quantifier. We show that when we restrict ourselves to quantified conjunctions of linear inequalities, i.e., quantified linear systems, the complexity classes collapse to polynomial time. In other words, the presence of universal quantifiers does not alter the complexity of the linear programming problem, which is known to be in P. Our result reinforces the importance of sentence formats from the perspective of computational complexity.
VL  - 518
ER  - 

TY  - JOUR
T1  - Decision tree building on multi-core using FastFlow
JF  - Concurrency and Computation: Practice and Experience
Y1  - 2014
A1  - Aldinucci, Marco
A1  - Salvatore Ruggieri
A1  - Torquati, Massimo
AB  - The whole computer hardware industry embraced the multi-core. The extreme optimisation of sequential algorithms is then no longer sufficient to squeeze the real machine power, which can be only exploited via thread-level parallelism. Decision tree algorithms exhibit natural concurrency that makes them suitable to be parallelised. This paper presents an in-depth study of the parallelisation of an implementation of the C4.5 algorithm for multi-core architectures. We characterise elapsed time lower bounds for the forms of parallelisations adopted and achieve close to optimal performance. Our implementation is based on the FastFlow parallel programming environment, and it requires minimal changes to the original sequential code. Copyright © 2013 John Wiley & Sons, Ltd.
VL  - 26
ER  - 

TY  - JOUR
T1  - Discovering urban and country dynamics from mobile phone data with spatial correlation patterns
JF  - Telecommunications Policy
Y1  - 2014
A1  - Roberto Trasarti
A1  - Ana-Maria Olteanu-Raimond
A1  - Mirco Nanni
A1  - Thomas Couronné
A1  - Barbara Furletti
A1  - Fosca Giannotti
A1  - Zbigniew Smoreda
A1  - Cezary Ziemlicki
KW  - Urban dynamics
AB  - Abstract Mobile communication technologies pervade our society and existing wireless networks are able to sense the movement of people, generating large volumes of data related to human activities, such as mobile phone call records. At the present, this kind of data is collected and stored by telecom operators infrastructures mainly for billing reasons, yet it represents a major source of information in the study of human mobility. In this paper, we propose an analytical process aimed at extracting interconnections between different areas of the city that emerge from highly correlated temporal variations of population local densities. To accomplish this objective, we propose a process based on two analytical tools: (i) a method to estimate the presence of people in different geographical areas; and (ii) a method to extract time- and space-constrained sequential patterns capable to capture correlations among geographical areas in terms of significant co-variations of the estimated presence. The methods are presented and combined in order to deal with two real scenarios of different spatial scale: the Paris Region and the whole France.
UR  - http://www.sciencedirect.com/science/article/pii/S0308596113002012
ER  - 

TY  - CHAP
T1  - EGIA–Evolutionary Optimisation of Gene Regulatory Networks, an Integrative Approach
T2  - Complex Networks V
Y1  - 2014
A1  - Alina Sirbu
A1  - Martin Crane
A1  - Heather J Ruskin
JF  - Complex Networks V
PB  - Springer International Publishing
UR  - http://link.springer.com/chapter/10.1007/978-3-319-05401-8_21
ER  - 

TY  - CONF
T1  - Fair pattern discovery
T2  - Symposium on Applied Computing, {SAC} 2014, Gyeongju, Republic of Korea - March 24 - 28, 2014
Y1  - 2014
A1  - Sara Hajian
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Josep Domingo-Ferrer
A1  - Fosca Giannotti
AB  - Data mining is gaining societal momentum due to the ever increasing availability of large amounts of human data, easily collected by a variety of sensing technologies. We are assisting to unprecedented opportunities of understanding human and society behavior that unfortunately is darkened by several risks for human rights: one of this is the unfair discrimination based on the extracted patterns and profiles. Consider the case when a set of patterns extracted from the personal data of a population of individual persons is released for subsequent use in a decision making process, such as, e.g., granting or denying credit. Decision rules based on such patterns may lead to unfair discrimination, depending on what is represented in the training cases. In this context, we address the discrimination risks resulting from publishing frequent patterns. We present a set of pattern sanitization methods, one for each discrimination measure used in the legal literature, for fair (discrimination-protected) publishing of frequent pattern mining results. Our proposed pattern sanitization methods yield discrimination-protected patterns, while introducing reasonable (controlled) pattern distortion. Finally, the effectiveness of our proposals is assessed by extensive experiments.
JF  - Symposium on Applied Computing, {SAC} 2014, Gyeongju, Republic of Korea - March 24 - 28, 2014
UR  - http://doi.acm.org/10.1145/2554850.2555043
ER  - 

TY  - JOUR
T1  - Introduction to special issue on computational methods for enforcing privacy and fairness in the knowledge society
JF  - Artificial Intelligence and Law
Y1  - 2014
A1  - Sergio Mascetti
A1  - Ricci, Annarita
A1  - Salvatore Ruggieri
VL  - 22
ER  - 

TY  - CONF
T1  - Investigating semantic regularity of human mobility lifestyle
T2  - 18th International Database Engineering {&} Applications Symposium, {IDEAS} 2014, Porto, Portugal, July 7-9, 2014
Y1  - 2014
A1  - Vinicius Monteiro de Lira
A1  - S Rinzivillo
A1  - Chiara Renso
A1  - Valéria Cesário Times
A1  - Patr{\'ı}cia C. A. R. Tedesco
JF  - 18th International Database Engineering {&} Applications Symposium, {IDEAS} 2014, Porto, Portugal, July 7-9, 2014
PB  - ACM
CY  - Porto, Portugal
UR  - http://doi.acm.org/10.1145/2628194.2628226
ER  - 

TY  - CONF
T1  - {MAPMOLTY:} {A} Web Tool for Discovering Place Loyalty Based on Mobile Crowdsource Data
T2  - Web Engineering, 14th International Conference, {ICWE} 2014, Toulouse, France, July 1-4, 2014. Proceedings
Y1  - 2014
A1  - Vinicius Monteiro de Lira
A1  - S Rinzivillo
A1  - Valéria Cesário Times
A1  - Chiara Renso
JF  - Web Engineering, 14th International Conference, {ICWE} 2014, Toulouse, France, July 1-4, 2014. Proceedings
UR  - http://dx.doi.org/10.1007/978-3-319-08245-5_43
ER  - 

TY  - CONF
T1  - Mining efficient training patterns of non-professional cyclists
T2  - 22nd Italian Symposium on Advanced Database Systems, {SEBD} 2014, Sorrento Coast, Italy, June 16-18, 2014.
Y1  - 2014
A1  - Paolo Cintia
A1  - Luca Pappalardo
A1  - Dino Pedreschi
JF  - 22nd Italian Symposium on Advanced Database Systems, {SEBD} 2014, Sorrento Coast, Italy, June 16-18, 2014.
ER  - 

TY  - CHAP
T1  - Mobility Profiling
T2  - Data Science and Simulation in Transportation Research
Y1  - 2014
A1  - Mirco Nanni
A1  - Roberto Trasarti
A1  - Paolo Cintia
A1  - Barbara Furletti
A1  - Chiara Renso
A1  - Lorenzo Gabrielli
A1  - S Rinzivillo
A1  - Fosca Giannotti
AB  - The ability to understand the dynamics of human mobility is crucial for tasks like urban planning and transportation management. The recent rapidly growing availability of large spatio-temporal datasets gives us the possibility to develop sophisticated and accurate analysis methods and algorithms that can enable us to explore several relevant mobility phenomena: the distinct access paths to a territory, the groups of persons that move together in space and time, the regions of a territory that contains a high density of traffic demand, etc. All these paradigmatic perspectives focus on a collective view of the mobility where the interesting phenomenon is the result of the contribution of several moving objects. In this chapter, the authors explore a different approach to the topic and focus on the analysis and understanding of relevant individual mobility habits in order to assign a profile to an individual on the basis of his/her mobility. This process adds a semantic level to the raw mobility data, enabling further analyses that require a deeper understanding of the data itself. The studies described in this chapter are based on two large datasets of spatio-temporal data, originated, respectively, from GPS-equipped devices and from a mobile phone network.
JF  - Data Science and Simulation in Transportation Research
PB  - IGI Global
ER  - 

TY  - JOUR
T1  - A multidisciplinary survey on discrimination analysis
JF  - The Knowledge Engineering Review
Y1  - 2014
A1  - Andrea Romei
A1  - Salvatore Ruggieri
AB  - The collection and analysis of observational and experimental data represent the main tools for assessing the presence, the extent, the nature, and the trend of discrimination phenomena. Data analysis techniques have been proposed in the last 50 years in the economic, legal, statistical, and, recently, in the data mining literature. This is not surprising, since discrimination analysis is a multidisciplinary problem, involving sociological causes, legal argumentations, economic models, statistical techniques, and computational issues. The objective of this survey is to provide a guidance and a glue for researchers and anti-discrimination data analysts on concepts, problems, application areas, datasets, methods, and approaches from a multidisciplinary perspective. We organize the approaches according to their method of data collection as observational, quasi-experimental, and experimental studies. A fourth line of recently blooming research on knowledge discovery based methods is also covered. Observational methods are further categorized on the basis of their application context: labor economics, social profiling, consumer markets, and others.
VL  - 29
ER  - 

TY  - CONF
T1  - Ontolifloss: Ontology for learning processes in FLOSS communities
T2  - International Conference on Software Engineering and Formal Methods
Y1  - 2014
A1  - Mukala, Patrick
A1  - Cerone, Antonio
A1  - Franco Turini
AB  - Free/Libre Open Source Software (FLOSS) communities are considered an example of commons-based peer-production models where groups of participants work together to achieve projects of common purpose. In these settings, many occurring activities can be documented and have established them as learning environments. As knowledge exchange is proved to occur in FLOSS, the dynamic and free nature of participation poses a great challenge in understanding activities pertaining to Learning Processes.    In this paper we raise this question and propose an ontology (called OntoLiFLOSS) in order to define terms and concepts that can explain learning activities taking place in these communities. The objective of this endeavor is to define in the simplest possible way a common definition of concepts and activities that can guide the identification of learning processes taking place among FLOSS members in any of the standard repositories such as mailing list, SVN, bug trackers and even discussion forums.
JF  - International Conference on Software Engineering and Formal Methods
PB  - Springer, Cham
ER  - 

TY  - CONF
T1  - Overlap versus partition: marketing classification and customer profiling in complex networks of products
T2  - Data engineering workshops (ICDEW), 2014 IEEE 30th international conference on
Y1  - 2014
A1  - Diego Pennacchioli
A1  - Michele Coscia
A1  - Dino Pedreschi
AB  - In recent years we witnessed the explosion in the availability of data regarding human and customer behavior in the market. This data richness era has fostered the development of useful applications in understanding how markets and the minds of the customers work. In this paper we focus on the analysis of complex networks based on customer behavior. Complex network analysis has provided a new and wide toolbox for the classic data mining task of clustering. With community discovery, i.e. the detection of functional modules in complex networks, we are now able to group together customers and products using a variety of different criteria. The aim of this paper is to explore this new analytic degree of freedom. We are interested in providing a case study uncovering the meaning of different community discovery algorithms on a network of products connected together because co-purchased by the same customers. We focus our interest in the different interpretation of a partition approach, where each product belongs to a single community, against an overlapping approach, where each product can belong to multiple communities. We found that the former is useful to improve the marketing classification of products, while the latter is able to create a collection of different customer profiles.
JF  - Data engineering workshops (ICDEW), 2014 IEEE 30th international conference on
PB  - IEEE
UR  - http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6818312
ER  - 

TY  - CONF
T1  - The patterns of musical influence on the Last.Fm social network
T2  - 22nd Italian Symposium on Advanced Database Systems, {SEBD} 2014, Sorrento Coast, Italy, June 16-18, 2014.
Y1  - 2014
A1  - Diego Pennacchioli
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Michele Coscia
JF  - 22nd Italian Symposium on Advanced Database Systems, {SEBD} 2014, Sorrento Coast, Italy, June 16-18, 2014.
ER  - 

TY  - CONF
T1  - A Privacy Risk Model for Trajectory Data
T2  - Trust Management {VIII} - 8th {IFIP} {WG} 11.11 International Conference, {IFIPTM} 2014, Singapore, July 7-10, 2014. Proceedings
Y1  - 2014
A1  - Anirban Basu
A1  - Anna Monreale
A1  - Juan Camilo Corena
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Shinsaku Kiyomoto
A1  - Yutaka Miyake
A1  - Tadashi Yanagihara
A1  - Roberto Trasarti
AB  - Time sequence data relating to users, such as medical histories and mobility data, are good candidates for data mining, but often contain highly sensitive information. Different methods in privacy-preserving data publishing are utilised to release such private data so that individual records in the released data cannot be re-linked to specific users with a high degree of certainty. These methods provide theoretical worst-case privacy risks as measures of the privacy protection that they offer. However, often with many real-world data the worst-case scenario is too pessimistic and does not provide a realistic view of the privacy risks: the real probability of re-identification is often much lower than the theoretical worst-case risk. In this paper we propose a novel empirical risk model for privacy which, in relation to the cost of privacy attacks, demonstrates better the practical risks associated with a privacy preserving data release. We show detailed evaluation of the proposed risk model by using k-anonymised real-world mobility data.
JF  - Trust Management {VIII} - 8th {IFIP} {WG} 11.11 International Conference, {IFIPTM} 2014, Singapore, July 7-10, 2014. Proceedings
UR  - http://dx.doi.org/10.1007/978-3-662-43813-8_9
ER  - 

TY  - JOUR
T1  - Privacy-by-Design in Big Data Analytics and Social Mining
JF  - EPJ Data Science
Y1  - 2014
A1  - Anna Monreale
A1  - S Rinzivillo
A1  - Francesca Pratesi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Privacy is ever-growing concern in our society and is becoming a fundamental aspect to take into account when one wants to use, publish and analyze data involving human personal sensitive information. Unfortunately, it is increasingly hard to transform the data in a way that it protects sensitive information: we live in the era of big data characterized by unprecedented opportunities to sense, store and analyze social data describing human activities in great detail and resolution. As a result, privacy preservation simply cannot be accomplished by de-identification alone. In this paper, we propose the privacy-by-design paradigm to develop technological frameworks for countering the threats of undesirable, unlawful effects of privacy violation, without obstructing the knowledge discovery opportunities of social mining and big data analytical technologies. Our main idea is to inscribe privacy protection into the knowledge discovery technology by design, so that the analysis incorporates the relevant privacy requirements from the start.
VL  - 10
N1  - 2014:10
ER  - 

TY  - CONF
T1  - Process mining event logs from FLOSS data: state of the art and perspectives
T2  - International Conference on Software Engineering and Formal Methods
Y1  - 2014
A1  - Mukala, Patrick
A1  - Cerone, Antonio
A1  - Franco Turini
AB  - Free/Libre Open Source Software (FLOSS) is a phenomenon that has undoubtedly triggered extensive research endeavors. At the heart of these initiatives is the ability to mine data from FLOSS repositories with the hope of revealing empirical evidence to answer existing questions on the FLOSS development process. In spite of the success produced with existing mining techniques, emerging questions about FLOSS data require alternative and more appropriate ways to explore and analyse such data.    In this paper, we explore a different perspective called process mining. Process mining has been proved to be successful in terms of tracing and reconstructing process models from data logs (event logs). The chief objective of our analysis is threefold. We aim to achieve: (1) conformance to predefined models; (2) discovery of new model patterns; and, finally, (3) extension to predefined models.
JF  - International Conference on Software Engineering and Formal Methods
PB  - Springer, Cham
ER  - 

TY  - CONF
T1  - The purpose of motion: Learning activities from Individual Mobility Networks
T2  - International Conference on Data Science and Advanced Analytics, {DSAA} 2014, Shanghai, China, October 30 - November 1, 2014
Y1  - 2014
A1  - S Rinzivillo
A1  - Lorenzo Gabrielli
A1  - Mirco Nanni
A1  - Luca Pappalardo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
JF  - International Conference on Data Science and Advanced Analytics, {DSAA} 2014, Shanghai, China, October 30 - November 1, 2014
UR  - http://dx.doi.org/10.1109/DSAA.2014.7058090
ER  - 

TY  - JOUR
T1  - On quantified linear implications
JF  - Annals of Mathematics and Artificial Intelligence
Y1  - 2014
A1  - Eirinakis, Pavlos
A1  - Salvatore Ruggieri
A1  - Subramani, K
A1  - Wojciechowski, Piotr
AB  - A Quantified Linear Implication (QLI) is an inclusion query over two polyhedral sets, with a quantifier string that specifies which variables are existentially quantified and which are universally quantified. Equivalently, it can be viewed as a quantified implication of two systems of linear inequalities. In this paper, we provide a 2-person game semantics for the QLI problem, which allows us to explore the computational complexities of several of its classes. More specifically, we prove that the decision problem for QLIs with an arbitrary number of quantifier alternations is PSPACE-hard. Furthermore, we explore the computational complexities of several classes of 0, 1, and 2-quantifier alternation QLIs. We observed that some classes are decidable in polynomial time, some are NP-complete, some are coNP-hard and some are  ΠP2Π2P -hard. We also establish the hardness of QLIs with 2 or more quantifier alternations with respect to the first quantifier in the quantifier string and the number of quantifier alternations. All the proofs that we provide for polynomially solvable problems are constructive, i.e., polynomial-time decision algorithms are devised that utilize well-known procedures. QLIs can be utilized as powerful modelling tools for real-life applications. Such applications include reactive systems, real-time schedulers, and static program analyzers.
VL  - 71
ER  - 

TY  - JOUR
T1  - The retail market as a complex system
JF  - EPJ Data Science
Y1  - 2014
A1  - Diego Pennacchioli
A1  - Michele Coscia
A1  - S Rinzivillo
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Aim of this paper is to introduce the complex system perspective into retail market analysis. Currently, to understand the retail market means to search for local patterns at the micro level, involving the segmentation, separation and profiling of diverse groups of consumers. In other contexts, however, markets are modelled as complex systems. Such strategy is able to uncover emerging regularities and patterns that make markets more predictable, e.g. enabling to predict how much a country’s GDP will grow. Rather than isolate actors in homogeneous groups, this strategy requires to consider the system as a whole, as the emerging pattern can be detected only as a result of the interaction between its self-organizing parts. This assumption holds also in the retail market: each customer can be seen as an independent unit maximizing its own utility function. As a consequence, the global behaviour of the retail market naturally emerges, enabling a novel description of its properties, complementary to the local pattern approach. Such task demands for a data-driven empirical framework. In this paper, we analyse a unique transaction database, recording the micro-purchases of a million customers observed for several years in the stores of a national supermarket chain. We show the emergence of the fundamental pattern of this complex system, connecting the products’ volumes of sales with the customers’ volumes of purchases. This pattern has a number of applications. We provide three of them. By enabling us to evaluate the sophistication of needs that a customer has and a product satisfies, this pattern has been applied to the task of uncovering the hierarchy of needs of the customers, providing a hint about what is the next product a customer could be interested in buying and predicting in which shop she is likely to go to buy it.
VL  - 3
UR  - http://link.springer.com/article/10.1140/epjds/s13688-014-0033-x
ER  - 

TY  - CHAP
T1  - Retrieving Points of Interest from Human Systematic Movements
T2  - Software Engineering and Formal Methods
Y1  - 2014
A1  - Riccardo Guidotti
A1  - Anna Monreale
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Human mobility analysis is emerging as a more and more fundamental task to deeply understand human behavior. In the last decade these kind of studies have become feasible thanks to the massive increase in availability of mobility data. A crucial point, for many mobility applications and analysis, is to extract interesting locations for people. In this paper, we propose a novel methodology to retrieve efficiently significant places of interest from movement data. Using car drivers’ systematic movements we mine everyday interesting locations, that is, places around which people life gravitates. The outcomes show the empirical evidence that these places capture nearly the whole mobility even though generated only from systematic movements abstractions.
JF  - Software Engineering and Formal Methods
PB  - Springer International Publishing
ER  - 

TY  - JOUR
T1  - Uncovering Hierarchical and Overlapping Communities with a Local-First Approach
JF  - {TKDD}
Y1  - 2014
A1  - Michele Coscia
A1  - Giulio Rossetti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Community discovery in complex networks is the task of organizing a network’s structure by grouping together nodes related to each other. Traditional approaches are based on the assumption that there is a global-level organization in the network. However, in many scenarios, each node is the bearer of complex information and cannot be classified in disjoint clusters. The top-down global view of the partition approach is not designed for this. Here, we represent this complex information as multiple latent labels, and we postulate that edges in the networks are created among nodes carrying similar labels. The latent labels are the communities a node belongs to and we discover them with a simple local-first approach to community discovery. This is achieved by democratically letting each node vote for the communities it sees surrounding it in its limited view of the global system, its ego neighborhood, using a label propagation algorithm, assuming that each node is aware of the label it shares with each of its connections. The local communities are merged hierarchically, unveiling the modular organization of the network at the global level and identifying overlapping groups and groups of groups. We tested this intuition against the state-of-the-art overlapping community discovery and found that our new method advances in the chosen scenarios in the quality of the obtained communities. We perform a test on benchmark and on real-world networks, evaluating the quality of the community coverage by using the extracted communities to predict the metadata attached to the nodes, which we consider external information about the latent labels. We also provide an explanation about why real-world networks contain overlapping communities and how our logic is able to capture them. Finally, we show how our method is deterministic, is incremental, and has a limited time complexity, so that it can be used on real-world scale networks.
VL  - 9
UR  - http://doi.acm.org/10.1145/2629511
ER  - 

TY  - CONF
T1  - Use of mobile phone data to estimate mobility flows. Measuring urban population and inter-city mobility using big data in an integrated approach
T2  - 47th SIS Scientific Meeting of the Italian Statistica Society
Y1  - 2014
A1  - Barbara Furletti
A1  - Lorenzo Gabrielli
A1  - Fosca Giannotti
A1  - Letizia Milli
A1  - Mirco Nanni
A1  - Dino Pedreschi
AB  - The Big Data, originating from the digital breadcrumbs of human activi-  ties, sensed as a by-product of the technologies that we use for our daily activities, let  us to observe the individual and collective behavior of people at an unprecedented  detail. Many dimensions of our social life have big data “proxies”, as the mobile  calls data for mobility. In this paper we investigate to what extent such ”big data”,  in integration with administrative ones, could be a support in producing reliable and  timely estimates of inter-city mobility. The study has been jointly developed by Is-  tat, CNR, University of Pisa in the range of interest of the “Commssione di studio  avente il compito di orientare le scelte dellIstat sul tema dei Big Data ”. In an on-  going project at ISTAT, called “Persons and Places” – based on an integration of  administrative data sources, it has been produced a first release of Origin Destina-  tion matrix – at municipality level – assuming that the places of residence and that  of work (or study) be the terminal points of usual individual mobility for work or  study. The coincidence between the city of residence and that of work (or study) –  is considered as a proxy of the absence of intercity mobility for a person (we define  him a static resident). The opposite case is considered as a proxy of presence of mo-  bility (the person is a dynamic resident: commuter or embedded). As administrative  data do not contain information on frequency of the mobility, the idea is to specify  an estimate method, using calling data as support, to define for each municipality the  stock of standing residents, embedded city users and daily city users (commuters)
JF  - 47th SIS Scientific Meeting of the Italian Statistica Society
CY  - Cagliari
SN  - 978-88-8467-874-4
UR  - http://www.sis2014.it/proceedings/allpapers/3026.pdf
ER  - 

TY  - CONF
T1  - Use of mobile phone data to estimate visitors mobility flows
T2  - Proceedings of MoKMaSD
Y1  - 2014
A1  - Lorenzo Gabrielli
A1  - Barbara Furletti
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - S Rinzivillo
AB  - Big Data originating from the digital breadcrumbs of human activities,  sensed as by-product of the technologies that we use for our daily activities, allows  us to observe the individual and collective behavior of people at an unprecedented  detail. Many dimensions of our social life have big data “proxies”, such as the mo-  bile calls data for mobility. In this paper we investigate to what extent data coming  from mobile operators could be a support in producing reliable and timely estimates  of intra-city mobility flows. The idea is to define an estimation method based on  calling data to characterize the mobility habits of visitors at the level of a single  municipality
JF  - Proceedings of MoKMaSD
UR  - http://www.di.unipi.it/mokmasd/symposium-2014/preproceedings/GabrielliEtAl-mokmasd2014.pdf
ER  - 

TY  - JOUR
T1  - Using t-closeness anonymity to control for non-discrimination.
JF  - Trans. Data Privacy
Y1  - 2014
A1  - Salvatore Ruggieri
AB  - We investigate the relation between t-closeness, a well-known model of data anonymization  against attribute disclosure, and α-protection, a model of the social discrimination hidden in  data. We show that t-closeness implies bdf (t)-protection, for a bound function bdf () depending on  the discrimination measure f() at hand. This allows us to adapt inference control methods, such  as the Mondrian multidimensional generalization technique and the Sabre bucketization and redistribution  framework, to the purpose of non-discrimination data protection. The parallel between  the two analytical models raises intriguing issues on the interplay between data anonymization and  non-discrimination research in data protection.
VL  - 7
UR  - http://dl.acm.org/citation.cfm?id=2870623
ER  - 

TY  - Generic
T1  - Analysis of GSM Calls Data for Understanding User Mobility Behavior
T2  - IEEE Big Data
Y1  - 2013
A1  - Barbara Furletti
A1  - Lorenzo Gabrielli
A1  - Chiara Renso
A1  - S Rinzivillo
JF  - IEEE Big Data
CY  - Santa Clara, California
ER  - 

TY  - JOUR
T1  - Assessing the Attractiveness of Places with Movement Data
JF  - Journal of Information and Data Management
Y1  - 2013
A1  - André Salvaro Furtado
A1  - Renato Fileto
A1  - Chiara Renso
VL  - 4
ER  - 

TY  - CONF
T1  - Average Speed Estimation For Road Networks Based On GPS Raw Trajectories
T2  - ICEIS Conference
Y1  - 2013
A1  - Ivanildo Barbosa
A1  - Marco A. Casanova
A1  - Chiara Renso
A1  - de José Antônio Fernandes Macêdo
JF  - ICEIS Conference
ER  - 

TY  - JOUR
T1  - Awareness and learning in participatory noise sensing.
JF  - PLoS One
Y1  - 2013
A1  - Becker, Martin
A1  - Saverio Caminiti
A1  - Fiorella, Donato
A1  - Francis, Louise
A1  - Pietro Gravino
A1  - Haklay, Mordechai Muki
A1  - Hotho, Andreas
A1  - Vittorio Loreto
A1  - Mueller, Juergen
A1  - Ricchiuti, Ferdinando
A1  - Vito D P Servedio
A1  - Alina Sirbu
A1  - Francesca Tria
AB  - <p>The development of ICT infrastructures has facilitated the emergence of new paradigms for looking at society and the environment over the last few years. Participatory environmental sensing, i.e. directly involving citizens in environmental monitoring, is one example, which is hoped to encourage learning and enhance awareness of environmental issues. In this paper, an analysis of the behaviour of individuals involved in noise sensing is presented. Citizens have been involved in noise measuring activities through the WideNoise smartphone application. This application has been designed to record both objective (noise samples) and subjective (opinions, feelings) data. The application has been open to be used freely by anyone and has been widely employed worldwide. In addition, several test cases have been organised in European countries. Based on the information submitted by users, an analysis of emerging awareness and learning is performed. The data show that changes in the way the environment is perceived after repeated usage of the application do appear. Specifically, users learn how to recognise different noise levels they are exposed to. Additionally, the subjective data collected indicate an increased user involvement in time and a categorisation effect between pleasant and less pleasant environments.</p>
VL  - 8
ER  - 

TY  - CONF
T1  - Baquara: A Holistic Ontological Framework for Movement Analysis with Linked Data
T2  - Entity Relationship Conference - ER 2013
Y1  - 2013
A1  - Renato Fileto
A1  - Marcelo Krger
A1  - Nikos Pelekis
A1  - Yannis Theodoridis
A1  - Chiara Renso
JF  - Entity Relationship Conference - ER 2013
CY  - Hong Kong
ER  - 

TY  - JOUR
T1  - Cohesion, consensus and extreme information in opinion dynamics
JF  - Advances in Complex Systems
Y1  - 2013
A1  - Alina Sirbu
A1  - Vittorio Loreto
A1  - Vito D P Servedio
A1  - Francesca Tria
VL  - 16
UR  - http://www.worldscientific.com/doi/abs/10.1142/S0219525913500355
ER  - 

TY  - CONF
T1  - Comparing General Mobility and Mobility by Car
T2  - Computational Intelligence and 11th Brazilian Congress on Computational Intelligence (BRICS-CCI CBIC), 2013 BRICS Congress on
Y1  - 2013
A1  - Luca Pappalardo
A1  - Filippo Simini
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
JF  - Computational Intelligence and 11th Brazilian Congress on Computational Intelligence (BRICS-CCI CBIC), 2013 BRICS Congress on
ER  - 

TY  - JOUR
T1  - CONSTAnT - A Conceptual Data Model for Semantic Trajectories of Moving Objects
JF  - Transaction in GIS
Y1  - 2013
A1  - Vania Bogorny
A1  - Chiara Renso
A1  - Artur Ribeiro de Aquino
A1  - Fernando de Lucca Siqueira
A1  - Luis Otavio Alvares
ER  - 

TY  - CONF
T1  - Data Anonymity Meets Non-discrimination
T2  - Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
Y1  - 2013
A1  - Salvatore Ruggieri
JF  - Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
PB  - IEEE
ER  - 

TY  - CHAP
T1  - The discovery of discrimination
T2  - Discrimination and privacy in the information society
Y1  - 2013
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
A1  - Franco Turini
JF  - Discrimination and privacy in the information society
PB  - Springer
ER  - 

TY  - JOUR
T1  - Discrimination discovery in scientific project evaluation: A case study
JF  - Expert Systems with Applications
Y1  - 2013
A1  - Andrea Romei
A1  - Salvatore Ruggieri
A1  - Franco Turini
VL  - 40
ER  - 

TY  - CONF
T1  - Efficient GPU-based skyline computation
T2  - DAMON@SIGMOD 2013
Y1  - 2013
A1  - Kenneth S. Boeg
A1  - Ira Assent,
A1  - Matteo Magnani
JF  - DAMON@SIGMOD 2013
ER  - 

TY  - CONF
T1  - "Engine Matters": {A} First Large Scale Data Driven Study on Cyclists' Performance
T2  - 13th {IEEE} International Conference on Data Mining Workshops, {ICDM} Workshops, TX, USA, December 7-10, 2013
Y1  - 2013
A1  - Paolo Cintia
A1  - Luca Pappalardo
A1  - Dino Pedreschi
JF  - 13th {IEEE} International Conference on Data Mining Workshops, {ICDM} Workshops, TX, USA, December 7-10, 2013
UR  - http://dx.doi.org/10.1109/ICDMW.2013.41
ER  - 

TY  - JOUR
T1  - Evolving networks: Eras and turning points
JF  - Intell. Data Anal.
Y1  - 2013
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Dino Pedreschi
AB  - Within the large body of research in complex network analysis, an important topic is the temporal evolution of networks. Existing approaches aim at analyzing the evolution on the global and the local scale, extracting properties of either the entire network or local patterns. In this paper, we focus on detecting clusters of temporal snapshots of a network, to be interpreted as eras of evolution. To this aim, we introduce a novel hierarchical clustering methodology, based on a dissimilarity measure (derived from the Jaccard coefficient) between two temporal snapshots of the network, able to detect the turning points at the beginning of the eras. We devise a framework to discover and browse the eras, either in top-down or a bottom-up fashion, supporting the exploration of the evolution at any level of temporal resolution. We show how our approach applies to real networks and null models, by detecting eras in an evolving co-authorship graph extracted from a bibliographic dataset, a collaboration graph extracted from a cinema database, and a network extracted from a database of terrorist attacks; we illustrate how the discovered temporal clustering highlights the crucial moments when the networks witnessed profound changes in their structure. Our approach is finally boosted by introducing a meaningful labeling of the obtained clusters, such as the characterizing topics of each discovered era, thus adding a semantic dimension to our analysis.
VL  - 17
UR  - http://dx.doi.org/10.3233/IDA-120566
ER  - 

TY  - Generic
T1  - Explaining the PRoduct Range Effect in Purchase Data
T2  - IEEE Big Data
Y1  - 2013
A1  - Diego Pennacchioli
A1  - Michele Coscia
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
JF  - IEEE Big Data
ER  - 

TY  - CONF
T1  - A Gravity Model for Speed Estimation over Road Network
T2  - 2013 {IEEE} 14th International Conference on Mobile Data Management, Milan, Italy, June 3-6, 2013 - Volume 2
Y1  - 2013
A1  - Paolo Cintia
A1  - Roberto Trasarti
A1  - José Antônio Fernandes de Macêdo
A1  - Livia Almada
A1  - Camila Fereira
JF  - 2013 {IEEE} 14th International Conference on Mobile Data Management, Milan, Italy, June 3-6, 2013 - Volume 2
UR  - http://dx.doi.org/10.1109/MDM.2013.83
ER  - 

TY  - JOUR
T1  - How you move reveals who you are: understanding human behavior by analyzing trajectory data
JF  - Knowl. Inf. Syst.
Y1  - 2013
A1  - Chiara Renso
A1  - Miriam Baglioni
A1  - José Antônio Fernandes de Macêdo
A1  - Roberto Trasarti
A1  - Monica Wachowicz
VL  - 37
UR  - http://dx.doi.org/10.1007/s10115-012-0511-z
ER  - 

TY  - CONF
T1  - Inferring human activities from GPS tracks UrbComp
T2  - Workshop at KDD 2013
Y1  - 2013
A1  - Paolo Cintia
A1  - Barbara Furletti
A1  - Chiara Renso
JF  - Workshop at KDD 2013
CY  - Chicago USA
ER  - 

TY  - CONF
T1  - Learning from polyhedral sets
T2  - Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Y1  - 2013
A1  - Salvatore Ruggieri
JF  - Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
PB  - AAAI Press
ER  - 

TY  - CONF
T1  - Measuring tie strength in multidimensional networks
T2  - SEDB 2013
Y1  - 2013
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - Dino Pedreschi
JF  - SEDB 2013
ER  - 

TY  - THES
T1  - Mobility Ranking - Human Mobility Analysis Using Ranking Measures
Y1  - 2013
A1  - Riccardo Guidotti
ER  - 

TY  - CONF
T1  - Mob-Warehouse: A semantic approach for mobility analysis with a Trajectory Data Ware- house
T2  - SecoGIS 2013 - International Workshop on Semantic Aspects of GIS, Joint to ER conference 2013
Y1  - 2013
A1  - Ricardo Wagner
A1  - de José Antônio Fernandes Macêdo
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Alessandro Roncato
A1  - Roberto Trasarti
JF  - SecoGIS 2013 - International Workshop on Semantic Aspects of GIS, Joint to ER conference 2013
CY  - Hong Kong
ER  - 

TY  - CONF
T1  - MP4-A Project: Mobility Planning For Africa
T2  - In D4D Challenge @ 3rd Conf. on the Analysis of Mobile Phone datasets (NetMob 2013)
Y1  - 2013
A1  - Mirco Nanni
A1  - Roberto Trasarti
A1  - Barbara Furletti
A1  - Lorenzo Gabrielli
A1  - Peter Van Der Mede
A1  - Joost De Bruijn
A1  - Erik de Romph
A1  - Gerard Bruil
AB  - This project aims to create a tool that uses mobile phone transaction (trajectory) data that will be able to address transportation related challenges, thus allowing promotion and facilitation of sustainable urban mobility planning in Third World countries. The proposed tool is a transport demand model for Ivory Coast, with emphasis on its major urbanization Abidjan. The consortium will bring together available data from the internet, and integrate these with the mobility data obtained from the mobile phones in order to build the best possible transport model. A transport model allows an understanding of current and future infrastructure requirements in Ivory Coast. As such, this project will provide the first proof of concept. In this context, long-term analysis of individual call traces will be performed to reconstruct systematic movements, and to infer an origin-destination matrix. A similar process will be performed using the locations of caller and recipient of phone calls, enabling the comparison of socio-economic ties vs. mobility. The emerging links between different areas will be used to build an effective map to optimize regional border definitions and road infrastructure from a mobility perspective. Finally, we will try to build specialized origin-destination matrices for specific categories of population. Such categories will be inferred from data through analysis of calling behaviours, and will also be used to characterize the population of different cities. The project also includes a study of data compliance with distributions of standard measures observed in literature, including distribution of calls, call durations and call network features.
JF  - In D4D Challenge @ 3rd Conf. on the Analysis of Mobile Phone datasets (NetMob 2013)
CY  - Cambridge, USA
UR  - http://perso.uclouvain.be/vincent.blondel/netmob/2013/D4D-book.pdf
ER  - 

TY  - CONF
T1  - On multidimensional network measures
T2  - SEDB 2013
Y1  - 2013
A1  - Matteo Magnani
A1  - Anna Monreale
A1  - Giulio Rossetti
A1  - Fosca Giannotti
AB  - Networks, i.e., sets of interconnected entities, are ubiquitous,  spanning disciplines as diverse as sociology, biology and computer science.  The recent availability of large amounts of network data has thus  provided a unique opportunity to develop models and analysis tools applicable  to a wide range of scenarios. However, real-world phenomena are  often more complex than existing graph data models. One relevant example  concerns the numerous types of social relationships (or edges) that  can be present between individuals in a social network. In this short paper  we present a unified model and a set of measures recently developed  to represent and analyze network data with multiple types of edges.
JF  - SEDB 2013
UR  - https://www.researchgate.net/publication/256194479_On_multidimensional_network_measures
ER  - 

TY  - JOUR
T1  - Opinion dynamics with disagreement and modulated information
JF  - Journal of Statistical Physics
Y1  - 2013
A1  - Alina Sirbu
A1  - Vittorio Loreto
A1  - Vito D P Servedio
A1  - Francesca Tria
UR  - http://link.springer.com/article/10.1007/s10955-013-0724-x
ER  - 

TY  - CONF
T1  - Pisa Tourism fluxes Observatory: deriving mobility indicators from GSM call habits
T2  - NetMob Conference 2013
Y1  - 2013
A1  - Barbara Furletti
A1  - Lorenzo Gabrielli
A1  - Chiara Renso
A1  - S Rinzivillo
JF  - NetMob Conference 2013
ER  - 

TY  - CONF
T1  - Privacy-Aware Distributed Mobility Data Analytics
T2  - SEBD
Y1  - 2013
A1  - Francesca Pratesi
A1  - Anna Monreale
A1  - Hui Wendy Wang
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Gennady Andrienko
A1  - Natalia Andrienko
AB  - We propose an approach to preserve privacy in an analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because they may describe typical movement behaviors and therefore be used for re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation.
JF  - SEBD
CY  - Roccella Jonica
ER  - 

TY  - CHAP
T1  - Privacy-Preserving Distributed Movement Data Aggregation
T2  - Geographic Information Science at the Heart of Europe
Y1  - 2013
A1  - Anna Monreale
A1  - Hui Wendy Wang
A1  - Francesca Pratesi
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Gennady Andrienko
A1  - Natalia Andrienko
ED  - Vandenbroucke, Danny
ED  - Bucher, Bénédicte
ED  - Crompvoets, Joep
AB  - We propose a novel approach to privacy-preserving analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because people’s whereabouts have the potential to reveal intimate personal traits, such as religious or sexual preferences, and may allow re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation.
JF  - Geographic Information Science at the Heart of Europe
T3  - Lecture Notes in Geoinformation and Cartography
PB  - Springer International Publishing
SN  - 978-3-319-00614-7
UR  - http://dx.doi.org/10.1007/978-3-319-00615-4_13
ER  - 

TY  - JOUR
T1  - Privacy-Preserving Mining of Association Rules From Outsourced Transaction Databases
JF  - IEEE Systems Journal
Y1  - 2013
A1  - Fosca Giannotti
A1  - L.V.S. Lakshmanan
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Hui Wendy Wang
AB  - Spurred by developments such as cloud computing, there has been considerable recent interest in the paradigm of data mining-as-a-service. A company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third party service provider (server). However, both the items and the association rules of the outsourced database are considered private property of the corporation (data owner). To protect corporate privacy, the data owner transforms its data and ships it to the server, sends mining queries to the server, and recovers the true patterns from the extracted patterns received from the server. In this paper, we study the problem of outsourcing the association rule mining task within a corporate privacy-preserving framework. We propose an attack model based on background knowledge and devise a scheme for privacy preserving outsourced mining. Our scheme ensures that each transformed item is indistinguishable with respect to the attacker's background knowledge, from at least k-1 other transformed items. Our comprehensive experiments on a very large and real transaction database demonstrate that our techniques are effective, scalable, and protect privacy.
ER  - 

TY  - CONF
T1  - A Proactive Ap- plication to Monitor Truck Fleets
T2  - Mobile Data Management Conference, 2013
Y1  - 2013
A1  - Fabio Da Costa Albuquerque
A1  - Marco A. Casanova
A1  - Marcelo Tilio M. de Carvalho
A1  - de José Antônio Fernandes Macêdo
A1  - Chiara Renso
JF  - Mobile Data Management Conference, 2013
ER  - 

TY  - CONF
T1  - Quantification Trees
T2  - 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, December 7-10, 2013
Y1  - 2013
A1  - Letizia Milli
A1  - Anna Monreale
A1  - Giulio Rossetti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Fabrizio Sebastiani
AB  - In many applications there is a need to monitor how a population is distributed across different classes, and to track the changes in this distribution that derive from varying circumstances, an example such application is monitoring the percentage (or "prevalence") of unemployed people in a given region, or in a given age range, or at different time periods. When the membership of an individual in a class cannot be established deterministically, this monitoring activity requires classification. However, in the above applications the final goal is not determining which class each individual belongs to, but simply estimating the prevalence of each class in the unlabeled data. This task is called quantification. In a supervised learning framework we may estimate the distribution across the classes in a test set from a training set of labeled individuals. However, this may be sub optimal, since the distribution in the test set may be substantially different from that in the training set (a phenomenon called distribution drift). So far, quantification has mostly been addressed by learning a classifier optimized for individual classification and later adjusting the distribution it computes to compensate for its tendency to either under-or over-estimate the prevalence of the class. In this paper we propose instead to use a type of decision trees (quantification trees) optimized not for individual classification, but directly for quantification. Our experiments show that quantification trees are more accurate than existing state-of-the-art quantification methods, while retaining at the same time the simplicity and understandability of the decision tree framework.
JF  - 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, December 7-10, 2013
UR  - http://dx.doi.org/10.1109/ICDM.2013.122
ER  - 

TY  - JOUR
T1  - Scalable Analysis of Movement Data for Extracting and Exploring Significant Places
JF  - IEEE Transactions on Visualization and Computer Graphics
Y1  - 2013
A1  - Gennady Andrienko
A1  - Natalia Andrienko
A1  - C. Hunter
A1  - S Rinzivillo
A1  - Stefan Wrobel
VL  - 19
ER  - 

TY  - JOUR
T1  - Semantic Trajectories Modeling and Analysis
JF  - ACM Computing Surveys
Y1  - 2013
A1  - Christine Parent
A1  - Stefano Spaccapietra
A1  - Chiara Renso
A1  - Gennady Andrienko
A1  - Natalia Andrienko
A1  - Vania Bogorny
A1  - Damiani M L,
A1  - Gkoulalas-Divanis A,
A1  - de José Antônio Fernandes Macêdo
A1  - Nikos Pelekis
VL  - 45
ER  - 

TY  - JOUR
T1  - Spatial and Temporal Evaluation of Network-based Analysis of Human Mobility
JF  - Social Network Analysis and Mining
Y1  - 2013
A1  - Michele Coscia
A1  - S Rinzivillo
A1  - Fosca Giannotti
A1  - Dino Pedreschi
VL  - to appear
ER  - 

TY  - CHAP
T1  - Spatio and Spatio-temporal Reasoning and Decision Support Tools
T2  - Entry at Encyclopedia of Social Network Analysis and Mining
Y1  - 2013
A1  - Monica Wachowicz
A1  - Chiara Renso
JF  - Entry at Encyclopedia of Social Network Analysis and Mining
ER  - 

TY  - CONF
T1  - Spatio temporal keyword-queries in Social Networs
Y1  - 2013
A1  - Vittoria Cozza
A1  - Antonio Messina
A1  - Danilo Montesi
A1  - Luca Arietta
A1  - Matteo Magnani
ER  - 

TY  - JOUR
T1  - Spatio-Temporal Data
JF  - Spatio-Temporal Databases: Flexible Querying and Reasoning
Y1  - 2013
A1  - Mirco Nanni
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
ER  - 

TY  - CONF
T1  - A Study on Parameter Estimation for a Mining Flock Algorithm
T2  - Mining Complex Patterns Workshop, ECML PKDD 2013
Y1  - 2013
A1  - Rebecca Ong
A1  - Mirco Nanni
A1  - Chiara Renso
A1  - Monica Wachowicz
A1  - Dino Pedreschi
JF  - Mining Complex Patterns Workshop, ECML PKDD 2013
ER  - 

TY  - CONF
T1  - Tailoring Moving Patterns to Contexts
T2  - AGILE Conference
Y1  - 2013
A1  - Monica Wachowicz
A1  - Rebecca Ong
A1  - Chiara Renso
JF  - AGILE Conference
CY  - Leuven, Belgium, 2013
ER  - 

TY  - CONF
T1  - The Three Dimensions of Social Prominence
T2  - Social Informatics - 5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings
Y1  - 2013
A1  - Diego Pennacchioli
A1  - Giulio Rossetti
A1  - Luca Pappalardo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - Michele Coscia
JF  - Social Informatics - 5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings
UR  - http://dx.doi.org/10.1007/978-3-319-03260-3_28
ER  - 

TY  - JOUR
T1  - Towards mega-modeling: a walk through data analysis experiences
JF  - {SIGMOD} Record
Y1  - 2013
A1  - Stefano Ceri
A1  - Themis Palpanas
A1  - Emanuele Della Valle
A1  - Dino Pedreschi
A1  - Johann-Christoph Freytag
A1  - Roberto Trasarti
VL  - 42
UR  - http://doi.acm.org/10.1145/2536669.2536673
ER  - 

TY  - CONF
T1  - Transportation Planning Based on {GSM} Traces: {A} Case Study on Ivory Coast
T2  - Citizen in Sensor Networks - Second International Workshop, CitiSens 2013, Barcelona, Spain, September 19, 2013, Revised Selected Papers
Y1  - 2013
A1  - Mirco Nanni
A1  - Roberto Trasarti
A1  - Barbara Furletti
A1  - Lorenzo Gabrielli
A1  - Peter Van Der Mede
A1  - Joost De Bruijn
A1  - Erik de Romph
A1  - Gerard Bruil
JF  - Citizen in Sensor Networks - Second International Workshop, CitiSens 2013, Barcelona, Spain, September 19, 2013, Revised Selected Papers
UR  - http://dx.doi.org/10.1007/978-3-319-04178-0_2
ER  - 

TY  - JOUR
T1  - {Understanding the patterns of car travel}
JF  - The European Physical Journal Special Topics
Y1  - 2013
A1  - Luca Pappalardo
A1  - S Rinzivillo
A1  - Qu, Zehui
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - {Are the patterns of car travel different from those of general human mobility? Based on a unique dataset consisting of the GPS trajectories of 10 million travels accomplished by 150,000 cars in Italy, we investigate how known mobility models apply to car travels, and illustrate novel analytical findings. We also assess to what extent the sample in our dataset is representative of the overall car mobility, and discover how to build an extremely accurate model that, given our GPS data, estimates the real traffic values as measured by road sensors.}
VL  - 215
UR  - http://dx.doi.org/10.1140/epjst%252fe2013-01715-5
ER  - 

TY  - CONF
T1  - Where Have You Been Today? Annotating Trajectories with DayTag
T2  - International Conference on Spatial and Spatio-temporal Databases (SSTD)
Y1  - 2013
A1  - S Rinzivillo
A1  - Fernando de Lucca Siqueira
A1  - Lorenzo Gabrielli
A1  - Chiara Renso
A1  - Vania Bogorny
JF  - International Conference on Spatial and Spatio-temporal Databases (SSTD)
ER  - 

TY  - CONF
T1  - Where Shall We Go Today? Planning Touristic Tours with TripBuilder
T2  - International Conference CIKM 2013
Y1  - 2013
A1  - Igo Brilhante
A1  - Franco Maria Nardini
A1  - Raffaele Perego
A1  - Chiara Renso
A1  - de José Antônio Fernandes Macêdo
JF  - International Conference CIKM 2013
CY  - San Francisco, USA
ER  - 

TY  - CONF
T1  - XTribe: a web-based social computation platform
T2  - Cloud and Green Computing (CGC), 2013 Third International Conference on
Y1  - 2013
A1  - Saverio Caminiti
A1  - Cicali, Claudio
A1  - Pietro Gravino
A1  - Vittorio Loreto
A1  - Vito D P Servedio
A1  - Alina Sirbu
A1  - Francesca Tria
JF  - Cloud and Green Computing (CGC), 2013 Third International Conference on
PB  - IEEE
UR  - http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6686061&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6686061
ER  - 

TY  - CONF
T1  - You Know Because I Know”: a Multidimensional Network Approach to Human Resources Problem
T2  - ASONAM 2013
Y1  - 2013
A1  - Michele Coscia
A1  - Giulio Rossetti
A1  - Diego Pennacchioli
A1  - Damiano Ceccarelli
A1  - Fosca Giannotti
JF  - ASONAM 2013
ER  - 

TY  - CONF
T1  - An Agent-Based Model to Evaluate Carpooling at Large Manufacturing Plants
T2  - Proceedings of the 3rd International Conference on Ambient Systems, Networks and Technologies {(ANT} 2012), the 9th International Conference on Mobile Web Information Systems (MobiWIS-2012), Niagara Falls, Ontario, Canada, August 27-29, 2012
Y1  - 2012
A1  - Tom Bellemans
A1  - Sebastian Bothe
A1  - Sungjin Cho
A1  - Fosca Giannotti
A1  - Davy Janssens
A1  - Luk Knapen
A1  - Christine Körner
A1  - Michael May
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Hendrik Stange
A1  - Roberto Trasarti
A1  - Ansar-Ul-Haque Yasar
A1  - Geert Wets
JF  - Proceedings of the 3rd International Conference on Ambient Systems, Networks and Technologies {(ANT} 2012), the 9th International Conference on Mobile Web Information Systems (MobiWIS-2012), Niagara Falls, Ontario, Canada, August 27-29, 2012
UR  - http://dx.doi.org/10.1016/j.procs.2012.08.001
ER  - 

TY  - RPRT
T1  - Analisi di Mobilita' con dati eterogenei
Y1  - 2012
A1  - Barbara Furletti
A1  - Roberto Trasarti
A1  - Lorenzo Gabrielli
A1  - S Rinzivillo
A1  - Luca Pappalardo
A1  - Fosca Giannotti
PB  - ISTI - CNR
CY  - Pisa
ER  - 

TY  - CONF
T1  - Anonymity: a Comparison between the Legal and Computer Science Perspectives.
T2  - The 5rd International Conference on Computers, Privacy, and Data Protection: “European Data Protection: Coming of Age”
Y1  - 2012
A1  - S Mascetti
A1  - Anna Monreale
A1  - A Ricci
A1  - A. Gerino
AB  - Privacy preservation has emerged as a major challenge in ICT. One possible solution for enforcing privacy is to guarantee anonymity. Indeed, according to international regulations, no restriction is applied to the handling of anonymous data. Consequently, in the past years the notion of anonymity has been extensively studied by two different communities: Law researchers and professionals that propose definitions of privacy regulations, and Computer Scientists attempting to provide technical solutions for enforcing the legal requirements.    In this contribution we address the problem with an interdisciplinary approach, in the aim to encourage the reciprocal understanding and collaboration between researchers in the two areas. To achieve this, we compare the different notions of anonymity provided in the European data protection Law with the formal models proposed in Computer Science. This analysis allows us to identify the main similarities and differences between the two points of view, hence highlighting the need for a joint research effort.
JF  - The 5rd International Conference on Computers, Privacy, and Data Protection: “European Data Protection: Coming of Age”
ER  - 

TY  - CONF
T1  - AUDIO: An Integrity Auditing Framework of Outlier-Mining-as-a-Service Systems.
T2  - Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2012
Y1  - 2012
A1  - R.Liu
A1  - Hui Wendy Wang
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Fosca Giannotti
A1  - W Guo
AB  - Spurred by developments such as cloud computing, there has been considerable recent interest in the data-mining-as-a-service paradigm. Users lacking in expertise or computational resources can outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises issues about result integrity: how can the data owner verify that the mining results returned by the server are correct? In this paper, we present AUDIO, an integrity auditing framework for the specific task of distance-based outlier mining outsourcing. It provides efficient and practical verification approaches to check both completeness and correctness of the mining results. The key idea of our approach is to insert a small amount of artificial tuples into the outsourced data; the artificial tuples will produce artificial outliers and non-outliers that do not exist in the original dataset. The server’s answer is verified by analyzing the presence of artificial outliers/non-outliers, obtaining a probabilistic guarantee of correctness and completeness of the mining result. Our empirical results show the effectiveness and efficiency of our method.
JF  - Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2012
ER  - 

TY  - CONF
T1  - Classifying Trust/Distrust Relationships in Online Social Networks
T2  - 2012 International Conference on Privacy, Security, Risk and Trust, {PASSAT} 2012, and 2012 International Confernece on Social Computing, SocialCom 2012, Amsterdam, Netherlands, September 3-5, 2012
Y1  - 2012
A1  - Giacomo Bachi
A1  - Michele Coscia
A1  - Anna Monreale
A1  - Fosca Giannotti
AB  - Online social networks are increasingly being used as places where communities gather to exchange information, form opinions, collaborate in response to events. An aspect of this information exchange is how to determine if a source of social information can be trusted or not. Data mining literature addresses this problem. However, if usually employs social balance theories, by looking at small structures in complex networks known as triangles. This has proven effective in some cases, but it under performs in the lack of context information about the relation and in more complex interactive structures. In this paper we address the problem of creating a framework for the trust inference, able to infer the trust/distrust relationships in those relational environments that cannot be described by using the classical social balance theory. We do so by decomposing a trust network in its ego network components and mining on this ego network set the trust relationships, extending a well known graph mining algorithm. We test our framework on three public datasets describing trust relationships in the real world (from the social media Epinions, Slash dot and Wikipedia) and confronting our results with the trust inference state of the art, showing better performances where the social balance theory fails.
JF  - 2012 International Conference on Privacy, Security, Risk and Trust, {PASSAT} 2012, and 2012 International Confernece on Social Computing, SocialCom 2012, Amsterdam, Netherlands, September 3-5, 2012
UR  - http://dx.doi.org/10.1109/SocialCom-PASSAT.2012.115
ER  - 

TY  - Generic
T1  - ComeTogether: Discovering Communities of Places in Mobility Data
T2  - MDM 2012
Y1  - 2012
A1  - Igo Brilhante
A1  - Michele Berlingerio
A1  - Roberto Trasarti
A1  - Chiara Renso
A1  - de José Antônio Fernandes Macêdo
A1  - Marco A. Casanova
JF  - MDM 2012
ER  - 

TY  - JOUR
T1  - Data Science for Simulating the Era of Electric Vehicles
JF  - KI - Künstliche Intelligenz
Y1  - 2012
A1  - Davy Janssens
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - S Rinzivillo
ER  - 

TY  - CONF
T1  - DEMON: a Local-First Discovery Method for Overlapping Communities
T2  - KDD 2012
Y1  - 2012
A1  - Michele Coscia
A1  - Giulio Rossetti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - KDD 2012
ER  - 

TY  - CONF
T1  - DEMON: a local-first discovery method for overlapping communities
T2  - The 18th {ACM} {SIGKDD} International Conference on Knowledge Discovery and Data Mining, {KDD} '12, Beijing, China, August 12-16, 2012
Y1  - 2012
A1  - Michele Coscia
A1  - Giulio Rossetti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - The 18th {ACM} {SIGKDD} International Conference on Knowledge Discovery and Data Mining, {KDD} '12, Beijing, China, August 12-16, 2012
UR  - http://doi.acm.org/10.1145/2339530.2339630
ER  - 

TY  - JOUR
T1  - Discovering the Geographical Borders of Human Mobility
JF  - KI - Künstliche Intelligenz
Y1  - 2012
A1  - S Rinzivillo
A1  - Simone Mainardi
A1  - Fabio Pezzoni
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - The availability of massive network and mobility data from diverse domains has fostered the analysis of human behavior and interactions. Broad, extensive, and multidisciplinary research has been devoted to the extraction of non-trivial knowledge from this novel form of data. We propose a general method to determine the influence of social and mobility behavior over a specific geographical area in order to evaluate to what extent the current administrative borders represent the real basin of human movement. We build a network representation of human movement starting with vehicle GPS tracks and extract relevant clusters, which are then mapped back onto the territory, finding a good match with the existing administrative borders. The novelty of our approach is the focus on a detailed spatial resolution, we map emerging borders in terms of individual municipalities, rather than macro regional or national areas. We present a series of experiments to illustrate and evaluate the effectiveness of our approach.
UR  - https://link.springer.com/article/10.1007%2Fs13218-012-0181-8
ER  - 

TY  - CONF
T1  - "How Well Do We Know Each Other?" Detecting Tie Strength in Multidimensional Social Networks
T2  - International Conference on Advances in Social Networks Analysis and Mining, {ASONAM} 2012, Istanbul, Turkey, 26-29 August 2012
Y1  - 2012
A1  - Luca Pappalardo
A1  - Giulio Rossetti
A1  - Dino Pedreschi
JF  - International Conference on Advances in Social Networks Analysis and Mining, {ASONAM} 2012, Istanbul, Turkey, 26-29 August 2012
UR  - http://doi.ieeecomputersociety.org/10.1109/ASONAM.2012.180
ER  - 

TY  - CONF
T1  - Identifying users profiles from mobile calls habits
T2  - ACM SIGKDD International Workshop on Urban Computing
Y1  - 2012
A1  - Barbara Furletti
A1  - Lorenzo Gabrielli
A1  - Chiara Renso
A1  - S Rinzivillo
AB  - The huge quantity of positioning data registered by our mobile phones stimulates several research questions, mainly originating from the combination of this huge quantity of data with the extreme heterogeneity of the tracked user and the low granularity of the data. We propose a methodology to partition the users tracked by GSM phone calls into profiles like resident, commuters, in transit and tourists. The methodology analyses the phone calls with a combination of top-down and bottom up techniques where the top-down phase is based on a sequence of queries that identify some behaviors. The bottom-up is a machine learning phase to find groups of similar call behavior, thus refining the previous step. The integration of the two steps results in the partitioning of mobile traces into these four user categories that can be deeper analyzed, for example to understand the tourist movements in city or the traffic effects of commuters. An experiment on the identification of user profiles on a real dataset collecting call records from one month in the city of Pisa illustrates the methodology.
JF  - ACM SIGKDD International Workshop on Urban Computing
PB  - ACM New York, NY, USA ©2012
CY  - Beijing, China
SN  - 978-1-4503-1542-5
UR  - http://delivery.acm.org/10.1145/2350000/2346500/p17-furletti.pdf?ip=146.48.83.121&acc=ACTIVE%20SERVICE&CFID=166768290&CFTOKEN=58719386&__acm__=1357648050_e23771c2f6bd8feb96bd66b39294175d
ER  - 

TY  - CONF
T1  - Individual Mobility Profiles: Methods and Application on Vehicle Sharing
T2  - Twentieth Italian Symposium on Advanced Database Systems, {SEBD} 2012, Venice, Italy, June 24-27, 2012, Proceedings
Y1  - 2012
A1  - Roberto Trasarti
A1  - Fabio Pinelli
A1  - Mirco Nanni
A1  - Fosca Giannotti
JF  - Twentieth Italian Symposium on Advanced Database Systems, {SEBD} 2012, Venice, Italy, June 24-27, 2012, Proceedings
UR  - http://sebd2012.dei.unipd.it/documents/188475/32d00b8a-8ead-4d97-923f-bd2f2cf6ddcb
ER  - 

TY  - CONF
T1  - Injecting Discrimination and Privacy Awareness Into Pattern Discovery
T2  - 12th {IEEE} International Conference on Data Mining Workshops, {ICDM} Workshops, Brussels, Belgium, December 10, 2012
Y1  - 2012
A1  - Sara Hajian
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Josep Domingo-Ferrer
A1  - Fosca Giannotti
AB  - Data mining is gaining societal momentum due to the ever increasing availability of large amounts of human data, easily collected by a variety of sensing technologies. Data mining comes with unprecedented opportunities and risks: a deeper understanding of human behavior and how our society works is darkened by a greater chance of privacy intrusion and unfair discrimination based on the extracted patterns and profiles. Although methods independently addressing privacy or discrimination in data mining have been proposed in the literature, in this context we argue that privacy and discrimination risks should be tackled together, and we present a methodology for doing so while publishing frequent pattern mining results. We describe a combined pattern sanitization framework that yields both privacy and discrimination-protected patterns, while introducing reasonable (controlled) pattern distortion.
JF  - 12th {IEEE} International Conference on Data Mining Workshops, {ICDM} Workshops, Brussels, Belgium, December 10, 2012
UR  - http://dx.doi.org/10.1109/ICDMW.2012.51
ER  - 

TY  - JOUR
T1  - Integrating heterogeneous gene expression data for gene regulatory network modelling.
JF  - Theory Biosci
Y1  - 2012
A1  - Alina Sirbu
A1  - Heather J Ruskin
A1  - Martin Crane
AB  - <p>Gene regulatory networks (GRNs) are complex biological systems that have a large impact on protein levels, so that discovering network interactions is a major objective of systems biology. Quantitative GRN models have been inferred, to date, from time series measurements of gene expression, but at small scale, and with limited application to real data. Time series experiments are typically short (number of time points of the order of ten), whereas regulatory networks can be very large (containing hundreds of genes). This creates an under-determination problem, which negatively influences the results of any inferential algorithm. Presented here is an integrative approach to model inference, which has not been previously discussed to the authors' knowledge. Multiple heterogeneous expression time series are used to infer the same model, and results are shown to be more robust to noise and parameter perturbation. Additionally, a wavelet analysis shows that these models display limited noise over-fitting within the individual datasets.</p>
VL  - 131
ER  - 

TY  - JOUR
T1  - Knowledge Discovery in Ontologies
JF  - Intelligent Data Analysis
Y1  - 2012
A1  - Barbara Furletti
A1  - Franco Turini
VL  - 16
UR  - http://iospress.metapress.com/content/765h53w41286p578/fulltext.pdf
ER  - 

TY  - CONF
T1  - M-Attract: Assessing Places Attractiveness by using Moving Objects Trajectories Data
T2  - GEOINFO 2012 Brazilian Conference on Geographical Information Systems
Y1  - 2012
A1  - André Salvaro Furtado
A1  - Renato Fileto
A1  - Chiara Renso
JF  - GEOINFO 2012 Brazilian Conference on Geographical Information Systems
ER  - 

TY  - CONF
T1  - Mega-modeling for Big Data Analytics
T2  - Conceptual Modeling - 31st International Conference {ER} 2012, Florence, Italy, October 15-18, 2012. Proceedings
Y1  - 2012
A1  - Stefano Ceri
A1  - Emanuele Della Valle
A1  - Dino Pedreschi
A1  - Roberto Trasarti
JF  - Conceptual Modeling - 31st International Conference {ER} 2012, Florence, Italy, October 15-18, 2012. Proceedings
UR  - http://dx.doi.org/10.1007/978-3-642-34002-4_1
ER  - 

TY  - JOUR
T1  - Multidimensional networks: foundations of structural analysis
JF  - World Wide Web
Y1  - 2012
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Dino Pedreschi
AB  - Complex networks have been receiving increasing attention by the scientific community, thanks also to the increasing availability of real-world network data. So far, network analysis has focused on the characterization and measurement of local and global properties of graphs, such as diameter, degree distribution, centrality, and so on. In the last years, the multidimensional nature of many real world networks has been pointed out, i.e. many networks containing multiple connections between any pair of nodes have been analyzed. Despite the importance of analyzing this kind of networks was recognized by previous works, a complete framework for multidimensional network analysis is still missing. Such a framework would enable the analysts to study different phenomena, that can be either the generalization to the multidimensional setting of what happens in monodimensional networks, or a new class of phenomena induced by the additional degree of complexity that multidimensionality provides in real networks. The aim of this paper is then to give the basis for multidimensional network analysis: we present a solid repertoire of basic concepts and analytical measures, which take into account the general structure of multidimensional networks. We tested our framework on different real world multidimensional networks, showing the validity and the meaningfulness of the measures introduced, that are able to extract important and non-random information about complex phenomena in such networks.
VL  - Volume 15 / 2012
UR  - http://www.springerlink.com/content/f774289854430410/abstract/
ER  - 

TY  - Generic
T1  - Optimal Spatial Resolution for the Analysis of Human Mobility
T2  - IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Y1  - 2012
A1  - Michele Coscia
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Fosca Giannotti
JF  - IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
CY  - Instanbul, Turkey
ER  - 

TY  - JOUR
T1  - RNA-Seq vs dual- and single-channel microarray data: sensitivity analysis for differential expression and clustering.
JF  - PLoS One
Y1  - 2012
A1  - Alina Sirbu
A1  - Kerr, Gráinne
A1  - Martin Crane
A1  - Heather J Ruskin
AB  - <p>With the fast development of high-throughput sequencing technologies, a new generation of genome-wide gene expression measurements is under way. This is based on mRNA sequencing (RNA-seq), which complements the already mature technology of microarrays, and is expected to overcome some of the latter's disadvantages. These RNA-seq data pose new challenges, however, as strengths and weaknesses have yet to be fully identified. Ideally, Next (or Second) Generation Sequencing measures can be integrated for more comprehensive gene expression investigation to facilitate analysis of whole regulatory networks. At present, however, the nature of these data is not very well understood. In this paper we study three alternative gene expression time series datasets for the Drosophila melanogaster embryo development, in order to compare three measurement techniques: RNA-seq, single-channel and dual-channel microarrays. The aim is to study the state of the art for the three technologies, with a view of assessing overlapping features, data compatibility and integration potential, in the context of time series measurements. This involves using established tools for each of the three different technologies, and technical and biological replicates (for RNA-seq and microarrays, respectively), due to the limited availability of biological RNA-seq replicates for time series data. The approach consists of a sensitivity analysis for differential expression and clustering. In general, the RNA-seq dataset displayed highest sensitivity to differential expression. The single-channel data performed similarly for the differentially expressed genes common to gene sets considered. Cluster analysis was used to identify different features of the gene space for the three datasets, with higher similarities found for the RNA-seq and single-channel microarray dataset.</p>
VL  - 7
ER  - 

TY  - JOUR
T1  - Smart cities of the future
JF  - European Physical Journal-Special Topics
Y1  - 2012
A1  - Batty, Michael
A1  - Axhausen, Kay W
A1  - Fosca Giannotti
A1  - Pozdnoukhov, Alexei
A1  - Bazzani, Armando
A1  - Monica Wachowicz
A1  - Ouzounis, Georgios
A1  - Portugali, Yuval
AB  - Here we sketch the rudiments of what constitutes a smart city which we define as a city in which ICT is merged with traditional infrastructures, coordinated and integrated using new digital technologies. We first sketch our vision defining seven goals which concern: developing a new understanding of urban problems; effective and feasible ways to coordinate urban technologies; models and methods for using urban data across spatial and temporal scales; developing new technologies for communication and dissemination; developing new forms of urban governance and organisation; defining critical problems relating to cities, transport, and energy; and identifying risk, uncertainty, and hazards in the smart city. To this, we add six research challenges: to relate the infrastructure of smart cities to their operational functioning and planning through management, control and optimisation; to explore the notion of the city as a laboratory for innovation; to provide portfolios of urban simulation which inform future designs; to develop technologies that ensure equity, fairness and realise a better quality of city life; to develop technologies that ensure informed participation and create shared knowledge for democratic city governance; and to ensure greater and more effective mobility and access to opportunities for urban populations. We begin by defining the state of the art, explaining the science of smart cities. We define six scenarios based on new cities badging themselves as smart, older cities regenerating themselves as smart, the development of science parks, tech cities, and technopoles focused on high technologies, the development of urban services using contemporary ICT, the use of ICT to develop new urban intelligence functions, and the development of online and mobile forms of participation. Seven project areas are then proposed: Integrated Databases for the Smart City, Sensing, Networking and the Impact of New Social Media, Modelling Network Performance, Mobility and Travel Behaviour, Modelling Urban Land Use, Transport and Economic Interactions, Modelling Urban Transactional Activities in Labour and Housing Markets, Decision Support as Urban Intelligence, Participatory Governance and Planning Structures for the Smart City. Finally we anticipate the paradigm shifts that will occur in this research and define a series of key demonstrators which we believe are important to progressing a science of smart cities.
VL  - 214
ER  - 

TY  - CHAP
T1  - What else can be extracted from ontologies? Influence Rules
T2  - Software and Data Technologies
Y1  - 2012
A1  - Franco Turini
A1  - Barbara Furletti
JF  - Software and Data Technologies
T3  - Communications in Computer and Information Science
PB  - Springer
ER  - 

TY  - JOUR
T1  - Wine and Food Tourism First European Conference
JF  - Edizioni ETS Pisa
Y1  - 2012
A1  - Romano, Maria Francesca
A1  - Michela Natilli
ER  - 

TY  - JOUR
T1  - Wisdom of crowds for robust gene network inference.
JF  - Nat Methods
Y1  - 2012
A1  - Daniel Marbach
A1  - J.C. Costello
A1  - Robert Küffner
A1  - N.M. Vega
A1  - R.J. Prill
A1  - D.M. Camacho
A1  - K.R. Allison
A1  - Manolis Kellis
A1  - J.J. Collins
A1  - Gustavo Stolovitzky
AB  - <p>Reconstructing gene regulatory networks from high-throughput data is a long-standing challenge. Through the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we performed a comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data. We characterize the performance, data requirements and inherent biases of different inference approaches, and we provide guidelines for algorithm application and development. We observed that no single inference method performs optimally across all data sets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse data sets. We thereby constructed high-confidence networks for E. coli and S. aureus, each comprising ~1,700 transcriptional interactions at a precision of ~50%. We experimentally tested 53 previously unobserved regulatory interactions in E. coli, of which 23 (43%) were supported. Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks.</p>
VL  - 9
ER  - 

TY  - JOUR
T1  - Wisdom of crowds for robust gene network inference
JF  - Nature Methods
Y1  - 2012
A1  - Daniel Marbach
A1  - J.C. Costello
A1  - Robert Küffner
A1  - N.M. Vega
A1  - R.J. Prill
A1  - D.M. Camacho
A1  - K.R. Allison
A1  - Manolis Kellis
A1  - J.J. Collins
A1  - Aderhold, A.
A1  - Gustavo Stolovitzky
A1  - Bonneau, R.
A1  - Chen, Y.
A1  - Cordero, F.
A1  - Martin Crane
A1  - Dondelinger, F.
A1  - Drton, M.
A1  - Esposito, R.
A1  - Foygel, R.
A1  - De La Fuente, A.
A1  - Gertheiss, J.
A1  - Geurts, P.
A1  - Greenfield, A.
A1  - Grzegorczyk, M.
A1  - Haury, A.-C.
A1  - Holmes, B.
A1  - Hothorn, T.
A1  - Husmeier, D.
A1  - Huynh-Thu, V.A.
A1  - Irrthum, A.
A1  - Karlebach, G.
A1  - Lebre, S.
A1  - De Leo, V.
A1  - Madar, A.
A1  - Mani, S.
A1  - Mordelet, F.
A1  - Ostrer, H.
A1  - Ouyang, Z.
A1  - Pandya, R.
A1  - Petri, T.
A1  - Pinna, A.
A1  - Poultney, C.S.
A1  - Rezny, S.
A1  - Heather J Ruskin
A1  - Saeys, Y.
A1  - Shamir, R.
A1  - Alina Sirbu
A1  - Song, M.
A1  - Soranzo, N.
A1  - Statnikov, A.
A1  - N.M. Vega
A1  - Vera-Licona, P.
A1  - Vert, J.-P.
A1  - Visconti, A.
A1  - Haizhou Wang
A1  - Wehenkel, L.
A1  - Windhager, L.
A1  - Zhang, Y.
A1  - Zimmer, R.
VL  - 9
UR  - http://www.scopus.com/inward/record.url?eid=2-s2.0-84870305264&partnerID=40&md5=04a686572bdefff60157bf68c95df7ea
ER  - 

TY  - JOUR
T1  - A classification for community discovery methods in complex networks
JF  - Statistical Analysis and Data Mining
Y1  - 2011
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Dino Pedreschi
VL  - 4
ER  - 

TY  - JOUR
T1  - C-safety: a framework for the anonymization of semantic trajectories
JF  - Transactions on Data Privacy
Y1  - 2011
A1  - Anna Monreale
A1  - Roberto Trasarti
A1  - Dino Pedreschi
A1  - Chiara Renso
A1  - Vania Bogorny
AB  - The increasing abundance of data about the trajectories of personal movement is opening  new opportunities for analyzing and mining human mobility. However, new risks emerge since it  opens new ways of intruding into personal privacy. Representing the personal movements as sequences  of places visited by a person during her/his movements - semantic trajectory - poses great  privacy threats. In this paper we propose a privacy model defining the attack model of semantic trajectory  linking and a privacy notion, called c-safety based on a generalization of visited places based  on a taxonomy. This method provides an upper bound to the probability of inferring that a given  person, observed in a sequence of non-sensitive places, has also visited any sensitive location. Coherently  with the privacy model, we propose an algorithm for transforming any dataset of semantic  trajectories into a c-safe one. We report a study on two real-life GPS trajectory datasets to show how  our algorithm preserves interesting quality/utility measures of the original trajectories, when mining  semantic trajectories sequential pattern mining results. We also empirically measure how the  probability that the attacker’s inference succeeds is much lower than the theoretical upper bound  established.
VL  - 4
UR  - http://dl.acm.org/citation.cfm?id=2019319&CFID=803961971&CFTOKEN=35994039
ER  - 

TY  - BOOK
T1  - Dinamiche di impoverimento. Meccanismi, traiettorie ed effetti in un contesto locale
Y1  - 2011
A1  - Tomei, Gabriele
A1  - Michela Natilli
PB  - Carocci Editore
ER  - 

TY  - CONF
T1  - Finding and Characterizing Communities in Multidimensional Networks
T2  - ASONAM
Y1  - 2011
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
JF  - ASONAM
ER  - 

TY  - CONF
T1  - Finding redundant and complementary communities in multidimensional networks
T2  - CIKM
Y1  - 2011
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
JF  - CIKM
ER  - 

TY  - CONF
T1  - Foundations of Multidimensional Network Analysis
T2  - ASONAM
Y1  - 2011
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Dino Pedreschi
AB  - Complex networks have been receiving increasing attention by the scientific community, thanks also to the increasing availability of real-world network data. In the last years, the multidimensional nature of many real world networks has been pointed out, i.e. many networks containing multiple connections between any pair of nodes have been analyzed. Despite the importance of analyzing this kind of networks was recognized by previous works, a complete framework for multidimensional network analysis is still missing. Such a framework would enable the analysts to study different phenomena, that can be either the generalization to the multidimensional setting of what happens inmonodimensional network, or a new class of phenomena induced by the additional degree of complexity that multidimensionality provides in real networks. The aim of this paper is then to give the basis for multidimensional network analysis: we develop a solid repertoire of basic concepts and analytical measures, which takes into account the general structure of multidimensional networks. We tested our framework on a real world multidimensional network, showing the validity and the meaningfulness of the measures introduced, that are able to extract important, nonrandom, information about complex phenomena.
JF  - ASONAM
ER  - 

TY  - Generic
T1  - From Movement Tracks through Events to Places: Extracting and Characterizing Significant Places from Mobility Data
T2  - IEEE Conference on Visual Analytics Science and Technology
Y1  - 2011
A1  - Gennady Andrienko
A1  - Natalia Andrienko
A1  - Cristophe Hurter
A1  - S Rinzivillo
A1  - Stefan Wrobel
JF  - IEEE Conference on Visual Analytics Science and Technology
ER  - 

TY  - CONF
T1  - The impact of wine and food tourism in Italy: an analysis of official statistical data at province level
T2  - First European Conference on Wine and Food Tourism
Y1  - 2011
A1  - Michela Natilli
A1  - Romano, Maria Francesca
JF  - First European Conference on Wine and Food Tourism
ER  - 

TY  - CONF
T1  - The language of tourists in a wine and food blog
T2  - First European Conference on Wine and Food Tourism
Y1  - 2011
A1  - Pavone, Pasquale
A1  - Michela Natilli
A1  - Romano, Maria Francesca
JF  - First European Conference on Wine and Food Tourism
ER  - 

TY  - CONF
T1  - Link Prediction su Reti Multidimensionali
T2  - Sistemi Evoluti per Basi di Dati - {SEBD} 2011, Proceedings of the Nineteenth Italian Symposium on Advanced Database Systems, Maratea, Italy, June 26-29, 2011
Y1  - 2011
A1  - Giulio Rossetti
A1  - Michele Berlingerio
A1  - Fosca Giannotti
JF  - Sistemi Evoluti per Basi di Dati - {SEBD} 2011, Proceedings of the Nineteenth Italian Symposium on Advanced Database Systems, Maratea, Italy, June 26-29, 2011
ER  - 

TY  - JOUR
T1  - Measuring the effectiveness of homeopathic care through objective and shared indicators
JF  - Homeopathy
Y1  - 2011
A1  - Leone, Laura
A1  - Marchitiello, Maria
A1  - Michela Natilli
A1  - Romano, Maria Francesca
VL  - 100
ER  - 

TY  - CONF
T1  - Mining Influence Rules out of Ontologies
T2  - International Conference on Software and Data Technologies (ICSOFT)
Y1  - 2011
A1  - Barbara Furletti
A1  - Franco Turini
JF  - International Conference on Software and Data Technologies (ICSOFT)
CY  - Siviglia, Spagna
ER  - 

TY  - CONF
T1  - Mining mobility user profiles for car pooling
T2  - KDD
Y1  - 2011
A1  - Roberto Trasarti
A1  - Fabio Pinelli
A1  - Mirco Nanni
A1  - Fosca Giannotti
JF  - KDD
ER  - 

TY  - CONF
T1  - Privacy-preserving data mining from outsourced databases.
T2  - the 3rd International Conference on Computers, Privacy, and Data Protection: An element of choice
Y1  - 2011
A1  - Fosca Giannotti
A1  - L.V.S. Lakshmanan
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Hui Wendy Wang
AB  - Spurred by developments such as cloud computing, there has been considerable recent interest in the paradigm of data mining-as-service: a company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third party service provider (server). However, both the outsourced database and the knowledge extract from it by data mining are considered private property of the data owner. To protect corporate privacy, the data owner transforms its data and ships it to the server, sends mining queries to the server, and recovers the true patterns from the extracted patterns received from the server. In this paper, we study the problem of outsourcing a data mining task within a corporate privacy-preserving framework. We propose a scheme for privacy-preserving outsourced mining which offers a formal protection against information disclosure, and show that the data owner can recover the correct data mining results efficiently.
JF  - the 3rd International Conference on Computers, Privacy, and Data Protection: An element of choice
ER  - 

TY  - JOUR
T1  - The pursuit of hubbiness: Analysis of hubs in large multidimensional networks
JF  - J. Comput. Science
Y1  - 2011
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Dino Pedreschi
AB  - Hubs are highly connected nodes within a network. In complex network analysis, hubs have been widely studied, and are at the basis of many tasks, such as web search and epidemic outbreak detection. In reality, networks are often multidimensional, i.e., there can exist multiple connections between any pair of nodes. In this setting, the concept of hub depends on the multiple dimensions of the network, whose interplay becomes crucial for the connectedness of a node. In this paper, we characterize multidimensional hubs. We consider the multidimensional generalization of the degree and introduce a new class of measures, that we call Dimension Relevance, aimed at analyzing the importance of different dimensions for the hubbiness of a node. We assess the meaningfulness of our measures by comparing them on real networks and null models, then we study the interplay among dimensions and their effect on node connectivity. Our findings show that: (i) multidimensional hubs do exist and their characterization yields interesting insights and (ii) it is possible to detect the most influential dimensions that cause the different hub behaviors. We demonstrate the usefulness of multidimensional analysis in three real world domains: detection of ambiguous query terms in a word–word query log network, outlier detection in a social network, and temporal analysis of behaviors in a co-authorship network.
VL  - 2
ER  - 

TY  - JOUR
T1  - A Query Language for Mobility Data Mining
JF  - IJDWM
Y1  - 2011
A1  - Roberto Trasarti
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Chiara Renso
VL  - 7
ER  - 

TY  - CONF
T1  - Scalable Link Prediction on Multidimensional Networks
T2  - ICDM Workshops
Y1  - 2011
A1  - Giulio Rossetti
A1  - Michele Berlingerio
A1  - Fosca Giannotti
JF  - ICDM Workshops
CY  - Vancouver
ER  - 

TY  - CHAP
T1  - Stages of Gene Regulatory Network Inference: the Evolutionary Algorithm Role
T2  - Evolutionary Algorithms
Y1  - 2011
A1  - Alina Sirbu
A1  - Heather J Ruskin
A1  - Martin Crane
JF  - Evolutionary Algorithms
PB  - InTech
UR  - http://www.intechopen.com/articles/show/title/stages-of-gene-regulatory-network-inference-the-evolutionary-algorithm-role
ER  - 

TY  - JOUR
T1  - Stiramenti identitari. Strategie di integrazione degli strannieri nella provincia di Massa Carrara tra appartenenza etnica ed esperienza transnazionale
Y1  - 2011
A1  - Tomei, Gabriele
A1  - Paletti, F
A1  - Michela Natilli
ER  - 

TY  - CONF
T1  - Traffic Jams Detection Using Flock Mining
T2  - ECML/PKDD (3)
Y1  - 2011
A1  - Rebecca Ong
A1  - Fabio Pinelli
A1  - Roberto Trasarti
A1  - Mirco Nanni
A1  - Chiara Renso
A1  - S Rinzivillo
A1  - Fosca Giannotti
JF  - ECML/PKDD (3)
ER  - 

TY  - JOUR
T1  - Unveiling the complexity of human mobility by querying and mining massive trajectory data
JF  - VLDB J.
Y1  - 2011
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Fabio Pinelli
A1  - Chiara Renso
A1  - S Rinzivillo
A1  - Roberto Trasarti
VL  - 20
ER  - 

TY  - CONF
T1  - Who/Where Are My New Customers?
T2  - ISMIS Industrial Session
Y1  - 2011
A1  - S Rinzivillo
A1  - Salvatore Ruggieri
JF  - ISMIS Industrial Session
ER  - 

TY  - CONF
T1  - Advanced knowledge discovery on movement data with the GeoPKDD system
T2  - EDBT
Y1  - 2010
A1  - Mirco Nanni
A1  - Roberto Trasarti
A1  - Chiara Renso
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - EDBT
ER  - 

TY  - CONF
T1  - Advanced knowledge discovery on movement data with the GeoPKDD system
T2  - EDBT
Y1  - 2010
A1  - Mirco Nanni
A1  - Roberto Trasarti
A1  - Chiara Renso
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - EDBT
ER  - 

TY  - CONF
T1  - As Time Goes by: Discovering Eras in Evolving Social Networks
T2  - PAKDD (1)
Y1  - 2010
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Dino Pedreschi
AB  - Within the large body of research in complex network analysis, an important topic is the temporal evolution of networks. Existing approaches aim at analyzing the evolution on the global and the local scale, extracting properties of either the entire network or local patterns. In this paper, we focus instead on detecting clusters of temporal snapshots of a network, to be interpreted as eras of evolution. To this aim, we introduce a novel hierarchical clustering methodology, based on a dissimilarity measure (derived from the Jaccard coefficient) between two temporal snapshots of the network. We devise a framework to discover and browse the eras, either in top-down or a bottom-up fashion, supporting the exploration of the evolution at any level of temporal resolution. We show how our approach applies to real networks, by detecting eras in an evolving co-authorship graph extracted from a bibliographic dataset; we illustrate how the discovered temporal clustering highlights the crucial moments when the network had profound changes in its structure. Our approach is finally boosted by introducing a meaningful labeling of the obtained clusters, such as the characterizing topics of each discovered era, thus adding a semantic dimension to our analysis.
JF  - PAKDD (1)
ER  - 

TY  - JOUR
T1  - Comparison of evolutionary algorithms in gene regulatory network model inference.
JF  - BMC Bioinformatics
Y1  - 2010
A1  - Alina Sirbu
A1  - Heather J Ruskin
A1  - Martin Crane
AB  - <p><b>BACKGROUND: </b>The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient.</p><p><b>RESULTS: </b>This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared.</p><p><b>CONCLUSIONS: </b>Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.</p>
VL  - 11
ER  - 

TY  - JOUR
T1  - Cross-platform microarray data normalisation for regulatory network inference.
JF  - PLoS One
Y1  - 2010
A1  - Alina Sirbu
A1  - Heather J Ruskin
A1  - Martin Crane
AB  - <p><b>BACKGROUND: </b>Inferring Gene Regulatory Networks (GRNs) from time course microarray data suffers from the dimensionality problem created by the short length of available time series compared to the large number of genes in the network. To overcome this, data integration from diverse sources is mandatory. Microarray data from different sources and platforms are publicly available, but integration is not straightforward, due to platform and experimental differences.</p><p><b>METHODS: </b>We analyse here different normalisation approaches for microarray data integration, in the context of reverse engineering of GRN quantitative models. We introduce two preprocessing approaches based on existing normalisation techniques and provide a comprehensive comparison of normalised datasets.</p><p><b>CONCLUSIONS: </b>Results identify a method based on a combination of Loess normalisation and iterative K-means as best for time series normalisation for this problem.</p>
VL  - 5
ER  - 

TY  - CONF
T1  - Discovering Eras in Evolving Social Networks (Extended Abstract)
T2  - SEBD
Y1  - 2010
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Dino Pedreschi
JF  - SEBD
ER  - 

TY  - CONF
T1  - Exploring Real Mobility Data with M-Atlas
T2  - ECML/PKDD (3)
Y1  - 2010
A1  - Roberto Trasarti
A1  - S Rinzivillo
A1  - Fabio Pinelli
A1  - Mirco Nanni
A1  - Anna Monreale
A1  - Chiara Renso
A1  - Dino Pedreschi
A1  - Fosca Giannotti
AB  - Research on moving-object data analysis has been recently fostered by the widespread diffusion of new techniques and systems for monitoring, collecting and storing location aware data, generated by a wealth of technological infrastructures, such as GPS positioning and wireless networks. These have made available massive repositories of spatio-temporal data recording human mobile activities, that call for suitable analytical methods, capable of enabling the development of innovative, location-aware applications.
JF  - ECML/PKDD (3)
ER  - 

TY  - Generic
T1  - A Generalisation-based Approach to Anonymising Movement Data
T2  - 13th AGILE conference on Geographic Information Science
Y1  - 2010
A1  - Gennady Andrienko
A1  - Natalia Andrienko
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - S Rinzivillo
AB  - The possibility to collect, store, disseminate, and analyze data about movements of people raises  very serious privacy concerns, given the sensitivity of the information about personal positions. In  particular, sensitive information about individuals can be uncovered with the use of data mining and  visual analytics methods. In this paper we present a method for the generalization of trajectory data  that can be adopted as the first step of a process to obtain k-anonymity in spatio-temporal datasets.  We ran a preliminary set of experiments on a real-world trajectory dataset, demonstrating that this  method of generalization of trajectories preserves the clustering analysis results.
JF  - 13th AGILE conference on Geographic Information Science
UR  - http://agile2010.dsi.uminho.pt/pen/ShortPapers_PDF%5C122_DOC.pdf
ER  - 

TY  - JOUR
T1  - Improving the Business Plan Evaluation Process: the Role of Intangibles
JF  - Quality Technology & Quantitative Management
Y1  - 2010
A1  - Barbara Furletti
A1  - Franco Turini
A1  - Andrea Bellandi
A1  - Miriam Baglioni
A1  - Chiara Pratesi
VL  - 7
UR  - http://web.it.nctu.edu.tw/~qtqm/upcomingpapers/2010V7N1/2010V7N1_F3.pdf
ER  - 

TY  - CONF
T1  - Location Prediction through Trajectory Pattern Mining (Extended Abstract)
T2  - SEBD
Y1  - 2010
A1  - Anna Monreale
A1  - Fabio Pinelli
A1  - Roberto Trasarti
A1  - Fosca Giannotti
JF  - SEBD
ER  - 

TY  - CONF
T1  - Mobility data mining: discovering movement patterns from trajectory data
T2  - Computational Transportation Science
Y1  - 2010
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Fabio Pinelli
A1  - Chiara Renso
A1  - S Rinzivillo
A1  - Roberto Trasarti
JF  - Computational Transportation Science
ER  - 

TY  - JOUR
T1  - Movement Data Anonymity through Generalization
JF  - Transactions on Data Privacy
Y1  - 2010
A1  - Anna Monreale
A1  - Gennady Andrienko
A1  - Natalia Andrienko
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - S Rinzivillo
A1  - Stefan Wrobel
AB  - Wireless networks and mobile devices, such as mobile phones and GPS receivers, sense  and track the movements of people and vehicles, producing society-wide mobility databases. This is  a challenging scenario for data analysis and mining. On the one hand, exciting opportunities arise out  of discovering new knowledge about human mobile behavior, and thus fuel intelligent info-mobility  applications. On other hand, new privacy concerns arise when mobility data are published. The  risk is particularly high for GPS trajectories, which represent movement of a very high precision and  spatio-temporal resolution: the de-identification of such trajectories (i.e., forgetting the ID of their  associated owners) is only a weak protection, as generally it is possible to re-identify a person by observing  her routine movements. In this paper we propose a method for achieving true anonymity in  a dataset of published trajectories, by defining a transformation of the original GPS trajectories based  on spatial generalization and k-anonymity. The proposed method offers a formal data protection  safeguard, quantified as a theoretical upper bound to the probability of re-identification. We conduct  a thorough study on a real-life GPS trajectory dataset, and provide strong empirical evidence that  the proposed anonymity techniques achieve the conflicting goals of data utility and data privacy. In  practice, the achieved anonymity protection is much stronger than the theoretical worst case, while  the quality of the cluster analysis on the trajectory data is preserved.
VL  - 3
UR  - http://www.tdp.cat/issues/abs.a045a10.php
ER  - 

TY  - CONF
T1  - Preserving privacy in semantic-rich trajectories of human mobility
T2  - SPRINGL
Y1  - 2010
A1  - Anna Monreale
A1  - Roberto Trasarti
A1  - Chiara Renso
A1  - Dino Pedreschi
A1  - Vania Bogorny
AB  - The increasing abundance of data about the trajectories of personal movement is opening up new opportunities for analyzing and mining human mobility, but new risks emerge since it opens new ways of intruding into personal privacy. Representing the personal movements as sequences of places visited by a person during her/his movements - semantic trajectory - poses even greater privacy threats w.r.t. raw geometric location data. In this paper we propose a privacy model defining the attack model of semantic trajectory linking, together with a privacy notion, called c-safety. This method provides an upper bound to the probability of inferring that a given person, observed in a sequence of nonsensitive places, has also stopped in any sensitive location. Coherently with the privacy model, we propose an algorithm for transforming any dataset of semantic trajectories into a c-safe one. We report a study on a real-life GPS trajectory dataset to show how our algorithm preserves interesting quality/utility measures of the original trajectories, such as sequential pattern mining results.
JF  - SPRINGL
ER  - 

TY  - CONF
T1  - Querying and mining trajectories with gaps: a multi-path reconstruction approach (Extended Abstract)
T2  - SEBD
Y1  - 2010
A1  - Mirco Nanni
A1  - Roberto Trasarti
JF  - SEBD
ER  - 

TY  - JOUR
T1  - Regulatory network modelling: Correlation for structure and parameter optimisation
JF  - Proceedings of The IASTED Technology Conferences (International Conference on Computational Bioscience), Cambridge, Massachusetts
Y1  - 2010
A1  - Alina Sirbu
A1  - Heather J Ruskin
A1  - Martin Crane
UR  - http://www.actapress.com/Abstract.aspx?paperId=41573
ER  - 

TY  - CHAP
T1  - Spatio-temporal clustering
T2  - Data Mining and Knowledge Discovery Handbook
Y1  - 2010
A1  - Slava Kisilevich
A1  - Florian Mansmann
A1  - Mirco Nanni
A1  - S Rinzivillo
JF  - Data Mining and Knowledge Discovery Handbook
ER  - 

TY  - CONF
T1  - Towards Discovery of Eras in Social Networks
T2  - M3SN 2010 Workshop, in conjunction with ICDE2010
Y1  - 2010
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Dino Pedreschi
AB  - In the last decades, much research has been devoted in topics related to Social Network Analysis. One important direction in this area is to analyze the temporal evolution of a network. So far, previous approaches analyzed this setting at both the global and the local level. In this paper, we focus on finding a way to detect temporal eras in an evolving network. We pose the basis for a general framework that aims at helping the analyst in browsing the temporal clusters both in a top-down and bottom-up way, exploring the network at any level of temporal details. We show the effectiveness of our approach on real data, by applying our proposed methodology to a co-authorship network extracted from a bibliographic dataset. Our first results are encouraging, and open the way for the definition and implementation of a general framework for discovering eras in evolving social networks.
JF  - M3SN 2010 Workshop, in conjunction with ICDE2010
ER  - 

TY  - Generic
T1  - Anonymous Sequences from Trajectory Data
T2  - 17th Italian Symposium on Advanced Database Systems
Y1  - 2009
A1  - Ruggero G. Pensa
A1  - Anna Monreale
A1  - Fabio Pinelli
A1  - Dino Pedreschi
JF  - 17th Italian Symposium on Advanced Database Systems
CY  - Camogli, Italy
ER  - 

TY  - JOUR
T1  - A constraint-based querying system for exploratory pattern discovery
JF  - Inf. Syst.
Y1  - 2009
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Claudio Lucchese
A1  - Salvatore Orlando
A1  - Raffaele Perego
A1  - Roberto Trasarti
VL  - 34
ER  - 

TY  - JOUR
T1  - A constraint-based querying system for exploratory pattern discovery
JF  - Inf. Syst.
Y1  - 2009
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Claudio Lucchese
A1  - Salvatore Orlando
A1  - Raffaele Perego
A1  - Roberto Trasarti
VL  - 34
ER  - 

TY  - CONF
T1  - DAMSEL: A System for Progressive Querying and Reasoning on Movement Data
T2  - DEXA Workshops
Y1  - 2009
A1  - Roberto Trasarti
A1  - Miriam Baglioni
A1  - Chiara Renso
JF  - DEXA Workshops
ER  - 

TY  - JOUR
T1  - Developing a Spatial Knowledge Representation for Pedestrian Interactions
Y1  - 2009
A1  - Daniel Ornellana, Chiara Renso
N1  - MOVEMENT-AWARE APPLICATIONS FOR SUSTAINABLE MOBILITY: TECHNOLOGIES AND APPROACHES, Monica Wachowicz Editor, IGI Publisher, To appear \subsection{Conferenze e Workshop}
ER  - 

TY  - CONF
T1  - Geographic privacy-aware knowledge discovery and delivery
T2  - EDBT
Y1  - 2009
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Yannis Theodoridis
JF  - EDBT
ER  - 

TY  - CONF
T1  - GeoPKDD – Geographic Privacy-aware Knowledge Discovery
T2  - The European Future Technologies Conference (FET 2009)
Y1  - 2009
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Chiara Renso
A1  - S Rinzivillo
A1  - Roberto Trasarti
JF  - The European Future Technologies Conference (FET 2009)
ER  - 

TY  - CONF
T1  - Integrating induction and deduction for finding evidence of discrimination
T2  - ICAIL
Y1  - 2009
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
A1  - Franco Turini
JF  - ICAIL
ER  - 

TY  - CONF
T1  - K-BestMatch Reconstruction and Comparison of Trajectory Data
T2  - ICDM Workshops
Y1  - 2009
A1  - Mirco Nanni
A1  - Roberto Trasarti
JF  - ICDM Workshops
ER  - 

TY  - CONF
T1  - K-BestMatch Reconstruction and Comparison of Trajectory Data
T2  - ICDM Workshops
Y1  - 2009
A1  - Mirco Nanni
A1  - Roberto Trasarti
JF  - ICDM Workshops
ER  - 

TY  - CONF
T1  - Measuring Discrimination in Socially-Sensitive Decision Records
T2  - SDM
Y1  - 2009
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
A1  - Franco Turini
JF  - SDM
ER  - 

TY  - CHAP
T1  - Mining Clinical, Immunological, and Genetic Data of Solid Organ Transplantation
T2  - Biomedical Data and Applications
Y1  - 2009
A1  - Michele Berlingerio
A1  - Francesco Bonchi
A1  - Michele Curcio
A1  - Fosca Giannotti
A1  - Franco Turini
JF  - Biomedical Data and Applications
ER  - 

TY  - CONF
T1  - Mining Graph Evolution Rules
T2  - ECML/PKDD 2009
Y1  - 2009
A1  - Michele Berlingerio
A1  - Francesco Bonchi
A1  - Björn Bringmann
A1  - Aristides Gionis
JF  - ECML/PKDD 2009
CY  - Bled, Slovenia
ER  - 

TY  - CONF
T1  - Mining Mobility Behavior from Trajectory Data
T2  - CSE (4)
Y1  - 2009
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Chiara Renso
A1  - Roberto Trasarti
JF  - CSE (4)
ER  - 

TY  - CONF
T1  - Mining the Information Propagation in a Network
T2  - SEBD
Y1  - 2009
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
JF  - SEBD
ER  - 

TY  - CONF
T1  - Mining the Information Propagation in a Network
T2  - SEBD
Y1  - 2009
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
JF  - SEBD
ER  - 

TY  - CONF
T1  - Mining the Temporal Dimension of the Information Propagation
T2  - IDA
Y1  - 2009
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
JF  - IDA
ER  - 

TY  - CONF
T1  - Mining the Temporal Dimension of the Information Propagation
T2  - IDA
Y1  - 2009
A1  - Michele Berlingerio
A1  - Michele Coscia
A1  - Fosca Giannotti
JF  - IDA
ER  - 

TY  - CONF
T1  - Movement data anonymity through generalization
T2  - Proceedings of the 2nd SIGSPATIAL ACM GIS 2009 International Workshop on Security and Privacy in GIS and LBS
Y1  - 2009
A1  - Gennady Andrienko
A1  - Natalia Andrienko
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Dino Pedreschi
AB  - In recent years, spatio-temporal and moving objects databases have gained considerable interest, due to the diffusion of mobile devices (e.g., mobile phones, RFID devices and GPS devices) and of new applications, where the discovery of consumable, concise, and applicable knowledge is the key step. Clearly, in these applications privacy is a concern, since models extracted from this kind of data can reveal the behavior of group of individuals, thus compromising their privacy. Movement data present a new challenge for the privacy-preserving data mining community because of their spatial and temporal characteristics.    In this position paper we briefly present an approach for the generalization of movement data that can be adopted for obtaining k-anonymity in spatio-temporal datasets; specifically, it can be used to realize a framework for publishing of spatio-temporal data while preserving privacy. We ran a preliminary set of experiments on a real-world trajectory dataset, demonstrating that this method of generalization of trajectories preserves the clustering analysis results.
JF  - Proceedings of the 2nd SIGSPATIAL ACM GIS 2009 International Workshop on Security and Privacy in GIS and LBS
PB  - ACM
ER  - 

TY  - CONF
T1  - A new technique for sequential pattern mining under regular expressions
T2  - SEBD
Y1  - 2009
A1  - Roberto Trasarti
A1  - Francesco Bonchi
A1  - Bart Goethals
JF  - SEBD
ER  - 

TY  - THES
T1  - Ontology Driven Knowledge Discovery
T2  - IMT - Lucca
Y1  - 2009
A1  - Barbara Furletti
JF  - IMT - Lucca
PB  - IMT - Lucca
CY  - Lucca - Italy
ER  - 

TY  - CONF
T1  - Poverty as a Social Condition: a Case Study on a Small Municipality in Tuscany
T2  - Global Recession: Regional Impacts on Housing, Jobs, Health and Wellbeing
Y1  - 2009
A1  - Tomei, Gabriele
A1  - Michela Natilli
JF  - Global Recession: Regional Impacts on Housing, Jobs, Health and Wellbeing
PB  - SEAFORD
ER  - 

TY  - JOUR
T1  - {The Role of a Multi-tier Ontological Framework in Reasoning to Discover Meaningful Patterns of Sustainable Mobility}
Y1  - 2009
A1  - Monica Wachowicz
A1  - de José Antônio Fernandes Macêdo
A1  - Chiara Renso
A1  - Arend Ligtenberg
N1  - {Geographic Data Mining and Knowledge Discovery, 2nd Edition, to appear}
ER  - 

TY  - CONF
T1  - Social Network Analysis as Knowledge Discovery Process: A Case Study on Digital Bibliography
T2  - ASONAM
Y1  - 2009
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Ruggero G. Pensa
JF  - ASONAM
ER  - 

TY  - CONF
T1  - Social Network Analysis as Knowledge Discovery Process: A Case Study on Digital Bibliography
T2  - ASONAM
Y1  - 2009
A1  - Michele Coscia
A1  - Fosca Giannotti
A1  - Ruggero G. Pensa
JF  - ASONAM
ER  - 

TY  - CONF
T1  - Temporal mining for interactive workflow data analysis
T2  - KDD
Y1  - 2009
A1  - Michele Berlingerio
A1  - Fabio Pinelli
A1  - Mirco Nanni
A1  - Fosca Giannotti
JF  - KDD
ER  - 

TY  - CONF
T1  - Towards Semantic Interpretation of Movement Behavior
T2  - AGILE Conf.
Y1  - 2009
A1  - Miriam Baglioni
A1  - de José Antônio Fernandes Macêdo
A1  - Chiara Renso
A1  - Roberto Trasarti
A1  - Monica Wachowicz
JF  - AGILE Conf.
ER  - 

TY  - CONF
T1  - Towards Semantic Interpretation of Movement Behavior
T2  - AGILE Conf.
Y1  - 2009
A1  - Miriam Baglioni
A1  - de José Antônio Fernandes Macêdo
A1  - Chiara Renso
A1  - Roberto Trasarti
A1  - Monica Wachowicz
JF  - AGILE Conf.
ER  - 

TY  - CONF
T1  - Trajectory pattern analysis for urban traffic
T2  - Second International Workshop on Computational Transportation Science
Y1  - 2009
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Fabio Pinelli
JF  - Second International Workshop on Computational Transportation Science
PB  - ACM
CY  - SEATTLE, USA
ER  - 

TY  - CONF
T1  - A Visual Analytics Toolkit for Cluster-Based Classification of Mobility Data
T2  - SSTD
Y1  - 2009
A1  - Gennady Andrienko
A1  - Natalia Andrienko
A1  - S Rinzivillo
A1  - Mirco Nanni
A1  - Dino Pedreschi
JF  - SSTD
ER  - 

TY  - CONF
T1  - Visual Cluster Analysis of Large Collections of Trajectories
T2  - IEEE Visual Analytics Science and Tecnology (VAST 2009)
Y1  - 2009
A1  - Gennady Andrienko
A1  - Natalia Andrienko
A1  - S Rinzivillo
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Fosca Giannotti
JF  - IEEE Visual Analytics Science and Tecnology (VAST 2009)
PB  - IEEE Computer Society Press
ER  - 

TY  - Generic
T1  - WhereNext: a Location Predictor on Trajectory Pattern Mining
T2  - 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Y1  - 2009
A1  - Anna Monreale
A1  - Fabio Pinelli
A1  - Roberto Trasarti
A1  - Fosca Giannotti
AB  - The pervasiveness of mobile devices and location based services is leading to an increasing volume of mobility data.This side eect provides the opportunity for innovative methods that analyse the behaviors of movements. In this paper we propose WhereNext, which is a method aimed at predicting with a certain level of accuracy the next location of a moving object. The prediction uses previously extracted movement patterns named Trajectory Patterns, which are a concise representation of behaviors of moving objects as sequences of regions frequently visited with a typical travel time. A decision tree, named T-pattern Tree, is built and evaluated with a formal training and test process. The tree is learned from the Trajectory Patterns that hold a certain area and it may be used as a predictor of the next location of a new trajectory finding the best matching path in the tree. Three dierent best matching methods to classify a new moving object are proposed and their impact on the quality of prediction is studied extensively. Using Trajectory Patterns as predictive rules has the following implications: (I) the learning depends on the movement of all available objects in a certain area instead of on the individual history of an object; (II) the prediction tree intrinsically contains the spatio-temporal properties that have emerged from the data and this allows us to define matching methods that striclty depend on the properties of such movements. In addition, we propose a set of other measures, that evaluate a priori the predictive power of a set of Trajectory Patterns. This measures were tuned on a real life case study. Finally, an exhaustive set of experiments and results on the real dataset are presented.
JF  - 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
ER  - 

TY  - JOUR
T1  - Wine tourism in Italy: New profiles, styles of consumption, ways of touring
JF  - Turizam: međunarodni znanstveno-stručni časopis
Y1  - 2009
A1  - Romano, Maria Francesca
A1  - Michela Natilli
VL  - 57
ER  - 

TY  - JOUR
T1  - Anonymity preserving pattern discovery
JF  - VLDB J.
Y1  - 2008
A1  - Maurizio Atzori
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
VL  - 17
ER  - 

TY  - JOUR
T1  - An Application of Advanced Spatio-Temporal Formalisms to Behavioural Ecology
Y1  - 2008
A1  - T. Ceccarelli
A1  - D. Centeno
A1  - Fosca Giannotti
A1  - A. Massolo
A1  - Christine Parent
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Stefano Spaccapietra
A1  - Franco Turini
N1  - Geoinformatica, Volume 12, Number 1 / March,
ER  - 

TY  - JOUR
T1  - An Application of Advanced Spatio-Temporal Formalisms to Behavioural Ecology
JF  - GeoInformatica
Y1  - 2008
A1  - Alessandra Raffaetà
A1  - T. Ceccarelli
A1  - D. Centeno
A1  - Fosca Giannotti
A1  - A. Massolo
A1  - Christine Parent
A1  - Chiara Renso
A1  - Stefano Spaccapietra
A1  - Franco Turini
VL  - 12
ER  - 

TY  - CONF
T1  - A Case Study in Sequential Pattern Mining for IT-Operational Risk
T2  - ECML/PKDD (1)
Y1  - 2008
A1  - Valerio Grossi
A1  - Andrea Romei
A1  - Salvatore Ruggieri
JF  - ECML/PKDD (1)
ER  - 

TY  - CHAP
T1  - Characterising the Next Generation of Mobile Applications Through a Privacy-Aware Geographic Knowledge Discovery Process
T2  - Mobility, Data Mining and Privacy
Y1  - 2008
A1  - Monica Wachowicz
A1  - Arend Ligtenberg
A1  - Chiara Renso
A1  - Seda F. Gürses
JF  - Mobility, Data Mining and Privacy
PB  - a Knowledge Discovery vision
CY  - Mobility, Privacy, and Geography
ER  - 

TY  - CONF
T1  - Clustering of German municipalities based on mobility characteristics: an overview of results
T2  - GIS
Y1  - 2008
A1  - Andrea Zanda
A1  - Christine Körner
A1  - Fosca Giannotti
A1  - Daniel Schulz
A1  - Michael May
JF  - GIS
ER  - 

TY  - CONF
T1  - DAEDALUS: A knowledge discovery analysis framework for movement data
T2  - SEBD
Y1  - 2008
A1  - Riccardo Ortale
A1  - E Ritacco
A1  - N. Pelekisy
A1  - Roberto Trasarti
A1  - Gianni Costa
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Chiara Renso
A1  - Yannis Theodoridis
JF  - SEBD
ER  - 

TY  - CONF
T1  - The DAEDALUS framework: progressive querying and mining of movement data
T2  - GIS
Y1  - 2008
A1  - Riccardo Ortale
A1  - E Ritacco
A1  - Nikos Pelekis
A1  - Roberto Trasarti
A1  - Gianni Costa
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Chiara Renso
A1  - Yannis Theodoridis
JF  - GIS
ER  - 

TY  - CHAP
T1  - Discovering Strategic Behaviour in Multi- Agent Scenarios by Ontology-Driven Mining
T2  - Advances in Robotics, Automation and Control
Y1  - 2008
A1  - Davide Bacciu
A1  - Andrea Bellandi
A1  - Barbara Furletti
A1  - Valerio Grossi
A1  - Andrea Romei
JF  - Advances in Robotics, Automation and Control
SN  - 978-953-7619-16-9
UR  - http://www.intechopen.com/books/advances_in_robotics_automation_and_control/discovering_strategic_behaviors_in_multi-agent_scenarios_by_ontology-driven_mining
ER  - 

TY  - CONF
T1  - Discrimination-aware data mining
T2  - KDD
Y1  - 2008
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
A1  - Franco Turini
JF  - KDD
ER  - 

TY  - CONF
T1  - AN EXTENSIBLE AND INTERACTIVE SOFTWARE AGENT FOR MOBILE DEVICES BASED ON GPS DATA
T2  - IADIS International Conference Applied Computing
Y1  - 2008
A1  - Barbara Furletti
A1  - Francesco Fornasari
A1  - Claudio Montanari
JF  - IADIS International Conference Applied Computing
SN  - 978-972-8924-56-0
UR  - http://www.iadisportal.org/digital-library/mdownload/an-extensible-and-interactive-software-agent-for-mobile-devices-based-on-gps-data
ER  - 

TY  - CHAP
T1  - Knowledge Discovery from Geographical Data
T2  - Mobility, Data Mining and Privacy
Y1  - 2008
A1  - S Rinzivillo
A1  - Franco Turini
A1  - Vania Bogorny
A1  - Christine Körner
A1  - Bart Kuijpers
A1  - Michael May
JF  - Mobility, Data Mining and Privacy
ER  - 

TY  - Generic
T1  - Location prediction within the mobility data analysis environment Daedalus
T2  - First International Workshop on Computational Transportation Science
Y1  - 2008
A1  - Fabio Pinelli
A1  - Anna Monreale
A1  - Roberto Trasarti
A1  - Fosca Giannotti
AB  - In this paper we propose a method to predict the next location of a moving object based on two recent results in GeoPKDD project: DAEDALUS, a mobility data analysis environment and Trajectory Pattern, a sequential pattern mining algorithm with temporal annotation integrated in DAEDALUS. The first one is a DMQL environment for moving objects, where both data and patterns can be represented. The second one extracts movement patterns as sequences of movements between locations with typical travel times.    This paper proposes a prediction method which uses the local models extracted by Trajectory Pattern to build a global model called Prediction Tree. The future location of a moving object is predicted visiting the tree and calculating the best matching function.    The integration within DAEDALUS system supports an interactive construction of the predictor on the top of a set of spatio-temporal patterns.    Others proposals in literature base the definition of prediction methods for future location of a moving object on previously extracted frequent patterns. They use the recent history of movements of the object itself and often use time only to order the events. Our work uses the movements of all moving objects in a certain area to learn a classifier built on the mined trajectory patterns, which are intrinsically equipped with temporal information.
JF  - First International Workshop on Computational Transportation Science
CY  - Dublin, Ireland
ER  - 

TY  - CHAP
T1  - Mobility, Data Mining and Privacy: A Vision of Convergence
T2  - Mobility, Data Mining and Privacy
Y1  - 2008
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - Mobility, Data Mining and Privacy
ER  - 

TY  - BOOK
T1  - Mobility, Data Mining and Privacy - Geographic Knowledge Discovery
T2  - Mobility, Data Mining and Privacy
Y1  - 2008
A1  - Fosca Giannotti
A1  - Dino Pedreschi
ED  - Fosca Giannotti
ED  - Dino Pedreschi
JF  - Mobility, Data Mining and Privacy
PB  - Springer
SN  - 978-3-540-75176-2
ER  - 

TY  - CONF
T1  - Mobility, Data Mining and Privacy the Experience of the GeoPKDD Project
T2  - PinKDD
Y1  - 2008
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - PinKDD
ER  - 

TY  - CONF
T1  - Ontological Support for Association Rule Mining
T2  - IASTED International Conference on Artificial Intelligence and Applications (AIA)
Y1  - 2008
A1  - Barbara Furletti
A1  - Andrea Bellandi
A1  - Valerio Grossi
A1  - Andrea Romei
JF  - IASTED International Conference on Artificial Intelligence and Applications (AIA)
CY  - Innsbruck, Austria
ER  - 

TY  - CONF
T1  - An Ontology-Based Approach for the Semantic Modelling and Reasoning on Trajectories
T2  - ER Workshops
Y1  - 2008
A1  - Miriam Baglioni
A1  - de José Antônio Fernandes Macêdo
A1  - Chiara Renso
A1  - Monica Wachowicz
JF  - ER Workshops
ER  - 

TY  - CONF
T1  - Ontology-Based Business Plan Classification
T2  - Enterprise Distributed Object Computing Conference (EDOC)
Y1  - 2008
A1  - Franco Turini
A1  - Barbara Furletti
A1  - Miriam Baglioni
A1  - Laura Spinsanti
A1  - Andrea Bellandi
JF  - Enterprise Distributed Object Computing Conference (EDOC)
SN  - 978-0-7695-3373-5
UR  - http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4634789
ER  - 

TY  - CONF
T1  - Ontology-Based Business Plan Classification
T2  - EDOC
Y1  - 2008
A1  - Miriam Baglioni
A1  - Andrea Bellandi
A1  - Barbara Furletti
A1  - Laura Spinsanti
A1  - Franco Turini
JF  - EDOC
ER  - 

TY  - JOUR
T1  - Ontology-driven Querying of Geographical Databases
Y1  - 2008
A1  - Miriam Baglioni
A1  - E. Giovannetti
A1  - Maria V Masserotti
A1  - Chiara Renso
A1  - Laura Spinsanti
N1  - Transactions in GIS Volume 12, Issue s1, Date: December Pages:\subsection{Capitoli Libri}
ER  - 

TY  - CONF
T1  - Pattern-Preserving k-Anonymization of Sequences and its Application to Mobility Data Mining
T2  - PiLBA
Y1  - 2008
A1  - Ruggero G. Pensa
A1  - Anna Monreale
A1  - Fabio Pinelli
A1  - Dino Pedreschi
AB  - Sequential pattern mining is a major research field in knowledge  discovery and data mining. Thanks to the increasing availability of  transaction data, it is now possible to provide new and improved services  based on users’ and customers’ behavior. However, this puts the citizen’s  privacy at risk. Thus, it is important to develop new privacy-preserving  data mining techniques that do not alter the analysis results significantly.  In this paper we propose a new approach for anonymizing sequential  data by hiding infrequent, and thus potentially sensible, subsequences.  Our approach guarantees that the disclosed data are k-anonymous and  preserve the quality of extracted patterns. An application to a real-world  moving object database is presented, which shows the effectiveness of our  approach also in complex contexts.
JF  - PiLBA
UR  - https://air.unimi.it/retrieve/handle/2434/52786/106397/ProceedingsPiLBA08.pdf#page=44
ER  - 

TY  - CHAP
T1  - Privacy Protection: Regulations and Technologies, Opportunities and Threats
T2  - Mobility, Data Mining and Privacy
Y1  - 2008
A1  - Dino Pedreschi
A1  - Francesco Bonchi
A1  - Franco Turini
A1  - Vassilios S. Verykios
A1  - Maurizio Atzori
A1  - Bradley Malin
A1  - Bart Moelans
A1  - Yücel Saygin
JF  - Mobility, Data Mining and Privacy
ER  - 

TY  - CHAP
T1  - Querying and Reasoning for Spatiotemporal Data Mining
T2  - Mobility, Data Mining and Privacy
Y1  - 2008
A1  - Giuseppe Manco
A1  - Miriam Baglioni
A1  - Fosca Giannotti
A1  - Bart Kuijpers
A1  - Alessandra Raffaetà
A1  - Chiara Renso
JF  - Mobility, Data Mining and Privacy
ER  - 

TY  - CHAP
T1  - Querying and Reasoning for Spatio-Temporal Data Mining
Y1  - 2008
A1  - Giuseppe Manco
A1  - Miriam Baglioni
A1  - Fosca Giannotti
A1  - Bart Kuijpers
A1  - Alessandra Raffaetà
A1  - Chiara Renso
PB  - a Knowledge Discovery vision
CY  - Mobility, Privacy, and Geography
N1  - , A Springer LNCS Monograph Fosca Giannotti and Dino Pedreschi, Editors, January
ER  - 

TY  - CHAP
T1  - Spatiotemporal Data Mining
T2  - Mobility, Data Mining and Privacy
Y1  - 2008
A1  - Mirco Nanni
A1  - Bart Kuijpers
A1  - Christine Körner
A1  - Michael May
A1  - Dino Pedreschi
JF  - Mobility, Data Mining and Privacy
ER  - 

TY  - CONF
T1  - Temporal analysis of process logs: a case study
T2  - SEBD
Y1  - 2008
A1  - Michele Berlingerio
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Fabio Pinelli
JF  - SEBD
ER  - 

TY  - CONF
T1  - Typing Linear Constraints for Moding CLP() Programs
T2  - SAS
Y1  - 2008
A1  - Salvatore Ruggieri
A1  - Frédéric Mesnard
JF  - SAS
ER  - 

TY  - JOUR
T1  - Visually driven analysis of movement data by progressive clustering
JF  - Information Visualization
Y1  - 2008
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Mirco Nanni
A1  - Fosca Giannotti
A1  - Natalia Andrienko
A1  - Gennady Andrienko
PB  - Palgrave Macmillan Ltd
VL  - 7
ER  - 

TY  - CHAP
T1  - Wireless Network Data Sources: Tracking and Synthesizing Trajectories
T2  - Mobility, Data Mining and Privacy
Y1  - 2008
A1  - Chiara Renso
A1  - Simone Puntoni
A1  - E. Frentzos
A1  - Andrea Mazzoni
A1  - Bart Moelans
A1  - Nikos Pelekis
A1  - F. Pini
JF  - Mobility, Data Mining and Privacy
ER  - 

TY  - CONF
T1  - Building Geospatial Ontologies from Geographical Databases
T2  - GeoS
Y1  - 2007
A1  - Miriam Baglioni
A1  - Maria V Masserotti
A1  - Chiara Renso
A1  - Laura Spinsanti
JF  - GeoS
ER  - 

TY  - CONF
T1  - Hiding Sensitive Trajectory Patterns
T2  - ICDM Workshops
Y1  - 2007
A1  - Osman Abul
A1  - Maurizio Atzori
A1  - Francesco Bonchi
A1  - Fosca Giannotti
JF  - ICDM Workshops
ER  - 

TY  - CONF
T1  - Hiding Sequences
T2  - ICDE Workshops
Y1  - 2007
A1  - Osman Abul
A1  - Maurizio Atzori
A1  - Francesco Bonchi
A1  - Fosca Giannotti
JF  - ICDE Workshops
ER  - 

TY  - CONF
T1  - Hiding Sequences
T2  - SEBD
Y1  - 2007
A1  - Osman Abul
A1  - Maurizio Atzori
A1  - Francesco Bonchi
A1  - Fosca Giannotti
JF  - SEBD
ER  - 

TY  - JOUR
T1  - Knowledge discovery from spatial transactions
JF  - Journal of Intelligent Information Systems
Y1  - 2007
A1  - S Rinzivillo
A1  - Franco Turini
VL  - 28
ER  - 

TY  - CONF
T1  - Mining Clinical Data with a Temporal Dimension: A Case Study
T2  - BIBM
Y1  - 2007
A1  - Michele Berlingerio
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Franco Turini
JF  - BIBM
ER  - 

TY  - CONF
T1  - Ontology-Driven Association Rule Extraction: A Case Study
T2  - International Workshop on Contexts and Ontologies: Representation and Reasoning
Y1  - 2007
A1  - Barbara Furletti
A1  - Andrea Bellandi
A1  - Valerio Grossi
A1  - Andrea Romei
JF  - International Workshop on Contexts and Ontologies: Representation and Reasoning
CY  - Roskilde, Denmark
UR  - http://ceur-ws.org/Vol-298/paper1.pdf
ER  - 

TY  - CONF
T1  - Privacy-Aware Knowledge Discovery from Location Data
T2  - MDM
Y1  - 2007
A1  - Maurizio Atzori
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Osman Abul
JF  - MDM
ER  - 

TY  - CONF
T1  - PUSHING CONSTRAINTS IN ASSOCIATION RULE MINING: AN ONTOLOGY-BASED APPROACH
T2  - IADIS International Conference WWW/Internet 2007
Y1  - 2007
A1  - Barbara Furletti
A1  - Andrea Bellandi
A1  - Andrea Romei
A1  - Valerio Grossi
JF  - IADIS International Conference WWW/Internet 2007
SN  - 978-972-8924-44-7
UR  - http://www.iadisportal.org/digital-library/mdownload/pushing-constraints-in-association-rule-mining-an-ontology-based-approach
ER  - 

TY  - CONF
T1  - Time-Annotated Sequences for Medical Data Mining
T2  - ICDM Workshops
Y1  - 2007
A1  - Michele Berlingerio
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Franco Turini
JF  - ICDM Workshops
ER  - 

TY  - CONF
T1  - Towards Constraint-Based Subgraph Mining
T2  - SEBD
Y1  - 2007
A1  - Michele Berlingerio
A1  - Francesco Bonchi
A1  - Fosca Giannotti
JF  - SEBD
ER  - 

TY  - CONF
T1  - Trajectory pattern mining
T2  - KDD
Y1  - 2007
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Fabio Pinelli
A1  - Dino Pedreschi
JF  - KDD
ER  - 

TY  - CONF
T1  - ConQueSt: a Constraint-based Querying System for Exploratory Pattern Discovery
T2  - ICDE
Y1  - 2006
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Claudio Lucchese
A1  - Salvatore Orlando
A1  - Raffaele Perego
A1  - Roberto Trasarti
JF  - ICDE
ER  - 

TY  - CONF
T1  - Efficient Mining of Temporally Annotated Sequences
T2  - SDM
Y1  - 2006
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
JF  - SDM
ER  - 

TY  - CHAP
T1  - Examples of Integration of Induction and Deduction in Knowledge Discovery
T2  - Reasoning, Action and Interaction in AI Theories and Systems
Y1  - 2006
A1  - Franco Turini
A1  - Miriam Baglioni
A1  - Barbara Furletti
A1  - S Rinzivillo
JF  - Reasoning, Action and Interaction in AI Theories and Systems
T3  - LNAI
VL  - 4155
UR  - http://www.springerlink.com/content/m400v4507476n18g/fulltext.pdf
ER  - 

TY  - CONF
T1  - Examples of Integration of Induction and Deduction in Knowledge Discovery
T2  - Reasoning, Action and Interaction in AI Theories and Systems
Y1  - 2006
A1  - Franco Turini
A1  - Miriam Baglioni
A1  - Barbara Furletti
A1  - S Rinzivillo
JF  - Reasoning, Action and Interaction in AI Theories and Systems
ER  - 

TY  - CONF
T1  - On Interactive Pattern Mining from Relational Databases
T2  - KDID
Y1  - 2006
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Claudio Lucchese
A1  - Salvatore Orlando
A1  - Raffaele Perego
A1  - Roberto Trasarti
JF  - KDID
ER  - 

TY  - CONF
T1  - On Interactive Pattern Mining from Relational Databases
T2  - SEBD
Y1  - 2006
A1  - Claudio Lucchese
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Salvatore Orlando
A1  - Raffaele Perego
A1  - Roberto Trasarti
JF  - SEBD
ER  - 

TY  - CHAP
T1  - Maximum Entropy Reasoning for GIS
Y1  - 2006
A1  - H. Hosni
A1  - Maria V Masserotti
A1  - Chiara Renso
N1  - Flexible Databases supporting Imprecision and Uncertainty, Physica Verlag, June
ER  - 

TY  - CONF
T1  - Mining HLA Patterns Explaining Liver Diseases
T2  - CBMS
Y1  - 2006
A1  - Michele Berlingerio
A1  - Francesco Bonchi
A1  - Silvia Chelazzi
A1  - Michele Curcio
A1  - Fosca Giannotti
A1  - Fabrizio Scatena
JF  - CBMS
ER  - 

TY  - CONF
T1  - Mining sequences with temporal annotations
T2  - SAC
Y1  - 2006
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Fabio Pinelli
JF  - SAC
ER  - 

TY  - JOUR
T1  - Time-focused clustering of trajectories of moving objects
JF  - J. Intell. Inf. Syst.
Y1  - 2006
A1  - Mirco Nanni
A1  - Dino Pedreschi
VL  - 27
ER  - 

TY  - CONF
T1  - A Tool for Economic Plans analysis based on expert knowledge and data mining techniques
T2  - IADIS International Conference Applied Computing 2006
Y1  - 2006
A1  - Miriam Baglioni
A1  - Barbara Furletti
A1  - Franco Turini
JF  - IADIS International Conference Applied Computing 2006
SN  - 972-8924-09-7
UR  - http://www.iadisportal.org/digital-library/mdownload/a-tool-for-economic-plans-analysis-based-on-expert-knowledge-and-data-mining-techniques
ER  - 

TY  - CONF
T1  - Towards low-perturbation anonymity preserving pattern discovery
T2  - SAC
Y1  - 2006
A1  - Maurizio Atzori
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - SAC
ER  - 

TY  - JOUR
T1  - Anonymity and data mining
JF  - Comput. Syst. Sci. Eng.
Y1  - 2005
A1  - Maurizio Atzori
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
VL  - 20
ER  - 

TY  - CONF
T1  - Blocking Anonymity Threats Raised by Frequent Itemset Mining
T2  - ICDM
Y1  - 2005
A1  - Maurizio Atzori
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - ICDM
ER  - 

TY  - CONF
T1  - Comparative indicators of regional poverty and deprivation: Poland versus EU-15 Member States
T2  - conference Comparative Economic Analysis of Households‟ Behaviour (CEAHB): Old and New EU Members, Warsaw University
Y1  - 2005
A1  - Betti, Gianni
A1  - Mulas, Anna
A1  - Michela Natilli
A1  - Neri, Laura
A1  - Verma, Vijay
JF  - conference Comparative Economic Analysis of Households‟ Behaviour (CEAHB): Old and New EU Members, Warsaw University
ER  - 

TY  - CONF
T1  - DrC4.5: Improving C4.5 by means of Prior Knowledge
T2  - ACM Symposium on Applied Computing
Y1  - 2005
A1  - Miriam Baglioni
A1  - Barbara Furletti
A1  - Franco Turini
JF  - ACM Symposium on Applied Computing
PB  - ACM
CY  - Santa Fe, New Mexico, USA
SN  - 1-58113-964-0
UR  - http://dl.acm.org/ft_gateway.cfm?id=1066787&ftid=311609&dwn=1&CFID=96873366&CFTOKEN=59233511
ER  - 

TY  - JOUR
T1  - Efficient breadth-first mining of frequent pattern with monotone constraints
JF  - Knowl. Inf. Syst.
Y1  - 2005
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Alessio Mazzanti
A1  - Dino Pedreschi
VL  - 8
ER  - 

TY  - JOUR
T1  - Exante: A Preprocessing Method for Frequent-Pattern Mining
JF  - IEEE Intelligent Systems
Y1  - 2005
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Alessio Mazzanti
A1  - Dino Pedreschi
VL  - 20
ER  - 

TY  - CONF
T1  - Extracting spatial association rules from spatial transactions
T2  - ACM GIS
Y1  - 2005
A1  - S Rinzivillo
A1  - Franco Turini
JF  - ACM GIS
ER  - 

TY  - ABST
T1  - Indicators of social exclusion and poverty in Europe’s regions
Y1  - 2005
A1  - Verma, Vijay
A1  - Betti, Gianni
A1  - Michela Natilli
A1  - Lemmi, Achille
ER  - 

TY  - CONF
T1  - Synthetic generation of cellular network positioning data
T2  - GIS
Y1  - 2005
A1  - Fosca Giannotti
A1  - Andrea Mazzoni
A1  - Simone Puntoni
A1  - Chiara Renso
JF  - GIS
ER  - 

TY  - CONF
T1  - Synthetic generation of cellular network positioning data
T2  - GIS
Y1  - 2005
A1  - Fosca Giannotti
A1  - Andrea Mazzoni
A1  - Simone Puntoni
A1  - Chiara Renso
JF  - GIS
ER  - 

TY  - JOUR
T1  - Bounded Nondeterminism of Logic Programs
JF  - Ann. Math. Artif. Intell.
Y1  - 2004
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
VL  - 42
ER  - 

TY  - CONF
T1  - Characterisations of Termination in Logic Programming
T2  - Program Development in Computational Logic
Y1  - 2004
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
A1  - Jan-Georg Smaus
JF  - Program Development in Computational Logic
ER  - 

TY  - CONF
T1  - Classification in Geographical Information Systems
T2  - PKDD
Y1  - 2004
A1  - S Rinzivillo
A1  - Franco Turini
JF  - PKDD
ER  - 

TY  - CONF
T1  - Deductive and Inductive Reasoning on Spatio-Temporal Data
T2  - INAP/WLP
Y1  - 2004
A1  - Mirco Nanni
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
JF  - INAP/WLP
ER  - 

TY  - CONF
T1  - Deductive and Inductive Reasoning on Trajectories
T2  - SEBD
Y1  - 2004
A1  - Mirco Nanni
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
JF  - SEBD
ER  - 

TY  - CONF
T1  - Discovery of ads web hosts through traffic data analysis
T2  - DMKD
Y1  - 2004
A1  - V. Bacarella
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
JF  - DMKD
ER  - 

TY  - CONF
T1  - Frequent Pattern Queries for Flexible Knowledge Discovery
T2  - SEBD
Y1  - 2004
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - SEBD
ER  - 

TY  - JOUR
T1  - Integrating knowledge representation and reasoning in Geographical Information Systems
JF  - International Journal of Geographical Information Science
Y1  - 2004
A1  - Paolo Mancarella
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
VL  - 18
ER  - 

TY  - JOUR
T1  - Integrating Knowledge Representation and Reasoning in Geographical
Y1  - 2004
A1  - Paolo Mancarella
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
N1  - information systems. {\em International Journal of GIS,Vol 18 (4), June }.
ER  - 

TY  - CONF
T1  - IT4PS: information technology for problem solving
T2  - ITiCSE
Y1  - 2004
A1  - C. Alfonsi
A1  - Nello Scarabottolo
A1  - Dino Pedreschi
A1  - Maria Simi
JF  - ITiCSE
ER  - 

TY  - Generic
T1  - Knowledge Discovery in Databases: PKDD 2004, 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, Pisa, Italy, September 20-24, 2004, Proceedings
T2  - Lecture Notes in Computer Science
Y1  - 2004
A1  - Jean-François Boulicaut
A1  - Floriana Esposito
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - Lecture Notes in Computer Science
PB  - Springer
VL  - 3202
SN  - 3-540-23108-0
ER  - 

TY  - Generic
T1  - Machine Learning: ECML 2004, 15th European Conference on Machine Learning, Pisa, Italy, September 20-24, 2004, Proceedings
T2  - Lecture Notes in Computer Science
Y1  - 2004
A1  - Jean-François Boulicaut
A1  - Floriana Esposito
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - Lecture Notes in Computer Science
PB  - Springer
VL  - 3201
SN  - 3-540-23105-6
ER  - 

TY  - JOUR
T1  - \newblock{A Declarative Framework for Reasoning on Spatio-temporal Data}
Y1  - 2004
A1  - Mirco Nanni
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
N1  - \newblock{Book chapter in Spatio-temporal databases, flexible querying and reasoning, R. de Caluwe, G. de Trè, G. Bordogna editors, Physica Verlag }.
ER  - 

TY  - CONF
T1  - Pushing Constraints to Detect Local Patterns
T2  - Local Pattern Detection
Y1  - 2004
A1  - Francesco Bonchi
A1  - Fosca Giannotti
JF  - Local Pattern Detection
ER  - 

TY  - CONF
T1  - A Relational Query Primitive for Constraint-Based Pattern Mining
T2  - Constraint-Based Mining and Inductive Databases
Y1  - 2004
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - Constraint-Based Mining and Inductive Databases
ER  - 

TY  - JOUR
T1  - Specifying Mining Algorithms with Iterative User-Defined Aggregates
JF  - IEEE Trans. Knowl. Data Eng.
Y1  - 2004
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Franco Turini
VL  - 16
ER  - 

TY  - CONF
T1  - Towards a Logic Query Language for Data Mining
T2  - Database Support for Data Mining Applications
Y1  - 2004
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Franco Turini
JF  - Database Support for Data Mining Applications
ER  - 

TY  - CONF
T1  - Adaptive Constraint Pushing in Frequent Pattern Mining
T2  - PKDD
Y1  - 2003
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Alessio Mazzanti
A1  - Dino Pedreschi
JF  - PKDD
ER  - 

TY  - CONF
T1  - ExAMiner: Optimized Level-wise Frequent Pattern Mining with Monotone Constraint
T2  - ICDM
Y1  - 2003
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Alessio Mazzanti
A1  - Dino Pedreschi
JF  - ICDM
ER  - 

TY  - CONF
T1  - ExAnte: Anticipated Data Reduction in Constrained Pattern Mining
T2  - PKDD
Y1  - 2003
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Alessio Mazzanti
A1  - Dino Pedreschi
JF  - PKDD
ER  - 

TY  - JOUR
T1  - On logic programs that always succeed
JF  - Sci. Comput. Program.
Y1  - 2003
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
VL  - 48
ER  - 

TY  - CONF
T1  - Logical Languages for Data Mining
T2  - Logics for Emerging Applications of Databases
Y1  - 2003
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Jef Wijsen
JF  - Logics for Emerging Applications of Databases
ER  - 

TY  - CONF
T1  - Personal income in the gross and net forms: applications of the Siena Micro-Simulation Model (SM2)
T2  - conference of the Società Italiana di Economia, Demografia e Statistica (SIEDS), Campobasso
Y1  - 2003
A1  - Verma, V
A1  - Betti, G
A1  - Ballini, F
A1  - Michela Natilli
A1  - Galgani, S
JF  - conference of the Società Italiana di Economia, Demografia e Statistica (SIEDS), Campobasso
ER  - 

TY  - CONF
T1  - Pre-processing for Constrained Pattern Mining
T2  - SEBD
Y1  - 2003
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Alessio Mazzanti
A1  - Dino Pedreschi
JF  - SEBD
ER  - 

TY  - CONF
T1  - Qualitative Spatial Reasoning in a Logical Framework
T2  - AI*IA
Y1  - 2003
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
JF  - AI*IA
ER  - 

TY  - CONF
T1  - Using Spin to Generate Tests from ASM Specifications
T2  - Abstract State Machines
Y1  - 2003
A1  - Angelo Gargantini
A1  - Elvinia Riccobene
A1  - S Rinzivillo
JF  - Abstract State Machines
ER  - 

TY  - CONF
T1  - WebCat: Automatic Categorization of Web Search Results
T2  - SEBD
Y1  - 2003
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - F. Samaritani
JF  - SEBD
ER  - 

TY  - CONF
T1  - Characterizing Web User Accesses: A Transactional Approach to Web Log Clustering
T2  - ITCC
Y1  - 2002
A1  - Fosca Giannotti
A1  - Cristian Gozzi
A1  - Giuseppe Manco
JF  - ITCC
ER  - 

TY  - JOUR
T1  - Classes of terminating logic programs
JF  - TPLP
Y1  - 2002
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
A1  - Jan-Georg Smaus
VL  - 2
ER  - 

TY  - CONF
T1  - Clustering Transactional Data
T2  - PKDD
Y1  - 2002
A1  - Fosca Giannotti
A1  - Cristian Gozzi
A1  - Giuseppe Manco
JF  - PKDD
ER  - 

TY  - CONF
T1  - The Declarative Side of Magic
T2  - Computational Logic: Logic Programming and Beyond
Y1  - 2002
A1  - Paolo Mascellani
A1  - Dino Pedreschi
JF  - Computational Logic: Logic Programming and Beyond
ER  - 

TY  - CONF
T1  - Enhancing GISs for spatio-temporal reasoning
T2  - ACM-GIS
Y1  - 2002
A1  - Alessandra Raffaetà
A1  - Franco Turini
A1  - Chiara Renso
JF  - ACM-GIS
ER  - 

TY  - CONF
T1  - Invited talk: Logical Data Mining Query Languages
T2  - KDID
Y1  - 2002
A1  - Fosca Giannotti
JF  - KDID
ER  - 

TY  - CONF
T1  - LDL-M$_{\mbox{ine}}$: Integrating Data Mining with Intelligent Query Answering
T2  - JELIA
Y1  - 2002
A1  - Fosca Giannotti
A1  - Giuseppe Manco
JF  - JELIA
ER  - 

TY  - CONF
T1  - Negation as Failure through Abduction: Reasoning about Termination
T2  - Computational Logic: Logic Programming and Beyond
Y1  - 2002
A1  - Paolo Mancarella
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
JF  - Computational Logic: Logic Programming and Beyond
ER  - 

TY  - CONF
T1  - Qualitative Reasoning in a Spatio-Temporal Language
T2  - SEBD
Y1  - 2002
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
JF  - SEBD
ER  - 

TY  - JOUR
T1  - Classes of Terminating Logic Programs
JF  - CoRR
Y1  - 2001
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
A1  - Jan-Georg Smaus
VL  - cs.LO/0106
ER  - 

TY  - CONF
T1  - Clustering Transactional Data
T2  - SEBD
Y1  - 2001
A1  - Fosca Giannotti
A1  - Cristian Gozzi
A1  - Giuseppe Manco
JF  - SEBD
ER  - 

TY  - CONF
T1  - Complex Reasoning on Geographical Data
T2  - SEBD
Y1  - 2001
A1  - Fosca Giannotti
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
JF  - SEBD
ER  - 

TY  - CONF
T1  - Complex Reasoning on Geographical Data
T2  - SEBD
Y1  - 2001
A1  - Fosca Giannotti
A1  - Alessandra Raffaetà
A1  - Chiara Renso
A1  - Franco Turini
JF  - SEBD
ER  - 

TY  - CONF
T1  - Data Mining for Intelligent Web Caching
T2  - ITCC
Y1  - 2001
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Chiara Renso
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
JF  - ITCC
ER  - 

TY  - CONF
T1  - Data Mining for Intelligent Web Caching
T2  - ITCC
Y1  - 2001
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Chiara Renso
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
JF  - ITCC
ER  - 

TY  - JOUR
T1  - Nondeterministic, Nonmonotonic Logic Databases
JF  - IEEE Trans. Knowl. Data Eng.
Y1  - 2001
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Mirco Nanni
A1  - Dino Pedreschi
VL  - 13
ER  - 

TY  - JOUR
T1  - Semantics and Expressive Power of Nondeterministic Constructs in Deductive Databases
JF  - J. Comput. Syst. Sci.
Y1  - 2001
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Carlo Zaniolo
VL  - 62
ER  - 

TY  - CONF
T1  - Specifying Mining Algorithms with Iterative User-Defined Aggregates: A Case Study
T2  - PKDD
Y1  - 2001
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Franco Turini
JF  - PKDD
ER  - 

TY  - JOUR
T1  - Web log data warehousing and mining for intelligent web caching
JF  - Data Knowl. Eng.
Y1  - 2001
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Cristian Gozzi
A1  - Giuseppe Manco
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Chiara Renso
A1  - Salvatore Ruggieri
VL  - 39
ER  - 

TY  - JOUR
T1  - Web Log Data Warehousing and Mining for Intelligent Web Caching
JF  - Data and Knowledge Engineering
Y1  - 2001
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Cristian Gozzi
A1  - Giuseppe Manco
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Chiara Renso
A1  - Salvatore Ruggieri
N1  - 39:165, November .
ER  - 

TY  - JOUR
T1  - Web log data warehousing and mining for intelligent web caching
JF  - Data Knowl. Eng.
Y1  - 2001
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Cristian Gozzi
A1  - Giuseppe Manco
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Chiara Renso
A1  - Salvatore Ruggieri
VL  - 39
ER  - 

TY  - CONF
T1  - Declarative Knowledge Extraction with Interactive User-Defined Aggregates
T2  - FQAS
Y1  - 2000
A1  - Fosca Giannotti
A1  - Giuseppe Manco
JF  - FQAS
ER  - 

TY  - JOUR
T1  - Foundations of distributed interaction systems
JF  - Ann. Math. Artif. Intell.
Y1  - 2000
A1  - Marat Fayzullin
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - V. S. Subrahmanian
VL  - 28
ER  - 

TY  - CONF
T1  - Logic-Based Knowledge Discovery in Databases
T2  - EJC
Y1  - 2000
A1  - Fosca Giannotti
A1  - Mirco Nanni
A1  - Dino Pedreschi
JF  - EJC
ER  - 

TY  - CONF
T1  - Making Knowledge Extraction and Reasoning Closer
T2  - PAKDD
Y1  - 2000
A1  - Fosca Giannotti
A1  - Giuseppe Manco
JF  - PAKDD
ER  - 

TY  - CONF
T1  - Temporal Reasoning in Geographical Information Systems
T2  - DEXA Workshop
Y1  - 2000
A1  - Alessandra Raffaetà
A1  - Chiara Renso
JF  - DEXA Workshop
ER  - 

TY  - JOUR
T1  - Using MedLan to Integrate Geographical Data
JF  - J. Log. Program.
Y1  - 2000
A1  - Domenico Aquilino
A1  - Patrizia Asirelli
A1  - A Formuso
A1  - Chiara Renso
A1  - Franco Turini
VL  - 43
ER  - 

TY  - JOUR
T1  - Using Medlan to Integrate Geographical Data
JF  - Journal of Logic Programming
Y1  - 2000
A1  - Domenico Aquilino
A1  - Patrizia Asirelli
A1  - A Formuso
A1  - Chiara Renso
A1  - Franco Turini
N1  - 43(1):.
ER  - 

TY  - CONF
T1  - On Verification in Logic Database Languages
T2  - Computational Logic
Y1  - 2000
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - Computational Logic
ER  - 

TY  - CONF
T1  - Beyond Current Technology: The Perspective of Three EC GIS Projects
T2  - DEXA Workshop
Y1  - 1999
A1  - Fosca Giannotti
A1  - Robert Jeansoulin
A1  - Yannis Theodoridis
JF  - DEXA Workshop
ER  - 

TY  - CONF
T1  - Bounded Nondeterminism of Logic Programs
T2  - ICLP
Y1  - 1999
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
JF  - ICLP
ER  - 

TY  - CONF
T1  - A Classification-Based Methodology for Planning Audit Strategies in Fraud Detection
T2  - KDD
Y1  - 1999
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Gianni Mainetto
A1  - Dino Pedreschi
JF  - KDD
ER  - 

TY  - JOUR
T1  - Dynamic composition of parameterised logic modules
JF  - Comput. Lang.
Y1  - 1999
A1  - Antonio Brogi
A1  - Chiara Renso
A1  - Franco Turini
VL  - 25
ER  - 

TY  - JOUR
T1  - Dynamic Composition of Parameterised Logic Modules
JF  - Computer Languages
Y1  - 1999
A1  - Antonio Brogi
A1  - Chiara Renso
A1  - Franco Turini
N1  - 25(4):.
ER  - 

TY  - CONF
T1  - Experiences with a Logic-Based Knowledge Discovery Support Environment
T2  - AI*IA
Y1  - 1999
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - AI*IA
ER  - 

TY  - CONF
T1  - Experiences with a Logic-based knowledge discovery Support Environment
T2  - 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery
Y1  - 1999
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery
ER  - 

TY  - CONF
T1  - Integration of Deduction and Induction for Mining Supermarket Sales Data
T2  - SEBD
Y1  - 1999
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - SEBD
ER  - 

TY  - JOUR
T1  - On Logic Programs That Do Not Fail
JF  - Electr. Notes Theor. Comput. Sci.
Y1  - 1999
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
VL  - 30
ER  - 

TY  - CONF
T1  - Querying inductive Databases via Logic-Based user-defined aggregates
T2  - APPIA-GULP-PRODE
Y1  - 1999
A1  - Fosca Giannotti
A1  - Giuseppe Manco
JF  - APPIA-GULP-PRODE
ER  - 

TY  - CONF
T1  - Querying Inductive Databases via Logic-Based User-Defined Aggregates
T2  - PKDD
Y1  - 1999
A1  - Fosca Giannotti
A1  - Giuseppe Manco
JF  - PKDD
ER  - 

TY  - CONF
T1  - Una Metodologia Basata sulla Classificazione per la Pianificazione degli Accertamenti nel Rilevamento di Frodi
T2  - SEBD
Y1  - 1999
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Gianni Mainetto
A1  - Dino Pedreschi
JF  - SEBD
ER  - 

TY  - CONF
T1  - Using Data Mining Techniques in Fiscal Fraud Detection
T2  - DaWaK
Y1  - 1999
A1  - Francesco Bonchi
A1  - Fosca Giannotti
A1  - Gianni Mainetto
A1  - Dino Pedreschi
JF  - DaWaK
ER  - 

TY  - JOUR
T1  - Verification of Logic Programs
JF  - J. Log. Program.
Y1  - 1999
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
VL  - 39
ER  - 

TY  - CONF
T1  - The Constraint Operator of MedLan: Its Efficient Implementation and Use
T2  - IICIS
Y1  - 1998
A1  - Patrizia Asirelli
A1  - Chiara Renso
A1  - Franco Turini
JF  - IICIS
ER  - 

TY  - JOUR
T1  - Datalog with Non-Deterministic Choice Computers NDB-PTIME
JF  - J. Log. Program.
Y1  - 1998
A1  - Fosca Giannotti
A1  - Dino Pedreschi
VL  - 35
ER  - 

TY  - CONF
T1  - On the Effective Semantics of Nondeterministic, Nonmonotonic, Temporal Logic Databases
T2  - CSL
Y1  - 1998
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Mirco Nanni
A1  - Dino Pedreschi
JF  - CSL
ER  - 

TY  - BOOK
T1  - Mechanisms for Semantic Integration of Deductive Databases
Y1  - 1998
A1  - Chiara Renso
N1  - PhD thesis, Dipartimento di Informatica, University of Pisa, .
ER  - 

TY  - JOUR
T1  - A Mediator Approach for Representing Knowledge
JF  - Intelligent Multimedia Presentation Systems. Human Computer Interaction Letters, 1 (1): 32-38, April 1998.
Y1  - 1998
A1  - Chiara Renso
A1  - Salvatore Ruggieri
ER  - 

TY  - CONF
T1  - Query Answering in Nondeterministic, Nonmonotonic Logic Databases
T2  - FQAS
Y1  - 1998
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Mirco Nanni
A1  - Dino Pedreschi
JF  - FQAS
ER  - 

TY  - JOUR
T1  - Weakest Preconditions for Pure Prolog Programs
JF  - Inf. Process. Lett.
Y1  - 1998
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
VL  - 67
ER  - 

TY  - JOUR
T1  - Applying Restriction Constraint to Deductive Databases
JF  - Annals of Mathematics and Artificial Intelligence
Y1  - 1997
A1  - Domenico Aquilino
A1  - Patrizia Asirelli
A1  - Chiara Renso
A1  - Franco Turini
N1  - 1997
ER  - 

TY  - JOUR
T1  - Applying Restriction Constraints to Deductive Databases
JF  - Ann. Math. Artif. Intell.
Y1  - 1997
A1  - Domenico Aquilino
A1  - Patrizia Asirelli
A1  - Chiara Renso
A1  - Franco Turini
VL  - 19
ER  - 

TY  - CONF
T1  - Datalog++: A Basis for Active Object-Oriented Databases
T2  - DOOD
Y1  - 1997
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Mirco Nanni
A1  - Dino Pedreschi
JF  - DOOD
ER  - 

TY  - CONF
T1  - Datalog++: a Basis for Active Object.Oriented Databases
T2  - SEBD
Y1  - 1997
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Mirco Nanni
A1  - Dino Pedreschi
JF  - SEBD
ER  - 

TY  - CONF
T1  - A Deductive Data Model for Representing and Querying Semistructured Data
T2  - APPIA-GULP-PRODE
Y1  - 1997
A1  - Fosca Giannotti
A1  - Giuseppe Manco
A1  - Dino Pedreschi
JF  - APPIA-GULP-PRODE
ER  - 

TY  - JOUR
T1  - Non-determinism in Deductive Databases - Preface
JF  - Ann. Math. Artif. Intell.
Y1  - 1997
A1  - Dino Pedreschi
A1  - V. S. Subrahmanian
VL  - 19
ER  - 

TY  - JOUR
T1  - Programming with Non-Determinism in Deductive Databases
JF  - Ann. Math. Artif. Intell.
Y1  - 1997
A1  - Fosca Giannotti
A1  - Sergio Greco
A1  - Domenico Saccà
A1  - Carlo Zaniolo
VL  - 19
ER  - 

TY  - CONF
T1  - Static Analysis of Transactions for Conservative Multigranularity Locking
T2  - DBPL
Y1  - 1997
A1  - Giuseppe Amato
A1  - Fosca Giannotti
A1  - Gianni Mainetto
JF  - DBPL
ER  - 

TY  - JOUR
T1  - Verification of Meta-Interpreters
JF  - J. Log. Comput.
Y1  - 1997
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
VL  - 7
ER  - 

TY  - JOUR
T1  - A Closer Look at Declarative Interpretations
JF  - J. Log. Program.
Y1  - 1996
A1  - Krzysztof R. Apt
A1  - Maurizio Gabbrielli
A1  - Dino Pedreschi
VL  - 28
ER  - 

TY  - CONF
T1  - Language Extensions for Semantic Integration of Deductive Databases
T2  - Logic in Databases
Y1  - 1996
A1  - Patrizia Asirelli
A1  - Chiara Renso
A1  - Franco Turini
JF  - Logic in Databases
ER  - 

TY  - Generic
T1  - Logic in Databases, International Workshop LID'96, San Miniato, Italy, July 1-2, 1996, Proceedings
T2  - Lecture Notes in Computer Science
Y1  - 1996
A1  - Dino Pedreschi
A1  - Carlo Zaniolo
JF  - Lecture Notes in Computer Science
PB  - Springer
VL  - 1154
SN  - 3-540-61814-7
ER  - 

TY  - CONF
T1  - Ragionamento spazio-temporale con LDLT: primi esperimenti verso un sistema deduttivo per applicazioni geografiche
T2  - SEBD
Y1  - 1996
A1  - Marilisa E. Carboni
A1  - Annalisa Di Deo
A1  - Fosca Giannotti
A1  - Maria V Masserotti
JF  - SEBD
ER  - 

TY  - CONF
T1  - Spatio-Temporal Reasoning with LDLT: First Steps Towards a Deductive System for Geographical Applications
T2  - DDLP
Y1  - 1996
A1  - Marilisa E. Carboni
A1  - Annalisa Di Deo
A1  - Fosca Giannotti
A1  - Maria V Masserotti
JF  - DDLP
ER  - 

TY  - CHAP
T1  - Towards {D}eclarative {GIS} {A}nalysis
Y1  - 1996
A1  - Domenico Aquilino
A1  - Chiara Renso
A1  - Franco Turini
N1  - {\em Proocedings of the {F}ourth {ACM} {W}orkshop on {A}dvances in {G}eographic {I}nformation {S}ystems}, pages.
ER  - 

TY  - CONF
T1  - Towards Declarative GIS Analysis
T2  - ACM-GIS
Y1  - 1996
A1  - Domenico Aquilino
A1  - Chiara Renso
A1  - Franco Turini
JF  - ACM-GIS
ER  - 

TY  - CONF
T1  - Using Temporary Integrity Constraints to Optimize Databases
T2  - FAPR
Y1  - 1996
A1  - Danilo Montesi
A1  - Chiara Renso
A1  - Franco Turini
JF  - FAPR
ER  - 

TY  - CONF
T1  - A Case Study in Logic Program Verification: the Vanilla Metainterpreter
T2  - GULP-PRODE
Y1  - 1995
A1  - Dino Pedreschi
A1  - Salvatore Ruggieri
JF  - GULP-PRODE
ER  - 

TY  - CONF
T1  - Declarative Reconstruction of Updates in Logic Databases: a Compilative Approach
T2  - GULP-PRODE
Y1  - 1995
A1  - Marilisa E. Carboni
A1  - V. Foddai
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - GULP-PRODE
ER  - 

TY  - CONF
T1  - Declarative Reconstruction of Updates in Logic Databases: A Compilative Approach
T2  - SEBD
Y1  - 1995
A1  - Marilisa E. Carboni
A1  - Fosca Giannotti
A1  - V. Foddai
A1  - Dino Pedreschi
JF  - SEBD
ER  - 

TY  - CONF
T1  - An Operator for Composing Deductive Databases with Theories of Constraints
T2  - LPNMR
Y1  - 1995
A1  - Domenico Aquilino
A1  - Patrizia Asirelli
A1  - Chiara Renso
A1  - Franco Turini
JF  - LPNMR
ER  - 

TY  - JOUR
T1  - An Operator for Composing Deductive Databases with Theories of Constraints
Y1  - 1995
A1  - Domenico Aquilino
A1  - Patrizia Asirelli
A1  - Chiara Renso
A1  - Franco Turini
N1  - Logic Programming and Nonmonotonic Reasoning, Third International Conference Lecture Notes in Computer Science vol 928,
ER  - 

TY  - CONF
T1  - An abstract interpreter for the specification language LOTOS
T2  - FORTE
Y1  - 1994
A1  - Franco Fiore
A1  - Fosca Giannotti
JF  - FORTE
ER  - 

TY  - CONF
T1  - Amalgamating Language and Meta-language for Composing Logic Programs
T2  - GULP-PRODE (2)
Y1  - 1994
A1  - Antonio Brogi
A1  - Chiara Renso
A1  - Franco Turini
JF  - GULP-PRODE (2)
ER  - 

TY  - CONF
T1  - Conservative Multigranularity Locking for an Obiect-Oriented Persistent Language via Abstract Interpretation
T2  - SEBD
Y1  - 1994
A1  - Giuseppe Amato
A1  - Fosca Giannotti
A1  - Gianni Mainetto
JF  - SEBD
ER  - 

TY  - CONF
T1  - Expressive Power of Non-Deterministic Operators for Logic-based Languages
T2  - Workshop on Deductive Databases and Logic Programming
Y1  - 1994
A1  - Luca Corciulo
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Carlo Zaniolo
JF  - Workshop on Deductive Databases and Logic Programming
ER  - 

TY  - JOUR
T1  - Gate Splitting in LOTOS Specifications Using Abstract Interpretation
JF  - Sci. Comput. Program.
Y1  - 1994
A1  - Fosca Giannotti
A1  - Diego Latella
VL  - 23
ER  - 

TY  - JOUR
T1  - Implementations of Program Composition Operations
Y1  - 1994
A1  - Antonio Brogi
A1  - A. Chiarelli
A1  - Paolo Mancarella
A1  - V. Mazzotta
A1  - Dino Pedreschi
A1  - Chiara Renso
A1  - Franco Turini
N1  - Programming Language Implementation and Logic Programming Lecture Notes in Computer Science, volume 844,
ER  - 

TY  - CONF
T1  - Implementations of Program Composition Operations
T2  - PLILP
Y1  - 1994
A1  - Antonio Brogi
A1  - A. Chiarelli
A1  - Paolo Mancarella
A1  - V. Mazzotta
A1  - Dino Pedreschi
A1  - Chiara Renso
A1  - Franco Turini
JF  - PLILP
ER  - 

TY  - CONF
T1  - Implementations of Program Composition Operations
T2  - PLILP
Y1  - 1994
A1  - Antonio Brogi
A1  - A. Chiarelli
A1  - Paolo Mancarella
A1  - V. Mazzotta
A1  - Dino Pedreschi
A1  - Chiara Renso
A1  - Franco Turini
JF  - PLILP
ER  - 

TY  - JOUR
T1  - Modular Logic Programming
JF  - ACM Trans. Program. Lang. Syst.
Y1  - 1994
A1  - Antonio Brogi
A1  - Paolo Mancarella
A1  - Dino Pedreschi
A1  - Franco Turini
VL  - 16
ER  - 

TY  - CONF
T1  - A Proof Method for Runtime Properties of Prolog Programs
T2  - ICLP
Y1  - 1994
A1  - Dino Pedreschi
JF  - ICLP
ER  - 

TY  - CONF
T1  - Proving termination of Prolog programs
T2  - GULP-PRODE (1)
Y1  - 1994
A1  - Paolo Mascellani
A1  - Dino Pedreschi
JF  - GULP-PRODE (1)
ER  - 

TY  - CONF
T1  - Data Sharing Analysis for a Database Programming Lanaguage via Abstract Interpretation
T2  - VLDB
Y1  - 1993
A1  - Giuseppe Amato
A1  - Fosca Giannotti
A1  - Gianni Mainetto
JF  - VLDB
ER  - 

TY  - CONF
T1  - Datalog with Non-Deterministic Choice Computes NDB-PTIME
T2  - DOOD
Y1  - 1993
A1  - Luca Corciulo
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - DOOD
ER  - 

TY  - CONF
T1  - Gate Splitting in LOTOS Specifications Using Abstract Interpretation
T2  - TAPSOFT
Y1  - 1993
A1  - Fosca Giannotti
A1  - Diego Latella
JF  - TAPSOFT
ER  - 

TY  - JOUR
T1  - Reasoning about Termination of Pure Prolog Programs
JF  - Inf. Comput.
Y1  - 1993
A1  - Krzysztof R. Apt
A1  - Dino Pedreschi
VL  - 106
ER  - 

TY  - CONF
T1  - Static Analysis of Transactions: an Experiment of Abstract Interpretation Usage
T2  - FMLDO
Y1  - 1993
A1  - Giuseppe Amato
A1  - Fosca Giannotti
A1  - Gianni Mainetto
JF  - FMLDO
ER  - 

TY  - CONF
T1  - A WAM Estesa per la Composizione di Programi Logici
T2  - GULP
Y1  - 1993
A1  - A. Chiarelli
A1  - V. Mazzotta
A1  - Chiara Renso
JF  - GULP
ER  - 

TY  - CONF
T1  - Analysis of Concurrent Transactions in a Functional Database Programming Language
T2  - WSA
Y1  - 1992
A1  - Giuseppe Amato
A1  - Fosca Giannotti
A1  - Gianni Mainetto
JF  - WSA
ER  - 

TY  - CONF
T1  - Meta for Modularising Logic Programming
T2  - META
Y1  - 1992
A1  - Antonio Brogi
A1  - Paolo Mancarella
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - META
ER  - 

TY  - BOOK
T1  - The Type System of LML
T2  - Types in Logic Programming
Y1  - 1992
A1  - Bruno Bertolino
A1  - Luigi Meo
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - Types in Logic Programming
ER  - 

TY  - CONF
T1  - Using Abstract Interpretation for Gate splitting in LOTOS Specifications
T2  - WSA
Y1  - 1992
A1  - Fosca Giannotti
A1  - Diego Latella
JF  - WSA
ER  - 

TY  - CONF
T1  - Non-Determinism in Deductive Databases
T2  - DOOD
Y1  - 1991
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Domenico Saccà
A1  - Carlo Zaniolo
JF  - DOOD
ER  - 

TY  - CONF
T1  - Proving Termination of General Prolog Programs
T2  - TACS
Y1  - 1991
A1  - Krzysztof R. Apt
A1  - Dino Pedreschi
JF  - TACS
ER  - 

TY  - CONF
T1  - A Technique for Recursive Invariance Detection and Selective Program Specification
T2  - PLILP
Y1  - 1991
A1  - Fosca Giannotti
A1  - Manuel V. Hermenegildo
JF  - PLILP
ER  - 

TY  - CONF
T1  - Theory Construction in Computational Logic
T2  - ICLP Workshop on Construction of Logic Programs
Y1  - 1991
A1  - Antonio Brogi
A1  - Paolo Mancarella
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - ICLP Workshop on Construction of Logic Programs
ER  - 

TY  - CONF
T1  - Algebraic Properties of a Class of Logic Programs
T2  - NACLP
Y1  - 1990
A1  - Paolo Mancarella
A1  - Dino Pedreschi
A1  - Marina Rondinelli
A1  - Marco Tagliatti
JF  - NACLP
ER  - 

TY  - CONF
T1  - Declarative Semantics for Pruning Operators in Logic Programming
T2  - LPNMR
Y1  - 1990
A1  - Fosca Giannotti
A1  - Dino Pedreschi
JF  - LPNMR
ER  - 

TY  - CONF
T1  - Logic Programming within a Functional Framework
T2  - PLILP
Y1  - 1990
A1  - Antonio Brogi
A1  - Paolo Mancarella
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - PLILP
ER  - 

TY  - CONF
T1  - RASP: A Resource Allocator for Software Projects
T2  - IEA/AIE (Vol. 2)
Y1  - 1990
A1  - C. Bertazzoni
A1  - Fosca Giannotti
JF  - IEA/AIE (Vol. 2)
ER  - 

TY  - JOUR
T1  - A Transformational Approach to Negation in Logic Programming
JF  - J. Log. Program.
Y1  - 1990
A1  - Roberto Barbuti
A1  - Paolo Mancarella
A1  - Dino Pedreschi
A1  - Franco Turini
VL  - 8
ER  - 

TY  - CONF
T1  - Universal Quantification by Case Analysis
T2  - ECAI
Y1  - 1990
A1  - Antonio Brogi
A1  - Paolo Mancarella
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - ECAI
ER  - 

TY  - CONF
T1  - An Algebra of Logic Programs
T2  - ICLP/SLP
Y1  - 1988
A1  - Paolo Mancarella
A1  - Dino Pedreschi
JF  - ICLP/SLP
ER  - 

TY  - JOUR
T1  - Complete Logic Programs with Domain-Closure Axiom
JF  - J. Log. Program.
Y1  - 1988
A1  - Paolo Mancarella
A1  - Simone Martini
A1  - Dino Pedreschi
VL  - 5
ER  - 

TY  - CONF
T1  - A Progress Report on the LML Project
T2  - FGCS
Y1  - 1988
A1  - Bruno Bertolino
A1  - Paolo Mancarella
A1  - Luigi Meo
A1  - Luca Nini
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - FGCS
ER  - 

TY  - CONF
T1  - Intensional Negation of Logic Programs: Examples and Implementation Techniques
T2  - TAPSOFT, Vol.2
Y1  - 1987
A1  - Roberto Barbuti
A1  - Paolo Mancarella
A1  - Dino Pedreschi
A1  - Franco Turini
JF  - TAPSOFT, Vol.2
ER  - 

TY  - JOUR
T1  - Symbolic Evaluation with Structural Recursive Symbolic Constants
JF  - Sci. Comput. Program.
Y1  - 1987
A1  - Fosca Giannotti
A1  - Attilio Matteucci
A1  - Dino Pedreschi
A1  - Franco Turini
VL  - 9
ER  - 

TY  - JOUR
T1  - Symbolic Semantics and Program Reduction
JF  - IEEE Trans. Software Eng.
Y1  - 1985
A1  - Vincenzo Ambriola
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Franco Turini
VL  - 11
ER  - 

TY  - CONF
T1  - The Type System of Galileo
T2  - Data Types and Persistence (Appin), Informal Proceedings
Y1  - 1985
A1  - Antonio Albano
A1  - Fosca Giannotti
A1  - Renzo Orsini
A1  - Dino Pedreschi
JF  - Data Types and Persistence (Appin), Informal Proceedings
ER  - 

TY  - CONF
T1  - The Type System of Galileo
T2  - Data Types and Persistence (Appin)
Y1  - 1985
A1  - Antonio Albano
A1  - Fosca Giannotti
A1  - Renzo Orsini
A1  - Dino Pedreschi
JF  - Data Types and Persistence (Appin)
ER  -