%0 Journal Article
%J Scientific Reports
%D 2023
%T Mobility Constraints in Segregation Models
%A Daniele Gambetta
%A Giovanni Mauro
%A Luca Pappalardo
%X Since the development of the original Schelling model of urban segregation, several enhancements have been proposed, but none have considered the impact of mobility constraints on model dynamics. Recent studies have shown that human mobility follows specific patterns, such as a preference for short distances and dense locations. This paper proposes a segregation model incorporating mobility constraints to make agents select their location based on distance and location relevance. Our findings indicate that the mobility-constrained model produces lower segregation levels but takes longer to converge than the original Schelling model. We identified a few persistently unhappy agents from the minority group who cause this prolonged convergence time and lower segregation level as they move around the grid centre. Our study presents a more realistic representation of how agents move in urban areas and provides a novel and insightful approach to analyzing the impact of mobility constraints on segregation models. We highlight the significance of incorporating mobility constraints when policymakers design interventions to address urban segregation.
%B Scientific Reports
%V 13
%P 12087
%G eng
%R 10.1038/s41598-023-38519-6

%0 Conference Paper
%B Proceedings of the 30th International Conference on Advances in Geographic Information Systems
%D 2022
%T Connected Vehicle Simulation Framework for Parking Occupancy Prediction (Demo Paper)
%A Resce, Pierpaolo
%A Vorwerk, Lukas
%A Han, Zhiwei
%A Cornacchia, Giuliano
%A Alamdari, Omid Isfahani
%A Mirco Nanni
%A Luca Pappalardo
%A Weimer, Daniel
%A Liu, Yuanting
%X This paper demonstrates a simulation framework that collects data about connected vehicles' locations and surroundings in a realistic traffic scenario. Our focus lies on the capability to detect parking spots and their occupancy status. We use this data to train machine learning models that predict parking occupancy levels of specific areas in the city center of San Francisco. By comparing their performance to a given ground truth, our results show that it is possible to use simulated connected vehicle data as a base for prototyping meaningful AI-based applications.
%B Proceedings of the 30th International Conference on Advances in Geographic Information Systems
%I Association for Computing Machinery
%C New York, NY, USA
%@ 9781450395298
%G eng
%U https://doi.org/10.1145/3557915.3560995
%R 10.1145/3557915.3560995

%0 Journal Article
%J EPJ data science
%D 2022
%T Generating mobility networks with generative adversarial networks
%A Giovanni Mauro
%A Luca, Massimiliano
%A Longa, Antonio
%A Lepri, Bruno
%A Luca Pappalardo
%X The increasingly crucial role of human displacements in complex societal phenomena, such as traffic congestion, segregation, and the diffusion of epidemics, is attracting the interest of scientists from several disciplines. In this article, we address mobility network generation, i.e., generating a city’s entire mobility network, a weighted directed graph in which nodes are geographic locations and weighted edges represent people’s movements between those locations, thus describing the entire mobility set flows within a city. Our solution is MoGAN, a model based on Generative Adversarial Networks (GANs) to generate realistic mobility networks. We conduct extensive experiments on public datasets of bike and taxi rides to show that MoGAN outperforms the classical Gravity and Radiation models regarding the realism of the generated networks. Our model can be used for data augmentation and performing simulations and what-if analysis.
%B EPJ data science
%V 11
%P 58
%G eng
%R https://doi.org/10.1140/epjds/s13688-022-00372-4

%0 Conference Paper
%B Proceedings of the 30th International Conference on Advances in Geographic Information Systems
%D 2022
%T How Routing Strategies Impact Urban Emissions
%A Cornacchia, Giuliano
%A Böhm, Matteo
%A Giovanni Mauro
%A Mirco Nanni
%A Dino Pedreschi
%A Luca Pappalardo
%X Navigation apps use routing algorithms to suggest the best path to reach a user's desired destination. Although undoubtedly useful, navigation apps' impact on the urban environment (e.g., CO2 emissions and pollution) is still largely unclear. In this work, we design a simulation framework to assess the impact of routing algorithms on carbon dioxide emissions within an urban environment. Using APIs from TomTom and OpenStreetMap, we find that settings in which either all vehicles or none of them follow a navigation app's suggestion lead to the worst impact in terms of CO2 emissions. In contrast, when just a portion (around half) of vehicles follow these suggestions, and some degree of randomness is added to the remaining vehicles' paths, we observe a reduction in the overall CO2 emissions over the road network. Our work is a first step towards designing next-generation routing principles that may increase urban well-being while satisfying individual needs.
%B Proceedings of the 30th International Conference on Advances in Geographic Information Systems
%I Association for Computing Machinery
%C New York, NY, USA
%@ 9781450395298
%G eng
%U https://doi.org/10.1145/3557915.3560977
%R 10.1145/3557915.3560977

%0 Journal Article
%J PLOS ONE
%D 2021
%T Explaining the difference between men’s and women’s football
%A Luca Pappalardo
%A Alessio Rossi
%A Michela Natilli
%A Paolo Cintia
%E Constantinou, Anthony C.
%X Women’s football is gaining supporters and practitioners worldwide, raising questions about what the differences are with men’s football. While the two sports are often compared based on the players’ physical attributes, we analyze the spatio-temporal events during matches in the last World Cups to compare male and female teams based on their technical performance. We train an artificial intelligence model to recognize if a team is male or female based on variables that describe a match’s playing intensity, accuracy, and performance quality. Our model accurately distinguishes between men’s and women’s football, revealing crucial technical differences, which we investigate through the extraction of explanations from the classifier’s decisions. The differences between men’s and women’s football are rooted in play accuracy, the recovery time of ball possession, and the players’ performance quality. Our methodology may help journalists and fans understand what makes women’s football a distinct sport and coaches design tactics tailored to female teams.
%B PLOS ONE
%V 16
%P e0255407
%8 Apr-08-2021
%G eng
%U https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0255407
%! PLoS ONE
%R https://doi.org/10.1371/journal.pone.0255407

%0 Journal Article
%D 2021
%T Introduction to the special issue on social mining and big data ecosystem for open, responsible data science
%A Luca Pappalardo
%A Grossi, Valerio
%A Dino Pedreschi
%8 2021/03/05
%@ 2364-4168
%G eng
%U https://doi.org/10.1007/s41060-021-00253-5
%! International Journal of Data Science and Analytics
%R https://link.springer.com/article/10.1007/s41060-021-00253-5

%0 Journal Article
%J ISPRS International Journal of Geo-Information
%D 2021
%T A Mechanistic Data-Driven Approach to Synthesize Human Mobility Considering the Spatial, Temporal, and Social Dimensions Together
%A Cornacchia, Giuliano
%A Luca Pappalardo
%X Modelling human mobility is crucial in several areas, from urban planning to epidemic modelling, traffic forecasting, and what-if analysis. Existing generative models focus mainly on reproducing the spatial and temporal dimensions of human mobility, while the social aspect, though it influences human movements significantly, is often neglected. Those models that capture some social perspectives of human mobility utilize trivial and unrealistic spatial and temporal mechanisms. In this paper, we propose the Spatial, Temporal and Social Exploration and Preferential Return model (STS-EPR), which embeds mechanisms to capture the spatial, temporal, and social aspects together. We compare the trajectories produced by STS-EPR with respect to real-world trajectories and synthetic trajectories generated by two state-of-the-art generative models on a set of standard mobility measures. Our experiments conducted on an open dataset show that STS-EPR, overall, outperforms existing spatial-temporal or social models demonstrating the importance of modelling adequately the sociality to capture precisely all the other dimensions of human mobility. We further investigate the impact of the tile shape of the spatial tessellation on the performance of our model. STS-EPR, which is open-source and tested on open data, represents a step towards the design of a mechanistic data-driven model that captures all the aspects of human mobility comprehensively.
%B ISPRS International Journal of Geo-Information
%V 10
%P 599
%G eng
%U https://www.mdpi.com/2220-9964/10/9/599
%R 10.3390/ijgi10090599

%0 Journal Article
%D 2021
%T STS-EPR: Modelling individual mobility considering the spatial, temporal, and social dimensions together
%A Cornacchia, Giuliano
%A Luca Pappalardo
%8 05
%G eng
%R 10.1016/j.procs.2021.03.035

%0 Conference Paper
%B 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)
%D 2020
%T Estimating countries’ peace index through the lens of the world news as monitored by GDELT
%A V. Voukelatou
%A Luca Pappalardo
%A Lorenzo Gabrielli
%A Fosca Giannotti
%X Peacefulness is a principal dimension of well-being, and its measurement has lately drawn the attention of researchers and policy-makers. During the last years, novel digital data streams have drastically changed research in this field. In the current study, we exploit information extracted from Global Data on Events, Location, and Tone (GDELT) digital news database, to capture peacefulness through the Global Peace Index (GPI). Applying machine learning techniques, we demonstrate that news media attention, sentiment, and social stability from GDELT can be used as proxies for measuring GPI at a monthly level. Additionally, through the variable importance analysis, we show that each country's socio-economic, political, and military profile emerges. This could bring added value to researchers interested in "Data Science for Social Good", to policy-makers, and peacekeeping organizations since they could monitor peacefulness almost real-time, and therefore facilitate timely and more efficient policy-making.
%B 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)
%8 2020
%G eng
%U https://ieeexplore.ieee.org/abstract/document/9260052
%R https://doi.org/10.1109/DSAA49011.2020.00034

%0 Journal Article
%J International Journal of Data Science and Analytics
%D 2020
%T Human migration: the big data perspective
%A Alina Sirbu
%A Andrienko, Gennady
%A Andrienko, Natalia
%A Boldrini, Chiara
%A Conti, Marco
%A Fosca Giannotti
%A Riccardo Guidotti
%A Bertoli, Simone
%A Jisu Kim
%A Muntean, Cristina Ioana
%A Luca Pappalardo
%A Passarella, Andrea
%A Dino Pedreschi
%A Pollacci, Laura
%A Francesca Pratesi
%A Sharma, Rajesh
%X How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.
%B International Journal of Data Science and Analytics
%P 1–20
%8 2020/03/23
%@ 2364-4168
%G eng
%U https://link.springer.com/article/10.1007%2Fs41060-020-00213-5
%! International Journal of Data Science and Analytics
%R https://doi.org/10.1007/s41060-020-00213-5

%0 Generic
%D 2020
%T Mobile phone data analytics against the COVID-19 epidemics in Italy: flow diversity and local job markets during the national lockdown
%A Pietro Bonato
%A Paolo Cintia
%A Francesco Fabbri
%A Daniele Fadda
%A Fosca Giannotti
%A Pier Luigi Lopalco
%A Sara Mazzilli
%A Mirco Nanni
%A Luca Pappalardo
%A Dino Pedreschi
%A Francesco Penone
%A S Rinzivillo
%A Giulio Rossetti
%A Marcello Savarese
%A Lara Tavoschi
%X Understanding collective mobility patterns is crucial to plan the restart of production and economic activities, which are currently put in stand-by to fight the diffusion of the epidemics. In this report, we use mobile phone data to infer the movements of people between Italian provinces and municipalities, and we analyze the incoming, outcoming and internal mobility flows before and during the national lockdown (March 9th, 2020) and after the closure of non-necessary productive and economic activities (March 23th, 2020). The population flow across provinces and municipalities enable for the modelling of a risk index tailored for the mobility of each municipality or province. Such an index would be a useful indicator to drive counter-measures in reaction to a sudden reactivation of the epidemics. Mobile phone data, even when aggregated to preserve the privacy of individuals, are a useful data source to track the evolution in time of human mobility, hence allowing for monitoring the effectiveness of control measures such as physical distancing. We address the following analytical questions: How does the mobility structure of a territory change? Do incoming and outcoming flows become more predictable during the lockdown, and what are the differences between weekdays and weekends? Can we detect proper local job markets based on human mobility flows, to eventually shape the borders of a local outbreak?
%G eng
%U https://arxiv.org/abs/2004.11278
%R https://dx.doi.org/10.32079/ISTI-TR-2020/005

%0 Journal Article
%J IEEE Transactions on Intelligent Transportation SystemsIEEE Transactions on Intelligent Transportation Systems
%D 2020
%T Modeling Adversarial Behavior Against Mobility Data Privacy
%A Roberto Pellungrini
%A Luca Pappalardo
%A F. Simini
%A Anna Monreale
%X Privacy risk assessment is a crucial issue in any privacy-aware analysis process. Traditional frameworks for privacy risk assessment systematically generate the assumed knowledge for a potential adversary, evaluating the risk without realistically modelling the collection of the background knowledge used by the adversary when performing the attack. In this work, we propose Simulated Privacy Annealing (SPA), a new adversarial behavior model for privacy risk assessment in mobility data. We model the behavior of an adversary as a mobility trajectory and introduce an optimization approach to find the most effective adversary trajectory in terms of privacy risk produced for the individuals represented in a mobility data set. We use simulated annealing to optimize the movement of the adversary and simulate a possible attack on mobility data. We finally test the effectiveness of our approach on real human mobility data, showing that it can simulate the knowledge gathering process for an adversary in a more realistic way.
%B IEEE Transactions on Intelligent Transportation SystemsIEEE Transactions on Intelligent Transportation Systems
%P 1 - 14
%8 2020
%@ 1558-0016
%G eng
%U https://ieeexplore.ieee.org/abstract/document/9199893
%! IEEE Transactions on Intelligent Transportation Systems
%R https://doi.org/10.1109/TITS.2020.3021911

%0 Journal Article
%J arXiv preprint arXiv:2007.02371
%D 2020
%T Modelling Human Mobility considering Spatial, Temporal and Social Dimensions
%A Cornacchia, Giuliano
%A Giulio Rossetti
%A Luca Pappalardo
%B arXiv preprint arXiv:2007.02371
%G eng

%0 Journal Article
%J arXiv preprint arXiv:2006.03141
%D 2020
%T The relationship between human mobility and viral transmissibility during the COVID-19 epidemics in Italy
%A Paolo Cintia
%A Daniele Fadda
%A Fosca Giannotti
%A Luca Pappalardo
%A Giulio Rossetti
%A Dino Pedreschi
%A S Rinzivillo
%A Bonato, Pietro
%A Fabbri, Francesco
%A Penone, Francesco
%A Savarese, Marcello
%A Checchi, Daniele
%A Chiaromonte, Francesca
%A Vineis , Paolo
%A Guzzetta, Giorgio
%A Riccardo, Flavia
%A Marziano, Valentina
%A Poletti, Piero
%A Trentini, Filippo
%A Bella, Antonio
%A Andrianou, Xanthi
%A Del Manso, Martina
%A Fabiani, Massimo
%A Bellino, Stefania
%A Boros, Stefano
%A Mateo Urdiales, Alberto
%A Vescio, Maria Fenicia
%A Brusaferro, Silvio
%A Rezza, Giovanni
%A Pezzotti, Patrizio
%A Ajelli, Marco
%A Merler, Stefano
%X We describe in this report our studies to understand the relationship between human mobility and the spreading of COVID-19, as an aid to manage the restart of the social and economic activities after the lockdown and monitor the epidemics in the coming weeks and months. We compare the evolution (from January to May 2020) of the daily mobility flows in Italy, measured by means of nation-wide mobile phone data, and the evolution of transmissibility, measured by the net reproduction number, i.e., the mean number of secondary infections generated by one primary infector in the presence of control interventions and human behavioural adaptations. We find a striking relationship between the negative variation of mobility flows and the net reproduction number, in all Italian regions, between March 11th and March 18th, when the country entered the lockdown. This observation allows us to quantify the time needed to "switch off" the country mobility (one week) and the time required to bring the net reproduction number below 1 (one week). A reasonably simple regression model provides evidence that the net reproduction number is correlated with a region's incoming, outgoing and internal mobility. We also find a strong relationship between the number of days above the epidemic threshold before the mobility flows reduce significantly as an effect of lockdowns, and the total number of confirmed SARS-CoV-2 infections per 100k inhabitants, thus indirectly showing the effectiveness of the lockdown and the other non-pharmaceutical interventions in the containment of the contagion. Our study demonstrates the value of "big" mobility data to the monitoring of key epidemic indicators to inform choices as the epidemics unfolds in the coming months.
%B arXiv preprint arXiv:2006.03141
%G eng
%U https://arxiv.org/abs/2006.03141

%0 Journal Article
%J Available at SSRN 3452058
%D 2019
%T Defining Geographic Markets from Probabilistic Clusters: A Machine Learning Algorithm Applied to Supermarket Scanner Data
%A Bruestle, Stephen
%A Luca Pappalardo
%A Riccardo Guidotti
%B Available at SSRN 3452058
%G eng

%0 Conference Paper
%B Companion of The 2019 World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019.
%D 2019
%T Human Mobility from theory to practice: Data, Models and Applications
%A Luca Pappalardo
%A Gianni Barlacchi
%A Roberto Pellungrini
%A Filippo Simini
%X The inclusion of tracking technologies in personal devices opened the doors to the analysis of large sets of mobility data like GPS traces and call detail records. This tutorial presents an overview of both modeling principles of human mobility and machine learning models applicable to specific problems. We review the state of the art of five main aspects in human mobility: (1) human mobility data landscape; (2) key measures of individual and collective mobility; (3) generative models at the level of individual, population and mixture of the two; (4) next location prediction algorithms; (5) applications for social good. For each aspect, we show experiments and simulations using the Python library ”scikit-mobility” developed by the presenters of the tutorial.
%B Companion of The 2019 World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019.
%G eng
%U https://doi.org/10.1145/3308560.3320099
%R 10.1145/3308560.3320099

%0 Journal Article
%J ACM Transactions on Intelligent Systems and Technology (TIST)
%D 2019
%T PlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach
%A Luca Pappalardo
%A Paolo Cintia
%A Ferragina, Paolo
%A Massucco, Emanuele
%A Dino Pedreschi
%A Fosca Giannotti
%X The problem of evaluating the performance of soccer players is attracting the interest of many companies and the scientific community, thanks to the availability of massive data capturing all the events generated during a match (e.g., tackles, passes, shots, etc.). Unfortunately, there is no consolidated and widely accepted metric for measuring performance quality in all of its facets. In this article, we design and implement PlayeRank, a data-driven framework that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. We build our framework by deploying a massive dataset of soccer-logs and consisting of millions of match events pertaining to four seasons of 18 prominent soccer competitions. By comparing PlayeRank to known algorithms for performance evaluation in soccer, and by exploiting a dataset of players’ evaluations made by professional soccer scouts, we show that PlayeRank significantly outperforms the competitors. We also explore the ratings produced by PlayeRank and discover interesting patterns about the nature of excellent performances and what distinguishes the top players from the others. At the end, we explore some applications of PlayeRank—i.e. searching players and player versatility—showing its flexibility and efficiency, which makes it worth to be used in the design of a scalable platform for soccer analytics.
%B ACM Transactions on Intelligent Systems and Technology (TIST)
%V 10
%P 1–27
%G eng
%U https://dl.acm.org/doi/abs/10.1145/3343172
%R 10.1145/3343172

%0 Journal Article
%J Scientific data
%D 2019
%T A public data set of spatio-temporal match events in soccer competitions
%A Luca Pappalardo
%A Paolo Cintia
%A Alessio Rossi
%A Massucco, Emanuele
%A Ferragina, Paolo
%A Dino Pedreschi
%A Fosca Giannotti
%X Soccer analytics is attracting increasing interest in academia and industry, thanks to the availability of sensing technologies that provide high-fidelity data streams for every match. Unfortunately, these detailed data are owned by specialized companies and hence are rarely publicly available for scientific research. To fill this gap, this paper describes the largest open collection of soccer-logs ever released, containing all the spatio-temporal events (passes, shots, fouls, etc.) that occured during each match for an entire season of seven prominent soccer competitions. Each match event contains information about its position, time, outcome, player and characteristics. The nature of team sports like soccer, halfway between the abstraction of a game and the reality of complex social systems, combined with the unique size and composition of this dataset, provide an ideal ground for tackling a wide range of data science problems, including the measurement and evaluation of performance, both at individual and at collective level, and the determinants of success and failure.
%B Scientific data
%V 6
%P 1–15
%G eng
%U https://www.nature.com/articles/s41597-019-0247-7
%R 10.1038/s41597-019-0247-7

%0 Journal Article
%J Applied Sciences
%D 2019
%T Relationship between External and Internal Workloads in Elite Soccer Players: Comparison between Rate of Perceived Exertion and Training Load
%A Alessio Rossi
%A Perri, Enrico
%A Luca Pappalardo
%A Paolo Cintia
%A Iaia, F Marcello
%X The use of machine learning (ML) in soccer allows for the management of a large amount of data deriving from the monitoring of sessions and matches. Although the rate of perceived exertion (RPE), training load (S-RPE), and global position system (GPS) are standard methodologies used in team sports to assess the internal and external workload; how the external workload affects RPE and S-RPE remains still unclear. This study explores the relationship between both RPE and S-RPE and the training workload through ML. Data were recorded from 22 elite soccer players, in 160 training sessions and 35 matches during the 2015/2016 season, by using GPS tracking technology. A feature selection process was applied to understand which workload features influence RPE and S-RPE the most. Our results show that the training workloads performed in the previous week have a strong effect on perceived exertion and training load. On the other hand, the analysis of our predictions shows higher accuracy for medium RPE and S-RPE values compared with the extremes. These results provide further evidence of the usefulness of ML as a support to athletic trainers and coaches in understanding the relationship between training load and individual-response in team sports.
%B Applied Sciences
%V 9
%P 5174
%G eng
%U https://www.mdpi.com/2076-3417/9/23/5174/htm
%R 10.3390/app9235174

%0 Conference Paper
%B Software Technologies: Applications and Foundations - STAF 2018 Collocated Workshops, Toulouse, France, June 25-29, 2018, Revised Selected Papers
%D 2018
%T Analyzing Privacy Risk in Human Mobility Data
%A Roberto Pellungrini
%A Luca Pappalardo
%A Francesca Pratesi
%A Anna Monreale
%X Mobility data are of fundamental importance for understanding the patterns of human movements, developing analytical services and modeling human dynamics. Unfortunately, mobility data also contain individual sensitive information, making it necessary an accurate privacy risk assessment for the individuals involved. In this paper, we propose a methodology for assessing privacy risk in human mobility data. Given a set of individual and collective mobility features, we define the minimum data format necessary for the computation of each feature and we define a set of possible attacks on these data formats. We perform experiments computing the empirical risk in a real-world mobility dataset, and show how the distributions of the considered mobility features are affected by the removal of individuals with different levels of privacy risk.
%B Software Technologies: Applications and Foundations - STAF 2018 Collocated Workshops, Toulouse, France, June 25-29, 2018, Revised Selected Papers
%G eng
%U https://doi.org/10.1007/978-3-030-04771-9_10
%R 10.1007/978-3-030-04771-9_10

%0 Journal Article
%J PloS one
%D 2018
%T Effective injury forecasting in soccer with GPS training data and machine learning
%A Alessio Rossi
%A Luca Pappalardo
%A Paolo Cintia
%A Iaia, F Marcello
%A Fernàndez, Javier
%A Medina, Daniel
%X Injuries have a great impact on professional soccer, due to their large influence on team performance and the considerable costs of rehabilitation for players. Existing studies in the literature provide just a preliminary understanding of which factors mostly affect injury risk, while an evaluation of the potential of statistical models in forecasting injuries is still missing. In this paper, we propose a multi-dimensional approach to injury forecasting in professional soccer that is based on GPS measurements and machine learning. By using GPS tracking technology, we collect data describing the training workload of players in a professional soccer club during a season. We then construct an injury forecaster and show that it is both accurate and interpretable by providing a set of case studies of interest to soccer practitioners. Our approach opens a novel perspective on injury prevention, providing a set of simple and practical rules for evaluating and interpreting the complex relations between injury risk and training performance in professional soccer.
%B PloS one
%V 13
%P e0201264
%G eng
%U https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0201264
%R https://doi.org/10.1371/journal.pone.0201264

%0 Conference Paper
%B ECML PKDD 2018 Workshops
%D 2018
%T Exploring Students Eating Habits Through Individual Profiling and Clustering Analysis
%A Michela Natilli
%A Anna Monreale
%A Riccardo Guidotti
%A Luca Pappalardo
%B ECML PKDD 2018 Workshops
%I Springer
%G eng

%0 Journal Article
%J BMC gastroenterology
%D 2018
%T Gastroesophageal reflux symptoms among Italian university students: epidemiology and dietary correlates using automatically recorded transactions
%A Martinucci, Irene
%A Michela Natilli
%A Lorenzoni, Valentina
%A Luca Pappalardo
%A Anna Monreale
%A Turchetti, Giuseppe
%A Dino Pedreschi
%A Marchi, Santino
%A Barale, Roberto
%A de Bortoli, Nicola
%X Background: Gastroesophageal reflux disease (GERD) is one of the most common gastrointestinal disorders  worldwide, with relevant impact on the quality of life and health care costs.The aim of our study is to assess the  prevalence of GERD based on self-reported symptoms among university students in central Italy. The secondary aim is  to evaluate lifestyle correlates, particularly eating habits, in GERD students using automatically recorded transactions  through cashiers at university canteen.  Methods: A web-survey was created and launched through an app, ad-hoc developed for an interactive exchange of  information with students, including anthropometric data and lifestyle habits. Moreover, the web-survey allowed  users a self-diagnosis of GERD through a simple questionnaire. As regard eating habits, detailed collection of meals  consumed, including number and type of dishes, were automatically recorded through cashiers at the university  canteen equipped with an automatic registration system.  Results: We collected 3012 questionnaires. A total of 792 students (26.2% of the respondents) reported typical GERD  symptoms occurring at least weekly. Female sex was more prevalent than male sex. In the set of students with GERD,  the percentage of smokers was higher, and our results showed that when BMI tends to higher values the percentage  of students with GERD tends to increase. When evaluating correlates with diet, we found, among all users, a lower  frequency of legumes choice in GERD students and, among frequent users, a lower frequency of choice of pasta and  rice in GERD students.  Discussion: The results of our study are in line with the values reported in the literature. Nowadays, GERD is a common  problem in our communities, and can potentially lead to serious medical complications; the economic burden  involved in the diagnostic and therapeutic management of the disease has a relevant impact on healthcare costs.  Conclusions: To our knowledge, this is the first study evaluating the prevalence of typical GERD–related symptoms  in a young population of University students in Italy. Considering the young age of enrolled subjects, our prevalence  rate, relatively high compared to the usual estimates, could represent a further negative factor for the future  economic sustainability of the healthcare system.  Keywords: Gastroesophageal reflux disease, GERD, Heartburn, Regurgitation, Diet, Prevalence, University students
%B BMC gastroenterology
%V 18
%P 116
%G eng
%U https://bmcgastroenterol.biomedcentral.com/articles/10.1186/s12876-018-0832-9
%R 10.1186/s12876-018-0832-9

%0 Journal Article
%J PLOS ONE
%D 2018
%T Gravity and scaling laws of city to city migration
%A Prieto Curiel, Rafael
%A Luca Pappalardo
%A Lorenzo Gabrielli
%A Bishop, Steven Richard
%X Models of human migration provide powerful tools to forecast the flow of migrants, measure the impact of a policy, determine the cost of physical and political frictions and more. Here, we analyse the migration of individuals from and to cities in the US, finding that city to city migration follows scaling laws, so that the city size is a significant factor in determining whether, or not, an individual decides to migrate and the city size of both the origin and destination play key roles in the selection of the destination. We observe that individuals from small cities tend to migrate more frequently, tending to move to similar-sized cities, whereas individuals from large cities do not migrate so often, but when they do, they tend to move to other large cities. Building upon these findings we develop a scaling model which describes internal migration as a two-step decision process, demonstrating that it can partially explain migration fluxes based solely on city size. We then consider the impact of distance and construct a gravity-scaling model by combining the observed scaling patterns with the gravity law of migration. Results show that the scaling laws are a significant feature of human migration and that the inclusion of scaling can overcome the limits of the gravity and the radiation models of human migration.
%B PLOS ONE
%V 13
%P 1-19
%8 07
%G eng
%U https://doi.org/10.1371/journal.pone.0199892
%R 10.1371/journal.pone.0199892

%0 Book Section
%B A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years
%D 2018
%T How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science
%A Amato, G.
%A Candela, L.
%A Castelli, D.
%A Esuli, A.
%A Falchi, F.
%A Gennaro, C.
%A Fosca Giannotti
%A Anna Monreale
%A Mirco Nanni
%A Pagano, P.
%A Luca Pappalardo
%A Dino Pedreschi
%A Francesca Pratesi
%A Rabitti, F.
%A S Rinzivillo
%A Giulio Rossetti
%A Salvatore Ruggieri
%A Sebastiani, F.
%A Tesconi, M.
%E Flesca, Sergio
%E Greco, Sergio
%E Masciari, Elio
%E Saccà, Domenico
%X During the last 35 years, data management principles such as physical and logical independence, declarative querying and cost-based optimization have led to profound pervasiveness of relational databases in any kind of organization. More importantly, these technical advances have enabled the first round of business intelligence applications and laid the foundation for managing and analyzing Big Data today.
%B A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years
%I Springer International Publishing
%C Cham
%P 287 - 306
%@ 978-3-319-61893-7
%G eng
%U https://link.springer.com/chapter/10.1007%2F978-3-319-61893-7_17
%R https://doi.org/10.1007/978-3-319-61893-7_17

%0 Report
%D 2018
%T Open the Black Box Data-Driven Explanation of Black Box Decision Systems
%A Dino Pedreschi
%A Fosca Giannotti
%A Riccardo Guidotti
%A Anna Monreale
%A Luca Pappalardo
%A Salvatore Ruggieri
%A Franco Turini
%B arXiv preprint arXiv:1806.09936
%G eng

%0 Journal Article
%J IEEE Transactions on Knowledge and Data Engineering
%D 2018
%T Personalized Market Basket Prediction with Temporal Annotated Recurring Sequences
%A Riccardo Guidotti
%A Giulio Rossetti
%A Luca Pappalardo
%A Fosca Giannotti
%A Dino Pedreschi
%X Nowadays, a hot challenge for supermarket chains is to offer personalized services to their customers. Market basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable of capturing at the same time the different factors influencing the customer's decision process: co-occurrence, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern Temporal Annotated Recurring Sequence (TARS) able to capture simultaneously and adaptively all these factors. We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to understand the level of the customer's stocks and recommend the set of most necessary items. By adopting the TBP the supermarket chains could crop tailored suggestions for each individual customer which in turn could effectively speed up their shopping sessions. A deep experimentation shows that TARS are able to explain the customer purchase behavior, and that TBP outperforms the state-of-the-art competitors.
%B IEEE Transactions on Knowledge and Data Engineering
%G eng
%U https://ieeexplore.ieee.org/abstract/document/8477157
%R 10.1109/TKDE.2018.2872587

%0 Conference Paper
%B 2018 IEEE 5th international conference on data science and advanced analytics (DSAA)
%D 2018
%T Weak nodes detection in urban transport systems: Planning for resilience in Singapore
%A Ferretti, Michele
%A Barlacchi, Gianni
%A Luca Pappalardo
%A Lucchini, Lorenzo
%A Lepri, Bruno
%X The availability of massive data-sets describing human mobility offers the possibility to design simulation tools to monitor and improve the resilience of transport systems in response to traumatic events such as natural and man-made disasters (e.g., floods, terrorist attacks, etc. . . ). In this perspective, we propose ACHILLES, an application to models people's movements in a given transport mode through a multiplex network representation based on mobility data. ACHILLES is a web-based application which provides an easy-to-use interface to explore the mobility fluxes and the connectivity of every urban zone in a city, as well as to visualize changes in the transport system resulting from the addition or removal of transport modes, urban zones, and single stops. Notably, our application allows the user to assess the overall resilience of the transport network by identifying its weakest node, i.e. Urban Achilles Heel, with reference to the ancient Greek mythology. To demonstrate the impact of ACHILLES for humanitarian aid we consider its application to a real-world scenario by exploring human mobility in Singapore in response to flood prevention.
%B 2018 IEEE 5th international conference on data science and advanced analytics (DSAA)
%I IEEE
%G eng
%U https://ieeexplore.ieee.org/abstract/document/8631413/authors#authors
%R 10.1109/DSAA.2018.00061

%0 Conference Paper
%B Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, PAP 2017, Held in Conjunction with ECML PKDD 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers
%D 2017
%T Assessing Privacy Risk in Retail Data
%A Roberto Pellungrini
%A Francesca Pratesi
%A Luca Pappalardo
%X Retail data are one of the most requested commodities by commercial companies. Unfortunately, from this data it is possible to retrieve highly sensitive information about individuals. Thus, there exists the need for accurate individual privacy risk evaluation. In this paper, we propose a methodology for assessing privacy risk in retail data. We define the data formats for representing retail data, the privacy framework for calculating privacy risk and some possible privacy attacks for this kind of data. We perform experiments in a real-world retail dataset, and show the distribution of privacy risk for the various attacks.
%B Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, PAP 2017, Held in Conjunction with ECML PKDD 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers
%G eng
%U https://doi.org/10.1007/978-3-319-71970-2_3
%R 10.1007/978-3-319-71970-2_3

%0 Journal Article
%J ACM Trans. Intell. Syst. Technol.
%D 2017
%T A Data Mining Approach to Assess Privacy Risk in Human Mobility Data
%A Roberto Pellungrini
%A Luca Pappalardo
%A Francesca Pratesi
%A Anna Monreale
%X Human mobility data are an important proxy to understand human mobility dynamics, develop analytical services, and design mathematical models for simulation and what-if analysis. Unfortunately mobility data are very sensitive since they may enable the re-identification of individuals in a database. Existing frameworks for privacy risk assessment provide data providers with tools to control and mitigate privacy risks, but they suffer two main shortcomings: (i) they have a high computational complexity; (ii) the privacy risk must be recomputed every time new data records become available and for every selection of individuals, geographic areas, or time windows. In this article, we propose a fast and flexible approach to estimate privacy risk in human mobility data. The idea is to train classifiers to capture the relation between individual mobility patterns and the level of privacy risk of individuals. We show the effectiveness of our approach by an extensive experiment on real-world GPS data in two urban areas and investigate the relations between human mobility patterns and the privacy risk of individuals.
%B ACM Trans. Intell. Syst. Technol.
%V 9
%P 31:1–31:27
%G eng
%U http://doi.acm.org/10.1145/3106774
%R 10.1145/3106774

%0 Journal Article
%J Data Mining and Knowledge Discovery
%D 2017
%T Data-driven generation of spatio-temporal routines in human mobility
%A Luca Pappalardo
%A Filippo Simini
%X The generation of realistic spatio-temporal trajectories of human mobility is of fundamental importance in a wide range of applications, such as the developing of protocols for mobile ad-hoc networks or what-if analysis in urban ecosystems. Current generative algorithms fail in accurately reproducing the individuals' recurrent schedules and at the same time in accounting for the possibility that individuals may break the routine during periods of variable duration. In this article we present Ditras (DIary-based TRAjectory Simulator), a framework to simulate the spatio-temporal patterns of human mobility. Ditras operates in two steps: the generation of a mobility diary and the translation of the mobility diary into a mobility trajectory. We propose a data-driven algorithm which constructs a diary generator from real data, capturing the tendency of individuals to follow or break their routine. We also propose a trajectory generator based on the concept of preferential exploration and preferential return. We instantiate Ditras with the proposed diary and trajectory generators and compare the resulting algorithm with real data and synthetic data produced by other generative algorithms, built by instantiating Ditras with several combinations of diary and trajectory generators. We show that the proposed algorithm reproduces the statistical properties of real trajectories in the most accurate way, making a step forward the understanding of the origin of the spatio-temporal patterns of human mobility.
%B Data Mining and Knowledge Discovery
%8 Dec
%G eng
%U https://doi.org/10.1007/s10618-017-0548-4
%R 10.1007/s10618-017-0548-4

%0 Generic
%D 2017
%T Fast Estimation of Privacy Risk in Human Mobility Data
%A Roberto Pellungrini
%A Luca Pappalardo
%A Francesca Pratesi
%A Anna Monreale
%X Mobility data are an important proxy to understand the patterns of human movements, develop analytical services and design models for simulation and prediction of human dynamics. Unfortunately mobility data are also very sensitive, since they may contain personal information about the individuals involved. Existing frameworks for privacy risk assessment enable the data providers to quantify and mitigate privacy risks, but they suffer two main limitations: (i) they have a high computational complexity; (ii) the privacy risk must be re-computed for each new set of individuals, geographic areas or time windows. In this paper we explore a fast and flexible solution to estimate privacy risk in human mobility data, using predictive models to capture the relation between an individual’s mobility patterns and her privacy risk. We show the effectiveness of our approach by experimentation on a real-world GPS dataset and provide a comparison with traditional methods.
%@ 978-3-319-66283-1
%G eng
%R 10.1007/978-3-319-66284-8_35

%0 Conference Paper
%B 2017 IEEE International Conference on Data Mining (ICDM)
%D 2017
%T Market Basket Prediction using User-Centric Temporal Annotated Recurring Sequences
%A Riccardo Guidotti
%A Giulio Rossetti
%A Luca Pappalardo
%A Fosca Giannotti
%A Dino Pedreschi
%X Nowadays, a hot challenge for supermarket chains is to offer personalized services to their customers. Market basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable of capturing at the same time the different factors influencing the customer’s decision process: co-occurrence, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern named Temporal Annotated Recurring Sequence (TARS). We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to understand the level of the customer’s stocks and recommend the set of most necessary items. A deep experimentation shows that TARS can explain the customers’ purchase behavior, and that TBP outperforms the state-of-the-art competitors.
%B 2017 IEEE International Conference on Data Mining (ICDM)
%I IEEE
%G eng

%0 Journal Article
%J arXiv preprint arXiv:1702.07158
%D 2017
%T Next Basket Prediction using Recurring Sequential Patterns
%A Riccardo Guidotti
%A Giulio Rossetti
%A Luca Pappalardo
%A Fosca Giannotti
%A Dino Pedreschi
%X Nowadays, a hot challenge for supermarket chains is to offer personalized services for their customers. Next basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable to capture at the same time the different factors influencing the customer's decision process: co-occurrency, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern Temporal Annotated Recurring Sequence (TARS) able to capture simultaneously and adaptively all these factors. We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to to understand the level of the customer's stocks and recommend the set of most necessary items. By adopting the TBP the supermarket chains could crop tailored suggestions for each individual customer which in turn could effectively speed up their shopping sessions. A deep experimentation shows that TARS are able to explain the customer purchase behavior, and that TBP outperforms the state-of-the-art competitors.
%B arXiv preprint arXiv:1702.07158
%G eng
%U https://arxiv.org/abs/1702.07158

%0 Journal Article
%J Advances in Complex Systems
%D 2017
%T Quantifying the relation between performance and success in soccer
%A Luca Pappalardo
%A Paolo Cintia
%X The availability of massive data about sports activities offers nowadays the opportunity to quantify the relation between performance and success. In this study, we analyze more than 6000 games and 10 million events in six European leagues and investigate this relation in soccer competitions. We discover that a team’s position in a competition’s final ranking is significantly related to its typical performance, as described by a set of technical features extracted from the soccer data. Moreover, we find that, while victory and defeats can be explained by the team’s performance during a game, it is difficult to detect draws by using a machine learning approach. We then simulate the outcomes of an entire season of each league only relying on technical data and exploiting a machine learning model trained on data from past seasons. The simulation produces a team ranking which is similar to the actual ranking, suggesting that a complex systems’ view on soccer has the potential of revealing hidden patterns regarding the relation between performance and success.
%B Advances in Complex Systems
%P 1750014
%G eng
%U http://www.worldscientific.com/doi/abs/10.1142/S021952591750014X
%R 10.1142/S021952591750014X

%0 Journal Article
%J Machine Learning
%D 2017
%T Tiles: an online algorithm for community discovery in dynamic social networks
%A Giulio Rossetti
%A Luca Pappalardo
%A Dino Pedreschi
%A Fosca Giannotti
%X Community discovery has emerged during the last decade as one of the most challenging problems in social network analysis. Many algorithms have been proposed to find communities on static networks, i.e. networks which do not change in time. However, social networks are dynamic realities (e.g. call graphs, online social networks): in such scenarios static community discovery fails to identify a partition of the graph that is semantically consistent with the temporal information expressed by the data. In this work we propose Tiles, an algorithm that extracts overlapping communities and tracks their evolution in time following an online iterative procedure. Our algorithm operates following a domino effect strategy, dynamically recomputing nodes community memberships whenever a new interaction takes place. We compare Tiles with state-of-the-art community detection algorithms on both synthetic and real world networks having annotated community structure: our experiments show that the proposed approach is able to guarantee lower execution times and better correspondence with the ground truth communities than its competitors. Moreover, we illustrate the specifics of the proposed approach by discussing the properties of identified communities it is able to identify.
%B Machine Learning
%V 106
%P 1213–1241
%G eng
%U https://link.springer.com/article/10.1007/s10994-016-5582-8
%R 10.1007/s10994-016-5582-8

%0 Journal Article
%J International Journal of Data Science and Analytics
%D 2016
%T An analytical framework to nowcast well-being using mobile phone data
%A Luca Pappalardo
%A Maarten Vanhoof
%A Lorenzo Gabrielli
%A Zbigniew Smoreda
%A Dino Pedreschi
%A Fosca Giannotti
%X An intriguing open question is whether measurements derived from Big Data recording human activities can yield high-fidelity proxies of socio-economic development and well-being. Can we monitor and predict the socio-economic development of a territory just by observing the behavior of its inhabitants through the lens of Big Data? In this paper, we design a data-driven analytical framework that uses mobility measures and social measures extracted from mobile phone data to estimate indicators for socio-economic development and well-being. We discover that the diversity of mobility, defined in terms of entropy of the individual users’ trajectories, exhibits (i) significant correlation with two different socio-economic indicators and (ii) the highest importance in predictive models built to predict the socio-economic indicators. Our analytical framework opens an interesting perspective to study human behavior through the lens of Big Data by means of new statistical indicators that quantify and possibly “nowcast” the well-being and the socio-economic development of a territory.
%B International Journal of Data Science and Analytics
%V 2
%P 75–92
%G eng
%R 10.1007/s41060-016-0013-2

%0 Journal Article
%J Social Network Analysis and Mining
%D 2016
%T Homophilic network decomposition: a community-centric analysis of online social services
%A Giulio Rossetti
%A Luca Pappalardo
%A Riivo Kikas
%A Dino Pedreschi
%A Fosca Giannotti
%A Marlon Dumas
%X In this paper we formulate the homophilic network decomposition problem: Is it possible to identify a network partition whose structure is able to characterize the degree of homophily of its nodes? The aim of our work is to understand the relations between the homophily of individuals and the topological features expressed by specific network substructures. We apply several community detection algorithms on three large-scale online social networks—Skype, LastFM and Google+—and advocate the need of identifying the right algorithm for each specific network in order to extract a homophilic network decomposition. Our results show clear relations between the topological features of communities and the degree of homophily of their nodes in three online social scenarios: product engagement in the Skype network, number of listened songs on LastFM and homogeneous level of education among users of Google+.
%B Social Network Analysis and Mining
%V 6
%P 103
%G eng
%R 10.1007/s1327

%0 Conference Paper
%B 7th Workshop on Complex Networks
%D 2016
%T A novel approach to evaluate community detection algorithms on ground truth
%A Giulio Rossetti
%A Luca Pappalardo
%A S Rinzivillo
%X Evaluating a community detection algorithm is a complex task due to the lack of a shared and universally accepted definition of community. In literature, one of the most common way to assess the performances of a community detection algorithm is to compare its output with given ground truth communities by using computationally expensive metrics (i.e., Normalized Mutual Information). In this paper we propose a novel approach aimed at evaluating the adherence of a community partition to the ground truth: our methodology provides more information than the state-of-the-art ones and is fast to compute on large-scale networks. We evaluate its correctness by applying it to six popular community detection algorithms on four large-scale network datasets. Experimental results show how our approach allows to easily evaluate the obtained communities on the ground truth and to characterize the quality of community detection algorithms.
%B 7th Workshop on Complex Networks
%I Springer-Verlag
%C Dijon, France
%G eng
%U http://www.giuliorossetti.net/about/wp-content/uploads/2015/12/Complenet16.pdf
%R 10.1007/978-3-319-30569-1_10

%0 Conference Paper
%B International conference on Advances in Social Network Analysis and Mining
%D 2015
%T Community-centric analysis of user engagement in Skype social network
%A Giulio Rossetti
%A Luca Pappalardo
%A Riivo Kikas
%A Dino Pedreschi
%A Fosca Giannotti
%A Marlon Dumas
%B International conference on Advances in Social Network Analysis and Mining
%I IEEE
%C Paris, France
%@ 978-1-4503-3854-7
%G eng
%U http://dl.acm.org/citation.cfm?doid=2808797.2809384
%R 10.1145/2808797.2809384

%0 Conference Proceedings
%B IEEE International Conference on Data Science and Advanced Analytics
%D 2015
%T The harsh rule of the goals: data-driven performance indicators for football teams
%A Paolo Cintia
%A Luca Pappalardo
%A Dino Pedreschi
%A Fosca Giannotti
%A Marco Malvaldi
%X —Sports analytics in general, and football (soccer in  USA) analytics in particular, have evolved in recent years in an  amazing way, thanks to automated or semi-automated sensing  technologies that provide high-fidelity data streams extracted  from every game. In this paper we propose a data-driven  approach and show that there is a large potential to boost the  understanding of football team performance. From observational  data of football games we extract a set of pass-based performance  indicators and summarize them in the H indicator. We observe a  strong correlation among the proposed indicator and the success  of a team, and therefore perform a simulation on the four major  European championships (78 teams, almost 1500 games). The  outcome of each game in the championship was replaced by a  synthetic outcome (win, loss or draw) based on the performance  indicators computed for each team. We found that the final  rankings in the simulated championships are very close to the  actual rankings in the real championships, and show that teams  with high ranking error show extreme values of a defense/attack  efficiency measure, the Pezzali score. Our results are surprising  given the simplicity of the proposed indicators, suggesting that  a complex systems’ view on football data has the potential of  revealing hidden patterns and behavior of superior quality.
%B IEEE International Conference on Data Science and Advanced Analytics
%G eng
%U https://www.researchgate.net/profile/Luca_Pappalardo/publication/281318318_The_harsh_rule_of_the_goals_data-driven_performance_indicators_for_football_teams/links/561668e308ae37cfe4090a5d.pdf

%0 Journal Article
%J Nat Commun
%D 2015
%T Returners and explorers dichotomy in human mobility
%A Luca Pappalardo
%A Filippo Simini
%A S Rinzivillo
%A Dino Pedreschi
%A Fosca Giannotti
%A Barabasi, Albert-Laszlo
%X The availability of massive digital traces of human whereabouts has offered a series of novel insights on the quantitative patterns characterizing human mobility. In particular, numerous recent studies have lead to an unexpected consensus: the considerable variability in the characteristic travelled distance of individuals coexists with a high degree of predictability of their future locations. Here we shed light on this surprising coexistence by systematically investigating the impact of recurrent mobility on the characteristic distance travelled by individuals. Using both mobile phone and GPS data, we discover the existence of two distinct classes of individuals: returners and explorers. As existing models of human mobility cannot explain the existence of these two classes, we develop more realistic models able to capture the empirical findings. Finally, we show that returners and explorers play a distinct quantifiable role in spreading phenomena and that a correlation exists between their mobility patterns and social interactions.
%B Nat Commun
%V 6
%8 09
%G eng
%U http://dx.doi.org/10.1038/ncomms9166

%0 Journal Article
%J Journal of Official Statistics
%D 2015
%T Small Area Model-Based Estimators Using Big Data Sources
%A Stefano Marchetti
%A Caterina Giusti
%A Monica Pratesi
%A Nicola Salvati
%A Fosca Giannotti
%A Dino Pedreschi
%A S Rinzivillo
%A Luca Pappalardo
%A Lorenzo Gabrielli
%B Journal of Official Statistics
%V 31
%P 263–281
%G eng

%0 Conference Paper
%B 22nd Italian Symposium on Advanced Database Systems, {SEBD} 2014, Sorrento Coast, Italy, June 16-18, 2014.
%D 2014
%T Mining efficient training patterns of non-professional cyclists
%A Paolo Cintia
%A Luca Pappalardo
%A Dino Pedreschi
%B 22nd Italian Symposium on Advanced Database Systems, {SEBD} 2014, Sorrento Coast, Italy, June 16-18, 2014.
%G eng

%0 Conference Paper
%B 22nd Italian Symposium on Advanced Database Systems, {SEBD} 2014, Sorrento Coast, Italy, June 16-18, 2014.
%D 2014
%T The patterns of musical influence on the Last.Fm social network
%A Diego Pennacchioli
%A Giulio Rossetti
%A Luca Pappalardo
%A Dino Pedreschi
%A Fosca Giannotti
%A Michele Coscia
%B 22nd Italian Symposium on Advanced Database Systems, {SEBD} 2014, Sorrento Coast, Italy, June 16-18, 2014.
%G eng

%0 Conference Paper
%B International Conference on Data Science and Advanced Analytics, {DSAA} 2014, Shanghai, China, October 30 - November 1, 2014
%D 2014
%T The purpose of motion: Learning activities from Individual Mobility Networks
%A S Rinzivillo
%A Lorenzo Gabrielli
%A Mirco Nanni
%A Luca Pappalardo
%A Dino Pedreschi
%A Fosca Giannotti
%B International Conference on Data Science and Advanced Analytics, {DSAA} 2014, Shanghai, China, October 30 - November 1, 2014
%G eng
%U http://dx.doi.org/10.1109/DSAA.2014.7058090
%R 10.1109/DSAA.2014.7058090

%0 Conference Paper
%B Computational Intelligence and 11th Brazilian Congress on Computational Intelligence (BRICS-CCI CBIC), 2013 BRICS Congress on
%D 2013
%T Comparing General Mobility and Mobility by Car
%A Luca Pappalardo
%A Filippo Simini
%A S Rinzivillo
%A Dino Pedreschi
%A Fosca Giannotti
%B Computational Intelligence and 11th Brazilian Congress on Computational Intelligence (BRICS-CCI CBIC), 2013 BRICS Congress on
%8 Sept
%G eng
%R 10.1109/BRICS-CCI-CBIC.2013.116

%0 Conference Paper
%B 13th {IEEE} International Conference on Data Mining Workshops, {ICDM} Workshops, TX, USA, December 7-10, 2013
%D 2013
%T "Engine Matters": {A} First Large Scale Data Driven Study on Cyclists' Performance
%A Paolo Cintia
%A Luca Pappalardo
%A Dino Pedreschi
%B 13th {IEEE} International Conference on Data Mining Workshops, {ICDM} Workshops, TX, USA, December 7-10, 2013
%G eng
%U http://dx.doi.org/10.1109/ICDMW.2013.41
%R 10.1109/ICDMW.2013.41

%0 Conference Paper
%B SEDB 2013
%D 2013
%T Measuring tie strength in multidimensional networks
%A Giulio Rossetti
%A Luca Pappalardo
%A Dino Pedreschi
%B SEDB 2013
%8 2013

%0 Conference Paper
%B Social Informatics - 5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings
%D 2013
%T The Three Dimensions of Social Prominence
%A Diego Pennacchioli
%A Giulio Rossetti
%A Luca Pappalardo
%A Dino Pedreschi
%A Fosca Giannotti
%A Michele Coscia
%B Social Informatics - 5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings
%G eng
%U http://dx.doi.org/10.1007/978-3-319-03260-3_28
%R 10.1007/978-3-319-03260-3_28

%0 Journal Article
%J The European Physical Journal Special Topics
%D 2013
%T {Understanding the patterns of car travel}
%A Luca Pappalardo
%A S Rinzivillo
%A Qu, Zehui
%A Dino Pedreschi
%A Fosca Giannotti
%X {Are the patterns of car travel different from those of general human mobility? Based on a unique dataset consisting of the GPS trajectories of 10 million travels accomplished by 150,000 cars in Italy, we investigate how known mobility models apply to car travels, and illustrate novel analytical findings. We also assess to what extent the sample in our dataset is representative of the overall car mobility, and discover how to build an extremely accurate model that, given our GPS data, estimates the real traffic values as measured by road sensors.}
%B The European Physical Journal Special Topics
%V 215
%P 61–73
%G eng
%U http://dx.doi.org/10.1140/epjst%252fe2013-01715-5
%R 10.1140/epjst%252fe2013-01715-5

%0 Report
%D 2012
%T Analisi di Mobilita' con dati eterogenei
%A Barbara Furletti
%A Roberto Trasarti
%A Lorenzo Gabrielli
%A S Rinzivillo
%A Luca Pappalardo
%A Fosca Giannotti
%I ISTI - CNR
%C Pisa

%0 Conference Paper
%B International Conference on Advances in Social Networks Analysis and Mining, {ASONAM} 2012, Istanbul, Turkey, 26-29 August 2012
%D 2012
%T "How Well Do We Know Each Other?" Detecting Tie Strength in Multidimensional Social Networks
%A Luca Pappalardo
%A Giulio Rossetti
%A Dino Pedreschi
%B International Conference on Advances in Social Networks Analysis and Mining, {ASONAM} 2012, Istanbul, Turkey, 26-29 August 2012
%G eng
%U http://doi.ieeecomputersociety.org/10.1109/ASONAM.2012.180
%R 10.1109/ASONAM.2012.180