TY  - JOUR
T1  - Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
Y1  - 2021
A1  - Mirco Nanni
A1  - Andrienko, Gennady
A1  - Barabasi, Albert-Laszlo
A1  - Boldrini, Chiara
A1  - Bonchi, Francesco
A1  - Cattuto, Ciro
A1  - Chiaromonte, Francesca
A1  - Comandé, Giovanni
A1  - Conti, Marco
A1  - Coté, Mark
A1  - Dignum, Frank
A1  - Dignum, Virginia
A1  - Domingo-Ferrer, Josep
A1  - Ferragina, Paolo
A1  - Fosca Giannotti
A1  - Riccardo Guidotti
A1  - Helbing, Dirk
A1  - Kaski, Kimmo
A1  - Kertész, János
A1  - Lehmann, Sune
A1  - Lepri, Bruno
A1  - Lukowicz, Paul
A1  - Matwin, Stan
A1  - Jiménez, David Megías
A1  - Anna Monreale
A1  - Morik, Katharina
A1  - Oliver, Nuria
A1  - Passarella, Andrea
A1  - Passerini, Andrea
A1  - Dino Pedreschi
A1  - Pentland, Alex
A1  - Pianesi, Fabio
A1  - Francesca Pratesi
A1  - S Rinzivillo
A1  - Salvatore Ruggieri
A1  - Siebes, Arno
A1  - Torra, Vicenc
A1  - Roberto Trasarti
A1  - Hoven, Jeroen van den
A1  - Vespignani, Alessandro
AB  - The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the “phase 2” of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens’ privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens’ “personal data stores”, to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates—if and when they want and for specific aims—with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.
SN  - 1572-8439
UR  - https://link.springer.com/article/10.1007/s10676-020-09572-w
JO  - Ethics and Information Technology
ER  - 

TY  - JOUR
T1  - An ethico-legal framework for social data science
Y1  - 2020
A1  - Forgó, Nikolaus
A1  - Hänold, Stefanie
A1  - van den Hoven, Jeroen
A1  - Krügel, Tina
A1  - Lishchuk, Iryna
A1  - Mahieu, René
A1  - Anna Monreale
A1  - Dino Pedreschi
A1  - Francesca Pratesi
A1  - van Putten, David
AB  - This paper presents a framework for research infrastructures enabling ethically sensitive and legally compliant data science in Europe. Our goal is to describe how to design and implement an open platform for big data social science, including, in particular, personal data. To this end, we discuss a number of infrastructural, organizational and methodological principles to be developed for a concrete implementation. These include not only systematically tools and methodologies that effectively enable both the empirical evaluation of the privacy risk and data transformations by using privacy-preserving approaches, but also the development of training materials (a massive open online course) and organizational instruments based on legal and ethical principles. This paper provides, by way of example, the implementation that was adopted within the context of the SoBigData Research Infrastructure.
SN  - 2364-4168
UR  - https://link.springer.com/article/10.1007/s41060-020-00211-7
JO  - International Journal of Data Science and Analytics
ER  - 

TY  - JOUR
T1  - Human migration: the big data perspective
JF  - International Journal of Data Science and Analytics
Y1  - 2020
A1  - Alina Sirbu
A1  - Andrienko, Gennady
A1  - Andrienko, Natalia
A1  - Boldrini, Chiara
A1  - Conti, Marco
A1  - Fosca Giannotti
A1  - Riccardo Guidotti
A1  - Bertoli, Simone
A1  - Jisu Kim
A1  - Muntean, Cristina Ioana
A1  - Luca Pappalardo
A1  - Passarella, Andrea
A1  - Dino Pedreschi
A1  - Pollacci, Laura
A1  - Francesca Pratesi
A1  - Sharma, Rajesh
AB  - How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.
SN  - 2364-4168
UR  - https://link.springer.com/article/10.1007%2Fs41060-020-00213-5
JO  - International Journal of Data Science and Analytics
ER  - 

TY  - JOUR
T1  - PRIMULE: Privacy risk mitigation for user profiles
Y1  - 2020
A1  - Francesca Pratesi
A1  - Lorenzo Gabrielli
A1  - Paolo Cintia
A1  - Anna Monreale
A1  - Fosca Giannotti
AB  - The availability of mobile phone data has encouraged the development of different data-driven tools, supporting social science studies and providing new data sources to the standard official statistics. However, this particular kind of data are subject to privacy concerns because they can enable the inference of personal and private information. In this paper, we address the privacy issues related to the sharing of user profiles, derived from mobile phone data, by proposing PRIMULE, a privacy risk mitigation strategy. Such a method relies on PRUDEnce (Pratesi et al., 2018), a privacy risk assessment framework that provides a methodology for systematically identifying risky-users in a set of data. An extensive experimentation on real-world data shows the effectiveness of PRIMULE strategy in terms of both quality of mobile user profiles and utility of these profiles for analytical services such as the Sociometer (Furletti et al., 2013), a data mining tool for city users classification.
VL  - 125
SN  - 0169-023X
UR  - https://www.sciencedirect.com/science/article/pii/S0169023X18305342
JO  - Data & Knowledge Engineering
ER  - 

TY  - CONF
T1  - Analyzing Privacy Risk in Human Mobility Data
T2  - Software Technologies: Applications and Foundations - STAF 2018 Collocated Workshops, Toulouse, France, June 25-29, 2018, Revised Selected Papers
Y1  - 2018
A1  - Roberto Pellungrini
A1  - Luca Pappalardo
A1  - Francesca Pratesi
A1  - Anna Monreale
AB  - Mobility data are of fundamental importance for understanding the patterns of human movements, developing analytical services and modeling human dynamics. Unfortunately, mobility data also contain individual sensitive information, making it necessary an accurate privacy risk assessment for the individuals involved. In this paper, we propose a methodology for assessing privacy risk in human mobility data. Given a set of individual and collective mobility features, we define the minimum data format necessary for the computation of each feature and we define a set of possible attacks on these data formats. We perform experiments computing the empirical risk in a real-world mobility dataset, and show how the distributions of the considered mobility features are affected by the removal of individuals with different levels of privacy risk.
JF  - Software Technologies: Applications and Foundations - STAF 2018 Collocated Workshops, Toulouse, France, June 25-29, 2018, Revised Selected Papers
UR  - https://doi.org/10.1007/978-3-030-04771-9_10
ER  - 

TY  - CHAP
T1  - How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science
T2  - A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years
Y1  - 2018
A1  - Amato, G.
A1  - Candela, L.
A1  - Castelli, D.
A1  - Esuli, A.
A1  - Falchi, F.
A1  - Gennaro, C.
A1  - Fosca Giannotti
A1  - Anna Monreale
A1  - Mirco Nanni
A1  - Pagano, P.
A1  - Luca Pappalardo
A1  - Dino Pedreschi
A1  - Francesca Pratesi
A1  - Rabitti, F.
A1  - S Rinzivillo
A1  - Giulio Rossetti
A1  - Salvatore Ruggieri
A1  - Sebastiani, F.
A1  - Tesconi, M.
ED  - Flesca, Sergio
ED  - Greco, Sergio
ED  - Masciari, Elio
ED  - Saccà, Domenico
AB  - During the last 35 years, data management principles such as physical and logical independence, declarative querying and cost-based optimization have led to profound pervasiveness of relational databases in any kind of organization. More importantly, these technical advances have enabled the first round of business intelligence applications and laid the foundation for managing and analyzing Big Data today.
JF  - A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years
PB  - Springer International Publishing
CY  - Cham
SN  - 978-3-319-61893-7
UR  - https://link.springer.com/chapter/10.1007%2F978-3-319-61893-7_17
ER  - 

TY  - JOUR
T1  - PRUDEnce: a system for assessing privacy risk vs utility in data sharing ecosystems
JF  - Transactions on Data Privacy
Y1  - 2018
A1  - Francesca Pratesi
A1  - Anna Monreale
A1  - Roberto Trasarti
A1  - Fosca Giannotti
A1  - Dino Pedreschi
A1  - Yanagihara, Tadashi
AB  - Data describing human activities are an important source of knowledge useful for understanding individual and collective behavior and for developing a wide range of user services. Unfortunately, this kind of data is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database. Therefore, Data Providers, before sharing those data, must apply any sort of anonymization to lower the privacy risks, but they must be aware and capable of controlling also the data quality, since these two factors are often a trade-off. In this paper we propose PRUDEnce (Privacy Risk versus Utility in Data sharing Ecosystems), a system enabling a privacy-aware ecosystem for sharing personal data. It is based on a methodology for assessing both the empirical (not theoretical) privacy risk associated to users represented in the data, and the data quality guaranteed only with users not at risk. Our proposal is able to support the Data Provider in the exploration of a repertoire of possible data transformations with the aim of selecting one specific transformation that yields an adequate trade-off between data quality and privacy risk. We study the practical effectiveness of our proposal over three data formats underlying many services, defined on real mobility data, i.e., presence data, trajectory data and road segment data.
VL  - 11
UR  - http://www.tdp.cat/issues16/tdp.a284a17.pdf
ER  - 

TY  - CONF
T1  - Assessing Privacy Risk in Retail Data
T2  - Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, PAP 2017, Held in Conjunction with ECML PKDD 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers
Y1  - 2017
A1  - Roberto Pellungrini
A1  - Francesca Pratesi
A1  - Luca Pappalardo
AB  - Retail data are one of the most requested commodities by commercial companies. Unfortunately, from this data it is possible to retrieve highly sensitive information about individuals. Thus, there exists the need for accurate individual privacy risk evaluation. In this paper, we propose a methodology for assessing privacy risk in retail data. We define the data formats for representing retail data, the privacy framework for calculating privacy risk and some possible privacy attacks for this kind of data. We perform experiments in a real-world retail dataset, and show the distribution of privacy risk for the various attacks.
JF  - Personal Analytics and Privacy. An Individual and Collective Perspective - First International Workshop, PAP 2017, Held in Conjunction with ECML PKDD 2017, Skopje, Macedonia, September 18, 2017, Revised Selected Papers
UR  - https://doi.org/10.1007/978-3-319-71970-2_3
ER  - 

TY  - JOUR
T1  - A Data Mining Approach to Assess Privacy Risk in Human Mobility Data
JF  - ACM Trans. Intell. Syst. Technol.
Y1  - 2017
A1  - Roberto Pellungrini
A1  - Luca Pappalardo
A1  - Francesca Pratesi
A1  - Anna Monreale
AB  - Human mobility data are an important proxy to understand human mobility dynamics, develop analytical services, and design mathematical models for simulation and what-if analysis. Unfortunately mobility data are very sensitive since they may enable the re-identification of individuals in a database. Existing frameworks for privacy risk assessment provide data providers with tools to control and mitigate privacy risks, but they suffer two main shortcomings: (i) they have a high computational complexity; (ii) the privacy risk must be recomputed every time new data records become available and for every selection of individuals, geographic areas, or time windows. In this article, we propose a fast and flexible approach to estimate privacy risk in human mobility data. The idea is to train classifiers to capture the relation between individual mobility patterns and the level of privacy risk of individuals. We show the effectiveness of our approach by an extensive experiment on real-world GPS data in two urban areas and investigate the relations between human mobility patterns and the privacy risk of individuals.
VL  - 9
UR  - http://doi.acm.org/10.1145/3106774
ER  - 

TY  - ABST
T1  - Fast Estimation of Privacy Risk in Human Mobility Data
Y1  - 2017
A1  - Roberto Pellungrini
A1  - Luca Pappalardo
A1  - Francesca Pratesi
A1  - Anna Monreale
AB  - Mobility data are an important proxy to understand the patterns of human movements, develop analytical services and design models for simulation and prediction of human dynamics. Unfortunately mobility data are also very sensitive, since they may contain personal information about the individuals involved. Existing frameworks for privacy risk assessment enable the data providers to quantify and mitigate privacy risks, but they suffer two main limitations: (i) they have a high computational complexity; (ii) the privacy risk must be re-computed for each new set of individuals, geographic areas or time windows. In this paper we explore a fast and flexible solution to estimate privacy risk in human mobility data, using predictive models to capture the relation between an individual’s mobility patterns and her privacy risk. We show the effectiveness of our approach by experimentation on a real-world GPS dataset and provide a comparison with traditional methods.
SN  - 978-3-319-66283-1
ER  - 

TY  - CONF
T1  - Privacy Preserving Multidimensional Profiling
T2  - International Conference on Smart Objects and Technologies for Social Good
Y1  - 2017
A1  - Francesca Pratesi
A1  - Anna Monreale
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Recently, big data had become central in the analysis of human behavior and the development of innovative services. In particular, a new class of services is emerging, taking advantage of different sources of data, in order to consider the multiple aspects of human beings. Unfortunately, these data can lead to re-identification problems and other privacy leaks, as diffusely reported in both scientific literature and media. The risk is even more pressing if multiple sources of data are linked together since a potential adversary could know information related to each dataset. For this reason, it is necessary to evaluate accurately and mitigate the individual privacy risk before releasing personal data. In this paper, we propose a methodology for the first task, i.e., assessing privacy risk, in a multidimensional scenario, defining some possible privacy attacks and simulating them using real-world datasets.
JF  - International Conference on Smart Objects and Technologies for Social Good
PB  - Springer
UR  - https://link.springer.com/chapter/10.1007/978-3-319-76111-4_15
ER  - 

TY  - CONF
T1  - Managing travels with PETRA: The Rome use case
T2  - 2015 31st IEEE International Conference on Data Engineering Workshops (ICDEW)
Y1  - 2015
A1  - Botea, Adi
A1  - Braghin, Stefano
A1  - Lopes, Nuno
A1  - Riccardo Guidotti
A1  - Francesca Pratesi
AB  - The aim of the PETRA project is to provide the basis for a city-wide transportation system that supports policies catering for both individual preferences of users and city-wide travel patterns. The PETRA platform will be initially deployed in the partner city of Rome, and later in Venice, and Tel-Aviv.
JF  - 2015 31st IEEE International Conference on Data Engineering Workshops (ICDEW)
PB  - IEEE
ER  - 

TY  - CONF
T1  - Mobility Mining for Journey Planning in Rome
T2  - Machine Learning and Knowledge Discovery in Databases
Y1  - 2015
A1  - Michele Berlingerio
A1  - Bicer, Veli
A1  - Botea, Adi
A1  - Braghin, Stefano
A1  - Lopes, Nuno
A1  - Riccardo Guidotti
A1  - Francesca Pratesi
AB  - We present recent results on integrating private car GPS routines obtained by a Data Mining module. into the PETRA (PErsonal TRansport Advisor) platform. The routines are used as additional “bus lines”, available to provide a ride to travelers. We present the effects of querying the planner with and without the routines, which show how Data Mining may help Smarter Cities applications.
JF  - Machine Learning and Knowledge Discovery in Databases
PB  - Springer International Publishing
ER  - 

TY  - JOUR
T1  - Privacy-by-Design in Big Data Analytics and Social Mining
JF  - EPJ Data Science
Y1  - 2014
A1  - Anna Monreale
A1  - S Rinzivillo
A1  - Francesca Pratesi
A1  - Fosca Giannotti
A1  - Dino Pedreschi
AB  - Privacy is ever-growing concern in our society and is becoming a fundamental aspect to take into account when one wants to use, publish and analyze data involving human personal sensitive information. Unfortunately, it is increasingly hard to transform the data in a way that it protects sensitive information: we live in the era of big data characterized by unprecedented opportunities to sense, store and analyze social data describing human activities in great detail and resolution. As a result, privacy preservation simply cannot be accomplished by de-identification alone. In this paper, we propose the privacy-by-design paradigm to develop technological frameworks for countering the threats of undesirable, unlawful effects of privacy violation, without obstructing the knowledge discovery opportunities of social mining and big data analytical technologies. Our main idea is to inscribe privacy protection into the knowledge discovery technology by design, so that the analysis incorporates the relevant privacy requirements from the start.
VL  - 10
N1  - 2014:10
ER  - 

TY  - CONF
T1  - Privacy-Aware Distributed Mobility Data Analytics
T2  - SEBD
Y1  - 2013
A1  - Francesca Pratesi
A1  - Anna Monreale
A1  - Hui Wendy Wang
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Gennady Andrienko
A1  - Natalia Andrienko
AB  - We propose an approach to preserve privacy in an analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because they may describe typical movement behaviors and therefore be used for re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation.
JF  - SEBD
CY  - Roccella Jonica
ER  - 

TY  - CHAP
T1  - Privacy-Preserving Distributed Movement Data Aggregation
T2  - Geographic Information Science at the Heart of Europe
Y1  - 2013
A1  - Anna Monreale
A1  - Hui Wendy Wang
A1  - Francesca Pratesi
A1  - S Rinzivillo
A1  - Dino Pedreschi
A1  - Gennady Andrienko
A1  - Natalia Andrienko
ED  - Vandenbroucke, Danny
ED  - Bucher, Bénédicte
ED  - Crompvoets, Joep
AB  - We propose a novel approach to privacy-preserving analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because people’s whereabouts have the potential to reveal intimate personal traits, such as religious or sexual preferences, and may allow re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation.
JF  - Geographic Information Science at the Heart of Europe
T3  - Lecture Notes in Geoinformation and Cartography
PB  - Springer International Publishing
SN  - 978-3-319-00614-7
UR  - http://dx.doi.org/10.1007/978-3-319-00615-4_13
ER  -