00476nas a2200145 4500008004100000245006400041210006400105260001300169100002100182700001900203700002300222700001900245700001700264856004900281 2021 eng d00aPrivacy Risk Assessment of Individual Psychometric Profiles0 aPrivacy Risk Assessment of Individual Psychometric Profiles bSpringer1 aMariani, Giacomo1 aMonreale, Anna1 aNaretto, Francesca1 aSoares, Carlos1 aTorgo, Luís uhttps://doi.org/10.1007/978-3-030-88942-5_3202034nas a2200217 4500008004100000020002200041245006900063210006900132260005200201520129400253100002301547700002501570700001901595700002701614700002001641700002101661700002501682700002501707700001701732856006701749 2020 eng d a978-3-030-61527-700aPredicting and Explaining Privacy Risk Exposure in Mobility Data0 aPredicting and Explaining Privacy Risk Exposure in Mobility Data aChambSpringer International Publishingc2020//3 aMobility data is a proxy of different social dynamics and its analysis enables a wide range of user services. Unfortunately, mobility data are very sensitive because the sharing of people’s whereabouts may arise serious privacy concerns. Existing frameworks for privacy risk assessment provide tools to identify and measure privacy risks, but they often (i) have high computational complexity; and (ii) are not able to provide users with a justification of the reported risks. In this paper, we propose expert, a new framework for the prediction and explanation of privacy risk on mobility data. We empirically evaluate privacy risk on real data, simulating a privacy attack with a state-of-the-art privacy risk assessment framework. We then extract individual mobility profiles from the data for predicting their risk. We compare the performance of several machine learning algorithms in order to identify the best approach for our task. Finally, we show how it is possible to explain privacy risk prediction on real data, using two algorithms: Shap, a feature importance-based method and Lore, a rule-based method. Overall, expert is able to provide a user with the privacy risk and an explanation of the risk itself. The experiments show excellent performance for the prediction task.1 aNaretto, Francesca1 aPellungrini, Roberto1 aMonreale, Anna1 aNardini, Franco, Maria1 aMusolesi, Mirco1 aAppice, Annalisa1 aTsoumakas, Grigorios1 aManolopoulos, Yannis1 aMatwin, Stan uhttps://link.springer.com/chapter/10.1007/978-3-030-61527-7_2702783nas a2200517 4500008004100000020002200041245008500063210006900148260005200217520120400269100002301473700002501496700002701521700002101548700002101569700001801590700002101608700002201629700001901651700002501670700002301695700002201718700002201740700002101762700001601783700002001799700002601819700002401845700002001869700002201889700002301911700002001934700001901954700002201973700002001995700002002015700002002035700001802055700001902073700002302092700001802115700002002133700002402153700002102177856006702198 2020 eng d a978-3-030-65965-300aPrediction and Explanation of Privacy Risk on Mobility Data with Neural Networks0 aPrediction and Explanation of Privacy Risk on Mobility Data with aChambSpringer International Publishingc2020//3 aThe analysis of privacy risk for mobility data is a fundamental part of any privacy-aware process based on such data. Mobility data are highly sensitive. Therefore, the correct identification of the privacy risk before releasing the data to the public is of utmost importance. However, existing privacy risk assessment frameworks have high computational complexity. To tackle these issues, some recent work proposed a solution based on classification approaches to predict privacy risk using mobility features extracted from the data. In this paper, we propose an improvement of this approach by applying long short-term memory (LSTM) neural networks to predict the privacy risk directly from original mobility data. We empirically evaluate privacy risk on real data by applying our LSTM-based approach. Results show that our proposed method based on a LSTM network is effective in predicting the privacy risk with results in terms of F1 of up to 0.91. Moreover, to explain the predictions of our model, we employ a state-of-the-art explanation algorithm, Shap. We explore the resulting explanation, showing how it is possible to provide effective predictions while explaining them to the end-user.1 aNaretto, Francesca1 aPellungrini, Roberto1 aNardini, Franco, Maria1 aGiannotti, Fosca1 aKoprinska, Irena1 aKamp, Michael1 aAppice, Annalisa1 aLoglisci, Corrado1 aAntonie, Luiza1 aZimmermann, Albrecht1 aGuidotti, Riccardo1 aÖzgöbek, Özlem1 aRibeiro, Rita, P.1 aGavaldà, Ricard1 aGama, João1 aAdilova, Linara1 aKrishnamurthy, Yamuna1 aFerreira, Pedro, M.1 aMalerba, Donato1 aMedeiros, Ibéria1 aCeci, Michelangelo1 aManco, Giuseppe1 aMasciari, Elio1 aRas, Zbigniew, W.1 aChristen, Peter1 aNtoutsi, Eirini1 aSchubert, Erich1 aZimek, Arthur1 aMonreale, Anna1 aBiecek, Przemyslaw1 aRinzivillo, S1 aKille, Benjamin1 aLommatzsch, Andreas1 aGulla, Jon, Atle uhttps://link.springer.com/chapter/10.1007/978-3-030-65965-3_3401574nas a2200193 4500008004100000020001400041245005500055210005400110260001600164300001100180490000800191520100500199100002301204700002301227700001801250700001901268700002101287856007201308 2020 eng d a0169-023X00aPRIMULE: Privacy risk mitigation for user profiles0 aPRIMULE Privacy risk mitigation for user profiles c2020/01/01/ a1017860 v1253 aThe availability of mobile phone data has encouraged the development of different data-driven tools, supporting social science studies and providing new data sources to the standard official statistics. However, this particular kind of data are subject to privacy concerns because they can enable the inference of personal and private information. In this paper, we address the privacy issues related to the sharing of user profiles, derived from mobile phone data, by proposing PRIMULE, a privacy risk mitigation strategy. Such a method relies on PRUDEnce (Pratesi et al., 2018), a privacy risk assessment framework that provides a methodology for systematically identifying risky-users in a set of data. An extensive experimentation on real-world data shows the effectiveness of PRIMULE strategy in terms of both quality of mobile user profiles and utility of these profiles for analytical services such as the Sociometer (Furletti et al., 2013), a data mining tool for city users classification.1 aPratesi, Francesca1 aGabrielli, Lorenzo1 aCintia, Paolo1 aMonreale, Anna1 aGiannotti, Fosca uhttps://www.sciencedirect.com/science/article/pii/S0169023X1830534201999nas a2200181 4500008004100000245011100041210006900152300001100221490000700232520140700239100002101646700001801667700002101685700002301706700002001729700002101749856004701770 2019 eng d00aPlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach0 aPlayeRank datadriven performance evaluation and player ranking i a1–270 v103 aThe problem of evaluating the performance of soccer players is attracting the interest of many companies and the scientific community, thanks to the availability of massive data capturing all the events generated during a match (e.g., tackles, passes, shots, etc.). Unfortunately, there is no consolidated and widely accepted metric for measuring performance quality in all of its facets. In this article, we design and implement PlayeRank, a data-driven framework that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. We build our framework by deploying a massive dataset of soccer-logs and consisting of millions of match events pertaining to four seasons of 18 prominent soccer competitions. By comparing PlayeRank to known algorithms for performance evaluation in soccer, and by exploiting a dataset of players’ evaluations made by professional soccer scouts, we show that PlayeRank significantly outperforms the competitors. We also explore the ratings produced by PlayeRank and discover interesting patterns about the nature of excellent performances and what distinguishes the top players from the others. At the end, we explore some applications of PlayeRank—i.e. searching players and player versatility—showing its flexibility and efficiency, which makes it worth to be used in the design of a scalable platform for soccer analytics.1 aPappalardo, Luca1 aCintia, Paolo1 aFerragina, Paolo1 aMassucco, Emanuele1 aPedreschi, Dino1 aGiannotti, Fosca uhttps://dl.acm.org/doi/abs/10.1145/334317201650nas a2200301 4500008004100000020002200041245004800063210004800111260005200159520072600211100002500937700001900962700002300981700001901004700001901023700001901042700002101061700002001082700002201102700002101124700002301145700002101168700002401189700002301213700002201236700002301258856006701281 2019 eng d a978-3-030-13463-100aPrivacy Risk for Individual Basket Patterns0 aPrivacy Risk for Individual Basket Patterns aChambSpringer International Publishingc2019//3 aRetail data are of fundamental importance for businesses and enterprises that want to understand the purchasing behaviour of their customers. Such data is also useful to develop analytical services and for marketing purposes, often based on individual purchasing patterns. However, retail data and extracted models may also provide very sensitive information to possible malicious third parties. Therefore, in this paper we propose a methodology for empirically assessing privacy risk in the releasing of individual purchasing data. The experiments on real-world retail data show that although individual patterns describe a summary of the customer activity, they may be successful used for the customer re-identifiation.1 aPellungrini, Roberto1 aMonreale, Anna1 aGuidotti, Riccardo1 aAlzate, Carlos1 aMonreale, Anna1 aBioglio, Livio1 aBitetta, Valerio1 aBordino, Ilaria1 aCaldarelli, Guido1 aFerretti, Andrea1 aGuidotti, Riccardo1 aGullo, Francesco1 aPascolutti, Stefano1 aPensa, Ruggero, G.1 aRobardet, Céline1 aSquartini, Tiziano uhttps://link.springer.com/chapter/10.1007/978-3-030-13463-1_1101689nas a2200193 4500008004100000245007700041210006900118300001100187490000600198520109400204100002101298700001801319700001901337700002301356700002101379700002001400700002101420856005401441 2019 eng d00aA public data set of spatio-temporal match events in soccer competitions0 apublic data set of spatiotemporal match events in soccer competi a1–150 v63 aSoccer analytics is attracting increasing interest in academia and industry, thanks to the availability of sensing technologies that provide high-fidelity data streams for every match. Unfortunately, these detailed data are owned by specialized companies and hence are rarely publicly available for scientific research. To fill this gap, this paper describes the largest open collection of soccer-logs ever released, containing all the spatio-temporal events (passes, shots, fouls, etc.) that occured during each match for an entire season of seven prominent soccer competitions. Each match event contains information about its position, time, outcome, player and characteristics. The nature of team sports like soccer, halfway between the abstraction of a game and the reality of complex social systems, combined with the unique size and composition of this dataset, provide an ideal ground for tackling a wide range of data science problems, including the measurement and evaluation of performance, both at individual and at collective level, and the determinants of success and failure.1 aPappalardo, Luca1 aCintia, Paolo1 aRossi, Alessio1 aMassucco, Emanuele1 aFerragina, Paolo1 aPedreschi, Dino1 aGiannotti, Fosca uhttps://www.nature.com/articles/s41597-019-0247-700404nas a2200121 4500008004100000245004000041210004000081100001700121700002100138700002000159700002100179856008200200 2019 eng d00aPublic opinion and Algorithmic bias0 aPublic opinion and Algorithmic bias1 aSirbu, Alina1 aGiannotti, Fosca1 aPedreschi, Dino1 aKertész, János uhttps://ercim-news.ercim.eu/en116/special/public-opinion-and-algorithmic-bias01676nas a2200145 4500008004100000245008600041210006900127520117000196100002301366700002101389700002101410700002101431700002001452856005801472 2018 eng d00aPersonalized Market Basket Prediction with Temporal Annotated Recurring Sequences0 aPersonalized Market Basket Prediction with Temporal Annotated Re3 aNowadays, a hot challenge for supermarket chains is to offer personalized services to their customers. Market basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable of capturing at the same time the different factors influencing the customer's decision process: co-occurrence, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern Temporal Annotated Recurring Sequence (TARS) able to capture simultaneously and adaptively all these factors. We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to understand the level of the customer's stocks and recommend the set of most necessary items. By adopting the TBP the supermarket chains could crop tailored suggestions for each individual customer which in turn could effectively speed up their shopping sessions. A deep experimentation shows that TARS are able to explain the customer purchase behavior, and that TBP outperforms the state-of-the-art competitors.1 aGuidotti, Riccardo1 aRossetti, Giulio1 aPappalardo, Luca1 aGiannotti, Fosca1 aPedreschi, Dino uhttps://ieeexplore.ieee.org/abstract/document/847715701950nas a2200181 4500008004100000245008800041210006900129260001200198490000700210520137400217100002301591700001901614700002201633700002101655700002001676700002401696856004801720 2018 eng d00aPRUDEnce: a system for assessing privacy risk vs utility in data sharing ecosystems0 aPRUDEnce a system for assessing privacy risk vs utility in data c08/20180 v113 aData describing human activities are an important source of knowledge useful for understanding individual and collective behavior and for developing a wide range of user services. Unfortunately, this kind of data is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database. Therefore, Data Providers, before sharing those data, must apply any sort of anonymization to lower the privacy risks, but they must be aware and capable of controlling also the data quality, since these two factors are often a trade-off. In this paper we propose PRUDEnce (Privacy Risk versus Utility in Data sharing Ecosystems), a system enabling a privacy-aware ecosystem for sharing personal data. It is based on a methodology for assessing both the empirical (not theoretical) privacy risk associated to users represented in the data, and the data quality guaranteed only with users not at risk. Our proposal is able to support the Data Provider in the exploration of a repertoire of possible data transformations with the aim of selecting one specific transformation that yields an adequate trade-off between data quality and privacy risk. We study the practical effectiveness of our proposal over three data formats underlying many services, defined on real mobility data, i.e., presence data, trajectory data and road segment data.1 aPratesi, Francesca1 aMonreale, Anna1 aTrasarti, Roberto1 aGiannotti, Fosca1 aPedreschi, Dino1 aYanagihara, Tadashi uhttp://www.tdp.cat/issues16/tdp.a284a17.pdf01373nas a2200145 4500008004100000245005000041210005000091260001300141520092300154100002301077700001901100700002101119700002001140856006701160 2017 eng d00aPrivacy Preserving Multidimensional Profiling0 aPrivacy Preserving Multidimensional Profiling bSpringer3 aRecently, big data had become central in the analysis of human behavior and the development of innovative services. In particular, a new class of services is emerging, taking advantage of different sources of data, in order to consider the multiple aspects of human beings. Unfortunately, these data can lead to re-identification problems and other privacy leaks, as diffusely reported in both scientific literature and media. The risk is even more pressing if multiple sources of data are linked together since a potential adversary could know information related to each dataset. For this reason, it is necessary to evaluate accurately and mitigate the individual privacy risk before releasing personal data. In this paper, we propose a methodology for the first task, i.e., assessing privacy risk, in a multidimensional scenario, defining some possible privacy attacks and simulating them using real-world datasets.1 aPratesi, Francesca1 aMonreale, Anna1 aGiannotti, Fosca1 aPedreschi, Dino uhttps://link.springer.com/chapter/10.1007/978-3-319-76111-4_1501341nas a2200169 4500008004100000245006100041210006000102260003800162300001400200520081200214100002001026700001501046700001901061700001701080700002301097856005101120 2016 eng d00aPartition-Based Clustering Using Constraint Optimization0 aPartitionBased Clustering Using Constraint Optimization bSpringer International Publishing a282–2993 aPartition-based clustering is the task of partitioning a dataset in a number of groups of examples, such that examples in each group are similar to each other. Many criteria for what constitutes a good clustering have been identified in the literature; furthermore, the use of additional constraints to find more useful clusterings has been proposed. In this chapter, it will be shown that most of these clustering tasks can be formalized using optimization criteria and constraints. We demonstrate how a range of clustering tasks can be modelled in generic constraint programming languages with these constraints and optimization criteria. Using the constraint-based modeling approach we also relate the DBSCAN method for density-based clustering to the label propagation technique for community discovery.1 aGrossi, Valerio1 aGuns, Tias1 aMonreale, Anna1 aNanni, Mirco1 aNijssen, Siegfried uhttp://dx.doi.org/10.1007/978-3-319-50137-6_1101568nas a2200133 4500008004100000245008400041210006900125260003600194490001400230520111700244100001701361700002001378856003601398 2016 eng d00aPower Consumption Modeling and Prediction in a Hybrid CPU-GPU-MIC Supercomputer0 aPower Consumption Modeling and Prediction in a Hybrid CPUGPUMIC aGrenoble, FrancebSpringer LNCS0 vLNCS 98333 aPower consumption is a major obstacle for High Performance Computing (HPC) systems in their quest towards the holy grail of ExaFLOP performance. Significant advances in power efficiency have to be made before this goal can be attained and accurate modeling is an essential step towards power efficiency by optimizing system operating parameters to match dynamic energy needs. In this paper we present a study of power consumption by jobs in Eurora, a hybrid CPU-GPU-MIC system installed at the largest Italian data center. Using data from a dedicated monitoring framework, we build a data-driven model of power consumption for each user in the system and use it to predict the power requirements of future jobs. We are able to achieve good prediction results for over 80 % of the users in the system. For the remaining users, we identify possible reasons why prediction performance is not as good. Possible applications for our predictive modeling results include scheduling optimization, power-aware billing and system-scale power modeling. All the scripts used for the study have been made available on GitHub.1 aSirbu, Alina1 aBabaoglu, Ozalp uhttp://arxiv.org/abs/1601.0596101787nas a2200121 4500008004100000245006100041210006000102260003800162520137900200100001701579700002001596856004901616 2016 eng d00aPredicting System-level Power for a Hybrid Supercomputer0 aPredicting Systemlevel Power for a Hybrid Supercomputer aInnsbruck, AustriabIEEEc07/20163 aFor current High Performance Computing systems to scale towards the holy grail of ExaFLOP performance, their power consumption has to be reduced by at least one order of magnitude. This goal can be achieved only through a combination of hardware and software advances. Being able to model and accurately predict the power consumption of large computational systems is necessary for software-level innovations such as proactive and power-aware scheduling, resource allocation and fault tolerance techniques. In this paper we present a 2-layer model of power consumption for a hybrid supercomputer (which held the top spot of the Green500 list on July 2013) that combines CPU, GPU and MIC technologies to achieve higher energy efficiency. Our model takes as input workload information - the number and location of resources that are used by each job at a certain time - and calculates the resulting system-level power consumption. When jobs are submitted to the system, the workload configuration can be foreseen based on the scheduler policies, and our model can then be applied to predict the ensuing system-level power consumption. Additionally, alternative workload configurations can be evaluated from a power perspective and more efficient ones can be selected. Applications of the model include not only power-aware scheduling but also prediction of anomalous behavior.1 aSirbu, Alina1 aBabaoglu, Ozalp uhttp://ieeexplore.ieee.org/document/7568420/01227nas a2200121 4500008004100000245005000041210004900091260004500140520083300185100001901018700002101037856004701058 2016 eng d00aPrivacy-Preserving Outsourcing of Data Mining0 aPrivacyPreserving Outsourcing of Data Mining a Atlanta, GA, USAbIEEE Computer Society3 aData mining is gaining momentum in society due to the ever increasing availability of large amounts of data, easily gathered by a variety of collection technologies and stored via computer systems. Due to the limited computational resources of data owners and the developments in cloud computing, there has been considerable recent interest in the paradigm of data mining-as-a-service (DMaaS). In this paradigm, a company (data owner) lacking in expertise or computational resources outsources its mining needs to a third party service provider (server). Given the fact that the server may not be fully trusted, one of the main concerns of the DMaaS paradigm is the protection of data privacy. In this paper, we provide an overview of a variety of techniques and approaches that address the privacy issues of the DMaaS paradigm.1 aMonreale, Anna1 aWang, Hui, Wendy uhttp://dx.doi.org/10.1109/COMPSAC.2016.16901567nas a2200145 4500008004100000245010400041210006900145260000900214520098000223100002501203700001901228700002301247700002301270856012801293 2016 eng d00aPrivacy-Preserving Outsourcing of Pattern Mining of Event-Log Data-A Use-Case from Process Industry0 aPrivacyPreserving Outsourcing of Pattern Mining of EventLog Data bIEEE3 aWith the advent of cloud computing and its model for IT services based on the Internet and big data centers, the interest of industries into XaaS ("Anything as a Service") paradigm is increasing. Business intelligence and knowledge discovery services are typical services that companies tend to externalize on the cloud, due to their data intensive nature and the algorithms complexity. What is appealing for a company is to rely on external expertise and infrastructure to compute the analytical results and models which are required by the business analysts for understanding the business phenomena under observation. Although it is advantageous to achieve sophisticated analysis there exist several serious privacy issues in this paradigm. In this paper we investigate through an industrial use-case the application of a framework for privacypreserving outsourcing of pattern mining on event-log data. Moreover, we present and discuss some ideas about possible extensions.1 aMarrella, Alessandro1 aMonreale, Anna1 aKloepper, Benjamin1 aKrueger, Martin, W uhttps://kdd.isti.cnr.it/publications/privacy-preserving-outsourcing-pattern-mining-event-log-data-use-case-process-industry02544nas a2200373 4500008004100000022001400041245008200055210006900137260000900206300001300215490000700228520142000235100001701655700001901672700002201691700002201713700001501735700002001750700002001770700001901790700002101809700002101830700001901851700002101870700001601891700002601907700002001933700002401953700001701977700001701994700002002011700002702031856011202058 2015 eng d a1932-620300aParticipatory Patterns in an International Air Quality Monitoring Initiative.0 aParticipatory Patterns in an International Air Quality Monitorin c2015 ae01367630 v103 a
The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective way of producing a change is involving the citizens themselves in monitoring campaigns (a citizen science bottom-up approach). This is possible by developing novel technologies and IT infrastructures enabling large citizen participation. Here, in the wider framework of one of the first such projects, we show results from an international competition where citizens were involved in mobile air pollution monitoring using low cost sensing devices, combined with a web-based game to monitor perceived levels of pollution. Measures of shift in perceptions over the course of the campaign are provided, together with insights into participatory patterns emerging from this study. Interesting effects related to inertia and to direct involvement in measurement activities rather than indirect information exposure are also highlighted, indicating that direct involvement can enhance learning and environmental awareness. In the future, this could result in better adoption of policies towards decreasing pollution.
1 aSirbu, Alina1 aBecker, Martin1 aCaminiti, Saverio1 aDe Baets, Bernard1 aElen, Bart1 aFrancis, Louise1 aGravino, Pietro1 aHotho, Andreas1 aIngarra, Stefano1 aLoreto, Vittorio1 aMolino, Andrea1 aMueller, Juergen1 aPeters, Jan1 aRicchiuti, Ferdinando1 aSaracino, Fabio1 aServedio, Vito, D P1 aStumme, Gerd1 aTheunis, Jan1 aTria, Francesca1 aVan den Bossche, Joris uhttps://kdd.isti.cnr.it/publications/participatory-patterns-international-air-quality-monitoring-initiative02010nas a2200157 4500008004100000245004500041210004500086260001200131300001100143490000600154520154300160100002001703700002401723700002101747856008401768 2015 eng d00aProduct assortment and customer mobility0 aProduct assortment and customer mobility c10-2015 a1–180 v43 aCustomers mobility is dependent on the sophistication of their needs: sophisticated customers need to travel more to fulfill their needs. In this paper, we provide more detailed evidence of this phenomenon, providing an empirical validation of the Central Place Theory. For each customer, we detect what is her favorite shop, where she purchases most products. We can study the relationship between the favorite shop and the closest one, by recording the influence of the shop’s size and the customer’s sophistication in the discordance cases, i.e. the cases in which the favorite shop is not the closest one. We show that larger shops are able to retain most of their closest customers and they are able to catch large portions of customers from smaller shops around them. We connect this observation with the shop’s larger sophistication, and not with its other characteristics, as the phenomenon is especially noticeable when customers want to satisfy their sophisticated needs. This is a confirmation of the recent extensions of the Central Place Theory, where the original assumptions of homogeneity in customer purchase power and needs are challenged. Different types of shops have also different survival logics. The largest shops get closed if they are unable to catch customers from the smaller shops, while medium size shops get closed if they cannot retain their closest customers. All analysis are performed on a large real-world dataset recording all purchases from millions of customers across the west coast of Italy.1 aCoscia, Michele1 aPennacchioli, Diego1 aGiannotti, Fosca uhttp://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-015-0051-300482nas a2200145 4500008004100000245006800041210006300109100002400172700002100196700002100217700002000238700002100258700002000279856003700299 2014 eng d00aThe patterns of musical influence on the Last.Fm social network0 apatterns of musical influence on the LastFm social network1 aPennacchioli, Diego1 aRossetti, Giulio1 aPappalardo, Luca1 aPedreschi, Dino1 aGiannotti, Fosca1 aCoscia, Michele uhttps://kdd.isti.cnr.it/node/62301645nas a2200205 4500008003900000245004500039210004300084300001400127520105800141100001801199700001901217700002401236700002101260700002001281700002301301700001901324700002401343700002201367856005001389 2014 d00aA Privacy Risk Model for Trajectory Data0 aPrivacy Risk Model for Trajectory Data a125–1403 aTime sequence data relating to users, such as medical histories and mobility data, are good candidates for data mining, but often contain highly sensitive information. Different methods in privacy-preserving data publishing are utilised to release such private data so that individual records in the released data cannot be re-linked to specific users with a high degree of certainty. These methods provide theoretical worst-case privacy risks as measures of the privacy protection that they offer. However, often with many real-world data the worst-case scenario is too pessimistic and does not provide a realistic view of the privacy risks: the real probability of re-identification is often much lower than the theoretical worst-case risk. In this paper we propose a novel empirical risk model for privacy which, in relation to the cost of privacy attacks, demonstrates better the practical risks associated with a privacy preserving data release. We show detailed evaluation of the proposed risk model by using k-anonymised real-world mobility data.1 aBasu, Anirban1 aMonreale, Anna1 aCorena, Juan Camilo1 aGiannotti, Fosca1 aPedreschi, Dino1 aKiyomoto, Shinsaku1 aMiyake, Yutaka1 aYanagihara, Tadashi1 aTrasarti, Roberto uhttp://dx.doi.org/10.1007/978-3-662-43813-8_901574nas a2200157 4500008003900000245006200039210006000101490000700161520105400168100001901222700001801241700002301259700002101282700002001303856009301323 2014 d00aPrivacy-by-Design in Big Data Analytics and Social Mining0 aPrivacybyDesign in Big Data Analytics and Social Mining0 v103 aPrivacy is ever-growing concern in our society and is becoming a fundamental aspect to take into account when one wants to use, publish and analyze data involving human personal sensitive information. Unfortunately, it is increasingly hard to transform the data in a way that it protects sensitive information: we live in the era of big data characterized by unprecedented opportunities to sense, store and analyze social data describing human activities in great detail and resolution. As a result, privacy preservation simply cannot be accomplished by de-identification alone. In this paper, we propose the privacy-by-design paradigm to develop technological frameworks for countering the threats of undesirable, unlawful effects of privacy violation, without obstructing the knowledge discovery opportunities of social mining and big data analytical technologies. Our main idea is to inscribe privacy protection into the knowledge discovery technology by design, so that the analysis incorporates the relevant privacy requirements from the start.1 aMonreale, Anna1 aRinzivillo, S1 aPratesi, Francesca1 aGiannotti, Fosca1 aPedreschi, Dino uhttps://kdd.isti.cnr.it/publications/privacy-design-big-data-analytics-and-social-mining01411nas a2200133 4500008004100000245008100041210006900122260001900191520090300210100002001113700002001133700001901153856010501172 2014 eng d00aProcess mining event logs from FLOSS data: state of the art and perspectives0 aProcess mining event logs from FLOSS data state of the art and p bSpringer, Cham3 aFree/Libre Open Source Software (FLOSS) is a phenomenon that has undoubtedly triggered extensive research endeavors. At the heart of these initiatives is the ability to mine data from FLOSS repositories with the hope of revealing empirical evidence to answer existing questions on the FLOSS development process. In spite of the success produced with existing mining techniques, emerging questions about FLOSS data require alternative and more appropriate ways to explore and analyse such data. In this paper, we explore a different perspective called process mining. Process mining has been proved to be successful in terms of tracing and reconstructing process models from data logs (event logs). The chief objective of our analysis is threefold. We aim to achieve: (1) conformance to predefined models; (2) discovery of new model patterns; and, finally, (3) extension to predefined models. 1 aMukala, Patrick1 aCerone, Antonio1 aTurini, Franco uhttps://kdd.isti.cnr.it/publications/process-mining-event-logs-floss-data-state-art-and-perspectives00505nas a2200145 4500008004100000245008100041210006900122100001800191700002300209700001700232700002100249700002000270700002100290856004800311 2014 eng d00aThe purpose of motion: Learning activities from Individual Mobility Networks0 apurpose of motion Learning activities from Individual Mobility N1 aRinzivillo, S1 aGabrielli, Lorenzo1 aNanni, Mirco1 aPappalardo, Luca1 aPedreschi, Dino1 aGiannotti, Fosca uhttp://dx.doi.org/10.1109/DSAA.2014.705809000516nas a2200121 4500008003900000245008700039210006900126100002200195700002300217700001800240700001800258856011800276 2013 d00aPisa Tourism fluxes Observatory: deriving mobility indicators from GSM call habits0 aPisa Tourism fluxes Observatory deriving mobility indicators fro1 aFurletti, Barbara1 aGabrielli, Lorenzo1 aRenso, Chiara1 aRinzivillo, S uhttps://kdd.isti.cnr.it/publications/pisa-tourism-fluxes-observatory-deriving-mobility-indicators-gsm-call-habits01416nas a2200181 4500008004100000245005400041210005300095260002000148520088200168100002301050700001901073700002101092700001801113700002001131700002301151700002301174856003701197 2013 eng d00aPrivacy-Aware Distributed Mobility Data Analytics0 aPrivacyAware Distributed Mobility Data Analytics aRoccella Jonica3 aWe propose an approach to preserve privacy in an analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because they may describe typical movement behaviors and therefore be used for re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation. 1 aPratesi, Francesca1 aMonreale, Anna1 aWang, Hui, Wendy1 aRinzivillo, S1 aPedreschi, Dino1 aAndrienko, Gennady1 aAndrienko, Natalia uhttps://kdd.isti.cnr.it/node/61501685nas a2200241 4500008003900000020002200039245006100061210006000122260003800182300001200220520094300232100001901175700002101194700002301215700001801238700002001256700002301276700002301299700002501322700002401347700002101371856005101392 2013 d a978-3-319-00614-700aPrivacy-Preserving Distributed Movement Data Aggregation0 aPrivacyPreserving Distributed Movement Data Aggregation bSpringer International Publishing a225-2453 aWe propose a novel approach to privacy-preserving analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because people’s whereabouts have the potential to reveal intimate personal traits, such as religious or sexual preferences, and may allow re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation.1 aMonreale, Anna1 aWang, Hui, Wendy1 aPratesi, Francesca1 aRinzivillo, S1 aPedreschi, Dino1 aAndrienko, Gennady1 aAndrienko, Natalia1 aVandenbroucke, Danny1 aBucher, Bénédicte1 aCrompvoets, Joep uhttp://dx.doi.org/10.1007/978-3-319-00615-4_1301777nas a2200145 4500008003900000245008900039210006900128520121300197100002101410700002201431700001901453700002001472700002101492856011801513 2013 d00aPrivacy-Preserving Mining of Association Rules From Outsourced Transaction Databases0 aPrivacyPreserving Mining of Association Rules From Outsourced Tr3 aSpurred by developments such as cloud computing, there has been considerable recent interest in the paradigm of data mining-as-a-service. A company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third party service provider (server). However, both the items and the association rules of the outsourced database are considered private property of the corporation (data owner). To protect corporate privacy, the data owner transforms its data and ships it to the server, sends mining queries to the server, and recovers the true patterns from the extracted patterns received from the server. In this paper, we study the problem of outsourcing the association rule mining task within a corporate privacy-preserving framework. We propose an attack model based on background knowledge and devise a scheme for privacy preserving outsourced mining. Our scheme ensures that each transformed item is indistinguishable with respect to the attacker's background knowledge, from at least k-1 other transformed items. Our comprehensive experiments on a very large and real transaction database demonstrate that our techniques are effective, scalable, and protect privacy.1 aGiannotti, Fosca1 aLakshmanan, L V S1 aMonreale, Anna1 aPedreschi, Dino1 aWang, Hui, Wendy uhttps://kdd.isti.cnr.it/publications/privacy-preserving-mining-association-rules-outsourced-transaction-databases00505nas a2200133 4500008003900000245005400039210005100093100003200144700002300176700003100199700003800230700001800268856008500286 2013 d00aA Proactive Ap- plication to Monitor Truck Fleets0 aProactive Ap plication to Monitor Truck Fleets1 aAlbuquerque, Fabio Da Costa1 aCasanova, Marco, A1 aCarvalho, Marcelo Tilio, M1 aMacêdo, José Antônio Fernandes1 aRenso, Chiara uhttps://kdd.isti.cnr.it/publications/proactive-ap-plication-monitor-truck-fleets01493nas a2200157 4500008003900000245006200039210006000101260000900161520096900170100002101139700002201160700001901182700002001201700002101221856009301242 2011 d00aPrivacy-preserving data mining from outsourced databases.0 aPrivacypreserving data mining from outsourced databases c20113 aSpurred by developments such as cloud computing, there has been considerable recent interest in the paradigm of data mining-as-service: a company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third party service provider (server). However, both the outsourced database and the knowledge extract from it by data mining are considered private property of the data owner. To protect corporate privacy, the data owner transforms its data and ships it to the server, sends mining queries to the server, and recovers the true patterns from the extracted patterns received from the server. In this paper, we study the problem of outsourcing a data mining task within a corporate privacy-preserving framework. We propose a scheme for privacy-preserving outsourced mining which offers a formal protection against information disclosure, and show that the data owner can recover the correct data mining results efficiently.1 aGiannotti, Fosca1 aLakshmanan, L V S1 aMonreale, Anna1 aPedreschi, Dino1 aWang, Hui, Wendy uhttps://kdd.isti.cnr.it/publications/privacy-preserving-data-mining-outsourced-databases01999nas a2200169 4500008003900000245008200039210006900121300001200190490000600202520141100208100002501619700002001644700002101664700001901685700002001704856010501724 2011 d00aThe pursuit of hubbiness: Analysis of hubs in large multidimensional networks0 apursuit of hubbiness Analysis of hubs in large multidimensional a223-2370 v23 aHubs are highly connected nodes within a network. In complex network analysis, hubs have been widely studied, and are at the basis of many tasks, such as web search and epidemic outbreak detection. In reality, networks are often multidimensional, i.e., there can exist multiple connections between any pair of nodes. In this setting, the concept of hub depends on the multiple dimensions of the network, whose interplay becomes crucial for the connectedness of a node. In this paper, we characterize multidimensional hubs. We consider the multidimensional generalization of the degree and introduce a new class of measures, that we call Dimension Relevance, aimed at analyzing the importance of different dimensions for the hubbiness of a node. We assess the meaningfulness of our measures by comparing them on real networks and null models, then we study the interplay among dimensions and their effect on node connectivity. Our findings show that: (i) multidimensional hubs do exist and their characterization yields interesting insights and (ii) it is possible to detect the most influential dimensions that cause the different hub behaviors. We demonstrate the usefulness of multidimensional analysis in three real world domains: detection of ambiguous query terms in a word–word query log network, outlier detection in a social network, and temporal analysis of behaviors in a co-authorship network.1 aBerlingerio, Michele1 aCoscia, Michele1 aGiannotti, Fosca1 aMonreale, Anna1 aPedreschi, Dino uhttps://kdd.isti.cnr.it/publications/pursuit-hubbiness-analysis-hubs-large-multidimensional-networks01643nas a2200157 4500008003900000245007100039210006900110300001000179520109600189100001901285700002201304700001801326700002001344700001901364856010201383 2010 d00aPreserving privacy in semantic-rich trajectories of human mobility0 aPreserving privacy in semanticrich trajectories of human mobilit a47-543 aThe increasing abundance of data about the trajectories of personal movement is opening up new opportunities for analyzing and mining human mobility, but new risks emerge since it opens new ways of intruding into personal privacy. Representing the personal movements as sequences of places visited by a person during her/his movements - semantic trajectory - poses even greater privacy threats w.r.t. raw geometric location data. In this paper we propose a privacy model defining the attack model of semantic trajectory linking, together with a privacy notion, called c-safety. This method provides an upper bound to the probability of inferring that a given person, observed in a sequence of nonsensitive places, has also stopped in any sensitive location. Coherently with the privacy model, we propose an algorithm for transforming any dataset of semantic trajectories into a c-safe one. We report a study on a real-life GPS trajectory dataset to show how our algorithm preserves interesting quality/utility measures of the original trajectories, such as sequential pattern mining results.1 aMonreale, Anna1 aTrasarti, Roberto1 aRenso, Chiara1 aPedreschi, Dino1 aBogorny, Vania uhttps://kdd.isti.cnr.it/publications/preserving-privacy-semantic-rich-trajectories-human-mobility00460nas a2200109 4500008004100000245008300041210006900124260001200193100002000205700002100225856010400246 2009 eng d00aPoverty as a Social Condition: a Case Study on a Small Municipality in Tuscany0 aPoverty as a Social Condition a Case Study on a Small Municipali bSEAFORD1 aTomei, Gabriele1 aNatilli, Michela uhttps://kdd.isti.cnr.it/publications/poverty-social-condition-case-study-small-municipality-tuscany01351nas a2200133 4500008003900000245009600039210006900135520084300204100002201047700001901069700001901088700002001107856009001127 2008 d00aPattern-Preserving k-Anonymization of Sequences and its Application to Mobility Data Mining0 aPatternPreserving kAnonymization of Sequences and its Applicatio3 aSequential pattern mining is a major research field in knowledge discovery and data mining. Thanks to the increasing availability of transaction data, it is now possible to provide new and improved services based on users’ and customers’ behavior. However, this puts the citizen’s privacy at risk. Thus, it is important to develop new privacy-preserving data mining techniques that do not alter the analysis results significantly. In this paper we propose a new approach for anonymizing sequential data by hiding infrequent, and thus potentially sensible, subsequences. Our approach guarantees that the disclosed data are k-anonymous and preserve the quality of extracted patterns. An application to a real-world moving object database is presented, which shows the effectiveness of our approach also in complex contexts.1 aPensa, Ruggero, G1 aMonreale, Anna1 aPinelli, Fabio1 aPedreschi, Dino uhttps://air.unimi.it/retrieve/handle/2434/52786/106397/ProceedingsPiLBA08.pdf#page=4400657nas a2200181 4500008003900000245008000039210006900119300001200188100002000200700002200220700001900242700002700261700002100288700001900309700001800328700001900346856011000365 2008 d00aPrivacy Protection: Regulations and Technologies, Opportunities and Threats0 aPrivacy Protection Regulations and Technologies Opportunities an a101-1191 aPedreschi, Dino1 aBonchi, Francesco1 aTurini, Franco1 aVerykios, Vassilios, S1 aAtzori, Maurizio1 aMalin, Bradley1 aMoelans, Bart1 aSaygin, Yücel uhttps://kdd.isti.cnr.it/content/privacy-protection-regulations-and-technologies-opportunities-and-threats00496nas a2200145 4500008004100000245005700041210005600098300001200154100002100166700002200187700002100209700002000230700001600250856008400266 2007 eng d00aPrivacy-Aware Knowledge Discovery from Location Data0 aPrivacyAware Knowledge Discovery from Location Data a283-2871 aAtzori, Maurizio1 aBonchi, Francesco1 aGiannotti, Fosca1 aPedreschi, Dino1 aAbul, Osman uhttps://kdd.isti.cnr.it/content/privacy-aware-knowledge-discovery-location-data00577nas a2200145 4500008003900000020002200039245008000061210006900141260000900210100002200219700002100241700001800262700002000280856013100300 2007 d a978-972-8924-44-700aPUSHING CONSTRAINTS IN ASSOCIATION RULE MINING: AN ONTOLOGY-BASED APPROACH 0 aPUSHING CONSTRAINTS IN ASSOCIATION RULE MINING AN ONTOLOGYBASED c20071 aFurletti, Barbara1 aBellandi, Andrea1 aRomei, Andrea1 aGrossi, Valerio uhttp://www.iadisportal.org/digital-library/mdownload/pushing-constraints-in-association-rule-mining-an-ontology-based-approach00379nas a2200109 4500008004100000245004900041210004900090300000900139100002200148700002100170856007800191 2004 eng d00aPushing Constraints to Detect Local Patterns0 aPushing Constraints to Detect Local Patterns a1-191 aBonchi, Francesco1 aGiannotti, Fosca uhttps://kdd.isti.cnr.it/content/pushing-constraints-detect-local-patterns00547nas a2200133 4500008004100000245010300041210006900144100001300213700001300226700001500239700002100254700001500275856012300290 2003 eng d00aPersonal income in the gross and net forms: applications of the Siena Micro-Simulation Model (SM2)0 aPersonal income in the gross and net forms applications of the S1 aVerma, V1 aBetti, G1 aBallini, F1 aNatilli, Michela1 aGalgani, S uhttps://kdd.isti.cnr.it/publications/personal-income-gross-and-net-forms-applications-siena-micro-simulation-model-sm200449nas a2200133 4500008004100000245005000041210004900091300001200140100002200152700002100174700002200195700002000217856007800237 2003 eng d00aPre-processing for Constrained Pattern Mining0 aPreprocessing for Constrained Pattern Mining a519-5301 aBonchi, Francesco1 aGiannotti, Fosca1 aMazzanti, Alessio1 aPedreschi, Dino uhttps://kdd.isti.cnr.it/content/pre-processing-constrained-pattern-mining00487nas a2200145 4500008004100000245006000041210005900101300001100160490000700171100002100178700001800199700002100217700001900238856008400257 1997 eng d00aProgramming with Non-Determinism in Deductive Databases0 aProgramming with NonDeterminism in Deductive Databases a97-1250 v191 aGiannotti, Fosca1 aGreco, Sergio1 aSaccà, Domenico1 aZaniolo, Carlo uhttps://kdd.isti.cnr.it/content/programming-non-determinism-deductive-databases00375nas a2200097 4500008004100000245006100041210005900102300001200161100002000173856008400193 1994 eng d00aA Proof Method for Runtime Properties of Prolog Programs0 aProof Method for Runtime Properties of Prolog Programs a584-5981 aPedreschi, Dino uhttps://kdd.isti.cnr.it/content/proof-method-runtime-properties-prolog-programs00361nas a2200109 4500008004100000245004300041210004300084300001000127100002200137700002000159856007200179 1994 eng d00aProving termination of Prolog programs0 aProving termination of Prolog programs a46-611 aMascellani, Paolo1 aPedreschi, Dino uhttps://kdd.isti.cnr.it/content/proving-termination-prolog-programs00387nas a2200109 4500008004100000245005100041210005100092300001200143100002200155700002000177856008000197 1991 eng d00aProving Termination of General Prolog Programs0 aProving Termination of General Prolog Programs a265-2891 aApt, Krzysztof, R1 aPedreschi, Dino uhttps://kdd.isti.cnr.it/content/proving-termination-general-prolog-programs00467nas a2200157 4500008004100000245004100041210003900082300001200121100002100133700002200154700001500176700001500191700002000206700001900226856006400245 1988 eng d00aA Progress Report on the LML Project0 aProgress Report on the LML Project a675-6841 aBertolino, Bruno1 aMancarella, Paolo1 aMeo, Luigi1 aNini, Luca1 aPedreschi, Dino1 aTurini, Franco uhttps://kdd.isti.cnr.it/content/progress-report-lml-project