00468nas a2200133   4500008004100000245007500041210006900116100001900185700002300204700002700227700001900254700002400273856003700297       2024                        eng d00aPUFFLE: Balancing Privacy, Utility, and Fairness in Federated Learning0 aPUFFLE Balancing Privacy Utility and Fairness in Federated Learn1 aCorbucci, Luca1 aHeikkila, Mikko, A1 aNoguero, David, Solans1 aMonreale, Anna1 aKourtellis, Nicolas  uhttps://arxiv.org/abs/2407.1522400476nas a2200145   4500008004100000245006400041210006400105260001300169100002100182700001900203700002300222700001900245700001700264856004900281       2021                        eng d00aPrivacy Risk Assessment of Individual Psychometric Profiles0 aPrivacy Risk Assessment of Individual Psychometric Profiles  bSpringer1 aMariani, Giacomo1 aMonreale, Anna1 aNaretto, Francesca1 aSoares, Carlos1 aTorgo, Luís  uhttps://doi.org/10.1007/978-3-030-88942-5_3202034nas a2200217   4500008004100000020002200041245006900063210006900132260005200201520129400253100002301547700002501570700001901595700002701614700002001641700002101661700002501682700002501707700001701732856006701749       2020                        eng d  a978-3-030-61527-700aPredicting and Explaining Privacy Risk Exposure in Mobility Data0 aPredicting and Explaining Privacy Risk Exposure in Mobility Data  aChambSpringer International Publishingc2020//3 aMobility data is a proxy of different social dynamics and its analysis enables a wide range of user services. Unfortunately, mobility data are very sensitive because the sharing of people’s whereabouts may arise serious privacy concerns. Existing frameworks for privacy risk assessment provide tools to identify and measure privacy risks, but they often (i) have high computational complexity; and (ii) are not able to provide users with a justification of the reported risks. In this paper, we propose expert, a new framework for the prediction and explanation of privacy risk on mobility data. We empirically evaluate privacy risk on real data, simulating a privacy attack with a state-of-the-art privacy risk assessment framework. We then extract individual mobility profiles from the data for predicting their risk. We compare the performance of several machine learning algorithms in order to identify the best approach for our task. Finally, we show how it is possible to explain privacy risk prediction on real data, using two algorithms: Shap, a feature importance-based method and Lore, a rule-based method. Overall, expert is able to provide a user with the privacy risk and an explanation of the risk itself. The experiments show excellent performance for the prediction task.1 aNaretto, Francesca1 aPellungrini, Roberto1 aMonreale, Anna1 aNardini, Franco, Maria1 aMusolesi, Mirco1 aAppice, Annalisa1 aTsoumakas, Grigorios1 aManolopoulos, Yannis1 aMatwin, Stan  uhttps://link.springer.com/chapter/10.1007/978-3-030-61527-7_2702783nas a2200517   4500008004100000020002200041245008500063210006900148260005200217520120400269100002301473700002501496700002701521700002101548700002101569700001801590700002101608700002201629700001901651700002501670700002301695700002201718700002201740700002101762700001601783700002001799700002601819700002401845700002001869700002201889700002301911700002001934700001901954700002201973700002001995700002002015700002002035700001802055700001902073700002302092700001802115700002002133700002402153700002102177856006702198       2020                        eng d  a978-3-030-65965-300aPrediction and Explanation of Privacy Risk on Mobility Data with Neural Networks0 aPrediction and Explanation of Privacy Risk on Mobility Data with  aChambSpringer International Publishingc2020//3 aThe analysis of privacy risk for mobility data is a fundamental part of any privacy-aware process based on such data. Mobility data are highly sensitive. Therefore, the correct identification of the privacy risk before releasing the data to the public is of utmost importance. However, existing privacy risk assessment frameworks have high computational complexity. To tackle these issues, some recent work proposed a solution based on classification approaches to predict privacy risk using mobility features extracted from the data. In this paper, we propose an improvement of this approach by applying long short-term memory (LSTM) neural networks to predict the privacy risk directly from original mobility data. We empirically evaluate privacy risk on real data by applying our LSTM-based approach. Results show that our proposed method based on a LSTM network is effective in predicting the privacy risk with results in terms of F1 of up to 0.91. Moreover, to explain the predictions of our model, we employ a state-of-the-art explanation algorithm, Shap. We explore the resulting explanation, showing how it is possible to provide effective predictions while explaining them to the end-user.1 aNaretto, Francesca1 aPellungrini, Roberto1 aNardini, Franco, Maria1 aGiannotti, Fosca1 aKoprinska, Irena1 aKamp, Michael1 aAppice, Annalisa1 aLoglisci, Corrado1 aAntonie, Luiza1 aZimmermann, Albrecht1 aGuidotti, Riccardo1 aÖzgöbek, Özlem1 aRibeiro, Rita, P.1 aGavaldà, Ricard1 aGama, João1 aAdilova, Linara1 aKrishnamurthy, Yamuna1 aFerreira, Pedro, M.1 aMalerba, Donato1 aMedeiros, Ibéria1 aCeci, Michelangelo1 aManco, Giuseppe1 aMasciari, Elio1 aRas, Zbigniew, W.1 aChristen, Peter1 aNtoutsi, Eirini1 aSchubert, Erich1 aZimek, Arthur1 aMonreale, Anna1 aBiecek, Przemyslaw1 aRinzivillo, S1 aKille, Benjamin1 aLommatzsch, Andreas1 aGulla, Jon, Atle  uhttps://link.springer.com/chapter/10.1007/978-3-030-65965-3_3401574nas a2200193   4500008004100000020001400041245005500055210005400110260001600164300001100180490000800191520100500199100002301204700002301227700001801250700001901268700002101287856007201308       2020                        eng d  a0169-023X00aPRIMULE: Privacy risk mitigation for user profiles0 aPRIMULE Privacy risk mitigation for user profiles  c2020/01/01/  a1017860 v1253 aThe availability of mobile phone data has encouraged the development of different data-driven tools, supporting social science studies and providing new data sources to the standard official statistics. However, this particular kind of data are subject to privacy concerns because they can enable the inference of personal and private information. In this paper, we address the privacy issues related to the sharing of user profiles, derived from mobile phone data, by proposing PRIMULE, a privacy risk mitigation strategy. Such a method relies on PRUDEnce (Pratesi et al., 2018), a privacy risk assessment framework that provides a methodology for systematically identifying risky-users in a set of data. An extensive experimentation on real-world data shows the effectiveness of PRIMULE strategy in terms of both quality of mobile user profiles and utility of these profiles for analytical services such as the Sociometer (Furletti et al., 2013), a data mining tool for city users classification.1 aPratesi, Francesca1 aGabrielli, Lorenzo1 aCintia, Paolo1 aMonreale, Anna1 aGiannotti, Fosca  uhttps://www.sciencedirect.com/science/article/pii/S0169023X1830534201999nas a2200181   4500008004100000245011100041210006900152300001100221490000700232520140700239100002101646700001801667700002101685700002301706700002001729700002101749856004701770       2019                        eng d00aPlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach0 aPlayeRank datadriven performance evaluation and player ranking i  a1–270 v103 aThe problem of evaluating the performance of soccer players is attracting the interest of many companies and the scientific community, thanks to the availability of massive data capturing all the events generated during a match (e.g., tackles, passes, shots, etc.). Unfortunately, there is no consolidated and widely accepted metric for measuring performance quality in all of its facets. In this article, we design and implement PlayeRank, a data-driven framework that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. We build our framework by deploying a massive dataset of soccer-logs and consisting of millions of match events pertaining to four seasons of 18 prominent soccer competitions. By comparing PlayeRank to known algorithms for performance evaluation in soccer, and by exploiting a dataset of players’ evaluations made by professional soccer scouts, we show that PlayeRank significantly outperforms the competitors. We also explore the ratings produced by PlayeRank and discover interesting patterns about the nature of excellent performances and what distinguishes the top players from the others. At the end, we explore some applications of PlayeRank—i.e. searching players and player versatility—showing its flexibility and efficiency, which makes it worth to be used in the design of a scalable platform for soccer analytics.1 aPappalardo, Luca1 aCintia, Paolo1 aFerragina, Paolo1 aMassucco, Emanuele1 aPedreschi, Dino1 aGiannotti, Fosca  uhttps://dl.acm.org/doi/abs/10.1145/334317201650nas a2200301   4500008004100000020002200041245004800063210004800111260005200159520072600211100002500937700001900962700002300981700001901004700001901023700001901042700002101061700002001082700002201102700002101124700002301145700002101168700002401189700002301213700002201236700002301258856006701281       2019                        eng d  a978-3-030-13463-100aPrivacy Risk for Individual Basket Patterns0 aPrivacy Risk for Individual Basket Patterns  aChambSpringer International Publishingc2019//3 aRetail data are of fundamental importance for businesses and enterprises that want to understand the purchasing behaviour of their customers. Such data is also useful to develop analytical services and for marketing purposes, often based on individual purchasing patterns. However, retail data and extracted models may also provide very sensitive information to possible malicious third parties. Therefore, in this paper we propose a methodology for empirically assessing privacy risk in the releasing of individual purchasing data. The experiments on real-world retail data show that although individual patterns describe a summary of the customer activity, they may be successful used for the customer re-identifiation.1 aPellungrini, Roberto1 aMonreale, Anna1 aGuidotti, Riccardo1 aAlzate, Carlos1 aMonreale, Anna1 aBioglio, Livio1 aBitetta, Valerio1 aBordino, Ilaria1 aCaldarelli, Guido1 aFerretti, Andrea1 aGuidotti, Riccardo1 aGullo, Francesco1 aPascolutti, Stefano1 aPensa, Ruggero, G.1 aRobardet, Céline1 aSquartini, Tiziano  uhttps://link.springer.com/chapter/10.1007/978-3-030-13463-1_1101689nas a2200193   4500008004100000245007700041210006900118300001100187490000600198520109400204100002101298700001801319700001901337700002301356700002101379700002001400700002101420856005401441       2019                        eng d00aA public data set of spatio-temporal match events in soccer competitions0 apublic data set of spatiotemporal match events in soccer competi  a1–150 v63 aSoccer analytics is attracting increasing interest in academia and industry, thanks to the availability of sensing technologies that provide high-fidelity data streams for every match. Unfortunately, these detailed data are owned by specialized companies and hence are rarely publicly available for scientific research. To fill this gap, this paper describes the largest open collection of soccer-logs ever released, containing all the spatio-temporal events (passes, shots, fouls, etc.) that occured during each match for an entire season of seven prominent soccer competitions. Each match event contains information about its position, time, outcome, player and characteristics. The nature of team sports like soccer, halfway between the abstraction of a game and the reality of complex social systems, combined with the unique size and composition of this dataset, provide an ideal ground for tackling a wide range of data science problems, including the measurement and evaluation of performance, both at individual and at collective level, and the determinants of success and failure.1 aPappalardo, Luca1 aCintia, Paolo1 aRossi, Alessio1 aMassucco, Emanuele1 aFerragina, Paolo1 aPedreschi, Dino1 aGiannotti, Fosca  uhttps://www.nature.com/articles/s41597-019-0247-700404nas a2200121   4500008004100000245004000041210004000081100001700121700002100138700002000159700002100179856008200200       2019                        eng d00aPublic opinion and Algorithmic bias0 aPublic opinion and Algorithmic bias1 aSirbu, Alina1 aGiannotti, Fosca1 aPedreschi, Dino1 aKertész, János  uhttps://ercim-news.ercim.eu/en116/special/public-opinion-and-algorithmic-bias01676nas a2200145   4500008004100000245008600041210006900127520117000196100002301366700002101389700002101410700002101431700002001452856005801472       2018                        eng d00aPersonalized Market Basket Prediction with Temporal Annotated Recurring Sequences0 aPersonalized Market Basket Prediction with Temporal Annotated Re3 aNowadays, a hot challenge for supermarket chains is to offer personalized services to their customers. Market basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable of capturing at the same time the different factors influencing the customer's decision process: co-occurrence, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern Temporal Annotated Recurring Sequence (TARS) able to capture simultaneously and adaptively all these factors. We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to understand the level of the customer's stocks and recommend the set of most necessary items. By adopting the TBP the supermarket chains could crop tailored suggestions for each individual customer which in turn could effectively speed up their shopping sessions. A deep experimentation shows that TARS are able to explain the customer purchase behavior, and that TBP outperforms the state-of-the-art competitors.1 aGuidotti, Riccardo1 aRossetti, Giulio1 aPappalardo, Luca1 aGiannotti, Fosca1 aPedreschi, Dino  uhttps://ieeexplore.ieee.org/abstract/document/847715701950nas a2200181   4500008004100000245008800041210006900129260001200198490000700210520137400217100002301591700001901614700002201633700002101655700002001676700002401696856004801720       2018                        eng d00aPRUDEnce: a system for assessing privacy risk vs utility in data sharing ecosystems0 aPRUDEnce a system for assessing privacy risk vs utility in data   c08/20180 v113 aData describing human activities are an important source of knowledge useful for understanding individual and collective behavior and for developing a wide range of user services. Unfortunately, this kind of data is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database. Therefore, Data Providers, before sharing those data, must apply any sort of anonymization to lower the privacy risks, but they must be aware and capable of controlling also the data quality, since these two factors are often a trade-off. In this paper we propose PRUDEnce (Privacy Risk versus Utility in Data sharing Ecosystems), a system enabling a privacy-aware ecosystem for sharing personal data. It is based on a methodology for assessing both the empirical (not theoretical) privacy risk associated to users represented in the data, and the data quality guaranteed only with users not at risk. Our proposal is able to support the Data Provider in the exploration of a repertoire of possible data transformations with the aim of selecting one specific transformation that yields an adequate trade-off between data quality and privacy risk. We study the practical effectiveness of our proposal over three data formats underlying many services, defined on real mobility data, i.e., presence data, trajectory data and road segment data.1 aPratesi, Francesca1 aMonreale, Anna1 aTrasarti, Roberto1 aGiannotti, Fosca1 aPedreschi, Dino1 aYanagihara, Tadashi  uhttp://www.tdp.cat/issues16/tdp.a284a17.pdf01373nas a2200145   4500008004100000245005000041210005000091260001300141520092300154100002301077700001901100700002101119700002001140856006701160       2017                        eng d00aPrivacy Preserving Multidimensional Profiling0 aPrivacy Preserving Multidimensional Profiling  bSpringer3 aRecently, big data had become central in the analysis of human behavior and the development of innovative services. In particular, a new class of services is emerging, taking advantage of different sources of data, in order to consider the multiple aspects of human beings. Unfortunately, these data can lead to re-identification problems and other privacy leaks, as diffusely reported in both scientific literature and media. The risk is even more pressing if multiple sources of data are linked together since a potential adversary could know information related to each dataset. For this reason, it is necessary to evaluate accurately and mitigate the individual privacy risk before releasing personal data. In this paper, we propose a methodology for the first task, i.e., assessing privacy risk, in a multidimensional scenario, defining some possible privacy attacks and simulating them using real-world datasets.1 aPratesi, Francesca1 aMonreale, Anna1 aGiannotti, Fosca1 aPedreschi, Dino  uhttps://link.springer.com/chapter/10.1007/978-3-319-76111-4_1501341nas a2200169   4500008004100000245006100041210006000102260003800162300001400200520081200214100002001026700001501046700001901061700001701080700002301097856005101120       2016                        eng d00aPartition-Based Clustering Using Constraint Optimization0 aPartitionBased Clustering Using Constraint Optimization  bSpringer International Publishing  a282–2993 aPartition-based clustering is the task of partitioning a dataset in a number of groups of examples, such that examples in each group are similar to each other. Many criteria for what constitutes a good clustering have been identified in the literature; furthermore, the use of additional constraints to find more useful clusterings has been proposed. In this chapter, it will be shown that most of these clustering tasks can be formalized using optimization criteria and constraints. We demonstrate how a range of clustering tasks can be modelled in generic constraint programming languages with these constraints and optimization criteria. Using the constraint-based modeling approach we also relate the DBSCAN method for density-based clustering to the label propagation technique for community discovery.1 aGrossi, Valerio1 aGuns, Tias1 aMonreale, Anna1 aNanni, Mirco1 aNijssen, Siegfried  uhttp://dx.doi.org/10.1007/978-3-319-50137-6_1101568nas a2200133   4500008004100000245008400041210006900125260003600194490001400230520111700244100001701361700002001378856003601398       2016                        eng d00aPower Consumption Modeling and Prediction in a Hybrid CPU-GPU-MIC Supercomputer0 aPower Consumption Modeling and Prediction in a Hybrid CPUGPUMIC   aGrenoble, FrancebSpringer LNCS0 vLNCS 98333 aPower consumption is a major obstacle for High Performance Computing (HPC) systems in their quest towards the holy grail of ExaFLOP performance. Significant advances in power efficiency have to be made before this goal can be attained and accurate modeling is an essential step towards power efficiency by optimizing system operating parameters to match dynamic energy needs. In this paper we present a study of power consumption by jobs in Eurora, a hybrid CPU-GPU-MIC system installed at the largest Italian data center. Using data from a dedicated monitoring framework, we build a data-driven model of power consumption for each user in the system and use it to predict the power requirements of future jobs. We are able to achieve good prediction results for over 80 % of the users in the system. For the remaining users, we identify possible reasons why prediction performance is not as good. Possible applications for our predictive modeling results include scheduling optimization, power-aware billing and system-scale power modeling. All the scripts used for the study have been made available on GitHub.1 aSirbu, Alina1 aBabaoglu, Ozalp  uhttp://arxiv.org/abs/1601.0596101787nas a2200121   4500008004100000245006100041210006000102260003800162520137900200100001701579700002001596856004901616       2016                        eng d00aPredicting System-level Power for a Hybrid Supercomputer0 aPredicting Systemlevel Power for a Hybrid Supercomputer  aInnsbruck, AustriabIEEEc07/20163 aFor current High Performance Computing systems to scale towards the holy grail of ExaFLOP performance, their power consumption has to be reduced by at least one order of magnitude. This goal can be achieved only through a combination of hardware and software advances. Being able to model and accurately predict the power consumption of large computational systems is necessary for software-level innovations such as proactive and power-aware scheduling, resource allocation and fault tolerance techniques. In this paper we present a 2-layer model of power consumption for a hybrid supercomputer (which held the top spot of the Green500 list on July 2013) that combines CPU, GPU and MIC technologies to achieve higher energy efficiency. Our model takes as input workload information - the number and location of resources that are used by each job at a certain time - and calculates the resulting system-level power consumption. When jobs are submitted to the system, the workload configuration can be foreseen based on the scheduler policies, and our model can then be applied to predict the ensuing system-level power consumption. Additionally, alternative workload configurations can be evaluated from a power perspective and more efficient ones can be selected. Applications of the model include not only power-aware scheduling but also prediction of anomalous behavior.1 aSirbu, Alina1 aBabaoglu, Ozalp  uhttp://ieeexplore.ieee.org/document/7568420/01227nas a2200121   4500008004100000245005000041210004900091260004500140520083300185100001901018700002101037856004701058       2016                        eng d00aPrivacy-Preserving Outsourcing of Data Mining0 aPrivacyPreserving Outsourcing of Data Mining  a Atlanta, GA, USAbIEEE Computer Society3 aData mining is gaining momentum in society due to the ever increasing availability of large amounts of data, easily gathered by a variety of collection technologies and stored via computer systems. Due to the limited computational resources of data owners and the developments in cloud computing, there has been considerable recent interest in the paradigm of data mining-as-a-service (DMaaS). In this paradigm, a company (data owner) lacking in expertise or computational resources outsources its mining needs to a third party service provider (server). Given the fact that the server may not be fully trusted, one of the main concerns of the DMaaS paradigm is the protection of data privacy. In this paper, we provide an overview of a variety of techniques and approaches that address the privacy issues of the DMaaS paradigm.1 aMonreale, Anna1 aWang, Hui, Wendy  uhttp://dx.doi.org/10.1109/COMPSAC.2016.16901567nas a2200145   4500008004100000245010400041210006900145260000900214520098000223100002501203700001901228700002301247700002301270856012801293       2016                        eng d00aPrivacy-Preserving Outsourcing of Pattern Mining of Event-Log Data-A Use-Case from Process Industry0 aPrivacyPreserving Outsourcing of Pattern Mining of EventLog Data  bIEEE3 aWith the advent of cloud computing and its model for IT services based on the Internet and big data centers, the interest of industries into XaaS ("Anything as a Service") paradigm is increasing. Business intelligence and knowledge discovery services are typical services that companies tend to externalize on the cloud, due to their data intensive nature and the algorithms complexity. What is appealing for a company is to rely on external expertise and infrastructure to compute the analytical results and models which are required by the business analysts for understanding the business phenomena under observation. Although it is advantageous to achieve sophisticated analysis there exist several serious privacy issues in this paradigm. In this paper we investigate through an industrial use-case the application of a framework for privacypreserving outsourcing of pattern mining on event-log data. Moreover, we present and discuss some ideas about possible extensions.1 aMarrella, Alessandro1 aMonreale, Anna1 aKloepper, Benjamin1 aKrueger, Martin, W  uhttps://kdd.isti.cnr.it/publications/privacy-preserving-outsourcing-pattern-mining-event-log-data-use-case-process-industry02544nas a2200373   4500008004100000022001400041245008200055210006900137260000900206300001300215490000700228520142000235100001701655700001901672700002201691700002201713700001501735700002001750700002001770700001901790700002101809700002101830700001901851700002101870700001601891700002601907700002001933700002401953700001701977700001701994700002002011700002702031856011202058       2015                        eng d  a1932-620300aParticipatory Patterns in an International Air Quality Monitoring Initiative.0 aParticipatory Patterns in an International Air Quality Monitorin  c2015  ae01367630 v103 a<p>The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective way of producing a change is involving the citizens themselves in monitoring campaigns (a citizen science bottom-up approach). This is possible by developing novel technologies and IT infrastructures enabling large citizen participation. Here, in the wider framework of one of the first such projects, we show results from an international competition where citizens were involved in mobile air pollution monitoring using low cost sensing devices, combined with a web-based game to monitor perceived levels of pollution. Measures of shift in perceptions over the course of the campaign are provided, together with insights into participatory patterns emerging from this study. Interesting effects related to inertia and to direct involvement in measurement activities rather than indirect information exposure are also highlighted, indicating that direct involvement can enhance learning and environmental awareness. In the future, this could result in better adoption of policies towards decreasing pollution.</p>1 aSirbu, Alina1 aBecker, Martin1 aCaminiti, Saverio1 aDe Baets, Bernard1 aElen, Bart1 aFrancis, Louise1 aGravino, Pietro1 aHotho, Andreas1 aIngarra, Stefano1 aLoreto, Vittorio1 aMolino, Andrea1 aMueller, Juergen1 aPeters, Jan1 aRicchiuti, Ferdinando1 aSaracino, Fabio1 aServedio, Vito, D P1 aStumme, Gerd1 aTheunis, Jan1 aTria, Francesca1 aVan den Bossche, Joris  uhttps://kdd.isti.cnr.it/publications/participatory-patterns-international-air-quality-monitoring-initiative02010nas a2200157   4500008004100000245004500041210004500086260001200131300001100143490000600154520154300160100002001703700002401723700002101747856008401768       2015                        eng d00aProduct assortment and customer mobility0 aProduct assortment and customer mobility  c10-2015  a1–180 v43 aCustomers mobility is dependent on the sophistication of their needs: sophisticated customers need to travel more to fulfill their needs. In this paper, we provide more detailed evidence of this phenomenon, providing an empirical validation of the Central Place Theory. For each customer, we detect what is her favorite shop, where she purchases most products. We can study the relationship between the favorite shop and the closest one, by recording the influence of the shop’s size and the customer’s sophistication in the discordance cases, i.e. the cases in which the favorite shop is not the closest one. We show that larger shops are able to retain most of their closest customers and they are able to catch large portions of customers from smaller shops around them. We connect this observation with the shop’s larger sophistication, and not with its other characteristics, as the phenomenon is especially noticeable when customers want to satisfy their sophisticated needs. This is a confirmation of the recent extensions of the Central Place Theory, where the original assumptions of homogeneity in customer purchase power and needs are challenged. Different types of shops have also different survival logics. The largest shops get closed if they are unable to catch customers from the smaller shops, while medium size shops get closed if they cannot retain their closest customers. All analysis are performed on a large real-world dataset recording all purchases from millions of customers across the west coast of Italy.1 aCoscia, Michele1 aPennacchioli, Diego1 aGiannotti, Fosca  uhttp://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-015-0051-300482nas a2200145   4500008004100000245006800041210006300109100002400172700002100196700002100217700002000238700002100258700002000279856003700299       2014                        eng d00aThe patterns of musical influence on the Last.Fm social network0 apatterns of musical influence on the LastFm social network1 aPennacchioli, Diego1 aRossetti, Giulio1 aPappalardo, Luca1 aPedreschi, Dino1 aGiannotti, Fosca1 aCoscia, Michele  uhttps://kdd.isti.cnr.it/node/62301645nas a2200205   4500008003900000245004500039210004300084300001400127520105800141100001801199700001901217700002401236700002101260700002001281700002301301700001901324700002401343700002201367856005001389       2014                          d00aA Privacy Risk Model for Trajectory Data0 aPrivacy Risk Model for Trajectory Data  a125–1403 aTime sequence data relating to users, such as medical histories and mobility data, are good candidates for data mining, but often contain highly sensitive information. Different methods in privacy-preserving data publishing are utilised to release such private data so that individual records in the released data cannot be re-linked to specific users with a high degree of certainty. These methods provide theoretical worst-case privacy risks as measures of the privacy protection that they offer. However, often with many real-world data the worst-case scenario is too pessimistic and does not provide a realistic view of the privacy risks: the real probability of re-identification is often much lower than the theoretical worst-case risk. In this paper we propose a novel empirical risk model for privacy which, in relation to the cost of privacy attacks, demonstrates better the practical risks associated with a privacy preserving data release. We show detailed evaluation of the proposed risk model by using k-anonymised real-world mobility data.1 aBasu, Anirban1 aMonreale, Anna1 aCorena, Juan Camilo1 aGiannotti, Fosca1 aPedreschi, Dino1 aKiyomoto, Shinsaku1 aMiyake, Yutaka1 aYanagihara, Tadashi1 aTrasarti, Roberto  uhttp://dx.doi.org/10.1007/978-3-662-43813-8_901574nas a2200157   4500008003900000245006200039210006000101490000700161520105400168100001901222700001801241700002301259700002101282700002001303856009301323       2014                          d00aPrivacy-by-Design in Big Data Analytics and Social Mining0 aPrivacybyDesign in Big Data Analytics and Social Mining0 v103 aPrivacy is ever-growing concern in our society and is becoming a fundamental aspect to take into account when one wants to use, publish and analyze data involving human personal sensitive information. Unfortunately, it is increasingly hard to transform the data in a way that it protects sensitive information: we live in the era of big data characterized by unprecedented opportunities to sense, store and analyze social data describing human activities in great detail and resolution. As a result, privacy preservation simply cannot be accomplished by de-identification alone. In this paper, we propose the privacy-by-design paradigm to develop technological frameworks for countering the threats of undesirable, unlawful effects of privacy violation, without obstructing the knowledge discovery opportunities of social mining and big data analytical technologies. Our main idea is to inscribe privacy protection into the knowledge discovery technology by design, so that the analysis incorporates the relevant privacy requirements from the start.1 aMonreale, Anna1 aRinzivillo, S1 aPratesi, Francesca1 aGiannotti, Fosca1 aPedreschi, Dino  uhttps://kdd.isti.cnr.it/publications/privacy-design-big-data-analytics-and-social-mining01411nas a2200133   4500008004100000245008100041210006900122260001900191520090300210100002001113700002001133700001901153856010501172       2014                        eng d00aProcess mining event logs from FLOSS data: state of the art and perspectives0 aProcess mining event logs from FLOSS data state of the art and p  bSpringer, Cham3 aFree/Libre Open Source Software (FLOSS) is a phenomenon that has undoubtedly triggered extensive research endeavors. At the heart of these initiatives is the ability to mine data from FLOSS repositories with the hope of revealing empirical evidence to answer existing questions on the FLOSS development process. In spite of the success produced with existing mining techniques, emerging questions about FLOSS data require alternative and more appropriate ways to explore and analyse such data.

In this paper, we explore a different perspective called process mining. Process mining has been proved to be successful in terms of tracing and reconstructing process models from data logs (event logs). The chief objective of our analysis is threefold. We aim to achieve: (1) conformance to predefined models; (2) discovery of new model patterns; and, finally, (3) extension to predefined models.

1 aMukala, Patrick1 aCerone, Antonio1 aTurini, Franco  uhttps://kdd.isti.cnr.it/publications/process-mining-event-logs-floss-data-state-art-and-perspectives00505nas a2200145   4500008004100000245008100041210006900122100001800191700002300209700001700232700002100249700002000270700002100290856004800311       2014                        eng d00aThe purpose of motion: Learning activities from Individual Mobility Networks0 apurpose of motion Learning activities from Individual Mobility N1 aRinzivillo, S1 aGabrielli, Lorenzo1 aNanni, Mirco1 aPappalardo, Luca1 aPedreschi, Dino1 aGiannotti, Fosca  uhttp://dx.doi.org/10.1109/DSAA.2014.705809000516nas a2200121   4500008003900000245008700039210006900126100002200195700002300217700001800240700001800258856011800276       2013                          d00aPisa Tourism fluxes Observatory: deriving mobility indicators from GSM call habits0 aPisa Tourism fluxes Observatory deriving mobility indicators fro1 aFurletti, Barbara1 aGabrielli, Lorenzo1 aRenso, Chiara1 aRinzivillo, S  uhttps://kdd.isti.cnr.it/publications/pisa-tourism-fluxes-observatory-deriving-mobility-indicators-gsm-call-habits01416nas a2200181   4500008004100000245005400041210005300095260002000148520088200168100002301050700001901073700002101092700001801113700002001131700002301151700002301174856003701197       2013                        eng d00aPrivacy-Aware Distributed Mobility Data Analytics0 aPrivacyAware Distributed Mobility Data Analytics  aRoccella Jonica3 aWe propose an approach to preserve privacy in an analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because they may describe typical movement behaviors and therefore be used for re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation.
1 aPratesi, Francesca1 aMonreale, Anna1 aWang, Hui, Wendy1 aRinzivillo, S1 aPedreschi, Dino1 aAndrienko, Gennady1 aAndrienko, Natalia  uhttps://kdd.isti.cnr.it/node/61501685nas a2200241   4500008003900000020002200039245006100061210006000122260003800182300001200220520094300232100001901175700002101194700002301215700001801238700002001256700002301276700002301299700002501322700002401347700002101371856005101392       2013                          d  a978-3-319-00614-700aPrivacy-Preserving Distributed Movement Data Aggregation0 aPrivacyPreserving Distributed Movement Data Aggregation  bSpringer International Publishing  a225-2453 aWe propose a novel approach to privacy-preserving analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because people’s whereabouts have the potential to reveal intimate personal traits, such as religious or sexual preferences, and may allow re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation.1 aMonreale, Anna1 aWang, Hui, Wendy1 aPratesi, Francesca1 aRinzivillo, S1 aPedreschi, Dino1 aAndrienko, Gennady1 aAndrienko, Natalia1 aVandenbroucke, Danny1 aBucher, Bénédicte1 aCrompvoets, Joep  uhttp://dx.doi.org/10.1007/978-3-319-00615-4_1301777nas a2200145   4500008003900000245008900039210006900128520121300197100002101410700002201431700001901453700002001472700002101492856011801513       2013                          d00aPrivacy-Preserving Mining of Association Rules From Outsourced Transaction Databases0 aPrivacyPreserving Mining of Association Rules From Outsourced Tr3 aSpurred by developments such as cloud computing, there has been considerable recent interest in the paradigm of data mining-as-a-service. A company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third party service provider (server). However, both the items and the association rules of the outsourced database are considered private property of the corporation (data owner). To protect corporate privacy, the data owner transforms its data and ships it to the server, sends mining queries to the server, and recovers the true patterns from the extracted patterns received from the server. In this paper, we study the problem of outsourcing the association rule mining task within a corporate privacy-preserving framework. We propose an attack model based on background knowledge and devise a scheme for privacy preserving outsourced mining. Our scheme ensures that each transformed item is indistinguishable with respect to the attacker's background knowledge, from at least k-1 other transformed items. Our comprehensive experiments on a very large and real transaction database demonstrate that our techniques are effective, scalable, and protect privacy.1 aGiannotti, Fosca1 aLakshmanan, L V S1 aMonreale, Anna1 aPedreschi, Dino1 aWang, Hui, Wendy  uhttps://kdd.isti.cnr.it/publications/privacy-preserving-mining-association-rules-outsourced-transaction-databases00505nas a2200133   4500008003900000245005400039210005100093100003200144700002300176700003100199700003800230700001800268856008500286       2013                          d00aA Proactive Ap- plication to Monitor Truck Fleets0 aProactive Ap plication to Monitor Truck Fleets1 aAlbuquerque, Fabio Da Costa1 aCasanova, Marco, A1 aCarvalho, Marcelo Tilio, M1 aMacêdo, José Antônio Fernandes1 aRenso, Chiara  uhttps://kdd.isti.cnr.it/publications/proactive-ap-plication-monitor-truck-fleets01493nas a2200157   4500008003900000245006200039210006000101260000900161520096900170100002101139700002201160700001901182700002001201700002101221856009301242       2011                          d00aPrivacy-preserving data mining from outsourced databases.0 aPrivacypreserving data mining from outsourced databases  c20113 aSpurred by developments such as cloud computing, there has been considerable recent interest in the paradigm of data mining-as-service: a company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third party service provider (server). However, both the outsourced database and the knowledge extract from it by data mining are considered private property of the data owner. To protect corporate privacy, the data owner transforms its data and ships it to the server, sends mining queries to the server, and recovers the true patterns from the extracted patterns received from the server. In this paper, we study the problem of outsourcing a data mining task within a corporate privacy-preserving framework. We propose a scheme for privacy-preserving outsourced mining which offers a formal protection against information disclosure, and show that the data owner can recover the correct data mining results efficiently.1 aGiannotti, Fosca1 aLakshmanan, L V S1 aMonreale, Anna1 aPedreschi, Dino1 aWang, Hui, Wendy  uhttps://kdd.isti.cnr.it/publications/privacy-preserving-data-mining-outsourced-databases01999nas a2200169   4500008003900000245008200039210006900121300001200190490000600202520141100208100002501619700002001644700002101664700001901685700002001704856010501724       2011                          d00aThe pursuit of hubbiness: Analysis of hubs in large multidimensional networks0 apursuit of hubbiness Analysis of hubs in large multidimensional   a223-2370 v23 aHubs are highly connected nodes within a network. In complex network analysis, hubs have been widely studied, and are at the basis of many tasks, such as web search and epidemic outbreak detection. In reality, networks are often multidimensional, i.e., there can exist multiple connections between any pair of nodes. In this setting, the concept of hub depends on the multiple dimensions of the network, whose interplay becomes crucial for the connectedness of a node. In this paper, we characterize multidimensional hubs. We consider the multidimensional generalization of the degree and introduce a new class of measures, that we call Dimension Relevance, aimed at analyzing the importance of different dimensions for the hubbiness of a node. We assess the meaningfulness of our measures by comparing them on real networks and null models, then we study the interplay among dimensions and their effect on node connectivity. Our findings show that: (i) multidimensional hubs do exist and their characterization yields interesting insights and (ii) it is possible to detect the most influential dimensions that cause the different hub behaviors. We demonstrate the usefulness of multidimensional analysis in three real world domains: detection of ambiguous query terms in a word–word query log network, outlier detection in a social network, and temporal analysis of behaviors in a co-authorship network.1 aBerlingerio, Michele1 aCoscia, Michele1 aGiannotti, Fosca1 aMonreale, Anna1 aPedreschi, Dino  uhttps://kdd.isti.cnr.it/publications/pursuit-hubbiness-analysis-hubs-large-multidimensional-networks01643nas a2200157   4500008003900000245007100039210006900110300001000179520109600189100001901285700002201304700001801326700002001344700001901364856010201383       2010                          d00aPreserving privacy in semantic-rich trajectories of human mobility0 aPreserving privacy in semanticrich trajectories of human mobilit  a47-543 aThe increasing abundance of data about the trajectories of personal movement is opening up new opportunities for analyzing and mining human mobility, but new risks emerge since it opens new ways of intruding into personal privacy. Representing the personal movements as sequences of places visited by a person during her/his movements - semantic trajectory - poses even greater privacy threats w.r.t. raw geometric location data. In this paper we propose a privacy model defining the attack model of semantic trajectory linking, together with a privacy notion, called c-safety. This method provides an upper bound to the probability of inferring that a given person, observed in a sequence of nonsensitive places, has also stopped in any sensitive location. Coherently with the privacy model, we propose an algorithm for transforming any dataset of semantic trajectories into a c-safe one. We report a study on a real-life GPS trajectory dataset to show how our algorithm preserves interesting quality/utility measures of the original trajectories, such as sequential pattern mining results.1 aMonreale, Anna1 aTrasarti, Roberto1 aRenso, Chiara1 aPedreschi, Dino1 aBogorny, Vania  uhttps://kdd.isti.cnr.it/publications/preserving-privacy-semantic-rich-trajectories-human-mobility00460nas a2200109   4500008004100000245008300041210006900124260001200193100002000205700002100225856010400246       2009                        eng d00aPoverty as a Social Condition: a Case Study on a Small Municipality in Tuscany0 aPoverty as a Social Condition a Case Study on a Small Municipali  bSEAFORD1 aTomei, Gabriele1 aNatilli, Michela  uhttps://kdd.isti.cnr.it/publications/poverty-social-condition-case-study-small-municipality-tuscany01351nas a2200133   4500008003900000245009600039210006900135520084300204100002201047700001901069700001901088700002001107856009001127       2008                          d00aPattern-Preserving k-Anonymization of Sequences and its Application to Mobility Data Mining0 aPatternPreserving kAnonymization of Sequences and its Applicatio3 aSequential pattern mining is a major research field in knowledge
discovery and data mining. Thanks to the increasing availability of
transaction data, it is now possible to provide new and improved services
based on users’ and customers’ behavior. However, this puts the citizen’s
privacy at risk. Thus, it is important to develop new privacy-preserving
data mining techniques that do not alter the analysis results significantly.
In this paper we propose a new approach for anonymizing sequential
data by hiding infrequent, and thus potentially sensible, subsequences.
Our approach guarantees that the disclosed data are k-anonymous and
preserve the quality of extracted patterns. An application to a real-world
moving object database is presented, which shows the effectiveness of our
approach also in complex contexts.1 aPensa, Ruggero, G1 aMonreale, Anna1 aPinelli, Fabio1 aPedreschi, Dino  uhttps://air.unimi.it/retrieve/handle/2434/52786/106397/ProceedingsPiLBA08.pdf#page=4400657nas a2200181   4500008003900000245008000039210006900119300001200188100002000200700002200220700001900242700002700261700002100288700001900309700001800328700001900346856011000365       2008                          d00aPrivacy Protection: Regulations and Technologies, Opportunities and Threats0 aPrivacy Protection Regulations and Technologies Opportunities an  a101-1191 aPedreschi, Dino1 aBonchi, Francesco1 aTurini, Franco1 aVerykios, Vassilios, S1 aAtzori, Maurizio1 aMalin, Bradley1 aMoelans, Bart1 aSaygin, Yücel  uhttps://kdd.isti.cnr.it/content/privacy-protection-regulations-and-technologies-opportunities-and-threats00496nas a2200145   4500008004100000245005700041210005600098300001200154100002100166700002200187700002100209700002000230700001600250856008400266       2007                        eng d00aPrivacy-Aware Knowledge Discovery from Location Data0 aPrivacyAware Knowledge Discovery from Location Data  a283-2871 aAtzori, Maurizio1 aBonchi, Francesco1 aGiannotti, Fosca1 aPedreschi, Dino1 aAbul, Osman  uhttps://kdd.isti.cnr.it/content/privacy-aware-knowledge-discovery-location-data00577nas a2200145   4500008003900000020002200039245008000061210006900141260000900210100002200219700002100241700001800262700002000280856013100300       2007                          d  a978-972-8924-44-700aPUSHING CONSTRAINTS IN ASSOCIATION RULE MINING: AN ONTOLOGY-BASED APPROACH 0 aPUSHING CONSTRAINTS IN ASSOCIATION RULE MINING AN ONTOLOGYBASED   c20071 aFurletti, Barbara1 aBellandi, Andrea1 aRomei, Andrea1 aGrossi, Valerio  uhttp://www.iadisportal.org/digital-library/mdownload/pushing-constraints-in-association-rule-mining-an-ontology-based-approach00379nas a2200109   4500008004100000245004900041210004900090300000900139100002200148700002100170856007800191       2004                        eng d00aPushing Constraints to Detect Local Patterns0 aPushing Constraints to Detect Local Patterns  a1-191 aBonchi, Francesco1 aGiannotti, Fosca  uhttps://kdd.isti.cnr.it/content/pushing-constraints-detect-local-patterns00547nas a2200133   4500008004100000245010300041210006900144100001300213700001300226700001500239700002100254700001500275856012300290       2003                        eng d00aPersonal income in the gross and net forms: applications of the Siena Micro-Simulation Model (SM2)0 aPersonal income in the gross and net forms applications of the S1 aVerma, V1 aBetti, G1 aBallini, F1 aNatilli, Michela1 aGalgani, S  uhttps://kdd.isti.cnr.it/publications/personal-income-gross-and-net-forms-applications-siena-micro-simulation-model-sm200449nas a2200133   4500008004100000245005000041210004900091300001200140100002200152700002100174700002200195700002000217856007800237       2003                        eng d00aPre-processing for Constrained Pattern Mining0 aPreprocessing for Constrained Pattern Mining  a519-5301 aBonchi, Francesco1 aGiannotti, Fosca1 aMazzanti, Alessio1 aPedreschi, Dino  uhttps://kdd.isti.cnr.it/content/pre-processing-constrained-pattern-mining00487nas a2200145   4500008004100000245006000041210005900101300001100160490000700171100002100178700001800199700002100217700001900238856008400257       1997                        eng d00aProgramming with Non-Determinism in Deductive Databases0 aProgramming with NonDeterminism in Deductive Databases  a97-1250 v191 aGiannotti, Fosca1 aGreco, Sergio1 aSaccà, Domenico1 aZaniolo, Carlo  uhttps://kdd.isti.cnr.it/content/programming-non-determinism-deductive-databases00375nas a2200097   4500008004100000245006100041210005900102300001200161100002000173856008400193       1994                        eng d00aA Proof Method for Runtime Properties of Prolog Programs0 aProof Method for Runtime Properties of Prolog Programs  a584-5981 aPedreschi, Dino  uhttps://kdd.isti.cnr.it/content/proof-method-runtime-properties-prolog-programs00361nas a2200109   4500008004100000245004300041210004300084300001000127100002200137700002000159856007200179       1994                        eng d00aProving termination of Prolog programs0 aProving termination of Prolog programs  a46-611 aMascellani, Paolo1 aPedreschi, Dino  uhttps://kdd.isti.cnr.it/content/proving-termination-prolog-programs00387nas a2200109   4500008004100000245005100041210005100092300001200143100002200155700002000177856008000197       1991                        eng d00aProving Termination of General Prolog Programs0 aProving Termination of General Prolog Programs  a265-2891 aApt, Krzysztof, R1 aPedreschi, Dino  uhttps://kdd.isti.cnr.it/content/proving-termination-general-prolog-programs00467nas a2200157   4500008004100000245004100041210003900082300001200121100002100133700002200154700001500176700001500191700002000206700001900226856006400245       1988                        eng d00aA Progress Report on the LML Project0 aProgress Report on the LML Project  a675-6841 aBertolino, Bruno1 aMancarella, Paolo1 aMeo, Luigi1 aNini, Luca1 aPedreschi, Dino1 aTurini, Franco  uhttps://kdd.isti.cnr.it/content/progress-report-lml-project