%0 Journal Article %J IEEE Transactions on Dependable and Secure Computing %D 2017 %T Authenticated Outlier Mining for Outsourced Databases %A Dong, Boxiang %A Hui Wendy Wang %A Anna Monreale %A Dino Pedreschi %A Fosca Giannotti %A W Guo %X The Data-Mining-as-a-Service (DMaS) paradigm is becoming the focus of research, as it allows the data owner (client) who lacks expertise and/or computational resources to outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises some issues about result integrity: how could the client verify the mining results returned by the server are both sound and complete? In this paper, we focus on outlier mining, an important mining task. Previous verification techniques use an authenticated data structure (ADS) for correctness authentication, which may incur much space and communication cost. In this paper, we propose a novel solution that returns a probabilistic result integrity guarantee with much cheaper verification cost. The key idea is to insert a set of artificial records (ARs) into the dataset, from which it constructs a set of artificial outliers (AOs) and artificial non-outliers (ANOs). The AOs and ANOs are used by the client to detect any incomplete and/or incorrect mining results with a probabilistic guarantee. The main challenge that we address is how to construct ARs so that they do not change the (non-)outlierness of original records, while guaranteeing that the client can identify ANOs and AOs without executing mining. Furthermore, we build a strategic game and show that a Nash equilibrium exists only when the server returns correct outliers. Our implementation and experiments demonstrate that our verification solution is efficient and lightweight. %B IEEE Transactions on Dependable and Secure Computing %G eng %U https://ieeexplore.ieee.org/document/8048342/ %R 10.1109/TDSC.2017.2754493 %0 Conference Paper %B 40th IEEE Annual Computer Software and Applications Conference, {COMPSAC} Workshops 2016, Atlanta, GA, USA, June 10-14, 2016 %D 2016 %T Privacy-Preserving Outsourcing of Data Mining %A Anna Monreale %A Hui Wendy Wang %X Data mining is gaining momentum in society due to the ever increasing availability of large amounts of data, easily gathered by a variety of collection technologies and stored via computer systems. Due to the limited computational resources of data owners and the developments in cloud computing, there has been considerable recent interest in the paradigm of data mining-as-a-service (DMaaS). In this paradigm, a company (data owner) lacking in expertise or computational resources outsources its mining needs to a third party service provider (server). Given the fact that the server may not be fully trusted, one of the main concerns of the DMaaS paradigm is the protection of data privacy. In this paper, we provide an overview of a variety of techniques and approaches that address the privacy issues of the DMaaS paradigm. %B 40th IEEE Annual Computer Software and Applications Conference, {COMPSAC} Workshops 2016, Atlanta, GA, USA, June 10-14, 2016 %I IEEE Computer Society %C Atlanta, GA, USA %G eng %U http://dx.doi.org/10.1109/COMPSAC.2016.169 %R 10.1109/COMPSAC.2016.169 %0 Conference Paper %B SEBD %D 2013 %T Privacy-Aware Distributed Mobility Data Analytics %A Francesca Pratesi %A Anna Monreale %A Hui Wendy Wang %A S Rinzivillo %A Dino Pedreschi %A Gennady Andrienko %A Natalia Andrienko %X We propose an approach to preserve privacy in an analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because they may describe typical movement behaviors and therefore be used for re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation. %B SEBD %C Roccella Jonica %G eng %0 Book Section %B Geographic Information Science at the Heart of Europe %D 2013 %T Privacy-Preserving Distributed Movement Data Aggregation %A Anna Monreale %A Hui Wendy Wang %A Francesca Pratesi %A S Rinzivillo %A Dino Pedreschi %A Gennady Andrienko %A Natalia Andrienko %E Vandenbroucke, Danny %E Bucher, Bénédicte %E Crompvoets, Joep %X We propose a novel approach to privacy-preserving analytical processing within a distributed setting, and tackle the problem of obtaining aggregated information about vehicle traffic in a city from movement data collected by individual vehicles and shipped to a central server. Movement data are sensitive because people’s whereabouts have the potential to reveal intimate personal traits, such as religious or sexual preferences, and may allow re-identification of individuals in a database. We provide a privacy-preserving framework for movement data aggregation based on trajectory generalization in a distributed environment. The proposed solution, based on the differential privacy model and on sketching techniques for efficient data compression, provides a formal data protection safeguard. Using real-life data, we demonstrate the effectiveness of our approach also in terms of data utility preserved by the data transformation. %B Geographic Information Science at the Heart of Europe %S Lecture Notes in Geoinformation and Cartography %I Springer International Publishing %P 225-245 %@ 978-3-319-00614-7 %U http://dx.doi.org/10.1007/978-3-319-00615-4_13 %R 10.1007/978-3-319-00615-4_13 %0 Journal Article %J IEEE Systems Journal %D 2013 %T Privacy-Preserving Mining of Association Rules From Outsourced Transaction Databases %A Fosca Giannotti %A L.V.S. Lakshmanan %A Anna Monreale %A Dino Pedreschi %A Hui Wendy Wang %X Spurred by developments such as cloud computing, there has been considerable recent interest in the paradigm of data mining-as-a-service. A company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third party service provider (server). However, both the items and the association rules of the outsourced database are considered private property of the corporation (data owner). To protect corporate privacy, the data owner transforms its data and ships it to the server, sends mining queries to the server, and recovers the true patterns from the extracted patterns received from the server. In this paper, we study the problem of outsourcing the association rule mining task within a corporate privacy-preserving framework. We propose an attack model based on background knowledge and devise a scheme for privacy preserving outsourced mining. Our scheme ensures that each transformed item is indistinguishable with respect to the attacker's background knowledge, from at least k-1 other transformed items. Our comprehensive experiments on a very large and real transaction database demonstrate that our techniques are effective, scalable, and protect privacy. %B IEEE Systems Journal %R 10.1109/JSYST.2012.2221854 %0 Conference Paper %B Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2012 %D 2012 %T AUDIO: An Integrity Auditing Framework of Outlier-Mining-as-a-Service Systems. %A R.Liu %A Hui Wendy Wang %A Anna Monreale %A Dino Pedreschi %A Fosca Giannotti %A W Guo %X Spurred by developments such as cloud computing, there has been considerable recent interest in the data-mining-as-a-service paradigm. Users lacking in expertise or computational resources can outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises issues about result integrity: how can the data owner verify that the mining results returned by the server are correct? In this paper, we present AUDIO, an integrity auditing framework for the specific task of distance-based outlier mining outsourcing. It provides efficient and practical verification approaches to check both completeness and correctness of the mining results. The key idea of our approach is to insert a small amount of artificial tuples into the outsourced data; the artificial tuples will produce artificial outliers and non-outliers that do not exist in the original dataset. The server’s answer is verified by analyzing the presence of artificial outliers/non-outliers, obtaining a probabilistic guarantee of correctness and completeness of the mining result. Our empirical results show the effectiveness and efficiency of our method. %B Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2012 %8 2012 %R 10.1007/978-3-642-33486-3_1 %0 Conference Paper %B the 3rd International Conference on Computers, Privacy, and Data Protection: An element of choice %D 2011 %T Privacy-preserving data mining from outsourced databases. %A Fosca Giannotti %A L.V.S. Lakshmanan %A Anna Monreale %A Dino Pedreschi %A Hui Wendy Wang %X Spurred by developments such as cloud computing, there has been considerable recent interest in the paradigm of data mining-as-service: a company (data owner) lacking in expertise or computational resources can outsource its mining needs to a third party service provider (server). However, both the outsourced database and the knowledge extract from it by data mining are considered private property of the data owner. To protect corporate privacy, the data owner transforms its data and ships it to the server, sends mining queries to the server, and recovers the true patterns from the extracted patterns received from the server. In this paper, we study the problem of outsourcing a data mining task within a corporate privacy-preserving framework. We propose a scheme for privacy-preserving outsourced mining which offers a formal protection against information disclosure, and show that the data owner can recover the correct data mining results efficiently. %B the 3rd International Conference on Computers, Privacy, and Data Protection: An element of choice %8 2011 %R 10.1007/978-94-007-0641-5_19