TY  - CONF
T1  - SoBigData: Social Mining & Big Data Ecosystem
T2  - Companion of the The Web Conference 2018 on The Web Conference 2018
Y1  - 2018
A1  - Fosca Giannotti
A1  - Roberto Trasarti
A1  - Bontcheva, Kalina
A1  - Valerio Grossi
AB  - One of the most pressing and fascinating challenges scientists face today, is understanding the complexity of our globally interconnected society. The big data arising from the digital breadcrumbs of human activities has the potential of providing a powerful social microscope, which can help us understand many complex and hidden socio-economic phenomena. Such challenge requires high-level analytics, modeling and reasoning across all the social dimensions above. There is a need to harness these opportunities for scientific advancement and for the social good, compared to the currently prevalent exploitation of big data for commercial purposes or, worse, social control and surveillance. The main obstacle to this accomplishment, besides the scarcity of data scientists, is the lack of a large-scale open ecosystem where big data and social mining research can be carried out. The SoBigData Research Infrastructure (RI) provides an integrated ecosystem for ethic-sensitive scientific discoveries and advanced applications of social data mining on the various dimensions of social life as recorded by "big data". The research community uses the SoBigData facilities as a "secure digital wind-tunnel" for large-scale social data analysis and simulation experiments. SoBigData promotes repeatable and open science and supports data science research projects by providing: i) an ever-growing, distributed data ecosystem for procurement, access and curation and management of big social data, to underpin social data mining research within an ethic-sensitive context; ii) an ever-growing, distributed platform of interoperable, social data mining methods and associated skills: tools, methodologies and services for mining, analysing, and visualising complex and massive datasets, harnessing the techno-legal barriers to the ethically safe deployment of big data for social mining; iii) an ecosystem where protection of personal information and the respect for fundamental human rights can coexist with a safe use of the same information for scientific purposes of broad and central societal interest. SoBigData has a dedicated ethical and legal board, which is implementing a legal and ethical framework.
JF  - Companion of the The Web Conference 2018 on The Web Conference 2018
PB  - International World Wide Web Conferences Steering Committee
UR  - http://www.sobigdata.eu/sites/default/files/www%202018.pdf
ER  - 

TY  - JOUR
T1  - HyWare: a HYbrid Workflow lAnguage for Research E-infrastructures
JF  - D-Lib Magazine
Y1  - 2017
A1  - Leonardo Candela
A1  - Paolo Manghi
A1  - Fosca Giannotti
A1  - Valerio Grossi
A1  - Roberto Trasarti
AB  - Research e-infrastructures are "systems of systems", patchworks of tools, services and data sources, evolving over time to address the needs of the scientific process. Accordingly, in such environments, researchers implement their scientific processes by means of workflows made of a variety of actions, including for example usage of web services, download and execution of shared software libraries or tools, or local and manual manipulation of data. Although scientists may benefit from sharing their scientific process, the heterogeneity underpinning e-infrastructures hinders their ability to represent, share and eventually reproduce such workflows. This work presents HyWare, a language for representing scientific process in highly-heterogeneous e-infrastructures in terms of so-called hybrid workflows. HyWare lays in between "business process modeling languages", which offer a formal and high-level description of a reasoning, protocol, or procedure, and "workflow execution languages", which enable the fully automated execution of a sequence of computational steps via dedicated engines.
VL  - 23
UR  - http://dx.doi.org/10.1045/january2017-candela
ER  - 

TY  - JOUR
T1  - Survey on using constraints in data mining
JF  - Data Mining and Knowledge Discovery
Y1  - 2017
A1  - Valerio Grossi
A1  - Andrea Romei
A1  - Franco Turini
AB  - This paper provides an overview of the current state-of-the-art on using constraints in knowledge discovery and data mining. The use of constraints in a data mining task requires specific definition and satisfaction tools during knowledge extraction. This survey proposes three groups of studies based on classification, clustering and pattern mining, whether the constraints are on the data, the models or the measures, respectively. We consider the distinctions between hard and soft constraint satisfaction, and between the knowledge extraction phases where constraints are considered. In addition to discussing how constraints can be used in data mining, we show how constraint-based languages can be used throughout the data mining process.
VL  - 31
ER  - 

TY  - CHAP
T1  - Data Mining and Constraints: An Overview
T2  - Data Mining and Constraint Programming
Y1  - 2016
A1  - Valerio Grossi
A1  - Dino Pedreschi
A1  - Franco Turini
AB  - This paper provides an overview of the current state-of-the-art on using constraints in knowledge discovery and data mining. The use of constraints requires mechanisms for defining and evaluating them during the knowledge extraction process. We give a structured account of three main groups of constraints based on the specific context in which they are defined and used. The aim is to provide a complete view on constraints as a building block of data mining methods.
JF  - Data Mining and Constraint Programming
PB  - Springer International Publishing
ER  - 

TY  - JOUR
T1  - Driving Profiles Computation and Monitoring for Car Insurance CRM
JF  - Journal ACM Transactions on Intelligent Systems and Technology (TIST)
Y1  - 2016
A1  - Mirco Nanni
A1  - Roberto Trasarti
A1  - Anna Monreale
A1  - Valerio Grossi
A1  - Dino Pedreschi
AB  - Customer segmentation is one of the most traditional and valued tasks in customer relationship management (CRM). In this article, we explore the problem in the context of the car insurance industry, where the mobility behavior of customers plays a key role: Different mobility needs, driving habits, and skills imply also different requirements (level of coverage provided by the insurance) and risks (of accidents). In the present work, we describe a methodology to extract several indicators describing the driving profile of customers, and we provide a clustering-oriented instantiation of the segmentation problem based on such indicators. Then, we consider the availability of a continuous flow of fresh mobility data sent by the circulating vehicles, aiming at keeping our segments constantly up to date. We tackle a major scalability issue that emerges in this context when the number of customers is large-namely, the communication bottleneck-by proposing and implementing a sophisticated distributed monitoring solution that reduces communications between vehicles and company servers to the essential. We validate the framework on a large database of real mobility data coming from GPS devices on private cars. Finally, we analyze the privacy risks that the proposed approach might involve for the users, providing and evaluating a countermeasure based on data perturbation.
VL  - 8
UR  - http://doi.acm.org/10.1145/2912148
ER  - 

TY  - CHAP
T1  - Partition-Based Clustering Using Constraint Optimization
T2  - Data Mining and Constraint Programming - Foundations of a Cross-Disciplinary Approach
Y1  - 2016
A1  - Valerio Grossi
A1  - Tias Guns
A1  - Anna Monreale
A1  - Mirco Nanni
A1  - Siegfried Nijssen
AB  - Partition-based clustering is the task of partitioning a dataset in a number of groups of examples, such that examples in each group are similar to each other. Many criteria for what constitutes a good clustering have been identified in the literature; furthermore, the use of additional constraints to find more useful clusterings has been proposed. In this chapter, it will be shown that most of these clustering tasks can be formalized using optimization criteria and constraints. We demonstrate how a range of clustering tasks can be modelled in generic constraint programming languages with these constraints and optimization criteria. Using the constraint-based modeling approach we also relate the DBSCAN method for density-based clustering to the label propagation technique for community discovery.
JF  - Data Mining and Constraint Programming - Foundations of a Cross-Disciplinary Approach
PB  - Springer International Publishing
UR  - http://dx.doi.org/10.1007/978-3-319-50137-6_11
ER  - 

TY  - CONF
T1  - Clustering Formulation Using Constraint Optimization
T2  - Software Engineering and Formal Methods - {SEFM} 2015 Collocated Workshops: ATSE, HOFM, MoKMaSD, and VERY*SCART, York, UK, September 7-8, 2015, Revised Selected Papers
Y1  - 2015
A1  - Valerio Grossi
A1  - Anna Monreale
A1  - Mirco Nanni
A1  - Dino Pedreschi
A1  - Franco Turini
AB  - The problem of clustering a set of data is a textbook machine learning problem, but at the same time, at heart, a typical optimization problem. Given an objective function, such as minimizing the intra-cluster distances or maximizing the inter-cluster distances, the task is to find an assignment of data points to clusters that achieves this objective. In this paper, we present a constraint programming model for a centroid based clustering and one for a density based clustering. In particular, as a key contribution, we show how the expressivity introduced by the formulation of the problem by constraint programming makes the standard problem easy to be extended with other constraints that permit to generate interesting variants of the problem. We show this important aspect in two different ways: first, we show how the formulation of the density-based clustering by constraint programming makes it very similar to the label propagation problem and then, we propose a variant of the standard label propagation approach.
JF  - Software Engineering and Formal Methods - {SEFM} 2015 Collocated Workshops: ATSE, HOFM, MoKMaSD, and VERY*SCART, York, UK, September 7-8, 2015, Revised Selected Papers
PB  - Springer Berlin Heidelberg
UR  - http://dx.doi.org/10.1007/978-3-662-49224-6_9
ER  - 

TY  - CONF
T1  - A Case Study in Sequential Pattern Mining for IT-Operational Risk
T2  - ECML/PKDD (1)
Y1  - 2008
A1  - Valerio Grossi
A1  - Andrea Romei
A1  - Salvatore Ruggieri
JF  - ECML/PKDD (1)
ER  - 

TY  - CHAP
T1  - Discovering Strategic Behaviour in Multi- Agent Scenarios by Ontology-Driven Mining
T2  - Advances in Robotics, Automation and Control
Y1  - 2008
A1  - Davide Bacciu
A1  - Andrea Bellandi
A1  - Barbara Furletti
A1  - Valerio Grossi
A1  - Andrea Romei
JF  - Advances in Robotics, Automation and Control
SN  - 978-953-7619-16-9
UR  - http://www.intechopen.com/books/advances_in_robotics_automation_and_control/discovering_strategic_behaviors_in_multi-agent_scenarios_by_ontology-driven_mining
ER  - 

TY  - CONF
T1  - Ontological Support for Association Rule Mining
T2  - IASTED International Conference on Artificial Intelligence and Applications (AIA)
Y1  - 2008
A1  - Barbara Furletti
A1  - Andrea Bellandi
A1  - Valerio Grossi
A1  - Andrea Romei
JF  - IASTED International Conference on Artificial Intelligence and Applications (AIA)
CY  - Innsbruck, Austria
ER  - 

TY  - CONF
T1  - Ontology-Driven Association Rule Extraction: A Case Study
T2  - International Workshop on Contexts and Ontologies: Representation and Reasoning
Y1  - 2007
A1  - Barbara Furletti
A1  - Andrea Bellandi
A1  - Valerio Grossi
A1  - Andrea Romei
JF  - International Workshop on Contexts and Ontologies: Representation and Reasoning
CY  - Roskilde, Denmark
UR  - http://ceur-ws.org/Vol-298/paper1.pdf
ER  - 

TY  - CONF
T1  - PUSHING CONSTRAINTS IN ASSOCIATION RULE MINING: AN ONTOLOGY-BASED APPROACH
T2  - IADIS International Conference WWW/Internet 2007
Y1  - 2007
A1  - Barbara Furletti
A1  - Andrea Bellandi
A1  - Andrea Romei
A1  - Valerio Grossi
JF  - IADIS International Conference WWW/Internet 2007
SN  - 978-972-8924-44-7
UR  - http://www.iadisportal.org/digital-library/mdownload/pushing-constraints-in-association-rule-mining-an-ontology-based-approach
ER  -