%0 Conference Paper %B International Symposium on Intelligent Data Analysis %D 2020 %T Digital Footprints of International Migration on Twitter %A Jisu Kim %A Alina Sirbu %A Fosca Giannotti %A Lorenzo Gabrielli %X Studying migration using traditional data has some limitations. To date, there have been several studies proposing innovative methodologies to measure migration stocks and flows from social big data. Nevertheless, a uniform definition of a migrant is difficult to find as it varies from one work to another depending on the purpose of the study and nature of the dataset used. In this work, a generic methodology is developed to identify migrants within the Twitter population. This describes a migrant as a person who has the current residence different from the nationality. The residence is defined as the location where a user spends most of his/her time in a certain year. The nationality is inferred from linguistic and social connections to a migrant’s country of origin. This methodology is validated first with an internal gold standard dataset and second with two official statistics, and shows strong performance scores and correlation coefficients. Our method has the advantage that it can identify both immigrants and emigrants, regardless of the origin/destination countries. The new methodology can be used to study various aspects of migration, including opinions, integration, attachment, stocks and flows, motivations for migration, etc. Here, we exemplify how trending topics across and throughout different migrant communities can be observed. %B International Symposium on Intelligent Data Analysis %I Springer %G eng %U https://link.springer.com/chapter/10.1007/978-3-030-44584-3_22 %R https://doi.org/10.1007/978-3-030-44584-3_22 %0 Journal Article %J International Journal of Data Science and Analytics %D 2020 %T Human migration: the big data perspective %A Alina Sirbu %A Andrienko, Gennady %A Andrienko, Natalia %A Boldrini, Chiara %A Conti, Marco %A Fosca Giannotti %A Riccardo Guidotti %A Bertoli, Simone %A Jisu Kim %A Muntean, Cristina Ioana %A Luca Pappalardo %A Passarella, Andrea %A Dino Pedreschi %A Pollacci, Laura %A Francesca Pratesi %A Sharma, Rajesh %X How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants. %B International Journal of Data Science and Analytics %P 1–20 %8 2020/03/23 %@ 2364-4168 %G eng %U https://link.springer.com/article/10.1007%2Fs41060-020-00213-5 %! International Journal of Data Science and Analytics %R https://doi.org/10.1007/s41060-020-00213-5 %0 Journal Article %J PloS one %D 2019 %T Algorithmic bias amplifies opinion fragmentation and polarization: A bounded confidence model %A Alina Sirbu %A Dino Pedreschi %A Fosca Giannotti %A Kertész, János %X The flow of information reaching us via the online media platforms is optimized not by the information content or relevance but by popularity and proximity to the target. This is typically performed in order to maximise platform usage. As a side effect, this introduces an algorithmic bias that is believed to enhance fragmentation and polarization of the societal debate. To study this phenomenon, we modify the well-known continuous opinion dynamics model of bounded confidence in order to account for the algorithmic bias and investigate its consequences. In the simplest version of the original model the pairs of discussion participants are chosen at random and their opinions get closer to each other if they are within a fixed tolerance level. We modify the selection rule of the discussion partners: there is an enhanced probability to choose individuals whose opinions are already close to each other, thus mimicking the behavior of online media which suggest interaction with similar peers. As a result we observe: a) an increased tendency towards opinion fragmentation, which emerges also in conditions where the original model would predict consensus, b) increased polarisation of opinions and c) a dramatic slowing down of the speed at which the convergence at the asymptotic state is reached, which makes the system highly unstable. Fragmentation and polarization are augmented by a fragmented initial population. %B PloS one %V 14 %P e0213246 %G eng %U https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0213246 %R 10.1371/journal.pone.0213246 %0 Journal Article %J ERCIM News %D 2019 %T Public opinion and Algorithmic bias %A Alina Sirbu %A Fosca Giannotti %A Dino Pedreschi %A Kertész, János %B ERCIM News %G eng %U https://ercim-news.ercim.eu/en116/special/public-opinion-and-algorithmic-bias %0 Journal Article %J International Journal of Data Science and Analytics %D 2018 %T NDlib: a python library to model and analyze diffusion processes over complex networks %A Giulio Rossetti %A Letizia Milli %A S Rinzivillo %A Alina Sirbu %A Dino Pedreschi %A Fosca Giannotti %X Nowadays the analysis of dynamics of and on networks represents a hot topic in the social network analysis playground. To support students, teachers, developers and researchers, in this work we introduce a novel framework, namely NDlib, an environment designed to describe diffusion simulations. NDlib is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. For this reason, upon NDlib, we designed a simulation server that allows remote execution of experiments as well as an online visualization tool that abstracts its programmatic interface and makes available the simulation platform to non-technicians. %B International Journal of Data Science and Analytics %V 5 %P 61–79 %G eng %U https://link.springer.com/article/10.1007/s41060-017-0086-6 %R 10.1007/s41060-017-0086-6 %0 Book Section %B Participatory Sensing, Opinions and Collective Awareness %D 2017 %T Applications for Environmental Sensing in EveryAware %A Atzmueller, Martin %A Becker, Martin %A Molino, Andrea %A Mueller, Juergen %A Peters, Jan %A Alina Sirbu %X This chapter provides a technical description of the EveryAware applications for air quality and noise monitoring. Specifically, we introduce AirProbe, for measuring air quality, and WideNoise Plus for estimating environmental noise. We also include an overview on hardware components and smartphone-based measurement technology, and we present the according web backend, e.g., providing for real-time tracking, data storage, analysis and visualizations. %B Participatory Sensing, Opinions and Collective Awareness %I Springer %P 135–155 %G eng %U http://link.springer.com/chapter/10.1007/978-3-319-25658-0_7 %R 10.1007/978-3-319-25658-0_7 %0 Book Section %B Participatory Sensing, Opinions and Collective Awareness %D 2017 %T Experimental Assessment of the Emergence of Awareness and Its Influence on Behavioral Changes: The Everyaware Lesson %A Pietro Gravino %A Alina Sirbu %A Becker, Martin %A Vito D P Servedio %A Vittorio Loreto %X The emergence of awareness is deeply connected to the process of learning. In fact, by learning that high sound levels may harm one’s health, that noise levels that we estimate as innocuous may be dangerous, that there exist an alternative path we can walk to go to work and minimize our exposure to air pollution, etc., citizens will be able to understand the environment around them and act consequently to go toward a more sustainable world. %B Participatory Sensing, Opinions and Collective Awareness %I Springer %P 337–362 %G eng %U http://link.springer.com/chapter/10.1007/978-3-319-25658-0_16 %R 10.1007/978-3-319-25658-0_16 %0 Book Section %B Participatory Sensing, Opinions and Collective Awareness %D 2017 %T Large Scale Engagement Through Web-Gaming and Social Computations %A Vito D P Servedio %A Saverio Caminiti %A Pietro Gravino %A Vittorio Loreto %A Alina Sirbu %A Francesca Tria %X In the last few years the Web has progressively acquired the status of an infrastructure for social computation that allows researchers to coordinate the cognitive abilities of human agents, so to steer the collective user activity towards predefined goals. This general trend is also triggering the adoption of web-games as an alternative laboratory to run experiments in the social sciences and whenever the contribution of human beings can be effectively used for research purposes. Web-games introduce a playful aspect in scientific experiments with the result of increasing participation of people and of keeping their attention steady in time. The aim of this chapter is to suggest a general purpose web-based platform scheme for web-gaming and social computation. This platform will simplify the realization of web-games and will act as a repository of different scientific experiments, thus realizing a sort of showcase that stimulates users’ curiosity and helps researchers in recruiting volunteers. A platform built by following these criteria has been developed within the EveryAware project, the Experimental Tribe (XTribe) platform, which is operational and ready to be used. Finally, a sample web-game hosted by the XTribe platform will be presented with the aim of reporting the results, in terms of participation and motivation, of two different player recruiting strategies. %B Participatory Sensing, Opinions and Collective Awareness %I Springer %P 237–254 %G eng %U http://link.springer.com/chapter/10.1007/978-3-319-25658-0_12 %R 10.1007/978-3-319-25658-0_12 %0 Journal Article %J International Journal of Data Science and Analytics %D 2017 %T NDlib: a python library to model and analyze diffusion processes over complex networks %A Giulio Rossetti %A Letizia Milli %A S Rinzivillo %A Alina Sirbu %A Dino Pedreschi %A Fosca Giannotti %X Nowadays the analysis of dynamics of and on networks represents a hot topic in the social network analysis playground.To support students, teachers, developers and researchers, in this work we introduce a novel framework, namely NDlib, an environment designed to describe diffusion simulations. NDlib is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. For this reason, upon NDlib, we designed a simulation server that allows remote execution of experiments as well as an online visualization tool that abstracts its programmatic interface and makes available the simulation platform to non-technicians. %B International Journal of Data Science and Analytics %P 1–19 %G eng %0 Conference Paper %B IEEE International Conference on Data Science and Advanced Analytics, DSA %D 2017 %T NDlib: Studying Network Diffusion Dynamics %A Giulio Rossetti %A Letizia Milli %A S Rinzivillo %A Alina Sirbu %A Dino Pedreschi %A Fosca Giannotti %X Nowadays the analysis of diffusive phenomena occurring on top of complex networks represents a hot topic in the Social Network Analysis playground. In order to support students, teachers, developers and researchers in this work we introduce a novel simulation framework, ND LIB . ND LIB is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. Upon the diffusion library, we designed a simulation server that allows remote execution of experiments and an online visualization tool that abstract the programmatic interface and makes available the simulation platform to non-technicians. %B IEEE International Conference on Data Science and Advanced Analytics, DSA %C Tokyo %G eng %U https://ieeexplore.ieee.org/abstract/document/8259774 %R https://doi.org/10.1109/DSAA.2017.6 %0 Book Section %B Participatory Sensing, Opinions and Collective Awareness %D 2017 %T Opinion dynamics: models, extensions and external effects %A Alina Sirbu %A Vittorio Loreto %A Vito D P Servedio %A Francesca Tria %X Recently, social phenomena have received a lot of attention not only from social scientists, but also from physicists, mathematicians and computer scientists, in the emerging interdisciplinary field of complex system science. Opinion dynamics is one of the processes studied, since opinions are the drivers of human behaviour, and play a crucial role in many global challenges that our complex world and societies are facing: global financial crises, global pandemics, growth of cities, urbanisation and migration patterns, and last but not least important, climate change and environmental sustainability and protection. Opinion formation is a complex process affected by the interplay of different elements, including the individual predisposition, the influence of positive and negative peer interaction (social networks playing a crucial role in this respect), the information each individual is exposed to, and many others. Several models inspired from those in use in physics have been developed to encompass many of these elements, and to allow for the identification of the mechanisms involved in the opinion formation process and the understanding of their role, with the practical aim of simulating opinion formation and spreading under various conditions. These modelling schemes range from binary simple models such as the voter model, to multi-dimensional continuous approaches. Here, we provide a review of recent methods, focusing on models employing both peer interaction and external information, and emphasising the role that less studied mechanisms, such as disagreement, has in driving the opinion dynamics. Due to the important role that external information (mainly in the form of mass media broadcast) can have in enhancing awareness of social issues, a special emphasis will be devoted to study different forms it can take, investigating their effectiveness in driving the opinion formation at the population level. The review shows that, although a large number of approaches exist, some mechanisms such as the effect of multiple external information sources could largely benefit from further studies. Additionally, model validation with real data, which are starting to become available, is still largely lacking and should in our opinion be the main ambition of future investigations. %B Participatory Sensing, Opinions and Collective Awareness %I Springer %P 363–401 %G eng %U http://link.springer.com/chapter/10.1007/978-3-319-25658-0_17 %R 10.1007/978-3-319-25658-0_17 %0 Conference Paper %B Conference of the Italian Association for Artificial Intelligence %D 2017 %T Sentiment Spreading: An Epidemic Model for Lexicon-Based Sentiment Analysis on Twitter %A Pollacci, Laura %A Alina Sirbu %A Fosca Giannotti %A Dino Pedreschi %A Claudio Lucchese %A Muntean, Cristina Ioana %X While sentiment analysis has received significant attention in the last years, problems still exist when tools need to be applied to microblogging content. This because, typically, the text to be analysed consists of very short messages lacking in structure and semantic context. At the same time, the amount of text produced by online platforms is enormous. So, one needs simple, fast and effective methods in order to be able to efficiently study sentiment in these data. Lexicon-based methods, which use a predefined dictionary of terms tagged with sentiment valences to evaluate sentiment in longer sentences, can be a valid approach. Here we present a method based on epidemic spreading to automatically extend the dictionary used in lexicon-based sentiment analysis, starting from a reduced dictionary and large amounts of Twitter data. The resulting dictionary is shown to contain valences that correlate well with human-annotated sentiment, and to produce tweet sentiment classifications comparable to the original dictionary, with the advantage of being able to tag more tweets than the original. The method is easily extensible to various languages and applicable to large amounts of data. %B Conference of the Italian Association for Artificial Intelligence %I Springer %G eng %U https://link.springer.com/chapter/10.1007/978-3-319-70169-1_9 %R 10.1007/978-3-319-70169-1_9 %0 Conference Proceedings %B 22nd International European Conference on Parallel and Distributed Computing, Euro-Par 2016 %D 2016 %T Power Consumption Modeling and Prediction in a Hybrid CPU-GPU-MIC Supercomputer %A Alina Sirbu %A Ozalp Babaoglu %X Power consumption is a major obstacle for High Performance Computing (HPC) systems in their quest towards the holy grail of ExaFLOP performance. Significant advances in power efficiency have to be made before this goal can be attained and accurate modeling is an essential step towards power efficiency by optimizing system operating parameters to match dynamic energy needs. In this paper we present a study of power consumption by jobs in Eurora, a hybrid CPU-GPU-MIC system installed at the largest Italian data center. Using data from a dedicated monitoring framework, we build a data-driven model of power consumption for each user in the system and use it to predict the power requirements of future jobs. We are able to achieve good prediction results for over 80 % of the users in the system. For the remaining users, we identify possible reasons why prediction performance is not as good. Possible applications for our predictive modeling results include scheduling optimization, power-aware billing and system-scale power modeling. All the scripts used for the study have been made available on GitHub. %B 22nd International European Conference on Parallel and Distributed Computing, Euro-Par 2016 %I Springer LNCS %C Grenoble, France %V LNCS 9833 %G eng %U http://arxiv.org/abs/1601.05961 %R 10.1007/978-3-319-43659-3_9 %0 Conference Paper %B 2016 International Conference on High Performance Computing Simulation (HPCS) %D 2016 %T Predicting System-level Power for a Hybrid Supercomputer %A Alina Sirbu %A Ozalp Babaoglu %X For current High Performance Computing systems to scale towards the holy grail of ExaFLOP performance, their power consumption has to be reduced by at least one order of magnitude. This goal can be achieved only through a combination of hardware and software advances. Being able to model and accurately predict the power consumption of large computational systems is necessary for software-level innovations such as proactive and power-aware scheduling, resource allocation and fault tolerance techniques. In this paper we present a 2-layer model of power consumption for a hybrid supercomputer (which held the top spot of the Green500 list on July 2013) that combines CPU, GPU and MIC technologies to achieve higher energy efficiency. Our model takes as input workload information - the number and location of resources that are used by each job at a certain time - and calculates the resulting system-level power consumption. When jobs are submitted to the system, the workload configuration can be foreseen based on the scheduler policies, and our model can then be applied to predict the ensuing system-level power consumption. Additionally, alternative workload configurations can be evaluated from a power perspective and more efficient ones can be selected. Applications of the model include not only power-aware scheduling but also prediction of anomalous behavior. %B 2016 International Conference on High Performance Computing Simulation (HPCS) %I IEEE %C Innsbruck, Austria %8 07/2016 %G eng %U http://ieeexplore.ieee.org/document/7568420/ %R 10.1109/HPCSim.2016.7568420 %0 Journal Article %J Cluster Computing %D 2016 %T Towards operator-less data centers through data-driven, predictive, proactive autonomics %A Alina Sirbu %A Ozalp Babaoglu %X Continued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using live data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating predictive models for node failures. Our results support the practicality of a data-driven approach by showing the effectiveness of predictive models based on data found in typical data center logs. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing node state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if nodes will fail in a future 24-h window. Our evaluation reveals that if we limit false positive rates to 5 %, we can achieve true positive rates between 27 and 88 % with precision varying between 50 and 72 %. This level of performance allows us to recover large fraction of jobs’ executions (by redirecting them to other nodes when a failure of the present node is predicted) that would otherwise have been wasted due to failures. We discuss the feasibility of including our predictive model as the central component of a data-driven autonomic manager and operating it on-line with live data streams (rather than off-line on data logs). All of the scripts used for BigQuery and classification analyses are publicly available on GitHub. %B Cluster Computing %P 1–14 %8 04/2016 %G eng %U http://link.springer.com/article/10.1007/s10586-016-0564-y %R DOI:10.1007/s10586-016-0564-y %0 Conference Paper %B Proceedings of the 1st International Conference on Complex Information Systems %D 2016 %T Unveiling Political Opinion Structures with a Web-experiment %A Pietro Gravino %A Saverio Caminiti %A Alina Sirbu %A Francesca Tria %A Vito D P Servedio %A Vittorio Loreto %X The dynamics of political votes has been widely studied, both for its practical interest and as a paradigm of the dynamics of mass opinions and collective phenomena, where theoretical predictions can be easily tested. However, the vote outcome is often influenced by many factors beyond the bare opinion on the candidate, and in most cases it is bound to a single preference. The voter perception of the political space is still to be elucidated. We here propose a web experiment (laPENSOcos`ı) where we explicitly investigate participants’ opinions on political entities (parties, coalitions, individual candidates) of the Italian political scene. As a main result, we show that the political perception follows a Weber-Fechner-like law, i.e., when ranking political entities according to the user expressed preferences, the perceived distance of the user from a given entity scales as the logarithm of this rank. %B Proceedings of the 1st International Conference on Complex Information Systems %@ 978-989-758-181-6 %G eng %U http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0005906300390047 %R 10.5220/0005906300390047 %0 Journal Article %J Computing %D 2015 %T A Big Data Analyzer for Large Trace Logs %A Balliu, Alkida %A Olivetti, Dennis %A Ozalp Babaoglu %A Marzolla, Moreno %A Alina Sirbu %B Computing %G eng %U http://link.springer.com/article/10.1007/s00607-015-0480-7 %R 10.1007/s00607-015-0480-7 %0 Journal Article %J Microarrays %D 2015 %T Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks %A Alina Sirbu %A Martin Crane %A Heather J Ruskin %B Microarrays %V 4 %P 255–269 %G eng %U http://www.mdpi.com/2076-3905/4/2/255 %R 10.3390/microarrays4020255 %0 Journal Article %J Quality & Quantity %D 2015 %T Egalitarianism in the rank aggregation problem: a new dimension for democracy %A Contucci, Pierluigi %A Panizzi, Emanuele %A Ricci-Tersenghi, Federico %A Alina Sirbu %B Quality & Quantity %P 1–16 %G eng %U http://link.springer.com/article/10.1007/s11135-015-0197-x %R 10.1007/s11135-015-0197-x %0 Conference Paper %B Euro-Par 2015: parallel Processing Workshops, LNCS 9523 %D 2015 %T A Holistic Approach to Log Data Analysis in High-Performance Computing Systems: The Case of IBM Blue Gene/Q %A Alina Sirbu %A Ozalp Babaoglu %B Euro-Par 2015: parallel Processing Workshops, LNCS 9523 %I Springer %G eng %U http://link.springer.com/chapter/10.1007%2F978-3-319-27308-2_51 %R 10.1007/978-3-319-27308-2_51 %0 Journal Article %J PLoS One %D 2015 %T Participatory Patterns in an International Air Quality Monitoring Initiative. %A Alina Sirbu %A Becker, Martin %A Saverio Caminiti %A De Baets, Bernard %A Elen, Bart %A Francis, Louise %A Pietro Gravino %A Hotho, Andreas %A Ingarra, Stefano %A Vittorio Loreto %A Molino, Andrea %A Mueller, Juergen %A Peters, Jan %A Ricchiuti, Ferdinando %A Saracino, Fabio %A Vito D P Servedio %A Stumme, Gerd %A Theunis, Jan %A Francesca Tria %A Van den Bossche, Joris %X

The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective way of producing a change is involving the citizens themselves in monitoring campaigns (a citizen science bottom-up approach). This is possible by developing novel technologies and IT infrastructures enabling large citizen participation. Here, in the wider framework of one of the first such projects, we show results from an international competition where citizens were involved in mobile air pollution monitoring using low cost sensing devices, combined with a web-based game to monitor perceived levels of pollution. Measures of shift in perceptions over the course of the campaign are provided, together with insights into participatory patterns emerging from this study. Interesting effects related to inertia and to direct involvement in measurement activities rather than indirect information exposure are also highlighted, indicating that direct involvement can enhance learning and environmental awareness. In the future, this could result in better adoption of policies towards decreasing pollution.

%B PLoS One %V 10 %P e0136763 %8 2015 %G eng %R 10.1371/journal.pone.0136763 %0 Conference Paper %B IEEE International Conference on Cloud and Autonomic Computing %D 2015 %T Towards Data-Driven Autonomics in Data Centers %A Alina Sirbu %A Ozalp Babaoglu %B IEEE International Conference on Cloud and Autonomic Computing %I IEEE %G eng %U http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7312140&filter%3DAND%28p_IS_Number%3A7312127%29 %R DOI:10.1109/ICCAC.2015.19 %0 Conference Paper %B Informatika (BigSys workshop) %D 2014 %T BiDAl: Big Data Analyzer for Cluster Traces %A Balliu, Alkida %A Olivetti, Dennis %A Ozalp Babaoglu %A Marzolla, Moreno %A Alina Sirbu %B Informatika (BigSys workshop) %I GI-Edition Lecture Notes in Informatics %G eng %U http://arxiv.org/abs/1410.1309 %0 Book Section %B Complex Networks V %D 2014 %T EGIA–Evolutionary Optimisation of Gene Regulatory Networks, an Integrative Approach %A Alina Sirbu %A Martin Crane %A Heather J Ruskin %B Complex Networks V %I Springer International Publishing %P 217–229 %G eng %U http://link.springer.com/chapter/10.1007/978-3-319-05401-8_21 %R 10.1007/978-3-319-05401-8_21 %0 Journal Article %J PLoS One %D 2013 %T Awareness and learning in participatory noise sensing. %A Becker, Martin %A Saverio Caminiti %A Fiorella, Donato %A Francis, Louise %A Pietro Gravino %A Haklay, Mordechai Muki %A Hotho, Andreas %A Vittorio Loreto %A Mueller, Juergen %A Ricchiuti, Ferdinando %A Vito D P Servedio %A Alina Sirbu %A Francesca Tria %X

The development of ICT infrastructures has facilitated the emergence of new paradigms for looking at society and the environment over the last few years. Participatory environmental sensing, i.e. directly involving citizens in environmental monitoring, is one example, which is hoped to encourage learning and enhance awareness of environmental issues. In this paper, an analysis of the behaviour of individuals involved in noise sensing is presented. Citizens have been involved in noise measuring activities through the WideNoise smartphone application. This application has been designed to record both objective (noise samples) and subjective (opinions, feelings) data. The application has been open to be used freely by anyone and has been widely employed worldwide. In addition, several test cases have been organised in European countries. Based on the information submitted by users, an analysis of emerging awareness and learning is performed. The data show that changes in the way the environment is perceived after repeated usage of the application do appear. Specifically, users learn how to recognise different noise levels they are exposed to. Additionally, the subjective data collected indicate an increased user involvement in time and a categorisation effect between pleasant and less pleasant environments.

%B PLoS One %V 8 %P e81638 %8 2013 %G eng %R 10.1371/journal.pone.0081638 %0 Journal Article %J Advances in Complex Systems %D 2013 %T Cohesion, consensus and extreme information in opinion dynamics %A Alina Sirbu %A Vittorio Loreto %A Vito D P Servedio %A Francesca Tria %B Advances in Complex Systems %V 16 %P 1350035 %G eng %U http://www.worldscientific.com/doi/abs/10.1142/S0219525913500355 %R 10.1142/S0219525913500355 %0 Journal Article %J Journal of Statistical Physics %D 2013 %T Opinion dynamics with disagreement and modulated information %A Alina Sirbu %A Vittorio Loreto %A Vito D P Servedio %A Francesca Tria %B Journal of Statistical Physics %P 1–20 %G eng %U http://link.springer.com/article/10.1007/s10955-013-0724-x %R 10.1007/s10955-013-0724-x %0 Conference Paper %B Cloud and Green Computing (CGC), 2013 Third International Conference on %D 2013 %T XTribe: a web-based social computation platform %A Saverio Caminiti %A Cicali, Claudio %A Pietro Gravino %A Vittorio Loreto %A Vito D P Servedio %A Alina Sirbu %A Francesca Tria %B Cloud and Green Computing (CGC), 2013 Third International Conference on %I IEEE %G eng %U http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6686061&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6686061 %R 10.1109/CGC.2013.69 %0 Journal Article %J Theory Biosci %D 2012 %T Integrating heterogeneous gene expression data for gene regulatory network modelling. %A Alina Sirbu %A Heather J Ruskin %A Martin Crane %X

Gene regulatory networks (GRNs) are complex biological systems that have a large impact on protein levels, so that discovering network interactions is a major objective of systems biology. Quantitative GRN models have been inferred, to date, from time series measurements of gene expression, but at small scale, and with limited application to real data. Time series experiments are typically short (number of time points of the order of ten), whereas regulatory networks can be very large (containing hundreds of genes). This creates an under-determination problem, which negatively influences the results of any inferential algorithm. Presented here is an integrative approach to model inference, which has not been previously discussed to the authors' knowledge. Multiple heterogeneous expression time series are used to infer the same model, and results are shown to be more robust to noise and parameter perturbation. Additionally, a wavelet analysis shows that these models display limited noise over-fitting within the individual datasets.

%B Theory Biosci %V 131 %P 95-102 %8 2012 Jun %G eng %R 10.1007/s12064-011-0133-0 %0 Journal Article %J PLoS One %D 2012 %T RNA-Seq vs dual- and single-channel microarray data: sensitivity analysis for differential expression and clustering. %A Alina Sirbu %A Kerr, Gráinne %A Martin Crane %A Heather J Ruskin %X

With the fast development of high-throughput sequencing technologies, a new generation of genome-wide gene expression measurements is under way. This is based on mRNA sequencing (RNA-seq), which complements the already mature technology of microarrays, and is expected to overcome some of the latter's disadvantages. These RNA-seq data pose new challenges, however, as strengths and weaknesses have yet to be fully identified. Ideally, Next (or Second) Generation Sequencing measures can be integrated for more comprehensive gene expression investigation to facilitate analysis of whole regulatory networks. At present, however, the nature of these data is not very well understood. In this paper we study three alternative gene expression time series datasets for the Drosophila melanogaster embryo development, in order to compare three measurement techniques: RNA-seq, single-channel and dual-channel microarrays. The aim is to study the state of the art for the three technologies, with a view of assessing overlapping features, data compatibility and integration potential, in the context of time series measurements. This involves using established tools for each of the three different technologies, and technical and biological replicates (for RNA-seq and microarrays, respectively), due to the limited availability of biological RNA-seq replicates for time series data. The approach consists of a sensitivity analysis for differential expression and clustering. In general, the RNA-seq dataset displayed highest sensitivity to differential expression. The single-channel data performed similarly for the differentially expressed genes common to gene sets considered. Cluster analysis was used to identify different features of the gene space for the three datasets, with higher similarities found for the RNA-seq and single-channel microarray dataset.

%B PLoS One %V 7 %P e50986 %8 2012 %G eng %R 10.1371/journal.pone.0050986 %0 Journal Article %J Nature Methods %D 2012 %T Wisdom of crowds for robust gene network inference %A Daniel Marbach %A J.C. Costello %A Robert Küffner %A N.M. Vega %A R.J. Prill %A D.M. Camacho %A K.R. Allison %A Manolis Kellis %A J.J. Collins %A Aderhold, A. %A Gustavo Stolovitzky %A Bonneau, R. %A Chen, Y. %A Cordero, F. %A Martin Crane %A Dondelinger, F. %A Drton, M. %A Esposito, R. %A Foygel, R. %A De La Fuente, A. %A Gertheiss, J. %A Geurts, P. %A Greenfield, A. %A Grzegorczyk, M. %A Haury, A.-C. %A Holmes, B. %A Hothorn, T. %A Husmeier, D. %A Huynh-Thu, V.A. %A Irrthum, A. %A Karlebach, G. %A Lebre, S. %A De Leo, V. %A Madar, A. %A Mani, S. %A Mordelet, F. %A Ostrer, H. %A Ouyang, Z. %A Pandya, R. %A Petri, T. %A Pinna, A. %A Poultney, C.S. %A Rezny, S. %A Heather J Ruskin %A Saeys, Y. %A Shamir, R. %A Alina Sirbu %A Song, M. %A Soranzo, N. %A Statnikov, A. %A N.M. Vega %A Vera-Licona, P. %A Vert, J.-P. %A Visconti, A. %A Haizhou Wang %A Wehenkel, L. %A Windhager, L. %A Zhang, Y. %A Zimmer, R. %B Nature Methods %V 9 %P 796-804 %G eng %U http://www.scopus.com/inward/record.url?eid=2-s2.0-84870305264&partnerID=40&md5=04a686572bdefff60157bf68c95df7ea %R 10.1038/nmeth.2016 %0 Book Section %B Evolutionary Algorithms %D 2011 %T Stages of Gene Regulatory Network Inference: the Evolutionary Algorithm Role %A Alina Sirbu %A Heather J Ruskin %A Martin Crane %B Evolutionary Algorithms %I InTech %G eng %U http://www.intechopen.com/articles/show/title/stages-of-gene-regulatory-network-inference-the-evolutionary-algorithm-role %R DOI: 10.5772/15182 %0 Journal Article %J BMC Bioinformatics %D 2010 %T Comparison of evolutionary algorithms in gene regulatory network model inference. %A Alina Sirbu %A Heather J Ruskin %A Martin Crane %X

BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient.

RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared.

CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.

%B BMC Bioinformatics %V 11 %P 59 %8 2010 %G eng %R 10.1186/1471-2105-11-59 %0 Journal Article %J PLoS One %D 2010 %T Cross-platform microarray data normalisation for regulatory network inference. %A Alina Sirbu %A Heather J Ruskin %A Martin Crane %X

BACKGROUND: Inferring Gene Regulatory Networks (GRNs) from time course microarray data suffers from the dimensionality problem created by the short length of available time series compared to the large number of genes in the network. To overcome this, data integration from diverse sources is mandatory. Microarray data from different sources and platforms are publicly available, but integration is not straightforward, due to platform and experimental differences.

METHODS: We analyse here different normalisation approaches for microarray data integration, in the context of reverse engineering of GRN quantitative models. We introduce two preprocessing approaches based on existing normalisation techniques and provide a comprehensive comparison of normalised datasets.

CONCLUSIONS: Results identify a method based on a combination of Loess normalisation and iterative K-means as best for time series normalisation for this problem.

%B PLoS One %V 5 %P e13822 %8 2010 %G eng %R 10.1371/journal.pone.0013822 %0 Journal Article %J Proceedings of The IASTED Technology Conferences (International Conference on Computational Bioscience), Cambridge, Massachusetts %D 2010 %T Regulatory network modelling: Correlation for structure and parameter optimisation %A Alina Sirbu %A Heather J Ruskin %A Martin Crane %B Proceedings of The IASTED Technology Conferences (International Conference on Computational Bioscience), Cambridge, Massachusetts %P 3473–3481 %G eng %U http://www.actapress.com/Abstract.aspx?paperId=41573 %R 10.2316/P.2010.728-020