01834nas a2200145 4500008004100000245006100041210006100102260001300163520137000176100001401546700001701560700002101577700002301598856006701621 2020 eng d00aDigital Footprints of International Migration on Twitter0 aDigital Footprints of International Migration on Twitter bSpringer3 aStudying migration using traditional data has some limitations. To date, there have been several studies proposing innovative methodologies to measure migration stocks and flows from social big data. Nevertheless, a uniform definition of a migrant is difficult to find as it varies from one work to another depending on the purpose of the study and nature of the dataset used. In this work, a generic methodology is developed to identify migrants within the Twitter population. This describes a migrant as a person who has the current residence different from the nationality. The residence is defined as the location where a user spends most of his/her time in a certain year. The nationality is inferred from linguistic and social connections to a migrant’s country of origin. This methodology is validated first with an internal gold standard dataset and second with two official statistics, and shows strong performance scores and correlation coefficients. Our method has the advantage that it can identify both immigrants and emigrants, regardless of the origin/destination countries. The new methodology can be used to study various aspects of migration, including opinions, integration, attachment, stocks and flows, motivations for migration, etc. Here, we exemplify how trending topics across and throughout different migrant communities can be observed.1 aKim, Jisu1 aSirbu, Alina1 aGiannotti, Fosca1 aGabrielli, Lorenzo uhttps://link.springer.com/chapter/10.1007/978-3-030-44584-3_2201787nas a2200313 4500008004100000020001400041245004600055210004500101260001500146300001100161520090000172100001701072700002301089700002301112700002101135700001701156700002101173700002301194700002001217700001401237700002901251700002101280700002301301700002001324700002001344700002301364700001901387856006701406 2020 eng d a2364-416800aHuman migration: the big data perspective0 aHuman migration the big data perspective c2020/03/23 a1–203 aHow can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.1 aSirbu, Alina1 aAndrienko, Gennady1 aAndrienko, Natalia1 aBoldrini, Chiara1 aConti, Marco1 aGiannotti, Fosca1 aGuidotti, Riccardo1 aBertoli, Simone1 aKim, Jisu1 aMuntean, Cristina, Ioana1 aPappalardo, Luca1 aPassarella, Andrea1 aPedreschi, Dino1 aPollacci, Laura1 aPratesi, Francesca1 aSharma, Rajesh uhttps://link.springer.com/article/10.1007%2Fs41060-020-00213-501975nas a2200157 4500008004100000245009800041210006900139300001300208490000700221520143200228100001701660700002001677700002101697700002101718856007801739 2019 eng d00aAlgorithmic bias amplifies opinion fragmentation and polarization: A bounded confidence model0 aAlgorithmic bias amplifies opinion fragmentation and polarizatio ae02132460 v143 aThe flow of information reaching us via the online media platforms is optimized not by the information content or relevance but by popularity and proximity to the target. This is typically performed in order to maximise platform usage. As a side effect, this introduces an algorithmic bias that is believed to enhance fragmentation and polarization of the societal debate. To study this phenomenon, we modify the well-known continuous opinion dynamics model of bounded confidence in order to account for the algorithmic bias and investigate its consequences. In the simplest version of the original model the pairs of discussion participants are chosen at random and their opinions get closer to each other if they are within a fixed tolerance level. We modify the selection rule of the discussion partners: there is an enhanced probability to choose individuals whose opinions are already close to each other, thus mimicking the behavior of online media which suggest interaction with similar peers. As a result we observe: a) an increased tendency towards opinion fragmentation, which emerges also in conditions where the original model would predict consensus, b) increased polarisation of opinions and c) a dramatic slowing down of the speed at which the convergence at the asymptotic state is reached, which makes the system highly unstable. Fragmentation and polarization are augmented by a fragmented initial population.1 aSirbu, Alina1 aPedreschi, Dino1 aGiannotti, Fosca1 aKertész, János uhttps://journals.plos.org/plosone/article?id=10.1371/journal.pone.021324600404nas a2200121 4500008004100000245004000041210004000081100001700121700002100138700002000159700002100179856008200200 2019 eng d00aPublic opinion and Algorithmic bias0 aPublic opinion and Algorithmic bias1 aSirbu, Alina1 aGiannotti, Fosca1 aPedreschi, Dino1 aKertész, János uhttps://ercim-news.ercim.eu/en116/special/public-opinion-and-algorithmic-bias01236nas a2200181 4500008004100000245009100041210006900132300001200201490000600213520065500219100002100874700001900895700001800914700001700932700002000949700002100969856006400990 2018 eng d00aNDlib: a python library to model and analyze diffusion processes over complex networks0 aNDlib a python library to model and analyze diffusion processes a61–790 v53 aNowadays the analysis of dynamics of and on networks represents a hot topic in the social network analysis playground. To support students, teachers, developers and researchers, in this work we introduce a novel framework, namely NDlib, an environment designed to describe diffusion simulations. NDlib is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. For this reason, upon NDlib, we designed a simulation server that allows remote execution of experiments as well as an online visualization tool that abstracts its programmatic interface and makes available the simulation platform to non-technicians.1 aRossetti, Giulio1 aMilli, Letizia1 aRinzivillo, S1 aSirbu, Alina1 aPedreschi, Dino1 aGiannotti, Fosca uhttps://link.springer.com/article/10.1007/s41060-017-0086-601007nas a2200181 4500008004100000245005700041210005700098260001300155300001400168520046300182100002300645700001900668700001900687700002100706700001600727700001700743856006500760 2017 eng d00aApplications for Environmental Sensing in EveryAware0 aApplications for Environmental Sensing in EveryAware bSpringer a135–1553 aThis chapter provides a technical description of the EveryAware applications for air quality and noise monitoring. Specifically, we introduce AirProbe, for measuring air quality, and WideNoise Plus for estimating environmental noise. We also include an overview on hardware components and smartphone-based measurement technology, and we present the according web backend, e.g., providing for real-time tracking, data storage, analysis and visualizations. 1 aAtzmueller, Martin1 aBecker, Martin1 aMolino, Andrea1 aMueller, Juergen1 aPeters, Jan1 aSirbu, Alina uhttp://link.springer.com/chapter/10.1007/978-3-319-25658-0_701046nas a2200169 4500008004100000245012100041210006900162260001300231300001400244520045100258100002000709700001700729700001900746700002400765700002100789856006600810 2017 eng d00aExperimental Assessment of the Emergence of Awareness and Its Influence on Behavioral Changes: The Everyaware Lesson0 aExperimental Assessment of the Emergence of Awareness and Its In bSpringer a337–3623 aThe emergence of awareness is deeply connected to the process of learning. In fact, by learning that high sound levels may harm one’s health, that noise levels that we estimate as innocuous may be dangerous, that there exist an alternative path we can walk to go to work and minimize our exposure to air pollution, etc., citizens will be able to understand the environment around them and act consequently to go toward a more sustainable world.1 aGravino, Pietro1 aSirbu, Alina1 aBecker, Martin1 aServedio, Vito, D P1 aLoreto, Vittorio uhttp://link.springer.com/chapter/10.1007/978-3-319-25658-0_1601977nas a2200181 4500008004100000245007000041210006900111260001300180300001400193520139800207100002401605700002201629700002001651700002101671700001701692700002001709856006601729 2017 eng d00aLarge Scale Engagement Through Web-Gaming and Social Computations0 aLarge Scale Engagement Through WebGaming and Social Computations bSpringer a237–2543 aIn the last few years the Web has progressively acquired the status of an infrastructure for social computation that allows researchers to coordinate the cognitive abilities of human agents, so to steer the collective user activity towards predefined goals. This general trend is also triggering the adoption of web-games as an alternative laboratory to run experiments in the social sciences and whenever the contribution of human beings can be effectively used for research purposes. Web-games introduce a playful aspect in scientific experiments with the result of increasing participation of people and of keeping their attention steady in time. The aim of this chapter is to suggest a general purpose web-based platform scheme for web-gaming and social computation. This platform will simplify the realization of web-games and will act as a repository of different scientific experiments, thus realizing a sort of showcase that stimulates users’ curiosity and helps researchers in recruiting volunteers. A platform built by following these criteria has been developed within the EveryAware project, the Experimental Tribe (XTribe) platform, which is operational and ready to be used. Finally, a sample web-game hosted by the XTribe platform will be presented with the aim of reporting the results, in terms of participation and motivation, of two different player recruiting strategies.1 aServedio, Vito, D P1 aCaminiti, Saverio1 aGravino, Pietro1 aLoreto, Vittorio1 aSirbu, Alina1 aTria, Francesca uhttp://link.springer.com/chapter/10.1007/978-3-319-25658-0_1201274nas a2200169 4500008004100000245009100041210006900132300001100201520065400212100002100866700001900887700001800906700001700924700002000941700002100961856012200982 2017 eng d00aNDlib: a python library to model and analyze diffusion processes over complex networks0 aNDlib a python library to model and analyze diffusion processes a1–193 aNowadays the analysis of dynamics of and on networks represents a hot topic in the social network analysis playground.To support students, teachers, developers and researchers, in this work we introduce a novel framework, namely NDlib, an environment designed to describe diffusion simulations. NDlib is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. For this reason, upon NDlib, we designed a simulation server that allows remote execution of experiments as well as an online visualization tool that abstracts its programmatic interface and makes available the simulation platform to non-technicians.1 aRossetti, Giulio1 aMilli, Letizia1 aRinzivillo, S1 aSirbu, Alina1 aPedreschi, Dino1 aGiannotti, Fosca uhttps://kdd.isti.cnr.it/publications/ndlib-python-library-model-and-analyze-diffusion-processes-over-complex-networks01119nas a2200169 4500008004100000245004700041210004600088260001000134520063100144100002100775700001900796700001800815700001700833700002000850700002100870856005800891 2017 eng d00aNDlib: Studying Network Diffusion Dynamics0 aNDlib Studying Network Diffusion Dynamics aTokyo3 aNowadays the analysis of diffusive phenomena occurring on top of complex networks represents a hot topic in the Social Network Analysis playground. In order to support students, teachers, developers and researchers in this work we introduce a novel simulation framework, ND LIB . ND LIB is designed to be a multi-level ecosystem that can be fruitfully used by different user segments. Upon the diffusion library, we designed a simulation server that allows remote execution of experiments and an online visualization tool that abstract the programmatic interface and makes available the simulation platform to non-technicians.1 aRossetti, Giulio1 aMilli, Letizia1 aRinzivillo, S1 aSirbu, Alina1 aPedreschi, Dino1 aGiannotti, Fosca uhttps://ieeexplore.ieee.org/abstract/document/825977402813nas a2200157 4500008004100000245006200041210006000103260001300163300001400176520231700190100001702507700002102524700002402545700002002569856006602589 2017 eng d00aOpinion dynamics: models, extensions and external effects0 aOpinion dynamics models extensions and external effects bSpringer a363–4013 aRecently, social phenomena have received a lot of attention not only from social scientists, but also from physicists, mathematicians and computer scientists, in the emerging interdisciplinary field of complex system science. Opinion dynamics is one of the processes studied, since opinions are the drivers of human behaviour, and play a crucial role in many global challenges that our complex world and societies are facing: global financial crises, global pandemics, growth of cities, urbanisation and migration patterns, and last but not least important, climate change and environmental sustainability and protection. Opinion formation is a complex process affected by the interplay of different elements, including the individual predisposition, the influence of positive and negative peer interaction (social networks playing a crucial role in this respect), the information each individual is exposed to, and many others. Several models inspired from those in use in physics have been developed to encompass many of these elements, and to allow for the identification of the mechanisms involved in the opinion formation process and the understanding of their role, with the practical aim of simulating opinion formation and spreading under various conditions. These modelling schemes range from binary simple models such as the voter model, to multi-dimensional continuous approaches. Here, we provide a review of recent methods, focusing on models employing both peer interaction and external information, and emphasising the role that less studied mechanisms, such as disagreement, has in driving the opinion dynamics. Due to the important role that external information (mainly in the form of mass media broadcast) can have in enhancing awareness of social issues, a special emphasis will be devoted to study different forms it can take, investigating their effectiveness in driving the opinion formation at the population level. The review shows that, although a large number of approaches exist, some mechanisms such as the effect of multiple external information sources could largely benefit from further studies. Additionally, model validation with real data, which are starting to become available, is still largely lacking and should in our opinion be the main ambition of future investigations.1 aSirbu, Alina1 aLoreto, Vittorio1 aServedio, Vito, D P1 aTria, Francesca uhttp://link.springer.com/chapter/10.1007/978-3-319-25658-0_1701783nas a2200169 4500008004100000245009100041210006900132260001300201520120400214100002001418700001701438700002101455700002001476700002201496700002901518856006601547 2017 eng d00aSentiment Spreading: An Epidemic Model for Lexicon-Based Sentiment Analysis on Twitter0 aSentiment Spreading An Epidemic Model for LexiconBased Sentiment bSpringer3 aWhile sentiment analysis has received significant attention in the last years, problems still exist when tools need to be applied to microblogging content. This because, typically, the text to be analysed consists of very short messages lacking in structure and semantic context. At the same time, the amount of text produced by online platforms is enormous. So, one needs simple, fast and effective methods in order to be able to efficiently study sentiment in these data. Lexicon-based methods, which use a predefined dictionary of terms tagged with sentiment valences to evaluate sentiment in longer sentences, can be a valid approach. Here we present a method based on epidemic spreading to automatically extend the dictionary used in lexicon-based sentiment analysis, starting from a reduced dictionary and large amounts of Twitter data. The resulting dictionary is shown to contain valences that correlate well with human-annotated sentiment, and to produce tweet sentiment classifications comparable to the original dictionary, with the advantage of being able to tag more tweets than the original. The method is easily extensible to various languages and applicable to large amounts of data.1 aPollacci, Laura1 aSirbu, Alina1 aGiannotti, Fosca1 aPedreschi, Dino1 aLucchese, Claudio1 aMuntean, Cristina, Ioana uhttps://link.springer.com/chapter/10.1007/978-3-319-70169-1_901568nas a2200133 4500008004100000245008400041210006900125260003600194490001400230520111700244100001701361700002001378856003601398 2016 eng d00aPower Consumption Modeling and Prediction in a Hybrid CPU-GPU-MIC Supercomputer0 aPower Consumption Modeling and Prediction in a Hybrid CPUGPUMIC aGrenoble, FrancebSpringer LNCS0 vLNCS 98333 aPower consumption is a major obstacle for High Performance Computing (HPC) systems in their quest towards the holy grail of ExaFLOP performance. Significant advances in power efficiency have to be made before this goal can be attained and accurate modeling is an essential step towards power efficiency by optimizing system operating parameters to match dynamic energy needs. In this paper we present a study of power consumption by jobs in Eurora, a hybrid CPU-GPU-MIC system installed at the largest Italian data center. Using data from a dedicated monitoring framework, we build a data-driven model of power consumption for each user in the system and use it to predict the power requirements of future jobs. We are able to achieve good prediction results for over 80 % of the users in the system. For the remaining users, we identify possible reasons why prediction performance is not as good. Possible applications for our predictive modeling results include scheduling optimization, power-aware billing and system-scale power modeling. All the scripts used for the study have been made available on GitHub.1 aSirbu, Alina1 aBabaoglu, Ozalp uhttp://arxiv.org/abs/1601.0596101787nas a2200121 4500008004100000245006100041210006000102260003800162520137900200100001701579700002001596856004901616 2016 eng d00aPredicting System-level Power for a Hybrid Supercomputer0 aPredicting Systemlevel Power for a Hybrid Supercomputer aInnsbruck, AustriabIEEEc07/20163 aFor current High Performance Computing systems to scale towards the holy grail of ExaFLOP performance, their power consumption has to be reduced by at least one order of magnitude. This goal can be achieved only through a combination of hardware and software advances. Being able to model and accurately predict the power consumption of large computational systems is necessary for software-level innovations such as proactive and power-aware scheduling, resource allocation and fault tolerance techniques. In this paper we present a 2-layer model of power consumption for a hybrid supercomputer (which held the top spot of the Green500 list on July 2013) that combines CPU, GPU and MIC technologies to achieve higher energy efficiency. Our model takes as input workload information - the number and location of resources that are used by each job at a certain time - and calculates the resulting system-level power consumption. When jobs are submitted to the system, the workload configuration can be foreseen based on the scheduler policies, and our model can then be applied to predict the ensuing system-level power consumption. Additionally, alternative workload configurations can be evaluated from a power perspective and more efficient ones can be selected. Applications of the model include not only power-aware scheduling but also prediction of anomalous behavior.1 aSirbu, Alina1 aBabaoglu, Ozalp uhttp://ieeexplore.ieee.org/document/7568420/02543nas a2200133 4500008004100000245009300041210006900134260001200203300001100215520208300226100001702309700002002326856006302346 2016 eng d00aTowards operator-less data centers through data-driven, predictive, proactive autonomics0 aTowards operatorless data centers through datadriven predictive c04/2016 a1–143 aContinued reliance on human operators for managing data centers is a major impediment for them from ever reaching extreme dimensions. Large computer systems in general, and data centers in particular, will ultimately be managed using predictive computational and executable models obtained through data-science tools, and at that point, the intervention of humans will be limited to setting high-level goals and policies rather than performing low-level operations. Data-driven autonomics, where management and control are based on holistic predictive models that are built and updated using live data, opens one possible path towards limiting the role of operators in data centers. In this paper, we present a data-science study of a public Google dataset collected in a 12K-node cluster with the goal of building and evaluating predictive models for node failures. Our results support the practicality of a data-driven approach by showing the effectiveness of predictive models based on data found in typical data center logs. We use BigQuery, the big data SQL platform from the Google Cloud suite, to process massive amounts of data and generate a rich feature set characterizing node state over time. We describe how an ensemble classifier can be built out of many Random Forest classifiers each trained on these features, to predict if nodes will fail in a future 24-h window. Our evaluation reveals that if we limit false positive rates to 5 %, we can achieve true positive rates between 27 and 88 % with precision varying between 50 and 72 %. This level of performance allows us to recover large fraction of jobs’ executions (by redirecting them to other nodes when a failure of the present node is predicted) that would otherwise have been wasted due to failures. We discuss the feasibility of including our predictive model as the central component of a data-driven autonomic manager and operating it on-line with live data streams (rather than off-line on data logs). All of the scripts used for BigQuery and classification analyses are publicly available on GitHub.1 aSirbu, Alina1 aBabaoglu, Ozalp uhttp://link.springer.com/article/10.1007/s10586-016-0564-y01492nas a2200169 4500008004100000020002200041245006500063210006400128520092200192100002001114700002201134700001701156700002001173700002401193700002101217856008401238 2016 eng d a978-989-758-181-600aUnveiling Political Opinion Structures with a Web-experiment0 aUnveiling Political Opinion Structures with a Webexperiment3 aThe dynamics of political votes has been widely studied, both for its practical interest and as a paradigm of the dynamics of mass opinions and collective phenomena, where theoretical predictions can be easily tested. However, the vote outcome is often influenced by many factors beyond the bare opinion on the candidate, and in most cases it is bound to a single preference. The voter perception of the political space is still to be elucidated. We here propose a web experiment (laPENSOcos`ı) where we explicitly investigate participants’ opinions on political entities (parties, coalitions, individual candidates) of the Italian political scene. As a main result, we show that the political perception follows a Weber-Fechner-like law, i.e., when ranking political entities according to the user expressed preferences, the perceived distance of the user from a given entity scales as the logarithm of this rank.1 aGravino, Pietro1 aCaminiti, Saverio1 aSirbu, Alina1 aTria, Francesca1 aServedio, Vito, D P1 aLoreto, Vittorio uhttp://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/000590630039004700424nas a2200133 4500008004100000245004500041210004300086100001900129700002100148700002000169700002100189700001700210856006300227 2015 eng d00aA Big Data Analyzer for Large Trace Logs0 aBig Data Analyzer for Large Trace Logs1 aBalliu, Alkida1 aOlivetti, Dennis1 aBabaoglu, Ozalp1 aMarzolla, Moreno1 aSirbu, Alina uhttp://link.springer.com/article/10.1007/s00607-015-0480-700450nas a2200133 4500008004100000245008600041210006900127300001400196490000600210100001700216700001800233700002300251856004200274 2015 eng d00aData Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks0 aData Integration for Microarrays Enhanced Inference for Gene Reg a255–2690 v41 aSirbu, Alina1 aCrane, Martin1 aRuskin, Heather, J uhttp://www.mdpi.com/2076-3905/4/2/25500493nas a2200133 4500008004100000245008200041210006900123300001100192100002400203700002200227700003000249700001700279856006300296 2015 eng d00aEgalitarianism in the rank aggregation problem: a new dimension for democracy0 aEgalitarianism in the rank aggregation problem a new dimension f a1–161 aContucci, Pierluigi1 aPanizzi, Emanuele1 aRicci-Tersenghi, Federico1 aSirbu, Alina uhttp://link.springer.com/article/10.1007/s11135-015-0197-x00450nas a2200109 4500008004100000245011200041210006900153260001300222100001700235700002000252856006800272 2015 eng d00aA Holistic Approach to Log Data Analysis in High-Performance Computing Systems: The Case of IBM Blue Gene/Q0 aHolistic Approach to Log Data Analysis in HighPerformance Comput bSpringer1 aSirbu, Alina1 aBabaoglu, Ozalp uhttp://link.springer.com/chapter/10.1007%2F978-3-319-27308-2_5102544nas a2200373 4500008004100000022001400041245008200055210006900137260000900206300001300215490000700228520142000235100001701655700001901672700002201691700002201713700001501735700002001750700002001770700001901790700002101809700002101830700001901851700002101870700001601891700002601907700002001933700002401953700001701977700001701994700002002011700002702031856011202058 2015 eng d a1932-620300aParticipatory Patterns in an International Air Quality Monitoring Initiative.0 aParticipatory Patterns in an International Air Quality Monitorin c2015 ae01367630 v103 a

The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective way of producing a change is involving the citizens themselves in monitoring campaigns (a citizen science bottom-up approach). This is possible by developing novel technologies and IT infrastructures enabling large citizen participation. Here, in the wider framework of one of the first such projects, we show results from an international competition where citizens were involved in mobile air pollution monitoring using low cost sensing devices, combined with a web-based game to monitor perceived levels of pollution. Measures of shift in perceptions over the course of the campaign are provided, together with insights into participatory patterns emerging from this study. Interesting effects related to inertia and to direct involvement in measurement activities rather than indirect information exposure are also highlighted, indicating that direct involvement can enhance learning and environmental awareness. In the future, this could result in better adoption of policies towards decreasing pollution.

1 aSirbu, Alina1 aBecker, Martin1 aCaminiti, Saverio1 aDe Baets, Bernard1 aElen, Bart1 aFrancis, Louise1 aGravino, Pietro1 aHotho, Andreas1 aIngarra, Stefano1 aLoreto, Vittorio1 aMolino, Andrea1 aMueller, Juergen1 aPeters, Jan1 aRicchiuti, Ferdinando1 aSaracino, Fabio1 aServedio, Vito, D P1 aStumme, Gerd1 aTheunis, Jan1 aTria, Francesca1 aVan den Bossche, Joris uhttps://kdd.isti.cnr.it/publications/participatory-patterns-international-air-quality-monitoring-initiative00409nas a2200109 4500008004100000245005100041210005000092260000900142100001700151700002000168856011100188 2015 eng d00aTowards Data-Driven Autonomics in Data Centers0 aTowards DataDriven Autonomics in Data Centers bIEEE1 aSirbu, Alina1 aBabaoglu, Ozalp uhttp://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7312140&filter%3DAND%28p_IS_Number%3A7312127%2900459nas a2200145 4500008004100000245004800041210004700089260004400136100001900180700002100199700002000220700002100240700001700261856003500278 2014 eng d00aBiDAl: Big Data Analyzer for Cluster Traces0 aBiDAl Big Data Analyzer for Cluster Traces bGI-Edition Lecture Notes in Informatics1 aBalliu, Alkida1 aOlivetti, Dennis1 aBabaoglu, Ozalp1 aMarzolla, Moreno1 aSirbu, Alina uhttp://arxiv.org/abs/1410.130900512nas a2200133 4500008004100000245009000041210007100131260003800202300001400240100001700254700001800271700002300289856006600312 2014 eng d00aEGIA–Evolutionary Optimisation of Gene Regulatory Networks, an Integrative Approach0 aEGIA–Evolutionary Optimisation of Gene Regulatory Networks an In bSpringer International Publishing a217–2291 aSirbu, Alina1 aCrane, Martin1 aRuskin, Heather, J uhttp://link.springer.com/chapter/10.1007/978-3-319-05401-8_2102195nas a2200289 4500008004100000022001400041245005900055210005800114260000900172300001100181490000600192520133700198100001901535700002201554700002101576700002001597700002001617700002801637700001901665700002101684700002101705700002601726700002401752700001701776700002001793856009201813 2013 eng d a1932-620300aAwareness and learning in participatory noise sensing.0 aAwareness and learning in participatory noise sensing c2013 ae816380 v83 a

The development of ICT infrastructures has facilitated the emergence of new paradigms for looking at society and the environment over the last few years. Participatory environmental sensing, i.e. directly involving citizens in environmental monitoring, is one example, which is hoped to encourage learning and enhance awareness of environmental issues. In this paper, an analysis of the behaviour of individuals involved in noise sensing is presented. Citizens have been involved in noise measuring activities through the WideNoise smartphone application. This application has been designed to record both objective (noise samples) and subjective (opinions, feelings) data. The application has been open to be used freely by anyone and has been widely employed worldwide. In addition, several test cases have been organised in European countries. Based on the information submitted by users, an analysis of emerging awareness and learning is performed. The data show that changes in the way the environment is perceived after repeated usage of the application do appear. Specifically, users learn how to recognise different noise levels they are exposed to. Additionally, the subjective data collected indicate an increased user involvement in time and a categorisation effect between pleasant and less pleasant environments.

1 aBecker, Martin1 aCaminiti, Saverio1 aFiorella, Donato1 aFrancis, Louise1 aGravino, Pietro1 aHaklay, Mordechai, Muki1 aHotho, Andreas1 aLoreto, Vittorio1 aMueller, Juergen1 aRicchiuti, Ferdinando1 aServedio, Vito, D P1 aSirbu, Alina1 aTria, Francesca uhttps://kdd.isti.cnr.it/publications/awareness-and-learning-participatory-noise-sensing00492nas a2200145 4500008004100000245006800041210006700109300001200176490000700188100001700195700002100212700002400233700002000257856006900277 2013 eng d00aCohesion, consensus and extreme information in opinion dynamics0 aCohesion consensus and extreme information in opinion dynamics a13500350 v161 aSirbu, Alina1 aLoreto, Vittorio1 aServedio, Vito, D P1 aTria, Francesca uhttp://www.worldscientific.com/doi/abs/10.1142/S021952591350035500461nas a2200133 4500008004100000245006500041210006500106300001100171100001700182700002100199700002400220700002000244856006300264 2013 eng d00aOpinion dynamics with disagreement and modulated information0 aOpinion dynamics with disagreement and modulated information a1–201 aSirbu, Alina1 aLoreto, Vittorio1 aServedio, Vito, D P1 aTria, Francesca uhttp://link.springer.com/article/10.1007/s10955-013-0724-x00611nas a2200169 4500008004100000245005200041210005000093260000900143100002200152700002000174700002000194700002100214700002400235700001700259700002000276856014500296 2013 eng d00aXTribe: a web-based social computation platform0 aXTribe a webbased social computation platform bIEEE1 aCaminiti, Saverio1 aCicali, Claudio1 aGravino, Pietro1 aLoreto, Vittorio1 aServedio, Vito, D P1 aSirbu, Alina1 aTria, Francesca uhttp://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6686061&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D668606101654nas a2200169 4500008004100000022001400041245009000055210006900145260001300214300001100227490000800238520105800246100001701304700002301321700001801344856012201362 2012 eng d a1611-753000aIntegrating heterogeneous gene expression data for gene regulatory network modelling.0 aIntegrating heterogeneous gene expression data for gene regulato c2012 Jun a95-1020 v1313 a

Gene regulatory networks (GRNs) are complex biological systems that have a large impact on protein levels, so that discovering network interactions is a major objective of systems biology. Quantitative GRN models have been inferred, to date, from time series measurements of gene expression, but at small scale, and with limited application to real data. Time series experiments are typically short (number of time points of the order of ten), whereas regulatory networks can be very large (containing hundreds of genes). This creates an under-determination problem, which negatively influences the results of any inferential algorithm. Presented here is an integrative approach to model inference, which has not been previously discussed to the authors' knowledge. Multiple heterogeneous expression time series are used to infer the same model, and results are shown to be more robust to noise and parameter perturbation. Additionally, a wavelet analysis shows that these models display limited noise over-fitting within the individual datasets.

1 aSirbu, Alina1 aRuskin, Heather, J1 aCrane, Martin uhttps://kdd.isti.cnr.it/publications/integrating-heterogeneous-gene-expression-data-gene-regulatory-network-modelling02529nas a2200181 4500008004100000022001400041245012200055210006900177260000900246300001100255490000600266520187200272100001702144700001902161700001802180700002302198856012602221 2012 eng d a1932-620300aRNA-Seq vs dual- and single-channel microarray data: sensitivity analysis for differential expression and clustering.0 aRNASeq vs dual and singlechannel microarray data sensitivity ana c2012 ae509860 v73 a

With the fast development of high-throughput sequencing technologies, a new generation of genome-wide gene expression measurements is under way. This is based on mRNA sequencing (RNA-seq), which complements the already mature technology of microarrays, and is expected to overcome some of the latter's disadvantages. These RNA-seq data pose new challenges, however, as strengths and weaknesses have yet to be fully identified. Ideally, Next (or Second) Generation Sequencing measures can be integrated for more comprehensive gene expression investigation to facilitate analysis of whole regulatory networks. At present, however, the nature of these data is not very well understood. In this paper we study three alternative gene expression time series datasets for the Drosophila melanogaster embryo development, in order to compare three measurement techniques: RNA-seq, single-channel and dual-channel microarrays. The aim is to study the state of the art for the three technologies, with a view of assessing overlapping features, data compatibility and integration potential, in the context of time series measurements. This involves using established tools for each of the three different technologies, and technical and biological replicates (for RNA-seq and microarrays, respectively), due to the limited availability of biological RNA-seq replicates for time series data. The approach consists of a sensitivity analysis for differential expression and clustering. In general, the RNA-seq dataset displayed highest sensitivity to differential expression. The single-channel data performed similarly for the differentially expressed genes common to gene sets considered. Cluster analysis was used to identify different features of the gene space for the three datasets, with higher similarities found for the RNA-seq and single-channel microarray dataset.

1 aSirbu, Alina1 aKerr, Gráinne1 aCrane, Martin1 aRuskin, Heather, J uhttps://kdd.isti.cnr.it/publications/rna-seq-vs-dual-and-single-channel-microarray-data-sensitivity-analysis-differential02085nas a2200805 4500008004100000245005500041210005500096300001200151490000600163100002000169700001900189700002100208700001500229700001600244700001800260700001800278700002000296700001800316700001700334700002500351700001600376700001300392700001600405700001800421700002000439700001400459700001700473700001500490700002100505700001800526700001500544700001900559700002000578700001700598700001500615700001600630700001700646700002000663700001600683700001800699700001400717700001500731700001400746700001300760700001700773700001500790700001500805700001500820700001400835700001400849700001900863700001400882700002300896700001400919700001500933700001700948700001300965700001600978700001800994700001501012700002001027700001601047700001701063700001801080700001701098700001801115700001401133700001501147856011701162 2012 eng d00aWisdom of crowds for robust gene network inference0 aWisdom of crowds for robust gene network inference a796-8040 v91 aMarbach, Daniel1 aCostello, J.C.1 aKüffner, Robert1 aVega, N.M.1 aPrill, R.J.1 aCamacho, D.M.1 aAllison, K.R.1 aKellis, Manolis1 aCollins, J.J.1 aAderhold, A.1 aStolovitzky, Gustavo1 aBonneau, R.1 aChen, Y.1 aCordero, F.1 aCrane, Martin1 aDondelinger, F.1 aDrton, M.1 aEsposito, R.1 aFoygel, R.1 aDe La Fuente, A.1 aGertheiss, J.1 aGeurts, P.1 aGreenfield, A.1 aGrzegorczyk, M.1 aHaury, A.-C.1 aHolmes, B.1 aHothorn, T.1 aHusmeier, D.1 aHuynh-Thu, V.A.1 aIrrthum, A.1 aKarlebach, G.1 aLebre, S.1 aDe Leo, V.1 aMadar, A.1 aMani, S.1 aMordelet, F.1 aOstrer, H.1 aOuyang, Z.1 aPandya, R.1 aPetri, T.1 aPinna, A.1 aPoultney, C.S.1 aRezny, S.1 aRuskin, Heather, J1 aSaeys, Y.1 aShamir, R.1 aSirbu, Alina1 aSong, M.1 aSoranzo, N.1 aStatnikov, A.1 aVega, N.M.1 aVera-Licona, P.1 aVert, J.-P.1 aVisconti, A.1 aWang, Haizhou1 aWehenkel, L.1 aWindhager, L.1 aZhang, Y.1 aZimmer, R. uhttp://www.scopus.com/inward/record.url?eid=2-s2.0-84870305264&partnerID=40&md5=04a686572bdefff60157bf68c95df7ea00508nas a2200121 4500008004100000245008100041210006900122260001100191100001700202700002300219700001800242856012600260 2011 eng d00aStages of Gene Regulatory Network Inference: the Evolutionary Algorithm Role0 aStages of Gene Regulatory Network Inference the Evolutionary Alg bInTech1 aSirbu, Alina1 aRuskin, Heather, J1 aCrane, Martin uhttp://www.intechopen.com/articles/show/title/stages-of-gene-regulatory-network-inference-the-evolutionary-algorithm-role01924nas a2200169 4500008004100000022001400041245008600055210006900141260000900210300000700219490000700226520134700233100001701580700002301597700001801620856011601638 2010 eng d a1471-210500aComparison of evolutionary algorithms in gene regulatory network model inference.0 aComparison of evolutionary algorithms in gene regulatory network c2010 a590 v113 a

BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient.

RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared.

CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.

1 aSirbu, Alina1 aRuskin, Heather, J1 aCrane, Martin uhttps://kdd.isti.cnr.it/publications/comparison-evolutionary-algorithms-gene-regulatory-network-model-inference01557nas a2200169 4500008004100000022001400041245008300055210006900138260000900207300001100216490000600227520098100233100001701214700002301231700001801254856011501272 2010 eng d a1932-620300aCross-platform microarray data normalisation for regulatory network inference.0 aCrossplatform microarray data normalisation for regulatory netwo c2010 ae138220 v53 a

BACKGROUND: Inferring Gene Regulatory Networks (GRNs) from time course microarray data suffers from the dimensionality problem created by the short length of available time series compared to the large number of genes in the network. To overcome this, data integration from diverse sources is mandatory. Microarray data from different sources and platforms are publicly available, but integration is not straightforward, due to platform and experimental differences.

METHODS: We analyse here different normalisation approaches for microarray data integration, in the context of reverse engineering of GRN quantitative models. We introduce two preprocessing approaches based on existing normalisation techniques and provide a comprehensive comparison of normalised datasets.

CONCLUSIONS: Results identify a method based on a combination of Loess normalisation and iterative K-means as best for time series normalisation for this problem.

1 aSirbu, Alina1 aRuskin, Heather, J1 aCrane, Martin uhttps://kdd.isti.cnr.it/publications/cross-platform-microarray-data-normalisation-regulatory-network-inference00450nas a2200121 4500008004100000245008700041210006900128300001600197100001700213700002300230700001800253856005700271 2010 eng d00aRegulatory network modelling: Correlation for structure and parameter optimisation0 aRegulatory network modelling Correlation for structure and param a3473–34811 aSirbu, Alina1 aRuskin, Heather, J1 aCrane, Martin uhttp://www.actapress.com/Abstract.aspx?paperId=41573