Catálogo de publicaciones - revistas

Compartir en
redes sociales


Título de Acceso Abierto

EPJ Data Science

Resumen/Descripción – provisto por la editorial en inglés
The 21st century is currently witnessing the establishment of data-driven science as a complementary approach to the traditional hypothesis-driven method. This (r)evolution accompanying the paradigm shift from reductionism to complex systems sciences has already largely transformed the natural sciences and is about to bring the same changes to the techno-socio-economic sciences, viewed broadly.
Palabras clave – provistas por la editorial

data analysis; data mining; data enrichment

Disponibilidad
Institución detectada Período Navegá Descargá Solicitá
No requiere desde ene. 2012 / hasta nov. 2024 Directory of Open Access Journals acceso abierto
No requiere desde may. 2012 / hasta nov. 2024 SpringerLink acceso abierto

Información

Tipo de recurso:

revistas

ISSN electrónico

2193-1127

Editor responsable

Springer Nature

Idiomas de la publicación

  • inglés

País de edición

Reino Unido

Fecha de publicación

Información sobre licencias CC

https://creativecommons.org/licenses/by/4.0/

Tabla de contenidos

Multifaceted online coordinated behavior in the 2020 US presidential election

Serena TardelliORCID; Leonardo Nizzoli; Marco Avvenuti; Stefano Cresci; Maurizio Tesconi

<jats:title>Abstract</jats:title><jats:p>Organized attempts to manipulate public opinion during election run-ups have dominated online debates in the last few years. Such attempts require numerous accounts to <jats:italic>act in coordination</jats:italic> to exert influence. Yet, the ways in which coordinated behavior surfaces during major online political debates is still largely unclear. This study sheds light on coordinated behaviors that took place on Twitter (now X) during the 2020 US Presidential Election. Utilizing state-of-the-art network science methods, we detect and characterize the coordinated communities that participated in the debate. Our approach goes beyond previous analyses by proposing a multifaceted characterization of the coordinated communities that allows obtaining nuanced results. In particular, we uncover three main categories of coordinated users: (<jats:italic>i</jats:italic>) moderate groups genuinely interested in the electoral debate, (<jats:italic>ii</jats:italic>) conspiratorial groups that spread false information and divisive narratives, and (<jats:italic>iii</jats:italic>) foreign influence networks that either sought to tamper with the debate or that exploited it to publicize their own agendas. We also reveal a large use of automation by far-right foreign influence and conspiratorial communities. Conversely, left-leaning supporters were overall less coordinated and engaged primarily in harmless, factual communication. Our results also showed that Twitter was effective at thwarting the activity of some coordinated groups, while it failed on some other equally suspicious ones. Overall, this study advances the understanding of online human interactions and contributes new knowledge to mitigate cyber social threats.</jats:p>

Pp. No disponible

Early career wins and tournament prestige characterize tennis players’ trajectories

Chiara ZappalàORCID; Sandro SousaORCID; Tiago CunhaORCID; Alessandro PluchinoORCID; Andrea RapisardaORCID; Roberta SinatraORCID

<jats:title>Abstract</jats:title><jats:p>Success in sports is a complex phenomenon that has only garnered limited research attention. In particular, we lack a deep scientific understanding of success in sports like tennis and the factors that contribute to it. Here, we study the unfolding of tennis players’ careers to understand the role of early career stages and the impact of specific tournaments on players’ trajectories. We employ a comprehensive approach combining network science and analysis of the Association of Tennis Professionals (ATP) tournament data and introduce a novel method to quantify tournament prestige based on the eigenvector centrality of the co-attendance network of tournaments. Focusing on the interplay between participation in central tournaments and players’ performance, we find that the level of the tournament where players achieve their first win is associated with becoming a top player. This work sheds light on the critical role of the initial stages in the progression of players’ careers, offering valuable insights into the dynamics of success in tennis.</jats:p>

Pp. No disponible

Segmentation using large language models: A new typology of American neighborhoods

Alex D. SingletonORCID; Seth Spielman

<jats:title>Abstract</jats:title><jats:p>In the United States, recent changes to the National Statistical System have amplified the geographic-demographic resolution trade-off. That is, when working with demographic and economic data from the American Community Survey, as one zooms in geographically one loses resolution demographically due to very large margins of error. In this paper, we present a solution to this problem in the form of an AI based open and reproducible geodemographic classification system for the United States using small area estimates from the American Community Survey (ACS). We employ a partitioning clustering algorithm to a range of socio-economic, demographic, and built environment variables. Our approach utilizes an open source software pipeline that ensures adaptability to future data updates. A key innovation is the integration of GPT4, a state-of-the-art large language model, to generate intuitive cluster descriptions and names. This represents a novel application of natural language processing in geodemographic research and showcases the potential for human-AI collaboration within the geospatial domain.</jats:p>

Pp. No disponible

Who makes open source code? The hybridisation of commercial and open source practices

Peter Mehler; Eva Iris Otto; Anna SapienzaORCID

<jats:title>Abstract</jats:title><jats:p>While Free and Open Source (F/OSS) coding has traditionally been described as a separate commons linked to values of openness and sharing, recent research suggests an increasing integration of private corporations into F/OSS practices, blurring the boundaries between F/OSS and commodified coding. However, there is a dearth of empirical, and especially quantitative studies exploring this phenomenon. To address this gap, we model the power dynamics and infrastructural aspects of software production within GitHub, a central hub for F/OSS development, using a large-scale, directed network. Using various network statistics, we detect the ecosystem’s most impactful actors and find a nuanced picture of the influence of individuals, open source organizations, and private corporations in F/OSS practices. We find that the majority of public repositories on GitHub depend on a small core of specialized repositories and users. In accordance with expectations, individuals and open source organizations are more prevalent in this core of elite GitHub users, however, we also find a significant amount of private organizations with an indirect, yet consistent influence within GitHub. In addition, we find that directly influential individuals tend to facilitate sponsorship methods more often than indirectly or non-influential individuals. Our research highlights a hybridization of F/OSS and sheds light on the complex interplay between influence, power, and code production in the multi-language dependency ecosystem of GitHub.</jats:p>

Pp. No disponible

Online advertisement in a pink-colored market

Amir MehrjooORCID; Rubén Cuevas; Ángel Cuevas

<jats:title>Abstract</jats:title><jats:p>It is surprising that women are often charged more for products and services marketed explicitly to them. This phenomenon, known as the pink tax, is a major issue that questions women’s buying power. Nevertheless, it is not just limited to physical products – even online advertising can be subject to this type of gender-price discrimination. That is where our research comes in. We have developed a new methodology to measure what we call the digital marketing pink tax – the additional expense of delivering advertisements to female audiences. Analyzing data from Facebook advertising platforms across 187 countries and 40 territories shows this issue is systematic. Particularly, the digital marketing pink tax is prevalent in 79% of audiences across the world and 98% of audiences in highly developed countries. Therefore, advertisers incur a median cost of 30% more to display advertisements to women than men. In contrast, advertisers have to pay less digital marketing pink tax in less-developed countries (5%). Our research indicates that countries in the Middle East and Africa with a low Human Development Index (<jats:italic>HDI</jats:italic>) do not experience this phenomenon. Our comprehensive investigation of 24 industries reveals that advertisers must pay up to 64% of the digital marketing pink tax to target women in some industries. Our findings also suggest a connection between the digital marketing pink tax and the consumer pink tax – the extra charge placed on products marketed to women. Overall, our research sheds light on an important issue affecting women worldwide. Raising awareness of the digital marketing pink tax and advocating for better regulation.</jats:p>

Pp. No disponible

First-mover advantage in music

Oleg SobchukORCID; Mason YoungbloodORCID; Olivier MorinORCID

<jats:title>Abstract</jats:title><jats:p>Why do some songs and musicians become successful while others do not? We show that one of the reasons may be the “first-mover advantage”: artists that stand at the foundation of new music genres tend to be more successful than those who join these genres later on. To test this hypothesis, we have analyzed a massive dataset of over 920,000 songs, including 110 music genres: 10 chosen intentionally and preregistered, and 100 chosen randomly. For this, we collected the data from two music services: Spotify, which provides detailed information about songs’ success (the precise number of times each song was listened to), and Every Noise at Once, which provides detailed genre tags for musicians. 91 genres, out of 110, show the first-mover advantage—clearly suggesting that it is an important mechanism in music success and evolution.</jats:p>

Pp. No disponible

Novel embeddings improve the prediction of risk perception

Zak HussainORCID; Rui Mata; Dirk U. Wulff

<jats:title>Abstract</jats:title><jats:p>We assess whether the classic psychometric paradigm of risk perception can be improved or supplanted by novel approaches relying on language embeddings. To this end, we introduce the Basel Risk Norms, a large data set covering 1004 distinct sources of risk (e.g., vaccination, nuclear energy, artificial intelligence) and compare the psychometric paradigm against novel text and free-association embeddings in predicting risk perception. We find that an ensemble model combining text and free association rivals the predictive accuracy of the psychometric paradigm, captures additional affect and frequency-related dimensions of risk perception not accounted for by the classic approach, and has greater range of applicability to real-world text data, such as news headlines. Overall, our results establish the ensemble of text and free-association embeddings as a promising new tool for researchers and policymakers to track real-world risk perception.</jats:p>

Pp. No disponible

Detecting coordinated and bot-like behavior in Twitter: the Jürgen Conings case

Bart De ClerckORCID; Juan Carlos Fernandez Toledano; Filip Van Utterbeeck; Luis E. C. Rocha

<jats:title>Abstract</jats:title><jats:p>Social media platforms can play a pivotal role in shaping public opinion during times of crisis and controversy. The COVID-19 pandemic resulted in a large amount of dubious information being shared online. In Belgium, a crisis emerged during the pandemic when a soldier (Jürgen Conings) went missing with stolen weaponry after threatening politicians and virologists. This case created further division and polarization in online discussions. In this paper, we develop a methodology to study the potential of coordinated spread of incorrect information online. We combine network science and content analysis to infer and study the social network of users discussing the case, the news websites shared by those users, and their narratives. Additionally, we examined indications of bots or coordinated behavior among the users. Our findings reveal the presence of distinct communities within the discourse. Major news outlets, conspiracy theory websites, and anti-vax platforms were identified as the primary sources of (dis)information sharing. We also detected potential coordinated behavior and bot activity, indicating possible attempts to manipulate the discourse. We used the rapid semantic similarity network for the analysis of text, but our approach can be extended to the analysis of images, videos, and other types of content. These results provide insights into the role of social media in shaping public opinion during times of crisis and underscore the need for improved strategies to detect and mitigate disinformation campaigns and online discourse manipulation. Our research can aid intelligence community members in identifying and disrupting networks that spread extremist ideologies and false information, thereby promoting a more informed and resilient society.</jats:p>

Pp. No disponible

Quantifying polarization in online political discourse

Pau MuñozORCID; Alejandro BellogínORCID; Raúl Barba-RojasORCID; Fernando DíezORCID

<jats:title>Abstract</jats:title><jats:p>In an era of increasing political polarization, its analysis becomes crucial for the understanding of democratic dynamics. This paper presents a comprehensive research on measuring political polarization on X (Twitter) during election cycles in Spain, from 2011 to 2019. A wide comparative analysis is performed on algorithms used to identify and measure polarization or controversy on microblogging platforms. This analysis is specifically tailored towards publications made by official political party accounts during pre-campaign, campaign, election day, and the week post-election. Guided by the findings of this comparative evaluation, we propose a novel algorithm better suited to capture polarization in the context of political events, which is validated with real data. As a consequence, our research contributes a significant advancement in the field of political science, social network analysis, and overall computational social science, by providing a realistic method to capture polarization from online political discourse.</jats:p>

Pp. No disponible