This article investigates the information landscape shaped by curation algorithms that seek to maximize user engagement. Leveraging unique behavioral data, we trained machine learning models to predict user engagement with tweets. Our study reveals how the pursuit of engagement maximization skews content visibility, favoring posts similar to previously engaged content while downplaying alternative perspectives. The empirical grounding of our work provides a basis for evidence-based policies aimed at fostering responsible social media platforms.
Data Availability Statement
The data used to train the predictive models and the models themselves consist of non-public information and are not made accessible to the public in order to protect individuals’ privacy according to Horus’s privacy policy. The data used in the simulations was acquired via the Twitter API and cannot be made publicly accessible due to Twitter’s developer policy.
The author deeply thanks Pedro Ramaciotti Morales and David Chavalarias for their precious insights and careful proofread. The author extends their sincere acknowledgments to Mazyiar Panahi, the head of Politoscope and Multivac platforms, for enabling the collection of the large-scale retweet network data that was instrumental in this research. Finally, the author acknowledges the Jean-Pierre Aguilar fellowship from the CFM Foundation for Research, the support and resources provided by the Complex Systems Institute of Paris Île-de-France and the Region Île-de-France.
