Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3313831.3376524acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Explain like I am a Scientist: The Linguistic Barriers of Entry to r/science

Published: 23 April 2020 Publication History
  • Get Citation Alerts
  • Abstract

    As an online community for discussing research findings, r/science has the potential to contribute to science outreach and communication with a broad audience. Yet previous work suggests that most of the active contributors on r/science are science-educated people rather than a lay general public. One potential reason is that r/science contributors might use a different, more specialized language than used in other subreddits. To investigate this possibility, we analyzed the language used in more than 68 million posts and comments from 12 subreddits from 2018. We show that r/science uses a specialized language that is distinct from other subreddits. Transient (newer) authors of posts and comments on r/science use less specialized language than more frequent authors, and those that leave the community use less specialized language than those that stay, even when comparing their first comments. These findings suggest that the specialized language used in r/science has a gatekeeping effect, preventing participation by people whose language does not align with that used in r/science. By characterizing r/science's specialized language, we contribute guidelines and tools for increasing the number of contributors in r/science.

    Supplementary Material

    PDF File (paper397aux.pdf)
    List of top words most common and least common among r/science contributors for posts and comments.

    References

    [1]
    Dominique Brossard and Dietram A. Scheufele. 2013. Science, new media, and the public. Science 339, 6115 (2013), 40--41.
    [2]
    Moira Burke and Robert Kraut. 2008. Mind your Ps and Qs: the impact of politeness and rudeness in online communities. In Proceedings of the 2008 ACM Conference on Computer supported Cooperative Work. ACM, 281--284.
    [3]
    Terry W. Burns, D. John O'Connor, and Susan M. Stocklmayer. 2003. Science communication: a contemporary definition. Public Understanding of Science 12, 2 (2003), 183--202.
    [4]
    Justine Cassell and Dona Tversky. 2005. The language of online intercultural community formation. Journal of Computer-Mediated Communication 10, 2 (2005).
    [5]
    Eshwar Chandrasekharan, Mattia Samory, Shagun Jhaver, Hunter Charvat, Amy Bruckman, Cliff Lampe, Jacob Eisenstein, and Eric Gilbert. 2018. The Internet's Hidden Rules: An Empirical Study of Reddit Norm Violations at Micro, Meso, and Macro Scales. In Proceedings of the ACM on Human-Computer Interaction.
    [6]
    Stanley F. Chen and Joshua Goodman. 1999. An empirical study of smoothing techniques for language modeling. Computer Speech & Language 13, 4 (1999), 359--394.
    [7]
    Robert B. Cialdini, Raymond R. Reno, and Carl A. Kallgren. 1990. A focus theory of normative conduct: recycling the concept of norms to reduce littering in public places. Journal of Personality and Social Psychology 58, 6 (1990).
    [8]
    Cristian Danescu-Niculescu-Mizil, Robert West, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. No country for old members: User lifecycle and linguistic change in online communities. In Proceedings of the 22nd International Conference on World Wide Web. 307--318.
    [9]
    Casey Fiesler, Jialun "Aaron" Jiang, Joshua McCann, Kyle Frye, and Jed R. Brubaker. 2018. Reddit rules! Characterizing an ecosystem of governance. In Proceedings of International AAAI Conference on Web and Social Media (ICWSM).
    [10]
    Denae Ford, Kristina Lustig, Jeremy Banks, and Chris Parnin. 2018. We don't do that here: How collaborative editing with mentors improves engagement in social Q&A communities. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems.
    [11]
    Aaron Halfaker, R. Stuart Geiger, Jonathan T. Morgan, and John Riedl. 2013. The rise and decline of an open collaboration system: How Wikipedia's reaction to popularity is causing its decline. American Behavioral Scientist 57, 5 (2013), 664--688.
    [12]
    Aaron Halfaker, Aniket Kittur, Robert Kraut, and John Riedl. 2009. A jury of your peers: quality, experience and ownership in Wikipedia. In Proceedings of the 5th International Symposium on Wikis and Open Collaboration. ACM.
    [13]
    William L Hamilton, Justine Zhang, Cristian Danescu-Niculescu-Mizil, Dan Jurafsky, and Jure Leskovec. 2017. Loyalty in online communities. In Eleventh International AAAI Conference on Web and Social Media.
    [14]
    Per Hetland. 2014. Models in science communication: formatting public engagement and expertise. Nordic Journal of Science and Technology Studies 2, 2 (2014), 5--17.
    [15]
    Shih-Wen Huang, Minhyang Mia Suh, Benjamin Mako Hill, and Gary Hsieh. 2015. How activists are both born and made: An analysis of users on change. org. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 211--220.
    [16]
    Aaron Jaech, Victoria Zayats, Hao Fang, Mari Ostendorf, and Hannaneh Hajishirzi. 2015. Talking to the crowd: What do people react to in online discussions?. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2026--2031.
    [17]
    Ridley Jones, Lucas Colusso, Katharina Reinecke, and Gary Hsieh. 2019. r/science: Challenges and opportunities for online science communication. (2019).
    [18]
    Robert E. Kraut and Paul Resnick. 2012. Building Successful Online Communities: Evidence-based Social Design. MIT Press.
    [19]
    William Labov. 1973. The linguistic consequences of being a lame. Language in Society 2, 1 (1973), 81--115.
    [20]
    William Labov. 2006. The Social Stratification of English in New York City. Cambridge University Press.
    [21]
    Trevor Martin. 2017. community2vec: Vector representations of online communities encode semantic relationships. In Proceedings of the Second Workshop on NLP and Computational Social Science. 27--31.
    [22]
    J. Nathan Matias and Merry Mou. 2018. CivilServant: Community-led experiments in platform governance. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM.
    [23]
    James Milroy and Lesley Milroy. 1978. Belfast: Change and variation in an urban vernacular. Sociolinguistic Patterns in British English 19 (1978), 19--36.
    [24]
    Jonathan T. Morgan and Anna Filippova. 2018. "Welcome' changes?: Descriptive and injunctive norms in a Wikipedia sub-community. Proceedings of the ACM on Human-Computer Interaction 52 (2018).
    [25]
    Randall Munroe. 2015. A Thing Explainer word checker. (2015). https://blog.xkcd.com/2015/09/22/a-thing-explainer-word-checker/
    [26]
    Dong Nguyen, A. Seza Dogruöz, Carolyn P. Rosé, and Franciska de Jong. 2016. Computational sociolinguistics: A survey. Computational Linguistics 42, 3 (2016), 537--593.
    [27]
    Dong Nguyen and Carolyn P. Rosé. 2011. Language use as a reflection of socialization in online communities. In Proceedings of the Workshop on Languages in Social Media. 76--85.
    [28]
    Nigini Oliveira, Michael Muller, Nazareno Andrade, and Katharina Reinecke. 2018. The exchange in StackExchange: Divergences between Stack Overflow and its culturally diverse participants. Proceedings of the ACM on Human-Computer Interaction (2018).
    [29]
    Pontus Plavén-Sigray, Granville James Matheson, Björn Christian Schiffler, and William Hedley Thompson. 2017. The readability of scientific texts is decreasing over time. eLife 6 (2017).
    [30]
    Mathieu Ranger and Karen Bultitude. 2016. "The kind of mildly curious sort of science interested person like me': Science bloggers' practices relating to audience recruitment. Public Understanding of Science 25, 3 (2016), 361--378.
    [31]
    Joseph Seering, Tony Wang, Jina Yoon, and Geoff Kaufman. 2019. Moderator engagement and community development in the age of algorithms. New Media & Society (2019).
    [32]
    Eva Sharma and Munmun De Choudhury. 2018. Mental health support and its relationship to linguistic accommodation in online communities. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 641.
    [33]
    Andreas Stolcke. 2002. SRILM-an extensible language modeling toolkit. In Proceedings of the Seventh International Conference on Spoken Language Processing.
    [34]
    Chenhao Tan, Lillian Lee, and Bo Pang. 2014. The effect of wording on message propagation: Topic-and author-controlled natural experiments on Twitter. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.
    [35]
    Trang Tran and Mari Ostendorf. 2016. Characterizing the language of online communities and its relation to community reception. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 1030--1035.
    [36]
    Debbie Treise and Michael F. Weigold. 2002. Advancing science communication: A survey of science communicators. Science Communication 23, 3 (2002), 310--322.
    [37]
    Justine Zhang, William L Hamilton, Cristian Danescu-Niculescu-Mizil, Dan Jurafsky, and Jure Leskovec. 2017. Community identity and user engagement in a multi-community landscape. In Proceedings of the International AAAI Conference on Weblogs and Social Media.

    Cited By

    View all
    • (2024)Engage Wider Audience or Facilitate Quality Answers? a Mixed-methods Analysis of Questioning Strategies for Research Sensemaking on a Community Q&A SiteProceedings of the ACM on Human-Computer Interaction10.1145/36373278:CSCW1(1-31)Online publication date: 26-Apr-2024
    • (2024)Understanding the Unintended Effects of Human-Machine Moderation in Addressing Harassment within Online CommunitiesJournal of Management Information Systems10.1080/07421222.2024.234083141:2(341-366)Online publication date: 24-Jun-2024
    • (2023)Just another clickbait title: A corpus-driven investigation of negative attitudes toward science on RedditPublic Understanding of Science10.1177/0963662522114645332:5(580-595)Online publication date: 12-Jan-2023
    • Show More Cited By

    Index Terms

    1. Explain like I am a Scientist: The Linguistic Barriers of Entry to r/science

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
      April 2020
      10688 pages
      ISBN:9781450367080
      DOI:10.1145/3313831
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 April 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. reddit
      2. science communication
      3. social computing

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CHI '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)79
      • Downloads (Last 6 weeks)5

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Engage Wider Audience or Facilitate Quality Answers? a Mixed-methods Analysis of Questioning Strategies for Research Sensemaking on a Community Q&A SiteProceedings of the ACM on Human-Computer Interaction10.1145/36373278:CSCW1(1-31)Online publication date: 26-Apr-2024
      • (2024)Understanding the Unintended Effects of Human-Machine Moderation in Addressing Harassment within Online CommunitiesJournal of Management Information Systems10.1080/07421222.2024.234083141:2(341-366)Online publication date: 24-Jun-2024
      • (2023)Just another clickbait title: A corpus-driven investigation of negative attitudes toward science on RedditPublic Understanding of Science10.1177/0963662522114645332:5(580-595)Online publication date: 12-Jan-2023
      • (2023)Exploring the Effects of Event-induced Sudden Influx of Newcomers to Online Pop Music Fandom Communities: Content, Interaction, and EngagementProceedings of the ACM on Human-Computer Interaction10.1145/36100637:CSCW2(1-24)Online publication date: 4-Oct-2023
      • (2023)Large-Scale Anonymized Text-based Disability Discourse DatasetProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3614476(1-5)Online publication date: 22-Oct-2023
      • (2023)Understanding the Use of e-Prints on Reddit and 4chan’s Politically Incorrect BoardProceedings of the 15th ACM Web Science Conference 202310.1145/3578503.3583627(117-127)Online publication date: 30-Apr-2023
      • (2023)Understanding Communication Strategies and Viewer Engagement with Science Knowledge Videos on BilibiliProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581476(1-18)Online publication date: 19-Apr-2023
      • (2023)How Language Formality in Security and Privacy Interfaces Impacts Intended ComplianceProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581275(1-12)Online publication date: 19-Apr-2023
      • (2022)Examining science communication on Reddit: From an “Assembled” to a “Disassembling” approachPublic Understanding of Science10.1177/0963662521105723131:4(473-488)Online publication date: 13-Jan-2022
      • (2022)An HCI Research Agenda for Online Science CommunicationProceedings of the ACM on Human-Computer Interaction10.1145/35555916:CSCW2(1-22)Online publication date: 11-Nov-2022
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media