Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Imaginary People Representing Real Numbers: Generating Personas from Online Social Media Data

Published: 01 November 2018 Publication History

Abstract

We develop a methodology to automate creating imaginary people, referred to as personas, by processing complex behavioral and demographic data of social media audiences. From a popular social media account containing more than 30 million interactions by viewers from 198 countries engaging with more than 4,200 online videos produced by a global media corporation, we demonstrate that our methodology has several novel accomplishments, including: (a) identifying distinct user behavioral segments based on the user content consumption patterns; (b) identifying impactful demographics groupings; and (c) creating rich persona descriptions by automatically adding pertinent attributes, such as names, photos, and personal characteristics. We validate our approach by implementing the methodology into an actual working system; we then evaluate it via quantitative methods by examining the accuracy of predicting content preference of personas, the stability of the personas over time, and the generalizability of the method via applying to two other datasets. Research findings show the approach can develop rich personas representing the behavior and demographics of real audiences using privacy-preserving aggregated online social media data from major online platforms. Results have implications for media companies and other organizations distributing content via online platforms.

References

[1]
Sofiane Abbar, J. An, H. Kwak, Yacine Messaoui, and Javier Borge-Holthoefer. 2015. Consumers and suppliers: Attention asymmetries. A Case Study of Aljazeera's News Coverage and Comments. Computation+Journalism Symposium 2015, New York, NY, 2--3 October.
[2]
Tamara Adlin and John Pruitt. 2010. The Essential Persona Lifecycle: Your Guide to Building and Using Personas: Morgan Kaufmann Publishers, Inc.
[3]
Alchemy Taxonomy API. 2017. IBM Accessed 1 July https://www.ibm.com/watson/developercloud/alchemy-language.html.
[4]
J. An, H. Kwak, and B. J. Jansen. 2016a. Validating social media data for automatic persona generation. The 2nd International Workshop on Online Social Networks Technologies (OSNT-2016), 13th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA2016). Agidar, Morocco, 29 November - 2 December.
[5]
J. An, H. Kwak, and B. J. Jansen. 2017a. Automatic generation of personas using YouTube social media data. Proceedings of the 50th International Conference on System Sciences (HICSS-50). Waikoloa, Hawaii, 4--7 January.
[6]
J. An, H. Kwak, and B. J. Jansen. 2017b. Personas for content creators via decomposed aggregate audience statistics. The 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2017). Sydney, Australia 31 Jul-3 Aug.
[7]
Jisun An, Ho Youn Cho, Haewoon Kwak, Mohammed Ziyaad Hassen, and Bernard J. Jansen. 2016b. Towards automatic persona generation using socialmedia. The 3rd International Symposium on Social Networks Analysis, Management and Security (SNAMS2016), The 4th International Conference on Future Internet of Things and Cloud. Vienna, Austria, 29 November - 2 December.
[8]
Hugh Beyer and Karen Holtzblatt. 1998. Contextual Design: Defining Customer-centered Systems. Morgan-Kaufmann Publishers Inc.
[9]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3, 993--1022.
[10]
Åsa Blomquist and Mattias Arvola. 2002. Personas in action: Ethnography in an interaction design team. Proceedings of the 2nd Nordic Conference on Human-Computer Interaction. Aarhus, Denmark.
[11]
Lesly Camacho, Alejandra Gonzalez, and Solange Nice Alves-Souza. 2018. Social network data to alleviate cold-start in recommender system: A systematic review. Information Processing 8 Management 54, 4, 529--544.
[12]
George Casella and Roger L. Berger. 2001. Statistical Inference. Pacific Grove, CA, Duxbury Press.
[13]
M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon. 2007. I tube, you tube, everybody tubes: Analyzing the world's largest user generated content video system. Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement.
[14]
Christopher N. Chapman, E. Love, R. P. Milham, P. ElRif, and J. L. Alford. 2008. Quantitative evaluation of personas as information. Human Factors and Ergonomics Society 52nd Annual Meeting. New York, NY, 22--26 September.
[15]
Christopher N. Chapman and Russell P. Milham. 2006. The Personas' New Clothes: Methodological and practical arguments against a popular method. Human Factors and Ergonomics Society Annual Meeting. San Francisco, CA, 16--20 October.
[16]
Xihui Chen, Jun Pang, and Ran Xue. 2014. Constructing and comparing user mobility profiles. ACM Transactions on the Web (TWEB) 8, 4, Article 21.
[17]
Michael F. Clarke. 2015. The work of mad men that makes the methods of math men work: Practically occasioned segment design. The 33rd Annual ACM Conference on Human Factors in Computing Systems. Seoul, Republic of Korea.
[18]
Alan Cooper. 2004. The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity (2nd Edition). Pearson Higher Education.
[19]
Pallavi Dharwada, Joel S. Greenstein, Anand K. Gramopadhye, and Steve J. Davis. 2007. A case study on use of personas in design and development of an audit management system. Human Factors and Ergonomics Society Annual Meeting Proceedings. Baltimore, MD, 1--5 October.
[20]
Vidya L. Drego and Moira Dorsey. 2010. The ROI of Personas. Forrester Research.
[21]
Alexey Drutsa, Gleb Gusev, and Pavel Serdyukov. 2017. Periodicity in user engagement with a search engine and its application to online controlled experiments. ACM Transactions on the Web (TWEB) 11, 2, 1--35.
[22]
Elina Eriksson, Henrik Artman, and Anna Swartling. 2013. The secret life of a persona: When the personal becomes private. The SIGCHI Conference on Human Factors in Computing Systems. Paris, France.
[23]
Shamal Faily and Ivan Flechais. 2011. Persona cases: A technique for grounding personas. The SIGCHI Conference on Human Factors in Computing Systems. Vancouver, BC, Canada.
[24]
Erin Friess. 2012. Personas and decision making in the design process: An ethnographic case study. The SIGCHI Conference on Human Factors in Computing Systems. Austin, Texas.
[25]
Kim Goodwin and Alan Cooper. 2009. Designing for the Digital Age: How to Create Human-Centered Products and Services. Indianapolis, IN, Wiley.
[26]
R. M. Gray. 1984. Vector quantization. IEEE ASSP Magazine 1, 2, 4--29.
[27]
J. Grudin and J. Pruitt. 2002. Personas, participatory design and product development: An infrastructure for engagement. Participatory Design Conference.
[28]
Rosa Guljonsdottir and Sinna Lindquist. 2008. Personas and scenarios: Design tool or a communication device. 8th International Conference on Cooperative Systems (COOP'08). Carry-le-Rouet, France, 20--23 May.
[29]
Frank Y. Guo, Sanjay Shamdasani, and Bruce Randall. 2011. Creating effective personas for product design: Insights from a case study. In Internationalization, Design and Global Development: 4th International Conference, IDGD 2011, Held as Part of HCI International 2011, Orlando, FL, July 9-14, 2011, P. L. Patrick Rau (Ed.). Springer Berlin, 37--46.
[30]
Hoang Thi Bich Ngoc and Josiane Mothe. 2018. Location extraction from tweets. Information Processing 8 Management 54, 2, 129--144.
[31]
B. J. Jansen, J. An, H. Kwak, Mohammed Ziyaad Hassen, and Ho Youn Cho. 2016. Efforts towards automatically generating personas in real-time using actual user data. Qatar Foundation Annual Research Conference 2016. Doha, Qatar, 22--23 March.
[32]
B. J. Jansen, Kate Sobel, and Geoff Cook. 2011. Classifying ecommerce information sharing behaviour by youths on social networking sites. Journal of Information Science 37, 2, 120--136.
[33]
Ian Jolliffe. 2002. Principal Component Analysis (2nd ed). New York, John Wiley 8 Sons, Ltd.
[34]
Tejinder Judge, Tara Matthews, and Steve Whittaker. 2012. Comparing collaboration and individual personas for the design and evaluation of collaboration software. SIGCHI Conference on Human Factors in Computing Systems. Austin, Texas.
[35]
S. Jung, J. An, H. Kwak, M. Ahmad, L. Nielsen, and B. J. Jansen. 2017. Persona Generation from aggregated social media data. ACM Conference on Human Factors in Computing Systems 2017 (CHI2017). Denver, CO, 6--11 May.
[36]
D. Kahneman and A. Tversky. 1972. Subjective probability: A judgment of representativeness. Cognitive Psychology 3, 3, 430--454.
[37]
Jeon-Hyung Kang, and Kristina Lerman. 2017. Effort mediates access to information in online social networks. ACM Transactions on the Web (TWEB) 11, 1, 1--19.
[38]
S. D. Krashen. 1984. Immersion: Why it works and what it has taught us. Language and Society 12, 1, 61--64.
[39]
H. Kwak and J. An. 2014. Understanding news geography and major determinants of global news coverage of disasters. Computation+Journalism Symposium 2014. New York, NY, 24--25 October.
[40]
H. Kwak, J. An, and B. J. Jansen. 2017. Automatic generation of personas using youtube social media data. Hawaii International Conference on System Sciences (HICSS-50). Waikoloa, Hawaii, 4--7 January.
[41]
Haewoon Kwak, Jisun An, Joni Salminen, Soon-Gyo Jung, and Bernard J. Jansen. 2018. What we read, what we search: Media attention and public attention among 193 countries. The 2018 World Wide Web Conference. Lyon, France.
[42]
Daniel D. Lee and Sebastian H. Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755, 788--791.
[43]
E. Mao and J. Zhang. 2015. What drives consumers to click on social media ads? The Roles of Content, Media, and Individual Factors. 2015 48th Hawaii International Conference on System Sciences, 5--8 Jan. 2015.
[44]
Nicola Marsden and Maren Haag. 2016. Stereotypes and politics: Reflections on Personas. The 2016 CHI Conference on Human Factors in Computing Systems. Santa Clara, CA.
[45]
Adrienne L. Massanari. 2010. Designing for imaginary friends: Information architecture, personas, and the politics of user-centered design. New Media 8 Society 12, 4, 401--416.
[46]
Tara Matthews, Tejinder Judge, and Steve Whittaker. 2012. How do designers and user experience professionals actually perceive and use personas? SIGCHI Conference on Human Factors in Computing Systems. Austin, TX.
[47]
Jennifer McGinn and Nalini Kotamraju. 2008. Data-driven persona development. SIGCHI Conference on Human Factors in Computing Systems. Florence, Italy.
[48]
M. L. McHugh. 2012. Interrater reliability: The kappa statistic. Biochemia Medica 22, 3, 276--282.
[49]
Tomasz Miaskiewicz, Susan Jung Grant, and Kenneth A. Kozar. 2009. A preliminary examination of using personas to enhance user-centered design. AMCIS 2009 Proceedings.
[50]
Steve Mulder and Ziv Yaar. 2006. The User Is Always Right: A Practical Guide to Creating and Using Personas for the Web. New Rider, Berkely, CA.
[51]
Duc T. Nguyen and Jai E. Jung. 2017. Real-time event detection for online behavioral analysis of big social data. Future Generation Computer Systems 66, 137--145.
[52]
Lene Nielsen. 2004. Engaging Personas and Narrative Scenarios. Department of Informatics, Copenhagen Business School.
[53]
Lene Nielsen and Kira Storgaard Hansen. 2014. Personas is applicable: A study on the use of personas in Denmark. 32nd Annual ACM Conference on Human Factors in Computing Systems. Toronto, Ontario, Canada.
[54]
Rafael B. Pereira, Alexandre Plastino, Bianca Zadrozny, and Luiz H. C. Merschmann. 2018. Correlation analysis of performance measures for multi-label classification. Information Processing 8 Management 54, 3, 359--369.
[55]
Steve Portigal. 2008. Persona non grata. Last Modified January 2008 Accessed 29 December. http://www.portigal.com/wp-content/uploads/2008/01/Portigal-Consulting-White-Paper-Persona-Non-Grata.pdf.
[56]
John Pruitt and Tamara Adlin. 2005. The Persona Lifecycle: Keeping People in Mind Throughout Product Design. Morgan-Kaufmann Publishers Inc.
[57]
John Pruitt and Jonathan Grudin. 2003. Personas: Practice and theory. 2003 Conference on Designing for User Experiences. San Francisco, CA.
[58]
Adele Revella. 2015. Buyer Personas: How to Gain Insight into Your Customer's Expectations, Align Your Marketing Strategies, and Win More Business. Wiley.
[59]
Kerry Rodden, Hilary Hutchinson, and Xin Fu. 2010. Measuring the user experience on a large scale: User-centered metrics for web applications. SIGCHI Conference on Human Factors in Computing Systems. Atlanta, GA.
[60]
Kari Rönkkö. 2005. An empirical study demonstrating how different design constraints, project organization and contexts limited the utility of personas. 38th Annual Hawaii International Conference on System Sciences, 03--06 Jan. 2005.
[61]
Kari Rönkkö, Mats Hellman, Britta Kilander, and Yvonne Dittrich. 2004. Personas is not applicable: Local remedies interpreted in a wider context. 8th Conference on Participatory Design: Artful Integration: Interweaving Media, Materials and Practices - Volume 1. Toronto, Ontario, Canada.
[62]
Joni Salminen, Lene Nielsen, Soon-Gyo Jung, Jisun An, Haewoon Kwak, and Bernard J. Jansen. 2018. Is more better?: Impact of multiple photos on perception of persona profiles. 2018 CHI Conference on Human Factors in Computing Systems. Montreal QC, Canada.
[63]
G. Shuradze and H. T. Wagner. 2016. Towards a conceptualization of data analytics capabilities. 2016 49th Hawaii International Conference on System Sciences (HICSS), 5--8 Jan. 2016.
[64]
Wendell R. Smith. 1956. A product differentiation and market segmentation as alternative marketing strategies. Journal of Advertising 21, 1, 3--8.
[65]
Barbara B. Stern. 1994. A revised communication model for advertising: Multiple dimensions of the source, the message, and the recipient. Journal of Advertising 23, 2, 5--15.
[66]
Renata Tesch. 1990. Qualitative Research: Analysis Types and Software Tools. Psychology Press.
[67]
Xiang Zhang, Hans-Frederick Brown, and Anil Shankar. 2016. Data-driven personas: Constructing archetypal users with clickstreams and user telemetry. 2016 CHI Conference on Human Factors in Computing Systems. Santa Clara, CA.

Cited By

View all
  • (2025)Who uses personas in requirements engineering: The practitioners’ perspectiveInformation and Software Technology10.1016/j.infsof.2024.107609178(107609)Online publication date: Feb-2025
  • (2024)Quantum Leap in Customer Persona DevelopmentThe Quantum AI Era of Neuromarketing10.4018/979-8-3693-7673-7.ch006(133-156)Online publication date: 27-Dec-2024
  • (2024)Investigating Persona Viewing Behavior: An Eye-Tracking Study on Portrait-Format Persona ProfileProceedings of the 13th Nordic Conference on Human-Computer Interaction10.1145/3679318.3685376(1-12)Online publication date: 13-Oct-2024
  • Show More Cited By

Index Terms

  1. Imaginary People Representing Real Numbers: Generating Personas from Online Social Media Data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on the Web
    ACM Transactions on the Web  Volume 12, Issue 4
    November 2018
    215 pages
    ISSN:1559-1131
    EISSN:1559-114X
    DOI:10.1145/3281744
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 November 2018
    Accepted: 01 August 2018
    Revised: 01 May 2018
    Received: 01 August 2017
    Published in TWEB Volume 12, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Persona
    2. user analytics

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)222
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 31 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Who uses personas in requirements engineering: The practitioners’ perspectiveInformation and Software Technology10.1016/j.infsof.2024.107609178(107609)Online publication date: Feb-2025
    • (2024)Quantum Leap in Customer Persona DevelopmentThe Quantum AI Era of Neuromarketing10.4018/979-8-3693-7673-7.ch006(133-156)Online publication date: 27-Dec-2024
    • (2024)Investigating Persona Viewing Behavior: An Eye-Tracking Study on Portrait-Format Persona ProfileProceedings of the 13th Nordic Conference on Human-Computer Interaction10.1145/3679318.3685376(1-12)Online publication date: 13-Oct-2024
    • (2024)“There’s Something About Noura”: Exploring Think-Aloud Reasonings for Users’ Persona Choice in a Design TaskProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661512(1234-1247)Online publication date: 1-Jul-2024
    • (2024)Generative AI in User Experience Design and Research: How Do UX Practitioners, Teams, and Companies Use GenAI in Industry?Proceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660720(1579-1593)Online publication date: 1-Jul-2024
    • (2024)Development of Data-driven Persona Including User Behavior and Pain Point through Clustering with User Log of B2B SoftwareProceedings of the 2024 IEEE/ACM 17th International Conference on Cooperative and Human Aspects of Software Engineering10.1145/3641822.3641870(85-90)Online publication date: 14-Apr-2024
    • (2024)Deus Ex Machina and Personas from Large Language Models: Investigating the Composition of AI-Generated Persona DescriptionsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642036(1-20)Online publication date: 11-May-2024
    • (2024)Redefining Hospital Accessibility: A Service Design Framework for Inclusive HealthcareIOP Conference Series: Earth and Environmental Science10.1088/1755-1315/1402/1/0120661402:1(012066)Online publication date: 1-Oct-2024
    • (2024)Creating and validating predictive personas for target marketingInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2023.103147181(103147)Online publication date: Jan-2024
    • (2024)Picturing the fictitious person: An exploratory study on the effect of images on user perceptions of AI-generated personasComputers in Human Behavior: Artificial Humans10.1016/j.chbah.2024.1000522:1(100052)Online publication date: Jan-2024
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media