Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

How users search and what they search for in the medical domain: Understanding laypeople and experts through query logs

Published: 01 April 2016 Publication History

Abstract

The internet is an important source of medical knowledge for everyone, from laypeople to medical professionals. We investigate how these two extremes, in terms of user groups, have distinct needs and exhibit significantly different search behaviour. We make use of query logs in order to study various aspects of these two kinds of users. The logs from America Online, Health on the Net, Turning Research Into Practice and American Roentgen Ray Society (ARRS) GoldMiner were divided into three sets: (1) laypeople, (2) medical professionals (such as physicians or nurses) searching for health content and (3) users not seeking health advice. Several analyses are made focusing on discovering how users search and what they are most interested in. One possible outcome of our analysis is a classifier to infer user expertise, which was built. We show the results and analyse the feature set used to infer expertise. We conclude that medical experts are more persistent, interacting more with the search engine. Also, our study reveals that, conversely to what is stated in much of the literature, the main focus of users, both laypeople and professionals, is on disease rather than symptoms. The results of this article, especially through the classifier built, could be used to detect specific user groups and then adapt search results to the user group.

References

[1]
Aronson, A. R. (2001). Effective mapping of biomedical text to the UMLS Metathesaurus: The MetaMap program (pp. 17–21).
[2]
Aronson, A. R., & Rindflesch, T. C. (1997). Query expansion using the UMLS Metathesaurus. In Proceedings of the AMIA annual symposium (pp. 485–489).
[3]
Aronson, A. R., Bodenreider, O., Chang, H. F., Humphrey, S. M., Mork, J. G., Nelson, S. J., Rindflesch, T. C., & Wilbur, W. J. (2000). The NLM Indexing Initiative (pp. 17–21), Lister Hill National Center for Biomedical Communications (LHNCBC), National Library of Medicine, Bethesda, MD 20894, USA.
[4]
Aronson AR and Lang F An overview of metamap: Historical perspective and recent advances JAMIA 2010 17 3 229-236
[5]
Bhavnani, S. K. (2002). Domain-specific search strategies for the effective retrieval of healthcare and shopping information. In CHI ’02 extended abstracts on human factors in computing systems (pp. 610–611), CHI EA ’02. ACM.
[6]
Boyer C, Baujard V, and Geissbuhler A Evolution of Health Web certification through the HONcode experience Studies in Health Technology and Informatics 2011 169 53-57
[7]
Brenes DJ and Gayo-Avello D Stratified analysis of AOL query log Information Sciences 2009 179 12 1844-1858
[8]
Cartright, M.-A., White, R. W., & Horvitz, E. (2011). Intentions and attention in exploratory health search. In Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval (pp. 65–74), SIGIR ’11, New York, NY, USA, ACM.
[9]
Cohen J Statistical power analysis for the behavioral sciences 1988 2 London Routledge
[10]
Cole MJ, Gwizdka J, Liu C, Belkin NJ, and Zhang X Inferring user knowledge level from eye movement patterns Information Processing and Management 2013 49 5 1075-1091
[11]
Collins-Thompson, K., Bennett, P. N., White, R. W., de la Chica, S., & Sontag, D. (2011). Personalizing web search results by reading level. In Proceedings of the 20th ACM international conference on information and knowledge management (pp. 403–412), CIKM ’11, New York, NY, USA, ACM.
[12]
Demner-Fushman, D., Humphrey, S. M., Ide, N. C., Loane, R. F., Mork, J. G., Ruch, P., Ruiz, M. E., Smith, L. H., Wilbur, W. J., & Aronson, A. R. (2007). Combining resources to find answers to biomedical questions. In Proceedings of the sixteenth text retrieval conference, TREC 2007, Gaithersburg, Maryland, USA, November 5–9, 2007.
[13]
Denny JC, Smithers JD, Miller RA, and Spickard A “Understanding” medical school curriculum content using KnowledgeMap Journal of the American Medical Informatics Association 2003 10 4 351-362
[14]
Duarte Torres, S., Hiemstra, D., & Serdyukov, P. (2010). Query log analysis in the context of information retrieval for children. In Proceeding of the 33rd international ACM SIGIR conference on research and development in information retrieval (pp. 847–848), New York, ACM.
[15]
Duggan, G. B., & Payne, S. J. (2008). Knowledge in the head and on the web: Using topic expertise to aid search. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 39–48), CHI ’08.
[16]
Eurobarometer. (2014). European citizens’ digital health literacy. Technical report, European Commision.
[17]
Fox, S. (2011). Health topics. Technical report, The Pew Internet & American Life Project.
[18]
Fox, S., & Duggan, M. (2013). Health online 2013. Technical report, The Pew Internet & American Life Project.
[19]
Gayo-Avello D A survey on session detection methods in query logs and a proposal for future evaluation Information Sciences 2009 179 12 1822-1843
[20]
Goeuriot, L., Kelly, L., Li, W., Palotti, J., Pecina, P., Zuccon, G., Hanbury, A., Jones, G. J. F., & Müller, H. (2014). ShARe/CLEF eHealth Evaluation Lab 2014, Task 3: User-centred Health Information Retrieval. In Working notes for CLEF 2014 conference, Sheffield, UK, September 15–18, 2014 (pp. 43–61).
[21]
He, D., & Göker, A. (2000). Detecting session boundaries from web user logs. In Proceedings of the BCS-IRSG 22nd annual colloquium on information retrieval research (pp. 57–66).
[22]
Herskovic J, Tanaka L, Hersh W, and Bernstam E A day in the life of PubMed: Analysis of a typical day’s query log Journal of the American Medical Informatics Association 2007 14 2 212-220
[23]
Hollink V, Tsikrika T, and de Vries AP Semantic search log analysis: A method and a study on professional image search Journal of the American Society for Information Science and Technology 2011 62 4 691-713
[24]
Hsieh-Yee I Effects of search experience and subject knowledge on the search tactics of novice and experienced searchers Journal of the Association for Information Science and Technology 1993 44 161-174
[25]
Islamaj Dogan, R., Murray, G. C., Névéol, A., & Lu, Z. (2009). Understanding ® user search behavior through log analysis. Database, 2009, bap018.
[26]
Jadhav AS, Sheth AP, and Pathak J Online information searching for cardiovascular diseases: An analysis of mayo clinic search query logs Studies in Health Technology and Informatics 2014 205 702-706
[27]
Jansen BJ and Spink A How are we searching the world wide web?: A comparison of nine search engine transaction logs Information Processing and Management 2006 42 1 248-263
[28]
Jansen BJ, Spink A, Bateman J, and Saracevic T Real life information retrieval: A study of user queries on the web SIGIR Forum 1998 32 1 5-17
[29]
Jansen B, Spink A, and Taksai I Handbook of research on web log analysis. Information science reference 2008 Hershey, PA IGI Global Publishing
[30]
Jones, R., & Klinkner, K. L. (2008). Beyond the session timeout: Automatic hierarchical segmentation of search topics in query logs. In Proceedings of the 17th ACM conference on information and knowledge management (pp. 699–708), CIKM ’08, New York, NY, USA, ACM.
[31]
Kritz M, Gschwandtner M, Stefanov V, Hanbury A, and Samwald M Utilization and perceived problems of online medical resources and search tools among different groups of european physicians Journal of Medical Internet Research 2013 15 6 e122
[32]
Lacroix E-M and Mehnert R The US National Library of Medicine in the 21st century: Expanding collections, nontraditional formats, new audiences Health Information and Libraries Journal 2002 19 3 126-132
[33]
Lui, M., & Baldwin, T. (2012). Langid.py: An off-the-shelf language identification tool. In Proceedings of the ACL 2012 system demonstrations (pp. 25–30), ACL ’12, Stroudsburg, PA, USA, Association for Computational Linguistics.
[34]
Meats E, Brassey J, Heneghan C, and Glasziou P Using the Turning Research Into Practice (TRIP) database: How do clinicians really search? Journal of the Medical Library Association 2007 95 2 156-163
[35]
Névéol, A., Kim, W., Wilbur, W. J., & Lu, Z. (2009). Exploring two biomedical text genres for disease recognition. In Proceedings of the workshop on current trends in biomedical natural language processing (pp. 144–152), BioNLP ’09, Stroudsburg, PA, USA, Association for Computational Linguistics.
[36]
Névéol A, Dogan RI, and Lu Z Semi-automatic semantic annotation of pubmed queries: A study on quality, efficiency, satisfaction Journal of Biomedical Informatics 2011 44 2 310-318
[37]
NLM. (2009). UMLS reference manual. Bethesda (MD): National Library of Medicine (US).
[38]
Palotti, J., Hanbury, A., & Muller, H. (2014a). Exploiting health related features to infer user expertise in the medical domain. In Proceedings of WSCD workshop on web search and data mining. Wiley.
[39]
Palotti, J., Stefanov, V., & Hanbury, A. (2014b). User intent behind medical queries: An evaluation of entity mapping approaches with metamap and freebase. In Proceedings of the 5th information interaction in context symposium (pp. 283–286), IIiX ’14, ACM.
[40]
Palotti, J., Zuccon, G., Goeuriot, L., Kelly, L., Hanbury, A., Jones, G. J. F., Lupu, M., & Pecina, P. (2015). ShARe/CLEF eHealth Evaluation Lab 2015, Task 2: User-centred Health Information Retrieval. In Working notes for CLEF 2015 conference, Toulouse, France, September 8–11, 2015.
[41]
Pass, G., Chowdhury, A., & Torgeson, C. (2006). A picture of search. In Proceedings of the 1st international conference on scalable information systems, InfoScale ’06, New York, NY, USA, ACM.
[42]
Pratt, W., & Yetisgen-Yildiz, M. (2003). A study of biomedical concept identification: Metamap vs. people. In AMIA annual symposium proceedings (Vol. 2003, pp. 529–533). American Medical Informatics Association.
[43]
Roberts, K., Simpson, M., Demner-Fushman, D., & Voorhees, E., Hersh, W. (2014). State-of-the-art in biomedical literature retrieval for clinical cases: A survey of the TREC 2014 CDS Track.
[44]
Schwarz, J., & Morris, M. (2011). Augmenting web pages and search results to support credibility assessment. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1245–1254), CHI ’11, New York, NY, USA, ACM.
[45]
Silverstein C, Marais H, Henzinger M, and Moricz M Analysis of a very large web search engine query log SIGIR Forum 1999 33 1 6-12
[46]
Silvestri F Mining query logs: Turning search usage data into knowledge Foundations and Trends in Information Retrieval 2010 4 1:2 1-174
[47]
Spink A, Yang Y, Jansen J, Nykanen P, Lorence DP, Ozmutlu S, and Ozmutlu HC A study of medical and health queries to web search engines Health Information and Libraries Journal 2004 21 1 44-51
[48]
Tsikrika, T., Müller, H., & Kahn, C., Jr. (2012). Log analysis to understand medical professionals’ image searching behaviour. In Medical Informatics Europe.
[49]
Walsh TM and Volsko TA Readability assessment of internet-based consumer health information Respiratory Care 2008 53 10 1310-1315
[50]
Wang L, Wang J, Wang M, Li Y, Liang Y, and Xu D Using Internet search engines to obtain medical information: A comparative study Journal of Medical Internet Research 2012 14 3 e74
[51]
Weeber, M., Klein, H., Aronson, A. R., Mork, J. G., de Jong van den Berg, L. T. W., & Vos, R. (2000). Text-based discovery in biomedicine: The architecture of the dad-system. In Proceedings of the AMIA symposium (pp. 903–907).
[52]
White, R. W. & Horvitz, E. (2012). Studies of the onset and persistence of medical concerns in search logs. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (pp. 265–274), SIGIR ’12, New York, NY, USA, ACM.
[53]
White, R. W., Dumais, S. T., & Teevan, J. (2009) Characterizing the influence of domain expertise on web search behavior. In Proceedings of the second ACM international conference on web search and data mining (pp. 132–141), WSDM ’09, New York, NY, USA, ACM.
[54]
White RW and Horvitz E Cyberchondria: Studies of the escalation of medical concerns in web search ACM Transactions on Information Systems 2009 27 4 23:1-23:37
[55]
Wildemuth BM The effects of domain knowledge on search tactic formulation Journal of the Association for Information Science and Technology 2004 55 3 246-258
[56]
Yan X, Lau RY, Song D, Li X, and Ma J Toward a semantic granularity model for domain-specific information retrieval ACM Transactions on Information Systems 2011 29 3 151-1546
[57]
Younger P Internet-based information-seeking behaviour amongst doctors and nurses: A short review of the literature Health Information and Libraries Journal 2010 27 1 2-10
[58]
Zhang, X., Cole, M., Belkin, N. (2011). Predicting users’ domain knowledge from search behaviors. In Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval (pp. 1225–1226), SIGIR ’11, ACM.
[59]
Zhang Y Searching for specific health-related information in MedlinePlus: Behavioral patterns and user experience Journal of the Association for Information Science and Technology 2014 65 1 53-68
[60]
Zuccon, G., Koopman, B., Palotti, J. (2015) Diagnose this if you can: On the effectiveness of search engines in finding medical self-diagnosis information. In Advances in information retrieval (pp. 562–567). Springer.

Cited By

View all
  • (2023)The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web ArchivesProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591890(2848-2860)Online publication date: 19-Jul-2023
  • (2021)Semantic Information Retrieval on Medical TextsACM Computing Surveys10.1145/346247654:7(1-38)Online publication date: 17-Sep-2021
  • (2020)Query or Document Translation for Academic Search – What’s the Real Difference?Experimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-030-58219-7_3(28-42)Online publication date: 22-Sep-2020
  • Show More Cited By

Index Terms

  1. How users search and what they search for in the medical domain: Understanding laypeople and experts through query logs
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Information Retrieval
            Information Retrieval  Volume 19, Issue 1-2
            Apr 2016
            224 pages

            Publisher

            Kluwer Academic Publishers

            United States

            Publication History

            Published: 01 April 2016
            Accepted: 19 September 2015
            Received: 31 December 2014

            Author Tags

            1. Query log analysis
            2. Health search
            3. User behavior

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 24 Jan 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2023)The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web ArchivesProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591890(2848-2860)Online publication date: 19-Jul-2023
            • (2021)Semantic Information Retrieval on Medical TextsACM Computing Surveys10.1145/346247654:7(1-38)Online publication date: 17-Sep-2021
            • (2020)Query or Document Translation for Academic Search – What’s the Real Difference?Experimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-030-58219-7_3(28-42)Online publication date: 22-Sep-2020
            • (2019)The role of domain knowledge in document selection from search resultsJournal of the Association for Information Science and Technology10.1002/asi.2419970:11(1236-1247)Online publication date: 6-Oct-2019
            • (2018)SIGIR 2018 Tutorial on Health Search (HS2018)The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210188(1391-1394)Online publication date: 27-Jun-2018
            • (2017)Inferring Individual Attributes from Search Engine Queries and Auxiliary InformationProceedings of the 26th International Conference on World Wide Web10.1145/3038912.3052629(293-301)Online publication date: 3-Apr-2017
            • (2017)The role of domain knowledge in cognitive modeling of information searchInformation Retrieval10.1007/s10791-017-9308-820:5(456-479)Online publication date: 1-Oct-2017
            • (2016)Beyond Topical RelevanceProceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval10.1145/2911451.2911480(1167-1167)Online publication date: 7-Jul-2016

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media