Search Engine Similarity Analysis: A Combined Content and Rankings Approach

Dritsa, Konstantina; Sotiropoulos, Thodoris; Skarpetis, Haris; Louridas, Panos

doi:10.1007/978-3-030-62008-0_2

Konstantina Dritsa¹³,
Thodoris Sotiropoulos¹³,
Haris Skarpetis¹³ &
…
Panos Louridas¹³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12343))

Included in the following conference series:

International Conference on Web Information Systems Engineering

1315 Accesses
2 Citations
3 Altmetric

Abstract

How different are search engines? The search engine wars are a favorite topic of on-line analysts, as two of the biggest companies in the world, Google and Microsoft, battle for prevalence of the web search space. Differences in search engine popularity can be explained by their effectiveness or other factors, such as familiarity with the most popular first engine, peer imitation, or force of habit. In this work we present a thorough analysis of the affinity of the two major search engines, Google and Bing, along with DuckDuckGo, which goes to great lengths to emphasize its privacy-friendly credentials. To do so, we collected search results using a comprehensive set of 300 unique queries for two time periods in 2016 and 2019, and developed a new similarity metric that leverages both the content and the ranking of search responses. We evaluated the characteristics of the metric against other metrics and approaches that have been proposed in the literature, and used it to (1) investigate the similarities of search engine results, (2) the evolution of their affinity over time, (3) what aspects of the results influence similarity, and (4) how the metric differs over different kinds of search services. We found that Google stands apart, but Bing and DuckDuckGo are largely indistinguishable from each other.

K. Dritsa and T. Sotiropoulos—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

What did you see? A study to measure personalization in Google’s search engine

Article Open access 16 December 2019

Federated Search Using Query Log Evidence

Social Search

Notes

1.
All data, results, and source code used on our experiments are available through https://doi.org/10.5281/zenodo.3980817.
2.
https://www.google.com/trends/topcharts.
3.
https://azure.microsoft.com/en-us/services/cognitive-services/bing-web-search-api/.
4.
https://developers.google.com/custom-search/.

References

Agrawal, R., Golshan, B., Papalexakis, E.: A study of distinctiveness in web results of two search engines. In: Proceedings of the 24th International Conference on World Wide Web (2015)
Google Scholar
Agrawal, R., Golshan, B., Papalexakis, E.: Whither social networks for web search? In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015)
Google Scholar
Bailey, P., Craswell, N., White, R.W., Chen, L., Satyanarayana, A., Tahaghoghi, S.: Evaluating whole-page relevance. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (2010)
Google Scholar
Bar-Ilan, J., Levene, M., Mat-Hassan, M.: Dynamics of search engine rankings–a case study. In: WebDyn@ WWW (2004)
Google Scholar
Bar-Ilan, J., Mat-Hassan, M., Levene, M.: Methods for comparing rankings of search engine results. Comput. Netw. 50(10), 1448–1463 (2006)
Article Google Scholar
Bar-Yossef, Z., Keidar, I., Schonfeld, U.: Do not crawl in the DUST: different URLs with similar text. ACM Trans. Web 3(1), 1–31 (2009)
Article Google Scholar
Bharat, K., Broder, A.: A technique for measuring the relative size and overlap of public web search engines. Comput. Netw. ISDN Syst. 30(1), 379–388 (1998)
Article Google Scholar
Bian, J., Liu, T.Y., Qin, T., Zha, H.: Ranking with query-dependent loss for web search. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining (2010)
Google Scholar
Cardoso, B., Magalhães, J.: Google, Bing and a new perspective on ranking similarity. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management (2011)
Google Scholar
Chen, D., Chen, W., Wang, H., Chen, Z., Yang, Q.: Beyond ten blue links: enabling user click modeling in federated web search. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining (2012)
Google Scholar
Chu, H., Rosenthal, M.: Search engines for the world wide web: a comparative study and evaluation methodology. In: Proceedings of the ASIS Annual Meeting, vol. 33 (1996)
Google Scholar
Collier, J.H., Konagurthu, A.S.: An information measure for comparing top k lists. In: 2014 IEEE 10th International Conference on e-Science, vol. 1 (2014)
Google Scholar
Cutrell, E., Guan, Z.: What are you looking for?: An eye-tracking study of information usage in web search. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (2007)
Google Scholar
Ding, W., Marchionini, G.: A comparative study of web search service performance. Proc. ASIS Ann. Meet. 33, 136–142 (1996)
Google Scholar
DuckDuckGo: DuckDuckGo sources (2019). https://help.duckduckgo.com/results/sources/. Accessed 07 Aug 2019
Enge, E., Spencer, S., Fishkin, R., Stricchiola, J.: The Art of SEO. O’Reilly Media, Inc., Sebastopol (2012)
Google Scholar
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top $k$ lists. SIAM J. Discrete Math. 17(1), 134–160 (2003)
Article MathSciNet Google Scholar
StatCounter GlobalStats: Statcounter globalstats (2019). http://gs.statcounter.com. Accessed 06 Aug 2019
Gordon, M., Pathak, P.: Finding information on the world wide web: the retrieval effectiveness of search engines. Inf. Process. Manag. 35(2), 141–180 (1999)
Article Google Scholar
Hannak, A., et al.: Measuring personalization of web search. In: Proceedings of the 22nd International Conference on World Wide Web. ACM (2013)
Google Scholar
Jaro, M.A.: Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida. J. Am. Stat. Assoc. 84(406), 414–420 (1989)
Article Google Scholar
Kumar, R., Vassilvitskii, S.: Generalized distances between rankings. In: Proceedings of the 19th International Conference on World Wide Web. ACM (2010)
Google Scholar
Lee, S.H., Kim, S.J., Hong, S.H.: On URL normalization. In: Gervasi, O., et al. (eds.) ICCSA 2005. LNCS, vol. 3481, pp. 1076–1085. Springer, Heidelberg (2005). https://doi.org/10.1007/11424826_115
Chapter Google Scholar
Maxwell, D., Azzopardi, L., Moshfeghi, Y.: A study of snippet length and informativeness: behaviour, performance and user experience. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (2017)
Google Scholar
Ronald, S.: More distance functions for order-based encodings. In: 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No. 98TH8360), May 1998
Google Scholar
Sachse, J.: The influence of snippet length on user behavior in mobile web search. Aslib J. Inf. Manag. 71(3), 325–343 (2019)
Article Google Scholar
The Economist: Seek and you shall find: Google rewards reputable reporting, not left-wing politics, June 2019. https://www.economist.com/graphic-detail/2019/06/08/google-rewards-reputable-reporting-not-left-wing-politics
Vaughan, L.: New measurements for search engine evaluation proposed and tested. Inf. Process. Manag. 40(4), 677–691 (2004)
Article Google Scholar
Wallace, C.S.: Statistical and Inductive Inference by Minimum Message Length. Springer, New York (2005). https://doi.org/10.1007/0-387-27656-4
Book MATH Google Scholar
Wang, Y., et al.: Optimizing whole-page presentation for web search. ACM Trans. Web 12(3), 1–25 (2018)
Article Google Scholar
Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. ACM Trans. Inf. Syst. (TOIS) 28(4), 1–38 (2010)
Article Google Scholar
Winkler, W.E.: String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods (1990)
Google Scholar
Zaragoza, H., Cambazoglu, B.B., Baeza-Yates, R.: Web search solved?: All result rankings the same? In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management (2010)
Google Scholar

Download references

Acknowledgments

This work was supported by the European Union’s Horizon 2020 research and innovation program “FASTEN” under grant agreement No. 825328.

Author information

Authors and Affiliations

Athens University of Economics and Business, Athens, Greece
Konstantina Dritsa, Thodoris Sotiropoulos, Haris Skarpetis & Panos Louridas

Authors

Konstantina Dritsa
View author publications
You can also search for this author in PubMed Google Scholar
Thodoris Sotiropoulos
View author publications
You can also search for this author in PubMed Google Scholar
Haris Skarpetis
View author publications
You can also search for this author in PubMed Google Scholar
Panos Louridas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Konstantina Dritsa , Thodoris Sotiropoulos , Haris Skarpetis or Panos Louridas .

Editor information

Editors and Affiliations

VU Amsterdam, Amsterdam, The Netherlands
Zhisheng Huang
VU Amsterdam, Amsterdam, The Netherlands
Wouter Beek
Victoria University, Melbourne, VIC, Australia
Hua Wang
Swinburne University of Technology, Hawthorn, VIC, Australia
Rui Zhou
Victoria University, Melbourne, VIC, Australia
Yanchun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dritsa, K., Sotiropoulos, T., Skarpetis, H., Louridas, P. (2020). Search Engine Similarity Analysis: A Combined Content and Rankings Approach. In: Huang, Z., Beek, W., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2020. WISE 2020. Lecture Notes in Computer Science(), vol 12343. Springer, Cham. https://doi.org/10.1007/978-3-030-62008-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-62008-0_2
Published: 21 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62007-3
Online ISBN: 978-3-030-62008-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Search Engine Similarity Analysis: A Combined Content and Rankings Approach

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

What did you see? A study to measure personalization in Google’s search engine

Federated Search Using Query Log Evidence

Social Search

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Search Engine Similarity Analysis: A Combined Content and Rankings Approach

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

What did you see? A study to measure personalization in Google’s search engine

Federated Search Using Query Log Evidence

Social Search

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation