Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3131365.3131406acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article

On the structure and characteristics of user agent string

Published: 01 November 2017 Publication History

Abstract

User agent (UA) strings transmitted during HTTP transactions convey client system configuration details to ensure that content returned by a server is appropriate for the requesting host. As such, analysis of UA strings and their structure offers a unique perspective on active client systems in the Internet and when tracked longitudinally, offers a perspective on the nature of system and configuration dynamics. In this paper, we describe our study of UA string characteristics. Our work is based on analyzing a unique corpus of over 1B UA strings collected over a period of 2 years by comScore. We begin by analyzing the general characteristics of UA strings, focusing on the most prevalent strings and dynamic behaviors. We identify the top 10 most popular User Agents, which account for 26% of total daily volume. These strings describe the expected instances of popular platforms such as Microsoft, Apple and Google. We then report on the characteristics of low-volume UA strings, which has important implications for unique device identification. We show that this class of user agent generates the overwhelming majority of traffic, with between 2M and 10M instances observed each day. We show that the distribution of UA strings has temporal dependence and we show the distribution measured depends on the type of content served. Finally, we report on two large-scale UA anomalies characterized by web browsers sending false and misleading UAs in their web requests.

References

[1]
{n. d.}. Action Pack Variants (Ruby on Rails 4.1 Release Notes). http://edgeguides.rubyonrails.org/4_1_release_notes.html#action-pack-variants. ({n. d.}). Accessed August 2017.
[2]
{n. d.}. Panopticlick. ({n. d.}). https://panopticlick.eff.org/
[3]
{n. d.}. Udger. ({n. d.}). https://udger.com/resources/ua-list
[4]
{n. d.}. Understanding the compatibility view list. https://msdn.microsoft.com/en-us/library/gg622935(v=vs.85).aspx. ({n. d.}). Accessed August 2017.
[5]
{n. d.}. UserAgentString. ({n. d.}). http://www.useragentstring.com/pages/useragentstring.php
[6]
{n. d.}. WhatIsMyBrowser.com. ({n. d.}). https://www.whatismybrowser.com/developers/tools/user-agent-parser/browse
[7]
Károly Boda, Ádám Máté Földes, Gábor György Gulyás, and Sándor Imre. 2011. User tracking on the web via cross-browser fingerprinting. In Nordic Conference on Secure IT Systems. 31--46.
[8]
Aaron Cahn, Scott Alfeld, Paul Barford, and S. Muthukrishnan. 2016. An Empirical Study of Web Cookies. In Proceedings of the 25th International Conference on World Wide Web (WWW '16). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 891--901.
[9]
Shauvik Roy Choudhary, Husayn Versee, and Alessandro Orso. 2010. Webdiff: Automated identification of cross-browser issues in web applications. In IEEE International Conference on Software Maintenance (ICSM). 1--10.
[10]
Media Ratings Council. {n. d.}. Invalid Traffic Detection and Filtration Guidelines Addendum. ({n. d.}). http://mediaratingcouncil.org/101515_IVT%20Addendum%20FINAL%20(Version%201.0).pdf
[11]
Peter Eckersley. 2010. How unique is your web browser?. In International Symposium on Privacy Enhancing Technologies Symposium. 1--18.
[12]
EF Foundation. {n. d.}. Panopticlick. ({n. d.}). https://panopticlick.eff.org/
[13]
Aaron Gember, Ashok Anand, and Aditya Akella. 2011. A comparative study of handheld and non-handheld traffic in campus Wi-Fi networks. In International Conference on Passive and Active Network Measurement. 173--183.
[14]
Martin Grill and Martin Rehák. 2014. Malware detection using HTTP user-agent discrepancy identification. In Information Forensics and Security (WIFS), 2014 IEEE International Workshop on. 221--226.
[15]
Sunghwan Ihm and Vivek S Pai. 2011. Towards understanding modern web traffic. In Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference. 295--312.
[16]
Nizar Kheir. 2013. Analyzing HTTP user agent anomalies for malware detection. In Data Privacy Management and Autonomous Spontaneous Security. 187--200.
[17]
Vadim Kotov and Fabio Massacci. 2013. Anatomy of exploit kits. In International Symposium on Engineering Secure Software and Systems. 181--196.
[18]
Balachander Krishnamurthy. 2001. Web protocols and practice: HTTP/1.1, Networking protocols, caching, and traffic measurement. Addison-Wesley Professional.
[19]
Gregor Maier, Fabian Schneider, and Anja Feldmann. 2010. A first look at mobile hand-held device traffic. In International Conference on Passive and Active Network Measurement. 161--170.
[20]
Ali Mesbah and Mukul R Prasad. 2011. Automated cross-browser compatibility testing. In Proceedings of the 33rd International Conference on Software Engineering. 561--570.
[21]
Jitu Padhye and Henrik Frystyk Nielsen. 2012. A comparison of SPDY and HTTP performance. Technical Report MSR-TR-2012-102.
[22]
J. Reschke and R. Fielding. 2014. RFC 7231: Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content. http://tools.ietf.org/html/rfc7231. (June 2014).
[23]
Christian Rossow, Christian J Dietrich, Herbert Bos, Lorenzo Cavallaro, Maarten Van Steen, Felix C Freiling, and Norbert Pohlmann. 2011. Sandnet: Network traffic analysis of malicious software. In Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security. 78--88.
[24]
Fabian Schneider, Bernhard Ager, Gregor Maier, Anja Feldmann, and Steve Uhlig. 2012. Pitfalls in HTTP traffic measurements and analysis. In International Conference on Passive and Active Network Measurement. 242--251.
[25]
Kevin Springborn and Paul Barford. 2013. Impression Fraud in On-line Advertising via Pay-Per-View Networks. In USENIX Security. 211--226.
[26]
Paul J Timmins, Sean McCormick, Emmanuel Agu, and Craig E Wills. 2006. Characteristics of mobile web content. In Hot Topics in Web Systems and Technologies, 2006. HOTWEB '06. 1st IEEE Workshop on. 1--10.
[27]
Ting-Fang Yen, Yinglian Xie, Fang Yu, Roger Peng Yu, and Martin Abadi. 2012. Host Fingerprinting and Tracking on the Web: Privacy and Security Implications. In NDSS.

Cited By

View all
  • (2024)Uncovering the Hidden Data Costs of Mobile YouTube Video AdsProceedings of the ACM Web Conference 202410.1145/3589334.3645496(1138-1148)Online publication date: 13-May-2024
  • (2023)Unveiling the Impact of User-Agent Reduction and Client Hints: A Measurement StudyProceedings of the 22nd Workshop on Privacy in the Electronic Society10.1145/3603216.3624965(91-106)Online publication date: 26-Nov-2023
  • (2022)Case Study Application of Business Intelligence in Digital AdvertisingInternational Journal of E-Business Research10.4018/IJEBR.29329418:1(1-16)Online publication date: 5-Jan-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IMC '17: Proceedings of the 2017 Internet Measurement Conference
November 2017
509 pages
ISBN:9781450351188
DOI:10.1145/3131365
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • USENIX Assoc: USENIX Assoc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. character entropy matrix
  2. internet measurement
  3. user agent strings

Qualifiers

  • Research-article

Conference

IMC '17
IMC '17: Internet Measurement Conference
November 1 - 3, 2017
London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 277 of 1,083 submissions, 26%

Upcoming Conference

IMC '24
ACM Internet Measurement Conference
November 4 - 6, 2024
Madrid , AA , Spain

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Uncovering the Hidden Data Costs of Mobile YouTube Video AdsProceedings of the ACM Web Conference 202410.1145/3589334.3645496(1138-1148)Online publication date: 13-May-2024
  • (2023)Unveiling the Impact of User-Agent Reduction and Client Hints: A Measurement StudyProceedings of the 22nd Workshop on Privacy in the Electronic Society10.1145/3603216.3624965(91-106)Online publication date: 26-Nov-2023
  • (2022)Case Study Application of Business Intelligence in Digital AdvertisingInternational Journal of E-Business Research10.4018/IJEBR.29329418:1(1-16)Online publication date: 5-Jan-2022
  • (2021)Cognitive test scores vary with choice of personal digital deviceBehavior Research Methods10.3758/s13428-021-01597-353:6(2544-2557)Online publication date: 5-May-2021
  • (2021)Tracing your rootsProceedings of the 21st ACM Internet Measurement Conference10.1145/3487552.3487813(179-194)Online publication date: 2-Nov-2021
  • (2021)Déjà Vu? Client-Side Fingerprinting and Version Detection of Web Application Software2021 IEEE 46th Conference on Local Computer Networks (LCN)10.1109/LCN52139.2021.9524885(81-89)Online publication date: 4-Oct-2021
  • (2021)CrawlBot: A Domain-Specific Pseudonymous CrawlerCybersecurity in Emerging Digital Era10.1007/978-3-030-84842-2_7(89-101)Online publication date: 14-Aug-2021
  • (2020)Voyager: Tracking with a ClickProcedia Computer Science10.1016/j.procs.2020.08.011176(98-107)Online publication date: 2020
  • (2019)Characterizing JSON Traffic Patterns on a CDNProceedings of the Internet Measurement Conference on - IMC '1910.1145/3355369.3355594(195-201)Online publication date: 2019
  • (2019)An Analysis of Anomalous User Agent Strings in Network Traffic2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2019.00243(1771-1778)Online publication date: Aug-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media