Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3487552.3487844acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article
Public Access

Towards identifying networks with internet clients using public data

Published: 02 November 2021 Publication History

Abstract

Does an outage impact any users? Can a geolocation database known to be good at locating users and bad at infrastructure be trusted for a particular prefix? Is a content-heavy network likely to peer with a particular network? For these questions and many more, knowing which prefixes contain Internet users aids in interpreting Internet analysis. However, existing datasets of Internet activity are out of date, unvalidated, based on privileged data, or too coarse. As a step towards identifying which IP prefixes contain users, we present multiple novel techniques to identify which IP prefixes host web clients without relying on privileged data. Our techniques identify client activity in ASes responsible for 98.8% of Microsoft CDN traffic and in prefixes responsible for 95.2% of Microsoft CDN traffic. Less than 1% of prefixes identified by our technique as active do not contact Microsoft at all. We present measurements of Internet usage worldwide and sketch future directions for extending the techniques to measure relative activity levels across prefixes.

References

[1]
Routeviews prefix to AS mappings dataset for IPv4 and IPv6, 2021. URL https://www.caida.org/catalog/datasets/routeviews-prefix2as/.
[2]
Moheeb Abu Rajab, Jay Zarfoss, Fabian Monrose, and Andreas Terzis. A multifaceted approach to understanding the botnet phenomenon. In ACM IMC, 2006.
[3]
Rami Al-Dalky, Michael Rabinovich, and Kyle Schomp. A look at the ECS behavior of DNS resolvers. In ACM IMC, 2019.
[4]
Pelayo Vallina Alvaro Feal, Julien Gamba, Sergio Pastrana, Antonio Nappa, Oliver Hohlfeld, Narseo Vallina-Rodriguez, and Juan Tapiador. Blocklist babel: On the transparency and dynamics of open source blocklisting. IEEE Transactions on Network and Service Management, April 2021.
[5]
Todd Arnold, Ege Gürmeriçliler, Georgia Essig, Arpit Gupta, Matt Calder, Vasileios Giotsas, and Ethan Katz-Bassett. (How much) does a private WAN improve cloud performance? In IEEE INFOCOM, 2020.
[6]
Todd Arnold, Jia He, Weifan Jiang, Matt Calder, Italo Cunha, Vasileios Giotsas, and Ethan Katz-Bassett. Cloud provider connectivity in the flat Internet. In ACM IMC, 2020.
[7]
Matt Calder, Xun Fan, Zi Hu, Ethan Katz-Bassett, John Heidemann, and Ramesh Govindan. Mapping the expansion of Google's serving infrastructure. In ACM IMC, 2013.
[8]
Matt Calder, Ashley Flavel, Ethan Katz-Bassett, Ratul Mahajan, and Jitendra Padhye. Analyzing the performance of an anycast CDN. In ACM IMC, 2015.
[9]
Matt Calder, Xun Fan, and Liang Zhu. A cloud provider's view of EDNS clientsubnet adoption. In TMA, 2019.
[10]
Fangfei Chen, Ramesh K Sitaraman, and Marcelo Torres. End-user mapping: Next generation request routing for content delivery. In ACM SIGCOMM, 2015.
[11]
Yi-Ching Chiu, Brandon Schlinker, Abhishek Balaji Radhakrishnan, Ethan Katz-Bassett, and Ramesh Govindan. Are we one hop away from a better Internet? In ACM IMC, 2015.
[12]
David R Choffnes and Fabián E Bustamante. Taming the torrent: a practical approach to reducing cross-isp traffic in peer-to-peer systems. In ACM SIGCOMM, 2008.
[13]
Carlo Contavalli, Wilmer van der Gaast, David C. Lawrence, and Warren Kumari. Client subnet in DNS queries. RFC 7871, RFC Editor, May 2016. URL https://datatracker.ietf.org/doc/html/rfc7871.
[14]
Wouter de Vries, Roland van Rijswijk-Deij, Pieter-Tjerk de Boer, and Aiko Pras. Passive observations of a large DNS service: 2.5 years in the life of Google. In TMA, 2018.
[15]
DITL. DITL traces and analysis | DNS-OARC, 2020. URL https://www.dnsoarc.net/oarc/data/ditl.
[16]
Manaf Gharaibeh, Anant Shah, Bradley Huffaker, Han Zhang, Roya Ensafi, and Christos Papadopoulos. A look at router geolocation in public and commercial databases. In ACM IMC, 2017.
[17]
Petros Gigis, Vasileios Kotronis, Emile Aben, Stephen D Strowes, and Xenofontas Dimitropoulos. Characterizing user-to-user connectivity with RIPE Atlas. In ACM/IRTF ANRW, 2017.
[18]
John Heidemann, Yuri Pradkin, Ramesh Govindan, Christos Papadopoulos, Genevieve Bartlett, and Joseph Bannister. Census and survey of the visible Internet. In ACM IMC, 2008.
[19]
Geoff Huston. How big is that network?, 2014. URL https://labs.apnic.net/?p=526.
[20]
Thomas Koch, Weifan Jiang, Tao Luo, Petros Gigis, Ethan Katz-Bassett, Matt Calder, Georgios Smaragdakis, Lefteris Manassakis, Emile Aben, and Narseo Vallina-Rodriguez. Towards a traffic map of the Internet. In ACM HotNets, 2021.
[21]
Thomas Koch, Ke Li, Calvin Ardi, Ethan Katz-Bassett, Matt Calder, and John Heidemann. Anycast in context: A tale of two systems. In ACM SIGCOMM, 2021.
[22]
Vasileios Kotronis, George Nomikos, Lefteris Manassakis, Dimitris Mavrommatis, and Xenofontas Dimitropoulos. Shortcuts through colocation facilities. In ACM IMC, 2017.
[23]
Zhihao Li. Diagnosing and Improving the Performance of Internet Anycast. PhD thesis, University of Maryland, 2019.
[24]
Zhihao Li, Dave Levin, Neil Spring, and Bobby Bhattacharjee. Internet anycast: performance, problems, & potential. In ACM SIGCOMM, 2018.
[25]
Jared Mauch. Open resolver project revisited. In OARC 31, 2019.
[26]
Arian Akhavan Niaki, William Marczak, Sahand Farhoodi, Andrew McGregor, Phillipa Gill, and Nicholas Weaver. Cache Me Outside: A New Look at DNS Cache Probing. In PAM, 2021.
[27]
Mark O'Neill, Scott Ruoti, Kent Seamons, and Daniel Zappala. TLS proxies: Friend or foe? In ACM IMC, 2016.
[28]
Jeman Park, Aminollah Khormali, Manar Mohaisen, and Aziz Mohaisen. Where are you taking me? Behavioral analysis of open DNS resolvers. In IEEE/IFIP DSN, 2019.
[29]
Moheeb Abu Rajab, Fabian Monrose, Andreas Terzis, and Niels Provos. Peeking through the cloud: DNS-based estimation and its applications. In ACNS, 2008.
[30]
Sivaramakrishnan Ramanathan, Anushah Hossain, Jelena Mirkovic, Minlan Yu, and Sadia Afroz. Quantifying the impact of blocklisting in the age of address reuse. In ACM IMC, 2020.
[31]
Audrey Randall, Enze Liu, Gautam Akiwate, Ramakrishna Padmanabhan, Geoffrey M Voelker, Stefan Savage, and Aaron Schulman. Trufflehunter: Cache Snooping Rare Domains at Large Public DNS Resolvers. In ACM IMC, 2020.
[32]
Philipp Richter, Georgios Smaragdakis, David Plonka, and Arthur Berger. Beyond counting: new perspectives on the active IPv4 address space. In ACM IMC, 2016.
[33]
Will Scott, Thomas Anderson, Tadayoshi Kohno, and Arvind Krishnamurthy. Satellite: Joint analysis of CDNs and network-level interference. In USENIX ATC, 2016.
[34]
Florian Streibelt, Jan Böttger, Nikolaos Chatzis, Georgios Smaragdakis, and Anja Feldmann. Exploring EDNS-client-subnet adopters in your free time. In ACM IMC, 2013.
[35]
Matthew Thomas. Chromium's impact on root DNS traffic, 2020. URL https://blog.apnic.net/2020/08/21/chromiums-impact-on-root-dns-traffic/.
[36]
Duane Wessels. How Chromium reduced root DNS traffic, 2021. URL https://blog.apnic.net/2021/02/04/how-chromium-reduces-root-dns-traffic/.
[37]
Craig E Wills, Mikhail Mikhailov, and Hao Shang. Inferring relative popularity of Internet applications by actively querying DNS caches. In ACM IMC, 2003.
[38]
Maya Ziv, Liz Izhikevich, Kimberly Ruth, Katherine Izhikevich, and Zakir Durumeric. ASdb: A system for classifying owners of autonomous systems. In ACM IMC, 2021.

Cited By

View all

Index Terms

  1. Towards identifying networks with internet clients using public data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IMC '21: Proceedings of the 21st ACM Internet Measurement Conference
    November 2021
    768 pages
    ISBN:9781450391290
    DOI:10.1145/3487552
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    In-Cooperation

    • USENIX Assoc: USENIX Assoc

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 November 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. internet measurement
    2. network mapping
    3. replicable

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    IMC '21
    IMC '21: ACM Internet Measurement Conference
    November 2 - 4, 2021
    Virtual Event

    Acceptance Rates

    Overall Acceptance Rate 277 of 1,083 submissions, 26%

    Upcoming Conference

    IMC '24
    ACM Internet Measurement Conference
    November 4 - 6, 2024
    Madrid , AA , Spain

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 817
      Total Downloads
    • Downloads (Last 12 months)205
    • Downloads (Last 6 weeks)25
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media