Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2398776.2398783acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article

RasterZip: compressing network monitoring data with support for partial decompression

Published: 14 November 2012 Publication History

Abstract

Network traffic archival solutions are fundamental for a number of emerging applications that require: a) efficient storage of high-speed streams of traffic records and b) support for interactive exploration of massive datasets. Compression is a fundamental building block for any traffic archival solution. However, present solutions are tied to general-purpose compressors, which do not exploit patterns of network traffic data and require to decompress a lot of redundant data for high selectivity queries. In this work we introduce RasterZIP, a novel domain-specific compressor designed for network traffic monitoring data. RasterZIP uses an optimized lossless encoding that exploits patterns of traffic data, like the fact that IP addresses tend to share a common prefix. RasterZIP also introduces a novel decompression scheme that accelerates highly selective queries targeting a small portion of the dataset. With our solution we can achieve high-speed on-the-fly compression of more than half a million traffic records per second. We compare RasterZIP with the fastest Lempel-Ziv-based compressor and show that our solution improves the state-of-the-art both in terms of compression ratios and query response times without introducing penalty in any other performance metric.

Supplementary Material

PDF File (42.pdf)
Summary Review Documentation for "RasterZip: Compressing Network Monitoring Data with Support for Partial Decompression", Authors: F. Fusco, M. Vlachos and X. Dimitropoulos

References

[1]
StealthWatch FlowCollector. http://www.lancope.com/products/stealthwatch-flowcollector/.
[2]
D. Abadi, S. Madden, and M. Ferreira. Integrating Compression and Execution in Column-Oriented Database Systems. In Proc. 32nd ACM SIGMOD Int. Conf. on Management of Data, pages 671--682, 2006.
[3]
E. W. Bethel, S. Campbell, E. Dart, K. Stockinger, and K. Wu. Accelerating Network Traffic Analysis Using Query-Driven Visualization. In Proc. of IEEE Symposium on Visual Analytics Science and Technology, pages 115--122, 2006.
[4]
CAIDA. Targeted serendipity: the search for storage. http://blog.caida.org/best_available_data/2012/04/04/, 2012.
[5]
S. D. S. Center. Sdsc project storage pricing options. http://project.sdsc.edu/pricing.php.
[6]
X. Chen, M. Li, B. Ma, and J. Tromp. DNACompress: fast and effective DNA sequence compression. Bioinformatics, 18(12):1696--1698, 2002.
[7]
B. Claise. RFC 3954: Cisco Systems NetFlow Services Export Version 9, 2004.
[8]
M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni.Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. In Proc. of the 20th Annual Symposium on Computational Geometry, pages 253--262, 2004.
[9]
L. Deri, V. Lorenzetti, and S. Mortimer. Collection and Exploration of Large Data Monitoring Sets Using Bitmap Databases. In Proc. of the 2nd Int. Workshop on Traffic Monitoring and Analysis, TMA'10, pages 73--86, 2010.
[10]
N. Duffield, C. Lund, and M. Thorup. Charging from Sampled Network Usage. In Proc. of the 1st ACM SIGCOMM Workshop on Internet Measurement (IMW), pages 245--256, 2001.
[11]
A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and F. True. Deriving Traffic Demands for Operational IP Networks: Methodology and Experience. IEEE/ACM Transactions on Networking (ToN), 9:265--280, 2001.
[12]
A. Friedl and S. Ubik. Perfmon and Servmon: Monitoring Operational Status and Resources of Distributed Computing Systems. Technical Report 10, CESNET, Prague, Czech Republic, 2008.
[13]
F. Fusco, X. Dimitropoulos, M. Vlachos, and L. Deri. pcapindex: an index for network packet traces with legacy compatibility. SIGCOMM Comput. Commun. Rev., 42(1):47--53.
[14]
F. Fusco, M. Vlachos, and M. P. Stoecklin. Real-time creation of bitmap indexes on streaming network data. The VLDB Journal, 21(3):287--307, June 2012.
[15]
C. Gates, M. Collins, M. Duggan, A. Kompanek, and M. Thomas. More Netflow Tools for Performance and Security. In Proc. of the Conf. on Large Installation Systems Administration, pages 121--132, 2004.
[16]
P. Giura and N. Memon. Netstore: an efficient storage infrastructure for network forensics and monitoring. In Proc. of the 13th Int. Conf. on Recent advances in intrusion detection, RAID'10, pages 277--296, 2010.
[17]
P. Haag. NFDump. http://nfdump.sourceforge.net/.
[18]
R. Hofstede, A. Sperotto, T. Fioreze, and A. Pras. The network data handling war: MySQL vs. NfDump. In Proc. of the 16th EUNICE/IFIP Conf. on Networked services and applications: engineering, control and management, EUNICE'10, pages 167--176, 2010.
[19]
S. B. Joshi. Apache hadoop performance-tuning methodologies and best practices. In Proc. of the 3rd joint WOSP/SIPEW Int. Conf. on Performance Engineering, ICPE'12, pages 241--242, 2012.
[20]
A. Lakhina, M. Crovella, and C. Diot. Mining Anomalies Using Traffic Feature Distributions. In Proc. ACM SIGCOMM Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications, pages 217--228, 2005.
[21]
T. W. Lam, W.-K. Sung, S.-L. Tam, C.-K. Wong, and S.-M. Yiu. Compressed indexing and local alignment of DNA. Bioinformatics, 24(6):791--797, 2008.
[22]
S. Romig, M. Fullmer, and R. Luman. The OSU Flow-tools Package and Cisco NetFlow Logs. In Proc. Conference on Large Installation Systems Administration (LISA), pages 291--303, 2000.
[23]
D. Salomon. Data Compression: The Complete Reference. Springer-Verlag, 2nd edition, 2000.
[24]
B. D. Vo and G. S. Manku. RadixZip: Linear Time Compression of Token Streams. In Proc. Int. Conf. on Very Large Data Bases, pages 1162--1172, 2007.
[25]
A. Wagner. Entropy-Based Worm Detection for Fast IP Networks. PhD thesis, ETH Zurich, 2008.
[26]
H. Yan, S. Ding, and T. Suel. Compressing term positions in web indexes. In Proc. of the 32nd ACM SIGIR Conf. on Research and development in information retrieval, SIGIR'09, pages 147--154, 2009.
[27]
L. Yang, R. P. Dick, H. Lekatsas, and S. Chakradhar. Online memory compression for embedded systems. ACM Trans. Embed. Comput. Syst., 9:27:1--27:30, 2010.
[28]
J. Ziv and A. Lempel. A Universal Algorithm for Sequential Data Compression. IEEE Transactions on Information Theory, 23(3):337--343, 1977.

Cited By

View all
  • (2023)Hardware-Based Algorithm Acceleration towards Efficient Network Traffic Storage Systems2023 21st International Conference on Optical Communications and Networks (ICOCN)10.1109/ICOCN59242.2023.10236164(1-3)Online publication date: 31-Jul-2023
  • (2021)Background of Network Behavior AnalysisNetwork Behavior Analysis10.1007/978-981-16-8325-1_2(7-19)Online publication date: 16-Dec-2021
  • (2020)CompactFlow: A Hybrid Binary Format for Network Flow DataInformation Security Theory and Practice10.1007/978-3-030-41702-4_12(185-201)Online publication date: 2-Mar-2020
  • Show More Cited By

Index Terms

  1. RasterZip: compressing network monitoring data with support for partial decompression

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IMC '12: Proceedings of the 2012 Internet Measurement Conference
    November 2012
    572 pages
    ISBN:9781450317054
    DOI:10.1145/2398776
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 November 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data compression
    2. netflow
    3. network monitoring
    4. network traffic archives

    Qualifiers

    • Research-article

    Conference

    IMC '12
    Sponsor:
    IMC '12: Internet Measurement Conference
    November 14 - 16, 2012
    Massachusetts, Boston, USA

    Acceptance Rates

    Overall Acceptance Rate 277 of 1,083 submissions, 26%

    Upcoming Conference

    IMC '24
    ACM Internet Measurement Conference
    November 4 - 6, 2024
    Madrid , AA , Spain

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 16 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Hardware-Based Algorithm Acceleration towards Efficient Network Traffic Storage Systems2023 21st International Conference on Optical Communications and Networks (ICOCN)10.1109/ICOCN59242.2023.10236164(1-3)Online publication date: 31-Jul-2023
    • (2021)Background of Network Behavior AnalysisNetwork Behavior Analysis10.1007/978-981-16-8325-1_2(7-19)Online publication date: 16-Dec-2021
    • (2020)CompactFlow: A Hybrid Binary Format for Network Flow DataInformation Security Theory and Practice10.1007/978-3-030-41702-4_12(185-201)Online publication date: 2-Mar-2020
    • (2017)Processing Encrypted and Compressed Time Series Data2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS.2017.114(1053-1062)Online publication date: Jun-2017
    • (2016)VASTProceedings of the 13th Usenix Conference on Networked Systems Design and Implementation10.5555/2930611.2930634(345-362)Online publication date: 16-Mar-2016
    • (2016)Selective Capping of Packet Payloads at Multi-Gb/s RatesIEEE Journal on Selected Areas in Communications10.1109/JSAC.2016.255919834:6(1807-1818)Online publication date: Jun-2016
    • (2015)Selective Capping of Packet Payloads for Network Analysis and ManagementTraffic Monitoring and Analysis10.1007/978-3-319-17172-2_1(3-16)Online publication date: 17-Apr-2015
    • (2014)BreadZip: a combination of network traffic data and bitmap index encoding algorithm2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC.2014.6974426(3235-3240)Online publication date: Oct-2014
    • (2014)Flow Monitoring Explained: From Packet Capture to Data Analysis With NetFlow and IPFIXIEEE Communications Surveys & Tutorials10.1109/COMST.2014.232189816:4(2037-2064)Online publication date: Dec-2015
    • (2013)Improving the Compression Efficiency for News Web Service Using Semantic Relations Among WebpagesInternational Journal of Cognitive Informatics and Natural Intelligence10.4018/ijcini.20130401047:2(49-64)Online publication date: Apr-2013

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media