Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/IDEAS.2005.21guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Differencing Data Streams

Published: 25 July 2005 Publication History
  • Get Citation Alerts
  • Abstract

    We present external-memory algorithms for differencing large hierarchical datasets. Our methods are especially suited to streaming data with bounded differences. For input sizes m and n and maximum output (difference) size e, the I/O, RAM, and CPU costs of our algorithm rdiff are, respectively, m + n, 4e + 8, and O(MN). That is, given 4e + 8 blocks of RAM, our algorithm performs no I/O operations other than those required to read both inputs. We also present a variant of the algorithm that uses only four blocks of RAM, with I/O cost 8me+18m+n+6e+5 and CPU cost O(MN).

    Cited By

    View all
    • (2020)Organizing and compressing collections of files using differencesProceedings of the 24th Symposium on International Database Engineering & Applications10.1145/3410566.3410584(1-10)Online publication date: 12-Aug-2020
    • (2007)Efficient and effective explanation of change in hierarchical summariesProceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1281192.1281197(6-15)Online publication date: 12-Aug-2007

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    IDEAS '05: Proceedings of the 9th International Database Engineering & Application Symposium
    July 2005
    420 pages
    ISBN:0769524044

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 25 July 2005

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Organizing and compressing collections of files using differencesProceedings of the 24th Symposium on International Database Engineering & Applications10.1145/3410566.3410584(1-10)Online publication date: 12-Aug-2020
    • (2007)Efficient and effective explanation of change in hierarchical summariesProceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1281192.1281197(6-15)Online publication date: 12-Aug-2007

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media