Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Interactive outlier exploration in big data streams

Published: 01 August 2014 Publication History

Abstract

We demonstrate our VSOutlier system for supporting interactive exploration of outliers in big data streams. VSOutlier not only supports a rich variety of outlier types supported by innovative and efficient outlier detection strategies, but also provides a rich set of interactive interfaces to explore outliers in real time. Using the stock transactions dataset from the US stock market and the moving objects dataset from MITRE, we demonstrate that the VSOutlier system enables analysts to more efficiently identify, understand, and respond to phenomena of interest in near real-time even when applied to high volume streams.

References

[1]
F. Angiulli and F. Fassetti. Distance-based outlier queries in data streams: the novel task and algorithms. Data Min. Knowl. Discov., 20(2):290--324, 2010.
[2]
F. Angiulli and C. Pizzuti. Fast outlier detection in high dimensional spaces. In PKDD, pages 15--26, 2002.
[3]
A. Arasu, S. Babu, and J. Widom. The cql continuous query language. VLDB J., 15(2):121--142, 2006.
[4]
M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander. Lof: Identifying density-based local outliers. In SIGMOD Conference, pages 93--104, 2000.
[5]
L. Cao, D. Yang, Q. Wang, Y. Yu, J. Wang, and E. A. Rundensteiner. Scalable distance-based outlier detection over high-volume data streams. In ICDE, 2014.
[6]
J. Entzminger, J. N., C. Fowler, and W. Kenneally. Jointstars and gmti: past, present and future. Aerospace and Electronic Systems, IEEE Transactions on, 35(2):748--761, Apr. 1999.
[7]
D. Georgiadis, M. Kontaki, A. Gounaris, A. N. Papadopoulos, K. Tsichlas, and Y. Manolopoulos. Continuous outlier detection in data streams: an extensible framework and state-of-the-art algorithms. In SIGMOD Conference, pages 1061--1064, 2013.
[8]
I. INETATS. Stock trade traces. http://www.inetats.com/.
[9]
E. M. Knorr and R. T. Ng. Algorithms for mining distance-based outliers in large datasets. In VLDB, pages 392--403, 1998.
[10]
M. Kontaki, A. Gounaris, A. N. Papadopoulos, K. Tsichlas, and Y. Manolopoulos. Continuous monitoring of distance-based outliers over data streams. In ICDE, pages 135--146, 2011.
[11]
H.-P. Kriegel, M. Schubert, and A. Zimek. Angle-based outlier detection in high-dimensional data. In KDD, pages 444--452, 2008.
[12]
A. Nazaruk and M. Rauchman. Big data in capital markets. In SIGMOD Conference, pages 917--918, 2013.
[13]
S. Papadimitriou, H. Kitagawa, P. B. Gibbons, and C. Faloutsos. Loci: Fast outlier detection using the local correlation integral. In ICDE, pages 315--326, 2003.
[14]
S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. In SIGMOD Conference, pages 427--438, 2000.
[15]
Z. Xie, S. Huang, M. O. Ward, and E. A. Rundensteiner. Exploratory visualization of multivariate data with variable quality. In IEEE VAST, pages 183--190, 2006.
[16]
D. Yang, E. Rundensteiner, and M. Ward. Neighbor-based pattern detection over streaming data. In EDBT, pages 529--540, 2009.

Cited By

View all
  • (2024)Convolution and Cross-Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join QueriesProceedings of the ACM on Management of Data10.1145/36549322:3(1-26)Online publication date: 30-May-2024
  • (2022)TODProceedings of the VLDB Endowment10.14778/3570690.357070316:3(546-560)Online publication date: 1-Nov-2022
  • (2021)Multiple Dynamic Outlier-Detection from a Data Stream by Exploiting Duality of Data and QueriesProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452810(2063-2075)Online publication date: 9-Jun-2021
  • Show More Cited By
  1. Interactive outlier exploration in big data streams

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 7, Issue 13
    August 2014
    466 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 August 2014
    Published in PVLDB Volume 7, Issue 13

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Convolution and Cross-Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join QueriesProceedings of the ACM on Management of Data10.1145/36549322:3(1-26)Online publication date: 30-May-2024
    • (2022)TODProceedings of the VLDB Endowment10.14778/3570690.357070316:3(546-560)Online publication date: 1-Nov-2022
    • (2021)Multiple Dynamic Outlier-Detection from a Data Stream by Exploiting Duality of Data and QueriesProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452810(2063-2075)Online publication date: 9-Jun-2021
    • (2019)NETSProceedings of the VLDB Endowment10.14778/3342263.334226912:11(1303-1315)Online publication date: 1-Jul-2019
    • (2018)MacroBaseACM Transactions on Database Systems10.1145/327646343:4(1-45)Online publication date: 6-Dec-2018
    • (2018)Clustering stream data by exploring the evolution of density mountainProceedings of the VLDB Endowment10.1145/3164135.316413611:4(393-405)Online publication date: 5-Oct-2018
    • (2017)Clustering stream data by exploring the evolution of density mountainProceedings of the VLDB Endowment10.1145/3186728.316413611:4(393-405)Online publication date: 1-Dec-2017
    • (2017)MacroBaseProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3035928(541-556)Online publication date: 9-May-2017
    • (2017)Scalable validation of industrial equipment using a functional DSMSJournal of Intelligent Information Systems10.1007/s10844-016-0427-248:3(553-577)Online publication date: 1-Jun-2017
    • (2016)Distance-based outlier detection in data streamsProceedings of the VLDB Endowment10.14778/2994509.29945269:12(1089-1100)Online publication date: 1-Aug-2016
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media