Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Skip header Section
Managing and Mining Uncertain DataFebruary 2009
Publisher:
  • Springer Publishing Company, Incorporated
ISBN:978-0-387-09689-6
Published:11 February 2009
Pages:
494
Skip Bibliometrics Section
Reflects downloads up to 06 Oct 2024Bibliometrics
Skip Abstract Section
Abstract

Managing and Mining Uncertain Data, a survey with chapters by a variety of well known researchers in the data mining field, presents the most recent models, algorithms, and applications in the uncertain data mining field in a structured and concise way. This book is organized to make it more accessible to applications-driven practitioners for solving real problems. Also, given the lack of structurally organized information on this topic, Managing and Mining Uncertain Data provides insights which are not easily accessible elsewhere. Managing and Mining Uncertain Data is designed for a professional audience composed of researchers and practitioners in industry. This book is also suitable as a reference book for advanced-level students in computer science and engineering, as well as the ACM, IEEE, SIAM, INFORMS and AAAI Society groups.

Cited By

  1. Ke X, Khan A, Al Hasan M and Rezvansangsari R (2022). Reliability Maximization in Uncertain Graphs, IEEE Transactions on Knowledge and Data Engineering, 34:2, (894-913), Online publication date: 1-Feb-2022.
  2. Saha A, Brokkelkamp R, Velaj Y, Khan A and Bonchi F (2021). Shortest paths and centrality in uncertain networks, Proceedings of the VLDB Endowment, 14:7, (1188-1201), Online publication date: 1-Mar-2021.
  3. ACM
    Peng L Research on Data Uncertainty and Lineage Through Trio Proceedings of the 1st World Symposium on Software Engineering, (73-77)
  4. Wu Y, Lin X, Yang Y and He L (2019). Cleaning uncertain graphs via noisy crowdsourcing, World Wide Web, 22:4, (1523-1553), Online publication date: 1-Jul-2019.
  5. Ke X, Khan A and Quan L (2019). An in-depth comparison of s-t reliability algorithms over uncertain graphs, Proceedings of the VLDB Endowment, 12:8, (864-876), Online publication date: 1-Apr-2019.
  6. ACM
    Dallachiesa M, Aggarwal C and Palpanas T (2019). Improving Classification Quality in Uncertain Graphs, Journal of Data and Information Quality, 11:1, (1-20), Online publication date: 18-Jan-2019.
  7. ACM
    Cheema M (2018). Indoor location-based services, SIGSPATIAL Special, 10:2, (10-17), Online publication date: 13-Nov-2018.
  8. Khan A, Bonchi F, Gullo F and Nufer A (2018). Conditional Reliability in Uncertain Graphs, IEEE Transactions on Knowledge and Data Engineering, 30:11, (2078-2092), Online publication date: 1-Nov-2018.
  9. Agarwal P, Kumar N, Sintos S and Suri S (2018). Range-max queries on uncertain data, Journal of Computer and System Sciences, 94:C, (118-134), Online publication date: 1-Jun-2018.
  10. Ceccarello M, Fantozzi C, Pietracaprina A, Pucci G and Vandin F (2017). Clustering uncertain graphs, Proceedings of the VLDB Endowment, 11:4, (472-484), Online publication date: 1-Dec-2017.
  11. Ceccarello M, Fantozzi C, Pietracaprina A, Pucci G and Vandin F (2018). Clustering uncertain graphs, Proceedings of the VLDB Endowment, 11:4, (472-484), Online publication date: 1-Dec-2017.
  12. Agarwal P, Efrat A, Sankararaman S and Zhang W (2017). Nearest-Neighbor Searching Under Uncertainty I, Discrete & Computational Geometry, 58:3, (705-745), Online publication date: 1-Oct-2017.
  13. Zhang X, Liu H and Zhang X (2017). Novel density-based and hierarchical density-based clustering algorithms for uncertain data, Neural Networks, 93:C, (240-255), Online publication date: 1-Sep-2017.
  14. Gullo F, Ponti G, Tagarelli A and Greco S (2017). An information-theoretic approach to hierarchical clustering of uncertain data, Information Sciences: an International Journal, 402:C, (199-215), Online publication date: 1-Sep-2017.
  15. Lin X, Peng Y, Choi B and Xu J (2017). Human-Powered Data Cleaning for Probabilistic Reachability Queries on Uncertain Graphs, IEEE Transactions on Knowledge and Data Engineering, 29:7, (1452-1465), Online publication date: 1-Jul-2017.
  16. ACM
    Wu Y, Agarwal P, Li C, Yang J and Yu C (2017). Computational Fact Checking through Query Perturbations, ACM Transactions on Database Systems, 42:1, (1-41), Online publication date: 31-Mar-2017.
  17. Zhu R, Zou Z and Li J (2017). Towards efficient top-k reliability search on uncertain graphs, Knowledge and Information Systems, 50:3, (723-750), Online publication date: 1-Mar-2017.
  18. ACM
    Agarwal P, Aronov B, Har-Peled S, Phillips J, Yi K and Zhang W (2016). Nearest-Neighbor Searching Under Uncertainty II, ACM Transactions on Algorithms, 13:1, (1-25), Online publication date: 21-Dec-2016.
  19. ACM
    Agarwal P, Kumar N, Sintos S and Suri S Range-Max Queries on Uncertain Data Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, (465-476)
  20. Samet A and Dao T (2016). Bounded Support and Confidence over Evidential Databases, Procedia Computer Science, 80:C, (1822-1833), Online publication date: 1-Jun-2016.
  21. ACM
    Olteanu D and Schaik S (2016). ENFrame, ACM Transactions on Database Systems, 41:1, (1-44), Online publication date: 7-Apr-2016.
  22. Cao K, Wang G, Han D, Bai M and Li S (2016). An algorithm for classification over uncertain data based on extreme learning machine, Neurocomputing, 174:PA, (194-202), Online publication date: 22-Jan-2016.
  23. ACM
    Orang M and Shiri N Improving performance of similarity measures for uncertain time series using preprocessing techniques Proceedings of the 27th International Conference on Scientific and Statistical Database Management, (1-12)
  24. ACM
    Bhattacharya A and Awate S Probabilistic aggregate skyline join queries Proceedings of the 27th International Conference on Scientific and Statistical Database Management, (1-12)
  25. Yuan Y, Wang G, Chen L and Wang H (2015). Graph similarity search on large uncertain graph databases, The VLDB Journal — The International Journal on Very Large Data Bases, 24:2, (271-296), Online publication date: 1-Apr-2015.
  26. Samet A, Lefèvre É and Ben Yahia S Evidential Database Proceedings of the Third International Conference on Belief Functions: Theory and Applications - Volume 8764, (105-114)
  27. Dallachiesa M, Palpanas T and Ilyas I (2014). Top-k nearest neighbor search in uncertain data series, Proceedings of the VLDB Endowment, 8:1, (13-24), Online publication date: 1-Sep-2014.
  28. Zhang X, Liu H, Zhang X and Liu X Novel density-based clustering algorithms for uncertain data Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, (2191-2197)
  29. ACM
    Dallachiesa M, Aggarwal C and Palpanas T Node classification in uncertain graphs Proceedings of the 26th International Conference on Scientific and Statistical Database Management, (1-4)
  30. Wu Y, Agarwal P, Li C, Yang J and Yu C (2014). Toward computational fact-checking, Proceedings of the VLDB Endowment, 7:7, (589-600), Online publication date: 1-Mar-2014.
  31. ACM
    Song C and Ge T Discovering and managing quantitative association rules Proceedings of the 22nd ACM international conference on Information & Knowledge Management, (2429-2434)
  32. de Carvalho J and Ruiz D Discovering frequent itemsets on uncertain data Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition, (390-404)
  33. ACM
    Agarwal P, Aronov B, Har-Peled S, Phillips J, Yi K and Zhang W Nearest neighbor searching under uncertainty II Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI symposium on Principles of database systems, (115-126)
  34. ACM
    Angiulli F and Fassetti F (2013). Nearest Neighbor-Based Classification of Uncertain Data, ACM Transactions on Knowledge Discovery from Data, 7:1, (1-35), Online publication date: 1-Mar-2013.
  35. Liu B, Xiao Y, Cao L, Hao Z and Deng F (2013). SVDD-based outlier detection on uncertain data, Knowledge and Information Systems, 34:3, (597-618), Online publication date: 1-Mar-2013.
  36. ACM
    Olteanu D and van Schaik S DAGger Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, (1504-1507)
  37. Dallachiesa M, Nushi B, Mirylenka K and Palpanas T (2012). Uncertain time-series similarity, Proceedings of the VLDB Endowment, 5:11, (1662-1673), Online publication date: 1-Jul-2012.
  38. Tong Y, Chen L, Cheng Y and Yu P (2012). Mining frequent itemsets over uncertain databases, Proceedings of the VLDB Endowment, 5:11, (1650-1661), Online publication date: 1-Jul-2012.
  39. Matsumoto T and Hung E Accelerating outlier detection with uncertain data using graphics processors Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II, (169-180)
  40. ACM
    Agarwal P, Efrat A, Sankararaman S and Zhang W Nearest-neighbor searching under uncertainty Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems, (225-236)
  41. Yuan Y, Wang G, Chen L and Wang H (2012). Efficient subgraph similarity search on large probabilistic graph databases, Proceedings of the VLDB Endowment, 5:9, (800-811), Online publication date: 1-May-2012.
  42. Gullo F and Tagarelli A (2012). Uncertain centroid based partitional clustering of uncertain data, Proceedings of the VLDB Endowment, 5:7, (610-621), Online publication date: 1-Mar-2012.
  43. Xu W, Qin Z, Hu H and Zhao N Mining uncertain data streams using clustering feature decision trees Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II, (195-208)
  44. Wang Y, Tang C, Wang T, Yang D and Zhu J Efficient subject-oriented evaluating and mining methods for data with schema uncertainty Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I, (325-338)
  45. ACM
    Dallachiesa M, Nushi B, Mirylenka K and Palpanas T Similarity matching for uncertain time series Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Querying and Mining Uncertain Spatio-Temporal Data, (8-15)
  46. Jin C, Zhang Y and Zhou A Getting critical categories of a data set Proceedings of the 12th international conference on Web-age information management, (169-180)
  47. ACM
    Jin R, Liu L and Aggarwal C Discovering highly reliable subgraphs in uncertain graphs Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, (992-1000)
  48. Yuan Y, Wang G, Wang H and Chen L (2020). Efficient subgraph search over large uncertain graphs, Proceedings of the VLDB Endowment, 4:11, (876-886), Online publication date: 1-Aug-2011.
  49. Muzammal M Mining sequential patterns from probabilistic databases by pattern-growth Proceedings of the 28th British national conference on Advances in databases, (118-127)
  50. Jin R, Liu L, Ding B and Wang H (2011). Distance-constraint reachability computation in uncertain graphs, Proceedings of the VLDB Endowment, 4:9, (551-562), Online publication date: 1-Jun-2011.
  51. Muzammal M and Raman R Mining sequential patterns from probabilistic databases Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II, (210-221)
  52. Jin C, Gao M and Zhou A Handling ER-topk query on uncertain streams Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I, (326-340)
  53. ACM
    Zhao Y, Aggarwal C and Yu P On wavelet decomposition of uncertain time series data sets Proceedings of the 19th ACM international conference on Information and knowledge management, (129-138)
  54. Magnani M and Montesi D Uncertainty in decision tree classifiers Proceedings of the 4th international conference on Scalable uncertainty management, (250-263)
  55. Sen P, Deshpande A and Getoor L (2010). Read-once functions and query evaluation in probabilistic databases, Proceedings of the VLDB Endowment, 3:1-2, (1068-1079), Online publication date: 1-Sep-2010.
  56. Muzammal M and Raman R Uncertainty in sequential pattern mining Proceedings of the 27th British national conference on Data Security and Security Data, (147-150)
  57. Ge J, Xia Y and Nadungodage C UNN Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I, (449-460)
  58. ACM
    Magnani M and Montesi D US-SQL Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, (1195-1198)
  59. ACM
    Deshpande A Increasing representational power and scaling reasoning in probabilistic databases Proceedings of the 13th International Conference on Database Theory, (1-1)
  60. Cormode G, Deligiannakis A, Garofalakis M and McGregor A (2009). Probabilistic histograms for probabilistic data, Proceedings of the VLDB Endowment, 2:1, (526-537), Online publication date: 1-Aug-2009.
  61. ACM
    Aggarwal C, Li Y, Wang J and Wang J Frequent pattern mining with uncertain data Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, (29-38)
Contributors
  • IBM Thomas J. Watson Research Center

Reviews

John A. Fulcher

Uncertainty in datasets can arise due to the data collection method (for example, the imprecise nature of input sensors), cumulative noise due to forecasting/prediction/interpolation techniques, or even by noise that is deliberately added (for example, as a precursor to privacy-preserving data mining). Whatever the cause, it poses significant challenges for database administrators, data analysts, data miners, and knowledge engineers alike. The three broad areas covered in Aggarwal's book are modeling and system design, management, and mining of uncertain data. A total of 27 authors have contributed to the 16 chapters contained in this edited volume. In chapter 1, Aggarwal provides brief synopses of the 15 chapters that follow. Chapters 2 and 3 cover uncertain data representation and modeling. Chapter 4 is devoted to probabilistic graph modeling, such as Bayesian and Markov networks. Two real-world systems are described in chapters 5 and 6: Trio and MayBMS. Data integration is the focus of chapter 7. The resolution of aggregate queries using probabilistic data stream techniques, such as "sketches," is the focus of chapter 8. Chapter 9 discusses the join operation that needs to be redefined for uncertain data, due to its inherently probabilistic nature. Indexing uncertain data is covered in chapters 10 and 11, with the latter focusing on spatiotemporal data. Probabilistic Extensible Markup Language (XML) is the topic of chapter 12. Mining of uncertain data is the focus of the next three chapters-more specifically, clustering in chapter 13, general transformations in chapter 14, and frequent pattern mining in chapter 15. The book concludes by describing how some of the techniques introduced in the preceding chapters can be applied to biomedical imaging, as a representative application domain. Aggarwal's book is a timely publication, in that it provides a good summary of the current state of the art in the area of uncertain data modeling, management, and mining. It should be of interest to researchers and graduate students involved in the area, as well as novices who wish to become acquainted with the topic. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Recommendations