Principles of Data Mining: | Guide books

Principles of Data MiningFebruary 2013

February 2013

Author:
Max Bramer

Publisher:

Springer Publishing Company, Incorporated

ISBN:978-1-4471-4883-8

Published:21 February 2013

Pages:

454

Available at Amazon

Bibliometrics

Abstract

Data Mining, the automatic extraction of implicit and potentially useful information from data, is increasingly used in commercial, scientific and other application areas. Principles of Data Mining explains and explores the principal techniques of Data Mining: for classification, association rule mining and clustering. Each topic is clearly explained and illustrated by detailed worked examples, with a focus on algorithms rather than mathematical formalism. It is written for readers without a strong background in mathematics or statistics, and any formulae used are explained in detail. This second edition has been expanded to include additional chapters on using frequent pattern trees for Association Rule Mining, comparing classifiers, ensemble classification and dealing with very large volumes of data. Principles of Data Mining aims to help general readers develop the necessary understanding of what is inside the 'black box' so they can use commercial data mining packages discriminatingly, as well as enabling advanced readers or academic researchers to understand or contribute to future technical advances in the field. Suitable as a textbook to support courses at undergraduate or postgraduate levels in a wide range of subjects including Computer Science, Business Studies, Marketing, Artificial Intelligence, Bioinformatics and Forensic Science.

Cited By

Contributors

Max Arthur Bramer
University of Portsmouth
- Publication Years1977 - 2022
- Publication counts60
- Citation count61
- Available for Download2
- Downloads (cumulative)3,084
- Downloads (12 months)378
- Downloads (6 weeks)48
- Average Downloads per Article1,542
- Average Citation per Article1
View Full Profile

Index Terms

Principles of Data Mining
1. General and reference
  1. Document types
    1. Reference works
2. Information systems
  1. Information systems applications
    1. Data mining

Reviews

Reviewer: Alexis Leon

Data mining is one of the most popular and effective tools for knowledge discovery. It involves the analysis and summary of data from different perspectives and the automatic extraction of useful information. Data mining reveals trends, patterns, and other information hidden within huge volumes of data. Today, it is used in commercial, medical, scientific, geographical, meteorological, and other areas that generate large volumes of information that require automatic processing methods to be of real use. This book introduces the concept of data mining and explains the various techniques involved. The author starts with an introduction to data mining and its importance, and succinctly explains the fundamental concepts of data mining and principal techniques for classification, association rule mining, and clustering. Classification is a data mining technique that assigns items in a collection to target categories or classes. The book introduces the various classification techniques (naive Bayes, nearest neighbor, decision trees), and explains the top-down induction of decision trees (TDIDT) algorithm and the various criteria for attribute selection (entropy, Gini index of diversity, chi-square statistic, gain ratio). This is followed by discussions about related topics, including classifier predictive accuracy estimation, classifier performance measurement, classifier comparison, conversion of continuous attributes to categorical ones (discretization), overfitting reduction of decision trees, modular rules for classification, dealing with large volumes of data, and ensemble classification (use of a set of classifiers instead of a single one to classify unseen data). Association rules are if/then statements that help uncover relationships between data that seems to be unrelated in an information repository. The book covers the basic concepts of association rule mining, along with the various algorithms and criteria for selecting the best algorithms. There is also a comprehensive discussion of association rule mining algorithms, such as Apriori, market basket analysis, and frequency pattern growth. The author presents a detailed exploration of the two most popular data clustering methods, k -means clustering and hierarchical clustering, followed by a discussion of text mining, a type of classification where the objects are text documents. Other chapters examine the bag-of-words representation for document classification, automatic classification of web pages (hypertext categorization), and the difference between hypertext and standard text classification. Each topic discussion begins with the basics, and the book assumes that the reader has no prior knowledge of data mining. All explanations are clear and supported with detailed illustrations, examples, and solved problems. The focus on algorithms helps those who do not have a strong mathematical background to better understand the concepts, and the learning process is enhanced with self-assessment exercises and a list of references at the end of each chapter. The book has five appendices that add value. The first explains the mathematical notation and techniques used in the book and would especially help those with limited mathematical exposure. The second gives basic information about the different datasets used in the book. The third lists sources for further reading. The fourth is a comprehensive glossary of data mining terms and mathematical notation, and the last provides solutions to the self-assessment exercises. This book is written primarily as a text for a course on data mining. The rich pedagogical features, including illustrations, examples, solved problems, exercises and solutions, a glossary, and references, make it an ideal choice for that purpose. It would be very useful for any reader who wants to gain a good understanding of data mining concepts and techniques. More reviews about this item: Amazon Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Recommendations

Mining uncertain data

As an important data mining and knowledge discovery task, association rule mining searches for implicit, previously unknown, and potentially useful pieces of information—in the form of rules revealing associative relationships—that are embedded in the ...
Data Mining: Foundations and Practice
Mining Text Data

Browse Books

Sections

Cited By

Index Terms

Reviews

Access critical reviews of Computing literature here

Mining uncertain data

Data Mining: Foundations and Practice

Mining Text Data

Save to Binder

Sections

Cited By

Save to Binder

Index Terms

Reviews

Access critical reviews of Computing literature here

Recommendations

Mining uncertain data

Data Mining: Foundations and Practice

Mining Text Data