Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Minnesota Multiphasic Personality Inventory-2 protocols were examined in an attempt to develop a model able to identify chemically-dependent patients likely not to complete treatment. MMPI-2 profiles of 173 patients (142 male) were... more
Minnesota Multiphasic Personality Inventory-2 protocols were examined in an attempt to develop a model able to identify chemically-dependent patients likely not to complete treatment. MMPI-2 profiles of 173 patients (142 male) were analyzed using profile code types and a multiple analysis of variance. A chi-square showed that patients classified as neurotic were more likely to fail treatment. A MANOVA indicated that elevated T-scores on Scales 7 and 8 (p < .05) were related to noncompletion. Comparing these results with similar studies indicates that attempting to construct a predictive model based on a single objective measure may not be sufficient to determine outcome.
Several experimental studies have tested the relative merits of various supervised machine learning models. Comparisons have been made along dimensions that include model complexity, prediction accuracy, training set size, and training... more
Several experimental studies have tested the relative merits of various supervised machine learning models. Comparisons have been made along dimensions that include model complexity, prediction accuracy, training set size, and training time. Only limited work has been done to study the effect of training set exemplar typicality on model performance. We present experimental results obtained in testing C4.5, SX-WEB, a
Several experimental studies have tested the relative merits of various supervised machine learning models. Comparisons have been made along dimensions that include model complexity, prediction accuracy, training set size, and training... more
Several experimental studies have tested the relative merits of various supervised machine learning models. Comparisons have been made along dimensions that include model complexity, prediction accuracy, training set size, and training time. Only limited work has been done to study the effect of training set exemplar typicality on model performance. We present experimental results obtained in testing C4.5, SX-WEB, a backpropagation newal network and linear discriminant analysis using a real-valued and a mixed form of a medical data set. We generated training sets of highly typical, widely-varied and atypical exemplars for both data sets. We tested the classification accuracy of each model using the generated training sets. Test set accuracy levels ranged between 76% and 86% when each model was trained with typical or varied training sets. The accuracy levels for C4.5, backpropagation neural net and discriminant analysis dropped significantly when atypical training sets were used. In...
(Each Chapter concludes with a Chapter Summary, Key Terms, and Exercises.) Preface. I. DATA MINING FUNDAMENTALS. 1. Data Mining: A First View. Data Mining: A Definition. What Can Computers Learn? Is Data Mining Appropriate for my Problem?... more
(Each Chapter concludes with a Chapter Summary, Key Terms, and Exercises.) Preface. I. DATA MINING FUNDAMENTALS. 1. Data Mining: A First View. Data Mining: A Definition. What Can Computers Learn? Is Data Mining Appropriate for my Problem? Expert Systems or Data Mining? A Simple Data Mining Process Model. Why not Simple Search? Data Mining Applications. 2. Data Mining: A Closer Look. Data Mining Strategies. Supervised Data Mining Techniques. Association Rules. Clustering Techniques. Evaluating Performance. 3. Basic Data Mining Techniques. Decision Trees. Generating Association Rules. The K-Means Algorithm. Genetic Learning. Choosing a Data Mining Technique. 4. An Excel-Based Data Mining Tool. The iData Analyzer. ESX: A Multipurpose Tool for Data Mining. iDAV Format for Data Mining. A Five-Step Approach for Unsupervised Clustering. A Six-Step Approach for Supervised Learning. Techniques for Generating Rules. Instance Typicality. Special Considerations and Features. II. TOOLS FOR KNOWLEDGE DISCOVERY. 5. Knowledge Discovery in Databases. A KDD Process Model. Step 1: Goal Identification. Step 2: Creating a Target Data Set. Step 3: Data Preprocessing. Step 4: Data Transformation. Step 5: Data Mining. Step 6: Interpretation and Evaluation. Step 7: Taking Action. The CRISP-DM Process Model. Experimenting with ESX. 6. The Data Warehouse. Operational Databases. Data Warehouse Design. On-line Analytical Processing (OLAP). Excel Pivot Tables for Data Analysis. 7. Formal Evaluation Techniques. What Should be Evaluated? Tools for Evaluation. Computing Test Set Confidence Intervals. Comparing Supervised Learner Models. Attribute Evaluation. Unsupervised Evaluation Techniques. Evaluating Supervised Models with Numeric Output. III. ADVANCED DATA MINING TECHNIQUES. 8. Neural Networks. Feed-Forward Neural Networks. Neural Network Training: A Conceptual View. Neural Network Explanation. General Considerations. Neural Network Learning: A Detailed View. 9. Building Neural Networks with iDA. A Four-Step Approach for Backpropagation Learning. A Four-Step Approach for Neural Network Clustering. ESX for Neural Network Cluster Analysis. 10. Statistical Techniques. Linear Regression Analysis. Logistic Regression. Bayes Classifier. Clustering Algorithms. Heuristics or Statistics? 11. Specialized Techniques. Time-Series Analysis. Mining the Web. Mining Textual Data. Improving Performance. IV. INTELLIGENT SYSTEMS. 12. Rule-Based Systems. Exploring Artificial Intelligence. Problem Solving as a State Space Search. Expert Systems. Structuring a Rule-Based System. 13. Managing Uncertainty in Rule-Based Systems. Uncertainty: Sources and Solutions. Fuzzy Rule-Based Systems. A Probability-Based Approach to Uncertainty. 14. Intelligent Agents. Characteristics of Intelligent Agents. Types of Agents. Integrating Data Mining, Expert Systems, and Intelligent Agents. Appendix. Appendix A: Software Installation. Appendix B: Datasets for Data Mining. Appendix C: Decision Tree Attribute Selection. Appendix D: Statistics for Performance Evaluation. Appendix E: Excel 97 Pivot Tables. Bibliography.
Concept learning continues to be a topic of interest within the fields of computer science and instructional design. The research presented here has implications within both of these fields. The incremental concept formation approach is... more
Concept learning continues to be a topic of interest within the fields of computer science and instructional design. The research presented here has implications within both of these fields. The incremental concept formation approach is presented as a viable choice for a general model of concept learning. This approach creates clusterings of hierarchically organized concept categories when presented with previously unclassified instances. Learning is incremental and unsupervised. This research extends previous models of incremental concept formation by presenting an exemplar-based and a probability-based concept learning model. Each model can perform in domains containing nominal, real-valued and mixed data types and can limit the attributes used for classification to those deemed most predictive of class membership. Furthermore, the exemplar-based model uses a global approach to track and correct concept drift. The incremental concept formation approach also has important applications in educational environments. Specifically, when this approach is combined with an algorithm that creates rational sets of matched example/non-example pairs of the concepts to be learned, an environment appropriate for discovery learning is created. An algorithm that creates these rational sets of matched pairs from the concepts contained within a concept taxonomy is introduced. The models presented here are tested extensively in an effort to show their ability to perform well in several situations.
Research Interests:
The goal is to supply the participant with the tools to teach a course or unit about data mining and knowledge discovery. A basic understanding of the benefits and limitations of data mining as a problem-solving strategy will be offered.... more
The goal is to supply the participant with the tools to teach a course or unit about data mining and knowledge discovery. A basic understanding of the benefits and limitations of data mining as a problem-solving strategy will be offered. Several data mining techniques will be discussed. Prior knowledge about data mining and the knowledge discovery process is not necessary.
Minnesota Multiphasic Personality Inventory-2 protocols were examined in an attempt to develop a model able to identify chemically-dependent patients likely not to complete treatment. MMPI-2 profiles of 173 patients (142 male) were... more
Minnesota Multiphasic Personality Inventory-2 protocols were examined in an attempt to develop a model able to identify chemically-dependent patients likely not to complete treatment. MMPI-2 profiles of 173 patients (142 male) were analyzed using profile code ...
Several experimental studies have tested the relative merits of various supervised machine learning models. Comparisons have been made along dimensions that include model complexity, prediction accuracy, training set size, and training... more
Several experimental studies have tested the relative merits of various supervised machine learning models. Comparisons have been made along dimensions that include model complexity, prediction accuracy, training set size, and training time. Only limited work has been done to study the effect of training set exemplar typicality on model performance. We present experimental results obtained in testing C4.5, SX-WEB, a
Page 1. A MAJORITY RULES APPROACH TO DATA MINING Richard J. Roiger Cyrus Azarbod Department of Computer Science Mankato State University Mankato, MN 56002-8400 ... banks in both samples. The two years included the year of bank failure and... more
Page 1. A MAJORITY RULES APPROACH TO DATA MINING Richard J. Roiger Cyrus Azarbod Department of Computer Science Mankato State University Mankato, MN 56002-8400 ... banks in both samples. The two years included the year of bank failure and the previous year. ...
Physics and Astronomy, Minnesota State University, 141 Trafton Science Center North, Mankato, MN 56001 ; jon.hakkila=mnsu.msus.edu DAVID J. HAGLIN Department of Computer and Information Sciences, Minnesota State University, 141 Trafton... more
Physics and Astronomy, Minnesota State University, 141 Trafton Science Center North, Mankato, MN 56001 ; jon.hakkila=mnsu.msus.edu DAVID J. HAGLIN Department of Computer and Information Sciences, Minnesota State University, 141 Trafton Science Center North, ...
Gamma-Ray Burst (GRB) prompt emission contains information that can be used to infer structure of the relativistic outflow. Spectral lags, the Internal Luminosity Function (ILF), and Color-Color Diagrams are attributes that provide... more
Gamma-Ray Burst (GRB) prompt emission contains information that can be used to infer structure of the relativistic outflow. Spectral lags, the Internal Luminosity Function (ILF), and Color-Color Diagrams are attributes that provide diagnositcs with which jet structure can be studied. These attributes help delineate properties of internal shocks originating in the large Lorentz factor, tightly-beamed central core of the jet
We present a database of spectral lags and internal luminosity function (ILF) measurements for gamma-ray bursts (GRBs) in the BATSE catalog. Measurements were made using 64 ms count rate data and are defined for various combinations of... more
We present a database of spectral lags and internal luminosity function (ILF) measurements for gamma-ray bursts (GRBs) in the BATSE catalog. Measurements were made using 64 ms count rate data and are defined for various combinations of the four broadband BATSE energy channels. We discuss the processes used for measuring lags and ILF characteristics. We discuss the statistical and systematic
We study the angular anisotropies of faint intermediate and extremely short BATSE bursts in CGRO spacecraft coordinates. Our goal is to determine whether or not biases in detector effeciency can account for the observed anisotropies. We... more
We study the angular anisotropies of faint intermediate and extremely short BATSE bursts in CGRO spacecraft coordinates. Our goal is to determine whether or not biases in detector effeciency can account for the observed anisotropies. We conclude that the faint intermediate burst anisotropy is not statistically meaningful, and that some of the very shortest bursts might be transient events other than gamma-ray bursts.
ABSTRACT We present a database of spectral lags and internal luminosity function (ILF) measurements for gamma-ray bursts (GRBs) in the BATSE catalog. Measurements were made using 64ms count rate data and are defined for various... more
ABSTRACT We present a database of spectral lags and internal luminosity function (ILF) measurements for gamma-ray bursts (GRBs) in the BATSE catalog. Measurements were made using 64ms count rate data and are defined for various combinations of the four broadband BATSE energy channels. (3 data files).
[AIP Conference Proceedings 662, 147 (2003)]. Jon Hakkila, Timothy W. Giblin, Thomas M. Freismuth, Kevin C. Young, Amanda J. Sprague, Andrew D. Stallworth, David J. Haglin, Richard J. Roiger, William S. Paciesas. Abstract. ...
Title: A Gamma Ray Burst Database of BATSE Spectral Lag and Internal Luminosity Function Values Authors: Hakkila J., Giblin TW, Young KC, Fuller SP, Peters CD, Nolan C., Sonnett SM, Haglin DJ, Roiger RJ Table: Internal Luminosity Function... more
Title: A Gamma Ray Burst Database of BATSE Spectral Lag and Internal Luminosity Function Values Authors: Hakkila J., Giblin TW, Young KC, Fuller SP, Peters CD, Nolan C., Sonnett SM, Haglin DJ, Roiger RJ Table: Internal Luminosity Function Measurements of BATSE GRBs ...
Despite being the most energetic phenomenon in the known universe, the astrophysics of gamma-ray bursts (GRBs) has still proven difficult to understand. It has only been within the past five years that the GRB distance scale has been... more
Despite being the most energetic phenomenon in the known universe, the astrophysics of gamma-ray bursts (GRBs) has still proven difficult to understand. It has only been within the past five years that the GRB distance scale has been firmly established, on the basis of a few dozen bursts with x-ray, optical, and radio afterglows. The afterglows indicate source redshifts of z=1 to z=5, total energy outputs of roughly 10(exp 52) ergs, and energy confined to the far x-ray to near gamma-ray regime of the electromagnetic spectrum. The multi-wavelength afterglow observations have thus far provided more insight on the nature of the GRB mechanism than the GRB observations; far more papers have been written about the few observed gamma-ray burst afterglows in the past few years than about the thousands of detected gamma-ray bursts. One reason the GRB central engine is still so poorly understood is that GRBs have complex, overlapping characteristics that do not appear to be produced by one ho...