Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
William Hsu

    William Hsu

    The goal of this paper is to develop a methodology and model to classify and characterize the arousal state of participants in a built environment. Demonstrating this showcases the potential of developing an intelligent system capable of... more
    The goal of this paper is to develop a methodology and model to classify and characterize the arousal state of participants in a built environment. Demonstrating this showcases the potential of developing an intelligent system capable of both classifying and predicting biometric arousal state. This classification process is traditionally performed by human experts. Our approach can be leveraged to take advantage of the diversity of real-time sensor data to inform the development of smart(er) environments to improve human health.
    Bayesian network (BN) inference has long been seen as a very important and hard problem in AI. Both exact and approximate BN inference are NP-hard [Co90, Sh94]. To date researchers have developed many different kinds of exact and... more
    Bayesian network (BN) inference has long been seen as a very important and hard problem in AI. Both exact and approximate BN inference are NP-hard [Co90, Sh94]. To date researchers have developed many different kinds of exact and approximate BN inference algorithms. Each of these has different properties and works better for different classes of inference problems. Given a BN inference problem instance, it is usually hard but important to decide in advance which algorithm among a set of choices is the most appropriate. This problem is known as the algorithm selection problem [Ri76]. The goal of this research is to design and implement a meta-level reasoning system that acts as a “BN inference expert” and is able to quickly select the most appropriate algorithm for any given Bayesian network inference problem, and then predict the run time performance.
    This paper describes an approach using wearables to demonstrate the viability of measuring physiometric arousal indicators such as heart rate in assessing how urban built environments can induce physiometric arousal indicators in a... more
    This paper describes an approach using wearables to demonstrate the viability of measuring physiometric arousal indicators such as heart rate in assessing how urban built environments can induce physiometric arousal indicators in a subject. In addition, a machine learning methodology is developed to classify sensor inputs based on annotated arousal output as a target. The results are then used as a foundation for designing and implementing an affective intelligent systems framework for arousal state detection via supervised learning and classification.
    Pollinators are undergoing a global decline. Although vital to pollinator conservation and ecological research, species-level identification is expensive, time consuming, and requires specialized taxonomic training. However, deep learning... more
    Pollinators are undergoing a global decline. Although vital to pollinator conservation and ecological research, species-level identification is expensive, time consuming, and requires specialized taxonomic training. However, deep learning and computer vision are providing ways to open this methodological bottleneck through automated identification from images. Focusing on bumble bees, we compare four convolutional neural network classification models to evaluate prediction speed, accuracy, and the potential of this technology for automated bee identification. We gathered over 89,000 images of bumble bees, representing 36 species in North America, to train the ResNet, Wide ResNet, InceptionV3, and MnasNet models. Among these models, InceptionV3 presented a good balance of accuracy (91.6%) and average speed (3.34 ms). Species-level error rates were generally smaller for species represented by more training images. However, error rates also depended on the level of morphological variab...
    This chapter surveys recent and continuing trends in software tools for preparation of open courseware, in particular audiovisual lecture materials, documentaries and tutorials, and derivative materials. It begins by presenting a catalog... more
    This chapter surveys recent and continuing trends in software tools for preparation of open courseware, in particular audiovisual lecture materials, documentaries and tutorials, and derivative materials. It begins by presenting a catalog of tools ranging from open source wikis and custom content management systems to desktop video production. Next, it reviews techniques for preparation of lecture materials consisting of five specific learning technologies: animation of concepts and problem solutions; explanation of code; video walkthroughs of system documentation; software demonstrations; and creation of materials for instructor preparation and technology transfer. Accompanying the description of each technology and the review of its state of practice is a discussion of the goals and assessment criteria for deployed courseware that uses those tools and techniques. Holistic uses of these technologies are then analyzed via case studies in three domains: artificial intelligence, comput...
    This chapter presents applications of machine learning to predicting protein-protein interactions (PPI) in Saccharomyces cerevisiae. Several supervised inductive learning methods have been developed that treat this task as a... more
    This chapter presents applications of machine learning to predicting protein-protein interactions (PPI) in Saccharomyces cerevisiae. Several supervised inductive learning methods have been developed that treat this task as a classification problem over candidate links in a PPI network – a graph whose nodes represent proteins and whose arcs represent interactions. Most such methods use feature extraction from protein sequences (e.g., amino acid composition) or associated with protein sequences directly (e.g., GO annotation). Others use relational and structural features extracted from the PPI network, along with the features related to the protein sequence. Topological features of nodes and node pairs can be extracted directly from the underlying graph. This chapter presents two approaches from the literature (Qi et al., 2006; Licamele & Getoor, 2006) that construct features on the basis of background knowledge, an approach that extracts purely topological graph features (Paradesi et...
    Pitch detection and instrument identification can be achieved with relatively high accuracy when considering monophonic signals in music; however, accurately classifying polyphonic signals in music remains an unsolved research problem.... more
    Pitch detection and instrument identification can be achieved with relatively high accuracy when considering monophonic signals in music; however, accurately classifying polyphonic signals in music remains an unsolved research problem. Pitch and instrument classification is a subset of Music Information Retrieval (MIR) and automatic music transcription, both having numerous research and real-world applications. Several areas of research are covered in this chapter, including the fast Fourier transform, onset detection, convolution, and filtering. Polyphonic signals with many different voices and frequencies can be exceptionally complex. This chapter presents a new model for representing the spectral structure of polyphonic signals: Uniform MAx Gaussian Envelope (UMAGE). The new spectral envelope precisely approximates the distribution of frequency parts in the spectrum while still being resilient to oscillating rapidly and is able to generalize well without losing the representation...
    We present an adaptation of the standard genetic program (GP) to hierarchically decomposable, multi-agent learning problems. To break down a problem that requires cooperation of multiple agents, we use the team objective function to... more
    We present an adaptation of the standard genetic program (GP) to hierarchically decomposable, multi-agent learning problems. To break down a problem that requires cooperation of multiple agents, we use the team objective function to derive a simpler, intermediate objective function for pairs of cooperating agents. We apply GP to optimize first for the intermediate, then for the team objective function,
    In this paper we study the performance of probabilistic networks in the context of protein sequence analysis in molecular biology. Specifically, we report the results of our initial experiments applying this framework to the problem of... more
    In this paper we study the performance of probabilistic networks in the context of protein sequence analysis in molecular biology. Specifically, we report the results of our initial experiments applying this framework to the problem of protein secondary structure prediction. One of the main advantages of the probabilistic approach we describe here is our ability to perform detailed experiments where we can experiment with different models. We can easily perform local substitutions (mutations) and measure (probabilistically) their effect on the global structure. Window-based methods do not support such experimentation as readily. Our method is efficient both during training and during prediction, which is important in order to be able to perform many experiments with different networks. We believe that probabilistic methods are comparable to other methods in prediction quality. In addition, the predictions generated by our methods have precise quantitative semantics which is not shar...
    We present an application of inductive concept learning and interactive visualization techniques to a large-scale commercial data mining project. This paper focuses on design and configuration of high-level optimization systems (wrappers)... more
    We present an application of inductive concept learning and interactive visualization techniques to a large-scale commercial data mining project. This paper focuses on design and configuration of high-level optimization systems (wrappers) for relevance determination and constructive induction, and on integrating these wrappers with elicited knowledge on attribute relevance and synthesis. In particular, we discuss decision support issues for the application
    We present an approach to inductive concept learning using multiple models for time series. Our objective is to improve the efficiency and accuracy of concept learning by decomposing learning tasks that admit multiple types of learning... more
    We present an approach to inductive concept learning using multiple models for time series. Our objective is to improve the efficiency and accuracy of concept learning by decomposing learning tasks that admit multiple types of learning architectures and mixture estimation methods. The decomposition method adapts attribute subset selection and constructive induction (cluster definition) to define new subproblems. To these problem
    Research Interests:
    Probabilistic Prediction of Protein Secondary Structure Using Causal Networks. Arthur L. Delcher, Simon Kasif, Harry R. Goldberg, William H. Hsu. In this, paper we present a probabilistic approach to analysis and prediction of protein... more
    Probabilistic Prediction of Protein Secondary Structure Using Causal Networks. Arthur L. Delcher, Simon Kasif, Harry R. Goldberg, William H. Hsu. In this, paper we present a probabilistic approach to analysis and prediction of protein structure. ...
    Over the last 20 years or so, Bayesian networks (BNs) [Pe88, Ne90, RN95, CDLS99] have become the key method for representation and reasoning under uncertainty in AI. BNs not only provide a natural and compact way to encode exponentially... more
    Over the last 20 years or so, Bayesian networks (BNs) [Pe88, Ne90, RN95, CDLS99] have become the key method for representation and reasoning under uncertainty in AI. BNs not only provide a natural and compact way to encode exponentially sized joint probability distributions, ...
    This paper proposes and surveys genetic implementations of algorithms for selection and partitioning of attributes in large-scale concept learning problems. Algorithms of this type apply relevance determination criteria to attributes from... more
    This paper proposes and surveys genetic implementations of algorithms for selection and partitioning of attributes in large-scale concept learning problems. Algorithms of this type apply relevance determination criteria to attributes from those specified for the original ...
    In this paper, we address the problem of graph feature extraction and selection for link analysis in weblogs and similar social networks. First, we present an approach based on collaborative recommendation using the link structure of a... more
    In this paper, we address the problem of graph feature extraction and selection for link analysis in weblogs and similar social networks. First, we present an approach based on collaborative recommendation using the link structure of a social network and content-based recommendation using mutual declared interests. Next, we describe the application of this approach to a small representative subset of a large real-world social network: the user/community network of the blog service LiveJournal. We then discuss the ground features available in LiveJournal's public user information pages and describe some graph algorithms for analysis of the social network along with a feature set for classifying users as friends or non-friends. These are used to identify candidates, provide ground truth for recommendations, and construct features for learning the concept of an existing link. Finally, we evaluate the performance of classification learning algorithms and committee machines relative ...
    Research Interests:
    Research Interests:
    Research Interests:
    Topic models are probabilistic models for discovering topical themes in collections of documents. These models provide us with the means of organizing what would otherwise be unstructured collections. The first wave of topic models... more
    Topic models are probabilistic models for discovering topical themes in collections of documents. These models provide us with the means of organizing what would otherwise be unstructured collections. The first wave of topic models developed was able to discover the prevailing topics in a big collection of documents spanning a period of time. These time-invariant models were not capable of modeling 1) the time varying number of topics they discover and 2) the time changing structure of these topics. Few models were developed to address these two deficiencies. The online-hierarchical Dirichlet process models the documents with a time varying number of topics, and the continuous-time dynamic topic model evolves topic structure in continuous-time. In this chapter, the authors present the continuous-time infinite dynamic topic model that combines the advantages of these two models. It is a probabilistic topic model that changes the number of topics and topic structure over continuous-time.

    And 13 more