Skip to main content

Su WANG

The University of Texas at Austin, Linguistics, Graduate Student

The University of Texas at Austin, Statistics and Data Sciences, Graduate Student

Followers

129

Following

3

Co-authors

2

Public Views

Interests

Uploads

Teaching Documents by Su WANG

EM-algorithm and Clustering: a Tutorial

The EM-algorithm \citep{Dempster:77} applies widely in unsupervised learning, in particular clust... more The EM-algorithm \citep{Dempster:77} applies widely in unsupervised learning, in particular clustering models, e.g. K-means \citep{Kanungo:02} and Bernoulli Mixture models \citep{Juan:04}.
Many, however, have treated the algorithm as a pure blackbox in application.
In this guide, I derive the EM in complete detail and demonstrate its uses in clustering models, for the convenience of a deeper understanding of the algorithm for machine learning practitioners.
The readers are assumed to be familiar with basic clustering techniques and general knowledge in probability and statistics.

Generative Adversarial Networks (GAN): A Gentle Introduction [UPDATED]

In this tutorial, I present an intuitive introduction to the Generative Adversarial Network (GAN)... more In this tutorial, I present an intuitive introduction to the Generative Adversarial Network (GAN), invented by Ian Goodfellow of Google Brain, overview the general idea of the model, and describe the algorithm for training it as per the original work. I further briefly introduce the application of GAN in Natural Language Processing to show its flexibility and strong potential as a neural network architecture. In lieu of the discussion, I also present simple Tensorflow code for the original GAN and an important variant --- Wasserstein GAN, to help the reader getting a quick start in practical applications.

Named Entity Recognition -- An Overview of Methods

[NOTE: The draft is imbalance in the depth in the coverage of methods. The slides at Dataday 2017... more [NOTE: The draft is imbalance in the depth in the coverage of methods. The slides at Dataday 2017 provides a short complement. If you are interested in particular topics, have critiques or comments, please feel free to email me at shrekwang@utexas.edu]

A huge literature has been dedicated to NER. Here I give a bird-view overview of selected methods (supervised/semi-supervised/unsupervised + deep learning based methods), describing main ideas and sketch algorithms, with a focus on unsupervised NER, which I consider, from my work experience as a data scientist, the most useful in practice.

In addition I propose an end-to-end unsupervised architecture, leveraging the combined force of traditional feature engineering and deep learning.

An Introductory Tutorial of Machine Learning

(Tutorial & Code Updated Regularly on suwangcompling.com) Introductory Machine Learning draft, ba... more (Tutorial & Code Updated Regularly on suwangcompling.com)
Introductory Machine Learning draft, based on Andrew Ng’s materials, but is revamped as a time-saver much shorter and concise version. I also dive deep into the details of mechanical steps in the implementation of the learning algorithms, such that first-timers may understand the nuts and bolts better.
What’s covered is coded in Python (not included in the tutorial), which is on my GitHub/website https://github.com/suwangcompling/ml.

Content:

- Supervised Learning: Ch1 Linear Regression; Ch2 Logistic Regression; Ch3 Neural Network; Ch5 Support Vector Machine (SVM)

- Unsupervised Learning: Ch6 Clustering - K-means & Principal Component Analysis (PCA); Ch7 Anomaly/Outlier Detection

- Model Evaluation: Ch4 Learning Algorithm Evaluation

Introductory Statistical Learning

Based on Professor Trevor Hastie's material. Going into details of the most fundamental models i... more

Bayesian Statistical Methods: A Primer

The short article presents a succinct coverage of the essentials in Bayesian methods, provides de... more

Talks by Su WANG

Named Entity Recognition (DataDay Texas 2017 Keynote)

A survey of key methods in Named Entity Recognition. High-level description of mechanisms of mode... more

Conference Presentations by Su WANG

Modeling Semantic Plausibility by Injecting World Knowledge

by Su WANG, Greg Durrett, and Katrin Erk

Distributional data tells us that a man can swallow candy, but not that a man can swallow a paint... more Distributional data tells us that a man can swallow candy, but not that a man can swallow a paintball, since this is never attested. However both are physically plausible events. This paper introduces the task of semantic plau-sibility: recognizing plausible but possibly novel events. We present a new crowdsourced dataset of semantic plausibility judgments of single events such as " man swallow paintball ". Simple models based on distributional representations perform poorly on this task, despite doing well on selection preference, but injecting manually elicited knowledge about entity properties provides a substantial performance boost. Our error analysis shows that our new dataset is a great testbed for semantic plausi-bility models: more sophisticated knowledge representation and propagation could address many of the remaining errors.

Distributional Modeling on a Diet: One-shot Word Learning from Text Only

We test whether distributional models can do one-shot learning of definitional properties from te... more We test whether distributional models can do one-shot learning of definitional properties from text only. Using Bayesian models, we find that first learning overar-ching structure in the known data, regularities in textual contexts and in properties, helps one-shot learning, and that individual context items can be highly informative. Our experiments show that our model can learn properties from a single exposure when given an informative utterance.

Leveraging Discourse Information Effectively for Authorship Attribution

We explore techniques to maximize the effectiveness of discourse information in the task of autho... more We explore techniques to maximize the effectiveness of discourse information in the task of authorship attribution. We present a novel method to embed discourse features in a Convolutional Neural Network text classifier, which achieves a state-of-the-art result by a significant margin. We empirically investigate several featuriza-tion methods to understand the conditions under which discourse features contribute non-trivial performance gains, and analyze discourse embeddings. 1

Topic Modeling: A Complete Introductory Guide

I present an in-detail introduction to Topic Models (TM), a family of probabilistic models for (m... more I present an in-detail introduction to Topic Models (TM), a family of probabilistic models for (mainly) document modeling. I introduce and motivate the model, and illustrate its applications in Natural Language Processing (NLP), with the particular focus on a thorough description and derivation of the common inference algorithms proposed for TMs. I also compare the algorithms, overviewing various evaluation methods, and provide practical suggestions. Finally I look at a few popular extensions of TM before concluding.

EM-algorithm and Clustering: a Tutorial

The EM-algorithm \citep{Dempster:77} applies widely in unsupervised learning, in particular clust... more The EM-algorithm \citep{Dempster:77} applies widely in unsupervised learning, in particular clustering models, e.g. K-means \citep{Kanungo:02} and Bernoulli Mixture models \citep{Juan:04}.
Many, however, have treated the algorithm as a pure blackbox in application.
In this guide, I derive the EM in complete detail and demonstrate its uses in clustering models, for the convenience of a deeper understanding of the algorithm for machine learning practitioners.
The readers are assumed to be familiar with basic clustering techniques and general knowledge in probability and statistics.

Generative Adversarial Networks (GAN): A Gentle Introduction [UPDATED]

In this tutorial, I present an intuitive introduction to the Generative Adversarial Network (GAN)... more In this tutorial, I present an intuitive introduction to the Generative Adversarial Network (GAN), invented by Ian Goodfellow of Google Brain, overview the general idea of the model, and describe the algorithm for training it as per the original work. I further briefly introduce the application of GAN in Natural Language Processing to show its flexibility and strong potential as a neural network architecture. In lieu of the discussion, I also present simple Tensorflow code for the original GAN and an important variant --- Wasserstein GAN, to help the reader getting a quick start in practical applications.

Named Entity Recognition -- An Overview of Methods

[NOTE: The draft is imbalance in the depth in the coverage of methods. The slides at Dataday 2017... more [NOTE: The draft is imbalance in the depth in the coverage of methods. The slides at Dataday 2017 provides a short complement. If you are interested in particular topics, have critiques or comments, please feel free to email me at shrekwang@utexas.edu]

A huge literature has been dedicated to NER. Here I give a bird-view overview of selected methods (supervised/semi-supervised/unsupervised + deep learning based methods), describing main ideas and sketch algorithms, with a focus on unsupervised NER, which I consider, from my work experience as a data scientist, the most useful in practice.

In addition I propose an end-to-end unsupervised architecture, leveraging the combined force of traditional feature engineering and deep learning.

An Introductory Tutorial of Machine Learning

(Tutorial & Code Updated Regularly on suwangcompling.com) Introductory Machine Learning draft, ba... more (Tutorial & Code Updated Regularly on suwangcompling.com)
Introductory Machine Learning draft, based on Andrew Ng’s materials, but is revamped as a time-saver much shorter and concise version. I also dive deep into the details of mechanical steps in the implementation of the learning algorithms, such that first-timers may understand the nuts and bolts better.
What’s covered is coded in Python (not included in the tutorial), which is on my GitHub/website https://github.com/suwangcompling/ml.

Content:

- Supervised Learning: Ch1 Linear Regression; Ch2 Logistic Regression; Ch3 Neural Network; Ch5 Support Vector Machine (SVM)

- Unsupervised Learning: Ch6 Clustering - K-means & Principal Component Analysis (PCA); Ch7 Anomaly/Outlier Detection

- Model Evaluation: Ch4 Learning Algorithm Evaluation

Introductory Statistical Learning

Based on Professor Trevor Hastie's material. Going into details of the most fundamental models i... more

Bayesian Statistical Methods: A Primer

The short article presents a succinct coverage of the essentials in Bayesian methods, provides de... more

Named Entity Recognition (DataDay Texas 2017 Keynote)

A survey of key methods in Named Entity Recognition. High-level description of mechanisms of mode... more

Modeling Semantic Plausibility by Injecting World Knowledge

by Su WANG, Greg Durrett, and Katrin Erk

Distributional data tells us that a man can swallow candy, but not that a man can swallow a paint... more Distributional data tells us that a man can swallow candy, but not that a man can swallow a paintball, since this is never attested. However both are physically plausible events. This paper introduces the task of semantic plau-sibility: recognizing plausible but possibly novel events. We present a new crowdsourced dataset of semantic plausibility judgments of single events such as " man swallow paintball ". Simple models based on distributional representations perform poorly on this task, despite doing well on selection preference, but injecting manually elicited knowledge about entity properties provides a substantial performance boost. Our error analysis shows that our new dataset is a great testbed for semantic plausi-bility models: more sophisticated knowledge representation and propagation could address many of the remaining errors.

Distributional Modeling on a Diet: One-shot Word Learning from Text Only

We test whether distributional models can do one-shot learning of definitional properties from te... more We test whether distributional models can do one-shot learning of definitional properties from text only. Using Bayesian models, we find that first learning overar-ching structure in the known data, regularities in textual contexts and in properties, helps one-shot learning, and that individual context items can be highly informative. Our experiments show that our model can learn properties from a single exposure when given an informative utterance.

Leveraging Discourse Information Effectively for Authorship Attribution

We explore techniques to maximize the effectiveness of discourse information in the task of autho... more We explore techniques to maximize the effectiveness of discourse information in the task of authorship attribution. We present a novel method to embed discourse features in a Convolutional Neural Network text classifier, which achieves a state-of-the-art result by a significant margin. We empirically investigate several featuriza-tion methods to understand the conditions under which discourse features contribute non-trivial performance gains, and analyze discourse embeddings. 1

Topic Modeling: A Complete Introductory Guide

I present an in-detail introduction to Topic Models (TM), a family of probabilistic models for (m... more I present an in-detail introduction to Topic Models (TM), a family of probabilistic models for (mainly) document modeling. I introduce and motivate the model, and illustrate its applications in Natural Language Processing (NLP), with the particular focus on a thorough description and derivation of the common inference algorithms proposed for TMs. I also compare the algorithms, overviewing various evaluation methods, and provide practical suggestions. Finally I look at a few popular extensions of TM before concluding.