Andrew McCallum

University of Massachusetts Amherst, Computer Science, Faculty Member

Followers

341

Following

Public Views

Interests

Uploads

Books by Andrew McCallum

Papers by Andrew McCallum

Implicit surface modelling with a globally regularised basis of compact support

Dynamic sharing and backward compatibility on 64-bit machines

Abstract As an alternative to communication via messages or files, shared memory has the potentia... more Abstract As an alternative to communication via messages or files, shared memory has the potential to be simpler, faster, and less wasteful of space. Unfortunately, the mechanisms available for sharing in most multi-user operating systems are difficult to use. As a result, shared memory tends to appear primarily in self-contained parallel applications, where library or compiler support can take care of the messy details.

Download

An entity based model for coreference resolution

Abstract Recently, many advanced machine learning approaches have been proposed for coreference r... more Abstract Recently, many advanced machine learning approaches have been proposed for coreference resolution; however, all of the discriminatively-trained models reason over mentions rather than entities. That is, they do not explicitly contain variables indicating the “canonical” values for each attribute of an entity (eg, name, venue, title, etc.). This canonicalization step is typically implemented as a post-processing routine to coreference resolution prior to adding the extracted entity to a database.

Download

Bi-directional joint inference for entity resolution and segmentation using imperatively-defined factor graphs

There has been growing interest in using joint inference across multiple subtasks as a mechanism ... more There has been growing interest in using joint inference across multiple subtasks as a mechanism for avoiding the cascading accumulation of errors in traditional pipelines. Several recent papers demonstrate joint inference between the segmentation of entity mentions and their de-duplication, however, they have various weaknesses: inference information flows only in one direction, the number of uncertain hypotheses is severely limited, or the subtasks are only loosely coupled.

Download

Modeling relations and their mentions without labeled text

Several recent works on relation extraction have been applying the distant supervision paradigm: ... more Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the assumption that each sentence which mentions the two related entities is an expression of the given relation.

Download

Learning to create customized authority lists

Abstract The proliferation of hypertext and the popularity of Kleinberg's HITS algorithm have bro... more Abstract The proliferation of hypertext and the popularity of Kleinberg's HITS algorithm have brought about an increased interest in link analysis. While HITS and its older relatives from the Bibliometrics provide a method for finding authoritative sources on a particular topic, they do not allow individual users to inject their own opinions on what sources are authoritative. This paper presents a technique for learning a user's internal model of authority.

Download

Alternating projections for learning with expectation constraints

Abstract We present an objective function for learning with unlabeled data that utilizes auxiliar... more

Scalable probabilistic databases with factor graphs and MCMC

Abstract Incorporating probabilities into the semantics of incomplete databases has posed many ch... more Abstract Incorporating probabilities into the semantics of incomplete databases has posed many challenges, forcing systems to sacrifice modeling power, scalability, or treatment of relational algebra operators. We propose an alternative approach where the underlying relational database always represents a single world, and an external factor graph encodes a distribution over possible worlds; Markov chain Monte Carlo (MCMC) inference is then used to recover this uncertainty to a desired level of fidelity.

Polylingual topic models

Abstract Topic models are a useful tool for analyzing large text collections, but have previously... more Abstract Topic models are a useful tool for analyzing large text collections, but have previously been applied in only monolingual, or at most bilingual, contexts. Meanwhile, massive collections of interlinked documents in dozens of languages, such as Wikipedia, are now widely available, calling for tools that can characterize content in many languages. We introduce a polylingual topic model that discovers topics aligned across multiple languages.

Download

Rethinking LDA: Why priors matter

–As the truncation level T is increased, the optimized Dirichlet hyperparameters results in drama... more –As the truncation level T is increased, the optimized Dirichlet hyperparameters results in dramatically improved consistency in topic usage. The authors claim that the hyperparameter optimized LDA is more robust and data driven than standard LDA, while maintaining less model complexity and computational cost than nonparametric models.

Download

Toward optimal active learning through monte carlo estimation of error reduction

Abstract This paper presents an active learning method that directly optimizes expected future er... more Abstract This paper presents an active learning method that directly optimizes expected future error. This is in contrast to many other popular techniques that instead aim to reduce version space size. These methods are popular because for many learning models, closed form calculation of the expected future error is intractable. Our approach is made feasible by taking a Monte Carlo approach to estimating the expected reduction in error due to the labeling of a query.

Download

Learning to extract symbolic knowledge from the World Wide Web

Abstract: The World Wide Web is a vast source of information accessible to computers, but underst... more Abstract: The World Wide Web is a vast source of information accessible to computers, but understandable only to humans. The goal of the research described here is to automatically create a computer understandable knowledge base whose content mirrors that of the World Wide Web. Such a knowledge base would enable much more effective retrieval of Web information, and promote new uses of the Web to support knowledge based inference and problem solving.

Download

Integrating probabilistic extraction models and data mining to discover relations and patterns in text

Abstract In order for relation extraction systems to obtain human-level performance, they must be... more Abstract In order for relation extraction systems to obtain human-level performance, they must be able to incorporate relational patterns inherent in the data (for example, that one's sister is likely one's mother's daughter, or that children are likely to attend the same college as their parents). Hand-coding such knowledge can be time-consuming and inadequate. Additionally, there may exist many interesting, unknown relational patterns that both improve extraction performance and provide insight into text.

Download

Efficient methods for topic model inference on streaming document collections

Abstract Topic models provide a powerful tool for analyzing large text collections by representin... more Abstract Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of training documents requires approximate inference techniques that are computationally expensive. With today's large-scale, constantly expanding document collections, it is useful to be able to infer topic distributions for new documents without retraining the model.

Download

Challenges in information retrieval and language modeling

Information retrieval (IR) research has reached a point where it is appropriate to assess progres... more Information retrieval (IR) research has reached a point where it is appropriate to assess progress and to define a research agenda for the next five to ten years. This report summarizes a discussion of IR research challenges that took place at a recent workshop. The attendees of the workshop considered information retrieval research in a range of areas chosen to give broad coverage of topic areas that engage information retrieval researchers.

Download

Factorie: Probabilistic programming via imperatively defined factor graphs

Abstract Discriminatively trained undirected graphical models have had wide empirical success, an... more Abstract Discriminatively trained undirected graphical models have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data. The power in relational models is in their repeated structure and tied parameters; at issue is how to define these structures in a powerful and flexible way.

Download

Active learning by labeling features

Abstract Methods that learn from prior information about input features such as generalized expec... more Abstract Methods that learn from prior information about input features such as generalized expectation (GE) have been used to train accurate models with very little effort. In this paper, we propose an active learning approach in which the machine solicits" labels" on features rather than instances. In both simulated and real user experiments on two sequence labeling tasks we show that our active learning method outperforms passive learning with features as well as traditional active learning with instances.

Download

Large-scale cross-document coreference using distributed inference and hierarchical models

Abstract Cross-document coreference, the task of grouping all the mentions of each entity in a do... more Abstract Cross-document coreference, the task of grouping all the mentions of each entity in a document collection, arises in information extraction and automated knowledge base construction. For large collections, it is clearly impractical to consider all possible groupings of mentions into distinct entities.

Implicit surface modelling with a globally regularised basis of compact support

Dynamic sharing and backward compatibility on 64-bit machines

Download

An entity based model for coreference resolution

Download

Bi-directional joint inference for entity resolution and segmentation using imperatively-defined factor graphs

Download

Modeling relations and their mentions without labeled text

Download

Learning to create customized authority lists

Download

Alternating projections for learning with expectation constraints

Abstract We present an objective function for learning with unlabeled data that utilizes auxiliar... more

Scalable probabilistic databases with factor graphs and MCMC

Polylingual topic models

Download

Rethinking LDA: Why priors matter

Download

Toward optimal active learning through monte carlo estimation of error reduction

Download

Learning to extract symbolic knowledge from the World Wide Web

Download

Integrating probabilistic extraction models and data mining to discover relations and patterns in text

Download

Efficient methods for topic model inference on streaming document collections

Download

Challenges in information retrieval and language modeling

Download

Factorie: Probabilistic programming via imperatively defined factor graphs

Download

Active learning by labeling features

Download

Large-scale cross-document coreference using distributed inference and hierarchical models

Download

Model combination for event extraction in BioNLP 2011

Abstract We describe the FAUST entry to the BioNLP 2011 shared task on biomolecular event extract... more Abstract We describe the FAUST entry to the BioNLP 2011 shared task on biomolecular event extraction. The FAUST system explores several stacking models for combination using as base models the UMass dual decomposition (Riedel and McCallum, 2011) and Stanford event parsing (McClosky et al., 2011b) approaches.

Download

Optimizing semantic coherence in topic models

Abstract Latent variable models have the potential to add value to large document collections by ... more Abstract Latent variable models have the potential to add value to large document collections by discovering interpretable, low-dimensional subspaces. In order for people to use such models, however, they must trust them. Unfortunately, typical dimensionality reduction methods for text, such as latent Dirichlet allocation, often produce low-dimensional subspaces (topics) that are obviously flawed to human domain experts.

Download