Matt Lease
The University of Texas at Austin, School of Information, Faculty Member
Research Interests:
Research Interests:
Research Interests:
ABSTRACT This introduction to the special issue summarizes and contextualizes six novel research contributions at the intersection of information retrieval (IR) and crowdsourcing (also overlapping crowdsourcing’s closely-related sibling,... more
ABSTRACT This introduction to the special issue summarizes and contextualizes six novel research contributions at the intersection of information retrieval (IR) and crowdsourcing (also overlapping crowdsourcing’s closely-related sibling, human computation). Several of the papers included in this special issue represent deeper investigations into research topics for which earlier stages of the authors’ research were disseminated at crowdsourcing workshops at SIGIR and WSDM conferences, as well as at the NIST TREC conference. Since the first proposed use of crowdsourcing for IR in 2008, interest in this area has quickly accelerated and led to three workshops, an ongoing NIST TREC track, and a great variety of published papers, talks, and tutorials. We briefly summarize the area in order to help situate the contributions appearing in this special issue. We also discuss some broader current trends and issues in crowdsourcing which bear upon its use in IR and other fields.
Research Interests:
... Yakushiji, A., Tateisi, Y., Miyao, Y., Tsujii, J.: Event extraction from biomedical papers using afull parser. ... Daraselia, N., Yuryev, A., Egorov, S., Novichkova, S., Nikitin, A., Mazo, I.: Extract-ing human protein interactions... more
... Yakushiji, A., Tateisi, Y., Miyao, Y., Tsujii, J.: Event extraction from biomedical papers using afull parser. ... Daraselia, N., Yuryev, A., Egorov, S., Novichkova, S., Nikitin, A., Mazo, I.: Extract-ing human protein interactions from medline using a full-sentence parser. ...
Research Interests:
Abstract. A well-known challenge of information retrieval is how to infer a user's underlying information need when the input query consists of only a few keywords. Question Answering (QA) systems face an equally... more
Abstract. A well-known challenge of information retrieval is how to infer a user's underlying information need when the input query consists of only a few keywords. Question Answering (QA) systems face an equally important but opposite challenge: given a verbose question, how can the ...
Research Interests:
ABSTRACT As mobile devices continue to proliferate and become more tightly integrated with our daily activities, a number of libraries have begun deploying customized mobile Web portals and applications to promote accessibility for... more
ABSTRACT As mobile devices continue to proliferate and become more tightly integrated with our daily activities, a number of libraries have begun deploying customized mobile Web portals and applications to promote accessibility for patrons. Despite rapid growth of these ...
Research Interests:
ABSTRACT We study how to best use crowdsourced relevance judgments learning to rank [1, 7]. We integrate two lines of prior work: unreliable crowd-based binary annotation for binary classi-fication [5, 3] and aggregating graded relevance... more
ABSTRACT We study how to best use crowdsourced relevance judgments learning to rank [1, 7]. We integrate two lines of prior work: unreliable crowd-based binary annotation for binary classi-fication [5, 3] and aggregating graded relevance judgments from reliable experts ...
Research Interests:
... Horton, JJ and Chilton, LB (2010). The labor economics of paid crowdsourcing. Proceedings of the 11th ACM conference on Electronic commerce, 209-218. Howe, J. (2006) The Rise of Crowdsourcing, Wired, 14(6), URL (accessed 12 May;... more
... Horton, JJ and Chilton, LB (2010). The labor economics of paid crowdsourcing. Proceedings of the 11th ACM conference on Electronic commerce, 209-218. Howe, J. (2006) The Rise of Crowdsourcing, Wired, 14(6), URL (accessed 12 May; 2011): ...
... Crowdsourcing 101: Putting the WSDM of Crowds to Work for You. 7 Page 8. Wisdom of Crowds (WoC) Requires Diversity ... cf. NIPS'10 Workshop Computational Social Science & the Wisdom of Crowds 9 Crowdsourcing 101: Putting... more
... Crowdsourcing 101: Putting the WSDM of Crowds to Work for You. 7 Page 8. Wisdom of Crowds (WoC) Requires Diversity ... cf. NIPS'10 Workshop Computational Social Science & the Wisdom of Crowds 9 Crowdsourcing 101: Putting the WSDM of Crowds to Work for You. ...
Research Interests:
Research Interests:
ABSTRACT While Amazon's Mechanical Turk (AMT) online workforce has been characterized by many people as being anonymous, we expose an aspect of AMT's system design that can be exploited to reveal a surprising amount of... more
ABSTRACT While Amazon's Mechanical Turk (AMT) online workforce has been characterized by many people as being anonymous, we expose an aspect of AMT's system design that can be exploited to reveal a surprising amount of information about many AMT Workers, which may include personally identifying information (PII). This risk of PII exposure may surprise many Workers and Requesters today, as well as impact current institutional review board (IRB) oversight of human subjects research involving AMT Workers as participants. We assess the potential multi-faceted impact of such PII exposure for each stakeholder group: Workers, Requesters, and AMT itself. We discuss potential remedies each group may explore, as well as the responsibility of each group with regard to privacy protection. This discussion leads us to further situate issues of crowd worker privacy amidst broader ethical, economic, and regulatory issues, and we conclude by offering a set of recommendations to each stakeholder group.
Research Interests:
Research Interests:
Research Interests:
Research Interests:
We describe our submission to the Image Relevance Assessment Task (IRAT) at the 2012 Text REtrieval Conference (TREC) Crowdsourcing Track. Four aspects distinguish our approach: 1) an interface for cohesive, efficient topic- based... more
We describe our submission to the Image Relevance Assessment Task (IRAT) at the 2012 Text REtrieval Conference (TREC) Crowdsourcing Track. Four aspects distinguish our approach: 1) an interface for cohesive, efficient topic- based relevance judging and reporting judgment confidence; 2) a variant of Welinder and Perona’s method for online crowdsourcing [17] (inferring quality of the judgments and judges during data collection in order to dynamically optimize data collection); 3) a completely unsupervised approach using no labeled data for either training or tuning; and 4) automatic generation of individualized error reports for each crowd worker, supporting transparent assessment and education of workers. Our system was built start-to-finish in two weeks, and we collected approximately 44,000 labels for about $40 US.
Research Interests:
The first computers were people. Today, Internet-based access to 24/7 online human crowds has led to a renaissance of research in human computation and the advent of crowdsourcing. These new opportunities have brought a disruptive shift... more
The first computers were people. Today, Internet-based access to 24/7 online human crowds has led to a renaissance of research in human computation and the advent of crowdsourcing. These new opportunities have brought a disruptive shift to research and practice for how we build intelligent systems today. Not only can labeled data for training and evaluation be collected faster, cheaper,
Research Interests:
Research Interests:
Research Interests:
In the language model (LM) paradigm for information retrieval (IR), a docu-ment's relevance is estimated as the probability of observing the query string as a random sample from the document's underlying LM [12]. The standard... more
In the language model (LM) paradigm for information retrieval (IR), a docu-ment's relevance is estimated as the probability of observing the query string as a random sample from the document's underlying LM [12]. The standard uni-gram LM approach has been shown to have a ...
We present a new supervised method for estimating term-based retrieval models and apply it to weight expansion terms from relevance feedback. While previous work on supervised feedback [Cao et al., 2008] demonstrated significantly... more
We present a new supervised method for estimating term-based retrieval models and apply it to weight expansion terms from relevance feedback. While previous work on supervised feedback [Cao et al., 2008] demonstrated significantly improved retrieval accuracy over standard ...
Research Interests:
Research Interests:
ABSTRACT Crowdsourcing [5] methods are quickly changing the land-scape for the quantity, quality, and type of labeled data available to supervised learning. While such data can now be obtained more quickly and cheaply than ever before,... more
ABSTRACT Crowdsourcing [5] methods are quickly changing the land-scape for the quantity, quality, and type of labeled data available to supervised learning. While such data can now be obtained more quickly and cheaply than ever before, the generated labels also ...