-
Alexa, Let's Work Together: Introducing the First Alexa Prize TaskBot Challenge on Conversational Task Assistance
Authors:
Anna Gottardi,
Osman Ipek,
Giuseppe Castellucci,
Shui Hu,
Lavina Vaz,
Yao Lu,
Anju Khatri,
Anjali Chadha,
Desheng Zhang,
Sattvik Sahai,
Prerna Dwivedi,
Hangjie Shi,
Lucy Hu,
Andy Huang,
Luke Dai,
Bofei Yang,
Varun Somani,
Pankaj Rajan,
Ron Rezac,
Michael Johnston,
Savanna Stiff,
Leslie Ball,
David Carmel,
Yang Liu,
Dilek Hakkani-Tur
, et al. (5 additional authors not shown)
Abstract:
Since its inception in 2016, the Alexa Prize program has enabled hundreds of university students to explore and compete to develop conversational agents through the SocialBot Grand Challenge. The goal of the challenge is to build agents capable of conversing coherently and engagingly with humans on popular topics for 20 minutes, while achieving an average rating of at least 4.0/5.0. However, as co…
▽ More
Since its inception in 2016, the Alexa Prize program has enabled hundreds of university students to explore and compete to develop conversational agents through the SocialBot Grand Challenge. The goal of the challenge is to build agents capable of conversing coherently and engagingly with humans on popular topics for 20 minutes, while achieving an average rating of at least 4.0/5.0. However, as conversational agents attempt to assist users with increasingly complex tasks, new conversational AI techniques and evaluation platforms are needed. The Alexa Prize TaskBot challenge, established in 2021, builds on the success of the SocialBot challenge by introducing the requirements of interactively assisting humans with real-world Cooking and Do-It-Yourself tasks, while making use of both voice and visual modalities. This challenge requires the TaskBots to identify and understand the user's need, identify and integrate task and domain knowledge into the interaction, and develop new ways of engaging the user without distracting them from the task at hand, among other challenges. This paper provides an overview of the TaskBot challenge, describes the infrastructure support provided to the teams with the CoBot Toolkit, and summarizes the approaches the participating teams took to overcome the research challenges. Finally, it analyzes the performance of the competing TaskBots during the first year of the competition.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Answering Product-Questions by Utilizing Questions from Other Contextually Similar Products
Authors:
Ohad Rozen,
David Carmel,
Avihai Mejer,
Vitaly Mirkis,
Yftah Ziser
Abstract:
Predicting the answer to a product-related question is an emerging field of research that recently attracted a lot of attention. Answering subjective and opinion-based questions is most challenging due to the dependency on customer-generated content. Previous works mostly focused on review-aware answer prediction; however, these approaches fail for new or unpopular products, having no (or only a f…
▽ More
Predicting the answer to a product-related question is an emerging field of research that recently attracted a lot of attention. Answering subjective and opinion-based questions is most challenging due to the dependency on customer-generated content. Previous works mostly focused on review-aware answer prediction; however, these approaches fail for new or unpopular products, having no (or only a few) reviews at hand. In this work, we propose a novel and complementary approach for predicting the answer for such questions, based on the answers for similar questions asked on similar products. We measure the contextual similarity between products based on the answers they provide for the same question. A mixture-of-expert framework is used to predict the answer by aggregating the answers from contextually similar products. Empirical results demonstrate that our model outperforms strong baselines on some segments of questions, namely those that have roughly ten or more similar resolved questions in the corpus. We additionally publish two large-scale datasets used in this work, one is of similar product question pairs, and the second is of product question-answer pairs.
△ Less
Submitted 19 May, 2021;
originally announced May 2021.
-
Fishing in the Stream: Similarity Search over Endless Data
Authors:
Naama Kraus,
David Carmel,
Idit Keidar
Abstract:
Similarity search is the task of retrieving data items that are similar to a given query. In this paper, we introduce the time-sensitive notion of similarity search over endless data-streams (SSDS), which takes into account data quality and temporal characteristics in addition to similarity. SSDS is challenging as it needs to process unbounded data, while computation resources are bounded. We prop…
▽ More
Similarity search is the task of retrieving data items that are similar to a given query. In this paper, we introduce the time-sensitive notion of similarity search over endless data-streams (SSDS), which takes into account data quality and temporal characteristics in addition to similarity. SSDS is challenging as it needs to process unbounded data, while computation resources are bounded. We propose Stream-LSH, a randomized SSDS algorithm that bounds the index size by retaining items according to their freshness, quality, and dynamic popularity attributes. We analytically show that Stream-LSH increases the probability to find similar items compared to alternative approaches using the same space capacity. We further conduct an empirical study using real world stream datasets, which confirms our theoretical results.
△ Less
Submitted 7 August, 2017;
originally announced August 2017.
-
Tail-Tolerant Distributed Search
Authors:
Naama Kraus,
David Carmel,
Idit Keidar
Abstract:
Today's search engines process billions of online user queries a day over huge collections of data. In order to scale, they distribute query processing among many nodes, where each node holds and searches over a subset of the index called shard. Responses from some nodes occasionally fail to arrive within a reasonable time-interval due to various reasons, such as high server load and network conge…
▽ More
Today's search engines process billions of online user queries a day over huge collections of data. In order to scale, they distribute query processing among many nodes, where each node holds and searches over a subset of the index called shard. Responses from some nodes occasionally fail to arrive within a reasonable time-interval due to various reasons, such as high server load and network congestion. Search engines typically need to respond in a timely manner, and therefore skip such tail latency responses, which causes degradation in search quality. In this paper, we tackle response misses due to high tail latencies with the goal of maximizing search quality.
Search providers today use redundancy in the form of Replication for mitigating response misses, by constructing multiple copies of each shard and searching all replicas. This approach is not ideal, as it wastes resources on searching duplicate data. We propose two strategies to reduce this waste. First, we propose rSmartRed, an optimal shard selection scheme for replicated indexes. Second, when feasible, we propose to replace Replication with Repartition, which constructs independent index instances instead of exact copies. We analytically prove that rSmartRed's selection is optimal for Replication, and that Repartition achieves better search quality compared to Replication. We confirm our results with an empirical study using two real-world datasets, showing that rSmartRed improves recall compared to currently used approaches. We additionally show that Repartition improves over Replication in typical scenarios.
△ Less
Submitted 24 July, 2017;
originally announced July 2017.
-
NearBucket-LSH: Efficient Similarity Search in P2P Networks
Authors:
Naama Kraus,
David Carmel,
Idit Keidar,
Meni Orenbach
Abstract:
We present NearBucket-LSH, an effective algorithm for similarity search in large-scale distributed online social networks organized as peer-to-peer overlays. As communication is a dominant consideration in distributed systems, we focus on minimizing the network cost while guaranteeing good search quality. Our algorithm is based on Locality Sensitive Hashing (LSH), which limits the search to collec…
▽ More
We present NearBucket-LSH, an effective algorithm for similarity search in large-scale distributed online social networks organized as peer-to-peer overlays. As communication is a dominant consideration in distributed systems, we focus on minimizing the network cost while guaranteeing good search quality. Our algorithm is based on Locality Sensitive Hashing (LSH), which limits the search to collections of objects, called buckets, that have a high probability to be similar to the query. More specifically, NearBucket-LSH employs an LSH extension that searches in near buckets, and improves search quality but also significantly increases the network cost. We decrease the network cost by considering the internals of both LSH and the P2P overlay, and harnessing their properties to our needs. We show that our NearBucket-LSH increases search quality for a given network cost compared to previous art. In many cases, the search quality increases by more than 50%.
△ Less
Submitted 23 November, 2015;
originally announced November 2015.