-
On Gradient Boosted Decision Trees and Neural Rankers: A Case-Study on Short-Video Recommendations at ShareChat
Authors:
Olivier Jeunen,
Hitesh Sagtani,
Himanshu Doi,
Rasul Karimov,
Neeti Pokharna,
Danish Kalim,
Aleksei Ustimenko,
Christopher Green,
Wenzhe Shi,
Rishabh Mehrotra
Abstract:
Practitioners who wish to build real-world applications that rely on ranking models, need to decide which modelling paradigm to follow. This is not an easy choice to make, as the research literature on this topic has been shifting in recent years. In particular, whilst Gradient Boosted Decision Trees (GBDTs) have reigned supreme for more than a decade, the flexibility of neural networks has allowe…
▽ More
Practitioners who wish to build real-world applications that rely on ranking models, need to decide which modelling paradigm to follow. This is not an easy choice to make, as the research literature on this topic has been shifting in recent years. In particular, whilst Gradient Boosted Decision Trees (GBDTs) have reigned supreme for more than a decade, the flexibility of neural networks has allowed them to catch up, and recent works report accuracy metrics that are on par. Nevertheless, practical systems require considerations beyond mere accuracy metrics to decide on a modelling approach.
This work describes our experiences in balancing some of the trade-offs that arise, presenting a case study on a short-video recommendation application. We highlight (1) neural networks' ability to handle large training data size, user- and item-embeddings allows for more accurate models than GBDTs in this setting, and (2) because GBDTs are less reliant on specialised hardware, they can provide an equally accurate model at a lower cost. We believe these findings are of relevance to researchers in both academia and industry, and hope they can inspire practitioners who need to make similar modelling choices in the future.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Ad-load Balancing via Off-policy Learning in a Content Marketplace
Authors:
Hitesh Sagtani,
Madan Jhawar,
Rishabh Mehrotra,
Olivier Jeunen
Abstract:
Ad-load balancing is a critical challenge in online advertising systems, particularly in the context of social media platforms, where the goal is to maximize user engagement and revenue while maintaining a satisfactory user experience. This requires the optimization of conflicting objectives, such as user satisfaction and ads revenue. Traditional approaches to ad-load balancing rely on static allo…
▽ More
Ad-load balancing is a critical challenge in online advertising systems, particularly in the context of social media platforms, where the goal is to maximize user engagement and revenue while maintaining a satisfactory user experience. This requires the optimization of conflicting objectives, such as user satisfaction and ads revenue. Traditional approaches to ad-load balancing rely on static allocation policies, which fail to adapt to changing user preferences and contextual factors. In this paper, we present an approach that leverages off-policy learning and evaluation from logged bandit feedback. We start by presenting a motivating analysis of the ad-load balancing problem, highlighting the conflicting objectives between user satisfaction and ads revenue. We emphasize the nuances that arise due to user heterogeneity and the dependence on the user's position within a session. Based on this analysis, we define the problem as determining the optimal ad-load for a particular feed fetch. To tackle this problem, we propose an off-policy learning framework that leverages unbiased estimators such as Inverse Propensity Scoring (IPS) and Doubly Robust (DR) to learn and estimate the policy values using offline collected stochastic data. We present insights from online A/B experiments deployed at scale across over 80 million users generating over 200 million sessions, where we find statistically significant improvements in both user satisfaction metrics and ads revenue for the platform.
△ Less
Submitted 19 December, 2023; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Task Preferences across Languages on Community Question Answering Platforms
Authors:
Sebastin Santy,
Prasanta Bhattacharya,
Rishabh Mehrotra
Abstract:
With the steady emergence of community question answering (CQA) platforms like Quora, StackExchange, and WikiHow, users now have an unprecedented access to information on various kind of queries and tasks. Moreover, the rapid proliferation and localization of these platforms spanning geographic and linguistic boundaries offer a unique opportunity to study the task requirements and preferences of u…
▽ More
With the steady emergence of community question answering (CQA) platforms like Quora, StackExchange, and WikiHow, users now have an unprecedented access to information on various kind of queries and tasks. Moreover, the rapid proliferation and localization of these platforms spanning geographic and linguistic boundaries offer a unique opportunity to study the task requirements and preferences of users in different socio-linguistic groups. In this study, we implement an entity-embedding model trained on a large longitudinal dataset of multi-lingual and task-oriented question-answer pairs to uncover and quantify the (i) prevalence and distribution of various online tasks across linguistic communities, and (ii) emerging and receding trends in task popularity over time in these communities. Our results show that there exists substantial variance in task preference as well as popularity trends across linguistic communities on the platform. Findings from this study will help Q&A platforms better curate and personalize content for non-English users, while also offering valuable insights to businesses looking to target non-English speaking communities online.
△ Less
Submitted 18 December, 2022;
originally announced December 2022.
-
Disentangling Causal Effects from Sets of Interventions in the Presence of Unobserved Confounders
Authors:
Olivier Jeunen,
Ciarán M. Gilligan-Lee,
Rishabh Mehrotra,
Mounia Lalmas
Abstract:
The ability to answer causal questions is crucial in many domains, as causal inference allows one to understand the impact of interventions. In many applications, only a single intervention is possible at a given time. However, in some important areas, multiple interventions are concurrently applied. Disentangling the effects of single interventions from jointly applied interventions is a challeng…
▽ More
The ability to answer causal questions is crucial in many domains, as causal inference allows one to understand the impact of interventions. In many applications, only a single intervention is possible at a given time. However, in some important areas, multiple interventions are concurrently applied. Disentangling the effects of single interventions from jointly applied interventions is a challenging task -- especially as simultaneously applied interventions can interact. This problem is made harder still by unobserved confounders, which influence both treatments and outcome. We address this challenge by aiming to learn the effect of a single-intervention from both observational data and sets of interventions. We prove that this is not generally possible, but provide identification proofs demonstrating that it can be achieved under non-linear continuous structural causal models with additive, multivariate Gaussian noise -- even when unobserved confounders are present. Importantly, we show how to incorporate observed covariates and learn heterogeneous treatment effects. Based on the identifiability proofs, we provide an algorithm that learns the causal model parameters by pooling data from different regimes and jointly maximizing the combined likelihood. The effectiveness of our method is empirically demonstrated on both synthetic and real-world data.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Mostra: A Flexible Balancing Framework to Trade-off User, Artist and Platform Objectives for Music Sequencing
Authors:
Emanuele Bugliarello,
Rishabh Mehrotra,
James Kirk,
Mounia Lalmas
Abstract:
We consider the task of sequencing tracks on music streaming platforms where the goal is to maximise not only user satisfaction, but also artist- and platform-centric objectives, needed to ensure long-term health and sustainability of the platform. Grounding the work across four objectives: Sat, Discovery, Exposure and Boost, we highlight the need and the potential to trade-off performance across…
▽ More
We consider the task of sequencing tracks on music streaming platforms where the goal is to maximise not only user satisfaction, but also artist- and platform-centric objectives, needed to ensure long-term health and sustainability of the platform. Grounding the work across four objectives: Sat, Discovery, Exposure and Boost, we highlight the need and the potential to trade-off performance across these objectives, and propose Mostra, a Set Transformer-based encoder-decoder architecture equipped with submodular multi-objective beam search decoding. The proposed model affords system designers the power to balance multiple goals, and dynamically control the impact on one objective to satisfy other objectives. Through extensive experiments on data from a large-scale music streaming platform, we present insights on the trade-offs that exist across different objectives, and demonstrate that the proposed framework leads to a superior, just-in-time balancing across the various metrics of interest.
△ Less
Submitted 21 April, 2022;
originally announced April 2022.
-
A real-time spatiotemporal AI model analyzes skill in open surgical videos
Authors:
Emmett D. Goodman,
Krishna K. Patel,
Yilun Zhang,
William Locke,
Chris J. Kennedy,
Rohan Mehrotra,
Stephen Ren,
Melody Y. Guan,
Maren Downing,
Hao Wei Chen,
Jevin Z. Clark,
Gabriel A. Brat,
Serena Yeung
Abstract:
Open procedures represent the dominant form of surgery worldwide. Artificial intelligence (AI) has the potential to optimize surgical practice and improve patient outcomes, but efforts have focused primarily on minimally invasive techniques. Our work overcomes existing data limitations for training AI models by curating, from YouTube, the largest dataset of open surgical videos to date: 1997 video…
▽ More
Open procedures represent the dominant form of surgery worldwide. Artificial intelligence (AI) has the potential to optimize surgical practice and improve patient outcomes, but efforts have focused primarily on minimally invasive techniques. Our work overcomes existing data limitations for training AI models by curating, from YouTube, the largest dataset of open surgical videos to date: 1997 videos from 23 surgical procedures uploaded from 50 countries. Using this dataset, we developed a multi-task AI model capable of real-time understanding of surgical behaviors, hands, and tools - the building blocks of procedural flow and surgeon skill. We show that our model generalizes across diverse surgery types and environments. Illustrating this generalizability, we directly applied our YouTube-trained model to analyze open surgeries prospectively collected at an academic medical center and identified kinematic descriptors of surgical skill related to efficiency of hand motion. Our Annotated Videos of Open Surgery (AVOS) dataset and trained model will be made available for further development of surgical AI.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions
Authors:
James McInerney,
Brian Brost,
Praveen Chandar,
Rishabh Mehrotra,
Ben Carterette
Abstract:
Users of music streaming, video streaming, news recommendation, and e-commerce services often engage with content in a sequential manner. Providing and evaluating good sequences of recommendations is therefore a central problem for these services. Prior reweighting-based counterfactual evaluation methods either suffer from high variance or make strong independence assumptions about rewards. We pro…
▽ More
Users of music streaming, video streaming, news recommendation, and e-commerce services often engage with content in a sequential manner. Providing and evaluating good sequences of recommendations is therefore a central problem for these services. Prior reweighting-based counterfactual evaluation methods either suffer from high variance or make strong independence assumptions about rewards. We propose a new counterfactual estimator that allows for sequential interactions in the rewards with lower variance in an asymptotically unbiased manner. Our method uses graphical assumptions about the causal relationships of the slate to reweight the rewards in the logging policy in a way that approximates the expected sum of rewards under the target policy. Extensive experiments in simulation and on a live recommender system show that our approach outperforms existing methods in terms of bias and data efficiency for the sequential track recommendations problem.
△ Less
Submitted 23 August, 2020; v1 submitted 25 July, 2020;
originally announced July 2020.
-
3D Channel Modeling and Characterization for Hypersurface Empowered Indoor Environment at 60 GHz Millimeter-Wave Band
Authors:
Rashi Mehrotra,
Rafay Iqbal Ansari,
Alexandros Pitilakis,
Shuai Nie,
Christos Liaskos,
Nikolaos V. Kantartzis,
Andreas Pitsillides
Abstract:
This paper proposes a three-dimensional (3D) communication channel model for an indoor environment considering the effect of the Hypersurface. The Hypersurface is a software controlled intelligent metasurface, which can be used to manipulate electromagnetic waves, as for example for non-specular reflection and full absorption. Thus it can control the impinging rays from a transmitter towards a rec…
▽ More
This paper proposes a three-dimensional (3D) communication channel model for an indoor environment considering the effect of the Hypersurface. The Hypersurface is a software controlled intelligent metasurface, which can be used to manipulate electromagnetic waves, as for example for non-specular reflection and full absorption. Thus it can control the impinging rays from a transmitter towards a receiver location in both LOS and NLOS paths, e.g. to combat distance and improve wireless connectivity. We focus on the 60 GHz mmWave frequency band due to its increasing significance in 5G/6G networks and evaluate the effect of Hypersurface in an indoor environment in terms of attenuation coefficients related to the Hypersurface reflection and absorption functionalities, using CST simulation, a 3D electromagnetic simulator of high frequency components. To highlight the benefits of Hypersurface coated walls versus plain walls, we use the derived Hypersurface 3D channel model and a custom 3D ray-tracing simulator for plain walls considering a typical indoor scenario for different Tx-Rx location and separation distances.
△ Less
Submitted 28 June, 2019;
originally announced July 2019.
-
The Music Streaming Sessions Dataset
Authors:
Brian Brost,
Rishabh Mehrotra,
Tristan Jehan
Abstract:
At the core of many important machine learning problems faced by online streaming services is a need to model how users interact with the content they are served. Unfortunately, there are no public datasets currently available that enable researchers to explore this topic. In order to spur that research, we release the Music Streaming Sessions Dataset (MSSD), which consists of 160 million listenin…
▽ More
At the core of many important machine learning problems faced by online streaming services is a need to model how users interact with the content they are served. Unfortunately, there are no public datasets currently available that enable researchers to explore this topic. In order to spur that research, we release the Music Streaming Sessions Dataset (MSSD), which consists of 160 million listening sessions and associated user actions. Furthermore, we provide audio features and metadata for the approximately 3.7 million unique tracks referred to in the logs. This is the largest collection of such track metadata currently available to the public. This dataset enables research on important problems including how to model user listening and interaction behaviour in streaming, as well as Music Information Retrieval (MIR), and session-based sequential recommendations. Additionally, a subset of sessions were collected using a uniformly random recommendation setting, enabling their use for counterfactual evaluation of such sequential recommendations. Finally, we provide an analysis of user behavior and suggest further research problems which can be addressed using the dataset.
△ Less
Submitted 14 October, 2020; v1 submitted 31 December, 2018;
originally announced January 2019.
-
Towards Task Understanding in Visual Settings
Authors:
Sebastin Santy,
Wazeer Zulfikar,
Rishabh Mehrotra,
Emine Yilmaz
Abstract:
We consider the problem of understanding real world tasks depicted in visual images. While most existing image captioning methods excel in producing natural language descriptions of visual scenes involving human tasks, there is often the need for an understanding of the exact task being undertaken rather than a literal description of the scene. We leverage insights from real world task understandi…
▽ More
We consider the problem of understanding real world tasks depicted in visual images. While most existing image captioning methods excel in producing natural language descriptions of visual scenes involving human tasks, there is often the need for an understanding of the exact task being undertaken rather than a literal description of the scene. We leverage insights from real world task understanding systems, and propose a framework composed of convolutional neural networks, and an external hierarchical task ontology to produce task descriptions from input images. Detailed experiments highlight the efficacy of the extracted descriptions, which could potentially find their way in many applications, including image alt text generation.
△ Less
Submitted 28 November, 2018;
originally announced November 2018.
-
Characterizing and Predicting Supply-side Engagement on Crowd-contributed Video Sharing Platforms
Authors:
Rishabh Mehrotra,
Prasanta Bhattacharya
Abstract:
Video sharing and entertainment websites have rapidly grown in popularity and now constitute some of the most visited websites on the Internet. Despite the active user engagement on these online video-sharing platforms, most of recent research on online media platforms have restricted themselves to networking based social media sites, like Facebook or Twitter. We depart from previous studies in th…
▽ More
Video sharing and entertainment websites have rapidly grown in popularity and now constitute some of the most visited websites on the Internet. Despite the active user engagement on these online video-sharing platforms, most of recent research on online media platforms have restricted themselves to networking based social media sites, like Facebook or Twitter. We depart from previous studies in the online media space that have focused exclusively on demand-side user engagement, by modeling the supply-side of the crowd-contributed videos on this platform. The current study is among the first to perform a large-scale empirical study using longitudinal video upload data from a large online video platform. The modeling and subsequent prediction of video uploads is made complicated by the heterogeneity of video types (e.g. popular vs. niche video genres), and the inherent time trend effects associated with media uploads. We identify distinct genre-clusters from our dataset and employ a self-exciting Hawkes point-process model on each of these clusters to fully specify and estimate the video upload process. Additionally, we go beyond prediction to disentangle potential factors that govern user engagement and determine the video upload rates, which improves our analysis with additional explanatory power. Our findings show that using a relatively parsimonious point-process model, we are able to achieve higher model fit, and predict video uploads to the platform with a higher accuracy than competing models. The findings from this study can benefit platform owners in better understanding how their supply-side users engage with their site over time. We also offer a robust method for performing media upload prediction that is likely to be generalizable across media platforms which demonstrate similar temporal and genre-level heterogeneity.
△ Less
Submitted 10 June, 2017;
originally announced June 2017.
-
Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach
Authors:
Rishabh Mehrotra,
Emine Yilmaz
Abstract:
A significant amount of search queries originate from some real world information need or tasks. In order to improve the search experience of the end users, it is important to have accurate representations of tasks. As a result, significant amount of research has been devoted to extracting proper representations of tasks in order to enable search systems to help users complete their tasks, as well…
▽ More
A significant amount of search queries originate from some real world information need or tasks. In order to improve the search experience of the end users, it is important to have accurate representations of tasks. As a result, significant amount of research has been devoted to extracting proper representations of tasks in order to enable search systems to help users complete their tasks, as well as providing the end user with better query suggestions, for better recommendations, for satisfaction prediction, and for improved personalization in terms of tasks. Most existing task extraction methodologies focus on representing tasks as flat structures. However, tasks often tend to have multiple subtasks associated with them and a more naturalistic representation of tasks would be in terms of a hierarchy, where each task can be composed of multiple (sub)tasks. To this end, we propose an efficient Bayesian nonparametric model for extracting hierarchies of such tasks \& subtasks. We evaluate our method based on real world query log data both through quantitative and crowdsourced experiments and highlight the importance of considering task/subtask hierarchies.
△ Less
Submitted 6 June, 2017; v1 submitted 5 June, 2017;
originally announced June 2017.
-
Auditing Search Engines for Differential Satisfaction Across Demographics
Authors:
Rishabh Mehrotra,
Ashton Anderson,
Fernando Diaz,
Amit Sharma,
Hanna Wallach,
Emine Yilmaz
Abstract:
Many online services, such as search engines, social media platforms, and digital marketplaces, are advertised as being available to any user, regardless of their age, gender, or other demographic factors. However, there are growing concerns that these services may systematically underserve some groups of users. In this paper, we present a framework for internally auditing such services for differ…
▽ More
Many online services, such as search engines, social media platforms, and digital marketplaces, are advertised as being available to any user, regardless of their age, gender, or other demographic factors. However, there are growing concerns that these services may systematically underserve some groups of users. In this paper, we present a framework for internally auditing such services for differences in user satisfaction across demographic groups, using search engines as a case study. We first explain the pitfalls of naïvely comparing the behavioral metrics that are commonly used to evaluate search engines. We then propose three methods for measuring latent differences in user satisfaction from observed differences in evaluation metrics. To develop these methods, we drew on ideas from the causal inference literature and the multilevel modeling literature. Our framework is broadly applicable to other online services, and provides general insight into interpreting their evaluation metrics.
△ Less
Submitted 24 May, 2017;
originally announced May 2017.
-
Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems
Authors:
Layla El Asri,
Hannes Schulz,
Shikhar Sharma,
Jeremie Zumer,
Justin Harris,
Emery Fine,
Rahul Mehrotra,
Kaheer Suleman
Abstract:
This paper presents the Frames dataset (Frames is available at http://datasets.maluuba.com/Frames), a corpus of 1369 human-human dialogues with an average of 15 turns per dialogue. We developed this dataset to study the role of memory in goal-oriented dialogue systems. Based on Frames, we introduce a task called frame tracking, which extends state tracking to a setting where several states are tra…
▽ More
This paper presents the Frames dataset (Frames is available at http://datasets.maluuba.com/Frames), a corpus of 1369 human-human dialogues with an average of 15 turns per dialogue. We developed this dataset to study the role of memory in goal-oriented dialogue systems. Based on Frames, we introduce a task called frame tracking, which extends state tracking to a setting where several states are tracked simultaneously. We propose a baseline model for this task. We show that Frames can also be used to study memory in dialogue management and information presentation through natural language generation.
△ Less
Submitted 13 April, 2017; v1 submitted 31 March, 2017;
originally announced April 2017.
-
Atomistic approach to alloy scattering in $Si_{1-x}Ge_{x}$
Authors:
Saumitra R Mehrotra,
Abhijeet Paul,
Gerhard Klimeck
Abstract:
SiGe alloy scattering is of significant importance with the introduction of strained layers and SiGe channels into CMOS technology. However, alloy scattering has till now been treated in an empirical fashion with a fitting parameter. We present a theoretical model within the atomistic tight-binding representation for treating alloy scattering in SiGe. This approach puts the scattering model on a s…
▽ More
SiGe alloy scattering is of significant importance with the introduction of strained layers and SiGe channels into CMOS technology. However, alloy scattering has till now been treated in an empirical fashion with a fitting parameter. We present a theoretical model within the atomistic tight-binding representation for treating alloy scattering in SiGe. This approach puts the scattering model on a solid atomistic footing with physical insights. The approach is shown to inherently capture the bulk alloy scattering potential parameters for both n-type and p-type carriers and matches experimental mobility data.
△ Less
Submitted 1 March, 2011; v1 submitted 23 February, 2011;
originally announced February 2011.
-
Interface Trap Density Metrology of state-of-the-art undoped Si n-FinFETs
Authors:
Giuseppe Carlo Tettamanzi,
Abhijeet Paul,
Sunhee Lee,
Saumitra R. Mehrotra,
Nadine Collaert,
Serge Biesemans,
Gerhard Klimeck,
Sven Rogge
Abstract:
The presence of interface states at the MOS interface is a well-known cause of device degradation. This is particularly true for ultra-scaled FinFET geometries where the presence of a few traps can strongly influence device behavior. Typical methods for interface trap density (Dit) measurements are not performed on ultimate devices, but on custom designed structures. We present the first set of me…
▽ More
The presence of interface states at the MOS interface is a well-known cause of device degradation. This is particularly true for ultra-scaled FinFET geometries where the presence of a few traps can strongly influence device behavior. Typical methods for interface trap density (Dit) measurements are not performed on ultimate devices, but on custom designed structures. We present the first set of methods that allow direct estimation of Dit in state-of-the-art FinFETs, addressing a critical industry need.
△ Less
Submitted 11 November, 2010;
originally announced November 2010.
-
Diversity Begets Stability in an Evolving Network
Authors:
Ravi Mehrotra,
Vikram Soni,
Sanjay Jain
Abstract:
Complex evolving systems such as the biosphere, ecosystems and societies exhibit sudden collapses, for reasons that are only partially understood. Here we study this phenomenon using a mathematical model of a system that evolves under Darwinian selection and exhibits the spontaneous growth, stasis and collapse of its structure. We find that the typical lifetime of the system increases sharply wi…
▽ More
Complex evolving systems such as the biosphere, ecosystems and societies exhibit sudden collapses, for reasons that are only partially understood. Here we study this phenomenon using a mathematical model of a system that evolves under Darwinian selection and exhibits the spontaneous growth, stasis and collapse of its structure. We find that the typical lifetime of the system increases sharply with the diversity of its components or species. We also find that the prime reason for crashes is a naturally occurring internal fragility of the system. This fragility is captured in the network organizational character and is related to a reduced multiplicity of pathways between its components. This work suggests new parameters for understanding the robustness of evolving molecular networks, ecosystems, societies, and markets.
△ Less
Submitted 8 May, 2007;
originally announced May 2007.
-
Fast Algorithms For Josephson Junction Arrays : Bus--bars and Defects
Authors:
Sujay Datta,
Shantilal Das,
Deshdeep Sahdev,
Ravi Mehrotra,
Subodh R. Shenoy
Abstract:
We critically review the fast algorithms for the numerical study of two--dimensional Josephson junction arrays and develop the analogy of such systems with electrostatics. We extend these procedures to arrays with bus--bars and defects in the form of missing bonds. The role of boundaries and of the guage choice in determing the Green's function of the system is clarified. The extension of the Gr…
▽ More
We critically review the fast algorithms for the numerical study of two--dimensional Josephson junction arrays and develop the analogy of such systems with electrostatics. We extend these procedures to arrays with bus--bars and defects in the form of missing bonds. The role of boundaries and of the guage choice in determing the Green's function of the system is clarified. The extension of the Green's function approach to other situations is also discussed.
△ Less
Submitted 24 November, 1995; v1 submitted 23 November, 1995;
originally announced November 1995.