Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 66 results for author: Joachims, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.11199  [pdf, other

    cs.CY

    Algorithms for College Admissions Decision Support: Impacts of Policy Change and Inherent Variability

    Authors: Jinsook Lee, Emma Harvey, Joyce Zhou, Nikhil Garg, Thorsten Joachims, Rene F. Kizilcec

    Abstract: Each year, selective American colleges sort through tens of thousands of applications to identify a first-year class that displays both academic merit and diversity. In the 2023-2024 admissions cycle, these colleges faced unprecedented challenges. First, the number of applications has been steadily growing. Second, test-optional policies that have remained in place since the COVID-19 pandemic limi… ▽ More

    Submitted 24 June, 2024; originally announced July 2024.

    Comments: 25 pages, 8 figures

  2. arXiv:2404.16767  [pdf, other

    cs.LG cs.CL cs.CV

    REBEL: Reinforcement Learning via Regressing Relative Rewards

    Authors: Zhaolin Gao, Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun

    Abstract: While originally developed for continuous control problems, Proximal Policy Optimization (PPO) has emerged as the work-horse of a variety of reinforcement learning (RL) applications, including the fine-tuning of generative models. Unfortunately, PPO requires multiple heuristics to enable stable convergence (e.g. value networks, clipping), and is notorious for its sensitivity to the precise impleme… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: New experimental results on general chat

  3. arXiv:2402.15623  [pdf, other

    cs.CL cs.HC cs.IR cs.LG

    Language-Based User Profiles for Recommendation

    Authors: Joyce Zhou, Yijia Dai, Thorsten Joachims

    Abstract: Most conventional recommendation methods (e.g., matrix factorization) represent user profiles as high-dimensional vectors. Unfortunately, these vectors lack interpretability and steerability, and often perform poorly in cold-start settings. To address these shortcomings, we explore the use of user profiles that are represented as human-readable text. We propose the Language-based Factorization Mod… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 8 pages (4 in appendix), 22 tables/figures (16 in appendix). Accepted to LLM-IGS@WSDM2024 workshop, now sharing this slightly updated revision version with workshop

  4. arXiv:2402.10886  [pdf, other

    cs.CL

    Reviewer2: Optimizing Review Generation Through Prompt Generation

    Authors: Zhaolin Gao, Kianté Brantley, Thorsten Joachims

    Abstract: Recent developments in LLMs offer new opportunities for assisting authors in improving their work. In this paper, we envision a use case where authors can receive LLM-generated reviews that uncover weak points in the current draft. While initial methods for automated review generation already exist, these methods tend to produce reviews that lack detail, and they do not cover the range of opinions… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  5. arXiv:2402.06151  [pdf, other

    stat.ML cs.LG

    POTEC: Off-Policy Learning for Large Action Spaces via Two-Stage Policy Decomposition

    Authors: Yuta Saito, Jihan Yao, Thorsten Joachims

    Abstract: We study off-policy learning (OPL) of contextual bandit policies in large discrete action spaces where existing methods -- most of which rely crucially on reward-regression models or importance-weighted policy gradients -- fail due to excessive bias or variance. To overcome these issues in OPL, we propose a novel two-stage algorithm, called Policy Optimization via Two-Stage Policy Decomposition (P… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2305.08062

  6. arXiv:2311.01828  [pdf, other

    cs.IR

    Unbiased Offline Evaluation for Learning to Rank with Business Rules

    Authors: Matej Jakimov, Alexander Buchholz, Yannik Stein, Thorsten Joachims

    Abstract: For industrial learning-to-rank (LTR) systems, it is common that the output of a ranking model is modified, either as a results of post-processing logic that enforces business requirements, or as a result of unforeseen design flaws or bugs present in real-world production systems. This poses a challenge for deploying off-policy learning and evaluation methods, as these often rely on the assumption… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  7. arXiv:2310.17870  [pdf, other

    cs.IR cs.AI cs.LG

    Ranking with Slot Constraints

    Authors: Wentao Guo, Andrew Wang, Bradon Thymes, Thorsten Joachims

    Abstract: We introduce the problem of ranking with slot constraints, which can be used to model a wide range of application problems -- from college admission with limited slots for different majors, to composing a stratified cohort of eligible participants in a medical trial. We show that the conventional Probability Ranking Principle (PRP) can be highly sub-optimal for slot-constrained ranking problems, a… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  8. arXiv:2309.08817  [pdf, other

    cs.AI cs.HC

    GPT as a Baseline for Recommendation Explanation Texts

    Authors: Joyce Zhou, Thorsten Joachims

    Abstract: In this work, we establish a baseline potential for how modern model-generated text explanations of movie recommendations may help users, and explore what different components of these text explanations that users like or dislike, especially in contrast to existing human movie reviews. We found that participants gave no significantly different rankings between movies, nor did they give significant… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: 8 pages, 4 tables/figures. Accepted in current form to IntRS@RecSys2023 workshop. Intending on making noticeable in-place revisions on ArXiv for future submission, including potential title change

  9. arXiv:2309.01610  [pdf, other

    cs.LG cs.CY cs.IR

    Fairness in Ranking under Disparate Uncertainty

    Authors: Richa Rastogi, Thorsten Joachims

    Abstract: Ranking is a ubiquitous method for focusing the attention of human evaluators on a manageable subset of options. Its use as part of human decision-making processes ranges from surfacing potentially relevant products on an e-commerce site to prioritizing college applications for human review. While ranking can make human evaluation more effective by focusing attention on the most promising options,… ▽ More

    Submitted 11 July, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: A version of this paper was accepted as Spotlight (Oral) at UAI workshop on Epistemic AI, 2023

  10. arXiv:2307.04923  [pdf, other

    cs.IR

    Ranking with Long-Term Constraints

    Authors: Kianté Brantley, Zhichong Fang, Sarah Dean, Thorsten Joachims

    Abstract: The feedback that users provide through their choices (e.g., clicks, purchases) is one of the most common types of data readily available for training search and recommendation algorithms. However, myopically training systems based on choice data may only improve short-term engagement, but not the long-term sustainability of the platform and the long-term benefits to its users, content providers,… ▽ More

    Submitted 7 January, 2024; v1 submitted 10 July, 2023; originally announced July 2023.

  11. arXiv:2306.17575  [pdf, other

    cs.CL cs.AI cs.LG

    Augmenting Holistic Review in University Admission using Natural Language Processing for Essays and Recommendation Letters

    Authors: Jinsook Lee, Bradon Thymes, Joyce Zhou, Thorsten Joachims, Rene F. Kizilcec

    Abstract: University admission at many highly selective institutions uses a holistic review process, where all aspects of the application, including protected attributes (e.g., race, gender), grades, essays, and recommendation letters are considered, to compose an excellent and diverse class. In this study, we empirically evaluate how influential protected attributes are for predicting admission decisions u… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: EDI in EdTech R&D (Equity, Diversity, and Inclusion in Educational Technology Research and Development) workshop at AIED Tokyo 2023 Conference

  12. arXiv:2305.08062  [pdf, other

    stat.ML cs.AI cs.LG

    Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling

    Authors: Yuta Saito, Qingyang Ren, Thorsten Joachims

    Abstract: We study off-policy evaluation (OPE) of contextual bandit policies for large discrete action spaces where conventional importance-weighting approaches suffer from excessive variance. To circumvent this variance issue, we propose a new estimator, called OffCEM, that is based on the conjunct effect model (CEM), a novel decomposition of the causal effect into a cluster effect and a residual effect. O… ▽ More

    Submitted 2 June, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

    Comments: accepted at ICML2023. arXiv admin note: text overlap with arXiv:2202.06317

  13. arXiv:2302.03610  [pdf, other

    cs.CY

    Evaluating a Learned Admission-Prediction Model as a Replacement for Standardized Tests in College Admissions

    Authors: Hansol Lee, René F. Kizilcec, Thorsten Joachims

    Abstract: A growing number of college applications has presented an annual challenge for college admissions in the United States. Admission offices have historically relied on standardized test scores to organize large applicant pools into viable subsets for review. However, this approach may be subject to bias in test scores and selection bias in test-taking with recent trends toward test-optional admissio… ▽ More

    Submitted 23 May, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: In Proceedings of the ACM Conference on Learning at Scale (L@S) 2023

  14. arXiv:2210.09512  [pdf, other

    cs.LG cs.IR stat.CO

    Off-policy evaluation for learning-to-rank via interpolating the item-position model and the position-based model

    Authors: Alexander Buchholz, Ben London, Giuseppe di Benedetto, Thorsten Joachims

    Abstract: A critical need for industrial recommender systems is the ability to evaluate recommendation policies offline, before deploying them to production. Unfortunately, widely used off-policy evaluation methods either make strong assumptions about how users behave that can lead to excessive bias, or they make fewer assumptions and suffer from large variance. We tackle this problem by developing a new es… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: Presented at CONSEQUENCES workshop (Recsys '22) https://sites.google.com/view/consequences2022/contributions

  15. arXiv:2208.01148  [pdf, other

    cs.LG cs.IR stat.ML

    Boosted Off-Policy Learning

    Authors: Ben London, Levi Lu, Ted Sandler, Thorsten Joachims

    Abstract: We propose the first boosting algorithm for off-policy learning from logged bandit feedback. Unlike existing boosting methods for supervised learning, our algorithm directly optimizes an estimate of the policy's expected reward. We analyze this algorithm and prove that the excess empirical risk decreases (possibly exponentially fast) with each round of boosting, provided a ''weak'' learning condit… ▽ More

    Submitted 2 May, 2023; v1 submitted 1 August, 2022; originally announced August 2022.

    Comments: Final version as appeared in AISTATS 2023

  16. arXiv:2206.07247  [pdf, other

    cs.IR cs.AI cs.LG

    Fair Ranking as Fair Division: Impact-Based Individual Fairness in Ranking

    Authors: Yuta Saito, Thorsten Joachims

    Abstract: Rankings have become the primary interface in two-sided online markets. Many have noted that the rankings not only affect the satisfaction of the users (e.g., customers, listeners, employers, travelers), but that the position in the ranking allocates exposure -- and thus economic opportunity -- to the ranked items (e.g., articles, products, songs, job seekers, restaurants, hotels). This has raised… ▽ More

    Submitted 30 August, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: accepted at KDD2022, a few minor updates from the camera ready version

  17. arXiv:2205.15436  [pdf, other

    cs.IR cs.LG

    Uncertainty Quantification for Fairness in Two-Stage Recommender Systems

    Authors: Lequn Wang, Thorsten Joachims

    Abstract: Many large-scale recommender systems consist of two stages. The first stage efficiently screens the complete pool of items for a small subset of promising candidates, from which the second-stage model curates the final recommendations. In this paper, we investigate how to ensure group fairness to the items in this two-stage architecture. In particular, we find that existing first-stage recommender… ▽ More

    Submitted 24 February, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: ACM Conference on Web Search and Data Mining (WSDM), 2023

  18. arXiv:2204.10936  [pdf, other

    cs.IR cs.LG stat.ML

    Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion

    Authors: Adam Block, Rahul Kidambi, Daniel N. Hill, Thorsten Joachims, Inderjit S. Dhillon

    Abstract: Conventional methods for query autocompletion aim to predict which completed query a user will select from a list. A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions. To… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

  19. arXiv:2202.06317  [pdf, other

    cs.LG cs.AI stat.ML

    Off-Policy Evaluation for Large Action Spaces via Embeddings

    Authors: Yuta Saito, Thorsten Joachims

    Abstract: Off-policy evaluation (OPE) in contextual bandits has seen rapid adoption in real-world systems, since it enables offline evaluation of new policies using only historic log data. Unfortunately, when the number of actions is large, existing OPE estimators -- most of which are based on inverse propensity score weighting -- degrade severely and can suffer from extreme bias and variance. This foils th… ▽ More

    Submitted 15 June, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

    Comments: accepted at ICML2022

  20. arXiv:2202.01721  [pdf, other

    cs.LG cs.IR

    Variance-Optimal Augmentation Logging for Counterfactual Evaluation in Contextual Bandits

    Authors: Aaron David Tucker, Thorsten Joachims

    Abstract: Methods for offline A/B testing and counterfactual learning are seeing rapid adoption in search and recommender systems, since they allow efficient reuse of existing log data. However, there are fundamental limits to using existing log data alone, since the counterfactual estimators that are commonly used in these methods can have large bias and large variance when the logging policy is very diffe… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

    Comments: 19 pages, 4 figures, in submission

  21. arXiv:2202.01147  [pdf, other

    cs.LG cs.CY stat.ML

    Improving Screening Processes via Calibrated Subset Selection

    Authors: Lequn Wang, Thorsten Joachims, Manuel Gomez Rodriguez

    Abstract: Many selection processes such as finding patients qualifying for a medical trial or retrieval pipelines in search engines consist of multiple stages, where an initial screening stage focuses the resources on shortlisting the most promising candidates. In this paper, we investigate what guarantees a screening classifier can provide, independently of whether it is constructed manually or trained. We… ▽ More

    Submitted 12 June, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: International Conference on Machine Learning (ICML) 2022

  22. arXiv:2107.06720  [pdf, other

    cs.LG cs.CY cs.IR

    Fairness in Ranking under Uncertainty

    Authors: Ashudeep Singh, David Kempe, Thorsten Joachims

    Abstract: Fairness has emerged as an important consideration in algorithmic decision-making. Unfairness occurs when an agent with higher merit obtains a worse outcome than an agent with lower merit. Our central point is that a primary cause of unfairness is uncertainty. A principal or algorithm making decisions never has access to the agents' true merit, and instead uses proxy features that only imperfectly… ▽ More

    Submitted 10 November, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

    Comments: Full version of the paper published at Neural Information Processing Systems (NeurIPS) 2021

  23. arXiv:2106.01941  [pdf, other

    cs.IR

    Optimizing Rankings for Recommendation in Matching Markets

    Authors: Yi Su, Magd Bayoumi, Thorsten Joachims

    Abstract: Based on the success of recommender systems in e-commerce, there is growing interest in their use in matching markets (e.g., labor). While this holds potential for improving market fluidity and fairness, we show in this paper that naively applying existing recommender systems to matching markets is sub-optimal. Considering the standard process where candidates apply and then get evaluated by emplo… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

  24. arXiv:2103.02735  [pdf, other

    cs.LG cs.IR

    Fairness of Exposure in Stochastic Bandits

    Authors: Lequn Wang, Yiwei Bai, Wen Sun, Thorsten Joachims

    Abstract: Contextual bandit algorithms have become widely used for recommendation in online systems (e.g. marketplaces, music streaming, news), where they now wield substantial influence on which items get exposed to the users. This raises questions of fairness to the items -- and to the sellers, artists, and writers that benefit from this exposure. We argue that the conventional bandit formulation can lead… ▽ More

    Submitted 12 September, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: 29 pages, 10 figures, ICML 2021

  25. User Fairness, Item Fairness, and Diversity for Rankings in Two-Sided Markets

    Authors: Lequn Wang, Thorsten Joachims

    Abstract: Ranking items by their probability of relevance has long been the goal of conventional ranking systems. While this maximizes traditional criteria of ranking performance, there is a growing understanding that it is an oversimplification in online platforms that serve not only a diverse user population, but also the producers of the items. In particular, ranking algorithms are expected to be fair in… ▽ More

    Submitted 12 September, 2021; v1 submitted 3 October, 2020; originally announced October 2020.

    Comments: 19 pages, 3 figures, ICTIR 2021

    MSC Class: H.3.3

  26. arXiv:2008.04938  [pdf, ps, other

    cs.LG cs.SD eess.AS

    Content-based Music Similarity with Triplet Networks

    Authors: Joseph Cleveland, Derek Cheng, Michael Zhou, Thorsten Joachims, Douglas Turnbull

    Abstract: We explore the feasibility of using triplet neural networks to embed songs based on content-based music similarity. Our network is trained using triplets of songs such that two songs by the same artist are embedded closer to one another than to a third song by a different artist. We compare two models that are trained using different ways of picking this third song: at random vs. based on shared g… ▽ More

    Submitted 6 December, 2022; v1 submitted 11 August, 2020; originally announced August 2020.

  27. arXiv:2006.09438  [pdf, other

    cs.LG cs.IR stat.ML

    Off-policy Bandits with Deficient Support

    Authors: Noveen Sachdeva, Yi Su, Thorsten Joachims

    Abstract: Learning effective contextual-bandit policies from past actions of a deployed system is highly desirable in many settings (e.g. voice assistants, recommendation, search), since it enables the reuse of large amounts of log data. State-of-the-art methods for such off-policy learning, however, are based on inverse propensity score (IPS) weighting. A key theoretical requirement of IPS weighting is tha… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

    Comments: 11 pages, 6 figures. Accepted for publication at KDD '20 (Research track)

  28. arXiv:2005.14713  [pdf, other

    cs.IR cs.CY stat.ML

    Controlling Fairness and Bias in Dynamic Learning-to-Rank

    Authors: Marco Morik, Ashudeep Singh, Jessica Hong, Thorsten Joachims

    Abstract: Rankings are the primary interface through which many online platforms match users to items (e.g. news, products, music, video). In these two-sided markets, not only the users draw utility from the rankings, but the rankings also determine the utility (e.g. exposure, revenue) for the item providers (e.g. publishers, sellers, artists, studios). It has already been noted that myopically optimizing u… ▽ More

    Submitted 29 May, 2020; originally announced May 2020.

    Comments: First two authors contributed equally. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval 2020

  29. arXiv:2005.05951  [pdf, other

    cs.LG cs.AI stat.ML

    MOReL : Model-Based Offline Reinforcement Learning

    Authors: Rahul Kidambi, Aravind Rajeswaran, Praneeth Netrapalli, Thorsten Joachims

    Abstract: In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based solely on a dataset of historical interactions with the environment. The ability to train RL policies offline can greatly expand the applicability of RL, its data efficiency, and its experimental velocity. Prior work in offline RL has been confined almost exclusively to model-free RL approaches. In this wo… ▽ More

    Submitted 1 March, 2021; v1 submitted 12 May, 2020; originally announced May 2020.

    Comments: First two authors contributed equally. Published at NeurIPS 2020. After publication at NeurIPS 2020, (1) D4RL benchmark results have been added; (2) hyper-parameter ablation studies have been added; (3) scope of Lemma 3 has been extended

  30. arXiv:1911.08054  [pdf, other

    cs.LG cs.CY cs.IR stat.ML

    Policy-Gradient Training of Fair and Unbiased Ranking Functions

    Authors: Himank Yadav, Zhengxiao Du, Thorsten Joachims

    Abstract: While implicit feedback (e.g., clicks, dwell times, etc.) is an abundant and attractive source of data for learning to rank, it can produce unfair ranking policies for both exogenous and endogenous reasons. Exogenous reasons typically manifest themselves as biases in the training data, which then get reflected in the learned ranking policy and often lead to rich-get-richer dynamics. Moreover, even… ▽ More

    Submitted 9 May, 2021; v1 submitted 18 November, 2019; originally announced November 2019.

  31. arXiv:1902.04056  [pdf, other

    cs.LG cs.CY cs.IR stat.ML

    Policy Learning for Fairness in Ranking

    Authors: Ashudeep Singh, Thorsten Joachims

    Abstract: Conventional Learning-to-Rank (LTR) methods optimize the utility of the rankings to the users, but they are oblivious to their impact on the ranked items. However, there has been a growing understanding that the latter is important to consider for a wide range of ranking applications (e.g. online marketplaces, job placement, admissions). To address this need, we propose a general LTR framework tha… ▽ More

    Submitted 26 June, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

  32. Estimating Position Bias without Intrusive Interventions

    Authors: Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, Thorsten Joachims

    Abstract: Presentation bias is one of the key challenges when learning from implicit feedback in search engines, as it confounds the relevance signal. While it was recently shown how counterfactual learning-to-rank (LTR) approaches \cite{Joachims/etal/17a} can provably overcome presentation bias when observation propensities are known, it remains to show how to effectively estimate these propensities. In th… ▽ More

    Submitted 12 December, 2018; originally announced December 2018.

  33. arXiv:1811.02672  [pdf, other

    cs.LG stat.ML

    CAB: Continuous Adaptive Blending Estimator for Policy Evaluation and Learning

    Authors: Yi Su, Lequn Wang, Michele Santacatterina, Thorsten Joachims

    Abstract: The ability to perform offline A/B-testing and off-policy learning using logged contextual bandit feedback is highly desirable in a broad range of applications, including recommender systems, search engines, ad placement, and personalized health care. Both offline A/B-testing and off-policy learning require a counterfactual estimator that evaluates how some new policy would have performed, if it h… ▽ More

    Submitted 28 August, 2019; v1 submitted 6 November, 2018; originally announced November 2018.

  34. Intervention Harvesting for Context-Dependent Examination-Bias Estimation

    Authors: Zhichong Fang, Aman Agarwal, Thorsten Joachims

    Abstract: Accurate estimates of examination bias are crucial for unbiased learning-to-rank from implicit feedback in search engines and recommender systems, since they enable the use of Inverse Propensity Score (IPS) weighting techniques to address selection biases and missing data. Unfortunately, existing examination-bias estimators are limited to the Position-Based Model (PBM), where the examination bias… ▽ More

    Submitted 24 May, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

  35. arXiv:1806.03555  [pdf, other

    cs.LG cs.IR stat.ML

    Consistent Position Bias Estimation without Online Interventions for Learning-to-Rank

    Authors: Aman Agarwal, Ivan Zaitsev, Thorsten Joachims

    Abstract: Presentation bias is one of the key challenges when learning from implicit feedback in search engines, as it confounds the relevance signal with uninformative signals due to position in the ranking, saliency, and other presentation factors. While it was recently shown how counterfactual learning-to-rank (LTR) approaches \cite{Joachims/etal/17a} can provably overcome presentation bias if observatio… ▽ More

    Submitted 9 June, 2018; originally announced June 2018.

  36. arXiv:1805.00065  [pdf, other

    cs.IR cs.LG

    A General Framework for Counterfactual Learning-to-Rank

    Authors: Aman Agarwal, Kenta Takatsu, Ivan Zaitsev, Thorsten Joachims

    Abstract: Implicit feedback (e.g., click, dwell time) is an attractive source of training data for Learning-to-Rank, but its naive use leads to learning results that are distorted by presentation bias. For the special case of optimizing average rank for linear ranking functions, however, the recently developed SVM-PropRank method has shown that counterfactual inference techniques can be used to provably ove… ▽ More

    Submitted 27 August, 2019; v1 submitted 30 April, 2018; originally announced May 2018.

  37. arXiv:1802.07578  [pdf, other

    cs.HC cs.IR

    Improving Recommender Systems Beyond the Algorithm

    Authors: Tobias Schnabel, Paul N. Bennett, Thorsten Joachims

    Abstract: Recommender systems rely heavily on the predictive accuracy of the learning algorithm. Most work on improving accuracy has focused on the learning algorithm itself. We argue that this algorithmic focus is myopic. In particular, since learning algorithms generally improve with more and better data, we propose shaping the feedback generation process as an alternate and complementary route to improvi… ▽ More

    Submitted 21 February, 2018; originally announced February 2018.

  38. Fairness of Exposure in Rankings

    Authors: Ashudeep Singh, Thorsten Joachims

    Abstract: Rankings are ubiquitous in the online world today. As we have transitioned from finding books in libraries to ranking products, jobs, job applicants, opinions and potential romantic partners, there is a substantial precedent that ranking systems have a responsibility not only to their users but also to the items being ranked. To address these often conflicting responsibilities, we propose a concep… ▽ More

    Submitted 17 October, 2018; v1 submitted 20 February, 2018; originally announced February 2018.

    Comments: In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 2018

  39. arXiv:1706.08184  [pdf, other

    cs.SI cs.IR

    A preference elicitation interface for collecting dense recommender datasets with rich user information

    Authors: Pantelis P. Analytis, Tobias Schnabel, Stefan Herzog, Daniel Barkoczi, Thorsten Joachims

    Abstract: We present an interface that can be leveraged to quickly and effortlessly elicit people's preferences for visual stimuli, such as photographs, visual art and screensavers, along with rich side-information about its users. We plan to employ the new interface to collect dense recommender datasets that will complement existing sparse industry-scale datasets. The new interface and the collected datase… ▽ More

    Submitted 26 June, 2017; v1 submitted 25 June, 2017; originally announced June 2017.

    Comments: 2 pages

  40. arXiv:1704.01213  [pdf, other

    cs.IR

    Ranking with social cues: Integrating online review scores and popularity information

    Authors: Pantelis P. Analytis, Alexia Delfino, Juliane Kämmer, Mehdi Moussaïd, Thorsten Joachims

    Abstract: Online marketplaces, search engines, and databases employ aggregated social information to rank their content for users. Two ranking heuristics commonly implemented to order the available options are the average review score and item popularity-that is, the number of users who have experienced an item. These rules, although easy to implement, only partly reflect actual user preferences, as people… ▽ More

    Submitted 23 June, 2017; v1 submitted 4 April, 2017; originally announced April 2017.

    Comments: 4 pages, 3 figures, ICWSM

  41. Effective Evaluation using Logged Bandit Feedback from Multiple Loggers

    Authors: Aman Agarwal, Soumya Basu, Tobias Schnabel, Thorsten Joachims

    Abstract: Accurately evaluating new policies (e.g. ad-placement models, ranking functions, recommendation functions) is one of the key prerequisites for improving interactive systems. While the conventional approach to evaluation relies on online A/B tests, recent work has shown that counterfactual estimators can provide an inexpensive and fast alternative, since they can be applied offline using log data t… ▽ More

    Submitted 26 June, 2017; v1 submitted 17 March, 2017; originally announced March 2017.

    Comments: KDD 2018

  42. arXiv:1612.00367  [pdf, other

    cs.LG cs.AI stat.ML

    Large-scale Validation of Counterfactual Learning Methods: A Test-Bed

    Authors: Damien Lefortier, Adith Swaminathan, Xiaotao Gu, Thorsten Joachims, Maarten de Rijke

    Abstract: The ability to perform effective off-policy learning would revolutionize the process of building better interactive systems, such as search engines and recommendation systems for e-commerce, computational advertising and news. Recent approaches for off-policy evaluation and learning in these settings appear promising. With this paper, we provide real-world data and a standardized test-bed to syste… ▽ More

    Submitted 25 June, 2017; v1 submitted 1 December, 2016; originally announced December 2016.

    Comments: 10 pages, What If workshop NIPS 2016

  43. arXiv:1608.04468  [pdf, other

    cs.IR cs.LG

    Unbiased Learning-to-Rank with Biased Feedback

    Authors: Thorsten Joachims, Adith Swaminathan, Tobias Schnabel

    Abstract: Implicit feedback (e.g., clicks, dwell times, etc.) is an abundant source of data in human-interactive systems. While implicit feedback has many advantages (e.g., it is inexpensive to collect, user centric, and timely), its inherent biases are a key obstacle to its effective use. For example, position bias in search rankings strongly influences how many clicks a result receives, so that directly u… ▽ More

    Submitted 15 August, 2016; originally announced August 2016.

  44. arXiv:1604.07209  [pdf, ps, other

    cs.IR cs.LG

    Unbiased Comparative Evaluation of Ranking Functions

    Authors: Tobias Schnabel, Adith Swaminathan, Peter Frazier, Thorsten Joachims

    Abstract: Eliciting relevance judgments for ranking evaluation is labor-intensive and costly, motivating careful selection of which documents to judge. Unlike traditional approaches that make this selection deterministically, probabilistic sampling has shown intriguing promise since it enables the design of estimators that are provably unbiased even when reusing data with missing judgments. In this paper, w… ▽ More

    Submitted 25 April, 2016; originally announced April 2016.

    Comments: Under review; 10 pages

  45. Unbounded Human Learning: Optimal Scheduling for Spaced Repetition

    Authors: Siddharth Reddy, Igor Labutov, Siddhartha Banerjee, Thorsten Joachims

    Abstract: In the study of human learning, there is broad evidence that our ability to retain information improves with repeated exposure and decays with delay since last exposure. This plays a crucial role in the design of educational software, leading to a trade-off between teaching new material and reviewing what has already been taught. A common way to balance this trade-off is spaced repetition, which u… ▽ More

    Submitted 7 June, 2016; v1 submitted 22 February, 2016; originally announced February 2016.

    Comments: Accepted to the ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2016

  46. arXiv:1602.07029  [pdf, other

    cs.LG cs.AI cs.CY

    Latent Skill Embedding for Personalized Lesson Sequence Recommendation

    Authors: Siddharth Reddy, Igor Labutov, Thorsten Joachims

    Abstract: Students in online courses generate large amounts of data that can be used to personalize the learning process and improve quality of education. In this paper, we present the Latent Skill Embedding (LSE), a probabilistic model of students and educational content that can be used to recommend personalized sequences of lessons with the goal of helping students prepare for specific assessments. Akin… ▽ More

    Submitted 22 February, 2016; originally announced February 2016.

    Comments: Under review by the ACM SIGKDD Conference on Knowledge Discovery and Data Mining

  47. arXiv:1602.05352  [pdf, other

    cs.LG cs.AI cs.IR

    Recommendations as Treatments: Debiasing Learning and Evaluation

    Authors: Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, Thorsten Joachims

    Abstract: Most data for evaluating and training recommender systems is subject to selection biases, either through self-selection by the users or through the actions of the recommendation system itself. In this paper, we provide a principled approach to handling selection biases, adapting models and estimation techniques from causal inference. The approach leads to unbiased performance estimators despite bi… ▽ More

    Submitted 26 May, 2016; v1 submitted 17 February, 2016; originally announced February 2016.

    Comments: 10 pages in ICML 2016

  48. arXiv:1601.00741  [pdf, other

    cs.RO cs.AI cs.LG

    Learning Preferences for Manipulation Tasks from Online Coactive Feedback

    Authors: Ashesh Jain, Shikhar Sharma, Thorsten Joachims, Ashutosh Saxena

    Abstract: We consider the problem of learning preferences over trajectories for mobile manipulators such as personal robots and assembly line robots. The preferences we learn are more intricate than simple geometric constraints on trajectories; they are rather governed by the surrounding context of various objects and human interactions in the environment. We propose a coactive online learning framework for… ▽ More

    Submitted 5 January, 2016; originally announced January 2016.

    Comments: IJRR accepted (Learning preferences over trajectories from coactive feedback)

  49. arXiv:1510.07545  [pdf, other

    cs.HC cs.IR cs.LG

    Using Shortlists to Support Decision Making and Improve Recommender System Performance

    Authors: Tobias Schnabel, Paul N. Bennett, Susan T. Dumais, Thorsten Joachims

    Abstract: In this paper, we study shortlists as an interface component for recommender systems with the dual goal of supporting the user's decision process, as well as improving implicit feedback elicitation for increased recommendation quality. A shortlist is a temporary list of candidates that the user is currently considering, e.g., a list of a few movies the user is currently considering for viewing. Fr… ▽ More

    Submitted 8 February, 2016; v1 submitted 26 October, 2015; originally announced October 2015.

    Comments: 11 pages in WWW 2016

  50. arXiv:1502.02362  [pdf, ps, other

    cs.LG stat.ML

    Counterfactual Risk Minimization: Learning from Logged Bandit Feedback

    Authors: Adith Swaminathan, Thorsten Joachims

    Abstract: We develop a learning principle and an efficient algorithm for batch learning from logged bandit feedback. This learning setting is ubiquitous in online systems (e.g., ad placement, web search, recommendation), where an algorithm makes a prediction (e.g., ad ranking) for a given input (e.g., query) and observes bandit feedback (e.g., user clicks on presented ads). We first address the counterfactu… ▽ More

    Submitted 20 May, 2015; v1 submitted 9 February, 2015; originally announced February 2015.

    Comments: 10 pages