Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–22 of 22 results for author: Vashishth, S

.
  1. arXiv:2404.04530  [pdf, other

    cs.CL

    A Morphology-Based Investigation of Positional Encodings

    Authors: Poulami Ghosh, Shikhar Vashishth, Raj Dabre, Pushpak Bhattacharyya

    Abstract: Contemporary deep learning models effectively handle languages with diverse morphology despite not being directly integrated into them. Morphology and word order are closely linked, with the latter incorporated into transformer-based models through positional encodings. This prompts a fundamental inquiry: Is there a correlation between the morphological complexity of a language and the utilization… ▽ More

    Submitted 30 May, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

    Comments: Work in Progress

  2. arXiv:2401.02412  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    LLM Augmented LLMs: Expanding Capabilities through Composition

    Authors: Rachit Bansal, Bidisha Samanta, Siddharth Dalmia, Nitish Gupta, Shikhar Vashishth, Sriram Ganapathy, Abhishek Bapna, Prateek Jain, Partha Talukdar

    Abstract: Foundational models with billions of parameters which have been trained on large corpora of data have demonstrated non-trivial skills in a variety of domains. However, due to their monolithic structure, it is challenging and expensive to augment them or impart new skills. On the other hand, due to their adaptation abilities, several new instances of these models are being trained towards new domai… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: 17 pages, 2 figures, 8 tables

  3. arXiv:2311.00913  [pdf, other

    cs.CL

    Self-Influence Guided Data Reweighting for Language Model Pre-training

    Authors: Megh Thakkar, Tolga Bolukbasi, Sriram Ganapathy, Shikhar Vashishth, Sarath Chandar, Partha Talukdar

    Abstract: Language Models (LMs) pre-trained with self-supervision on large text corpora have become the default starting point for developing models for various NLP tasks. Once the pre-training corpus has been assembled, all data samples in the corpus are treated with equal importance during LM pre-training. However, due to varying levels of relevance and quality of data, equal importance to all the data sa… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted to EMNLP 2023

  4. arXiv:2309.10567  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Multimodal Modeling For Spoken Language Identification

    Authors: Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa

    Abstract: Spoken language identification refers to the task of automatically predicting the spoken language in a given utterance. Conventionally, it is modeled as a speech-based language identification task. Prior techniques have been constrained to a single modality; however in the case of video data there is a wealth of other metadata that may be beneficial for this task. In this work, we propose MuSeLI,… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  5. arXiv:2307.10982  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    MASR: Multi-label Aware Speech Representation

    Authors: Anjali Raj, Shikhar Bharadwaj, Sriram Ganapathy, Min Ma, Shikhar Vashishth

    Abstract: In the recent years, speech representation learning is constructed primarily as a self-supervised learning (SSL) task, using the raw audio signal alone, while ignoring the side-information that is often available for a given speech recording. In this paper, we propose MASR, a Multi-label Aware Speech Representation learning framework, which addresses the aforementioned limitations. MASR enables th… ▽ More

    Submitted 25 September, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted at ASRU 2023

  6. arXiv:2306.04374  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Label Aware Speech Representation Learning For Language Identification

    Authors: Shikhar Vashishth, Shikhar Bharadwaj, Sriram Ganapathy, Ankur Bapna, Min Ma, Wei Han, Vera Axelrod, Partha Talukdar

    Abstract: Speech representation learning approaches for non-semantic tasks such as language recognition have either explored supervised embedding extraction methods using a classifier model or self-supervised representation learning approaches using raw data. In this paper, we propose a novel framework of combining self-supervised representation learning with the language label information for the pre-train… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: Accepted at Interspeech 2023

  7. arXiv:2112.07887  [pdf, other

    cs.CL

    Knowledge-Rich Self-Supervision for Biomedical Entity Linking

    Authors: Sheng Zhang, Hao Cheng, Shikhar Vashishth, Cliff Wong, Jinfeng Xiao, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon

    Abstract: Entity linking faces significant challenges such as prolific variations and prevalent ambiguities, especially in high-value domains with myriad entities. Standard classification approaches suffer from the annotation bottleneck and cannot effectively handle unseen entities. Zero-shot entity linking has emerged as a promising direction for generalizing to new entities, but it still requires example… ▽ More

    Submitted 23 May, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

  8. arXiv:2106.06555  [pdf, other

    cs.LG

    Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network

    Authors: Justin Lovelace, Denis Newman-Griffis, Shikhar Vashishth, Jill Fain Lehman, Carolyn Penstein Rosé

    Abstract: Knowledge Graph (KG) completion research usually focuses on densely connected benchmark datasets that are not representative of real KGs. We curate two KG datasets that include biomedical and encyclopedic knowledge and use an existing commonsense KG dataset to explore KG completion in the more realistic setting where dense connectivity is not guaranteed. We develop a deep convolutional network tha… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Comments: The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)

  9. arXiv:2106.00920  [pdf, other

    cs.CL cs.AI cs.LG

    DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues

    Authors: Rishabh Joshi, Vidhisha Balachandran, Shikhar Vashishth, Alan Black, Yulia Tsvetkov

    Abstract: To successfully negotiate a deal, it is not enough to communicate fluently: pragmatic planning of persuasive negotiation strategies is essential. While modern dialogue agents excel at generating fluent sentences, they still lack pragmatic grounding and cannot reason strategically. We present DialoGraph, a negotiation system that incorporates pragmatic strategies in a negotiation dialogue using gra… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: Accepted at ICLR 2021; https://openreview.net/forum?id=kDnal_bbb-E

  10. arXiv:2010.02246  [pdf, other

    cs.CL cs.LG

    MedFilter: Improving Extraction of Task-relevant Utterances from Doctor-Patient Conversations through Integration of Discourse Structure and Ontological Knowledge

    Authors: Sopan Khosla, Shikhar Vashishth, Jill Fain Lehman, Carolyn Rose

    Abstract: Information extraction from conversational data is particularly challenging because the task-centric nature of conversation allows for effective communication of implicit information by humans, but is challenging for machines. The challenges may differ between utterances depending on the role of the speaker within the conversation, especially when relevant expertise is distributed asymmetrically a… ▽ More

    Submitted 21 June, 2022; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Accepted as Long Paper to EMNLP 2020

  11. Improving Broad-Coverage Medical Entity Linking with Semantic Type Prediction and Large-Scale Datasets

    Authors: Shikhar Vashishth, Denis Newman-Griffis, Rishabh Joshi, Ritam Dutt, Carolyn Rose

    Abstract: Medical entity linking is the task of identifying and standardizing medical concepts referred to in an unstructured text. Most of the existing methods adopt a three-step approach of (1) detecting mentions, (2) generating a list of candidate concepts, and finally (3) picking the best concept among them. In this paper, we probe into alleviating the problem of overgeneration of candidate concepts in… ▽ More

    Submitted 22 August, 2021; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: 44 pages

    Journal ref: Journal of Biomedical Informatics 2021

  12. arXiv:1911.03903  [pdf, other

    cs.CL

    A Re-evaluation of Knowledge Graph Completion Methods

    Authors: Zhiqing Sun, Shikhar Vashishth, Soumya Sanyal, Partha Talukdar, Yiming Yang

    Abstract: Knowledge Graph Completion (KGC) aims at automatically predicting missing links for large-scale knowledge graphs. A vast number of state-of-the-art KGC techniques have got published at top conferences in several research fields, including data mining, machine learning, and natural language processing. However, we notice that several recent papers report very high performance, which largely outperf… ▽ More

    Submitted 8 July, 2020; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: Accepted at ACL 2020

  13. arXiv:1911.03082  [pdf, other

    cs.LG stat.ML

    Composition-based Multi-Relational Graph Convolutional Networks

    Authors: Shikhar Vashishth, Soumya Sanyal, Vikram Nitin, Partha Talukdar

    Abstract: Graph Convolutional Networks (GCNs) have recently been shown to be quite successful in modeling graph-structured data. However, the primary focus has been on handling simple undirected graphs. Multi-relational graphs are a more general and prevalent form of graphs where each edge has a label and direction associated with it. Most of the existing approaches to handle such graphs suffer from over-pa… ▽ More

    Submitted 18 January, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: In Proceedings of ICLR 2020

  14. arXiv:1911.03042  [pdf, other

    cs.CL

    Neural Graph Embedding Methods for Natural Language Processing

    Authors: Shikhar Vashishth

    Abstract: Knowledge graphs are structured representations of facts in a graph, where nodes represent entities and edges represent relationships between them. Recent research has resulted in the development of several large KGs. However, all of them tend to be sparse with very few facts per entity. In the first part of the thesis, we propose two solutions to alleviate this problem: (1) KG Canonicalization, i… ▽ More

    Submitted 7 April, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

    Comments: 168 pages, PhD thesis (2019)

  15. arXiv:1911.00219  [pdf, other

    cs.LG stat.ML

    InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

    Authors: Shikhar Vashishth, Soumya Sanyal, Vikram Nitin, Nilesh Agrawal, Partha Talukdar

    Abstract: Most existing knowledge graphs suffer from incompleteness, which can be alleviated by inferring missing links based on known facts. One popular way to accomplish this is to generate low-dimensional embeddings of entities and relations, and use these to make inferences. ConvE, a recently proposed approach, applies convolutional filters on 2D reshapings of entity and relation embeddings in order to… ▽ More

    Submitted 24 September, 2020; v1 submitted 1 November, 2019; originally announced November 2019.

    Comments: Accepted at AAAI 2020

  16. arXiv:1909.11218  [pdf, other

    cs.CL cs.LG

    Attention Interpretability Across NLP Tasks

    Authors: Shikhar Vashishth, Shyam Upadhyay, Gaurav Singh Tomar, Manaal Faruqui

    Abstract: The attention layer in a neural network model provides insights into the model's reasoning behind its prediction, which are usually criticized for being opaque. Recently, seemingly contradictory viewpoints have emerged about the interpretability of attention weights (Jain & Wallace, 2019; Vig & Belinkov, 2019). Amid such confusion arises the need to understand attention mechanism more systematical… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

    Report number: 2019

  17. arXiv:1902.00175  [pdf, other

    cs.CL cs.AI cs.LG

    Dating Documents using Graph Convolution Networks

    Authors: Shikhar Vashishth, Shib Sankar Dasgupta, Swayambhu Nath Ray, Partha Talukdar

    Abstract: Document date is essential for many important tasks, such as document retrieval, summarization, event detection, etc. While existing approaches for these tasks assume accurate knowledge of the document date, this is not always available, especially for arbitrary documents from the Web. Document Dating is a challenging problem which requires inference over the temporal structure of the document. Pr… ▽ More

    Submitted 31 January, 2019; originally announced February 2019.

    Comments: Accepted at ACL 2018

    Journal ref: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 2018

  18. CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information

    Authors: Shikhar Vashishth, Prince Jain, Partha Talukdar

    Abstract: Open Information Extraction (OpenIE) methods extract (noun phrase, relation phrase, noun phrase) triples from text, resulting in the construction of large Open Knowledge Bases (Open KBs). The noun phrases (NPs) and relation phrases in such Open KBs are not canonicalized, leading to the storage of redundant and ambiguous facts. Recent research has posed canonicalization of Open KBs as clustering ov… ▽ More

    Submitted 31 January, 2019; originally announced February 2019.

    Comments: Accepted at WWW 2018

    Journal ref: International World Wide Web Conferences Steering Committee 2018

  19. arXiv:1901.08255  [pdf, other

    cs.LG cs.SI stat.ML

    Confidence-based Graph Convolutional Networks for Semi-Supervised Learning

    Authors: Shikhar Vashishth, Prateek Yadav, Manik Bhandari, Partha Talukdar

    Abstract: Predicting properties of nodes in a graph is an important problem with applications in a variety of domains. Graph-based Semi-Supervised Learning (SSL) methods aim to address this problem by labeling a small subset of the nodes as seeds and then utilizing the graph structure to predict label scores for the rest of the nodes in the graph. Recently, Graph Convolutional Networks (GCNs) have achieved… ▽ More

    Submitted 11 February, 2019; v1 submitted 24 January, 2019; originally announced January 2019.

    Comments: Accepted at AISTATS 2019

  20. arXiv:1812.04361  [pdf, other

    cs.CL

    RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information

    Authors: Shikhar Vashishth, Rishabh Joshi, Sai Suman Prayaga, Chiranjib Bhattacharyya, Partha Talukdar

    Abstract: Distantly-supervised Relation Extraction (RE) methods train an extractor by automatically aligning relation instances in a Knowledge Base (KB) with unstructured text. In addition to relation instances, KBs often contain other relevant side information, such as aliases of relations (e.g., founded and co-founded are aliases for the relation founderOfCompany). RE models usually ignore such readily av… ▽ More

    Submitted 11 February, 2019; v1 submitted 11 December, 2018; originally announced December 2018.

    Comments: 10 pages, 6 figures, EMNLP 2018

    Journal ref: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

  21. arXiv:1809.04283  [pdf, other

    cs.CL cs.LG

    Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks

    Authors: Shikhar Vashishth, Manik Bhandari, Prateek Yadav, Piyush Rai, Chiranjib Bhattacharyya, Partha Talukdar

    Abstract: Word embeddings have been widely adopted across several NLP applications. Most existing word embedding methods utilize sequential context of a word to learn its embedding. While there have been some attempts at utilizing syntactic context of a word, such methods result in an explosion of the vocabulary size. In this paper, we overcome this problem by proposing SynGCN, a flexible Graph Convolution… ▽ More

    Submitted 20 July, 2019; v1 submitted 12 September, 2018; originally announced September 2018.

    Comments: 11 pages, 2 figures

    Journal ref: 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019)

  22. arXiv:1805.11365  [pdf, other

    cs.LG stat.ML

    Lovasz Convolutional Networks

    Authors: Prateek Yadav, Madhav Nimishakavi, Naganand Yadati, Shikhar Vashishth, Arun Rajkumar, Partha Talukdar

    Abstract: Semi-supervised learning on graph structured data has received significant attention with the recent introduction of Graph Convolution Networks (GCN). While traditional methods have focused on optimizing a loss augmented with Laplacian regularization framework, GCNs perform an implicit Laplacian type regularization to capture local graph structure. In this work, we propose Lovasz Convolutional Net… ▽ More

    Submitted 3 January, 2019; v1 submitted 29 May, 2018; originally announced May 2018.

    Comments: Accepted at AISTATS 2019