Search | arXiv e-print repository

Nemotron-4 340B Technical Report

Authors: Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek , et al. (58 additional authors not shown)

Abstract: We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation be… ▽ More We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation benchmarks, and were sized to fit on a single DGX H100 with 8 GPUs when deployed in FP8 precision. We believe that the community can benefit from these models in various research studies and commercial applications, especially for generating synthetic data to train smaller language models. Notably, over 98% of data used in our model alignment process is synthetically generated, showcasing the effectiveness of these models in generating synthetic data. To further support open research and facilitate model development, we are also open-sourcing the synthetic data generation pipeline used in our model alignment process. △ Less

Submitted 6 August, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

arXiv:2405.07447 [pdf]

From traces to measures: Large language models as a tool for psychological measurement from text

Authors: Joseph J. P. Simons, Wong Liang Ze, Prasanta Bhattacharya, Brandon Siyuan Loh, Wei Gao

Abstract: Digital trace data provide potentially valuable resources for understanding human behaviour, but their value has been limited by issues of unclear measurement. The growth of large language models provides an opportunity to address this limitation in the case of text data. Specifically, recognizing cases where their responses are a form of psychological measurement (the use of observable indicators… ▽ More Digital trace data provide potentially valuable resources for understanding human behaviour, but their value has been limited by issues of unclear measurement. The growth of large language models provides an opportunity to address this limitation in the case of text data. Specifically, recognizing cases where their responses are a form of psychological measurement (the use of observable indicators to assess an underlying construct) allows existing measures and accuracy assessment frameworks from psychology to be re-purposed to use with large language models. Based on this, we offer four methodological recommendations for using these models to quantify text features: (1) identify the target of measurement, (2) use multiple prompts, (3) assess internal consistency, and (4) treat evaluation metrics (such as human annotations) as expected correlates rather than direct ground-truth measures. Additionally, we provide a workflow for implementing this approach. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 12 pages, 2 figures, 1 table

arXiv:2404.12674 [pdf, other]

Towards Universal Performance Modeling for Machine Learning Training on Multi-GPU Platforms

Authors: Zhongyi Lin, Ning Sun, Pallab Bhattacharya, Xizhou Feng, Louis Feng, John D. Owens

Abstract: Characterizing and predicting the training performance of modern machine learning (ML) workloads on compute systems with compute and communication spread between CPUs, GPUs, and network devices is not only the key to optimization and planning but also a complex goal to achieve. The primary challenges include the complexity of synchronization and load balancing between CPUs and GPUs, the variance i… ▽ More Characterizing and predicting the training performance of modern machine learning (ML) workloads on compute systems with compute and communication spread between CPUs, GPUs, and network devices is not only the key to optimization and planning but also a complex goal to achieve. The primary challenges include the complexity of synchronization and load balancing between CPUs and GPUs, the variance in input data distribution, and the use of different communication devices and topologies (e.g., NVLink, PCIe, network cards) that connect multiple compute devices, coupled with the desire for flexible training configurations. Built on top of our prior work for single-GPU platforms, we address these challenges and enable multi-GPU performance modeling by incorporating (1) data-distribution-aware performance models for embedding table lookup, and (2) data movement prediction of communication collectives, into our upgraded performance modeling pipeline equipped with inter-and intra-rank synchronization for ML workloads trained on multi-GPU platforms. Beyond accurately predicting the per-iteration training time of DLRM models with random configurations with a geomean error of 5.21% on two multi-GPU platforms, our prediction pipeline generalizes well to other types of ML workloads, such as Transformer-based NLP models with a geomean error of 3.00%. Moreover, even without actually running ML workloads like DLRMs on the hardware, it is capable of generating insights such as quickly selecting the fastest embedding table sharding configuration (with a success rate of 85%). △ Less

Submitted 27 April, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

Comments: 12 pages, 11 figures, 4 tables

arXiv:2310.16673 [pdf, other]

Exploring Large Language Models for Code Explanation

Authors: Paheli Bhattacharya, Manojit Chakraborty, Kartheek N S N Palepu, Vikas Pandey, Ishan Dindorkar, Rakesh Rajpurohit, Rishabh Gupta

Abstract: Automating code documentation through explanatory text can prove highly beneficial in code understanding. Large Language Models (LLMs) have made remarkable strides in Natural Language Processing, especially within software engineering tasks such as code generation and code summarization. This study specifically delves into the task of generating natural-language summaries for code snippets, using… ▽ More Automating code documentation through explanatory text can prove highly beneficial in code understanding. Large Language Models (LLMs) have made remarkable strides in Natural Language Processing, especially within software engineering tasks such as code generation and code summarization. This study specifically delves into the task of generating natural-language summaries for code snippets, using various LLMs. The findings indicate that Code LLMs outperform their generic counterparts, and zero-shot methods yield superior results when dealing with datasets with dissimilar distributions between training and testing sets. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: Accepted at the Forum for Information Retrieval Evaluation 2023 (IRSE Track)

ACM Class: D.2.3; I.7

arXiv:2310.09848 [pdf]

Enhancing Stance Classification with Quantified Moral Foundations

Authors: Hong Zhang, Prasanta Bhattacharya, Wei Gao, Liang Ze Wong, Brandon Siyuan Loh, Joseph J. P. Simons, Jisun An

Abstract: This study enhances stance detection on social media by incorporating deeper psychological attributes, specifically individuals' moral foundations. These theoretically-derived dimensions aim to provide a comprehensive profile of an individual's moral concerns which, in recent work, has been linked to behaviour in a range of domains, including society, politics, health, and the environment. In this… ▽ More This study enhances stance detection on social media by incorporating deeper psychological attributes, specifically individuals' moral foundations. These theoretically-derived dimensions aim to provide a comprehensive profile of an individual's moral concerns which, in recent work, has been linked to behaviour in a range of domains, including society, politics, health, and the environment. In this paper, we investigate how moral foundation dimensions can contribute to predicting an individual's stance on a given target. Specifically we incorporate moral foundation features extracted from text, along with message semantic features, to classify stances at both message- and user-levels across a range of targets and models. Our preliminary results suggest that encoding moral foundations can enhance the performance of stance detection tasks and help illuminate the associations between specific moral foundations and online stances on target topics. The results highlight the importance of considering deeper psychological attributes in stance analysis and underscores the role of moral foundations in guiding online social behavior. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: 11 pages, 5 figures

arXiv:2306.14142 [pdf, other]

Estimating Policy Effects in a Social Network with Independent Set Sampling

Authors: Eugene Ang, Prasanta Bhattacharya, Andrew Lim

Abstract: Evaluating the impact of policy interventions on respondents who are embedded in a social network is often challenging due to the presence of network interference within the treatment groups, as well as between treatment and non-treatment groups throughout the network. In this paper, we propose a modeling strategy that combines existing work on stochastic actor-oriented models (SAOM) with a novel… ▽ More Evaluating the impact of policy interventions on respondents who are embedded in a social network is often challenging due to the presence of network interference within the treatment groups, as well as between treatment and non-treatment groups throughout the network. In this paper, we propose a modeling strategy that combines existing work on stochastic actor-oriented models (SAOM) with a novel network sampling method based on the identification of independent sets. By assigning respondents from an independent set to the treatment, we are able to block any spillover of the treatment and network influence, thereby allowing us to isolate the direct effect of the treatment from the indirect network-induced effects, in the immediate term. As a result, our method allows for the estimation of both the \textit{direct} as well as the \textit{net effect} of a chosen policy intervention, in the presence of network effects in the population. We perform a comparative simulation analysis to show that our proposed sampling technique leads to distinct direct and net effects of the policy, as well as significant network effects driven by policy-linked homophily. This study highlights the importance of network sampling techniques in improving policy evaluation studies and has the potential to help researchers and policymakers with better planning, designing, and anticipating policy responses in a networked society. △ Less

Submitted 25 February, 2024; v1 submitted 25 June, 2023; originally announced June 2023.

arXiv:2301.02959 [pdf, other]

FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models

Authors: Geet Sethi, Pallab Bhattacharya, Dhruv Choudhary, Carole-Jean Wu, Christos Kozyrakis

Abstract: Sequence-based deep learning recommendation models (DLRMs) are an emerging class of DLRMs showing great improvements over their prior sum-pooling based counterparts at capturing users' long term interests. These improvements come at immense system cost however, with sequence-based DLRMs requiring substantial amounts of data to be dynamically materialized and communicated by each accelerator during… ▽ More Sequence-based deep learning recommendation models (DLRMs) are an emerging class of DLRMs showing great improvements over their prior sum-pooling based counterparts at capturing users' long term interests. These improvements come at immense system cost however, with sequence-based DLRMs requiring substantial amounts of data to be dynamically materialized and communicated by each accelerator during a single iteration. To address this rapidly growing bottleneck, we present FlexShard, a new tiered sequence embedding table sharding algorithm which operates at a per-row granularity by exploiting the insight that not every row is equal. Through precise replication of embedding rows based on their underlying probability distribution, along with the introduction of a new sharding strategy adapted to the heterogeneous, skewed performance of real-world cluster network topologies, FlexShard is able to significantly reduce communication demand while using no additional memory compared to the prior state-of-the-art. When evaluated on production-scale sequence DLRMs, FlexShard was able to reduce overall global all-to-all communication traffic by over 85%, resulting in end-to-end training communication latency improvements of almost 6x over the prior state-of-the-art approach. △ Less

Submitted 7 January, 2023; originally announced January 2023.

arXiv:2212.13897 [pdf, other]

What You Like: Generating Explainable Topical Recommendations for Twitter Using Social Annotations

Authors: Parantapa Bhattacharya, Saptarshi Ghosh, Muhammad Bilal Zafar, Soumya K. Ghosh, Niloy Ganguly

Abstract: With over 500 million tweets posted per day, in Twitter, it is difficult for Twitter users to discover interesting content from the deluge of uninteresting posts. In this work, we present a novel, explainable, topical recommendation system, that utilizes social annotations, to help Twitter users discover tweets, on topics of their interest. A major challenge in using traditional rating dependent r… ▽ More With over 500 million tweets posted per day, in Twitter, it is difficult for Twitter users to discover interesting content from the deluge of uninteresting posts. In this work, we present a novel, explainable, topical recommendation system, that utilizes social annotations, to help Twitter users discover tweets, on topics of their interest. A major challenge in using traditional rating dependent recommendation systems, like collaborative filtering and content based systems, in high volume social networks is that, due to attention scarcity most items do not get any ratings. Additionally, the fact that most Twitter users are passive consumers, with 44% users never tweeting, makes it very difficult to use user ratings for generating recommendations. Further, a key challenge in developing recommendation systems is that in many cases users reject relevant recommendations if they are totally unfamiliar with the recommended item. Providing a suitable explanation, for why the item is recommended, significantly improves the acceptability of recommendation. By virtue of being a topical recommendation system our method is able to present simple topical explanations for the generated recommendations. Comparisons with state-of-the-art matrix factorization based collaborative filtering, content based and social recommendations demonstrate the efficacy of the proposed approach. △ Less

Submitted 23 December, 2022; originally announced December 2022.

arXiv:2212.12594 [pdf, other]

Analyzing Regrettable Communications on Twitter: Characterizing Deleted Tweets and Their Authors

Authors: Parantapa Bhattacharya, Saptarshi Ghosh, Niloy Ganguly

Abstract: Over 500 million tweets are posted in Twitter each day, out of which about 11% tweets are deleted by the users posting them. This phenomenon of widespread deletion of tweets leads to a number of questions: what kind of content posted by users makes them want to delete them later? %Are all users equally active in deleting their tweets or Are users of certain predispositions more likely to post regr… ▽ More Over 500 million tweets are posted in Twitter each day, out of which about 11% tweets are deleted by the users posting them. This phenomenon of widespread deletion of tweets leads to a number of questions: what kind of content posted by users makes them want to delete them later? %Are all users equally active in deleting their tweets or Are users of certain predispositions more likely to post regrettable tweets, deleting them later? In this paper we provide a detailed characterization of tweets posted and then later deleted by their authors. We collected tweets from over 200 thousand Twitter users during a period of four weeks. Our characterization shows significant personality differences between users who delete their tweets and those who do not. We find that users who delete their tweets are more likely to be extroverted and neurotic while being less conscientious. Also, we find that deleted tweets while containing less information and being less conversational, contain significant indications of regrettable content. Since users of online communication do not have instant social cues (like listener's body language) to gauge the impact of their words, they are often delayed in employing repair strategies. Finally, we build a classifier which takes textual, contextual, as well as user features to predict if a tweet will be deleted or not. The classifier achieves a F1-score of 0.78 and the precision increases when we consider response features of the tweets. △ Less

Submitted 23 December, 2022; originally announced December 2022.

arXiv:2212.09045 [pdf, other]

Task Preferences across Languages on Community Question Answering Platforms

Authors: Sebastin Santy, Prasanta Bhattacharya, Rishabh Mehrotra

Abstract: With the steady emergence of community question answering (CQA) platforms like Quora, StackExchange, and WikiHow, users now have an unprecedented access to information on various kind of queries and tasks. Moreover, the rapid proliferation and localization of these platforms spanning geographic and linguistic boundaries offer a unique opportunity to study the task requirements and preferences of u… ▽ More With the steady emergence of community question answering (CQA) platforms like Quora, StackExchange, and WikiHow, users now have an unprecedented access to information on various kind of queries and tasks. Moreover, the rapid proliferation and localization of these platforms spanning geographic and linguistic boundaries offer a unique opportunity to study the task requirements and preferences of users in different socio-linguistic groups. In this study, we implement an entity-embedding model trained on a large longitudinal dataset of multi-lingual and task-oriented question-answer pairs to uncover and quantify the (i) prevalence and distribution of various online tasks across linguistic communities, and (ii) emerging and receding trends in task popularity over time in these communities. Our results show that there exists substantial variance in task preference as well as popularity trends across linguistic communities on the platform. Findings from this study will help Q&A platforms better curate and personalize content for non-English users, while also offering valuable insights to businesses looking to target non-English speaking communities online. △ Less

Submitted 18 December, 2022; originally announced December 2022.

Comments: 7 pages, 4 figures

arXiv:2211.01338 [pdf, other]

Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages

Authors: Anusha Prakash, Arun Kumar, Ashish Seth, Bhagyashree Mukherjee, Ishika Gupta, Jom Kuriakose, Jordan Fernandes, K V Vikram, Mano Ranjith Kumar M, Metilda Sagaya Mary, Mohammad Wajahat, Mohana N, Mudit Batra, Navina K, Nihal John George, Nithya Ravi, Pruthwik Mishra, Sudhanshu Srivastava, Vasista Sai Lodagala, Vandan Mujadia, Kada Sai Venkata Vineeth, Vrunda Sukhadia, Dipti Sharma, Hema Murthy, Pushpak Bhattacharya , et al. (2 additional authors not shown)

Abstract: Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages… ▽ More Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages belong to different language families, resulting in differences in generated audio duration. This is further compounded by the original speaker's rhythm, especially for extempore speech. This paper describes the challenges in regenerating English lecture videos in Indian languages semi-automatically. A prototype is developed for dubbing lectures into 9 Indian languages. A mean-opinion-score (MOS) is obtained for two languages, Hindi and Tamil, on two different courses. The output video is compared with the original video in terms of MOS (1-5) and lip synchronisation with scores of 4.09 and 3.74, respectively. The human effort also reduces by 75%. △ Less

Submitted 1 November, 2022; originally announced November 2022.

arXiv:2210.09421 [pdf, other]

Deepfake Text Detection: Limitations and Opportunities

Authors: Jiameng Pu, Zain Sarwar, Sifat Muhammad Abdullah, Abdullah Rehman, Yoonjin Kim, Parantapa Bhattacharya, Mobin Javed, Bimal Viswanath

Abstract: Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed f… ▽ More Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed for deepfake text detection. However, we lack a thorough understanding of their real-world applicability. In this paper, we collect deepfake text from 4 online services powered by Transformer-based tools to evaluate the generalization ability of the defenses on content in the wild. We develop several low-cost adversarial attacks, and investigate the robustness of existing defenses against an adaptive attacker. We find that many defenses show significant degradation in performance under our evaluation scenarios compared to their original claimed performance. Our evaluation shows that tapping into the semantic information in the text content is a promising approach for improving the robustness and generalization performance of deepfake text detection schemes. △ Less

Submitted 17 October, 2022; originally announced October 2022.

Comments: Accepted to IEEE S&P 2023; First two authors contributed equally to this work; 18 pages, 7 figures

arXiv:2210.07544 [pdf, other]

Legal Case Document Summarization: Extractive and Abstractive Methods and their Evaluation

Authors: Abhay Shukla, Paheli Bhattacharya, Soham Poddar, Rajdeep Mukherjee, Kripabandhu Ghosh, Pawan Goyal, Saptarshi Ghosh

Abstract: Summarization of legal case judgement documents is a challenging problem in Legal NLP. However, not much analyses exist on how different families of summarization models (e.g., extractive vs. abstractive) perform when applied to legal case documents. This question is particularly important since many recent transformer-based abstractive summarization models have restrictions on the number of input… ▽ More Summarization of legal case judgement documents is a challenging problem in Legal NLP. However, not much analyses exist on how different families of summarization models (e.g., extractive vs. abstractive) perform when applied to legal case documents. This question is particularly important since many recent transformer-based abstractive summarization models have restrictions on the number of input tokens, and legal documents are known to be very long. Also, it is an open question on how best to evaluate legal case document summarization systems. In this paper, we carry out extensive experiments with several extractive and abstractive summarization methods (both supervised and unsupervised) over three legal summarization datasets that we have developed. Our analyses, that includes evaluation by law practitioners, lead to several interesting insights on legal summarization in specific and long document summarization in general. △ Less

Submitted 14 October, 2022; originally announced October 2022.

Comments: Accepted at The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP), 2022

arXiv:2209.12474 [pdf, other]

Legal Case Document Similarity: You Need Both Network and Text

Authors: Paheli Bhattacharya, Kripabandhu Ghosh, Arindam Pal, Saptarshi Ghosh

Abstract: Estimating the similarity between two legal case documents is an important and challenging problem, having various downstream applications such as prior-case retrieval and citation recommendation. There are two broad approaches for the task -- citation network-based and text-based. Prior citation network-based approaches consider citations only to prior-cases (also called precedents) (PCNet). This… ▽ More Estimating the similarity between two legal case documents is an important and challenging problem, having various downstream applications such as prior-case retrieval and citation recommendation. There are two broad approaches for the task -- citation network-based and text-based. Prior citation network-based approaches consider citations only to prior-cases (also called precedents) (PCNet). This approach misses important signals inherent in Statutes (written laws of a jurisdiction). In this work, we propose Hier-SPCNet that augments PCNet with a heterogeneous network of Statutes. We incorporate domain knowledge for legal document similarity into Hier-SPCNet, thereby obtaining state-of-the-art results for network-based legal document similarity. Both textual and network similarity provide important signals for legal case similarity; but till now, only trivial attempts have been made to unify the two signals. In this work, we apply several methods for combining textual and network information for estimating legal case similarity. We perform extensive experiments over legal case documents from the Indian judiciary, where the gold standard similarity between document-pairs is judged by law experts from two reputed Law institutes in India. Our experiments establish that our proposed network-based methods significantly improve the correlation with domain experts' opinion when compared to the existing methods for network-based legal document similarity. Our best-performing combination method (that combines network-based and text-based similarity) improves the correlation with domain experts' opinion by 11.8% over the best text-based method and 20.6\% over the best network-based method. We also establish that our best-performing method can be used to recommend / retrieve citable and similar cases for a source (query) case, which are well appreciated by legal experts. △ Less

Submitted 26 September, 2022; originally announced September 2022.

Comments: This work has been published in Information Processing and Management, Elsevier, vol. 59, issue 6, November 2022

arXiv:2206.02878 [pdf, other]

doi 10.1145/3582016.3582063

TPP: Transparent Page Placement for CXL-Enabled Tiered-Memory

Authors: Hasan Al Maruf, Hao Wang, Abhishek Dhanotia, Johannes Weiner, Niket Agarwal, Pallab Bhattacharya, Chris Petersen, Mosharaf Chowdhury, Shobhit Kanaujia, Prakash Chauhan

Abstract: The increasing demand for memory in hyperscale applications has led to memory becoming a large portion of the overall datacenter spend. The emergence of coherent interfaces like CXL enables main memory expansion and offers an efficient solution to this problem. In such systems, the main memory can constitute different memory technologies with varied characteristics. In this paper, we characterize… ▽ More The increasing demand for memory in hyperscale applications has led to memory becoming a large portion of the overall datacenter spend. The emergence of coherent interfaces like CXL enables main memory expansion and offers an efficient solution to this problem. In such systems, the main memory can constitute different memory technologies with varied characteristics. In this paper, we characterize memory usage patterns of a wide range of datacenter applications across the server fleet of Meta. We, therefore, demonstrate the opportunities to offload colder pages to slower memory tiers for these applications. Without efficient memory management, however, such systems can significantly degrade performance. We propose a novel OS-level application-transparent page placement mechanism (TPP) for CXL-enabled memory. TPP employs a lightweight mechanism to identify and place hot/cold pages to appropriate memory tiers. It enables a proactive page demotion from local memory to CXL-Memory. This technique ensures a memory headroom for new page allocations that are often related to request processing and tend to be short-lived and hot. At the same time, TPP can promptly promote performance-critical hot pages trapped in the slow CXL-Memory to the fast local memory, while minimizing both sampling overhead and unnecessary migrations. TPP works transparently without any application-specific knowledge and can be deployed globally as a kernel release. We evaluate TPP in the production server fleet with early samples of new x86 CPUs with CXL 1.1 support. TPP makes a tiered memory system performant as an ideal baseline (<1% gap) that has all the memory in the local tier. It is 18% better than today's Linux, and 5-17% better than existing solutions including NUMA Balancing and AutoTiering. Most of the TPP patches have been merged in the Linux v5.18 release. △ Less

Submitted 28 May, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

arXiv:2204.08601 [pdf, other]

A Tour of Visualization Techniques for Computer Vision Datasets

Authors: Bilal Alsallakh, Pamela Bhattacharya, Vanessa Feng, Narine Kokhlikyan, Orion Reblitz-Richardson, Rahul Rajan, David Yan

Abstract: We survey a number of data visualization techniques for analyzing Computer Vision (CV) datasets. These techniques help us understand properties and latent patterns in such data, by applying dataset-level analysis. We present various examples of how such analysis helps predict the potential impact of the dataset properties on CV models and informs appropriate mitigation of their shortcomings. Final… ▽ More We survey a number of data visualization techniques for analyzing Computer Vision (CV) datasets. These techniques help us understand properties and latent patterns in such data, by applying dataset-level analysis. We present various examples of how such analysis helps predict the potential impact of the dataset properties on CV models and informs appropriate mitigation of their shortcomings. Finally, we explore avenues for further visualization techniques of different modalities of CV datasets as well as ones that are tailored to support specific CV tasks and analysis needs. △ Less

Submitted 18 April, 2022; originally announced April 2022.

arXiv:2106.15876 [pdf, other]

Incorporating Domain Knowledge for Extractive Summarization of Legal Case Documents

Authors: Paheli Bhattacharya, Soham Poddar, Koustav Rudra, Kripabandhu Ghosh, Saptarshi Ghosh

Abstract: Automatic summarization of legal case documents is an important and practical challenge. Apart from many domain-independent text summarization algorithms that can be used for this purpose, several algorithms have been developed specifically for summarizing legal case documents. However, most of the existing algorithms do not systematically incorporate domain knowledge that specifies what informati… ▽ More Automatic summarization of legal case documents is an important and practical challenge. Apart from many domain-independent text summarization algorithms that can be used for this purpose, several algorithms have been developed specifically for summarizing legal case documents. However, most of the existing algorithms do not systematically incorporate domain knowledge that specifies what information should ideally be present in a legal case document summary. To address this gap, we propose an unsupervised summarization algorithm DELSumm which is designed to systematically incorporate guidelines from legal experts into an optimization setup. We conduct detailed experiments over case documents from the Indian Supreme Court. The experiments show that our proposed unsupervised method outperforms several strong baselines in terms of ROUGE scores, including both general summarization algorithms and legal-specific ones. In fact, though our proposed algorithm is unsupervised, it outperforms several supervised summarization models that are trained over thousands of document-summary pairs. △ Less

Submitted 30 June, 2021; originally announced June 2021.

Comments: Accepted at the 18th International Conference on Artificial Intelligence and Law (ICAIL) 2021

arXiv:2106.06292 [pdf, other]

A Discussion on Building Practical NLP Leaderboards: The Case of Machine Translation

Authors: Sebastin Santy, Prasanta Bhattacharya

Abstract: Recent advances in AI and ML applications have benefited from rapid progress in NLP research. Leaderboards have emerged as a popular mechanism to track and accelerate progress in NLP through competitive model development. While this has increased interest and participation, the over-reliance on single, and accuracy-based metrics have shifted focus from other important metrics that might be equally… ▽ More Recent advances in AI and ML applications have benefited from rapid progress in NLP research. Leaderboards have emerged as a popular mechanism to track and accelerate progress in NLP through competitive model development. While this has increased interest and participation, the over-reliance on single, and accuracy-based metrics have shifted focus from other important metrics that might be equally pertinent to consider in real-world contexts. In this paper, we offer a preliminary discussion of the risks associated with focusing exclusively on accuracy metrics and draw on recent discussions to highlight prescriptive suggestions on how to develop more practical and effective leaderboards that can better reflect the real-world utility of models. △ Less

Submitted 30 December, 2022; v1 submitted 11 June, 2021; originally announced June 2021.

Comments: pre-print

arXiv:2104.05158 [pdf, other]

Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

Authors: Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng , et al. (28 additional authors not shown)

Abstract: Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pa… ▽ More Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pair it with the new evolution of Zion platform, namely ZionEX. We demonstrate the capability to train very large DLRMs with up to 12 Trillion parameters and show that we can attain 40X speedup in terms of time to solution over previous systems. We achieve this by (i) designing the ZionEX platform with dedicated scale-out network, provisioned with high bandwidth, optimal topology and efficient transport (ii) implementing an optimized PyTorch-based training stack supporting both model and data parallelism (iii) developing sharding algorithms capable of hierarchical partitioning of the embedding tables along row, column dimensions and load balancing them across multiple workers; (iv) adding high-performance core operators while retaining flexibility to support optimizers with fully deterministic updates (v) leveraging reduced precision communications, multi-level memory hierarchy (HBM+DDR+SSD) and pipelining. Furthermore, we develop and briefly comment on distributed data ingestion and other supporting services that are required for the robust and efficient end-to-end training in production environments. △ Less

Submitted 26 February, 2023; v1 submitted 11 April, 2021; originally announced April 2021.

arXiv:2104.02107 [pdf, other]

doi 10.1109/EuroSP48549.2020.00017

Jekyll: Attacking Medical Image Diagnostics using Deep Generative Models

Authors: Neal Mangaokar, Jiameng Pu, Parantapa Bhattacharya, Chandan K. Reddy, Bimal Viswanath

Abstract: Advances in deep neural networks (DNNs) have shown tremendous promise in the medical domain. However, the deep learning tools that are helping the domain, can also be used against it. Given the prevalence of fraud in the healthcare domain, it is important to consider the adversarial use of DNNs in manipulating sensitive data that is crucial to patient healthcare. In this work, we present the desig… ▽ More Advances in deep neural networks (DNNs) have shown tremendous promise in the medical domain. However, the deep learning tools that are helping the domain, can also be used against it. Given the prevalence of fraud in the healthcare domain, it is important to consider the adversarial use of DNNs in manipulating sensitive data that is crucial to patient healthcare. In this work, we present the design and implementation of a DNN-based image translation attack on biomedical imagery. More specifically, we propose Jekyll, a neural style transfer framework that takes as input a biomedical image of a patient and translates it to a new image that indicates an attacker-chosen disease condition. The potential for fraudulent claims based on such generated 'fake' medical images is significant, and we demonstrate successful attacks on both X-rays and retinal fundus image modalities. We show that these attacks manage to mislead both medical professionals and algorithmic detection schemes. Lastly, we also investigate defensive measures based on machine learning to detect images generated by Jekyll. △ Less

Submitted 5 April, 2021; originally announced April 2021.

Comments: Published in proceedings of the 5th European Symposium on Security and Privacy (EuroS&P '20)

arXiv:2103.04263 [pdf, other]

Deepfake Videos in the Wild: Analysis and Detection

Authors: Jiameng Pu, Neal Mangaokar, Lauren Kelly, Parantapa Bhattacharya, Kavya Sundaram, Mobin Javed, Bolun Wang, Bimal Viswanath

Abstract: AI-manipulated videos, commonly known as deepfakes, are an emerging problem. Recently, researchers in academia and industry have contributed several (self-created) benchmark deepfake datasets, and deepfake detection algorithms. However, little effort has gone towards understanding deepfake videos in the wild, leading to a limited understanding of the real-world applicability of research contributi… ▽ More AI-manipulated videos, commonly known as deepfakes, are an emerging problem. Recently, researchers in academia and industry have contributed several (self-created) benchmark deepfake datasets, and deepfake detection algorithms. However, little effort has gone towards understanding deepfake videos in the wild, leading to a limited understanding of the real-world applicability of research contributions in this space. Even if detection schemes are shown to perform well on existing datasets, it is unclear how well the methods generalize to real-world deepfakes. To bridge this gap in knowledge, we make the following contributions: First, we collect and present the largest dataset of deepfake videos in the wild, containing 1,869 videos from YouTube and Bilibili, and extract over 4.8M frames of content. Second, we present a comprehensive analysis of the growth patterns, popularity, creators, manipulation strategies, and production methods of deepfake content in the real-world. Third, we systematically evaluate existing defenses using our new dataset, and observe that they are not ready for deployment in the real-world. Fourth, we explore the potential for transfer learning schemes and competition-winning techniques to improve defenses. △ Less

Submitted 10 March, 2021; v1 submitted 6 March, 2021; originally announced March 2021.

Comments: Accepted to The Web Conference 2021; First two authors contributed equally to this work; 12 pages, 6 tables

arXiv:2012.02594 [pdf, other]

To Schedule or not to Schedule: Extracting Task Specific Temporal Entities and Associated Negation Constraints

Authors: Barun Patra, Chala Fufa, Pamela Bhattacharya, Charles Lee

Abstract: State of the art research for date-time entity extraction from text is task agnostic. Consequently, while the methods proposed in literature perform well for generic date-time extraction from texts, they don't fare as well on task specific date-time entity extraction where only a subset of the date-time entities present in the text are pertinent to solving the task. Furthermore, some tasks require… ▽ More State of the art research for date-time entity extraction from text is task agnostic. Consequently, while the methods proposed in literature perform well for generic date-time extraction from texts, they don't fare as well on task specific date-time entity extraction where only a subset of the date-time entities present in the text are pertinent to solving the task. Furthermore, some tasks require identifying negation constraints associated with the date-time entities to correctly reason over time. We showcase a novel model for extracting task-specific date-time entities along with their negation constraints. We show the efficacy of our method on the task of date-time understanding in the context of scheduling meetings for an email-based digital AI scheduling assistant. Our method achieves an absolute gain of 19\% f-score points compared to baseline methods in detecting the date-time entities relevant to scheduling meetings and a 4\% improvement over baseline methods for detecting negation constraints over date-time entities. △ Less

Submitted 15 November, 2020; originally announced December 2020.

Comments: Proceedings of EMNLP 2020

arXiv:2007.03225 [pdf, other]

Hier-SPCNet: A Legal Statute Hierarchy-based Heterogeneous Network for Computing Legal Case Document Similarity

Authors: Paheli Bhattacharya, Kripabandhu Ghosh, Arindam Pal, Saptarshi Ghosh

Abstract: Computing similarity between two legal case documents is an important and challenging task in Legal IR, for which text-based and network-based measures have been proposed in literature. All prior network-based similarity methods considered a precedent citation network among case documents only (PCNet). However, this approach misses an important source of legal knowledge -- the hierarchy of legal s… ▽ More Computing similarity between two legal case documents is an important and challenging task in Legal IR, for which text-based and network-based measures have been proposed in literature. All prior network-based similarity methods considered a precedent citation network among case documents only (PCNet). However, this approach misses an important source of legal knowledge -- the hierarchy of legal statutes that are applicable in a given legal jurisdiction (e.g., country). We propose to augment the PCNet with the hierarchy of legal statutes, to form a heterogeneous network Hier-SPCNet, having citation links between case documents and statutes, as well as citation and hierarchy links among the statutes. Experiments over a set of Indian Supreme Court case documents show that our proposed heterogeneous network enables significantly better document similarity estimation, as compared to existing approaches using PCNet. We also show that the proposed network-based method can complement text-based measures for better estimation of legal document similarity. △ Less

Submitted 7 July, 2020; originally announced July 2020.

Comments: Accepted at the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020 (Short Paper)

arXiv:2004.13274 [pdf]

Exploring the contextual factors affecting multimodal emotion recognition in videos

Authors: Prasanta Bhattacharya, Raj Kumar Gupta, Yinping Yang

Abstract: Emotional expressions form a key part of user behavior on today's digital platforms. While multimodal emotion recognition techniques are gaining research attention, there is a lack of deeper understanding on how visual and non-visual features can be used to better recognize emotions in certain contexts, but not others. This study analyzes the interplay between the effects of multimodal emotion fea… ▽ More Emotional expressions form a key part of user behavior on today's digital platforms. While multimodal emotion recognition techniques are gaining research attention, there is a lack of deeper understanding on how visual and non-visual features can be used to better recognize emotions in certain contexts, but not others. This study analyzes the interplay between the effects of multimodal emotion features derived from facial expressions, tone and text in conjunction with two key contextual factors: i) gender of the speaker, and ii) duration of the emotional episode. Using a large public dataset of 2,176 manually annotated YouTube videos, we found that while multimodal features consistently outperformed bimodal and unimodal features, their performance varied significantly across different emotions, gender and duration contexts. Multimodal features performed particularly better for male speakers in recognizing most emotions. Furthermore, multimodal features performed particularly better for shorter than for longer videos in recognizing neutral and happiness, but not sadness and anger. These findings offer new insights towards the development of more context-aware emotion recognition and empathetic systems. △ Less

Submitted 30 June, 2021; v1 submitted 28 April, 2020; originally announced April 2020.

Comments: Accepted version at IEEE Transactions on Affective Computing

arXiv:2004.12307 [pdf, other]

Methods for Computing Legal Document Similarity: A Comparative Study

Authors: Paheli Bhattacharya, Kripabandhu Ghosh, Arindam Pal, Saptarshi Ghosh

Abstract: Computing similarity between two legal documents is an important and challenging task in the domain of Legal Information Retrieval. Finding similar legal documents has many applications in downstream tasks, including prior-case retrieval, recommendation of legal articles, and so on. Prior works have proposed two broad ways of measuring similarity between legal documents - analyzing the precedent c… ▽ More Computing similarity between two legal documents is an important and challenging task in the domain of Legal Information Retrieval. Finding similar legal documents has many applications in downstream tasks, including prior-case retrieval, recommendation of legal articles, and so on. Prior works have proposed two broad ways of measuring similarity between legal documents - analyzing the precedent citation network, and measuring similarity based on textual content similarity measures. But there has not been a comprehensive comparison of these existing methods on a common platform. In this paper, we perform the first systematic analysis of the existing methods. In addition, we explore two promising new similarity computation methods - one text-based and the other based on network embeddings, which have not been considered till now. △ Less

Submitted 26 April, 2020; originally announced April 2020.

Comments: This paper was published at the LDA 2019 workshop in the JURIX 2019 conference

arXiv:2003.04988 [pdf, other]

ScopeIt: Scoping Task Relevant Sentences in Documents

Authors: Vishwas Suryanarayanan, Barun Patra, Pamela Bhattacharya, Chala Fufa, Charles Lee

Abstract: Intelligent assistants like Cortana, Siri, Alexa, and Google Assistant are trained to parse information when the conversation is synchronous and short; however, for email-based conversational agents, the communication is asynchronous, and often contains information irrelevant to the assistant. This makes it harder for the system to accurately detect intents, extract entities relevant to those inte… ▽ More Intelligent assistants like Cortana, Siri, Alexa, and Google Assistant are trained to parse information when the conversation is synchronous and short; however, for email-based conversational agents, the communication is asynchronous, and often contains information irrelevant to the assistant. This makes it harder for the system to accurately detect intents, extract entities relevant to those intents and thereby perform the desired action. We present a neural model for scoping relevant information for the agent from a large query. We show that when used as a preprocessing step, the model improves performance of both intent detection and entity extraction tasks. We demonstrate the model's impact on Scheduler (Cortana is the persona of the agent, while Scheduler is the name of the service. We use them interchangeably in the context of this paper.) - a virtual conversational meeting scheduling assistant that interacts asynchronously with users through email. The model helps the entity extraction and intent detection tasks requisite by Scheduler achieve an average gain of 35% in precision without any drop in recall. Additionally, we demonstrate that the same approach can be used for component level analysis in large documents, such as signature block identification. △ Less

Submitted 15 November, 2020; v1 submitted 22 February, 2020; originally announced March 2020.

Comments: Accepted in COLING 2020

ACM Class: I.2.7; I.7.5

arXiv:1911.05405 [pdf, ps, other]

Identification of Rhetorical Roles of Sentences in Indian Legal Judgments

Authors: Paheli Bhattacharya, Shounak Paul, Kripabandhu Ghosh, Saptarshi Ghosh, Adam Wyner

Abstract: Automatically understanding the rhetorical roles of sentences in a legal case judgement is an important problem to solve, since it can help in several downstream tasks like summarization of legal judgments, legal search, and so on. The task is challenging since legal case documents are usually not well-structured, and these rhetorical roles may be subjective (as evident from variation of opinions… ▽ More Automatically understanding the rhetorical roles of sentences in a legal case judgement is an important problem to solve, since it can help in several downstream tasks like summarization of legal judgments, legal search, and so on. The task is challenging since legal case documents are usually not well-structured, and these rhetorical roles may be subjective (as evident from variation of opinions between legal experts). In this paper, we address this task for judgments from the Supreme Court of India. We label sentences in 50 documents using multiple human annotators, and perform an extensive analysis of the human-assigned labels. We also attempt automatic identification of the rhetorical roles of sentences. While prior approaches towards this task used Conditional Random Fields over manually handcrafted features, we explore the use of deep neural models which do not require hand-crafting of features. Experiments show that neural models perform much better in this task than baseline methods which use handcrafted features. △ Less

Submitted 13 November, 2019; originally announced November 2019.

Comments: Accepted at the 32nd International Conference on Legal Knowledge and Information Systems (JURIX) 2019

arXiv:1901.08001 [pdf]

doi 10.1017/S1431927619000254

Removing Stripes, Scratches, and Curtaining with Non-Recoverable Compressed Sensing

Authors: Jonathan Schwartz, Yi Jiang, Yongjie Wang, Anthony Aiello, Pallab Bhattacharya, Hui Yuan, Zetian Mi, Nabil Bassim, Robert Hovden

Abstract: Highly-directional image artifacts such as ion mill curtaining, mechanical scratches, or image striping from beam instability degrade the interpretability of micrographs. These unwanted, aperiodic features extend the image along a primary direction and occupy a small wedge of information in Fourier space. Deleting this wedge of data replaces stripes, scratches, or curtaining, with more complex str… ▽ More Highly-directional image artifacts such as ion mill curtaining, mechanical scratches, or image striping from beam instability degrade the interpretability of micrographs. These unwanted, aperiodic features extend the image along a primary direction and occupy a small wedge of information in Fourier space. Deleting this wedge of data replaces stripes, scratches, or curtaining, with more complex streaking and blurring artifacts-known within the tomography community as missing wedge artifacts. Here, we overcome this problem by recovering the missing region using total variation minimization, which leverages image sparsity based reconstruction techniques-colloquially referred to as compressed sensing-to reliably restore images corrupted by stripe like features. Our approach removes beam instability, ion mill curtaining, mechanical scratches, or any stripe features and remains robust at low signal-to-noise. The success of this approach is achieved by exploiting compressed sensings inability to recover directional structures that are highly localized and missing in Fourier Space. △ Less

Submitted 23 January, 2019; originally announced January 2019.

Comments: 15 pages, 5 figures

arXiv:1712.05492 [pdf, other]

Constant Approximation Algorithms for Guarding Simple Polygons using Vertex Guards

Authors: Pritam Bhattacharya, Subir Kumar Ghosh, Sudebkumar Pal

Abstract: The art gallery problem enquires about the least number of guards sufficient to ensure that an art gallery, represented by a simple polygon $P$, is fully guarded. Most standard versions of this problem are known to be NP-hard. In 1987, Ghosh provided a deterministic $\mathcal{O}(\log n)$-approximation algorithm for the case of vertex guards and edge guards in simple polygons. In the same paper, Gh… ▽ More The art gallery problem enquires about the least number of guards sufficient to ensure that an art gallery, represented by a simple polygon $P$, is fully guarded. Most standard versions of this problem are known to be NP-hard. In 1987, Ghosh provided a deterministic $\mathcal{O}(\log n)$-approximation algorithm for the case of vertex guards and edge guards in simple polygons. In the same paper, Ghosh also conjectured the existence of constant ratio approximation algorithms for these problems. We present here three polynomial-time algorithms with a constant approximation ratio for guarding an $n$-sided simple polygon $P$ using vertex guards. (i) The first algorithm, that has an approximation ratio of 18, guards all vertices of $P$ in $\mathcal{O}(n^4)$ time. (ii) The second algorithm, that has the same approximation ratio of 18, guards the entire boundary of $P$ in $\mathcal{O}(n^5)$ time. (iii) The third algorithm, that has an approximation ratio of 27, guards all interior and boundary points of $P$ in $\mathcal{O}(n^5)$ time. Further, these algorithms can be modified to obtain similar approximation ratios while using edge guards. The significance of our results lies in the fact that these results settle the conjecture by Ghosh regarding the existence of constant-factor approximation algorithms for this problem, which has been open since 1987 despite several attempts by researchers. Our approximation algorithms exploit several deep visibility structures of simple polygons which are interesting in their own right. △ Less

Submitted 11 April, 2018; v1 submitted 14 December, 2017; originally announced December 2017.

Comments: 39 pages, 31 figures

arXiv:1712.00988 [pdf, ps, other]

End-to-End Relation Extraction using Markov Logic Networks

Authors: Sachin Pawar, Pushpak Bhattacharya, Girish K. Palshikar

Abstract: The task of end-to-end relation extraction consists of two sub-tasks: i) identifying entity mentions along with their types and ii) recognizing semantic relations among the entity mention pairs. %Identifying entity mentions along with their types and recognizing semantic relations among the entity mentions, are two very important problems in Information Extraction. It has been shown that for bette… ▽ More The task of end-to-end relation extraction consists of two sub-tasks: i) identifying entity mentions along with their types and ii) recognizing semantic relations among the entity mention pairs. %Identifying entity mentions along with their types and recognizing semantic relations among the entity mentions, are two very important problems in Information Extraction. It has been shown that for better performance, it is necessary to address these two sub-tasks jointly. We propose an approach for simultaneous extraction of entity mentions and relations in a sentence, by using inference in Markov Logic Networks (MLN). We learn three different classifiers : i) local entity classifier, ii) local relation classifier and iii) "pipeline" relation classifier which uses predictions of the local entity classifier. Predictions of these classifiers may be inconsistent with each other. We represent these predictions along with some domain knowledge using weighted first-order logic rules in an MLN and perform joint inference over the MLN to obtain a global output with minimum inconsistencies. Experiments on the ACE (Automatic Content Extraction) 2004 dataset demonstrate that our approach of joint extraction using MLNs outperforms the baselines of individual classifiers. Our end-to-end relation extraction performance is better than 2 out of 3 previous results reported on the ACE 2004 dataset. △ Less

Submitted 4 December, 2017; originally announced December 2017.

arXiv:1712.00640 [pdf, other]

Learning Sparse Adversarial Dictionaries For Multi-Class Audio Classification

Authors: Vaisakh Shaj, Puranjoy Bhattacharya

Abstract: Audio events are quite often overlapping in nature, and more prone to noise than visual signals. There has been increasing evidence for the superior performance of representations learned using sparse dictionaries for applications like audio denoising and speech enhancement. This paper concentrates on modifying the traditional reconstructive dictionary learning algorithms, by incorporating a discr… ▽ More Audio events are quite often overlapping in nature, and more prone to noise than visual signals. There has been increasing evidence for the superior performance of representations learned using sparse dictionaries for applications like audio denoising and speech enhancement. This paper concentrates on modifying the traditional reconstructive dictionary learning algorithms, by incorporating a discriminative term into the objective function in order to learn class-specific adversarial dictionaries that are good at representing samples of their own class at the same time poor at representing samples belonging to any other class. We quantitatively demonstrate the effectiveness of our learned dictionaries as a stand-alone solution for both binary as well as multi-class audio classification problems. △ Less

Submitted 2 December, 2017; originally announced December 2017.

Comments: Accepted in Asian Conference of Pattern Recognition (ACPR-2017)

arXiv:1706.03249 [pdf, other]

Characterizing and Predicting Supply-side Engagement on Crowd-contributed Video Sharing Platforms

Authors: Rishabh Mehrotra, Prasanta Bhattacharya

Abstract: Video sharing and entertainment websites have rapidly grown in popularity and now constitute some of the most visited websites on the Internet. Despite the active user engagement on these online video-sharing platforms, most of recent research on online media platforms have restricted themselves to networking based social media sites, like Facebook or Twitter. We depart from previous studies in th… ▽ More Video sharing and entertainment websites have rapidly grown in popularity and now constitute some of the most visited websites on the Internet. Despite the active user engagement on these online video-sharing platforms, most of recent research on online media platforms have restricted themselves to networking based social media sites, like Facebook or Twitter. We depart from previous studies in the online media space that have focused exclusively on demand-side user engagement, by modeling the supply-side of the crowd-contributed videos on this platform. The current study is among the first to perform a large-scale empirical study using longitudinal video upload data from a large online video platform. The modeling and subsequent prediction of video uploads is made complicated by the heterogeneity of video types (e.g. popular vs. niche video genres), and the inherent time trend effects associated with media uploads. We identify distinct genre-clusters from our dataset and employ a self-exciting Hawkes point-process model on each of these clusters to fully specify and estimate the video upload process. Additionally, we go beyond prediction to disentangle potential factors that govern user engagement and determine the video upload rates, which improves our analysis with additional explanatory power. Our findings show that using a relatively parsimonious point-process model, we are able to achieve higher model fit, and predict video uploads to the platform with a higher accuracy than competing models. The findings from this study can benefit platform owners in better understanding how their supply-side users engage with their site over time. We also offer a robust method for performing media upload prediction that is likely to be generalizable across media platforms which demonstrate similar temporal and genre-level heterogeneity. △ Less

Submitted 10 June, 2017; originally announced June 2017.

Comments: 8 pages, ICTIR 2017

arXiv:1608.01561 [pdf, ps, other]

UsingWord Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval

Authors: Paheli Bhattacharya, Pawan Goyal, Sudeshna Sarkar

Abstract: Cross-Language Information Retrieval (CLIR) has become an important problem to solve in the recent years due to the growth of content in multiple languages in the Web. One of the standard methods is to use query translation from source to target language. In this paper, we propose an approach based on word embeddings, a method that captures contextual clues for a particular word in the source lang… ▽ More Cross-Language Information Retrieval (CLIR) has become an important problem to solve in the recent years due to the growth of content in multiple languages in the Web. One of the standard methods is to use query translation from source to target language. In this paper, we propose an approach based on word embeddings, a method that captures contextual clues for a particular word in the source language and gives those words as translations that occur in a similar context in the target language. Once we obtain the word embeddings of the source and target language pairs, we learn a projection from source to target word embeddings, making use of a dictionary with word translation pairs.We then propose various methods of query translation and aggregation. The advantage of this approach is that it does not require the corpora to be aligned (which is difficult to obtain for resource-scarce languages), a dictionary with word translation pairs is enough to train the word vectors for translation. We experiment with Forum for Information Retrieval and Evaluation (FIRE) 2008 and 2012 datasets for Hindi to English CLIR. The proposed word embedding based approach outperforms the basic dictionary based approach by 70% and when the word embeddings are combined with the dictionary, the hybrid approach beats the baseline dictionary based method by 77%. It outperforms the English monolingual baseline by 15%, when combined with the translations obtained from Google Translate and Dictionary. △ Less

Submitted 4 August, 2016; originally announced August 2016.

Comments: 17th International Conference on Intelligent Text Processing and Computational Linguistics

arXiv:1512.06469 [pdf]

A Co-evolution Model of Network Structure and User Behavior in Online Social Networks: The Case of Network-Driven Content Generation

Authors: Prasanta Bhattacharya, Tuan Q. Phan, Xue Bai, Edoardo Airoldi

Abstract: With the rapid growth of online social network sites (SNS), it has become imperative for platform owners and online marketers to investigate what drives content production on these platforms. However, previous research has found it difficult to statistically model these factors from observational data due to the inability to separately assess the effects of network formation and network influence.… ▽ More With the rapid growth of online social network sites (SNS), it has become imperative for platform owners and online marketers to investigate what drives content production on these platforms. However, previous research has found it difficult to statistically model these factors from observational data due to the inability to separately assess the effects of network formation and network influence. In this paper, we adopt and enhance an actor-oriented continuous-time model to jointly estimate the co-evolution of the users' social network structure and their content production behavior using a Markov Chain Monte Carlo (MCMC)- based simulation approach. Specifically, we offer a method to analyze non-stationary and continuous behavior with network effects in the presence of observable and unobservable covariates, similar to what is observed in social media ecosystems. Leveraging a unique dataset from a large social network site, we apply our model to data on university students across six months to find that: 1) users tend to connect with others that have similar posting behavior, 2) however, after doing so, users tend to diverge in posting behavior, and 3) peer influences are sensitive to the strength of the posting behavior. Further, our method provides researchers and practitioners with a statistically rigorous approach to analyze network effects in observational data. These results provide insights and recommendations for SNS platforms to sustain an active and viable community. △ Less

Submitted 26 November, 2018; v1 submitted 20 December, 2015; originally announced December 2015.

arXiv:1409.4621 [pdf, other]

Approximability of Guarding Weak Visibility Polygons

Authors: Pritam Bhattacharya, Subir Kumar Ghosh, Bodhayan Roy

Abstract: The art gallery problem enquires about the least number of guards that are sufficient to ensure that an art gallery, represented by a polygon $P$, is fully guarded. In 1998, the problems of finding the minimum number of point guards, vertex guards, and edge guards required to guard $P$ were shown to be APX-hard by Eidenbenz, Widmayer and Stamm. In 1987, Ghosh presented approximation algorithms for… ▽ More The art gallery problem enquires about the least number of guards that are sufficient to ensure that an art gallery, represented by a polygon $P$, is fully guarded. In 1998, the problems of finding the minimum number of point guards, vertex guards, and edge guards required to guard $P$ were shown to be APX-hard by Eidenbenz, Widmayer and Stamm. In 1987, Ghosh presented approximation algorithms for vertex guards and edge guards that achieved a ratio of $\mathcal{O}(\log n)$, which was improved upto $\mathcal{O}(\log\log OPT)$ by King and Kirkpatrick in 2011. It has been conjectured that constant-factor approximation algorithms exist for these problems. We settle the conjecture for the special class of polygons that are weakly visible from an edge and contain no holes by presenting a 6-approximation algorithm for finding the minimum number of vertex guards that runs in $\mathcal{O}(n^2)$ time. On the other hand, for weak visibility polygons with holes, we present a reduction from the Set Cover problem to show that there cannot exist a polynomial time algorithm for the vertex guard problem with an approximation ratio better than $((1 - ε)/12)\ln n$ for any $ε>0$, unless NP=P. We also show that, for the special class of polygons without holes that are orthogonal as well as weakly visible from an edge, the approximation ratio can be improved to 3. Finally, we consider the Point Guard problem and show that it is NP-hard in the case of polygons weakly visible from an edge. △ Less

Submitted 30 April, 2016; v1 submitted 16 September, 2014; originally announced September 2014.

Comments: 23 pages, 21 figures, 30 citations

arXiv:1407.8476 [pdf]

A comparative study between seasonal wind speed by Fourier and Wavelet analysis

Authors: Sabyasachi Mukhopadhyay, Debadatta Dash, Asish Mitra, Paritosh Bhattacharya

Abstract: Wind Energy is a useful resource for Renewable energy purpose. Wind speed plays a vital role for wind energy calculation of certain location. So, it is very much necessary to know the wind speed data characteristics. In this paper fourier and wavelet transform are applied to study the wind speed data. We have compared wind speed of winter with summer by taking their speed into account using variou… ▽ More Wind Energy is a useful resource for Renewable energy purpose. Wind speed plays a vital role for wind energy calculation of certain location. So, it is very much necessary to know the wind speed data characteristics. In this paper fourier and wavelet transform are applied to study the wind speed data. We have compared wind speed of winter with summer by taking their speed into account using various discrete wavelets namely Haar and Daubechies-4 (Db-4). Also the periodicity of wind speed is checked using Continuous Wavelet Transform (MCWT) like Morlet. Thereafter a comparative study is done for detecting the periodicity of both summer and winter. Then wavelet coherence is checked between these two data for extracting the phase coherency information. △ Less

Submitted 6 August, 2014; v1 submitted 31 July, 2014; originally announced July 2014.

arXiv:1401.2230 [pdf]

doi 10.5121/ijwmn.2013.5610

An ANN Based Call Handoff Management Scheme for Mobile Cellular Network

Authors: P. P. Bhattacharya, Ananya Sarkar, IndranilSarkar, Subhajit Chatterjee

Abstract: Handoff decisions are usually signal strength based because of simplicity and effectiveness. Apart from the conventional techniques, such as threshold and hysteresis based schemes, recently many artificial intelligent techniques such as Fuzzy Logic, Artificial Neural Network (ANN) etc. are also used for taking handoff decision. In this paper, an Artificial Neural Network based handoff algorithm is… ▽ More Handoff decisions are usually signal strength based because of simplicity and effectiveness. Apart from the conventional techniques, such as threshold and hysteresis based schemes, recently many artificial intelligent techniques such as Fuzzy Logic, Artificial Neural Network (ANN) etc. are also used for taking handoff decision. In this paper, an Artificial Neural Network based handoff algorithm is proposed and its performance is studied. We have used ANN here for taking fast and accurate handoff decision. In our proposed handoff algorithm, Backpropagation Neural Network model is used.The advantages of Back propagation method are its simplicity and reasonable speed. The algorithm is designed, tested and found to give optimum results. △ Less

Submitted 10 January, 2014; originally announced January 2014.

Comments: 11 pages. arXiv admin note: text overlap with arXiv:1004.1794 by other authors

arXiv:1310.1590 [pdf, ps, other]

Evolution of the Modern Phase of Written Bangla: A Statistical Study

Authors: Paheli Bhattacharya, Arnab Bhattacharya

Abstract: Active languages such as Bangla (or Bengali) evolve over time due to a variety of social, cultural, economic, and political issues. In this paper, we analyze the change in the written form of the modern phase of Bangla quantitatively in terms of character-level, syllable-level, morpheme-level and word-level features. We collect three different types of corpora---classical, newspapers and blogs---a… ▽ More Active languages such as Bangla (or Bengali) evolve over time due to a variety of social, cultural, economic, and political issues. In this paper, we analyze the change in the written form of the modern phase of Bangla quantitatively in terms of character-level, syllable-level, morpheme-level and word-level features. We collect three different types of corpora---classical, newspapers and blogs---and test whether the differences in their features are statistically significant. Results suggest that there are significant changes in the length of a word when measured in terms of characters, but there is not much difference in usage of different characters, syllables and morphemes in a word or of different words in a sentence. To the best of our knowledge, this is the first work on Bangla of this kind. △ Less

Submitted 6 October, 2013; originally announced October 2013.

Comments: LCC 2013

ACM Class: I.2.7

arXiv:1309.3513 [pdf]

Application of Vertex coloring in a particular triangular closed path structure and in Krafts inequality

Authors: Sabyasachi Mukhopadhyay, Paritosh Bhattacharya, B. B. Ghosh

Abstract: A good deal of research has been done and published on coloring of the vertices of graphs for several years while studying of the excellent work of those maestros, we get inspire to work on the vertex coloring of graphs in case of a particular triangular closed path structure what we achieve from the front view of a pyramidal structure. From here we achieve a repetitive nature of vertex coloring i… ▽ More A good deal of research has been done and published on coloring of the vertices of graphs for several years while studying of the excellent work of those maestros, we get inspire to work on the vertex coloring of graphs in case of a particular triangular closed path structure what we achieve from the front view of a pyramidal structure. From here we achieve a repetitive nature of vertex coloring in case of odd and even number of horizontal lines within this triangular structure. In order to apply this repetitive nature of vertex coloring in case of a binary tree, we get a success in Krafts Inequality. Actually our work mainly deals with a particular triangular closed path vertex coloring and repetition of the vertex coloring nature in case of the Krafts inequality in the field of Information Theory and Coding. △ Less

Submitted 22 August, 2013; originally announced September 2013.

arXiv:1210.2940 [pdf]

A review on routing protocols for application in wireless sensor networks

Authors: Neha Rathi, Jyoti Saraswat, Partha Pratim Bhattacharya

Abstract: Wireless sensor networks are harshly restricted by storage capacity, energy and computing power. So it is essential to design effective and energy aware protocol in order to enhance the network lifetime. In this paper, a review on routing protocol in WSNs is carried out which are classified as data-centric, hierarchical and location based depending on the network structure. Then some of the multip… ▽ More Wireless sensor networks are harshly restricted by storage capacity, energy and computing power. So it is essential to design effective and energy aware protocol in order to enhance the network lifetime. In this paper, a review on routing protocol in WSNs is carried out which are classified as data-centric, hierarchical and location based depending on the network structure. Then some of the multipath routing protocols which are widely used in WSNs to improve network performance are also discussed. Advantages and disadvantages of each routing algorithm are discussed thereafter. Furthermore, this paper compares and summarizes the performances of routing protocols. △ Less

Submitted 10 October, 2012; originally announced October 2012.

Comments: 20 pages, 16 figures, 2 tables

arXiv:1205.2269 [pdf]

Performance improvement in OFDM system by PAPR reduction

Authors: Suverna Sengar, Partha Pratim Bhattacharya

Abstract: Orthogonal Frequency Division Multiplexing (OFDM) is an efficient method of data transmission for high speed communication systems. However, the main drawback of OFDM system is the high Peak to Average Power Ratio (PAPR) of the transmitted signals. OFDM consist of large number of independent subcarriers, as a result of which the amplitude of such a signal can have high peak values. Coding, phase r… ▽ More Orthogonal Frequency Division Multiplexing (OFDM) is an efficient method of data transmission for high speed communication systems. However, the main drawback of OFDM system is the high Peak to Average Power Ratio (PAPR) of the transmitted signals. OFDM consist of large number of independent subcarriers, as a result of which the amplitude of such a signal can have high peak values. Coding, phase rotation and clipping are among many PAPR reduction schemes that have been proposed to overcome this problem. Here two different PAPR reduction methods e.g. partial transmit sequence (PTS) and selective mapping (SLM) are used to reduce PAPR. Significant reduction in PAPR has been achieved using these techniques. The performances of the two methods are then compared. △ Less

Submitted 9 May, 2012; originally announced May 2012.

Comments: 13 pages, 8 figures, 1 Table, Signal & Image Processing : An International Journal (SIPIJ) Vol.3, No.2, April 2012

arXiv:1201.1964 [pdf]

A Survey on Dynamic Spectrum Access Techniques for Cognitive Radio

Authors: Anita Garhwal, Partha Pratim Bhattacharya

Abstract: Cognitive radio (CR) is a new paradigm that utilizes the available spectrum band. The key characteristic of CR system is to sense the electromagnetic environment to adapt their operation and dynamically vary its radio operating parameters. The technique of dynamically accessing the unused spectrum band is known as Dynamic Spectrum Access (DSA). The dynamic spectrum access technology helps to minim… ▽ More Cognitive radio (CR) is a new paradigm that utilizes the available spectrum band. The key characteristic of CR system is to sense the electromagnetic environment to adapt their operation and dynamically vary its radio operating parameters. The technique of dynamically accessing the unused spectrum band is known as Dynamic Spectrum Access (DSA). The dynamic spectrum access technology helps to minimize unused spectrum bands. In this paper, main functions of Cognitive Radio (CR) i.e. spectrum sensing, spectrum management, spectrum mobility and spectrum sharing are discussed. Then DSA models are discussed along with different methods of DSA such as Command and Control, Exclusive-Use, Shared Use of Primary Licensed User and Commons method. Game-theoretic approach using Bertrand game model, Markovian Queuing Model for spectrum allocation in centralized architecture and Fuzzy logic based method are also discussed and result are shown. △ Less

Submitted 9 January, 2012; originally announced January 2012.

Comments: arXiv admin note: text overlap with http://www.ijetch.org/papers/206-Z058.pdf by other authors

arXiv:1112.2248 [pdf]

A Survey on Cooperative Diversity and Its Applications in Various Wireless Networks

Authors: Gurpreet Kaur, Partha Pratim Bhattacharya

Abstract: Cooperative diversity is a technique in which various radio terminals relay signals for each other. Cooperative diversity results when cooperative communications is used primarily to leverage the spatial diversity available among distributed radios. In this paper different cooperative diversity schemes and their applications in various wireless networks are discussed. In this paper the impact of c… ▽ More Cooperative diversity is a technique in which various radio terminals relay signals for each other. Cooperative diversity results when cooperative communications is used primarily to leverage the spatial diversity available among distributed radios. In this paper different cooperative diversity schemes and their applications in various wireless networks are discussed. In this paper the impact of cooperative diversity on the energy consumption and lifetime of sensor network and the impact of cooperation in cognitive radio are discussed. Here, user scheduling and radio resource allocation techniques are also discussed which are developed in order to efficiently integrate various cooperative diversity schemes for the emerging IEEE 802.16j based systems. △ Less

Submitted 9 December, 2011; originally announced December 2011.

Comments: 20 pages

Journal ref: International Journal of Computer Science & Engineering Survey (IJCSES) Vol.2, No.4, November 2011

arXiv:1109.0257 [pdf]

Smart Radio Spectrum Management for Cognitive Radio

Authors: Partha Pratim Bhattacharya, Ronak Khandelwal, Rishita Gera, Anjali Agarwal

Abstract: Today's wireless networks are characterized by fixed spectrum assignment policy. The limited available spectrum and the inefficiency in the spectrum usage necessitate a new communication paradigm to exploit the existing wireless spectrum opportunistically. Cognitive radio is a paradigm for wireless communication in which either a network or a wireless node changes its transmission or reception par… ▽ More Today's wireless networks are characterized by fixed spectrum assignment policy. The limited available spectrum and the inefficiency in the spectrum usage necessitate a new communication paradigm to exploit the existing wireless spectrum opportunistically. Cognitive radio is a paradigm for wireless communication in which either a network or a wireless node changes its transmission or reception parameters to communicate efficiently avoiding interference with licensed or unlicensed users. In this work, a fuzzy logic based system for spectrum management is proposed where the radio can share unused spectrum depending on some parameters like distance, signal strength, node velocity and availability of unused spectrum. The system is simulated and is found to give satisfactory results. △ Less

Submitted 5 August, 2011; originally announced September 2011.

Comments: 13 pages, 11 figures

Journal ref: International Journal of Parallel and Distributed Systems, Vol. 2, NO 4, July 2011

Showing 1–44 of 44 results for author: Bhattacharya, P