-
Information Security and Privacy in the Digital World: Some Selected Topics
Authors:
Jaydip Sen,
Joceli Mayer,
Subhasis Dasgupta,
Subrata Nandi,
Srinivasan Krishnaswamy,
Pinaki Mitra,
Mahendra Pratap Singh,
Naga Prasanthi Kundeti,
Chandra Sekhara Rao MVP,
Sudha Sree Chekuri,
Seshu Babu Pallapothu,
Preethi Nanjundan,
Jossy P. George,
Abdelhadi El Allahi,
Ilham Morino,
Salma AIT Oussous,
Siham Beloualid,
Ahmed Tamtaoui,
Abderrahim Bajit
Abstract:
In the era of generative artificial intelligence and the Internet of Things, while there is explosive growth in the volume of data and the associated need for processing, analysis, and storage, several new challenges are faced in identifying spurious and fake information and protecting the privacy of sensitive data. This has led to an increasing demand for more robust and resilient schemes for aut…
▽ More
In the era of generative artificial intelligence and the Internet of Things, while there is explosive growth in the volume of data and the associated need for processing, analysis, and storage, several new challenges are faced in identifying spurious and fake information and protecting the privacy of sensitive data. This has led to an increasing demand for more robust and resilient schemes for authentication, integrity protection, encryption, non-repudiation, and privacy-preservation of data. The chapters in this book present some of the state-of-the-art research works in the field of cryptography and security in computing and communications.
△ Less
Submitted 29 March, 2024;
originally announced April 2024.
-
Prompt Perturbation Consistency Learning for Robust Language Models
Authors:
Yao Qiang,
Subhrangshu Nandi,
Ninareh Mehrabi,
Greg Ver Steeg,
Anoop Kumar,
Anna Rumshisky,
Aram Galstyan
Abstract:
Large language models (LLMs) have demonstrated impressive performance on a number of natural language processing tasks, such as question answering and text summarization. However, their performance on sequence labeling tasks such as intent classification and slot filling (IC-SF), which is a central component in personal assistant systems, lags significantly behind discriminative models. Furthermor…
▽ More
Large language models (LLMs) have demonstrated impressive performance on a number of natural language processing tasks, such as question answering and text summarization. However, their performance on sequence labeling tasks such as intent classification and slot filling (IC-SF), which is a central component in personal assistant systems, lags significantly behind discriminative models. Furthermore, there is a lack of substantive research on the robustness of LLMs to various perturbations in the input prompts. The contributions of this paper are three-fold. First, we show that fine-tuning sufficiently large LLMs can produce IC-SF performance comparable to discriminative models. Next, we systematically analyze the performance deterioration of those fine-tuned models due to three distinct yet relevant types of input perturbations - oronyms, synonyms, and paraphrasing. Finally, we propose an efficient mitigation approach, Prompt Perturbation Consistency Learning (PPCL), which works by regularizing the divergence between losses from clean and perturbed samples. Our experiments demonstrate that PPCL can recover on average 59% and 69% of the performance drop for IC and SF tasks, respectively. Furthermore, PPCL beats the data augmentation approach while using ten times fewer augmented data samples.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
Countably Colorful Hyperplane Transversal
Authors:
Sutanoya Chakraborty,
Arijit Ghosh,
Soumi Nandi
Abstract:
Let $\left\{ \mathcal{F}_{n}\right\}_{n \in \mathbb{N}}$ be an infinite sequence of families of compact connected sets in $\mathbb{R}^{d}$. An infinite sequence of compact connected sets $\left\{ B_{n} \right\}_{n\in \mathbb{N}}$ is called heterochromatic sequence from $\left\{ \mathcal{F}_{n}\right\}_{n \in \mathbb{N}}$ if there exists an infinite sequence…
▽ More
Let $\left\{ \mathcal{F}_{n}\right\}_{n \in \mathbb{N}}$ be an infinite sequence of families of compact connected sets in $\mathbb{R}^{d}$. An infinite sequence of compact connected sets $\left\{ B_{n} \right\}_{n\in \mathbb{N}}$ is called heterochromatic sequence from $\left\{ \mathcal{F}_{n}\right\}_{n \in \mathbb{N}}$ if there exists an infinite sequence $\left\{ i_{n} \right\}_{n\in \mathbb{N}}$ of natural numbers satisfying the following two properties: (a) $\{i_{n}\}_{n\in \mathbb{N}}$ is a monotonically increasing sequence, and (b) for all $n \in \mathbb{N}$, we have $B_{n} \in \mathcal{F}_{i_n}$. We show that if every heterochromatic sequence from $\left\{ \mathcal{F}_{n}\right\}_{n \in \mathbb{N}}$ contains $d+1$ sets that can be pierced by a single hyperplane then there exists a finite collection $\mathcal{H}$ of hyperplanes from $\mathbb{R}^{d}$ that pierces all but finitely many families from $\left\{ \mathcal{F}_{n}\right\}_{n \in \mathbb{N}}$. As a direct consequence of our result, we get that if every countable subcollection from an infinite family $\mathcal{F}$ of compact connected sets in $\mathbb{R}^{d}$ contains $d+1$ sets that can be pierced by a single hyperplane then $\mathcal{F}$ can be pierced by finitely many hyperplanes. To establish the optimality of our result we show that, for all $d \in \mathbb{N}$, there exists an infinite sequence $\left\{ \mathcal{F}_{n}\right\}_{n \in \mathbb{N}}$ of families of compact connected sets satisfying the following two conditions: (1) for all $n \in \mathbb{N}$, $\mathcal{F}_{n}$ is not pierceable by finitely many hyperplanes, and (2) for any $m \in \mathbb{N}$ and every sequence $\left\{B_n\right\}_{n=m}^{\infty}$ of compact connected sets in $\mathbb{R}^d$, where $B_i\in\mathcal{F}_i$ for all $i \geq m$, there exists a hyperplane in $\mathbb{R}^d$ that pierces at least $d+1$ sets in the sequence.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Part-of-Speech Tagger for Bodo Language using Deep Learning approach
Authors:
Dhrubajyoti Pathak,
Sanjib Narzary,
Sukumar Nandi,
Bidisha Som
Abstract:
Language Processing systems such as Part-of-speech tagging, Named entity recognition, Machine translation, Speech recognition, and Language modeling (LM) are well-studied in high-resource languages. Nevertheless, research on these systems for several low-resource languages, including Bodo, Mizo, Nagamese, and others, is either yet to commence or is in its nascent stages. Language model plays a vit…
▽ More
Language Processing systems such as Part-of-speech tagging, Named entity recognition, Machine translation, Speech recognition, and Language modeling (LM) are well-studied in high-resource languages. Nevertheless, research on these systems for several low-resource languages, including Bodo, Mizo, Nagamese, and others, is either yet to commence or is in its nascent stages. Language model plays a vital role in the downstream tasks of modern NLP. Extensive studies are carried out on LMs for high-resource languages. Nevertheless, languages such as Bodo, Rabha, and Mising continue to lack coverage. In this study, we first present BodoBERT, a language model for the Bodo language. To the best of our knowledge, this work is the first such effort to develop a language model for Bodo. Secondly, we present an ensemble DL-based POS tagging model for Bodo. The POS tagging model is based on combinations of BiLSTM with CRF and stacked embedding of BodoBERT with BytePairEmbeddings. We cover several language models in the experiment to see how well they work in POS tagging tasks. The best-performing model achieves an F1 score of 0.8041. A comparative experiment was also conducted on Assamese POS taggers, considering that the language is spoken in the same region as Bodo.
△ Less
Submitted 6 January, 2024;
originally announced January 2024.
-
Spreadsheet-based Configuration of Families of Real-Time Specifications
Authors:
José Proença,
David Pereira,
Giann Spilere Nandi,
Sina Borrami,
Jonas Melchert
Abstract:
Model checking real-time systems is complex, and requires a careful trade-off between including enough detail to be useful and not too much detail to avoid state explosion. This work exploits variability of the formal model being analysed and the requirements being checked, to facilitate the model-checking of variations of real-time specifications. This work results from the collaboration between…
▽ More
Model checking real-time systems is complex, and requires a careful trade-off between including enough detail to be useful and not too much detail to avoid state explosion. This work exploits variability of the formal model being analysed and the requirements being checked, to facilitate the model-checking of variations of real-time specifications. This work results from the collaboration between academics and Alstom, a railway company with a concrete use-case, in the context of the VALU3S European project. The configuration of the variability of the formal specifications is described in MS Excel spreadsheets with a particular structure, making it easy to use also by developers. These spreadsheets are processed automatically by our prototype tool that generates instances and runs the model checker. We propose the extension of our previous work by exploiting analysis over valid combination of features, while preserving the simplicity of a spreadsheet-based interface with the model checker.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
A multiple k-means cluster ensemble framework for clustering citation trajectories
Authors:
Joyita Chakraborty,
Dinesh K. Pradhan,
Subrata Nandi
Abstract:
Citation maturity time varies for different articles. However, the impact of all articles is measured in a fixed window. Clustering their citation trajectories helps understand the knowledge diffusion process and reveals that not all articles gain immediate success after publication. Moreover, clustering trajectories is necessary for paper impact recommendation algorithms. It is a challenging prob…
▽ More
Citation maturity time varies for different articles. However, the impact of all articles is measured in a fixed window. Clustering their citation trajectories helps understand the knowledge diffusion process and reveals that not all articles gain immediate success after publication. Moreover, clustering trajectories is necessary for paper impact recommendation algorithms. It is a challenging problem because citation time series exhibit significant variability due to non linear and non stationary characteristics. Prior works propose a set of arbitrary thresholds and a fixed rule based approach. All methods are primarily parameter dependent. Consequently, it leads to inconsistencies while defining similar trajectories and ambiguities regarding their specific number. Most studies only capture extreme trajectories. Thus, a generalised clustering framework is required. This paper proposes a feature based multiple k means cluster ensemble framework. 1,95,783 and 41,732 well cited articles from the Microsoft Academic Graph data are considered for clustering short term (10 year) and long term (30 year) trajectories, respectively. It has linear run time. Four distinct trajectories are obtained Early Rise Rapid Decline (2.2%), Early Rise Slow Decline (45%), Delayed Rise No Decline (53%), and Delayed Rise Slow Decline (0.8%). Individual trajectory differences for two different spans are studied. Most papers exhibit Early Rise Slow Decline and Delayed Rise No Decline patterns. The growth and decay times, cumulative citation distribution, and peak characteristics of individual trajectories are redefined empirically. A detailed comparative study reveals our proposed methodology can detect all distinct trajectory classes.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
A balanced Memristor-CMOS ternary logic family and its application
Authors:
Xiao-Yuan Wang,
Jia-Wei Zhou,
Chuan-Tao Dong,
Xin-Hui Chen,
Sanjoy Kumar Nandi,
Robert G. Elliman,
Sung-Mo Kang,
Herbert Ho-Ching Iu
Abstract:
The design of balanced ternary digital logic circuits based on memristors and conventional CMOS devices is proposed. First, balanced ternary minimum gate TMIN, maximum gate TMAX and ternary inverters are systematically designed and verified by simulation, and then logic circuits such as ternary encoders, decoders and multiplexers are designed on this basis. Two different schemes are then used to r…
▽ More
The design of balanced ternary digital logic circuits based on memristors and conventional CMOS devices is proposed. First, balanced ternary minimum gate TMIN, maximum gate TMAX and ternary inverters are systematically designed and verified by simulation, and then logic circuits such as ternary encoders, decoders and multiplexers are designed on this basis. Two different schemes are then used to realize the design of functional combinational logic circuits such as a balanced ternary half adder, multiplier, and numerical comparator. Finally, we report a series of comparisons and analyses of the two design schemes, which provide a reference for subsequent research and development of three-valued logic circuits.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Acoustic-to-articulatory inversion for dysarthric speech: Are pre-trained self-supervised representations favorable?
Authors:
Sarthak Kumar Maharana,
Krishna Kamal Adidam,
Shoumik Nandi,
Ajitesh Srivastava
Abstract:
Acoustic-to-articulatory inversion (AAI) involves mapping from the acoustic to the articulatory space. Signal-processing features like the MFCCs, have been widely used for the AAI task. For subjects with dysarthric speech, AAI is challenging because of an imprecise and indistinct pronunciation. In this work, we perform AAI for dysarthric speech using representations from pre-trained self-supervise…
▽ More
Acoustic-to-articulatory inversion (AAI) involves mapping from the acoustic to the articulatory space. Signal-processing features like the MFCCs, have been widely used for the AAI task. For subjects with dysarthric speech, AAI is challenging because of an imprecise and indistinct pronunciation. In this work, we perform AAI for dysarthric speech using representations from pre-trained self-supervised learning (SSL) models. We demonstrate the impact of different pre-trained features on this challenging AAI task, at low-resource conditions. In addition, we also condition x-vectors to the extracted SSL features to train a BLSTM network. In the seen case, we experiment with three AAI training schemes (subject-specific, pooled, and fine-tuned). The results, consistent across training schemes, reveal that DeCoAR, in the fine-tuned scheme, achieves a relative improvement of the Pearson Correlation Coefficient (CC) by ~1.81% and ~4.56% for healthy controls and patients, respectively, over MFCCs. We observe similar average trends for different SSL features in the unseen case. Overall, SSL networks like wav2vec, APC, and DeCoAR, trained with feature reconstruction or future timestep prediction tasks, perform well in predicting dysarthric articulatory trajectories.
△ Less
Submitted 9 February, 2024; v1 submitted 3 September, 2023;
originally announced September 2023.
-
Dimension Independent Helly Theorem for Lines and Flats
Authors:
Sutanoya Chakraborty,
Arijit Ghosh,
Soumi Nandi
Abstract:
We give a generalization of dimension independent Helly Theorem of Adiprasito, Bárány, Mustafa, and Terpai (Discrete & Computational Geometry 2022) to higher dimensional transversal. We also prove some impossibility results that establish the tightness of our extension.
We give a generalization of dimension independent Helly Theorem of Adiprasito, Bárány, Mustafa, and Terpai (Discrete & Computational Geometry 2022) to higher dimensional transversal. We also prove some impossibility results that establish the tightness of our extension.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
Stabbing boxes with finitely many axis-parallel lines and flats
Authors:
Sutanoya Chakraborty,
Arijit Ghosh,
Soumi Nandi
Abstract:
We give necessary and sufficient condition for an infinite collection of axis-parallel boxes in $\mathbb{R}^{d}$ to be pierceable by finitely many axis-parallel $k$-flats, where $0 \leq k < d$. We also consider colorful generalizations of the above result and establish their feasibility. The problem considered in this paper is an infinite variant of the Hadwiger-Debrunner $(p,q)$-problem.
We give necessary and sufficient condition for an infinite collection of axis-parallel boxes in $\mathbb{R}^{d}$ to be pierceable by finitely many axis-parallel $k$-flats, where $0 \leq k < d$. We also consider colorful generalizations of the above result and establish their feasibility. The problem considered in this paper is an infinite variant of the Hadwiger-Debrunner $(p,q)$-problem.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
TrainFors: A Large Benchmark Training Dataset for Image Manipulation Detection and Localization
Authors:
Soumyaroop Nandi,
Prem Natarajan,
Wael Abd-Almageed
Abstract:
The evaluation datasets and metrics for image manipulation detection and localization (IMDL) research have been standardized. But the training dataset for such a task is still nonstandard. Previous researchers have used unconventional and deviating datasets to train neural networks for detecting image forgeries and localizing pixel maps of manipulated regions. For a fair comparison, the training s…
▽ More
The evaluation datasets and metrics for image manipulation detection and localization (IMDL) research have been standardized. But the training dataset for such a task is still nonstandard. Previous researchers have used unconventional and deviating datasets to train neural networks for detecting image forgeries and localizing pixel maps of manipulated regions. For a fair comparison, the training set, test set, and evaluation metrics should be persistent. Hence, comparing the existing methods may not seem fair as the results depend heavily on the training datasets as well as the model architecture. Moreover, none of the previous works release the synthetic training dataset used for the IMDL task. We propose a standardized benchmark training dataset for image splicing, copy-move forgery, removal forgery, and image enhancement forgery. Furthermore, we identify the problems with the existing IMDL datasets and propose the required modifications. We also train the state-of-the-art IMDL methods on our proposed TrainFors1 dataset for a fair evaluation and report the actual performance of these methods under similar conditions.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
A Privacy-Preserving Blockchain-based E-voting System
Authors:
Arnab Mukherjee,
Souvik Majumdar,
Anup Kumar Kolya,
Saborni Nandi
Abstract:
Within a modern democratic nation, elections play a significant role in the nation's functioning. However, with the existing infrastructure for conducting elections using Electronic Voting Systems (EVMs), many loopholes exist, which illegitimate entities might leverage to cast false votes or even tamper with the EVMs after the voting session is complete. The need of the hour is to introduce a robu…
▽ More
Within a modern democratic nation, elections play a significant role in the nation's functioning. However, with the existing infrastructure for conducting elections using Electronic Voting Systems (EVMs), many loopholes exist, which illegitimate entities might leverage to cast false votes or even tamper with the EVMs after the voting session is complete. The need of the hour is to introduce a robust, auditable, transparent, and tamper-proof e-voting system, enabling a more reliable and fair election process. To address such concerns, we propose a novel solution for blockchain-based e-voting, focusing on the security and privacy aspects of the e-voting process. We consider the security risks and loopholes and aim to preserve the anonymity of the voters while ensuring that illegitimate votes are properly handled. Additionally, we develop a prototype as a proof of concept using the Ethereum blockchain platform. Finally, we perform experiments to demonstrate the performance of the system.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
A Scheme to resist Fast Correlation Attack for Word Oriented LFSR based Stream Cipher
Authors:
Subrata Nandi,
Srinivasan Krishnaswamy,
Pinaki Mitra
Abstract:
In LFSR-based stream ciphers, the knowledge of the feedback equation of the LFSR plays a critical role in most attacks. In word-based stream ciphers such as those in the SNOW series, even if the feedback configuration is hidden, knowing the characteristic polynomial of the state transition matrix of the LFSR enables the attacker to create a feedback equation over $GF(2)$. This, in turn, can be use…
▽ More
In LFSR-based stream ciphers, the knowledge of the feedback equation of the LFSR plays a critical role in most attacks. In word-based stream ciphers such as those in the SNOW series, even if the feedback configuration is hidden, knowing the characteristic polynomial of the state transition matrix of the LFSR enables the attacker to create a feedback equation over $GF(2)$. This, in turn, can be used to launch fast correlation attacks. In this work, we propose a method for hiding both the feedback equation of a word-based LFSR and the characteristic polynomial of the state transition matrix. Here, we employ a $z$-primitive $σ$-LFSR whose characteristic polynomial is randomly sampled from the distribution of primitive polynomials over $GF(2)$ of the appropriate degree. We propose an algorithm for locating $z$-primitive $σ$-LFSR configurations of a given degree. Further, an invertible matrix is generated from the key. This is then employed to generate a public parameter which is used to retrieve the feedback configuration using the key. If the key size is $n$- bits, the process of retrieving the feedback equation from the public parameter has a average time complexity $\mathbb{O}(2^{n-1})$. The proposed method has been tested on SNOW 2.0 and SNOW 3G for resistance to fast correlation attacks. We have demonstrated that the security of SNOW 2.0 and SNOW 3G increases from 128 bits to 256 bits.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
On $(n,m)$-chromatic numbers of graphs having bounded sparsity parameters
Authors:
Sandip Das,
Abhiruk Lahiri,
Soumen Nandi,
Sagnik Sen,
S Taruni
Abstract:
An $(n,m)$-graph is characterised by having $n$ types of arcs and $m$ types of edges. A homomorphism of an $(n,m)$-graph $G$ to an $(n,m)$-graph $H$, is a vertex mapping that preserves adjacency, direction, and type. The $(n,m)$-chromatic number of $G$, denoted by $χ_{n,m}(G)$, is the minimum value of $|V(H)|$ such that there exists a homomorphism of $G$ to $H$. The theory of homomorphisms of…
▽ More
An $(n,m)$-graph is characterised by having $n$ types of arcs and $m$ types of edges. A homomorphism of an $(n,m)$-graph $G$ to an $(n,m)$-graph $H$, is a vertex mapping that preserves adjacency, direction, and type. The $(n,m)$-chromatic number of $G$, denoted by $χ_{n,m}(G)$, is the minimum value of $|V(H)|$ such that there exists a homomorphism of $G$ to $H$. The theory of homomorphisms of $(n,m)$-graphs have connections with graph theoretic concepts like harmonious coloring, nowhere-zero flows; with other mathematical topics like binary predicate logic, Coxeter groups; and has application to the Query Evaluation Problem (QEP) in graph database.
In this article, we show that the arboricity of $G$ is bounded by a function of $χ_{n,m}(G)$ but not the other way around. Additionally, we show that the acyclic chromatic number of $G$ is bounded by a function of $χ_{n,m}(G)$, a result already known in the reverse direction. Furthermore, we prove that the $(n,m)$-chromatic number for the family of graphs with a maximum average degree less than $2+ \frac{2}{4(2n+m)-1}$, including the subfamily of planar graphs with girth at least $8(2n+m)$, equals $2(2n+m)+1$. This improves upon previous findings, which proved the $(n,m)$-chromatic number for planar graphs with girth at least $10(2n+m)-4$ is $2(2n+m)+1$.
It is established that the $(n,m)$-chromatic number for the family $\mathcal{T}_2$ of partial $2$-trees is both bounded below and above by quadratic functions of $(2n+m)$, with the lower bound being tight when $(2n+m)=2$. We prove $14 \leq χ_{(0,3)}(\mathcal{T}_2) \leq 15$ and $14 \leq χ_{(1,1)}(\mathcal{T}_2) \leq 21$ which improves both known lower bounds and the former upper bound. Moreover, for the latter upper bound, to the best of our knowledge we provide the first theoretical proof.
△ Less
Submitted 4 March, 2024; v1 submitted 13 June, 2023;
originally announced June 2023.
-
On coloring parameters of triangle-free planar $(n,m)$-graphs
Authors:
Soumen Nandi,
Sagnik Sen,
S Taruni
Abstract:
An $(n,m)$-graph is a graph with $n$ types of arcs and $m$ types of edges. A homomorphism of an $(n,m)$-graph $G$ to another $(n,m)$-graph $H$ is a vertex mapping that preserves the adjacencies along with their types and directions. The order of a smallest (with respect to the number of vertices) such $H$ is the $(n,m)$-chromatic number of $G$.Moreover, an $(n,m)$-relative clique $R$ of an…
▽ More
An $(n,m)$-graph is a graph with $n$ types of arcs and $m$ types of edges. A homomorphism of an $(n,m)$-graph $G$ to another $(n,m)$-graph $H$ is a vertex mapping that preserves the adjacencies along with their types and directions. The order of a smallest (with respect to the number of vertices) such $H$ is the $(n,m)$-chromatic number of $G$.Moreover, an $(n,m)$-relative clique $R$ of an $(n,m)$-graph $G$ is a vertex subset of $G$ for which no two distinct vertices of $R$ get identified under any homomorphism of $G$. The $(n,m)$-relative clique number of $G$, denoted by $ω_{r(n,m)}(G)$, is the maximum $|R|$ such that $R$ is an $(n,m)$-relative clique of $G$. In practice, $(n,m)$-relative cliques are often used for establishing lower bounds of $(n,m)$-chromatic number of graph families.
Generalizing an open problem posed by Sopena [Discrete Mathematics 2016] in his latest survey on oriented coloring, Chakroborty, Das, Nandi, Roy and Sen [Discrete Applied Mathematics 2022] conjectured that $ω_{r(n,m)}(G) \leq 2 (2n+m)^2 + 2$ for any triangle-free planar $(n,m)$-graph $G$ and that this bound is tight for all $(n,m) \neq (0,1)$.In this article, we positively settle this conjecture by improving the previous upper bound of $ω_{r(n,m)}(G) \leq 14 (2n+m)^2 + 2$ to $ω_{r(n,m)}(G) \leq 2 (2n+m)^2 + 2$, and by finding examples of triangle-free planar graphs that achieve this bound. As a consequence of the tightness proof, we also establish a new lower bound of $2 (2n+m)^2 + 2$ for the $(n,m)$-chromatic number for the family of triangle-free planar graphs.
△ Less
Submitted 15 October, 2023; v1 submitted 13 June, 2023;
originally announced June 2023.
-
Measuring and Mitigating Local Instability in Deep Neural Networks
Authors:
Arghya Datta,
Subhrangshu Nandi,
Jingcheng Xu,
Greg Ver Steeg,
He Xie,
Anoop Kumar,
Aram Galstyan
Abstract:
Deep Neural Networks (DNNs) are becoming integral components of real world services relied upon by millions of users. Unfortunately, architects of these systems can find it difficult to ensure reliable performance as irrelevant details like random initialization can unexpectedly change the outputs of a trained system with potentially disastrous consequences. We formulate the model stability proble…
▽ More
Deep Neural Networks (DNNs) are becoming integral components of real world services relied upon by millions of users. Unfortunately, architects of these systems can find it difficult to ensure reliable performance as irrelevant details like random initialization can unexpectedly change the outputs of a trained system with potentially disastrous consequences. We formulate the model stability problem by studying how the predictions of a model change, even when it is retrained on the same data, as a consequence of stochasticity in the training process. For Natural Language Understanding (NLU) tasks, we find instability in predictions for a significant fraction of queries. We formulate principled metrics, like per-sample ``label entropy'' across training runs or within a single training run, to quantify this phenomenon. Intriguingly, we find that unstable predictions do not appear at random, but rather appear to be clustered in data-specific ways. We study data-agnostic regularization methods to improve stability and propose new data-centric methods that exploit our local stability estimates. We find that our localized data-specific mitigation strategy dramatically outperforms data-agnostic methods, and comes within 90% of the gold standard, achieved by ensembling, at a fraction of the computational cost
△ Less
Submitted 18 May, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Certified Adversarial Robustness Within Multiple Perturbation Bounds
Authors:
Soumalya Nandi,
Sravanti Addepalli,
Harsh Rangwani,
R. Venkatesh Babu
Abstract:
Randomized smoothing (RS) is a well known certified defense against adversarial attacks, which creates a smoothed classifier by predicting the most likely class under random noise perturbations of inputs during inference. While initial work focused on robustness to $\ell_2$ norm perturbations using noise sampled from a Gaussian distribution, subsequent works have shown that different noise distrib…
▽ More
Randomized smoothing (RS) is a well known certified defense against adversarial attacks, which creates a smoothed classifier by predicting the most likely class under random noise perturbations of inputs during inference. While initial work focused on robustness to $\ell_2$ norm perturbations using noise sampled from a Gaussian distribution, subsequent works have shown that different noise distributions can result in robustness to other $\ell_p$ norm bounds as well. In general, a specific noise distribution is optimal for defending against a given $\ell_p$ norm based attack. In this work, we aim to improve the certified adversarial robustness against multiple perturbation bounds simultaneously. Towards this, we firstly present a novel \textit{certification scheme}, that effectively combines the certificates obtained using different noise distributions to obtain optimal results against multiple perturbation bounds. We further propose a novel \textit{training noise distribution} along with a \textit{regularized training scheme} to improve the certification within both $\ell_1$ and $\ell_2$ perturbation norms simultaneously. Contrary to prior works, we compare the certified robustness of different training algorithms across the same natural (clean) accuracy, rather than across fixed noise levels used for training and certification. We also empirically invalidate the argument that training and certifying the classifier with the same amount of noise gives the best results. The proposed approach achieves improvements on the ACR (Average Certified Radius) metric across both $\ell_1$ and $\ell_2$ perturbation bounds.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
On locating and neighbor-locating colorings of sparse graphs
Authors:
Dipayan Chakraborty,
Florent Foucaud,
Soumen Nandi,
Sagnik Sen,
D K Supraja
Abstract:
A proper $k$-coloring of a graph $G$ is a \emph{neighbor-locating $k$-coloring} if for each pair of vertices in the same color class, the two sets of colors found in their respective neighborhoods are different. The \textit{neighbor-locating chromatic number} $χ_{NL}(G)$ is the minimum $k$ for which $G$ admits a neighbor-locating $k$-coloring. A proper $k$-vertex-coloring of a graph $G$ is a \emph…
▽ More
A proper $k$-coloring of a graph $G$ is a \emph{neighbor-locating $k$-coloring} if for each pair of vertices in the same color class, the two sets of colors found in their respective neighborhoods are different. The \textit{neighbor-locating chromatic number} $χ_{NL}(G)$ is the minimum $k$ for which $G$ admits a neighbor-locating $k$-coloring. A proper $k$-vertex-coloring of a graph $G$ is a \emph{locating $k$-coloring} if for each pair of vertices $x$ and $y$ in the same color-class, there exists a color class $S_i$ such that $d(x,S_i)\neq d(y,S_i)$. The locating chromatic number $χ_{L}(G)$ is the minimum $k$ for which $G$ admits a locating $k$-coloring. Our main results concern the largest possible order of a sparse graph of given neighbor-locating chromatic number. More precisely, we prove that if $G$ has order $n$, neighbor-locating chromatic number $k$ and average degree at most $2a$, where $2a\le k-1$ is a positive integer, then $n$ is upper-bounded by $\mathcal{O}(a^2(k^{2a+1}))$. We also design a family of graphs of bounded maximum degree whose order is close to reaching this upper bound. Our upper bound generalizes two previous bounds from the literature, which were obtained for graphs of bounded maximum degree and graphs of bounded cycle rank, respectively. Also, we prove that determining whether $χ_L(G)\le k$ and $χ_{NL}(G)\le k$ are NP-complete for sparse graphs: more precisely, for graphs with average degree at most 7, maximum average degree at most 20 and that are $4$-partite. We also study the possible relation between the ordinary chromatic number, the locating chromatic number and the neighbor-locating chromatic number of a graph.
△ Less
Submitted 1 August, 2024; v1 submitted 31 January, 2023;
originally announced January 2023.
-
Heterochromatic Geometric Transversals of Convex sets
Authors:
Sutanoya Chakraborty,
Arijit Ghosh,
Soumi Nandi
Abstract:
An infinite sequence of sets $\left\{B_{n}\right\}_{n\in\mathbb{N}}$ is said to be a heterochromatic sequence from an infinite sequence of families $\left\{ \mathcal{F}_{n} \right\}_{n \in \mathbb{N}}$, if there exists a strictly increasing sequence of natural numbers $\left\{ i_{n}\right\}_{n \in \mathbb{N}}$ such that for all $n \in \mathbb{N}$ we have $B_{n} \in \mathcal{F}_{i_{n}}$. In this pa…
▽ More
An infinite sequence of sets $\left\{B_{n}\right\}_{n\in\mathbb{N}}$ is said to be a heterochromatic sequence from an infinite sequence of families $\left\{ \mathcal{F}_{n} \right\}_{n \in \mathbb{N}}$, if there exists a strictly increasing sequence of natural numbers $\left\{ i_{n}\right\}_{n \in \mathbb{N}}$ such that for all $n \in \mathbb{N}$ we have $B_{n} \in \mathcal{F}_{i_{n}}$. In this paper, we have proved that if for each $n\in\mathbb{N}$, $\mathcal{F}_n$ is a family of {\em nicely shaped} convex sets in $\mathbb{R}^d$ such that each heterochromatic sequence $\left\{B_{n}\right\}_{n\in\mathbb{N}}$ from $\left\{ \mathcal{F}_{n} \right\}_{n \in \mathbb{N}}$ contains at least $k+2$ sets that can be pierced by a single $k$-flat ($k$-dimensional affine space) then all but finitely many families in $\left\{\mathcal{F}_{n}\right\}_{n\in \mathbb{N}}$ can be pierced by finitely many $k$-flats. This result can be considered as a {\em countably colorful} generalization of the $(\aleph_0, k+2)$-theorem proved by Keller and Perles (Symposium on Computational Geometry 2022). We have also established the tightness of our result by proving a number of no-go theorems.
△ Less
Submitted 23 May, 2024; v1 submitted 28 December, 2022;
originally announced December 2022.
-
AsPOS: Assamese Part of Speech Tagger using Deep Learning Approach
Authors:
Dhrubajyoti Pathak,
Sukumar Nandi,
Priyankoo Sarmah
Abstract:
Part of Speech (POS) tagging is crucial to Natural Language Processing (NLP). It is a well-studied topic in several resource-rich languages. However, the development of computational linguistic resources is still in its infancy despite the existence of numerous languages that are historically and literary rich. Assamese, an Indian scheduled language, spoken by more than 25 million people, falls un…
▽ More
Part of Speech (POS) tagging is crucial to Natural Language Processing (NLP). It is a well-studied topic in several resource-rich languages. However, the development of computational linguistic resources is still in its infancy despite the existence of numerous languages that are historically and literary rich. Assamese, an Indian scheduled language, spoken by more than 25 million people, falls under this category. In this paper, we present a Deep Learning (DL)-based POS tagger for Assamese. The development process is divided into two stages. In the first phase, several pre-trained word embeddings are employed to train several tagging models. This allows us to evaluate the performance of the word embeddings in the POS tagging task. The top-performing model from the first phase is employed to annotate another set of new sentences. In the second phase, the model is trained further using the fresh dataset. Finally, we attain a tagging accuracy of 86.52% in F1 score. The model may serve as a baseline for further study on DL-based Assamese POS tagging.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
The oriented relative clique number of triangle-free planar graphs is 10
Authors:
Soura Sena Das,
Soumen Nandi,
Sagnik Sen
Abstract:
In relation to oriented coloring and chromatic number, the parameter oriented relative clique number of an oriented graph $\overrightarrow{G}$, denoted by $ω_{ro}(\overrightarrow{G})$, is the main focus of this work. We solve an open problem mentioned in the recent survey on oriented coloring by Sopena (Discrete Mathematics 2016), and positively settle a conjecture due to Sen (PhD thesis 2014), by…
▽ More
In relation to oriented coloring and chromatic number, the parameter oriented relative clique number of an oriented graph $\overrightarrow{G}$, denoted by $ω_{ro}(\overrightarrow{G})$, is the main focus of this work. We solve an open problem mentioned in the recent survey on oriented coloring by Sopena (Discrete Mathematics 2016), and positively settle a conjecture due to Sen (PhD thesis 2014), by proving that the maximum value of $ω_{ro}(\overrightarrow{G})$ is $10$ when $\overrightarrow{G}$ is a planar graph.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Colorful Helly Theorem for Piercing Boxes with Multiple Points
Authors:
Sourav Chakraborty,
Arijit Ghosh,
Soumi Nandi
Abstract:
Let $H_c=H_c(d,n)$ denote the smallest positive integer such that if we have a collection of families $\mathcal{F}_{1}, \dots, \mathcal{F}_{H_{c}}$ of axis-parallel boxes in $\mathbb{R}^{d}$ with the property that every colorful $H_{c}$-tuple from the above families can be pierced by $n$ points then there exits an $i\in \{ 1, \dots, H_{c}\}$, and for all $k\in \{ 1, \dots, H_{c}\} \setminus\{i\}$…
▽ More
Let $H_c=H_c(d,n)$ denote the smallest positive integer such that if we have a collection of families $\mathcal{F}_{1}, \dots, \mathcal{F}_{H_{c}}$ of axis-parallel boxes in $\mathbb{R}^{d}$ with the property that every colorful $H_{c}$-tuple from the above families can be pierced by $n$ points then there exits an $i\in \{ 1, \dots, H_{c}\}$, and for all $k\in \{ 1, \dots, H_{c}\} \setminus\{i\}$ there exists $F_k\in\mathcal{F}_k$ such that the following extended family $\mathcal{F}_i\cup\left\{F_k\;|\;k\in \{1, \dots, H_{c}\}\;\mbox{and} \;k\neq i\right\}$ can also be pierced by $n$ points. In this paper, we give a complete characterization of $H_{c}(d,n)$ for all values of $d$ and $n$. Our result is a colorful generalization of piercing axis-parallel boxes with multiple points by Danzer and Grünbaum (Combinatorica 1982).
△ Less
Submitted 5 June, 2023; v1 submitted 28 July, 2022;
originally announced July 2022.
-
Almost covering all the layers of hypercube with multiplicities
Authors:
Arijit Ghosh,
Chandrima Kayal,
Soumi Nandi
Abstract:
Given a hypercube $\mathcal{Q}^{n} := \{0,1\}^{n}$ in $\mathbb{R}^{n}$ and $k \in \{0, \dots, n\}$, the $k$-th layer $\mathcal{Q}^{n}_{k}$ of $\mathcal{Q}^{n}$ denotes the set of all points in $\mathcal{Q}^{n}$ whose coordinates contain exactly $k$ many ones. For a fixed $t \in \mathbb{N}$ and $k \in \{0, \dots, n\}$, let $P \in \mathbb{R}\left[x_{1}, \dots, x_{n}\right]$ be a polynomial that has…
▽ More
Given a hypercube $\mathcal{Q}^{n} := \{0,1\}^{n}$ in $\mathbb{R}^{n}$ and $k \in \{0, \dots, n\}$, the $k$-th layer $\mathcal{Q}^{n}_{k}$ of $\mathcal{Q}^{n}$ denotes the set of all points in $\mathcal{Q}^{n}$ whose coordinates contain exactly $k$ many ones. For a fixed $t \in \mathbb{N}$ and $k \in \{0, \dots, n\}$, let $P \in \mathbb{R}\left[x_{1}, \dots, x_{n}\right]$ be a polynomial that has zeroes of multiplicity at least $t$ at all points of $\mathcal{Q}^{n} \setminus \mathcal{Q}^{n}_{k}$, and $P$ has zeros of multiplicity exactly $t-1$ at all points of $\mathcal{Q}^{n}_{k}$. In this short note, we show that $$deg(P) \geq \max\left\{ k, n-k\right\}+2t-2.$$Matching the above lower bound we give an explicit construction of a family of hyperplanes $H_{1}, \dots, H_{m}$ in $\mathbb{R}^{n}$, where $m = \max\left\{ k, n-k\right\}+2t-2$, such that every point of $\mathcal{Q}^{n}_{k}$ will be covered exactly $t-1$ times, and every other point of $\mathcal{Q}^{n}$ will be covered at least $t$ times. Note that putting $k = 0$ and $t=1$, we recover the much celebrated covering result of Alon and Füredi (European Journal of Combinatorics, 1993). Using the above family of hyperplanes we disprove a conjecture of Venkitesh (The Electronic Journal of Combinatorics, 2022) on exactly covering symmetric subsets of hypercube $\mathcal{Q}^{n}$ with hyperplanes. To prove the above results we have introduced a new measure of complexity of a subset of the hypercube called index complexity which we believe will be of independent interest.
We also study a new interesting variant of the restricted sumset problem motivated by the ideas behind the proof of the above result.
△ Less
Submitted 16 February, 2023; v1 submitted 27 July, 2022;
originally announced July 2022.
-
MONet: Multi-scale Overlap Network for Duplication Detection in Biomedical Images
Authors:
Ekraam Sabir,
Soumyaroop Nandi,
Wael AbdAlmageed,
Prem Natarajan
Abstract:
Manipulation of biomedical images to misrepresent experimental results has plagued the biomedical community for a while. Recent interest in the problem led to the curation of a dataset and associated tasks to promote the development of biomedical forensic methods. Of these, the largest manipulation detection task focuses on the detection of duplicated regions between images. Traditional computer-v…
▽ More
Manipulation of biomedical images to misrepresent experimental results has plagued the biomedical community for a while. Recent interest in the problem led to the curation of a dataset and associated tasks to promote the development of biomedical forensic methods. Of these, the largest manipulation detection task focuses on the detection of duplicated regions between images. Traditional computer-vision based forensic models trained on natural images are not designed to overcome the challenges presented by biomedical images. We propose a multi-scale overlap detection model to detect duplicated image regions. Our model is structured to find duplication hierarchically, so as to reduce the number of patch operations. It achieves state-of-the-art performance overall and on multiple biomedical image categories.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
AsNER -- Annotated Dataset and Baseline for Assamese Named Entity recognition
Authors:
Dhrubajyoti Pathak,
Sukumar Nandi,
Priyankoo Sarmah
Abstract:
We present the AsNER, a named entity annotation dataset for low resource Assamese language with a baseline Assamese NER model. The dataset contains about 99k tokens comprised of text from the speech of the Prime Minister of India and Assamese play. It also contains person names, location names and addresses. The proposed NER dataset is likely to be a significant resource for deep neural based Assa…
▽ More
We present the AsNER, a named entity annotation dataset for low resource Assamese language with a baseline Assamese NER model. The dataset contains about 99k tokens comprised of text from the speech of the Prime Minister of India and Assamese play. It also contains person names, location names and addresses. The proposed NER dataset is likely to be a significant resource for deep neural based Assamese language processing. We benchmark the dataset by training NER models and evaluating using state-of-the-art architectures for supervised named entity recognition (NER) such as Fasttext, BERT, XLM-R, FLAIR, MuRIL etc. We implement several baseline approaches with state-of-the-art sequence tagging Bi-LSTM-CRF architecture. The highest F1-score among all baselines achieves an accuracy of 80.69% when using MuRIL as a word embedding method. The annotated dataset and the top performing model are made publicly available.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
AQuaMoHo: Localized Low-Cost Outdoor Air Quality Sensing over a Thermo-Hygrometer
Authors:
Prithviraj Pramanik,
Prasenjit Karmakar,
Praveen Kumar Sharma,
Soumyajit Chatterjee,
Abhijit Roy,
Santanu Mandal,
Subrata Nandi,
Sandip Chakraborty,
Mousumi Saha,
Sujoy Saha
Abstract:
Efficient air quality sensing serves as one of the essential services provided in any recent smart city. Mostly facilitated by sparsely deployed Air Quality Monitoring Stations (AQMSs) that are difficult to install and maintain, the overall spatial variation heavily impacts air quality monitoring for locations far enough from these pre-deployed public infrastructures. To mitigate this, we in this…
▽ More
Efficient air quality sensing serves as one of the essential services provided in any recent smart city. Mostly facilitated by sparsely deployed Air Quality Monitoring Stations (AQMSs) that are difficult to install and maintain, the overall spatial variation heavily impacts air quality monitoring for locations far enough from these pre-deployed public infrastructures. To mitigate this, we in this paper propose a framework named AQuaMoHo that can annotate data obtained from a low-cost thermo-hygrometer (as the sole physical sensing device) with the AQI labels, with the help of additional publicly crawled Spatio-temporal information of that locality. At its core, AQuaMoHo exploits the temporal patterns from a set of readily available spatial features using an LSTM-based model and further enhances the overall quality of the annotation using temporal attention. From a thorough study of two different cities, we observe that AQuaMoHo can significantly help annotate the air quality data on a personal scale.
△ Less
Submitted 17 November, 2022; v1 submitted 25 April, 2022;
originally announced April 2022.
-
Blockchain Meets AI for Resilient and Intelligent Internet of Vehicles
Authors:
Pranav Kumar Singh,
Sukumar Nandi,
Sunit K. Nandi,
Uttam Ghosh,
Danda B. Rawat
Abstract:
The Internet of Vehicles (IoV) is flourishing and offers various applications relating to road safety, traffic and fuel efficiency, and infotainment. Dealing with security and privacy threats and managing the trust (detecting malicious and misbehaving peers) in IoV remains the most significant concern. Artificial Intelligence is one of the most revolutionizing technologies, and the predictive powe…
▽ More
The Internet of Vehicles (IoV) is flourishing and offers various applications relating to road safety, traffic and fuel efficiency, and infotainment. Dealing with security and privacy threats and managing the trust (detecting malicious and misbehaving peers) in IoV remains the most significant concern. Artificial Intelligence is one of the most revolutionizing technologies, and the predictive power of its machine learning models can help detect intrusions and misbehaviors. Similarly, empowering the state-of-the-art IoV security framework with blockchain can make it secure and resilient. This article discusses joint AI and blockchain for security, privacy and trust-related risks in IoV. This paper also presents problems, challenges, requirements and solutions using ML and blockchain to address aforementioned issues in IoV.
△ Less
Submitted 28 December, 2021;
originally announced December 2021.
-
BioFors: A Large Biomedical Image Forensics Dataset
Authors:
Ekraam Sabir,
Soumyaroop Nandi,
Wael AbdAlmageed,
Prem Natarajan
Abstract:
Research in media forensics has gained traction to combat the spread of misinformation. However, most of this research has been directed towards content generated on social media. Biomedical image forensics is a related problem, where manipulation or misuse of images reported in biomedical research documents is of serious concern. The problem has failed to gain momentum beyond an academic discussi…
▽ More
Research in media forensics has gained traction to combat the spread of misinformation. However, most of this research has been directed towards content generated on social media. Biomedical image forensics is a related problem, where manipulation or misuse of images reported in biomedical research documents is of serious concern. The problem has failed to gain momentum beyond an academic discussion due to an absence of benchmark datasets and standardized tasks. In this paper we present BioFors -- the first dataset for benchmarking common biomedical image manipulations. BioFors comprises 47,805 images extracted from 1,031 open-source research papers. Images in BioFors are divided into four categories -- Microscopy, Blot/Gel, FACS and Macroscopy. We also propose three tasks for forensic analysis -- external duplication detection, internal duplication detection and cut/sharp-transition detection. We benchmark BioFors on all tasks with suitable state-of-the-art algorithms. Our results and analysis show that existing algorithms developed on common computer vision datasets are not robust when applied to biomedical images, validating that more research is required to address the unique challenges of biomedical image forensics.
△ Less
Submitted 29 August, 2021;
originally announced August 2021.
-
SIGN: Spatial-information Incorporated Generative Network for Generalized Zero-shot Semantic Segmentation
Authors:
Jiaxin Cheng,
Soumyaroop Nandi,
Prem Natarajan,
Wael Abd-Almageed
Abstract:
Unlike conventional zero-shot classification, zero-shot semantic segmentation predicts a class label at the pixel level instead of the image level. When solving zero-shot semantic segmentation problems, the need for pixel-level prediction with surrounding context motivates us to incorporate spatial information using positional encoding. We improve standard positional encoding by introducing the co…
▽ More
Unlike conventional zero-shot classification, zero-shot semantic segmentation predicts a class label at the pixel level instead of the image level. When solving zero-shot semantic segmentation problems, the need for pixel-level prediction with surrounding context motivates us to incorporate spatial information using positional encoding. We improve standard positional encoding by introducing the concept of Relative Positional Encoding, which integrates spatial information at the feature level and can handle arbitrary image sizes. Furthermore, while self-training is widely used in zero-shot semantic segmentation to generate pseudo-labels, we propose a new knowledge-distillation-inspired self-training strategy, namely Annealed Self-Training, which can automatically assign different importance to pseudo-labels to improve performance. We systematically study the proposed Relative Positional Encoding and Annealed Self-Training in a comprehensive experimental evaluation, and our empirical results confirm the effectiveness of our method on three benchmark datasets.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Exploiting Multi-modal Contextual Sensing for City-bus's Stay Location Characterization: Towards Sub-60 Seconds Accurate Arrival Time Prediction
Authors:
Ratna Mandal,
Prasenjit Karmakar,
Soumyajit Chatterjee,
Debaleen Das Spandan,
Shouvit Pradhan,
Sujoy Saha,
Sandip Chakraborty,
Subrata Nandi
Abstract:
Intelligent city transportation systems are one of the core infrastructures of a smart city. The true ingenuity of such an infrastructure lies in providing the commuters with real-time information about citywide transports like public buses, allowing her to pre-plan the travel. However, providing prior information for transportation systems like public buses in real-time is inherently challenging…
▽ More
Intelligent city transportation systems are one of the core infrastructures of a smart city. The true ingenuity of such an infrastructure lies in providing the commuters with real-time information about citywide transports like public buses, allowing her to pre-plan the travel. However, providing prior information for transportation systems like public buses in real-time is inherently challenging because of the diverse nature of different stay-locations that a public bus stops. Although straightforward factors stay duration, extracted from unimodal sources like GPS, at these locations look erratic, a thorough analysis of public bus GPS trails for 720km of bus travels at the city of Durgapur, a semi-urban city in India, reveals that several other fine-grained contextual features can characterize these locations accurately. Accordingly, we develop BuStop, a system for extracting and characterizing the stay locations from multi-modal sensing using commuters' smartphones. Using this multi-modal information BuStop extracts a set of granular contextual features that allow the system to differentiate among the different stay-location types. A thorough analysis of BuStop using the collected dataset indicates that the system works with high accuracy in identifying different stay locations like regular bus stops, random ad-hoc stops, stops due to traffic congestion stops at traffic signals, and stops at sharp turns. Additionally, we also develop a proof-of-concept setup on top of BuStop to analyze the potential of the framework in predicting expected arrival time, a critical piece of information required to pre-plan travel, at any given bus stop. Subsequent analysis of the PoC framework, through simulation over the test dataset, shows that characterizing the stay-locations indeed helps make more accurate arrival time predictions with deviations less than 60s from the ground-truth arrival time.
△ Less
Submitted 24 May, 2021;
originally announced May 2021.
-
On clique numbers of colored mixed graphs
Authors:
Dipayan Chakraborty,
Sandip Das,
Soumen Nandi,
Debdeep Roy,
Sagnik Sen
Abstract:
An (m,n)-colored mixed graph, or simply, an (m,n)-graph is a graph having m different types of arcs and n different types of edges. A homomorphism of an (m,n)-graph G to another (m,n)-graph H is a vertex mapping that preserves adjacency, the type thereto and the direction. A subset R of the set of vertices of G that always maps distinct vertices in itself to distinct image vertices under any homom…
▽ More
An (m,n)-colored mixed graph, or simply, an (m,n)-graph is a graph having m different types of arcs and n different types of edges. A homomorphism of an (m,n)-graph G to another (m,n)-graph H is a vertex mapping that preserves adjacency, the type thereto and the direction. A subset R of the set of vertices of G that always maps distinct vertices in itself to distinct image vertices under any homomorphism is called an (m,n)-relative clique of G. The maximum cardinality of an (m,n)-relative clique of a graph is called the (m,n)-relative clique number of the graph. In this article, we explore the (m,n)-relative clique numbers for various families of graphs.
△ Less
Submitted 6 January, 2022; v1 submitted 15 April, 2021;
originally announced April 2021.
-
Privacy Enhanced DigiLocker using Ciphertext-Policy Attribute-Based Encryption
Authors:
Puneet Bakshi,
Sukumar Nandi
Abstract:
Recently, Government of India has taken several initiatives to make India digitally strong such as to provide each resident a unique digital identity, referred to as Aadhaar, and to provide several online e-Governance services based on Aadhaar such as DigiLocker. DigiLocker is an online service which provides a shareable private storage space on public cloud to its subscribers. Although DigiLocker…
▽ More
Recently, Government of India has taken several initiatives to make India digitally strong such as to provide each resident a unique digital identity, referred to as Aadhaar, and to provide several online e-Governance services based on Aadhaar such as DigiLocker. DigiLocker is an online service which provides a shareable private storage space on public cloud to its subscribers. Although DigiLocker ensures traditional security such as data integrity and secure data access, privacy of e-documents are yet to addressed. Ciphertext-Policy Attribute-Based Encryption (CP-ABE) can improve data privacy but the right implementation of it has always been a challenge. This paper presents a scheme to implement privacy enhanced DigiLocker using CP-ABE.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
On the signed chromatic number of some classes of graphs
Authors:
Julien Bensmail,
Sandip Das,
Soumen Nandi,
Théo Pierron,
Sagnik Sen,
Eric Sopena
Abstract:
A signed graph $(G, σ)$ is a graph $G$ along with a function $σ: E(G) \to \{+,-\}$. A closed walk of a signed graph is positive (resp., negative) if it has an even (resp., odd) number of negative edges, counting repetitions. A homomorphism of a (simple) signed graph to another signed graph is a vertex-mapping that preserves adjacencies and signs of closed walks. The signed chromatic number of a si…
▽ More
A signed graph $(G, σ)$ is a graph $G$ along with a function $σ: E(G) \to \{+,-\}$. A closed walk of a signed graph is positive (resp., negative) if it has an even (resp., odd) number of negative edges, counting repetitions. A homomorphism of a (simple) signed graph to another signed graph is a vertex-mapping that preserves adjacencies and signs of closed walks. The signed chromatic number of a signed graph $(G, σ)$ is the minimum number of vertices $|V(H)|$ of a signed graph $(H, π)$ to which $(G, σ)$ admits a homomorphism.Homomorphisms of signed graphs have been attracting growing attention in the last decades, especially due to their strong connections to the theories of graph coloring and graph minors. These homomorphisms have been particularly studied through the scope of the signed chromatic number. In this work, we provide new results and bounds on the signed chromatic number of several families of signed graphs (planar graphs, triangle-free planar graphs, $K_n$-minor-free graphs, and bounded-degree graphs).
△ Less
Submitted 25 September, 2020;
originally announced September 2020.
-
V-CARE: A Blockchain Based Framework for Secure Vehicle Health Record System
Authors:
Pranav Kumar Singh,
Roshan Singh,
Sukumar Nandi
Abstract:
One of the biggest challenges associated with connected and autonomous vehicles (CAVs) is to maintain and make use of vehicles health records (VHR). VHR can facilitate different entities to offer various services in a proactive, transparent, secure, reliable and in an efficient manner. The state-of-the-art solutions for maintaining the VHR are centralized in nature, mainly owned by manufacturer an…
▽ More
One of the biggest challenges associated with connected and autonomous vehicles (CAVs) is to maintain and make use of vehicles health records (VHR). VHR can facilitate different entities to offer various services in a proactive, transparent, secure, reliable and in an efficient manner. The state-of-the-art solutions for maintaining the VHR are centralized in nature, mainly owned by manufacturer and authorized in-vehicle device developers. Owners, drivers, and other key service providers have limited accessibility and control to the VHR. We need to change the strategy from single or limited party access to multi-party access to VHR in an secured manner so that all stakeholders of intelligent transportation system (ITS) can be benefited from this. Any unauthorized attempt to alter the data should also be prevented. Blockchain is one such potential candidate, which can facilitate the sharing of such data among different participating organizations and individuals. For example, owners, manufacturers, trusted third parties, road authorities, insurance companies, charging stations, and car selling ventures can access VHR stored on the blockchain in a permissioned, secured, and with a higher level of confidence. In this paper, a blockchain-based decentralized secure system for V-CARE is proposed to manage records in an interoperable framework that leads to improved ITS services in terms of safety, availability, reliability, efficiency, and maintenance. Insurance based on pay-how-you-drive (PHYD), and sale and purchase of used vehicles can also be made more transparent and reliable without compromising the confidentiality and security of sensitive data.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
PowerPlanningDL: Reliability-Aware Framework for On-Chip Power Grid Design using Deep Learning
Authors:
Sukanta Dey,
Sukumar Nandi,
Gaurav Trivedi
Abstract:
With the increase in the complexity of chip designs, VLSI physical design has become a time-consuming task, which is an iterative design process. Power planning is that part of the floorplanning in VLSI physical design where power grid networks are designed in order to provide adequate power to all the underlying functional blocks. Power planning also requires multiple iterative steps to create th…
▽ More
With the increase in the complexity of chip designs, VLSI physical design has become a time-consuming task, which is an iterative design process. Power planning is that part of the floorplanning in VLSI physical design where power grid networks are designed in order to provide adequate power to all the underlying functional blocks. Power planning also requires multiple iterative steps to create the power grid network while satisfying the allowed worst-case IR drop and Electromigration (EM) margin. For the first time, this paper introduces Deep learning (DL)-based framework to approximately predict the initial design of the power grid network, considering different reliability constraints. The proposed framework reduces many iterative design steps and speeds up the total design cycle. Neural Network-based multi-target regression technique is used to create the DL model. Feature extraction is done, and the training dataset is generated from the floorplans of some of the power grid designs extracted from the IBM processor. The DL model is trained using the generated dataset. The proposed DL-based framework is validated using a new set of power grid specifications (obtained by perturbing the designs used in the training phase). The results show that the predicted power grid design is closer to the original design with minimal prediction error (~2%). The proposed DL-based approach also improves the design cycle time with a speedup of ~6X for standard power grid benchmarks.
△ Less
Submitted 24 July, 2020; v1 submitted 4 May, 2020;
originally announced May 2020.
-
The application of $σ$-LFSR in Key-Dependent Feedback Configuration for Word-Oriented Stream Ciphers
Authors:
Subrata Nandi,
Srinivasan Krishnaswamy,
Behrouz Zolfaghari,
Pinaki Mitra
Abstract:
In this paper, we propose and evaluate a method for generating key-dependent feedback configurations (KDFC) for $σ$-LFSRs. $σ$-LFSRs with such configurations can be applied to any stream cipher that uses a word-based LFSR. Here, a configuration generation algorithm uses the secret key(K) and the initialization vector (IV) to generate a feedback configuration. We have mathematically analysed the fe…
▽ More
In this paper, we propose and evaluate a method for generating key-dependent feedback configurations (KDFC) for $σ$-LFSRs. $σ$-LFSRs with such configurations can be applied to any stream cipher that uses a word-based LFSR. Here, a configuration generation algorithm uses the secret key(K) and the initialization vector (IV) to generate a feedback configuration. We have mathematically analysed the feedback configurations generated by this method. As a test case, we have applied this method on SNOW 2.0 and have studied its impact on resistance to various attacks. Further, we have also tested the generated keystream for randomness and have briefly described its implementation and the challenges involved in the same.
△ Less
Submitted 2 March, 2021; v1 submitted 20 March, 2020;
originally announced March 2020.
-
Pushable chromatic number of graphs with degree constraints
Authors:
Julien Bensmail,
Sandip Das,
Soumen Nandi,
Théo Pierron,
Soumyajit Paul,
Sagnik Sen,
Eric Sopena
Abstract:
Pushable homomorphisms and the pushable chromatic number $χ_p$ of oriented graphs were introduced by Klostermeyer and MacGillivray in 2004. They notably observed that, for any oriented graph $\overrightarrow{G}$, we have $χ_p(\overrightarrow{G}) \leq χ_o(\overrightarrow{G}) \leq 2 χ_p(\overrightarrow{G})$, where $χ_o(\overrightarrow{G})$ denotes the oriented chromatic number of…
▽ More
Pushable homomorphisms and the pushable chromatic number $χ_p$ of oriented graphs were introduced by Klostermeyer and MacGillivray in 2004. They notably observed that, for any oriented graph $\overrightarrow{G}$, we have $χ_p(\overrightarrow{G}) \leq χ_o(\overrightarrow{G}) \leq 2 χ_p(\overrightarrow{G})$, where $χ_o(\overrightarrow{G})$ denotes the oriented chromatic number of $\overrightarrow{G}$. This stands as first general bounds on $χ_p$. This parameter was further studied in later works.This work is dedicated to the pushable chromatic number of oriented graphs fulfilling particular degree conditions. For all $Δ\geq 29$, we first prove that the maximum value of the pushable chromatic number of an oriented graph with maximum degree $Δ$ lies between $2^{\fracΔ{2}-1}$ and $(Δ-3) \cdot (Δ-1) \cdot 2^{Δ-1} + 2$ which implies an improved bound on the oriented chromatic number of the same family of graphs. For subcubic oriented graphs, that is, when $Δ\leq 3$, we then prove that the maximum value of the pushable chromatic number is~$6$ or~$7$. We also prove that the maximum value of the pushable chromatic number of oriented graphs with maximum average degree less than~$3$ lies between~$5$ and~$6$. The former upper bound of~$7$ also holds as an upper bound on the pushable chromatic number of planar oriented graphs with girth at least~$6$.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.
-
Origin of current-controlled negative differential resistance modes and the emergence of composite characteristics with high complexity
Authors:
Shuai Li,
Xinjun Liu,
Sanjoy Kumar Nandi,
Shimul Kanti Nath,
Robert G. Elliman
Abstract:
Current-controlled negative differential resistance has significant potential as a fundamental building block in brain-inspired neuromorphic computing. However, achieving desired negative differential resistance characteristics, which is crucial for practical implementation, remains challenging due to little consensus on the underlying mechanism and unclear design criteria. Here, we report a mater…
▽ More
Current-controlled negative differential resistance has significant potential as a fundamental building block in brain-inspired neuromorphic computing. However, achieving desired negative differential resistance characteristics, which is crucial for practical implementation, remains challenging due to little consensus on the underlying mechanism and unclear design criteria. Here, we report a material-independent model of current-controlled negative differential resistance to explain a broad range of characteristics, including the origin of the discontinuous snap-back response observed in many transition metal oxides. This is achieved by explicitly accounting for a non-uniform current distribution in the oxide film and its impact on the effective circuit of the device, rather than a material-specific phase transition. The predictions of the model are then compared with experimental observations to show that the continuous S-type and discontinuous snap-back characteristics serve as fundamental building blocks for composite behaviour with higher complexity. Finally, we demonstrate the potential of our approach for predicting and engineering unconventional compound behaviour with novel functionality for emerging electronic and neuromorphic computing applications.
△ Less
Submitted 19 June, 2019;
originally announced July 2019.
-
Erratum for "On oriented cliques with respect to push operation"
Authors:
Julien Bensmail,
Soumen Nandi,
Sagnik Sen
Abstract:
An error is spotted in the statement of Theorem~1.3 of our published article titled "On oriented cliques with respect to push operation" (Discrete Applied Mathematics 2017). The theorem provided an exhaustive list of 16 minimal (up to spanning subgraph inclusion) underlying planar push cliques. The error was that, one of the 16 graphs from the above list was missing an arc. We correct the error an…
▽ More
An error is spotted in the statement of Theorem~1.3 of our published article titled "On oriented cliques with respect to push operation" (Discrete Applied Mathematics 2017). The theorem provided an exhaustive list of 16 minimal (up to spanning subgraph inclusion) underlying planar push cliques. The error was that, one of the 16 graphs from the above list was missing an arc. We correct the error and restate the corrected statement in this article. We also point out the reason for the error and comment that the error occurred due to a mistake in a particular lemma. We present the corrected proof of that particular lemma as well. Moreover, a few counts were wrongly reported due to the above mentioned error. So we update our reported counts after correction in this article.
△ Less
Submitted 12 October, 2018;
originally announced October 2018.
-
On relative clique number of colored mixed graphs
Authors:
Sandip Das,
Soumen Nandi,
Debdeep Roy,
Sagnik Sen
Abstract:
An $(m, n)$-colored mixed graph is a graph having arcs of $m$ different colors and edges of $n$ different colors. A graph homomorphism of an $(m, n$)-colored mixed graph $G$ to an $(m, n)$-colored mixed graph $H$ is a vertex mapping such that if $uv$ is an arc (edge) of color $c$ in $G$, then $f(u)f(v)$ is also an arc (edge) of color $c$. The ($m, n)$-colored mixed chromatic number of an $(m, n)$-…
▽ More
An $(m, n)$-colored mixed graph is a graph having arcs of $m$ different colors and edges of $n$ different colors. A graph homomorphism of an $(m, n$)-colored mixed graph $G$ to an $(m, n)$-colored mixed graph $H$ is a vertex mapping such that if $uv$ is an arc (edge) of color $c$ in $G$, then $f(u)f(v)$ is also an arc (edge) of color $c$. The ($m, n)$-colored mixed chromatic number of an $(m, n)$-colored mixed graph $G$, introduced by Nešetřil and Raspaud [J. Combin. Theory Ser. B 2000] is the order (number of vertices) of the smallest homomorphic image of $G$. Later Bensmail, Duffy and Sen [Graphs Combin. 2017] introduced another parameter related to the $(m, n)$-colored mixed chromatic number, namely, the $(m, n)$-relative clique number as the maximum cardinality of a vertex subset which, pairwise, must have distinct images with respect to any colored homomorphism.
In this article, we study the $(m, n$)-relative clique number for the family of subcubic graphs, graphs with maximum degree $Δ$, planar graphs and triangle-free planar graphs and provide new improved bounds in each of the cases. In particular, for subcubic graphs we provide exact value of the parameter.
△ Less
Submitted 12 October, 2018;
originally announced October 2018.
-
On Good and Bad Intentions behind Anomalous Citation Patterns among Journals in Computer Sciences
Authors:
Joyita Chakraborty,
Dinesh Pradhan,
Hridoy Sankar Dutta,
Subrata Nandi,
Tanmoy Chakraborty
Abstract:
Scientific journals are an important choice of publication venue for most authors. Publishing in prestigious journal plays a decisive role for authors in hiring and promotions. In last decade, citation pressure has become intact for all scientific entities more than ever before. Unethical publication practices has started to manipulate widely used performance metric such as "impact factor" for jou…
▽ More
Scientific journals are an important choice of publication venue for most authors. Publishing in prestigious journal plays a decisive role for authors in hiring and promotions. In last decade, citation pressure has become intact for all scientific entities more than ever before. Unethical publication practices has started to manipulate widely used performance metric such as "impact factor" for journals and citation based indices for authors. This threatens the integrity of scientific quality and takes away deserved credit of legitimate authors and their authentic publications.
In this paper we extract all possible anomalous citation patterns between journals from a Computer Science bibliographic dataset which contains more than 2,500 journals. Apart from excessive self-citations, we mostly focus on finding several patterns between two or more journals such as bi-directional mutual citations, chains, triangles, mesh, cartel relationships. On a macroscopic scale, the motivation is to understand the nature of these patterns by modeling how journals mutually interact through citations. On microscopic level, we differentiate between possible intentions (good or bad) behind such patterns. We see whether such patterns prevail for long period or during any specific time duration. For abnormal citation behavior, we study the nature of sudden inflation in impact factor of journals on a time basis which may occur due to addition of irrelevant and superfluous citations in such closed pattern interaction. We also study possible influences such as abrupt increase in paper count due to the presence of self-referential papers or duplicate manuscripts, author self-citation, author co-authorship network, author-editor network, publication houses etc. The entire study is done to question the reliability of existing bibliometrics, and hence, it is an urgent need to curtail their usage or redefine them.
△ Less
Submitted 27 July, 2018;
originally announced July 2018.
-
An Efficient Secure Distributed Cloud Storage for Append-only Data
Authors:
Binanda Sengupta,
Nishant Nikam,
Sushmita Ruj,
Srinivasan Narayanamurthy,
Siddhartha Nandi
Abstract:
Cloud computing enables users (clients) to outsource large volume of their data to cloud servers. Secure distributed cloud storage schemes ensure that multiple servers store these data in a reliable and untampered fashion. We propose an idea to construct such a scheme for static data by encoding data blocks (using error-correcting codes) and then attaching authentication information (tags) to thes…
▽ More
Cloud computing enables users (clients) to outsource large volume of their data to cloud servers. Secure distributed cloud storage schemes ensure that multiple servers store these data in a reliable and untampered fashion. We propose an idea to construct such a scheme for static data by encoding data blocks (using error-correcting codes) and then attaching authentication information (tags) to these encoded blocks. We identify some challenges while extending this idea to accommodate append-only data. Then, we propose our secure distributed cloud storage scheme for append-only data that addresses the challenges efficiently. The main advantage of our scheme is that it enables the servers to update the parity blocks themselves. Moreover, the client need not download any data (or parity) block to update the tags of the modified parity blocks residing on the servers. Finally, we analyze the security and performance of our scheme.
△ Less
Submitted 3 June, 2018; v1 submitted 8 May, 2018;
originally announced May 2018.
-
$C^3$-index: A PageRank based multi-faceted metric for authors' performance measurement
Authors:
Dinesh Pradhan,
Partha Sarathi Paul,
Umesh Maheswari,
Subrata Nandi,
Tanmoy Chakraborty
Abstract:
Ranking scientific authors is an important but challenging task, mostly due to the dynamic nature of the evolving scientific publications. The basic indicators of an author's productivity and impact are still the number of publications and the citation count (leading to the popular metrics such as h-index, g-index etc.). H-index and its popular variants are mostly effective in ranking highly-cited…
▽ More
Ranking scientific authors is an important but challenging task, mostly due to the dynamic nature of the evolving scientific publications. The basic indicators of an author's productivity and impact are still the number of publications and the citation count (leading to the popular metrics such as h-index, g-index etc.). H-index and its popular variants are mostly effective in ranking highly-cited authors, thus fail to resolve ties while ranking medium-cited and low-cited authors who are majority in number. Therefore, these metrics are inefficient to predict the ability of promising young researchers at the beginning of their career. In this paper, we propose $C^3$-index that combines the effect of citations and collaborations of an author in a systematic way using a weighted multi-layered network to rank authors. We conduct our experiments on a massive publication dataset of Computer Science and show that - (i) $C^3$-index is consistent over time, which is one of the fundamental characteristics of a ranking metric, (ii) $C^3$-index is as efficient as h-index and its variants to rank highly-cited authors, (iii) $C^3$-index can act as a conflict resolution metric to break ties in the ranking of medium-cited and low-cited authors, (iv) $C^3$-index can also be used to predict future achievers at the early stage of their career.
△ Less
Submitted 22 October, 2016;
originally announced October 2016.
-
$C^3$-index: Revisiting Authors' Performance Measure
Authors:
Dinesh Pradhan,
Partha Sarathi Paul,
Umesh Maheswari,
Subrata Nandi,
Tanmoy Chakraborty
Abstract:
Author performance indices (such as h-index and its variants) fail to resolve ties while ranking authors with low index values (majority in number) which includes the young researchers. In this work we leverage the citations as well as collaboration profile of an author in a novel way using a weighted multi-layered network and propose a variant of page-rank algorithm to obtain a new author perform…
▽ More
Author performance indices (such as h-index and its variants) fail to resolve ties while ranking authors with low index values (majority in number) which includes the young researchers. In this work we leverage the citations as well as collaboration profile of an author in a novel way using a weighted multi-layered network and propose a variant of page-rank algorithm to obtain a new author performance measure, $C^3$-index. Experiments on a massive publication dataset reveal several interesting characteristics of our metric: (i) we observe that $C^3$-index is consistent over time, (ii) $C^3$-index has high potential to break ties among low rank authors, (iii) $C^3$-index can effectively be used to predict future achievers at the early stage of their career.
△ Less
Submitted 8 April, 2016;
originally announced April 2016.
-
On the Discovery of Success Trajectories of Authors
Authors:
Dinesh Pradhan,
Tanmoy Chakraborty,
Saswata Pandit,
Subrata Nandi
Abstract:
Understanding the qualitative patterns of research endeavor of scientific authors in terms of publication count and their impact (citation) is important in order to quantify success trajectories. Here, we examine the career profile of authors in computer science and physics domains and discover at least six different success trajectories in terms of normalized citation count in longitudinal scale.…
▽ More
Understanding the qualitative patterns of research endeavor of scientific authors in terms of publication count and their impact (citation) is important in order to quantify success trajectories. Here, we examine the career profile of authors in computer science and physics domains and discover at least six different success trajectories in terms of normalized citation count in longitudinal scale. Initial observations of individual trajectories lead us to characterize the authors in each category. We further leverage this trajectory information to build a two-stage stratification model to predict future success of an author at the early stage of her career. Our model outperforms the baseline with an average improvement of 15.68% for both the datasets.
△ Less
Submitted 4 February, 2016;
originally announced February 2016.
-
Approximation algorithms for the two-center problem of convex polygon
Authors:
Sanjib Sadhu,
Sasanka Roy,
Soumen Nandi,
Anil Maheswari,
Subhas C. Nandy
Abstract:
Given a convex polygon $P$ with $n$ vertices, the two-center problem is to find two congruent closed disks of minimum radius such that they completely cover $P$. We propose an algorithm for this problem in the streaming setup, where the input stream is the vertices of the polygon in clockwise order. It produces a radius $r$ satisfying $r\leq2r_{opt}$ using $O(1)$ space, where $r_{opt}$ is the opti…
▽ More
Given a convex polygon $P$ with $n$ vertices, the two-center problem is to find two congruent closed disks of minimum radius such that they completely cover $P$. We propose an algorithm for this problem in the streaming setup, where the input stream is the vertices of the polygon in clockwise order. It produces a radius $r$ satisfying $r\leq2r_{opt}$ using $O(1)$ space, where $r_{opt}$ is the optimum solution. Next, we show that in non-streaming setup, we can improve the approximation factor by $r\leq 1.84 r_{opt}$, maintaining the time complexity of the algorithm to $O(n)$, and using $O(1)$ extra space in addition to the space required for storing the input.
△ Less
Submitted 8 December, 2015;
originally announced December 2015.
-
On oriented cliques with respect to push operation
Authors:
Julien Bensmail,
Soumen Nandi,
Sagnik Sen
Abstract:
To push a vertex $v$ of a directed graph $\overrightarrow{G}$ is to change the orientations of all the arcs incident with $v$. An oriented graph is a directed graph without any cycle of length at most 2. An oriented clique is an oriented graph whose non-adjacent vertices are connected by a directed 2-path. A push clique is an oriented clique that remains an oriented clique even if one pushes any s…
▽ More
To push a vertex $v$ of a directed graph $\overrightarrow{G}$ is to change the orientations of all the arcs incident with $v$. An oriented graph is a directed graph without any cycle of length at most 2. An oriented clique is an oriented graph whose non-adjacent vertices are connected by a directed 2-path. A push clique is an oriented clique that remains an oriented clique even if one pushes any set of vertices of it. We show that it is NP-complete to decide if an undirected graph is underlying graph of a push clique or not. We also prove that a planar push clique can have at most 8 vertices. We also provide an exhaustive list of minimal (with respect to spanning subgraph inclusion) planar push cliques.
△ Less
Submitted 27 November, 2015;
originally announced November 2015.
-
On chromatic number of colored mixed graphs
Authors:
Sandip Das,
Soumen Nandi,
Sagnik Sen
Abstract:
An $(m,n)$-colored mixed graph $G$ is a graph with its arcs having one of the $m$ different colors and edges having one of the $n$ different colors. A homomorphism $f$ of an $(m,n)$-colored mixed graph $G$ to an $(m,n)$-colored mixed graph $H$ is a vertex mapping such that if $uv$ is an arc (edge) of color $c$ in $G$, then $f(u)f(v)$ is an arc (edge) of color $c$ in $H$. The \textit{$(m,n)$-colore…
▽ More
An $(m,n)$-colored mixed graph $G$ is a graph with its arcs having one of the $m$ different colors and edges having one of the $n$ different colors. A homomorphism $f$ of an $(m,n)$-colored mixed graph $G$ to an $(m,n)$-colored mixed graph $H$ is a vertex mapping such that if $uv$ is an arc (edge) of color $c$ in $G$, then $f(u)f(v)$ is an arc (edge) of color $c$ in $H$. The \textit{$(m,n)$-colored mixed chromatic number} $χ_{(m,n)}(G)$ of an $(m,n)$-colored mixed graph $G$ is the order (number of vertices) of the smallest homomorphic image of $G$. This notion was introduced by Nešetřil and Raspaud (2000, J. Combin. Theory, Ser. B 80, 147--155). They showed that $χ_{(m,n)}(G) \leq k(2m+n)^{k-1}$ where $G$ is a $k$-acyclic colorable graph. We proved the tightness of this bound. We also showed that the acyclic chromatic number of a graph is bounded by $k^2 + k^{2 + \lceil log_{(2m+n)} log_{(2m+n)} k \rceil}$ if its $(m,n)$-colored mixed chromatic number is at most $k$.
Furthermore, using probabilistic method, we showed that for graphs with maximum degree $Δ$ its $(m,n)$-colored mixed chromatic number is at most $2(Δ-1)^{2m+n} (2m+n)^{Δ-1}$. In particular, the last result directly improves the upper bound $2Δ^2 2^Δ$ of oriented chromatic number of graphs with maximum degree $Δ$, obtained by Kostochka, Sopena and Zhu (1997, J. Graph Theory 24, 331--340) to $2(Δ-1)^2 2^{Δ-1}$. We also show that there exists a graph with maximum degree $Δ$ and $(m,n)$-colored mixed chromatic number at least $(2m+n)^{Δ/ 2}$.
△ Less
Submitted 28 August, 2015;
originally announced August 2015.
-
Evaluation of Codes with Inherent Double Replication for Hadoop
Authors:
M. Nikhil Krishnan,
N. Prakash,
V. Lalitha,
Birenjith Sasidharan,
P. Vijay Kumar,
Srinivasan Narayanamurthy,
Ranjit Kumar,
Siddhartha Nandi
Abstract:
In this paper, we evaluate the efficacy, in a Hadoop setting, of two coding schemes, both possessing an inherent double replication of data. The two coding schemes belong to the class of regenerating and locally regenerating codes respectively, and these two classes are representative of recent advances made in designing codes for the efficient storage of data in a distributed setting. In comparis…
▽ More
In this paper, we evaluate the efficacy, in a Hadoop setting, of two coding schemes, both possessing an inherent double replication of data. The two coding schemes belong to the class of regenerating and locally regenerating codes respectively, and these two classes are representative of recent advances made in designing codes for the efficient storage of data in a distributed setting. In comparison with triple replication, double replication permits a significant reduction in storage overhead, while delivering good MapReduce performance under moderate work loads. The two coding solutions under evaluation here, add only moderately to the storage overhead of double replication, while simultaneously offering reliability levels similar to that of triple replication.
One might expect from the property of inherent data duplication that the performance of these codes in executing a MapReduce job would be comparable to that of double replication. However, a second feature of this class of code comes into play here, namely that under both coding schemes analyzed here, multiple blocks from the same coded stripe are required to be stored on the same node. This concentration of data belonging to a single stripe negatively impacts MapReduce execution times. However, much of this effect can be undone by simply adding a larger number of processors per node. Further improvements are possible if one tailors the Map task scheduler to the codes under consideration. We present both experimental and simulation results that validate these observations.
△ Less
Submitted 26 June, 2014;
originally announced June 2014.
-
An Active Host-Based Intrusion Detection System for ARP-Related Attacks and its Verification
Authors:
Ferdous A Barbhuiya,
Santosh Biswas,
Sukumar Nandi
Abstract:
Spoofing with falsified IP-MAC pair is the first step in most of the LAN based-attacks. Address Resolution Protocol (ARP) is stateless, which is the main cause that makes spoofing possible. Several network level and host level mechanisms have been proposed to detect and mitigate ARP spoofing but each of them has their own drawback. In this paper we propose a Host-based Intrusion Detection system f…
▽ More
Spoofing with falsified IP-MAC pair is the first step in most of the LAN based-attacks. Address Resolution Protocol (ARP) is stateless, which is the main cause that makes spoofing possible. Several network level and host level mechanisms have been proposed to detect and mitigate ARP spoofing but each of them has their own drawback. In this paper we propose a Host-based Intrusion Detection system for LAN attacks, which works without any extra constraint like static IP-MAC, modifying ARP etc. The proposed scheme is verified under all possible attack scenarios. The scheme is successfully validated in a test bed with various attack scenarios and the results show the effectiveness of the proposed technique.
△ Less
Submitted 6 June, 2013;
originally announced June 2013.