Saikat Chakraborty

Ahsanullah University of Science and Technology, Computer Science and Engineering, Faculty Member

Bangladesh University of Engineering and Technology, Department of Computer Science and Engineering, Alumnus

Followers

Following

Public Views

Interests

Uploads

Papers by Saikat Chakraborty

On Multi-Modal Learning of Editing Source Code

2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Download

CODIT: Code Editing with Tree-Based NeuralMachine Translation

The way developers edit day-to-day code tends to be repetitive, often using existing code element... more The way developers edit day-to-day code tends to be repetitive, often using existing code elements. Many researchers have tried to automate repetitive code changes by learning from specific change templates which are applied to limited scope. The advancement of Neural Machine Translation (NMT) and the availability of vast open-source evolutionary data opens up the possibility of automatically learning those templates from the wild. However, unlike natural languages, for which NMT techniques were originally devised, source code and its changes have certain properties. For instance, compared to natural language, source code vocabulary can be significantly larger. Further, good changes in code do not break its syntactic structure. Thus, deploying state-of-the-art NMT models without adapting the methods to the source code domain yields sub-optimal results. To this end, we propose a novel Tree based NMT system to model source code changes and learn code change patterns from the wild. We ...

Download

Retrieval Augmented Code Generation and Summarization

Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Download

Contrastive Learning for Source Code with Structural and Functional Properties

Pre-trained transformer models have recently shown promises for understanding the source code. Mo... more Pre-trained transformer models have recently shown promises for understanding the source code. Most existing works expect to understand code from the textual features and limited structural knowledge of code. However, the program functionalities sometimes cannot be fully revealed by the code sequence, even with structure information. Programs can contain very different tokens and structures while sharing the same functionality, but changing only one or a few code tokens can introduce unexpected or malicious program behaviors while preserving the syntax and most tokens. In this work, we present BOOST, a novel self-supervised model to focus pre-training based on the characteristics of source code. We first employ automated, structure-guided code transformation algorithms that generate (i.) functionally equivalent code that looks drastically different from the original one, and (ii.) textually and syntactically very similar code that is functionally distinct from the original. We train...

Download

Deep Learning based Vulnerability Detection: Are We There Yet

IEEE Transactions on Software Engineering, 2021

Download

A Transformer-based Approach for Source Code Summarization

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Download

FP-ANK: An improvised intrusion detection system with hybridization of neural network and K-means clustering over feature selection by PCA

2015 18th International Conference on Computer and Information Technology (ICCIT), 2015

Intrusion Detection System (IDS) predominantly works for detecting malicious attacks. Many resear... more Intrusion Detection System (IDS) predominantly works for detecting malicious attacks. Many researchers have proposed the IDS with different techniques to achieve the best accuracy with the consolidation of Clustering and Artificial Neural Network (ANN). Clustering and ANN based models give better precision rate with better accuracy where attack records are low. Nevertheless, all the features of dataset are not relevant for classifying different attacks. So, feature selection can improve the stability and accuracy of IDS. In this paper, it is proposed that IDS with the amalgamation of best efficient features selected by Principal Component Analysis (PCA) can reduce the computational complexity of the system. It has been combined with the K-means clustering technique to cluster the specific groups of attacks and Artificial Neural Network to get a preeminent output by training the formulation of different base models. The model name has been defined by FP-ANK model. Investigational results have been reported on the NSL-KDD dataset where the accuracy rate associating with other models is distinct to validate the proposed system.