Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
8 views

SP-Summarizing Source Code With Transferred API Knowledge

The document presents a novel approach called TL-CodeSum for summarizing Java methods using transferred API knowledge, which enhances code comprehension and search. The method addresses limitations of previous code summarization techniques that rely heavily on the availability of similar code snippets and poorly named identifiers. Experimental results demonstrate that TL-CodeSum significantly outperforms existing state-of-the-art methods in generating accurate code summaries.

Uploaded by

pokat10168
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

SP-Summarizing Source Code With Transferred API Knowledge

The document presents a novel approach called TL-CodeSum for summarizing Java methods using transferred API knowledge, which enhances code comprehension and search. The method addresses limitations of previous code summarization techniques that rely heavily on the availability of similar code snippets and poorly named identifiers. Experimental results demonstrate that TL-CodeSum significantly outperforms existing state-of-the-art methods in generating accurate code summaries.

Uploaded by

pokat10168
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Singapore Management University

Institutional Knowledge at Singapore Management University

Research Collection School Of Computing and School of Computing and Information Systems
Information Systems

7-2018

Summarizing source code with transferred API knowledge


Xing HU

Ge LI

Xin XIA

David LO
Singapore Management University, davidlo@smu.edu.sg

Shuai LU

See next page for additional authors

Follow this and additional works at: https://ink.library.smu.edu.sg/sis_research

Part of the Software Engineering Commons

Citation
HU, Xing; LI, Ge; XIA, Xin; LO, David; LU, Shuai; and JIN, Zhi. Summarizing source code with transferred API
knowledge. (2018). Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelli-
gence (IJCAI 2018), Stockholm, Sweden, 2018 July 13-19. 2269-2275.
Available at: https://ink.library.smu.edu.sg/sis_research/4295

This Conference Proceeding Article is brought to you for free and open access by the School of Computing and
Information Systems at Institutional Knowledge at Singapore Management University. It has been accepted for
inclusion in Research Collection School Of Computing and Information Systems by an authorized administrator of
Institutional Knowledge at Singapore Management University. For more information, please email
cherylds@smu.edu.sg.
Author
Xing HU, Ge LI, Xin XIA, David LO, Shuai LU, and Zhi JIN

This conference proceeding article is available at Institutional Knowledge at Singapore Management University:
https://ink.library.smu.edu.sg/sis_research/4295
Summarizing Source Code with Transferred API Knowledge
Xing Hu1,2 , Ge Li1,2∗ , Xin Xia3 , David Lo4 , Shuai Lu1,2 , Zhi Jin1,2∗
1
Keylaboratory of High Confidence Software Technologies (Peking University), Ministry of Education
2
Institute of Software, EECS, Peking University, Beijing, China
3
Faculty of Information Technology, Monash University, Australia
4
School of Information Systems, Singapore Management University, Singapore
1,2
{huxing0101, lige, shuai.l, zhijin}@pku.edu.cn, 3 xin.xia@monash.edu, 4 davidlo@smu.edu.sg

Abstract code snippets as summaries [Haiduc et al., 2010; Eddy et al.,


2013], while some approaches extract keywords from the giv-
Code summarization, aiming to generate succinct en code snippets as summaries [Moreno et al., 2013]. How-
natural language description of source code, is ex- ever, these IR-based approaches have two main limitations.
tremely useful for code search and code compre- First, they fail to extract accurate keywords when the iden-
hension. It has played an important role in software tifiers and methods are poorly named. Second, they cannot
maintenance and evolution. Previous approaches output accurate summaries if no similar code snippet exists.
generate summaries by retrieving summaries from
Recently, some studies have adopted deep learning ap-
similar code snippets. However, these approaches
proaches to generate summaries by building probabilistic
heavily rely on whether similar code snippets can
models of source code [Iyer et al., 2016; Allamanis et al.,
be retrieved, how similar the snippets are, and fail
2016; Hu et al., 2018]. [Hu et al., 2018] combine the neu-
to capture the API knowledge in the source code,
ral machine translation model and the structural information
which carries vital information about the function-
within the Java methods to generate the summaries automati-
ality of the source code. In this paper, we propose a
cally. [Allamanis et al., 2016] proposes a convolutional mod-
novel approach, named TL-CodeSum, which suc-
el to generate name-like summaries, and their approach can
cessfully uses API knowledge learned in a differ-
only produce summaries with an average of 3 words. [Iyer et
ent but related task to code summarization. Ex-
al., 2016] presents an attention-based Recurrent Neural Net-
periments on large-scale real-world industry Java
works (RNN) named CODE-NN to generate summaries for
projects indicate that our approach is effective and
C# and SQL code snippets collected from Stack Overflow.
outperforms the state-of-the-art in code summariza-
Their experimental results have proved the effectiveness of
tion.
deep learning approaches on code summarization. Although
deep learning techniques are successful in the first step to-
1 Introduction ward automatic code summary generation, the performance
is limited since they treat source code as plain text. There is
As a critical task in software maintenance and evolution, code much latent knowledge in source code, e.g., identifier naming
summarization aims to generate functional natural language conventions and Application Programming Interface (API)
description for a piece of source code (e.g., method). Good usage patterns.
summaries improve program comprehension and help code
Intuitively, the functionality of a code snippet is related to
search [Haiduc et al., 2010]. The code comment is one of
its API sequences. Developers often invoke a specific API se-
the most common summaries used during software develop-
quence to implement a new feature. Compared to source code
ments. Unfortunately, the lack of high-quality code com-
with different coding conventions, API sequences tend to be
ments is a common problem in software industry. Good com-
regular. For example, we usually use the following API se-
ments are often absent, unmatched, and outdated during the
quence of Java Development Kit (JDK): FileRead.new, Buf-
evolution. Additionally, writing comments during the devel-
ferReader.new, BufferReader.read, and BufferReader.close to
opment is time-consuming for developers. To address these
implement the function “Read a file”. We conjecture that
issues, some studies have tried to give summaries for source
knowledge discovery in API sequence can assist the genera-
code automatically [Haiduc et al., 2010; Moreno et al., 2013;
tion of code summaries. Inspired by the transfer leaning [Pan
Iyer et al., 2016; Hu et al., 2018]. Generating code summaries
and Yang, 2010], the code summarization task can be fine
automatically can help save the developers’ time in writing
tuned by using the API knowledge learned in a different but
comments, program comprehension, and code search.
related task. In order to verify our conjecture, we conduct an
Previous works have exploited Information Retrieval (IR)
experiment on generating summaries for Java methods which
approaches and learning-based approaches to generate sum-
are functional units of Java programming language.
maries. Some IR approaches search comments from similar
In this paper, we propose a novel approach called TL-

Corresponding Authors CodeSum, which generates summaries for Java methods with
1 API Seqs and 2 API Knowledge API Summarization Model Java Method
Summary Extraction learning Encoder Decoder
Pairs of API Seqs API Seq Summary
and Summaries API Seq and
09~14 4 Transfered Source Code

1 Code and Summary 3 TL-CodeSum API Seq 5 Online


Training Decoder Summary Generation Trained
Extraction
Code Model
Pairs of Code, Encoder
15~16 API sequence and tokens
Code Corpora Summaries Code Summarization Model Code Summary

Figure 1: The overall architecture of TL-CodeSum

the assistance of transferred API knowledge from another Gu et al., 2016]. Generating summaries from source code
task of API sequences summarization. We conduct the code aims to bridge the gap between programming language and
summarization task on the Java projects which are created natural language. [Raychev et al., 2015] aimed to predic-
from 2015 to 2016 in GitHub. The API sequences summa- t names and types of variables, whereas [Allamanis et al.,
rization task aims to build the mappings between API knowl- 2015a; 2016] suggested names for variables, methods and
edge and the corresponding natural language description- classes. [Hu et al., 2018] exploited the neural machine trans-
s. The corpus for API sequences summarization consists of lation model on the code summarization with the assistance
hAPI sequence, summaryi pairs extracted from a large-scale of the structural information. [Allamanis et al., 2016] ap-
Java projects which are created from 2009 to 2014 in GitHub. plied a neural convolutional attentional model to summariz-
The experimental results demonstrate that TL-CodeSum sig- ing the Java code into short, name-like summaries (average 3
nificantly outperforms the state-of-the-art on code summa- words). [Iyer et al., 2016] presented an attention-based RNN
rization. network to generate summaries that described the functional-
The contributions of our work are shown as follows: ities of C# code snippets and SQL queries. These works have
• We propose a novel approach named TL-CodeSum that proved the effectiveness of building probabilistic models for
summarizes Java methods with the assistance of the code summarization. In this paper, we consider exploiting the
learned API knowledge. latent API knowledge in source code to assist the code sum-
marization. Inspired by transfer learning which achieves suc-
• We design a framework to learn API knowledge from cesses on training models with a learned knowledge [Pan and
API summarization task and use it to assist code sum- Yang, 2010], the API knowledge used to code summarization
marization task. is learned from a different but related task.

2 Related Work 3 Approach


As an integral part of software development, code summaries
describe the functionalities of source code. IR approach- In this section, we present our proposed approach TL-
es [Haiduc et al., 2010; Wong et al., 2015] and learning- CodeSum, which decodes summaries from source code with
based approaches [Iyer et al., 2016; Allamanis et al., 2016] transferred API knowledge. As shown in Figure 1, the ap-
have been exploited to automatic code summarization. IR proach mainly consists of three parts: data processing, mod-
approaches are widely used in code summarization. They el training, and online code summary generation. The mod-
usually synthesize summaries by retrieving keywords from el aims to implement two tasks, API sequence summariza-
source code or searching comments from similar code snip- tion task and code summarization task. The API sequence
pets. [Haiduc et al., 2010] applied two IR techniques, the summarization task aims to build the mappings between API
Vector Space Model (VSM) and Latent Semantic Indexing knowledge and the functionality descriptions. The learned
(LSI), to generate term-based summaries for Java classes and API knowledge is applied to code summarization task to as-
methods. [Wong et al., 2015] applied code clone detection sist the summary generation. The details of the two tasks will
techniques to find similar code snippets and extract the com- be introduced in the following sections.
ments from the similar code snippets. The effectiveness of IR
approaches heavily depends on whether similar code snippet-
3.1 API Sequence Summarization Task
s exist and how similar they are. While extracting keywords API sequence summarization aims to build the mappings be-
from the given code snippets, they fail to generate accurate tween API knowledge and natural language descriptions. To
summaries if the source code contains poorly named identi- implement a certain functionality, for example, how to read
fiers or method names. a file, developers often invoke the corresponding API se-
Recently, inspired by the work of [Hindle et al., 2012], an quences. In this paper, we exploit the API knowledge to assist
increasing number software tasks, e.g., fault detection [Ray code summarization.
et al., 2016], code completion [Nguyen et al., 2013], and The knowledge is learned from the API summariza-
code summarization [Iyer et al., 2016] build language model- tion task which generates summaries for API sequences.
s for source code. These language models vary from n-gram The task adopts a basic Sequence-to-Sequence (Seq2Seq)
model [Nguyen et al., 2013; Allamanis et al., 2014], bimodal model which achieves successes in Machine Translation
model [Allamanis et al., 2015b], and RNNs [Iyer et al., 2016; (MT) [Sutskever et al., 2014], Text Summarization [Rush et
public static boolean … EOS
cached files
Code Encoder …

Decoder … s’t-1 s’t … h1 h2 h3 … hl

all cached

Decoder … st-1 st …

h’1 h’2 h’3


h’1 h’2 h’3
API Sequences
API Sequences
Encoder Encoder
Collections. File. File. File. File. File.
emptyMap listFiles delete isDirectory list delete

(a) API Sequence Summarization (b) Code Summarization with Transferred API
Knowledge. (The red part is learned from API
sequence summarization task)
Figure 2: The model of TL-CodeSum
al., 2015], and etc. As shown in Figure 2(a), it mainly con- and
tains two parts, an API sequence encoder and a decoder. eij = a(s0i−1 , h0j ) (6)
Let A0 = {A0(i) } denotes a set of API sequence where is an alignment model which scores how well the inputs
A = [a01 , ..., a0m ] denotes the sequence of API invocations
0(i)
around position j and the output at position i match. Both
in a Java method. For each A0(i) ∈ A0 , there is a correspond- the encoder and decoder RNN are implemented as a GRU
ing natural language description D0(i) = [d01 , ..., d0n ]. The [Cho et al., 2014], which is one of widely-used RNN.
goal of API sequence summarization is to align the A0 and
D0 , namely, A0 → D0 . 3.2 Code Summarization Task
The API encoder uses an RNN to read the API sequence The code summarization model is a variant of the basic Se-
A0(i) = [a01 , ..., a0m ] one-by-one. The API sequence is embed- q2Seq model. Instead of using a code encoder and a decoder,
ded into a vector that represents the API knowledge. The API TL-CodeSum adds another API encoder which is transferred
knowledge is then used to generate the target summary by from API summarization model. Let C = {C (i) } , A =
the decoder. To better capture the latent alignment relations {A(i) }, and D = {D(i) } denote the source code, API se-
between API sequences and summaries, we adopt the classic quences, and corresponding summaries of Java methods re-
attention mechanism [Bahdanau et al., 2014]. The hidden s- spectively. The goal of code summarization is to generate
tate of the encoder is updated according to the API and the summaries from source code with the assisted API knowl-
previous hidden state, edge learned from API sequence summarization, namely,
C, A → D.
h0t = f (a0t , h0t−1 ) (1)
As shown in Figure 2(b), the API sequences within Java
where f is a non-linear function that maps a word of source methods are encoded by the transferred API encoder, which
language into a hidden state h0t at time t by considering pre- is marked red in API summarization task. The code encoder
vious hidden states h0t−1 . In this paper, we use a Gated Re- and API encoder aim to learn the semantic information of the
current Units (GRU) as f . The decoder is another RNN and given code snippet C = [c1 , ..., cl ] and API sequence A =
trained to predict conditional probability of the next word [a1 , ..., am ] respectively. In order to integrate the two parts of
d0t0 given the context vector C0 and the previously predicted information better, the decoder needs to be able to combine
words d01 , ..., d0t0 −1 as the attention information collected from both two encoders.
The context vector is computed as their sum,
p(d0t0 |d01 , ..., d0t0 −1 , A0 ) = g(d0t0 −1 , s0t0 , C0t0 ) (2)
l m
where g is a non-linear function that outputs the probability
X X
0 0
Ci = αij hj + αij hj (7)
of d0t0 and s0t0 is an RNN hidden state for time step t0 and j=1 j=1
computed by
where α and α0 are attention distributions of source code and
s0t0 = f (s0t0 −1 , d0t0 −1 , C0t0 ) (3)
API sequence respectively. The decoding procedure is similar
The context vector C0i is computed as a weighted sum of hid- to the API summarization task which adopts a GRU to predict
den states of the encoder h01 , ..., h0m , word-by-word.
m
C0i =
X
0 0
αij hj (4) 4 Experiments
j=1 4.1 Dataset Details
where There are two datasets used in our work, one for API se-
0 exp(eij ) quences summarization and the other one for code summa-
αij = Pm (5) rization as shown in the data processing stage in Figure 1.
k=1 exp(eik )
Table 1: Statistics for code snippets in our dataset Table 3: Precision, Recall, and F-score for our approach compared
with baseline
Datasets #Projects #Files #Lines #Items
Approaches Precision Recall F-score
15-16 9,732 1,051,647 158,571,730 69,708
09-14 13,154 2,938,929 496,215,929 340,922 CODE-NN 26.21 14.17 18.40
API-Only 30.72 21.14 25.05
Table 2: Statistics for API sequence, code and comments length Code-Only 38.89 28.81 33.10
API+Code 41.06 30.34 34.90
API sequences Lengths TL-CodeSum(fixed) 42.20 34.38 37.89
TL-CodeSum(fine-tuned) 40.78 35.41 37.91
Avg Mode Median <5 <10 <20
4.39 1 2 79.99% 91.38% 97.18%
Table 4: BLEU and METEOR for our approach compared with
Comments Lengths baseline
Avg Mode Median <20 <30 <50 Approaches BlEU-4 score METEOR
8.86 8 13 75.50% 86.79% 95.45%
CODE-NN 25.3 6.92
Code Lengths API-Only 26.45 10.71
Code-Only 35.50 14.78
Avg Mode Median <100 <150 <200
API+Code 37.28 15.88
99.94 16 65 68.63% 82.06% 89.00%
TL-CodeSum(fixed) 36.42 18.07
TL-CodeSum(fine-tuned) 41.98 18.81
The two datasets are both collected from GitHub. The API
sequences summarization dataset contains Java projects from lengths of Java methods, API sequences, and comments are
2009 to 2014 and is used to learn API knowledge. The Ja- 99.94, 4.39, and 8.86 respectively. The detailed information
va projects used in code summarization task are created from of the datasets is shown in Table 1 and Table 2.
2015 to 2016. The API knowledge learned from the former
dataset is applied to train the code summarization task on the 4.2 Experiment Settings
latter dataset. To keep the quality of the projects, we select the
We set the dimensionality of the GRU hidden states, token
projects that have at least 20 stars as the preliminary dataset.
embeddings, and summary embeddings to 128. The mod-
The API sequences are extracted by the approach that [Gu
el is trained using the mini-batch stochastic gradient descent
et al., 2016] proposed. We use Eclipse’s JDT compiler1 to
algorithm (SGD) and the batch size is set as 32. The maxi-
parse source code into AST trees. Then we extract the Ja-
mum lengths of source code and API sequences are 300 and
va methods, the API sequences within these methods and the
20. For decoding, we set the beam size to 5 and the maximum
corresponding Javadoc comments which are standard com-
summary length to 30 words. Sequences that exceed the max-
ments for Java methods. These comments that describe the
imum lengths will be excluded from training. The vocabulary
functionalities of Java methods are taken as code summaries.
size of the code, API, and summary are 50,000, 33,082, and
The source code is tokenized into tokens before they are fed
26,971. We use the Tensorflow to train our models on GPUs.
into the network. To decrease noise introduced to the learning
process, we only take the first sentence of the comments since
they typically describe the functionalities of Java methods ac- 5 Experimental Results
cording to Javadoc guidance2 . However, not every comment 5.1 Accuracy in Summary Generation
is useful, so some heuristic rules are required to filter the data.
Methods with empty or just one-word descriptions are filtered Metric: In this paper, we use IR metrics and Machine Trans-
out in this work. The setter, getter, constructor, test methods, lation (MT) metrics to evaluate our method. For IR metric-
and override methods, whose comments are easy to predict, s, we report the precision, recall and F-sore of our method.
are also excluded. Based on the number of mapped unigrams found between the
At last, we get 340,922 pairs of hAPI sequence, summaryi two strings (m), the total number of unigrams in the transla-
for API knowledge learning in API sequences summarization tion (t) and the total number of unigrams in the reference (r),
task and 69,708 pairs of h API sequence, code, summaryi for we calculate unigram precision P = m/t and unigram recall
code summarization task.3 We split each dataset into train- R = m/r. Precision is the fraction of generated summary to-
ing, valid and testing sets in proportion with 8 : 1 : 1 after kens that are relevant, while recall is the fraction of relevant
shuffling the pairs. We train all models using the training set tokens that are generated. F-score is the quality compromise
and compute the accuracy scores in the test set. The average between precision and recall.
We use two MT metrics BLEU-4 score [Papineni et al.,
1
http://www.eclipse.org/jdt/ 2002] and METEOR [Denkowski and Lavie, 2014] which are
2
http://www.oracle.com/technetwork/articles/java/index- also used in CODE-NN to measure the accuracy of generated
137868.html source code summaries. BLEU score is a widely used ac-
3 curacy measure for machine translation. It computes the n-
The data and code are available at https://github.com/xing-
hu/TL-CodeSum gram precision of a candidate sequence to the reference. ME-
Table 5: Examples of generated summaries given Java methods and
API sequences.

Examples
protected void sprint(double doubleField){
Java method and sprint(String.valueOf(doubleField));
API Sequence }

String.valueOf
Pretty printing accumulator function
Human-Written
for doubles
pretty printing accumulator function
TL-CodeSum
for longs
public void removeMouseListener(
GlobalMouseListener listener){
Java method and listeners.remove(listener);
API Sequence }

List.remove
Human-Written Removes a global mouse listener
TL-CodeSum removes an existing message listener.
private static boolean
instanceOfAny(Object o,
Collection<Class> classes){
for(Class c: classes){
if (c.isInstance(o))
Figure 3: A 2D projection of API embeddings using t-SNE Java method and return true;
}
API Sequence return false;
TEOR is recall-oriented and evaluates translation hypothe- }
ses by aligning them to reference translations and calculating Collection.isEmpty→Collection.add
sentence-level similarity scores. →Class.isInstance
Baseline: We compare TL-CodeSum with CODE-NN [Iy- returns true if the Object ’o’ is an in-
Human-Written
er et al., 2016] which is a state-of-the-art code summariza- stance of any class in the Collection
tion approach. CODE-NN proposed an end-to-end genera- returns true if the object is registered in
TL-CodeSum
tion system to generate summaries given code snippets. Com- classes, or false otherwise.
pared to TL-CodeSum, CODE-NN generates each word by a
global attention model which computes a weighted sum of the
embeddings of code tokens instead of hidden states of RNNs. tuning the whole network, while the recall is increased. In
We also evaluate the accuracy of generated summaries given terms of F-score, our proposed model with fine-tuning shows
API and code using the ibasic Seq2Seq model respectively slightly improvement over our model with fixed parameter-
(API-Only and Code-Only). To evaluate the influence of the s. TL-CodeSum generates more overlapping words between
transferred API knowledge, we conduct an experiment that automatically generated summaries and human-written sum-
uses two encoders to encoder API sequences and source code maries. Overall, the TL-CodeSum surpasses other approach-
respectively without transferred API knowledge (API+Code). es on generating information related summaries.
Additionally, we compare two approaches to exploiting API We also evaluate the gap between automatically generat-
knowledge, fine tuning the whole network (fine tuned TL- ed summaries and human-written summaries on MT metric-
CodeSum) and train the network with fixed API knowledge s. Table 4 illustrates METEOR scores and sentence level
(fixed TL-CodeSum) . BLEU-4 scores of different approaches to generating com-
Results: Table 3 illustrates the results on IR metrics of d- ments for Java methods. As the results indicate, the TL-
ifferent approaches. Precision denotes the ratio of match- CodeSum obviously outperforms the state-of-the-art method
ing words in the generated comments. Results show that CODE-NN on Java methods summarization. The BLEU-4 s-
using RNN to encode the source code (Code-Only) or API core and METEOR of CODE-NN and API-Only reflect that
sequences (API-Only) outperforms using the embeddings of summarizing from API sequences by Seq2Seq model has the
tokens directly (CODE-NN). The RNNs are good at learning similar ability of CODE-NN, although the semantics of API
the semantics of input sequences and the code information is sequences are much fewer than the source code. It mainly
much more helpful for summary generation. When combin- learns the relationship between API knowledge and function-
ing source code and API information, the precision is much alities of Java methods. Integrating the learned API knowl-
higher than CODE-NN and the two basic Seq2Seq model- edge and source code greatly improves the BLEU score and
s (i.e., Code-Only and API-Only). The improvements have METEOR. Through the evaluation, we have verified the ef-
proved the importance of API information while generat- fectiveness of API usage patterns for code summarization.
ing comments. Furthermore, transferring the API knowledge TL-CodeSum can not only generate more informative relat-
from the API sequence summarization task directly improves ed comments but also more expressive comments than state-
the precision and recall. The precision decreases when fine- of-the-art baselines. Compared to the model without API
Source Code:

API Seq: DataOutputStream.writeByte —>


DataOutputStream.writeShort—>
DataOutputStream.writeShort
Human Written Comments:
Write the constant to the output stream
Automatically Generated Comments:
Write the constant to the output stream

(a) An example of code snippet (b) Attention (c) Attention weights for source code tokens
weights for API
sequences

Figure 4: Heatmap of attention weights for API sequence and source code snippets. The model learns to align key summary words with the
corresponding tokens in API sequences and source code.
sequences, the BLEU-4 score of TL-CodeSum increases to human-written summaries are as follows:
41.98%. 1. Words replacement: Some words are replaced by
their synonyms, antonyms, or words in the same domain. In
5.2 Quality Analysis the first example, the word “doubles” is replaced by “longs”
API Embedding Quality. The API usage pattern is an im- which comes from the same domain (the data types of Java
portant part of code summarization. Different coding con- language).
ventions of different developers improve the difficulties of 2. More general: TL-CodeSum learns the functionali-
semantic learning. The API usage patterns are relatively reg- ties over a large-scale dataset. The generated summaries may
ular, hence integrating API knowledge helps learn the func- present more general meaning and give the abstract semantics
tionalities of source code. The quality of API embeddings’ of given Java methods just like the second example.
learning is crucial for our proposed method to work well. 3. Missed Identifiers: Identifiers are defined by differ-
Figure 3 shows a 2-D projection of the embeddings of APIs. ent developers and those used by different methods may dif-
For ease the demonstration, we select the APIs related to fer from one another. Learning the identifiers is challenging
“String” and “Math” which are circled in Figure 3. As shown problems [Hellendoorn and Devanbu, 2017]. TL-CodeSum
in the graph, TL-CodeSum can successfully embed APIs im- misses some identifiers or replaces them with “UNK” some-
plementing similar functionalities. times. As the third example shows, the identifiers “o” and
“Collection” are missing in the generated summary.
Complementarity of API and Code. TL-CodeSum gener-
ates summaries according to the semantics of source code
and the transferred API knowledge. Figure 4 shows the at- 6 Conclusion
tention weights for the API sequence and code tokens within In this paper, we propose a novel deep model called TL-
the Java method while generating their corresponding sum- CodeSum to generate summaries by capturing semantics
maries. We give the details of Java method, API sequence from the source code with the assistance of API knowledge.
within it, the human-written comment, and the automatical- The API knowledge is transferred into TL-CodeSum from
ly generated comment by TL-CodeSum in Figure 4(a). The API sequences summarization task. Experimental results on
generated tokens have different relationships between API se- Java methods indicate that integrating API sequences is ben-
quence and code tokens. From the figure, we find the words eficial and effective. TL-CodeSum significantly outperforms
“write” and “stream” are more relevant to API “DataOutput- the state-of-the-art methods for code summarization. In the
Stream.writeByte”. While the word “constant” is more rel- future, we will combine richer program structural and sequen-
evant the variable “tab” whose type is “ConstantPool”. TL- tial information derived from program analysis tools for code
CodeSum aligns different words with specific API or code summarization.
tokens.
Comparison between Human-Written and TL-CodeSum Acknowledgments
Generated Summaries. Table 5 shows three examples of This research is partially supported by the National Basic Re-
generated summaries. Most generated summaries are clear, search Program of China (the 973 Program) under Grant No.
coherent, and informative related regardless the lengths of Ja- 2015CB352201, and the National Natural Science Founda-
va methods. The main differences between the generated and tion of China under Grant No.61620106007.
References [Hindle et al., 2012] Abram Hindle, Earl T Barr, Zhendong
[Allamanis et al., 2014] Miltiadis Allamanis, Earl T Barr, Su, Mark Gabel, and Premkumar Devanbu. On the natu-
Christian Bird, and Charles Sutton. Learning natural cod- ralness of software. In Software Engineering (ICSE), 2012
ing conventions. In Proceedings of the 22nd ACM SIG- 34th International Conference on, pages 837–847. IEEE,
SOFT International Symposium on Foundations of Soft- 2012.
ware Engineering, pages 281–293. ACM, 2014. [Hu et al., 2018] Xing Hu, Ge Li, Xin Xia, David Lo, and
[Allamanis et al., 2015a] Miltiadis Allamanis, Earl T Barr, Zhi Jin. Deep code comment generation. In Proceedings
Christian Bird, and Charles Sutton. Suggesting accurate of the 2018 26th IEEE/ACM International Confernece on
method and class names. In Proceedings of the 2015 10th Program Comprehension. ACM, 2018.
Joint Meeting on Foundations of Software Engineering, [Iyer et al., 2016] Srinivasan Iyer, Ioannis Konstas, Alvin
pages 38–49. ACM, 2015. Cheung, and Luke Zettlemoyer. Summarizing source code
[Allamanis et al., 2015b] Miltos Allamanis, Daniel Tarlow, using a neural attention model. In ACL (1), 2016.
Andrew Gordon, and Yi Wei. Bimodal modelling of [Moreno et al., 2013] Laura Moreno, Jairo Aponte,
source code and natural language. In Proceedings of Giriprasad Sridhara, Andrian Marcus, Lori Pollock,
the 32nd International Conference on Machine Learning and K Vijay-Shanker. Automatic generation of natu-
(ICML-15), pages 2123–2132, 2015. ral language summaries for java classes. In Program
[Allamanis et al., 2016] Miltiadis Allamanis, Hao Peng, and Comprehension (ICPC), 2013 IEEE 21st International
Charles Sutton. A convolutional attention network for Conference on, pages 23–32. IEEE, 2013.
extreme summarization of source code. In Internation- [Nguyen et al., 2013] Tung Thanh Nguyen, Anh Tuan N-
al Conference on Machine Learning, pages 2091–2100, guyen, Hoan Anh Nguyen, and Tien N Nguyen. A sta-
2016. tistical semantic language model for source code. In Pro-
[Bahdanau et al., 2014] Dzmitry Bahdanau, Kyunghyun ceedings of the 2013 9th Joint Meeting on Foundations of
Software Engineering, pages 532–542. ACM, 2013.
Cho, and Yoshua Bengio. Neural machine translation by
jointly learning to align and translate. Computer Science, [Pan and Yang, 2010] Sinno Jialin Pan and Qiang Yang. A
2014. survey on transfer learning. IEEE Transactions on knowl-
[Cho et al., 2014] Kyunghyun Cho, Bart Van Merrienboer, edge and data engineering, 22(10):1345–1359, 2010.
Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, [Papineni et al., 2002] Kishore Papineni, Salim Roukos,
Holger Schwenk, and Yoshua Bengio. Learning phrase Todd Ward, and Wei-Jing Zhu. Bleu: a method for auto-
representations using rnn encoder-decoder for statistical matic evaluation of machine translation. In Proceedings of
machine translation. Computer Science, 2014. the 40th annual meeting on association for computational
[Denkowski and Lavie, 2014] Michael Denkowski and Alon linguistics, pages 311–318. Association for Computational
Linguistics, 2002.
Lavie. Meteor universal: Language specific translation e-
valuation for any target language. In Proceedings of the [Ray et al., 2016] Baishakhi Ray, Vincent Hellendoorn, Sa-
EACL 2014 Workshop on Statistical Machine Translation, heel Godhane, Zhaopeng Tu, Alberto Bacchelli, and
2014. Premkumar Devanbu. On the naturalness of buggy code.
[Eddy et al., 2013] Brian P Eddy, Jeffrey A Robinson, In Proceedings of the 38th International Conference on
Software Engineering, pages 428–439. ACM, 2016.
Nicholas A Kraft, and Jeffrey C Carver. Evaluating source
code summarization techniques: Replication and expan- [Raychev et al., 2015] Veselin Raychev, Martin Vechev, and
sion. In Program Comprehension (ICPC), 2013 IEEE 21st Andreas Krause. Predicting program properties from big
International Conference on, pages 13–22. IEEE, 2013. code. In ACM SIGPLAN Notices, volume 50, pages 111–
124. ACM, 2015.
[Gu et al., 2016] Xiaodong Gu, Hongyu Zhang, Dongmei
Zhang, and Sunghun Kim. Deep api learning. In Proceed- [Rush et al., 2015] Alexander M Rush, Sumit Chopra, and
ings of the 2016 24th ACM SIGSOFT International Sym- Jason Weston. A neural attention model for ab-
posium on Foundations of Software Engineering, pages stractive sentence summarization. arXiv preprint arX-
631–642. ACM, 2016. iv:1509.00685, 2015.
[Haiduc et al., 2010] Sonia Haiduc, Jairo Aponte, Laura [Sutskever et al., 2014] Ilya Sutskever, Oriol Vinyals, and
Moreno, and Andrian Marcus. On the use of automat- Quoc V Le. Sequence to sequence learning with neural
ed text summarization techniques for summarizing source networks. In Advances in neural information processing
code. In Reverse Engineering (WCRE), 2010 17th Work- systems, pages 3104–3112, 2014.
ing Conference on, pages 35–44. IEEE, 2010. [Wong et al., 2015] Edmund Wong, Taiyue Liu, and Lin Tan.
[Hellendoorn and Devanbu, 2017] Vincent J Hellendoorn Clocom: Mining existing source code for automatic com-
and Premkumar Devanbu. Are deep neural networks the ment generation. In Software Analysis, Evolution and
best choice for modeling source code? In Proceedings of Reengineering (SANER), 2015 IEEE 22nd International
the 2017 11th Joint Meeting on Foundations of Software Conference on, pages 380–389. IEEE, 2015.
Engineering, pages 763–773. ACM, 2017.

You might also like