Effectiveness of Kotlin vs. Java in Android App Development Tasks
Effectiveness of Kotlin vs. Java in Android App Development Tasks
Journal Pre-proof
PII: S0950-5849(20)30143-9
DOI: https://doi.org/10.1016/j.infsof.2020.106374
Reference: INFSOF 106374
Please cite this article as: Luca Ardito, Riccardo Coppola, Giovanni Malnati, Marco Torchiano, Effec-
tiveness of Kotlin vs. Java in Android App Development Tasks, Information and Software Technology
(2020), doi: https://doi.org/10.1016/j.infsof.2020.106374
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.
Abstract
Context: Kotlin is a new programming language representing an alternative
to Java; they both target the same JVM and can safely coexist in the same
application. Kotlin is advertised as capable to solve several known limitations
of Java. Recent surveys show that Kotlin achieved a relevant diffusion among
Java developers.
Goal: We planned to empirically assess a few typical promises of Kotlin
w.r.t. known Java’s limitations, in terms of development effectiveness, main-
tainability, and ease of development.
Method: Our experiment involved 27 teams of 4 people each that com-
pleted a set of maintenance tasks (both defect correction and feature ad-
dition) on Android apps written in either Java or Kotlin. In addition to
the number of fixed defects, effort, and code size, we collected, though a
questionnaire, the participants’ perceptions about the avoidance of known
pitfalls.
Results: We did not observe any significant difference in terms of main-
tainability between the two languages.We found a significant difference re-
garding the amount of code written, which constitutes evidence of better
conciseness of Kotlin. Concerning ease of development, the frequency of
NullPointerExceptions reported by the subjects was significantly lower when
developing in Kotlin. On the other hand, no significant difference was found
in the occurrence of other common Java pitfalls. Finally, the IDE support
was deemed better for Java than Kotlin.
Conclusions: Some of the promises of Kotlin to be a ”better Java” have
been confirmed by our empirical assessment. Evidence suggests that the
1 1. Introduction
2 Kotlin is a modern programming language, appeared in 2011, which rep-
3 resents an alternative to Java, with which it can seamlessly coexist. Many
4 pieces of evidence are available in the literature underlining that Kotlin is
5 gaining traction among Android software developers. In a previous study,
6 we mined all Android apps hosted on the F-Droid platform and updated
7 after October 2017: we found that nearly one-fifth of them featured Kotlin
8 code, with 2/3 of those projects featuring more Kotlin than Java code [1].
9 Similar trends have been reported by Oliveira et al. regarding the number of
10 StackOverflow questions about Kotlin programming for Android and GitHub
11 repositories with Kotlin [2].
12 One of the main design guidelines that led to the development of the
13 Kotlin language is a better handling of null values. In the Java language,
14 without the usage of specific checks, the handling of null values can lead
15 to NullPointerExceptions (NPE). Several studies in the literature report the
16 prominent role of NullPointerExceptions among the reasons for Android ap-
17 plication to crash. Coelho et al. report that near 30% of all stack traces col-
18 lected upon the Android app crash contained NPEs as their root causes [3].
19 The authors also underline the difficulty in protecting the code against those
20 exceptions, especially when the app does not have access to third-party source
21 code. NPEs can also happen – as Payet and Spoto report – in the link be-
22 tween the XML layouts and explicit application code casts [4]. Such a link
23 is obtained utilizing the very commonly used setContentView and findView-
24 ById methods. These method calls are very crucial, and frequent operations
25 are executed every time the components of the application screen are instan-
26 tiated. The effects of those issues are amplified by misuses of the exception
27 handling mechanisms provided by Java, which are documented frequently
28 among Android developers [5].
29 Readability and conciseness are considered key-features of the Kotlin lan-
30 guage, especially for what concerns the declaration of objects and classes with
31 numerous attributes [6].
2
32 The novelty of the Kotlin language, and the easiness in adapting existing
33 (and possibly long-running) Java projects to it, suggests the need for an
34 evaluation of the benefits guaranteed to developers from such transition.
35 Many advantages are reported by works in the specialized literature, but to
36 the best of our knowledge, their empirical assessment is still missing. With
37 this work, we aimed at assessing some assumed advantages of Kotlin with
38 respect to Java in the context of Android development and maintenance.
39 To do so, we conducted a controlled study with undergraduate students, a
40 sample that can represent average Kotlin developers due to the low experience
41 possessed – as of today – by developers with such language.
42 In light of an ever increasing adoption of Kotlin for Android development,
43 this empirical assessment aims to provide practical evidence that could help
44 in a transition from Java to Kotlin. In particular we focused on possible ef-
45 fects on maintainability, conciseness, and avoidance of a few common pitfalls.
46 The remainder of the paper is organized as follows: Section 2 provides
47 some background for Kotlin programming, its characteristics, and the recent
48 trends of its diffusion, and it provides a brief review of related work in liter-
49 ature; Section 3 describes the goal, procedure, participants and material of
50 the experiment, along with possible threats to the validity of our findings;
51 Section 4 discusses the threats to the validity of this study; Section 5 reports
52 the results of the experiment, that are discussed in section 6; finally, Section
53 7 concludes the paper.
54 2. Background
55 Kotlin first appeared in 2011, but its first stable release was distributed
56 in February 2016. In May 2017, Kotlin became a first-class language on
57 Android, and support was provided by the Android Studio DE since release
58 3.0 of October 2017. The popularity of Kotlin increased rapidly since then.
59 The State of Developer Ecosystem in 2018 shows that Kotlin is mainly used
60 for mobile and Server applications working mainly in Oreo and Nougat in
61 Android, and JDK 8 in servers. According to statistics provided by JetBrains,
62 only around 40% of Kotlin developers have adopted the language for more
63 than one year1 .
1
https://www.jetbrains.com/research/devecosystem-2019/ Last visited January
2020
3
64 Kotlin is a statically typed programming language that runs on the Java
65 Virtual Machine (JVM) and fully interoperates with Java: it is possible to
66 mix Kotlin and Java code in the same application, to call Kotlin code from
67 Java code and vice versa [7]. The two languages share several common-
68 alities [2], and the official documentation of Kotlin itself reports its main
69 characteristics by means of comparisons with Java.
70 Kotlin takes a pragmatic approach, such as not re-implementing the en-
71 tire Java collections framework making it compatible with the JDK collection
72 interfaces without breaking any existing project implementations. For exam-
73 ple, Kotlin still supports Java 6 bytecode because almost half of the Android
74 devices still run on it. It is possible to start using Kotlin for small parts of a
75 large project, including a few UI components and simple business logic. The
76 possible coexistence between Kotlin and Java can be deemed as one of the
77 main factors that are fueling the transition to Kotlin for Android developers.
78 As a first example of features that are not supported by Java, Kotlin
79 also allows functions in addition to classes to be first level constructs. In
80 Kotlin, everything is an object, even numeric values that in Java are treated
81 as primitive types. Kotlin provides the ability to extend a class with new
82 features without having to inherit from the class or use any design pattern
83 such as Decorator [8] through special declarations called Extension Functions
84 and Extension Properties.
85 On the other hand, Kotlin does not feature some characteristics of the
86 Java language, like checked exceptions, static members, non-private fields,
87 and the ternary operator.
88 A complete description of the features of Kotlin is out of the purpose of
89 this paper2 . The primary objective of our work has been instead to verify
90 some of the peculiarities of Kotlin, mostly regarding the avoidance of common
91 Java development pitfalls [9]:
2
A large set of open resources about the Kotlin language is available online at https:
//kotlinlang.org/docs/reference/
4
98 alternative to returning null, but not as a general-purpose solution to
99 the nullability problem [11]. Kotlin provides a way to declare nullable
100 variables explicitly (?) and a safe-call operator (?.) that can be used
101 in conjunction with the elvis operator (?:) to avoid most NPEs.
102 Figure 1 reports side-by-side examples of equivalent Kotlin and Java
103 code. We can observe how Kotlin allows declaring a nullable variable
104 – by default variables are non-nullable – and to use safe call and elvis
105 operators to achieve safer and more compact code.
106 • Mandatory Casts: Java often requires several explicit casts to let the
107 compiler cope with type conversions, this makes code longer and hard
108 to read, in addition, a wrong cast could be accepted by the compiler
109 and result into a run-time exception; Kotlin introduced smart casts and
110 a safe (nullable) cast operator (as?).
111 Figure 2 report an example of a safe cast, in Kotlin and Java. Safe casts
112 are capable of eliminating the possibility of triggering a ClassCastException
113 at run-time. As it is evident from the comparison, the safe cast in Java
114 requires a more verbose syntax – that we reported with the usage of
115 the ternary operator – with respect to that needed by Kotlin. Such a
116 higher verbosity can be deemed as a deterrent for developers to exten-
117 sively use the practice of safe casting, hence increasing the likelihood
118 of generating ClassCast exceptions.
5
Figure 2: Mandatory casts examples in Kotlin vs. Java
119 • Long argument lists: the invocation of Java methods uses a strict po-
120 sitional argument mapping. Therefore methods may require passing
121 many arguments even if they assume default or null values; writing
122 overloaded methods might help in such cases, but it may require sig-
123 nificant effort without covering all cases. Kotlin adopts a solution to
124 this issue by defining default values for arguments and allowing – in
125 addition to positional arguments – passing arguments by name. Other
126 recent languages have adopted similar solutions, e.g., default values for
127 arguments are provided by Python.
128 • Data Classes: often, a program requires the creation of classes whose
129 primary purpose is to hold data. The amount of boilerplate code re-
130 quired by Java to implement these classes can be relevant. The addi-
131 tional code can often be mechanically derivable from the data: such
132 automatic derivation is done by libraries that are not part of the stan-
133 dard Java library, e.g., in project Lombok3 . Kotlin introduced the
134 Data Classes that the compiler is able to process to generate all the re-
135 quired boilerplate code automatically. In our prior investigations about
136 Kotlin, we found out that the amount of LoCs savings for a data class
137 with few fields can be of up to 90% w.r.t. the Java equivalent. An
138 example of Kotlin class and its Java equivalent is reported in Figure 3.
139 The main contribution of our work is a comparison between Java and
140 Kotlin in the context of Android Mobile Applications, and specifically when
141 performing maintenance tasks on apps written in either language. We per-
142 form this comparison with undergraduate students attending the course of
143 Mobile Application Development, inspired by the work done by Kosar et
144 al. [12] for setting up the experiment.
3
https://projectlombok.org Last visited March 2019
6
Figure 3: Data class example in Kotlin vs. Java
7
170 than Java. With this paper, however, we did not focus on code smells but
171 on maintenance aspects of code development.
172 Banerjee et al. [20] performed comparisons between the usage of Java and
173 Kotlin for developing Android applications. They conclude that the usage of
174 Kotlin makes the development of Android applications easier while reducing
175 the number of errors and bugs in the code. The principal limit of the work
176 by Banerjee et al. lies in the fact that their assumptions are based only on
177 coding tasks executed by the authors (thus, significant researcher biases can
178 be introduced), and no empirical evidence is provided to support them. The
179 results of the present manuscript are in line with those authors’ findings but
180 – to the best of our knowledge – we provide the first empirical assessment of
181 the claimed advantages of the Kotlin language when compared to Java.
8
Table 1: GQM Template for the study
205 attend practical labs where they are required to work together in groups to
206 develop code for a course running project. The experiment took place dur-
207 ing two such labs and involved working on both a small application and the
208 course running project.
209 This section follows the reporting guidelines proposed by Jedlitschka et
210 al. [25] and the APA Manual [26] to organize the discussion of the exper-
211 imental design. More specifically, the following subsections provide details
212 about the high-level goal of the experiment, the participants that were in-
213 volved, the overall experimental design, and the individual research questions
214 that we formulated. For each research question, we report the materials, the
215 procedure, and the metrics that were used to answer them.
9
229 of the groups that could be introduced by allowing the students to compose
230 the groups as they desired. All the groups were formed by four students
231 enrolled in the course and attending the Computer Engineering MSc degree
232 at Politecnico di Torino.
233 The sample of the experiment is clearly a convenience sample that might
234 be representative of small teams of novice developers.
235 Following recommended good practices [28], the subjects were rewarded
236 with points for participating in the experiment. Based on the correctness of
237 their answers, each subject earned up to a 10% bonus on their assignment
238 grade for the course.
10
Table 2: Questionnaire Structure
11
Figure 4: BPMN diagram of the experimental procedure
268 RQ1: Does the use of Kotlin vs. Java affect the maintainability of Android
269 projects?
12
Table 4: Characteristics of the applications.
Java Kotlin
App Classes LOCs Classes LOCs
Booksearch 5 471 5 450
SoundRecorder 13 1525 12 1340
4
Available at: https://github.com/shrikant0013/android-booksearch
5
Available at: https://openlibrary.org/dev/docs/api/search
6
Available at https://github.com/dkim0419/SoundRecorder
7
We report as a digital appendix the use case narratives of the applications:
13
299 The first section of the questionnaire administered to the participating
300 groups concerned the maintainability concepts measured to answer RQ1.
322 Hu0 There is no difference between the understanding level achieved when
323 using Java or Kotlin.
324 Hl0 There is no difference between the capability of locating a defect when
325 using Java or Kotlin.
326 Ht0 There is no difference between the reported time required to correct a
327 defect when using Java or Kotlin.
14
328 The variables considered in our analysis correspond to the answers col-
329 lected through the questions in group 1 of the questionnaire.
330 Besides, we defined three derived measures, that were automatically com-
331 puted based on the answers to the questionnaire:
332 Purpose understanding is defined starting from item ii.2. The item asks
333 for a specific class in the application. One of the five options is cor-
334 rect; the others are wrong. Purpose understanding is a dichotomous
335 variable whose levels can be either correct or wrong. More specifically,
336 the RecordingItem class is used in the SoundRecorder app to manage
337 data about recordings; hence the Purpose understanding measure was
338 correct for all the experimental subjects that selected the fourth answer
339 to question ii.2.
340 Location accuracy is defined starting from item ii.6. The item asks the
341 respondents to identify the classes where the defects are located.
342 Two out of five classes are expected as a correct answer. We adopt
343 an information retrieval approach and compute the accuracy of the
344 answer.
In particular, Defect Location Accuracy (LA) is a ratio measure defined
as:
TP + TN
LA =
TP + TN + FP + FN
345 Where T P are the true positive, T N are the true negatives, F P are
346 the false positives, and F N are the false negatives.
347 More specifically, the two defects of the SoundRecorder app were in-
348 jected in the RecordingItem and FileViewerAdapter classes. Hence,
349 the maximum score for the Defect Location Accuracy was obtained if
350 and only if the respondents checked these two classes only in question
351 ii.6.
352 Fix effort is defined starting from item ii.5, that in the questionnaire col-
353 lects the time employed by each group to fix the defects, and item ii.6,
354 that reports the defects supposedly identified by the groups.
355 Fix Effort is defined as the ratio of the number of answers checked for
356 question ii.6 and the time estimated by the group for fixing the defects;
357 hence, it serves as a self-estimate of the average effort (in minutes) to
358 fix one defect.
15
359 3.4.4. Analysis method
360 Concerning the first research question (RQ1), we analyze three aspects:
361 • Understanding: we analyze the Purpose understanding variable, and
362 we compare the odds of a correct answer when the program is written
363 in Java vs. Kotlin. To this end, in order to assess Hu0 , we apply a
364 Fisher test for 2 × 2 contingency tables that test the null hypothesis
365 that the odds ratio is equal to one.
366 • Defect location: we analyze the variable Location accuracy to assess
367 Hl0 , in particular, we apply a Mann-Whitney test to check the null
368 hypothesis that there is no difference between the medians.
369 • Time to fix a defect: we analyze the variable Fix effort to assess Ht0
370 by using Mann-Whitney test.
376 RQ2: Does the use of Kotlin vs. Java makes the code more concise?
377 We consider conciseness at the macroscopic level, which means less code,
378 both in terms of the number of classes and LoCs.
16
392 3.6. Experimental Tasks
393 The students were asked on a weekly basis to perform development tasks
394 on the course-running Android project.
395 For the purpose of the evaluation of Kotlin vs. Java usage for the main-
396 tenance tasks of Android applications, we designed two features to be imple-
397 mented by the students.
398 The specific features were defined by one of the authors of the paper, and
399 were designed to be related to the category of the application under devel-
400 opment, compatible with the subjects’ expertise with Android, and feasible
401 in the time frame allocated to the experiment.
402 The features that were defined for the participant were: (1) implement a
403 user chat with notifications, using Firebase (referred as CHAT); (2) imple-
404 ment a way to express user ratings for the exchanged books, using a five-star
405 scheme (referred as RATINGS).
408 Hc0 There is no difference between the measured amount of classes written
409 to implement a new feature when using Java or Kotlin.
410 Hl0 There is no difference between the measured lines of code written to
411 implement a feature when using Java or Kotlin.
417 Both above measures were used to asses Hc0 by applying a non-parametric
418 Mann-Whitney test.
17
419 3.7. RQ3: Coding Pitfalls
420 One important principle in the design of Kotlin was to avoid several
421 common pitfalls of the Java programming language.
422 In this work, we decided to investigate four of the common pitfalls, i.e.,
423 Nullability, Mandatory Casts, Long argument list, and Data Classes, and are
424 described in section 2.
425 To evaluate the occurrence of known issues in Kotlin vs. Java program-
426 ming in the context of Android development, we defined our third and final
427 research question:
428 RQ3: Does the use of Kotlin vs. Java effectively avoids the occurrence of
429 common pitfalls?
430 3.7.1. Materials
431 The section (ii) of the questionnaire administered to the participating
432 groups (see table 2) concerned their experience with the occurrence of the
433 investigated common pitfalls. The reported occurrence of the pitfalls was
434 used to answer RQ3.
435 We decided to use the answer to those questions as proxies of the actual
436 occurrence of the pitfalls. This choice is due to the limited observability of
437 the teams while performing their development task. In fact, the participants
438 wrote the code on their own machines; therefore it was not feasible to install
439 a monitoring plug-in as it would be possible had they worked on lab devices.
18
452 3.7.3. Analysis Method
453 To analyze the perceived pitfalls (RQ3), we resort to the responses to the
454 items ii.1 to ii.5 of the questionnaire. For each variable, we compute the
455 effect size using the Cliff Delta statistic, and we used the relative confidence
456 interval for deciding about hypothesis rejection.
19
488 the results of the experiment [32]. All these guidelines were followed in the
489 conduction of the work documented in this paper. Besides, Carver et al. [32]
490 also provide a checklist to explain when the various activities should occur
491 (i.e., before starting the study, as soon as the study begins, during the study,
492 or after the study is completed). The requirements and checklist provide a
493 useful guide for judging how well a study is integrated into the university
494 course and for judging the reliability of the results. We used that checklist
495 to verify the research and pedagogical goals in this study.
496 Construct validity threats concern the relationship between theory and
497 observation. It is not assured that the Purpose Understanding, Location
498 Accuracy, and Fix Effort metrics defined in this paper are the best possible
499 proxies for providing answers to the Research Questions identified for this
500 study. We measure conciseness in terms of code size, though shorter code
501 could – at least in principle – bear a higher cognitive load, thus reversing the
502 benefits stemming from more concise code. Considering the specific features
503 introduced in Kotlin, we do not believe this is the case though there is no
504 empirical evidence supporting such belief. As explained in section 3.7.1,
505 we decided to measure the occurrence of pitfalls by means of proxies. In
506 practice, we inferred the actual occurrence on the basis of the reported pitfall
507 manifestation, recorded through the questionnaire. While this choice was
508 dictated by practical feasibility reasons, we have no reason to believe any
509 misreporting took place. Moreover, we argue the pitfalls do not represent a
510 problem per se but rather in as much they affect the development activities
511 of the developer, thus in this specific case the reported experience of such
512 pitfalls is probably closer to the original construct than a mechanical count
513 of pitfall frequency.
514 As far as the IDE support, we have to notice that in the presence of
515 participants familiar with Java but not at all with Kotlin, the perceived
516 support could be more related to familiarity than to actual IDE support.
517 Finally, Researcher bias is another possible threat to the validity of this
518 study, since the authors were involved in the creation of the starting Kotlin
519 versions of the two considered apps and the bug injection phase. However,
520 the authors have no reason to favor any language; neither are they inclined
521 to demonstrate any specific result.
20
Java Kotlin
Other language
professionals
in group 2 1 1
1 1 1 1
None 10 1 11
None 1 None 1
Java professionals in group
Figure 5: Groups with members having professional experience with Java or other lan-
guages
Java Kotlin
Highest Java
skill in group
Advanced 2 2
Intermediate 1 7 5 2
Novice 1 2 1 4
< 1 year 1 to 3 years > 3 years < 1 year 1 to 3 years > 3 years
Average group experince in Java development
522 5. Results
523 In this section, we report the results measured for the three Research
524 Questions of the experiment. We also provide details about the population
525 that participated in the experiment.
21
Table 5: Null hypotheses for RQ1
534 answers to those questions for both typologies of groups. Three groups in-
535 cluded participants that had professional Java development experience, and
536 overall, 6 out of 27 groups included components with experience as profes-
537 sional software developers in any language. No participant had any previous
538 experience in Kotlin.
539 The skill level of the population was measured through the answers to
540 question 5 and 8 of the Context section of the questionnaire. Figure 6 sum-
541 marizes the answers to those questions for both typologies of groups. The
542 experience with Java development was mostly between one and three years,
543 although three groups had no member with more than one year of experience
544 in Java, and four groups included a member with more than three years of
545 experience.
546 In the whole population, four groups had members having advanced Java
547 skills (i.e., they developed at least one project of over than 50 classes); eight
548 groups had a most experienced member that considered him/herself a novice
549 (i.e., they developed a few projects featuring up to 20 classes); in the re-
550 maining fifteen groups the most experienced member considered him/herself
551 an intermediate (i.e., at least one medium-sized project of 20 to 50 classes).
552 Regarding the years of experience with Java, four groups had an average
553 experience of more than three years, and three groups had an average expe-
554 rience of less than one year; the remaining groups had an average experience
555 with Java between one and three years.
22
Purpose
Correct 9
(69%)
Java
Wrong 4
(31%)
Correct 11
(79%)
Kotlin
Wrong 3
(21%)
0 2 4 6 8 10
Frequency
560 Table 5 reports the null hypotheses formulated to answer RQ1 and the
561 related decisions.
23
Java Kotlin
Class
RecordingService 77% 64%
RecordingItem 8% 0%
DBHelper 8% 0%
Figure 8: Frequency of the answers to the defect location question selected by the respon-
dents
Language
Kotlin +
Java +
24
Language
Kotlin +
Java +
0 20 40 60
Average time to fix a defect
596 Table 6 reports the null hypotheses formulated to answer RQ2, and the
597 related decisions.
598 In Table 7, we report the measured amount of classes and code that were
599 added by the respondents to implement the required features. The table
600 reports the raw count of classes and code and the percentage over the total
601 amount of code of the application. We also report in the last column the
602 number of data classes that were developed. The code was automatically
603 measured by a script that leveraged the open-source cloc tool8 . Blank and
604 comment lines were not included in the computation.
605 From the table, it can be seen that the number of classes and the amount
606 of code development had a very high variability between groups. Groups
607 that worked with Kotlin to implement the new features produced a number
608 of classes that ranged from 3 to 12 (246 to 1568 lines of code). Four different
8
https://github.com/AlDanial/cloc
25
Table 7: Absolute and relative added classes and LOCs for the development of the required
features
Added
Language Group Classes (%) LOCs (%) Data classes
Kotlin 1 7 (17.1%) 483 (8.3%) 0
3 8 (12.3%) 337 (5.2%) 0
5 12 (24.0%) 1568 (18.9%) 4
7 8 (12.7%) 745 (8.1%) 0
9 5 (4.4%) 515 (4.0%) 0
11 8 (13.6%) 978 (12.9%) 1
13 3 (8.8%) 322 (3.2%) 1
15 9 (20.4%) 664 (10.2%) 0
17 9 (17.0&) 972 (13.0%) 0
19 8 (7.1%) 460 (4.0%) 1
21 5 (9.1%) 380 (5.0%) 0
23 6 (12.8%) 835 (14.4%) 0
25 3 (5.1%) 246 (4.3%) 0
27 9 (14.1%) 744 (7.5%) 0
Java 2 17 (24.3%) 4099 (37.3%) 0
4 4 (11.8%) 679 (15.6%) 0
6 17 (37.8%) 1526 (31.6%) 0
8 22 (31.4%) 2208 (24.9%) 0
10 26 (32.9%) 4745 (45.0%) 0
12 17 (30.0%) 2688 (32.0%) 0
14 0 (0.0%) 583 (7.3%) 0
16 19 (16.4%) 3810 (31.9%) 0
18 3 (8.6%) 775 (18.1%) 0
20 13 (17.6%) 742 (14.9%) 0
22 6 (15.4%) 1755 (29.6%) 0
24 9 (15.0%) 1424 (14.7%) 0
26 2 (6.1%) 647 (15.4%) 0
26
Language
Kotlin +
Java +
0 10 20
Number of added classes
609 groups developed data classes. Groups that worked with Java produced a
610 number of classes that ranged from 0 to 26 (583 to 4745 lines of code).
611 The distribution of the number of new classes is reported in Figure 11.
612 We observe a lower number of classes developed for the participating groups
613 using Kotlin. The mean number of classes developed with Kotlin is 7 while
614 it is 12 for Java; the medians are respectively 8 and 13.
615 The hypothesis Hc0 was tested using a Mann-Whitney test that returned
616 a p-value=0.2138. Therefore we cannot reject the null hypothesis.
617 The effect size can be considered small, as Cliff’s Delta is 0.29; the 95%
618 CI for the effect size is (-0.24; 0.68): it includes the 0. Therefore, it cannot
619 be considered as statistically significant.
620 The same kind of analysis can be applied to the amount of Lines-Of-Code
621 (LOCs) written in order to implement the new feature. The distribution of
622 the LOCs by language is reported in Figure 12. The median LOCs reported
623 for Java is between 1526, while for Kotlin, it is Less than 589.5.
624 The hypothesis Hl0 was tested using a Mann-Whitney test that returned
625 a p-value=0.003. Therefore we can reject the null hypothesis.
626 The effect size can be considered large, Cliff’s Delta is 0.65, with the
627 relative 95% CI being (0.24; 0.86); since the CI does not include the 0, the
628 difference can be considered significant.
629 As far as the LOCs are concerned, we can, therefore, reject the null
630 hypothesis Hl0 and conclude that there is a significant difference in terms of
631 the number of LOCs written when developing the same feature with Kotlin
632 or Java. It is also worth underlining that only four groups used data classes
633 in Kotlin. Also, one of those groups was the one that wrote most code to
27
Language
Kotlin +
Java +
634 implement the new feature (1568 LOCs). The low number of developed data
635 classes suggests that the Kotlin language is more concise than Java, even
636 without taking into consideration the data class construct.
28
Larger for Kotlin
L M S
Issues with long args lists | |
Frequency of NullPointerExceptions | |
Figure 13: Effect size with confidence interval of language for different aspects.
662 6. Discussion
663 The overall goal of our comparative investigation on Java vs. Kotlin
664 was focused on assessing the consequences of a possible transition from one
29
665 programming language to the other. The switch could occur at different
666 levels: from a single project to a unit, up to a whole company.
667 We know that, by design, Kotlin is fully compatible at the bytecode-level
668 with Java; therefore, a smooth, progressive transition between the two lan-
669 guages is technically possible. The focus of our research questions addressed
670 the development part of the transition that entails:
30
697 6.2. Conciseness (RQ2)
698 One of the design goals of Kotlin, likewise other recent generation lan-
699 guages, is an increased expressive power that enables writing code in a more
700 concise way. This feature is extremely important because it can improve both
701 the productivity – by reducing the sheer number of keystrokes required – and
702 the understandability of the code. The second research question (RQ2) in
703 out study addressed this aspect. In this respect, we observed a significant
704 effect of using Kotlin on the amount of code written to develop new features.
705 More specifically, we found a large and statistically significant difference in
706 terms of lines of code developed (see Figure 12): Java development required
707 writing three times more lines than Kotlin. Concerning the number of classes
708 created in the development of new features, while the average is 67% higher
709 for Java, the difference is not statistically significant.
710 We found evidence that the usage of Kotlin led to writing more concise
711 code than Java (RQ2).
712 While we believe that, in general, the adoption of Kotlin can lead to
713 a more concise code, the extent of code reduction depends on the specific
714 type of application and environment. As we mentioned in section 4, our
715 experiment was able to provide evidence limited to the Android environment
716 and specifically concerning two small applications.
717 An important aspect that deserves further investigation is the capability
718 of modern construct that is present in Kotlin – but we argue in many mod-
719 ern programming languages – to actually enable more concise code in many
720 different settings.
721 Also related to the previous research question, we wonder it conciseness
722 stemming from more expressive constructs also translates into a higher un-
723 derstandability of the code.
31
732 occasionally occurred too frequently, while for teams using Kotlin, they hap-
733 pened mostly rarely. Although not statistically significant, we also observed
734 a lesser reduction of issues with long arguments lists and with the effort in
735 writing casts. Finally, no evidence was found of any reduction in the effort
736 devoted to writing data classes.
737 Our respondents also reported slightly better language support for Java
738 than for Kotlin. Even though the difference is not statistically significant, it
739 is surprising considering that the producer of the IDE is the same company
740 that designed the Kotlin language. Probably the much longer experience
741 available for Java development allowed for better support.
742 We found evidence that Kotlin was able to reduce the frequency of Null-
743 PointerExceptions. While no evidence was found concerning effects on other
744 investigated pitfalls. (RQ3).
745 The extent to which these findings can be generalized to experienced
746 programmers is unknown. We can speculate that NPE would represent a
747 lesser problem for experienced developers, though both writings cast and
748 dealing with long arguments lists can be expected to be issued less dependent
749 on the developers’ experience. The essentially inconclusive result concerning
750 data classes might be due to the architecture of the application and the
751 characteristic of the required new features, which did not require the use of
752 any data class (see Table 7).
753 Concerning the IDE support, we have to keep in mind that no student had
754 any experience with Kotlin’s development before the experiment. This bias
755 could have affected the participants’ perception. Therefore we could have
756 measured the familiarity with the language rather than the actual support
757 provided by the IDE.
758 The limited evidence and partially counter-intuitive results concerning
759 the coding pitfalls deserve further investigation. Research should be aimed
760 at understanding whether and under which circumstances the adoption of
761 Kotlin allows avoiding the pitfalls.
32
768 more concise code is likely to be written, and that the Kotlin language is
769 able to reduce the occurrence of one of the four pitfalls we investigated.
770 • Researchers interested in Kotlin: we highlighted a few interesting
771 aspects, regarding some of them we could not get any conclusive re-
772 sult, e.g., the effect on writing casts, long arguments lists, and data
773 classes. Such aspects are good candidates for further studies, as well
774 as confirmative replications of our findings.
775 • Tool builders: we found some hint of support that is perceived to be
776 better for Java than for Kotlin, further studies could confirm this and
777 provide directions for improving the IDE support.
778 7. Conclusion
779 Kotlin is a modern programming language that represents a relevant al-
780 ternative to Java in several development domains. In particular, it has been
781 adopted as an official development language for the Android OS. In this work,
782 we focused on the main promises of this new language. In particular, we in-
783 vestigated how Kotlin can improve the maintainability of code, make code
784 more compact, and avoid common pitfalls. For this purpose, we carried on an
785 experiment in the context of a Mobile Application Development course in an
786 MSc. degree. The experiment compared the Kotlin programming language
787 to its ancestor, Java.
788 With our experiment, we found that the usage of Kotlin apparently does
789 not affect the maintainability with respect to Java, when working on two
790 small applications. At the same time, we found evidence that the adoption
791 of Kotlin leads to more compact code when the subjects of the experiments
792 were asked to develop new features for an ongoing software project.
793 The adoption of Kotlin makes a few common Java annoyances less fre-
794 quent, thus making the development safer. We registered evidence of a re-
795 duction in the frequency of Null Pointer Exceptions. We also observed fewer
796 issues with long argument lists and reduced effort when dealing with casts,
797 although no definitive evidence could be found with this respect.
798 Those findings represent a first empirical assessment of the advantages
799 of Kotlin with respect to Java, as reported by many works in the related
800 literature. The findings showed that most of the promises of the develop-
801 ment of the Kotlin language are reflected by the code produced and by the
802 developers’ perception.
33
803 The study has few limitations, mainly due to the academic settings: the
804 software artifacts were small, the developers were students with limited ex-
805 perience; therefore, the number of bugs and tasks that were studied was
806 limited. The study may not be representative of bigger, real-world projects
807 that require many development tasks and may expose many typologies of
808 defects and issues. It is important to collect more evidence for different and
809 possibly larger applications and outside the Android ecosystem.
810 As future work, we hence plan to investigate the advantages brought
811 by Kotlin in other domains, e.g., server-side development. Also, we aim at
812 finding whether other expected Kotlin benefits hold.
813 References
814 [1] R. Coppola, L. Ardito, M. Torchiano, Characterizing the transition to
815 kotlin of android apps: a study on f-droid, play store, and github, in:
816 Proceedings of the 3rd ACM SIGSOFT International Workshop on App
817 Market Analytics, pp. 8–14.
818 [2] L. M. T. Victor L. de Oliveira, Felipe Ebert, On the adoption of kotlin
819 on android development: a triangulation study, in: 27th IEEE Interna-
820 tional Conference on Software Analysis, Evolution, and Reengineering
821 (SANER 2020), IEEE, pp. 1–6.
822 [3] R. Coelho, L. Almeida, G. Gousios, A. v. Deursen, C. Treude, Exception
823 handling bug hazards in android, Empirical Software Engineering 22
824 (2017) 1264–1304.
825 [4] É. Payet, F. Spoto, Static analysis of android programs, Information
826 and Software Technology 54 (2012) 1192–1201.
827 [5] J. Oliveira, D. Borges, T. Silva, N. Cacho, F. Castor, Do android devel-
828 opers neglect error handling? a maintenance-centric study on the rela-
829 tionship between android abstractions and uncaught exceptions, Journal
830 of Systems and Software 136 (2018) 1–18.
831 [6] S. Hellbrück, A Data Mining Approach to Compare Java with Kotlin,
832 Metropolia Ammattikorkeakoulu, 2019.
833 [7] B. Góis Mateus, M. Martinez, An empirical study on quality of android
834 applications written in kotlin language, Empirical Software Engineering
835 24 (2019) 3356–3393.
34
836 [8] E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns: Ele-
837 ments of Reusable Object-Oriented Software, Addison-Wesley Profes-
838 sional Computing Series, Pearson Education, 1994.
844 [11] B. Goetz, Response to ”should java 8 getters return optional type?”,
845 https://stackoverflow.com/a/26328555/3687824, 2014. Accessed:
846 2018-02-23.
856 [15] B. Skripal, V. Itsykson, Aspect-oriented extension for the kotlin pro-
857 gramming language, in: CEUR Workshop Proceedings, volume 1864,
858 pp. 1–6.
35
867 [18] M. Flauzino, J. Verı́ssimo, R. Terra, E. Cirilo, V. H. S. Durelli, R. S.
868 Durelli, Are you still smelling it?: A comparative study between java
869 and kotlin language, in: Proceedings of the VII Brazilian Symposium on
870 Software Components, Architectures, and Reuse, SBCARS ’18, ACM,
871 New York, NY, USA, 2018, pp. 23–32.
883 [23] D. Singh, An empirical study of programming languages from the point
884 of view of scientific computing, Int. J. Innov. Sci. Eng. Technol 4 (2017)
885 367–371.
36
897 [28] D. I. K. Sjoberg, J. E. Hannay, O. Hansen, V. By Kampenes, A. Kara-
898 hasanovic, N.-K. Liborg, A. C. Rekdal, A survey of controlled exper-
899 iments in software engineering, IEEE Trans. Softw. Eng. 31 (2005)
900 733–753.
904 [30] C. Hu, I. Neamtiu, Automating gui testing for android applications,
905 in: Proceedings of the 6th International Workshop on Automation of
906 Software Test, pp. 77–83.
907 [31] M. Reyhani Hamedani, D. Shin, M. Lee, S.-J. Cho, C. Hwang, Andro-
908 class: An effective method to classify android applications by applying
909 deep neural networks to comprehensive features, Wireless Communica-
910 tions and Mobile Computing 2018 (2018).
37
914 Author Biography
915
922
929
38
937 vironments. He is a co-author of seven patents. Since 1999, he cooperates
938 with Istituto Superiore ”Mario Boella”, participating to a shared laboratory
939 for the development of mobile services and applications. He supervised the
940 research activities of several graduate and PhD students at Politecnico di
941 Torino. He has been the advisor of four PhD students in Computer and
942 Control Engineering and more than 40 master students.
943
39
958 Appendix
959 7.1. Population details
960 Professional experience in Java and other languages in the two experi-
961 mental groups.
Java Kotlin
Other language
professionals
in group 2 1 1
1 1 1 1
None 10 1 11
None 1 None 1
962 Java professionals in group
RecordingItem 8% 0%
DBHelper 8% 0%
40
967 7.3. Detailed answer for perceptions
968 7.3.1. IDE support effectiveness
IDE support Java Kotlin
effectiveness
Very much 7 4
Much 2 1
Enough 3
Little 3
Very little 1 1
0 2 4 6 0 2 4
969 supprot effectiveness frequency-1.pdf Frequency
Frequently 5
Occasionally 6 3
Rarely 1 7
Never 3
0 2 4 6 0 2 4 6
971 frequency-1.pdf Frequency
41
972 7.3.3. Frequency of Long arguments list issues
Long arg list Java Kotlin
occurrence
Very Frequently
Frequently
Occasionally 5 4
Rarely 5 5
Never 3 5
0 1 2 3 4 5 0 1 2 3 4 5
973 argument list frequency-1.pdf Frequency
Proportional 7 7
Lower 2 3
Much lower 2 2
0 2 4 6 0 2 4 6
975 class frequency-1.pdf Frequency
42
976 7.3.5. Effort to write casts
Cast writing Java Kotlin
effort
Much higher 1
Higher 1
Proportional 8 8
Lower 1 2
Much lower 2 4
0 2 4 6 8 0 2 4 6 8
977 writing effort frequency-1.pdf Frequency
43
Credit Author Statement
Luca Ardito: conceptualization, methodology, paper writing
Riccardo Coppola.: data curation, paper writing.
Marco Torchiano: statistical analysis, data visualization, investigation, paper
writing.
Giovanni Malnati: supervision, experiment execution
44