-
A Large Scale Survey of Motivation in Software Development and Analysis of its Validity
Authors:
Idan Amit,
Dror G. Feitelson
Abstract:
Context: Motivation is known to improve performance. In software development in particular, there has been considerable interest in the motivation of contributors to open source. Objective: We identify 11 motivators from the literature (enjoying programming, ownership of code, learning, self use, etc.), and evaluate their relative effect on motivation. Since motivation is an internal subjective fe…
▽ More
Context: Motivation is known to improve performance. In software development in particular, there has been considerable interest in the motivation of contributors to open source. Objective: We identify 11 motivators from the literature (enjoying programming, ownership of code, learning, self use, etc.), and evaluate their relative effect on motivation. Since motivation is an internal subjective feeling, we also analyze the validity of the answers. Method: We conducted a survey with 66 questions on motivation which was completed by 521 developers. Most of the questions used an 11 point scale. We evaluated the validity of the answers validity by comparing related questions, comparing to actual behavior on GitHub, and comparison with the same developer in a follow up survey. Results: Validity problems include moderate correlations between answers to related questions, as well as self promotion and mistakes in the answers. Despite these problems, predictive analysis, investigating how diverse motivators influence the probability of high motivation, provided valuable insights. The correlations between the different motivators are low, implying their independence. High values in all 11 motivators predict increased probability of high motivation. In addition, improvement analysis shows that an increase in most motivators predicts an increase in general motivation.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
End to End Software Engineering Research
Authors:
Idan Amit
Abstract:
End to end learning is machine learning starting in raw data and predicting a desired concept, with all steps done automatically. In software engineering context, we see it as starting from the source code and predicting process metrics. This framework can be used for predicting defects, code quality, productivity and more. End-to-end improves over features based machine learning by not requiring…
▽ More
End to end learning is machine learning starting in raw data and predicting a desired concept, with all steps done automatically. In software engineering context, we see it as starting from the source code and predicting process metrics. This framework can be used for predicting defects, code quality, productivity and more. End-to-end improves over features based machine learning by not requiring domain experts and being able to extract new knowledge. We describe a dataset of 5M files from 15k projects constructed for this goal. The dataset is constructed in a way that enables not only predicting concepts but also investigating their causes.
△ Less
Submitted 22 December, 2021;
originally announced December 2021.
-
ComSum: Commit Messages Summarization and Meaning Preservation
Authors:
Leshem Choshen,
Idan Amit
Abstract:
We present ComSum, a data set of 7 million commit messages for text summarization. When documenting commits, software code changes, both a message and its summary are posted. We gather and filter those to curate developers' work summarization data set. Along with its growing size, practicality and challenging language domain, the data set benefits from the living field of empirical software engine…
▽ More
We present ComSum, a data set of 7 million commit messages for text summarization. When documenting commits, software code changes, both a message and its summary are posted. We gather and filter those to curate developers' work summarization data set. Along with its growing size, practicality and challenging language domain, the data set benefits from the living field of empirical software engineering. As commits follow a typology, we propose to not only evaluate outputs by Rouge, but by their meaning preservation.
△ Less
Submitted 23 August, 2021;
originally announced August 2021.
-
Follow Your Nose -- Which Code Smells are Worth Chasing?
Authors:
Idan Amit,
Nili Ben Ezra,
Dror G. Feitelson
Abstract:
The common use case of code smells assumes causality: Identify a smell, remove it, and by doing so improve the code. We empirically investigate their fitness to this use. We present a list of properties that code smells should have if they indeed cause lower quality. We evaluated the smells in 31,687 Java files from 677 GitHub repositories, all the repositories with 200+ commits in 2019. We measur…
▽ More
The common use case of code smells assumes causality: Identify a smell, remove it, and by doing so improve the code. We empirically investigate their fitness to this use. We present a list of properties that code smells should have if they indeed cause lower quality. We evaluated the smells in 31,687 Java files from 677 GitHub repositories, all the repositories with 200+ commits in 2019. We measured the influence of smells on four metrics for quality, productivity, and bug detection efficiency. Out of 151 code smells computed by the CheckStyle smell detector, less than 20% were found to be potentially causal, and only a handful are rather robust. The strongest smells deal with simplicity, defensive programming, and abstraction. Files without the potentially causal smells are 50% more likely to be of high quality. Unfortunately, most smells are not removed, and developers tend to remove the easy ones and not the effective ones.
△ Less
Submitted 15 January, 2024; v1 submitted 2 March, 2021;
originally announced March 2021.
-
A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits
Authors:
Steffen Herbold,
Alexander Trautsch,
Benjamin Ledel,
Alireza Aghamohammadi,
Taher Ahmed Ghaleb,
Kuljit Kaur Chahal,
Tim Bossenmaier,
Bhaveet Nagaria,
Philip Makedonski,
Matin Nili Ahmadabadi,
Kristof Szabados,
Helge Spieker,
Matej Madeja,
Nathaniel Hoy,
Valentina Lenarduzzi,
Shangwen Wang,
Gema Rodríguez-Pérez,
Ricardo Colomo-Palacios,
Roberto Verdecchia,
Paramvir Singh,
Yihao Qin,
Debasish Chakroborti,
Willard Davis,
Vijay Walunj,
Hongjun Wu
, et al. (23 additional authors not shown)
Abstract:
Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs.
Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits.
Metho…
▽ More
Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs.
Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits.
Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus.
Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case.
Conclusion: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise.
△ Less
Submitted 13 October, 2021; v1 submitted 12 November, 2020;
originally announced November 2020.
-
The Corrective Commit Probability Code Quality Metric
Authors:
Idan Amit,
Dror G. Feitelson
Abstract:
We present a code quality metric, Corrective Commit Probability (CCP), measuring the probability that a commit reflects corrective maintenance. We show that this metric agrees with developers' concept of quality, informative, and stable. Corrective commits are identified by applying a linguistic model to the commit messages. Corrective commits are identified by applying a linguistic model to the c…
▽ More
We present a code quality metric, Corrective Commit Probability (CCP), measuring the probability that a commit reflects corrective maintenance. We show that this metric agrees with developers' concept of quality, informative, and stable. Corrective commits are identified by applying a linguistic model to the commit messages. Corrective commits are identified by applying a linguistic model to the commit messages. We compute the CCP of all large active GitHub projects (7,557 projects with at least 200 commits in 2019). This leads to the creation of a quality scale, suggesting that the bottom 10% of quality projects spend at least 6 times more effort on fixing bugs than the top 10%. Analysis of project attributes shows that lower CCP (higher quality) is associated with smaller files, lower coupling, use of languages like JavaScript and C# as opposed to PHP and C++, fewer developers, lower developer churn, better onboarding, and better productivity. Among other things these results support the "Quality is Free" claim, and suggest that achieving higher quality need not require higher expenses.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
Machine Learning in Cyber-Security - Problems, Challenges and Data Sets
Authors:
Idan Amit,
John Matherly,
William Hewlett,
Zhi Xu,
Yinnon Meshi,
Yigal Weinberger
Abstract:
We present cyber-security problems of high importance. We show that in order to solve these cyber-security problems, one must cope with certain machine learning challenges. We provide novel data sets representing the problems in order to enable the academic community to investigate the problems and suggest methods to cope with the challenges. We also present a method to generate labels via pivotin…
▽ More
We present cyber-security problems of high importance. We show that in order to solve these cyber-security problems, one must cope with certain machine learning challenges. We provide novel data sets representing the problems in order to enable the academic community to investigate the problems and suggest methods to cope with the challenges. We also present a method to generate labels via pivoting, providing a solution to common problems of lack of labels in cyber-security.
△ Less
Submitted 22 April, 2019; v1 submitted 19 December, 2018;
originally announced December 2018.
-
Laser writable high-K dielectric for van der Waals nano-electronics
Authors:
N. Peimyoo,
M. D. Barnes,
J. D. Mehew,
A. De Sanctis,
I. Amit,
J. Escolar,
K. Anastasiou,
A. P. Rooney,
S. J. Haigh,
S. Russo,
M. F. Craciun,
F. Withers
Abstract:
Like silicon-based semiconductor devices, van der Waals heterostructures will require integration with high-K oxides. This is needed to achieve suitable voltage scaling, improved performance as well as allowing for added functionalities. Unfortunately, commonly used high-k oxide deposition methods are not directly compatible with 2D materials. Here we demonstrate a method to embed a multi-function…
▽ More
Like silicon-based semiconductor devices, van der Waals heterostructures will require integration with high-K oxides. This is needed to achieve suitable voltage scaling, improved performance as well as allowing for added functionalities. Unfortunately, commonly used high-k oxide deposition methods are not directly compatible with 2D materials. Here we demonstrate a method to embed a multi-functional few nm thick high-k oxide within van der Waals devices without degrading the properties of the neighbouring 2D materials. This is achieved by in-situ laser oxidation of embedded few layer HfS2 crystals. The resultant oxide is found to be in the amorphous phase with a dielectric constant of k~15 and break-down electric fields in the range of 0.5-0.6 V/nm. This transformation allows for the creation of a variety of fundamental nano-electronic and opto-electronic devices including, flexible Schottky barrier field effect transistors, dual gated graphene transistors as well as vertical light emitting and detecting tunnelling transistors. Furthermore, upon dielectric break-down, electrically conductive filaments are formed. This filamentation process can be used to electrically contact encapsulated conductive materials. Careful control of the filamentation process also allows for reversible switching between two resistance states. This allows for the creation of resistive switching random access memories (ReRAMs). We believe that this method of embedding a high-k oxide within complex van der Waals heterostructures could play an important role in future flexible multi-functional van der Waals devices.
△ Less
Submitted 12 November, 2018;
originally announced November 2018.
-
The Human Cell Atlas White Paper
Authors:
Aviv Regev,
Sarah Teichmann,
Orit Rozenblatt-Rosen,
Michael Stubbington,
Kristin Ardlie,
Ido Amit,
Paola Arlotta,
Gary Bader,
Christophe Benoist,
Moshe Biton,
Bernd Bodenmiller,
Benoit Bruneau,
Peter Campbell,
Mary Carmichael,
Piero Carninci,
Leslie Castelo-Soccio,
Menna Clatworthy,
Hans Clevers,
Christian Conrad,
Roland Eils,
Jeremy Freeman,
Lars Fugger,
Berthold Goettgens,
Daniel Graham,
Anna Greka
, et al. (56 additional authors not shown)
Abstract:
The Human Cell Atlas (HCA) will be made up of comprehensive reference maps of all human cells - the fundamental units of life - as a basis for understanding fundamental human biological processes and diagnosing, monitoring, and treating disease. It will help scientists understand how genetic variants impact disease risk, define drug toxicities, discover better therapies, and advance regenerative m…
▽ More
The Human Cell Atlas (HCA) will be made up of comprehensive reference maps of all human cells - the fundamental units of life - as a basis for understanding fundamental human biological processes and diagnosing, monitoring, and treating disease. It will help scientists understand how genetic variants impact disease risk, define drug toxicities, discover better therapies, and advance regenerative medicine. A resource of such ambition and scale should be built in stages, increasing in size, breadth, and resolution as technologies develop and understanding deepens. We will therefore pursue Phase 1 as a suite of flagship projects in key tissues, systems, and organs. We will bring together experts in biology, medicine, genomics, technology development and computation (including data analysis, software engineering, and visualization). We will also need standardized experimental and computational methods that will allow us to compare diverse cell and tissue types - and samples across human communities - in consistent ways, ensuring that the resulting resource is truly global.
This document, the first version of the HCA White Paper, was written by experts in the field with feedback and suggestions from the HCA community, gathered during recent international meetings. The White Paper, released at the close of this yearlong planning process, will be a living document that evolves as the HCA community provides additional feedback, as technological and computational advances are made, and as lessons are learned during the construction of the atlas.
△ Less
Submitted 11 October, 2018;
originally announced October 2018.
-
Sub 20 meV Schottky barriers in metal/MoTe2 junctions
Authors:
Nicola J. Townsend,
Iddo Amit,
Monica F. Craciun,
Saverio Russo
Abstract:
The newly emerging class of atomically-thin materials has shown a high potential for the realisation of novel electronic and optoelectronic components. Amongst this family, semiconducting transition metal dichalcogenides (TMDCs) are of particular interest. While their band gaps are compatible with those of conventional solid state devices, they present a wide range of exciting new properties that…
▽ More
The newly emerging class of atomically-thin materials has shown a high potential for the realisation of novel electronic and optoelectronic components. Amongst this family, semiconducting transition metal dichalcogenides (TMDCs) are of particular interest. While their band gaps are compatible with those of conventional solid state devices, they present a wide range of exciting new properties that is bound to become a crucial ingredient in the future of electronics. To utilise these properties for the prospect of electronics in general, and long-wavelength-based photodetectors in particular, the Schottky barriers formed upon contact with a metal and the contact resistance that arises at these interfaces have to be measured and controlled. We present experimental evidence for the formation of Schottky barriers as low as 10 meV between MoTe2 and metal electrodes. By varying the electrode work functions, we demonstrate that Fermi level pinning due to metal induced gap states at the interfaces occurs at 0.14 eV above the valence band maximum. In this configuration, thermionic emission is observed for the first time at temperatures between 40 K and 75 K. Finally, we discuss the ability to tune the barrier height using a gate electrode.
△ Less
Submitted 12 March, 2018;
originally announced March 2018.
-
Strain-engineered inverse charge-funnelling in layered semiconductors
Authors:
Adolfo De Sanctis,
Iddo Amit,
Steven P. Hepplestone,
Monica F. Craciun,
Saverio Russo
Abstract:
The control of charges in a circuit due to an external electric field is ubiquitous to the exchange, storage and manipulation of information in a wide range of applications, from electronic circuits to synapses in neural cells. Conversely, the ability to grow clean interfaces between materials has been a stepping stone for engineering built-in electric fields largely exploited in modern photovolta…
▽ More
The control of charges in a circuit due to an external electric field is ubiquitous to the exchange, storage and manipulation of information in a wide range of applications, from electronic circuits to synapses in neural cells. Conversely, the ability to grow clean interfaces between materials has been a stepping stone for engineering built-in electric fields largely exploited in modern photovoltaics and opto-electronics. The emergence of atomically thin semiconductors is now enabling new ways to attain electric fields and unveil novel charge transport mechanisms. Here, we report the first direct electrical observation of the inverse charge-funnel effect enabled by deterministic and spatially resolved strain-induced electric fields in a thin sheet of HfS2. We demonstrate that charges driven by these spatially varying electric fields in the channel of a phototransistor lead to a 350% enhancement in the responsivity. These findings could enable the informed design of highly efficient photovoltaic cells.
△ Less
Submitted 29 January, 2018;
originally announced January 2018.
-
High-Mobility and High-Optical Quality Atomically Thin WS2
Authors:
Francesco Reale,
Pawel Palczynski,
Iddo Amit,
Gareth F. Jones,
Jake D. Mehew,
Agnes Bacon,
Na Ni,
Peter C. Sherrell,
Stefano Agnoli,
Monica F. Craciun,
Saverio Russo,
Cecilia Mattevi
Abstract:
The rise of atomically thin materials has the potential to enable a paradigm shift in modern technologies by introducing multi-functional materials in the semiconductor industry. To date the growth of high quality atomically thin semiconductors (e.g. WS2) is one of the most pressing challenges to unleash the potential of these materials and the growth of mono- or bi-layers with high crystal qualit…
▽ More
The rise of atomically thin materials has the potential to enable a paradigm shift in modern technologies by introducing multi-functional materials in the semiconductor industry. To date the growth of high quality atomically thin semiconductors (e.g. WS2) is one of the most pressing challenges to unleash the potential of these materials and the growth of mono- or bi-layers with high crystal quality is yet to see its full realization. Here, we show that the novel use of molecular precursors in the controlled synthesis of mono- and bi-layer WS2 leads to superior material quality compared to the widely used topotactic transformation of WO3-based precursors. Record high room temperature charge carrier mobility up to 52 cm2/Vs and ultra-sharp photoluminescence linewidth of just 36 meV over submillimeter areas demonstrate that the quality of this material supersedes also that of naturally occurring materials. By exploiting surface diffusion kinetics of W and S species adsorbed onto a substrate, a deterministic layer thickness control has also been achieved promoting the design of scalable synthesis routes.
△ Less
Submitted 24 July, 2017;
originally announced July 2017.
-
Role of Charge Traps in the Performance of Atomically-Thin Transistors
Authors:
Iddo Amit,
Tobias J. Octon,
Nicola J. Townsend,
Francesco Reale,
C. David Wright,
Cecilia Mattevi,
Monica F. Craciun,
Saverio Russo
Abstract:
Transient currents in atomically thin MoTe$_2$ field-effect transistor are measured during cycles of pulses through the gate electrode. The transients are analyzed in light of a newly proposed model for charge trapping dynamics that renders a time-dependent change in threshold voltage the dominant effect on the channel hysteretic behavior over emission currents from the charge traps. The proposed…
▽ More
Transient currents in atomically thin MoTe$_2$ field-effect transistor are measured during cycles of pulses through the gate electrode. The transients are analyzed in light of a newly proposed model for charge trapping dynamics that renders a time-dependent change in threshold voltage the dominant effect on the channel hysteretic behavior over emission currents from the charge traps. The proposed model is expected to be instrumental in understanding the fundamental physics that governs the performance of atomically thin FETs and is applicable to the entire class of atomically thin-based devices. Hence, the model is vital to the intelligent design of fast and highly efficient opto-electronic devices.
△ Less
Submitted 16 March, 2017;
originally announced March 2017.
-
Multiple State EFN Transistors
Authors:
Gideon Segev,
Iddo Amit,
Andrey Godkin,
Alex Henning,
Yossi Rosenwaks
Abstract:
Electrostatically Formed Nanowire (EFN) based transistors have been suggested in the past as gas sensing devices. These transistors are multiple gate transistors in which the source to drain conduction path is determined by the bias applied to the back gate, and two junction gates. If a specific bias is applied to the side gates, the conduction band electrons between them are confined to a well-de…
▽ More
Electrostatically Formed Nanowire (EFN) based transistors have been suggested in the past as gas sensing devices. These transistors are multiple gate transistors in which the source to drain conduction path is determined by the bias applied to the back gate, and two junction gates. If a specific bias is applied to the side gates, the conduction band electrons between them are confined to a well-defined area forming a narrow channel- the Electrostatically Formed Nanowire. Recent work has shown that by applying non-symmetric bias on the side gates, the lateral position of the EFN can be controlled. We propose a novel Multiple State EFN Transistor (MSET) that utilizes this degree of freedom for the implementation of complete multiplexer functionality in a single transistor like device. The multiplexer functionality allows a very simple implementation of binary and multiple valued logic functions.
△ Less
Submitted 19 March, 2015; v1 submitted 25 February, 2015;
originally announced February 2015.