Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

TSE09.doc

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 1

DECOR: A Method for the Specification and


Detection of Code and Design Smells
Naouel Moha, Yann-Gaël Guéhéneuc, Laurence Duchien, and Anne-Françoise Le Meur

Abstract— Code and design smells are poor solutions to recur- classes without structure that declare long methods without
ring implementation and design problems. They may hinder the parameters. The names of the classes and methods may
evolution of a system by making it hard for software engineers suggest procedural programming. Spaghetti Code does not
to carry out changes. We propose three contributions to the
research field related to code and design smells: (1) D ECOR, exploit object-orientation mechanisms such as polymorphism
a method that embodies and defines all the steps necessary for and inheritance and prevents their use.
the specification and detection of code and design smells; (2) We use the term “smells” to denote both code and design
D ETEX, a detection technique that instantiates this method; and smells. This use does not exclude that, in a particular context,
(3) an empirical validation in terms of precision and recall of a smell can be the best way to actually design or implement
D ETEX. The originality of D ETEX stems from the ability for
software engineers to specify smells at a high-level of abstraction a system. For example, parsers generated automatically by
using a consistent vocabulary and domain-specific language for parser generators are often Spaghetti Code, i.e., very large
automatically generating detection algorithms. Using D ETEX, we classes with very long methods. Yet, although such classes
specify four well-known design smells: the antipatterns Blob, “smell”, software engineers must manually evaluate their
Functional Decomposition, Spaghetti Code, and Swiss Army possible negative impact according to the context.
Knife, and their 15 underlying code smells, and we automatically
generate their detection algorithms. We apply and validate the The detection of smells can substantially reduce the cost
detection algorithms in terms of precision and recall on X ERCES of subsequent activities in the development and maintenance
v2.7.0, and discuss the precision of these algorithms on 11 open- phases [4]. However, detection in large systems is a very time-
source systems. and resource-consuming and error-prone activity [5] because
Index Terms— Antipatterns, design smells, code smells, speci- smells cut across classes and methods and their descriptions
fication, meta-modelling, detection, JAVA. leave much room for interpretation.
Several approaches, as detailed in Section II, have been
I. I NTRODUCTION proposed to specify and detect smells. However, they have
three limitations. First, the authors do not explain the analysis
Software systems need to evolve continually to cope with
leading to the specifications of smells and the underlying de-
ever-changing requirements and environments. However, op-
tection framework. Second, the translation of the specifications
posite to design patterns [1], code and design smells—“poor”
into detection algorithms is often black-box, which prevents
solutions to recurring implementation and design problems—
replication. Finally, the authors do not present the results of
may hinder their evolution by making it hard for software
their detection on a representative set of smells and systems to
engineers to carry out changes.
allow comparison among approaches. So far, reported results
Code and design smells include low-level or local problems
concern proprietary systems and a reduced number of smells.
such as code smells [2], which are usually symptoms of more
We present three contributions to overcome these limi-
global design smells such as antipatterns [3]. Code smells are
tations. First, we propose D ECOR (DEtection & CORrec-
indicators or symptoms of the possible presence of design
tion2 ), a method that describes all the steps necessary for the
smells. Fowler [2] presented 22 code smells, structures in
specification and detection of code and design smells. This
the source code that suggest the possibility of refactorings.
method embodies in a coherent whole all the steps defined by
Duplicated code, long methods, large classes, and long pa-
previous work and, thus, provides a means to compare existing
rameter lists, are just a few symptoms of design smells and
techniques and to suggest future work.
opportunities for refactorings.
Second, we revisit in the context of the D ECOR method our
One example of a design smell is the Spaghetti Code
detection technique [6], [7], now called D ETEX (DETection
antipattern1 , which is characteristic of procedural thinking in
EXpert). D ETEX allows software engineers to specify smells
object-oriented programming. Spaghetti Code is revealed by
at a high-level of abstraction using a unified vocabulary and
N. Moha, post-doctoral fellow in INRIA – Triskell Team, IRISA - Univer- domain-specific language, obtained from an in-depth domain
sité de Rennes 1, France. E-mail: moha@irisa.fr analysis, and to automatically generate detection algorithms.
Y.-G. Guéhéneuc, Associate Professor, in Département de génie informa-
tique et génie logiciel, École Polytechnique de Montréal, Québec, Canada. Thus, D ECOR represents a concrete and generic method for the
E-mail: yann-gael.gueheneuc@polymtl.ca detection of smells with respect to previous work and D ETEX
L. Duchien, Professor, and A.-F. Le Meur, Associate Professor, in LIFL, is an instantiation or a concrete implementation of this method
INRIA Lille-Nord Europe – ADAM Team, Université de Lille, France. E-
mail:{Laurence.Duchien, Anne-Francoise.Le-Meur}@lifl.fr in the form of a detection technique.
1 This smell, like those presented later on, is really in between implemen-
tation and design. 2 Correction is future work.

DigitalAuthorized licensed 10.1109/TSE.2009.50


Object Indentifier use limited to: ECOLE POLYTECHNIQUE DE MONTREAL.0098-5589/09/$26.00
Downloaded on November 6, 2009
© 2009 IEEEat 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 2

Third, we validate D ETEX using precision and recall on attempt was made to automate this process, and thus, it does
the open-source system X ERCES and precision on 11 other not scale to large systems easily. Also, the process only covers
systems. We thus show indirectly the usefulness of D ECOR. the manual detection of smells, not their specification.
This extensive validation is the first report in the literature of Marinescu [17] presented a metric-based approach to detect
both precision and recall with open-source software systems. code smells with detection strategies, implemented in the
These three contributions take up and expand our previous I P LASMA tool. The strategies capture deviations from good
work on code and design smells [6], [7] to form a consistent design principles and consist of combining metrics with set
whole that provides all the necessary details to understand, operators and comparing their values against absolute and
use, replicate, and pursue our work. Therefore, we take up the relative thresholds.
domain analysis, language, underlying detection framework, Munro [18] noticed the limitations of text-based descrip-
and results of the recall on X ERCES. tions and proposed a template to describe code smells sys-
The paper is organized as follows. Section II surveys related tematically. This template is similar to the one used for design
work. Section III describes the D ECOR method and introduces patterns [1]. It consists of three main parts: a code smell name,
its instantiation D ETEX. Section IV details each step of the im- a text-based description of its characteristics, and heuristics for
plementation of D ETEX illustrated on the Spaghetti Code as a its detection. It is a step towards more precise specifications of
running example. Section V describes the validation of D ETEX code smells. Munro also proposed metric-based heuristics to
with the specification and detection of three additional design detect code smells, which are similar to Marinescu’s detection
smells: Blob, Functional Decomposition, and Swiss Army strategies. He also performed an empirical study to justify the
Knife, on 11 object-oriented systems: A RGO UML, A ZUREUS, choice of metrics and thresholds for detecting smells.
G ANTT P ROJECT, L OG 4J, L UCENE, N UTCH, PMD, Q UICK - Alikacem and Sahraoui [19] proposed a language to detect
UML, two versions of X ERCES, and E CLIPSE. Section VI violations of quality principles and smells in object-oriented
concludes and presents future work. systems. This language allows the specification of rules using
metrics, inheritance or association relationships among classes,
II. R ELATED W ORK according to the engineers’ expectations. It also allows using
Many works exist on the identification of problems in fuzzy logic to express the thresholds of rules conditions. The
software testing [8], databases ([9], [10]), and networks [11]. rules are executed by an inference engine.
We survey here those works directly related to the detection Some approaches for complex software analysis use vi-
of smells by presenting their existing descriptions, detection sualisation techniques [20], [21]. Such semi-automatic ap-
techniques, and related tools. Related work on design pattern proaches are interesting compromises between fully automatic
identification (e.g., [12]) is beyond the scope of this paper. detection techniques that can be efficient but loose track of
context, and manual inspection that is slow and inaccurate
A. Descriptions of Smells [22]. However, they require human expertise and are thus still
Several books have been written on smells. Webster [13] time-consuming. Other approaches perform fully automatic
wrote the first book on smells in the context of object- detection of smells and use visualisation techniques to present
oriented programming, including conceptual, political, coding, the detection results [23], [24].
and quality-assurance pitfalls. Riel [14] defined 61 heuristics Other related approaches include architectural consistency
characterising good object-oriented programming that enable checkers, which have been integrated in style-oriented archi-
engineers to assess the quality of their systems manually and tectural development environments [25], [26], [27]. For exam-
provide a basis for improving design and implementation. ple, active agents acting as critics [27] can check properties
Beck in Fowler’s book [2] compiled 22 code smells that are of architectural descriptions, identify potential syntactic and
low-level design problems in source code, suggesting that semantic errors, and report them to the designer.
engineers should apply refactorings. Code smells are described All these approaches have contributed significantly to the
in an informal style and associated with a manual detection automatic detection of smells. However, none presents a com-
process. Mäntylä [15] and Wake [16] proposed classifications plete method including a specification language, an explicit
for code smells. Brown et al. [3] focused on the design and detection platform, a detailed processing, and a validation of
implementation of object-oriented systems and described 40 the detection technique.
antipatterns textually, i.e., general design smells including the
well-known Blob and Spaghetti Code. C. Tools
These books provide in-depth views on heuristics, code In addition to detection techniques, several tools have been
smells, and antipatterns aimed at a wide academic audi- developed to find smells and implementation problems and–or
ence. However, manual inspection of the code for searching syntax errors.
for smells based only on text-based descriptions is a time- Annotation checkers such as A SPECT [28], LCL INT [29], or
consuming and error-prone activity. Thus, some researchers E XTENDED S TATIC C HECKER [30] use program verification
have proposed smell detection approaches. techniques to identify code smells. These tools require the
engineers’ assistance to add annotations in the code that can
B. Detection Techniques be used to verify the correctness of the system.
Travassos et al. [5] introduced a process based on manual S MALL L INT [31] analyses Smalltalk code to detect bugs,
inspections and reading techniques to identify smells. No possible programming errors, or unused code. F IND B UGS [32]

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 3

detects correctness- and performance-related bugs in JAVA • Step 5. Validation: The suspected code constituents are
systems. S ABER [33] detects latent coding errors in J2EE- manually validated to verify that they have real smells.
based applications. A NALYST 4 J [34] allows the identification The first step of the method is generic and must be based on
of antipatterns and code smells in JAVA systems using metrics. a representative set of smells. Steps 2 and 3 must be followed
PMD [35], C HECK S TYLE [36], and FXC OP [37] check when specifying a new smell. The last two steps 4 and 5
coding styles. PMD [35] and H AMMURAPI [38] also allow are repeatable and must be applied on each system. Feedback
developers to write detection rules using JAVA or XPATH. loops exist among the steps when the validation of the output
However, the addition of new rules is intended for engineers of a step suggests changing the output of its precursor.
familiar with JAVA and XPATH, which could limit access to During the iterative validation, we proceed as follows: In
a wider audience. With S EMMLE C ODE [39], engineers can Step 1, we may expand the vocabulary of smells; In Step
execute queries against source code, using a declarative query 2, we may extend the specification language; In Step 3, we
language called .QL, to detect code smells. may refine and reprocess the specifications to reduce the
C ROCOPAT [40] provides means to manipulate relations of number of erroneous detection results. The engineers choose
any arity with a simple and expressive query and manipulation the stopping criteria depending on their needs and the outputs
language. This tool allows many structural analyses in models of the detection. Steps 1, 2 and 5 remain manual by nature.
of object-oriented systems including design patterns identifi- Figure 1 (a) contrasts the D ECOR method with previous
cation and detection of problems in code (for example, cycles, work. Some previous work [2], [3], [13], [14] provided text-
clones, and dead code). based descriptions of smells but none performed a complete
Model checkers such as B LAST [41] and M OPS [42] also analysis of these descriptions. Munro [18] improved the de-
relate to code problems by checking for violations of tem- scriptions by proposing a template including heuristics for
poral safety properties in C systems using model checking their detection. However, he did not propose any automatic
techniques. process for their detection. Marinescu [17] proposed a detec-
Most of these tools detect predefined smells at the imple- tion technique based on high-level specifications. However, he
mentation level such as bugs or coding errors. Some of them as did not make explicit the processing of these specifications,
PMD [35] and H AMMURAPI [38] allow engineers to specify which appears as a black box. Alikacem and Sahraoui [19]
new detection rules for smells using languages such as JAVA also proposed high-level specifications but did not provide
or XPATH. any validation of their approach. Tools focused on imple-
mentation problems and could provide hints on smells and
III. D ECOR AND ITS I NSTANTIATION , D ETEX thus implement parts of the detection. Although these tools
Although previous works offer ways to specify and detect provide languages for specifying new smells, these specifica-
code and design smells, each has its particular advantages tions are intended for developers and, thus, are not high-level
and focuses on a subset of all the steps necessary to define specifications. Only Marinescu [17] and Munro [18] provide
a detection technique systematically. The processes used and some results of their detection but only on a few smells and
choices made to specify and implement the smell detection proprietary systems.
algorithms are often not explicit: they are often driven by the As our second contribution, we now revisit our previous
services of the underlying detection framework rather than by detection technique [6], [7] within the context of D ECOR.
an exhaustive study of the smell descriptions. Figure 1 (b) presents an overview of the four steps of D ETEX,
Therefore, as a first contribution, we propose D ECOR, a which are instances of the steps of D ECOR. It also emphasises
method that subsumes all the steps necessary to define a the steps, inputs, and outputs specific to D ETEX. The following
detection technique. The method defines explicitly each step to items summarise the steps in D ETEX:
build a detection technique. All steps of D ECOR are partially • Step 1. Domain Analysis: This first step consists of
instantiated by the previous approaches. Thus, the method performing a thorough analysis of the domain related to
encompasses previous work in a coherent whole. Figure 1 (a) smells to identify key concepts in their text-based de-
shows the five steps of the method. The following items scriptions. In addition to a unified vocabulary of reusable
summarise its steps: concepts, a taxonomy and classification of smells are
• Step 1. Description Analysis: Key concepts are iden- defined using the key concepts. The taxonomy highlights
tified in the text-based descriptions of smells in the and charts the similarities and differences among smells
literature. They form a unified vocabulary of reusable and their key concepts.
concepts to describe smells. • Step 2. Specification: The specification is performed
• Step 2. Specification: The concepts, which constitute a using a domain-specific language (DSL) in the form of
vocabulary, are combined to specify smells systematically rule cards using the previous vocabulary and taxonomy.
and consistently. A rule card is a set of rules. A rule describes the
• Step 3. Processing: The specifications are translated into properties that a class must have to be considered a smell.
algorithms that can be directly applied for the detection. The DSL allows defining properties for the detection
• Step 4. Detection: The detection is performed on systems of smells, specifying the structural relationships among
using the specifications previously processed and returns these properties and characterising properties according
the list of code constituents (e.g., classes, methods) sus- to their lexicon (i.e., names), structure (e.g., classes using
pected of having smells. global variables), and internal attributes using metrics.

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 4

(a) DECOR Method


On all smells On each smell On each system
Source code Source code
1 2 3 4 5

Specification
of the system of the system

Description

Processing

Validation
Detection
A nalysis
Text-based Vocabulary Specifications Operational Suspicious C ode constituents
description s of smells specifications code constituents having smells

Brown et al. (1998)


Fowler (1999) Marinescu (2004) Alikacem et al. (2006) Marinescu (2004) Marinescu (2004)
Riel (1996) Munro (2005) Alikacem et al. (2006) Munro (2005)
Webster (1995) Alikacem et al. (2006) Travassos : manual detection (1999)
Tools: SmallLint, PMD, CROCOPAT

(b) DETEX Technique


Model
1 2 3 4

Specification
of the system

Generation
Algorithm

Detection
A nalysis
Do m a i n

Text-based Vocabulary Rule Cards Detection Suspicious


description s of smells Taxonomy algorithms classes

Fig. 1. (a) D ECOR Method Compared to Related Work. (Boxes are steps and arrows connect the inputs and outputs of each step. Gray boxes represent
fully-automated steps.)
(b) D ETEX Detection Technique. (The steps, inputs, and outputs in bold, italics, and underlined are specific to D ETEX compared with D ECOR.)

• Step 3. Algorithm Generation: Detection algorithms are reusable when creating new systems”. In the context of smells,
automatically generated from models of the rule cards. information relates to smells, software systems are detection
These models are obtained by reifying the rules using a algorithms, and the information on smells must be reusable
dedicated meta-model and parser. A framework supports when specifying new smells. Domain analysis ensures that the
the automatic generation of the detection algorithms. language for specifying smells is built upon consistent high-
• Step 4. Detection: Detection algorithms are applied au- level abstractions and is flexible and expressive. This step is
tomatically on models of systems obtained from original crucial to D ETEX because its output serves as input for all the
designs produced during forward engineering or through following steps. In particular, the identified key concepts will
reverse engineering of the source code. be specified as properties and values in the next two steps.
D ETEX is original because the detection algorithms are 1) Process:
not ad hoc but generated using a DSL obtained from an
in-depth domain analysis of smells. A DSL benefits the Input: Text-based descriptions of design and code smells in
domain experts, engineers, and quality experts because they the literature, such as [2], [3], [13], [14].
can specify and modify manually the detection rules using
high-level abstractions pertaining to their domain, taking into Output: A textual list of the key concepts used in the literature
account the context of the analysed systems. The context cor- to describe smells, which forms a vocabulary for smells. Also,
responds to all information related to the characteristics of the a classification of code and design smells and a taxonomy in
systems including types (prototype, system in development or the form of a map highlighting similarities, differences, and
maintenance, embedded system, etc.), design choices (related relationships among smells.
to design heuristics and principles), and coding standards.
Description: This first step deals with identifying, defining,
IV. D ETEX IN D ETAILS and organising key concepts used to describe smells, including
metric-based heuristics as well as structural and lexical data
The following sections describe the four steps of D ETEX [7]. The key concepts refer to keywords or specific concepts
using a common pattern: input, output, description, and im- of object-oriented programming used to describe smells in
plementation. Each step is illustrated by a running example the literature ([2], [3], [13], [14]). They form a vocabulary
using the Spaghetti Code and followed by a discussion. of reusable concepts to specify smells.
The domain analysis requires a thorough search of the
A. Step 1: Domain Analysis literature for key concepts in the smell descriptions. We
The first step of D ETEX is inspired by the activities sug- perform the analysis in an iterative way: for each description
gested for domain analysis [43], which “is a process by which of a smell, we extract all key concepts, compare them with
information used in developing software systems is identi- already-found concepts, and add them to the domain avoiding
fied, captured, and organised with the purpose of making it synonyms and homonyms. A synonym is a same concept with

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 5

The Blob (also called God class [14]) corresponds to a large controller class that
two different names and homonyms are two different concepts depends on data stored in surrounding data classes. A large class declares many
with a same name. Thus, we obtain a compilation of concepts fields and methods with a low cohesion. A controller class monopolises most of the
processing done by a system, takes most of the decisions, and closely directs the
that form a concise and unified vocabulary. processing of other classes [44]. Controller classes can be identified using suspicious
We define and classify manually smells using the key names such as Process, Control, Manage, System, and so on. A data class
contains only data and performs no processing on these data. It is composed of highly
concepts. Smells sharing the same concepts belong to the same cohesive fields and accessors.
category. The classification limits possible misinterpretation, The Functional Decomposition antipattern may occur if experienced procedural
developers with little knowledge of object-orientation implement an object-oriented
avoiding synonyms and homonyms at any level of granularity. system. Brown describes this antipattern as “a ‘main’ routine that calls numerous
We sort the concepts according to the types of properties on subroutines”. The Functional Decomposition design smell consists of a main class,
i.e., a class with a procedural name, such as Compute or Display, in which
which they apply: measurable, lexical, or structural. inheritance and polymorphism are scarcely used, that is associated with small classes,
Measurable properties are concepts expressed with measures which declare many private fields and implement only a few methods.
of internal attributes of constituents of systems (classes, meth- The Spaghetti Code is an antipattern that is characteristic of procedural thinking in
object-oriented programming. Spaghetti Code is revealed by classes with no structure,
ods, fields, relationships, and so on). Lexical properties relate declaring long methods with no parameters, and utilising global variables. Names
to the vocabulary used to name constituents. They characterise of classes and methods may suggest procedural programming. Spaghetti Code does
not exploit and prevents the use of object-orientation mechanisms, polymorphism and
constituents with specific names defined in lists of keywords inheritance.
or in a thesaurus. Structural properties and relationships define The Swiss Army Knife refers to a tool fulfilling a wide range of needs. The Swiss
the structures of constituents (for example, fields correspond- Army Knife design smell is a complex class that offers a high number of services, for
example, a complex class implementing a high number of interfaces. A Swiss Army
ing to global variables) and their relationships (for example, Knife is different from a Blob, because it exposes a high complexity to address all
an association relationship between classes). foreseeable needs of a part of a system, whereas the Blob is a singleton monopolising
all processing and data of a system. Thus, several Swiss Army Knives may exist in
Figures 2 and 3 show the classifications of the four an- a system, for example utility classes.
tipatterns of interest in this paper, described in Table I, and TABLE I. List of Design Smells (The key concepts are in bold and italics.).
their code smells. These classifications organise and structure
smells consistently at the different levels of granularity. Message Chain
We then use the vocabulary to manually organise all smells Structural Shotgun Surgery

with respect to one another and build a taxonomy that puts all Duplicated Code (bis)

smells on a single map and highlights their relationships. The


map organises and combines smells, such as antipatterns and Inter-Class Lexical Comments (bis)

code smells, and other related key concepts using set operators
such as intersection and union.
Measurable Duplicated Code (bis)

Implementation: This step is intrinsically manual. It requires Code Smell DataClass


No Polymorphism
the engineers’ expertise and can seldom be supported by tools. Structural
Global Variable

2) Running Example: Comments (bis)

Intra-Class Lexical Controller Class


Analysis of the Spaghetti Code: We summarise the text- Procedural Class

based description of the Spaghetti Code [3, page 119] in Long Method
Large Class
Table I along with those of the Blob [page 73], Functional
Measurable No Inheritance
Decomposition [page 97], and Swiss Army Knife [page 197]. Low Cohesion

In the description of the Spaghetti Code, we identify the key Divergent Change

concepts (in italic in the table) of classes with long methods Long Parameter List

with no parameter, with procedural names, declaring global


variables, and not using inheritance and polymorphism. Fig. 2. Classification of some Code Smells. (Fowler’s smells are in gray.)
We obtain the following classification for the Spaghetti
Code: its measurable properties include the concepts of long for the key concepts are given in Section IV-B where we
methods, methods with no parameter, inheritance; its lexical present the DSL built from the domain analysis, its grammar,
properties include the concepts of procedural names; its struc- and an exhaustive list of the properties and values.
tural properties include the concepts of global variables, and
polymorphism. The Spaghetti Code does not involve structural Classification of Code Smells: Beck [2] provided a catalog
relationships among constituents. Such relationships appear of code smells but did not define any categories of or relation-
in Blob and Functional Decomposition, for example through ships among the smells. This lack of structuring hinders their
the key concepts depends on data and associated with small identification, comparison, and, consequently, detection.
classes. Measurable properties are characterised by values Efforts have been made to classify these symptoms. Mäntylä
specified using keywords such as high, low, few, and many, [15] proposed seven categories, such as object-orientation
for example in the textual descriptions of the Blob, Functional abusers or bloaters, including long methods, large classes,
Decomposition, and Swiss Army Knife, but not explicitly in or long parameter lists. Wake [16] distinguished code smells
the Spaghetti Code. The properties can be combined using that occur in or among classes. He further distinguished
set operators such as intersection and union. For example, all measurable smells, smells related to code duplication, smells
properties must be present to characterise a class as Spaghetti due to conditional logic, and others. These two classifications
Code. More details on the properties and their possible values are based on the nature of the smells. We are also interested

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 6

Structural
Blob / God Class This taxonomy describes the structural relationships among
Inter-Class Lexical code and design smells, and their measurable, structural, and
Functional Decomposition
lexical properties (ovals in white). It also describes the struc-
Measurable
AntiPattern
Structural
tural relationships (edges) among design smells (hexagons)
Spaghetti Code and some code smells (ovals in gray). It gives an overview of
Intra-Class Lexical
Swiss Army Knife all key concepts that characterise a design smell. It also makes
Measurable
explicit the relationships among code and design smells.
Figure 4 presents the taxonomy that shows the relationships
Fig. 3. Classification of Antipatterns. among design and code smells. This map is useful to prevent
misinterpretation by clarifying and classifying smells based on
in their properties, structure, and lexicon, as well as their their key concepts. Indeed, several sources of information may
coverage (intra- and inter-classes [45]) because these reflect result in conflicting smell descriptions and the domain experts’
better the spread of the smells. judgement is required to resolve such conflicts. Lanza et al.
Figure 2 shows the classification of some code smells. [23] introduced the notion of correlation webs to also show the
Following Wake, we distinguish code smells occurring in relationships among code smells. We introduce an additional
and among classes. We further divide the two subcategories level of granularity by adding antipatterns and include more
into structural, lexical, and measurable code smells. This information related to their properties.
division helps in identifying appropriate detection techniques.
3) Discussion:
For example, the detection of a structural smell may essentially
The distinction between structural and measurable smells
be based on static analyses; the detection of a lexical smell
does not exclude the fact that the structure of a system is
may rely on natural language processing; the detection of a
measurable. However, structural properties sometimes express
measurable smell may use metrics. Our classification is generic
better constraints among classes than metrics. While metrics
and classifies code smells in more than one category (e.g.,
report numbers, we may want to express the presence of a
Duplicated Code).
particular relation between two classes to describe a smell
Classification of Antipatterns: An antipattern [3] is a literary more precisely. In the example of the Spaghetti Code, we
form describing a bad solution to a recurring design problem use a structural property to characterise polymorphism and
that has a negative impact on the quality of a system design. a measurable property for inheritance. However, we could use
Contrary to design patterns, antipatterns describe what not a measurable property to characterise polymorphism and a
to do. There exist general antipatterns [3] and antipatterns structural property for inheritance. Such choices are left to
specific to concurrent processes [46], J2EE [47], [48], per- domain experts who can choose the property that best fits their
formance [49], XML [48], and other sub-fields of software understanding of the smells in the context in which they want
engineering. to detect them. With respect to the lexical properties, we use a
Brown et al. [3] classified antipatterns in three main cat- list of keywords to identify specific names but, in future work,
egories: development, architecture, and project management. we plan to use W ORD N ET, a lexical database of English to
We focus on the antipatterns related to development and archi- deal with synonyms to widen the list of keywords.
tecture because they represent poor design choices. Moreover, The domain analysis is iterative because the addition of a
their correction may enhance the quality of systems and their new smell description may require the extraction of a new
detection is possible semi-automatically. key concept, its comparison with existing concepts, and its
Figure 3 summarises the classification of the antipatterns. classification. In our domain analysis, we study 29 smells
We use the previous classification of code smells to classify including 8 antipatterns and 21 code smells. These 29 smells
antipatterns according to their associated code smells. In par- are representative of the whole set of smells described in
ticular, we distinguish between intra-class smells—smells in the literature and include about 60 key concepts. These
a class—and inter-class smells—smells spreading over more key concepts are at different levels of abstraction (structural
than one class. This distinction highlights the extent of the relationships, properties, and values) and of different types
code inspection required to detect a smell. For example, we (measurable, lexical, structural). They form a consistent vo-
classify the Spaghetti Code antipattern as an intra-class design cabulary of reusable concepts to specify smells. In this step,
smell belonging to the structural, lexical, and measurable we named the key concepts related to the Blob, Functional
sub-categories because its code smells include long methods Decomposition, Spaghetti Code, and Swiss Army Knife. We
(measurable code smell), global variables (structural code will further detail these concepts in the next two steps.
smell), procedural names (lexical code smell), and absence Thus, our domain analysis is complete enough to describe a
of inheritance (another measurable code smell). whole range of smells and can be extended, if required, during
another iteration of the domain analysis. We have described
Taxonomy of Design Smells: Figure 4 summarises the clas- without difficulty some new smells that were not used for the
sifications as a taxonomy in the form of a map. It is similar to domain analysis. However, this domain analysis does not allow
Gamma et al.’s Pattern Map [1, inside back cover]. We only the description of smells related to the behavior of system.
show the four design smells, including the Spaghetti Code, Current research work [50] will allow us to describe, specify,
used in this paper for the sake of clarity. and detect this new category of smells.

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 7

LEXIC CLASSNAME {Manager,


METRIC LCOM VERY_HIGH Low Cohesion Controller Class Process, Control, etc.}
union union
METRIC NMD+NAD LEXIC METHODNAME {Manager,
VERY_HIGH
LargeClass Controller Method Process, Control, etc.}
inter

associated from
ONE METRIC NMNOPARAM
NACC VERY_HIGH DataClass VERY_HIGH
associated to
Blob
MANY
METRIC LOC_METHOD
VERY_HIGH
STRUC
METRIC NINTERF VERY_HIGH
USE_GLOBAL_VARIABLE
LEXIC CLASSNAME {Make, Create,
Exec, Compute, etc.} No Parameter
Multiple Interface
Long Method
UseGlobalVariable
Procedural
SwissArmy Functional names
Knife Decomposition associated from
ONE No Inheritance inter
associated to Spaghetti
MANY inter
Code
METRIC NPRIVFIELD HIGH Field Private inter
METRIC DIT 1

STRUC
Class One NO_POLYMORPHISM
METRIC NMD VERY_LOW No Polymorphism
Method

Fig. 4. Taxonomy of Smells. (Hexagons are antipatterns, gray ovals are code smells, and white ovals are properties.)

B. Step 2: Specification the keyword RULE CARD, followed by a name and a set of
rules specifying the design smell (line 1). A rule describes a
1) Process: list of properties, such as metrics (lines 8–11), relationships
Input: A vocabulary and taxonomy of smells. with other rules, such as associations (lines 14–16), and–or
combination with other rules, based on available operators
Output: Specifications detailing the rules to apply on a model such as intersection or union (line 4). Properties can be of
of a system to detect the specified smells. three different kinds: measurable, structural, or lexical, and
define pairs of identifier–value (lines 5–7).
Description: We formalise the concepts and properties re-
quired to specify detection rules at a high-level of abstraction 1 rule card ::= RULE CARD:rule cardName { (rule)+ };
2 rule ::= RULE:ruleName { content rule };
using a DSL. The DSL allows the specification of smells
in a declarative way as compositions of rules in rule cards. 3 content rule ::= operator ruleName (ruleName)+ | property | relationship
4 operator ::= INTER | UNION | DIFF | INCL | NEG
Using the smell vocabulary and taxonomy, we map rules
with code smells and rules cards with design smells (i.e., 5 property ::= METRIC id metric value metric fuzziness
6 | LEXIC id lexic ((lexic value,)+ )
antipatterns). Each antipattern in the taxonomy corresponds 7 | STRUCT id struct
to a rule card. Each code smell associated in the taxonomy 8 id metric ::= DIT | NINTERF | NMNOPARAM | LCOM | LOC CLASS
| LOC METHOD | NAD | NMD | NACC | NPRIVFIELD
with an antipattern is described as a rule. The properties in the 9 | id metric + id metric
taxonomy are directly used to express the rules. We make the 10 | id metric - id metric
11 value metric ::= VERY HIGH | HIGH | MEDIUM | LOW | VERY LOW
choice of associating code smells with rules and antipatterns | NUMBER
with rule cards for the sake of simplicity but without loss of 12 id lexic ::= CLASS NAME | INTERFACE NAME | METHOD NAME
| FIELD NAME | PARAMETER NAME
generality for D ETEX. 13 id struct ::= USE GLOBAL VARIABLE | NO POLYMORPHISM
| IS DATACLASS | ABSTRACT CLASS
Implementation: Engineers manually define the specifications | ACCESSOR METHOD | STATIC METHOD
| FUNCTION CLASS | FUNCTION METHOD
for the detection of smells using the taxonomy and vocabulary | PROTECTED METHOD | OVERRIDDEN METHOD
and, if needed, the context of the analysed systems. | INHERITED METHOD | INHERITED VARIABLE
As highlighted in the taxonomy, smells relate to the structure 14 relationship ::= rel name FROM ruleName card TO ruleName card
of classes (fields, methods) as well as to the structure of sys- 15 rel name ::= ASSOC | AGGREG | COMPOS
16 card ::= ONE | MANY | ONE OR MANY | OPTIONNALY ONE
tems (classes and groups of related classes). For uniformity, we
consider that smells characterise classes. Thus, a rule detecting 17 rule cardName, ruleName, lexic value ∈ string
18 fuzziness ∈ double
long methods reports the classes defining these methods. A
rule detecting the misuse of an association relationship returns Fig. 5. BNF Grammar of Smell Rule Cards.
the class at the source of the relationship. (It is also possible to
obtain the class target of the relationship.) Thus, rules have a
consistent granularity and their results can be combined using Measurable Properties: A measurable property defines a
set operators. We chose class as level of granularity for the numerical or an ordinal value for a specific metric (lines 8–11).
sake of simplicity and without loss of generality. Ordinal values are defined with a five-point Likert scale: very
We define the DSL with a Backus Normal Form (BNF) high, high, medium, low, very low. Numerical values are used
grammar, shown in Figure 5. A rule card is identified by to define thresholds, whereas ordinal values are used to define

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 8

1 RULE CARD:SpaghettiCode {
values relative to all the classes of the system under analysis. 2 RULE:SpaghettiCode
We define ordinal values with the box-plot statistical technique { INTER LongMethod NoParamete NoInheritance
NoPolymorphism ProceduralName UseGlobalVariable };
[51] to relate ordinal values with concrete metric values while
3 RULE:LongMethod { METRIC LOC METHOD VERY HIGH 10.0 };
avoiding setting artificial thresholds. Metric values can be 4 RULE:NoParameter { METRIC NMNOPARAM VERY HIGH 5.0 };
added or subtracted. The degree of fuzziness defines the 5 RULE:NoInheritance { METRIC DIT 1 0.0 };
6 RULE:NoPolymorphism { STRUCT NO POLYMORPHISM };
acceptable margin around the numerical value or around the 7 RULE:ProceduralName { LEXIC CLASS NAME
threshold relative to the ordinal value (line 5). Although other (Make, Create, Exec...) };
8 RULE:UseGlobalVariable { STRUCT USE GLOBAL VARIABLE };
tools implement the box-plot, such as I P LASMA [52], D ETEX 9 };
enhances this technique with fuzzy logic and thus alleviates Fig. 6. Rule Card of the Spaghetti Code.
the problem related to the definition of thresholds.
A set of metrics was identified during the domain analysis,
including Chidamber and Kemerer metric suite [53], such as
The Spaghetti Code does not include structural relationships
depth of inheritance DIT, lines of code in a class LOC CLASS,
because it is an intra-class defect. An example of such a
lines of code in a method LOC METHOD, number of attributes
relationship exists in the Blob where a large controller class
in a class NAD, number of methods NMD, lack of cohesion
must be associated with several data classes to be considered
in methods LCOM, number of accessors NACC, number of
a Blob. Such a rule can be written as follows:
private fields NPRIVFIELD, number of interfaces NINTERF,
RULE:Blob { ASSOC FROM ControllerClass ONE TO DataClass MANY};
or number of methods with no parameters NMNOPARAM. The
choice of the metrics is based on the taxonomy of the smells,
which highlights the measurable properties needed to detect a 3) Discussion:
given smell. This set of metrics is not restricted and can be
The domain analysis performed ensures that the specifi-
easily extended with other metrics.
cations are built upon consistent high-level abstractions and
Lexical Properties: A lexical property relates to the vocabu- capture domain expertise in contrast with general purpose
lary used to name a class, interface, method, field, or parameter languages, which are designed to be universal [55]. The DSL
(line 12). It characterises constituents with specific names offers greater flexibility than ad-hoc detection algorithms.
defined in a list of keywords (line 6). In particular, we made no reference at this point to the
concrete implementation of the detection of the properties and
Structural Properties: A structural property relates to the structural relationships. Thus, it is easier for domain experts to
structure of a constituent (class, interface, method, field, pa- understand the specifications because they are expressed using
rameter, and so on) (lines 7, 13). For example, property USE - smell-related abstractions and they focus on what to detect
GLOBAL VARIABLE checks that a class uses global variables instead of how to detect it, as in logic meta-programming [56].
while NO POLYMORPHISM checks that a class that should use Also, experts can modify easily the specifications at a high-
polymorphism does not. The BNF grammar specifies only a level of abstraction without knowledge of the underlying de-
subset of possible structural properties, other can be added as tection framework, either by adding new rules or by modifying
new domain analyses are performed. existing ones. They could for example use rule cards to specify
smells dependent on industrial or technological contexts. For
Set Operators: Properties can be combined using multiple example, in small applications, they could consider as smells
set operators including intersection, union, difference, inclu- classes with a high DIT but not in large systems. In a
sion, and negation (line 4) (The negation represents the non- management application, they could also consider different
inclusion of one set in another). keywords as indicating controller classes.
The DSL is concise and expressive and provides a reasoning
Structural Relationships: System classes and interfaces char-
framework to specify meaningful rules. Moreover, we wanted
acterised by the previous properties may also be linked with
to avoid an imperative language where, for example, we would
one another with different types of relationships including:
use a rule like method[1].parameters.size = 0 to
association, aggregation, and composition [54] (lines 14–16).
obtain classes with methods with no parameters. Indeed, using
Cardinalities define the minimum and maximum numbers of
the DSL should not require computer skills or knowledge
instances of each class participating in a relationship.
about the underlying framework or meta-model, to be acces-
2) Running Example: sible to most experts. In our experiments, graduate students
Figure 6 shows the rule card of the Spaghetti Code, which wrote specifications in less than 15 minutes, depending on
characterises classes as Spaghetti Code using the intersection their familiarity with the smells, with no knowledge of the
of six rules (line 2). A class is Spaghetti Code if it declares underlying framework. We provide some rule cards at [57].
methods with a very high number of lines of code (measurable Since the method is iterative, if a key concept is missed, we
property, line 3), with no parameter (measurable property, line can add it to the DSL later. The method as well as the language
4); if it does not use inheritance (measurable property, line 5), are flexible. The flexibility of the rule cards depends on the
and polymorphism (structural property, line 6), and has a name expressiveness of the language and available key concepts,
that recalls procedural names (lexical property, line 7), while which has been tested on a representative set of smells, eight
declaring/using global variables (structural property, line 8). antipatterns and 21 code smells.

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 9

Relationships Operators Properties


C. Step 3: Generation of the Algorithms
+associationOneToMany() +intersection()
We briefly present here the generation step of algorithms
for the sake of completeness; details are available in [7].
RuleCard «interface» Structural Lexical
IRule
1) Process: 1 1

Input: Rule cards of smells. Rule CompositeRule


Measurable

Output: Detection algorithms for the smells. 1

Description: We reify the smell specifications to allow algo- Fig. 7. Meta-model S MELL DL.
rithms to access and manipulate programmatically the result-
ing models. Reification is an important mechanism to manipu- padl.kernel
late concepts programmatically [58]. From the DSL, we build sad.kernel IModel

a meta-model, S MELL DL (Smell Definition Language), and a «interface»


ICodeSmellDetection
inherit
«interface»
IClass
«interface»
«interface»
parser to model rule cards and manipulate these S MELL DL «interface»
SAD
target IEntity
IEntity «interface»
IAntiPatternDetection
models programmatically. Then, we automatically generate IVisitor
«interface»
IInterface

algorithms using templates. The detection algorithms are based sad.util


IElement

both on the models of the smells and on models of systems. Relationships


«interface» «interface»
IMethod IField
The generated detection algorithms are correct by construction BoxPlot Operators
«interface»
of our specifications using a DSL [59]. IAssociation
pom.metrics
«interface» «interface»
Implementation: The reification is automatic using the parser LOC_METHOD LCOM
IComposition IAggregation

with the S MELL DL meta-model. The generation is also au- LOC_CLASS NAD

NMD DIT
tomatic and relies on our S MELL FW (Smell FrameWork)
framework, which provides services common to all detec-
tion algorithms. These services implement operations on the Fig. 8. Architecture of the S MELL FW Framework.
relationships, operators, properties, and ordinal values. The
framework also provides services to build, access, and anal-
yse system models. Thus, we can compute metrics, analyse these models and generate other models, using the Visitor
structural relationships, perform lexical and structural analyses design pattern. We choose PADL because it has six years
on classes, and apply the rules. The set of services and the of active development and is maintained in-house. We could
overall design of the framework have been directed by the have used another meta-model such as FAMOOS [62] or GXL
key concepts from the domain analysis and the DSL. [63], or a source model extractor, such as LSME [64].
Figure 8 sketches the architecture of the S MELL FW frame-
Meta-model of Rule Cards: Figure 7 is an excerpt of the work, which consists of two main packages, sad.kernel
S MELL DL meta-model, which defines constituents to rep- and sad.util. Package sad.kernel contains core classes
resent rule cards, rules, set operators, relationships among and interfaces. Class SAD represents smells and is so
rules, and properties. A rule card is specified concretely as an far specialised in two subclasses, AntiPattern and
instance of class RuleCard. An instance of RuleCard is CodeSmell. This hierarchy is consistent with our taxonomy
composed of objects of type IRule, which describes rules of smells. A smell aggregates entities, interface IEntity
that can be either simple or composite. A composite rule, from padl.kernel. For example, a smell is a set of
CompositeRule, is composed of other rules, using the classes with particular characteristics. Interfaces IAntiPat-
Composite design pattern [1]. Rules are combined using set ternDetection and ICodeSmellDetection define
operators defined in class Operators. Structural relation- the services that detection algorithms must provide. Package
ships are enforced using methods in class Relationships. sad.util declares utility classes that allow the manipulation
The meta-model also implements the Visitor design pattern. A of some key concepts of the rule cards.
parser analyses the rule cards and produces an instance of class
RuleCard. The parser is built using JF LEX and JAVACUP Set Operators. Class Operators package sad.util de-
and the BNF grammar shown in Figure 5. fines the methods required to perform intersection, union,
Framework for Detection: The S MELL FW framework is difference, inclusion, and negation between code smells. These
built upon the PADL meta-model (Pattern and Abstract-level operators work on the sets of classes that are potential code
Description Language) [12] and on the POM framework smells. They return new sets containing only the appropriate
(Primitives, Operators, Metrics) for metric computation [60]. classes. For example, the code below performs an intersection
PADL is a language-independent meta-model to represent on the set of classes that contain methods without parameter
object-oriented systems [61], including binary class relation- and those with long methods:
ships [54] and accessors. PADL offers a set of constituents 1 final Set setOfLongMethodsWithNoParameter =
2 CodeSmellOperators.intersection(
(classes, interfaces, methods, fields, relationships. . . ) to build 3 setOfLongMethods,
models of systems. It also provides methods to manipulate 4 setOfMethodsWithNoParameter);

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 10

2) Running Example:
Measurable Properties. Properties based on metrics are com- The following code excerpt presents the visit method that
puted using POM, which provides 44 metrics, such as lines generates the detection rule associated to a measurable prop-
of code in a class LOC CLASS, number of declared methods erty. When a model of the rule is visited, tag <CODESMELL>
NMD, or lack of cohesion in methods LCOM, and is easily is replaced by the name of the rule, tag <METRIC> by the
extensible. Using POM, S MELL FW can compute any metric name of the metric, tag <FUZZINESS> by the associated
on a set of classes. For example, in the code below, the metric value of the fuzziness in the rule, and tag <ORDINAL VAL-
LOC CLASS is computed on each class of a system: UES> by the method associated with the ordinal value.
1 final IClass aClass = iteratorOnClasses.next(); 1 public void visit(IMetric aMetric) {
2 final double aClassLOC = 2 replaceTAG("<CODESMELL>", aRule.getName());
3 Metrics.compute("LOC_CLASS", aClass); 3 replaceTAG("<METRIC>", aMetric.getName());
4 replaceTAG("<FUZZINESS>", aMetric.getFuzziness());
Class BoxPlot in package sad.util offer methods to 5 replaceTAG("<ORDINAL_VALUE>", aMetric.getOrdinalValue());
6 }
computes and access the quartiles for and outliers of a set of 7 private String getOrdinalValue(int value) {
metric values as illustrated in the following code excerpt: 8 String method = null;
9 switch (value) {
1 double fuzziness = 0.1; 10 case VERY_HIGH : method = "getHighOutliers";
2 final BoxPlot boxPlot = 11 break;
3 new BoxPlot(LOCofSetOfClasses, fuzziness); 12 case HIGH : method = "getHighValues";
4 final Map setOfOutliers = boxPlot.getHighOutliers(); 13 break;
14 case MEDIUM : method = "getNormalValues";
15 break;
16 default : method = "getNormalValues";
Lexical Property. The verification of lexical properties stems 17 break;
from PADL, which checks the names of classes, methods, and 18 }
19 return method;
fields against names defined in the rule cards. The following 20 }
code checks, for each class of a system, if its name contains The detection algorithm for a design defect is declared as
one of the strings specified in a predefined list: implementing interface IAntiPatternDetection. The
1 String[] CTRL_NAMES = algorithm aggregates the detection algorithms of several code
2 new String[] { "Calculate", "Display", ..., "Make" }; smells, implementing interface ICodeSmellDetection.
3
4 final IClass aClass = iteratorOnClasses.next(); The results of the detections of code smells are combined using
5 for (int i = 0; i < CTRL_NAMES.length; i++) { set operators to obtain suspicious classes for the antipattern.
6 if (aClass.getName().contains(CTRL_NAMES[i])) {
7 // do something Excerpts of generated Spaghetti Code detection algorithm can
8 } be found in [7] and on the companion Web site [57].
9 }
3) Discussion:
Structural Properties. Any structural property can be verified The S MELL DL meta-model and the S MELL FW framework,
using PADL, which provides all the constituents and meth- along with the PADL meta-model and the POM framework,
ods to assess structural properties. For example, the method provide the concrete mechanisms to generate and apply de-
isAbstract() returns true if a class is abstract: tection algorithms. However, using D ECOR we could design
another language and build another meta-model with the same
1 final IClass aClass = iteratorOnClasses.next();
2 boolean isClassAbstract = aClass.isAbstract();
capabilities. Detection algorithms could be generated against
other frameworks. In particular, we could reuse some of the
tools presented in the related work in Section II-C.
Structural Relationships. PADL also provides constituents The addition of another property in the DSL requires
describing binary class relationships. We can enforce the exis- the implementation of the analysis within S MELL FW. We
tence of certain relationships among classes being potentially experimented informally with the addition of new properties
a smell, e.g., an association between a main class and its data and it took from 15 minutes to one day to add a new property,
classes as illustrated by the following code excerpt: depending on the complexity of the analysis. This operation
1 final Set setOfCandidateBlobs = is necessary only once per new property.
2 Relations.associationOneToMany(setOfMainClasses, S MELL DL models must be instantiated for each smell but
3 setOfDataClasses);
the S MELL DL meta-model and the S MELL FW framework are
generic and do not need to be redefined. Models of systems
Algorithm Generation: An instance of class RuleCard is
are built before applying the detection algorithms, while metric
the entry point to a model of a rule card. The generation of the
values are computed on the fly and as needed.
detection algorithms is implemented as a visitor on models of
rule cards that generates the appropriate source code, based
on templates and the services provided by S MELL FW, as D. Step 4: Detection
illustrated in the following running example. Templates are
excerpts of JAVA source code with well-defined tags to be 1) Process:
replaced by concrete code. More details on the templates and
generation algorithm can be found in [7]. Input: Smell detection algorithms and the model of a system
in which to detect the smells.

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 11

Output: Suspicious classes whose properties and relationships 1) The DSL allows the specification of many different
conform to the smells specifications. smells. This assumption supports the applicability of
D ETEX on four design smells, composed of 15 code
Description: We automatically apply the detection algorithms smells, and the consistency of the specifications.
on models of systems to detect suspicious classes. Detection 2) The generated detection algorithms have a recall of
algorithms may be applied in isolation or in batch. 100%, i.e., all known design smells are detected, and
Implementation: Calling the generated detection algorithms a precision greater than 50%, i.e., the detection algo-
rithms are better than random chance. Given the trade-
is straightforward, using the services provided by S MELL FW.
off between precision and recall, we assume that 50%
The model of a system could be obtained using reverse en-
gineering by instantiating the constituents of PADL, sketched precision is significant enough with respect to 100%
in Figure 8, or from design documents. recall. This assumption supports the precision of the rule
cards and the adequacy of the algorithm generation and
2) Running Example: of the S MELL FW framework.
Following our running example of the Spaghetti Code and 3) The complexity of the generated algorithms is rea-
X ERCES v2.7.0, we first obtain a model of X ERCES, based sonable, i.e., computation times are in the order of
on the constituents of PADL. We then apply the detection one minute. This assumption supports the precision of
algorithm of the Spaghetti Code on this model to detect and the generated algorithms and the performance of the
report suspicious classes, using the code exemplified below. services of the S MELL FW framework.
In X ERCES v2.7.0, we found 76 suspicious Spaghetti Code
classes among the 513 classes of the system.
B. Subjects of the Validation
1 IAntiPatternDetection antiPatternDetection =
2 new SpaghettiCodeDetection(model); We use D ETEX to describe four well-known but different
3 antiPatternDetection.performDetection(); antipatterns from Brown [3]: Blob, Functional Decomposition,
4 ...
5 outputFile.println( Spaghetti Code, and Swiss Army Knife. Table I summarises
6 antiPatternDetection.getSetOfAntiPatterns()); these smells, which include in their specifications 15 different
code smells, some of which described in Fowler [2]. We
3) Discussion:
automatically generate associated detection algorithms.
Models on which the detection algorithms are applied can
be obtained from original designs produced during forward
or from reverse engineering, because industrial designs are C. Process of the Validation
seldom available freely. Also, design documents, like docu- We validate the results of the detection algorithms by
mentation in general, are often out-of-date. In many systems analysing the suspicious classes manually to (1) validate sus-
with poor documentation, the source code is the only reliable picious classes as true positives in the context of the systems
source of information [65] that it is precise and up-to-date. and (2) identify false negatives, i.e., smells not reported by
Thus, because the efficiency of the detection depends on our algorithms. Thus, we recast our work in the domain of
the model of the system, we chose to work with reverse- information retrieval to use the measures of precision and
engineered data, which provides richer data than usual class recall [66]. Precision assesses the number of true smells
diagrams, for example method invocations. D ETEX would also identified among the detected smells, while recall assesses the
apply to class diagrams, yet certain rules would no longer be number of detected smells among the existing smells:
valid. Thus, we did not analyse class diagrams directly and let
such a study a future work. |{existing smells} ∩ {detected smells}|
precision =
|{detected smells}|
V. VALIDATION
|{existing smells} ∩ {detected smells}|
Previous detection approaches have been validated on few recall =
|{existing smells}|
smells and proprietary systems. Thus, as our third contribution,
in addition to the D ECOR method and D ETEX detection We asked independent engineers to compute the recall of
technique, we validate D ETEX. The aim of this validation is the generated algorithms. Validation is performed manually
to study both the application of the four steps of D ETEX and because only engineers can assess whether a suspicious class
the results of their application using four design smells, their is indeed a smell or a false positive, depending on the smell
15 code smells, and 11 open-source systems. The validation descriptions and the systems’ contexts and characteristics. This
is performed by independent engineers who assess whether step is time consuming if the smell specifications are not
suspicious classes are smells, depending on the contexts of the restrictive enough and the number of suspected classes is large.
systems. We put aside domain analysis and smell specification
because these steps are manual and their iterative processes D. Objects of the Validation
would be lengthy to describe.
We perform the validation using the reverse-engineered
models of ten open-source JAVA systems: A RGO UML, A ZU -
A. Assumptions of the Validation REUS , G ANTT P ROJECT, L OG 4J, L UCENE, N UTCH, PMD,
We want to validate the three following assumptions: Q UICK UML, and two versions of X ERCES. In contrast to

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 12

Numbers of Numbers of
previous work, we use freely available systems to ease com- Smells
True Positives Detected Smells
Precision Recall Time

parisons and replications. We provide some information on Blob 39/513 (7.6%) 44/513 (8.6%) 88.6% 100% 2.45s
F.D. 15/513 (3.0%) 29/513 (5.6%) 51.7% 100% 0.16s
these systems in Table II. We also apply the algorithms on S.C. 46/513 (9.0%) 76/513 (15%) 60.5% 100% 0.22s
E CLIPSE but only discuss their results. S.A.K. 23/513 (4.5%) 56/513 (11%) 41.1% 100% 0.05s
60.5% 100% 0.72s

Number of Number of TABLE III. Precision and Recall in X ERCES v2.7.0, which contains 513
Name Version Lines of Code
Classes Interfaces
classes. (F.D. = Functional Decomposition, S.C. = Spaghetti Code, and S.A.K.
A RGO UML 0.19.8 113,017 1,230 67
An extensive UML modelling tool = Swiss Army Knife).
A ZUREUS 2.3.0.6 191,963 1,449 546
A peer-to-peer client implementing the BitTorrent protocol
G ANTT P ROJECT 1.10.2 21,267 188 41
A project-management tool to plan projects with Gantt charts
L OG 4J 1.2.1 10,224 189 14 The recalls of our detection algorithms are 100% for each
A logging JAVA package design smell. We specified the detection rules to obtain a
L UCENE 1.4 10,614 154 14
A full-featured text-search JAVA engine
perfect recall and assess its impact on precision. Precision is
N UTCH 0.7.1 19,123 207 40 between 41.1% and close to 90% (with an overall precision of
An open-source web search engine, based on L UCENE
PMD 1.8 41,554 423 23
60.5%), providing between 5.6% and 15% of the total number
A JAVA source code analyser for identifying low-level problems of classes, which is reasonable to analyse manually, compared
Q UICK UML 2001 9,210 142 13 with analysing the entire system of 513 classes. These results
A simple UML class and sequence diagrams modelling tool
X ERCES 1.0.1 27,903 189 107 also provide a basis for comparison with other approaches.
A framework for building XML parsers in JAVA
X ERCES 2.7.0 71,217 513 162 2) Running Example:
Release of March 2006 of the X ERCES XML parser
We found 76 suspicious classes for the detection of the
TABLE II. List of Systems. Spaghetti Code design smell in X ERCES v2.7.0. Out of these
76 suspicious classes, 46 are indeed Spaghetti Code previously
identified in X ERCES manually by engineers independent of
the authors, which leads to a precision of 60.5% and a recall
E. Results of the Validation of 100% (see third line in Table III).
We report results in three steps. First, we report the pre- The result file contains all suspicious classes, including class
cisions and recalls of the detection algorithms for X ERCES org.apache.xerces.xinclude.XIncludeHandler
v2.7.0 for the four design smells using data obtained inde- declaring 112 methods. Among these 112 methods, method
pendently. These data constitute the first available report on handleIncludeElement(XMLAttributes) is typical
the precision and recall of a detection technique. Then, we of Spaghetti Code, because it does not use inheritance and
report the precisions and computation times of the detection polymorphism but uses excessively global variables. More-
algorithms on the ten reverse-engineered open-source systems over, this method weighs 759 LOC, while the upper method
to show the scalability of D ETEX. We illustrate these results length computed using the box-plot is 254.5 LOC. The result
by concrete examples. Finally, we also apply our detection file is illustrated below:
algorithms on E CLIPSE v3.1.2, demonstrating their scalability 1.Name = SpaghettiCode
1.Class = org.apache.xerces.xinclude.XIncludeHandler
and highlighting the problem of balance among numbers of 1.NoInheritance.DIT-0 = 1.0
suspicious classes, precisions, and system context. 1.LongMethod.Name = handleIncludeElement(XMLAttributes)
1.LongMethod.LOC_METHOD = 759.0
1) Precision and Recall on X ERCES: 1.LongMethod.LOC_METHOD_Max = 254.5
1.GlobalVariable-0 = SYMBOL_TABLE
We asked three master’s students and two independent 1.GlobalVariable-1 = ERROR_REPORTER
engineers to manually analyse X ERCES v2.7.0 using only 1.GlobalVariable-2 = ENTITY_RESOLVER
1.GlobalVariable-3 = BUFFER_SIZE
Brown’s and Fowler’s books as references. They used an 1.GlobalVariable-4 = PARSER_SETTINGS
integrated development environment, E CLIPSE, to visualise the
2.Name = SpaghettiCode
source code and studied each class separately. When in doubt, 2.Class = org.apache.xerces.impl.xpath.regex.RegularExpression
they referred to the books and decided by consensus using 2.NoInheritance.DIT-0 = 1.0
2.LongMethod.Name = matchCharArray(Context,Op,int,int,int)
a majority vote whether a class was actually a design smell. 2.LongMethod.LOC_METHOD = 1246.0
They performed a thorough study of X ERCES and produced 2.LongMethod.LOC_METHOD_Max = 254.5
2.GlobalVariable-0 = WT_OTHER
a XML file containing suspicious classes for the four design 2.GlobalVariable-1 = WT_IGNORE
smells. A few design smells may have been missed by mistake 2.GlobalVariable-2 = EXTENDED_COMMENT
2.GlobalVariable-3 = CARRIAGE_RETURN
due to the nature of the task. We will ask as future work 2.GlobalVariable-4 = IGNORE_CASE
other engineers to perform this same task again to confirm the ...

findings and on other systems to increase our database. Another example is class org.apache.xerces.impl.
Table III presents the precision and recall of the detection xpath.regex.RegularExpression declaring method
of the four design smells in X ERCES v2.7.0. We perform all matchCharArray(Context,Op,int,int,int) with
computations on an Intel Dual Core at 1.67GHz with 1Gb of a size of 1,246 LOC. Looking at the code, we see that this
RAM. Computation times do not include building the system method contains a switch statement and duplicated code for
model but include computing metrics and checking structural 20 different operators (such as =, <, >, [a-z]. . . ) while class
relationships and lexical and structural properties. org.apache.xerces.impl.xpath.regex.Op actu-

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 13

ally has subclasses for most of these operators. This method design smells in E CLIPSE requires more time and produces
could have been implemented in a more object-oriented style more results. We detect 848, 608, 436, and 520 suspicious
by dispatching the matching operator to Op subclasses to classes for the Blob, Functional Decomposition, Spaghetti
split the large method into smaller ones in the subclasses. Code, and Swiss Army Knife design smells, respectively. The
However, such design would introduce polymorphic calls into detections take about 1h20m for each smell, with another hour
the method traversing all characters of an array. Therefore, to build the model. The use of the detection algorithms on
X ERCES designers may not have opt for such a design to E CLIPSE shows the scalability of our implementation. It also
optimize performance at the cost of maintainability. highlights the balance between numbers of suspicious classes
The 46 Spaghetti Code represent true positives and precisions. Indeed, if the choice is to maximise recall, the
and include “bad” Spaghetti Code such as method number of suspicious classes may be high, even more so in
handleIncludeElement but also “good” Spaghetti large systems, and thus precision will be low. Conversely, if
Code such as method matchCharArray. The “good” the choice is to minimise the number of suspicious classes,
smells were not rejected because they could represent weak precision will be high but recall may be low. In addition,
spots in terms of quality and maintenance. Other examples of it shows the importance of specifying smells in the context
typical Spaghetti Code detected and checked as true positives of the system in which they are detected. Indeed, the large
are classes generated automatically by parser generators. The number of suspicious classes for Blob in E CLIPSE, about
30 other suspicious classes were rejected by the independent 1/10th of the overall number of classes, may come from
engineers and are false positives. Even if these classes verified design and implementation choices and constraints within the
the characteristics of Spaghetti Code, most of them were E CLIPSE community and thus, the smell specifications should
easy to understand, and thus, were considered false positives. be adapted to consider these choices. With our method and
Thus, it would be necessary to add other rules or modify detection technique, engineers can easily re-specify smells to
the existing ones to narrow the set of candidate classes, fit their context and environment and get greater precision.
for example, by detecting nested if statements and loops,
characterising complex code. F. Discussion of the Results
3) Results on Other Systems: We verify each of the three assumptions using the results
Table 9 provides for the nine other systems plus X ERCES of the validation of D ETEX.
V 2.7.0 the numbers of suspicious classes in the first line of 1) The DSL allows the specification of many different
each row; the numbers of true design smells in the second line; smells. We described four different design smells of
the precisions in the third; and the computation times in the inter- and intra-class categories and of the structural,
fourth. We only report precisions: recalls on other systems than lexical, and measurable categories, as shown in Figure
X ERCES are future work due to the required time-consuming 3. These four smells are characterised by 15 code smells
manual analyses. We have also performed all computations on also belonging to 6 different categories, shown in Figure
an Intel Dual Core at 1.67GHz with 1Gb of RAM. 2. Thus, we showed that we can describe many different
smells, which supports the efficiency of our detection
4) Illustrations of the Results:
technique and the generality of its DSL.
We briefly present examples of the four design
2) The generated detection algorithms have a recall of
smells. In X ERCES, method handleIncludeElement
100% and a precision greater than 50%. Table III
(XMLAttributes) of the org.apache.xerces.
shows that the precision and recall for X ERCES v2.7.0
xinclude.XIncludeHandler class is a typical example
fulfill our assumptions with a precision of 60.5% and a
of Spaghetti Code. A good example of Blob is class
recall of 100%. Table 9 presents the precisions for the
com.aelitis.azureus.core.dht.control.impl.
other nine systems, which almost all comply with our
DHTControlImpl in A ZUREUS. This class declares
assumption, with a precision greater than 50% (except
54 fields and 80 methods for 2,965 lines of code. An
for two systems), thus validating the usefulness of our
interesting example of Functional Decomposition is class
detection technique.
org.argouml.uml.cognitive.critics.Init in
3) The complexity of the generated algorithms is reason-
A RGO UML, in particular because the name of the class
able, i.e., computation times are in the order of one
includes a suspicious term, init that suggests a functional
minute. Computation times are in general less than a few
programming. Class org.apache.xerces.impl.dtd.
seconds (except for E CLIPSE which took about 1 hour)
DTDGrammar is a striking example of Swiss Army Knife in
because the complexity of the detection algorithms de-
X ERCES, implementing four different sets of services with
pends only on the number of classes in a system, n,
71 fields and 93 methods for 1,146 lines of code.
and on the number of properties to verify on each class:
5) Results on E CLIPSE for the Scalability: (c + op) × O(n), where c is the number of properties
We also apply our detection algorithms on E CLIPSE and op the number of operators.
to demonstrate their scalability. E CLIPSE v3.1.2 weighs The computation times of the design smells vary with
2,538,774 lines of code for 9,099 classes and 1,850 interfaces. the smells and the systems. During validation, we noticed
It is one order of magnitude larger than the largest of the that building the models of the systems took up most of
open-source systems, A ZUREUS. The detection of the four the computation times, while the detection algorithms have

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 14

G ANTT P ROJECT

X ERCES v1.0.1

X ERCES v2.7.0
Q UICK UML
A RGO UML

A ZUREUS

L UCENE

N UTCH
L OG 4J

PMD
29 (2.4%) 41 (2.8%) 10 (5.3%) 3 (1.6%) 3 (1.9%) 6 (2.9%) 4 (0.9%) 0 (0%) 10 (5.3%) 44 (8.6%)
Blob

25 (2.0%) 38 (2.6%) 9 (4.8%) 3 (1.6%) 2 (1.3%) 4 (1.9%) 4 (0.9%) 0 (0%) 10 (5.3%) 39 (7.6%)
86.2% 92.7% 90.0% 100% 66.7% 66.7% 100% 100% 100% 88.6%
3.0s 6.4s 2.4s 1.3s 1.8s 3.6s 3.9s 0.4s 2.7s 2.4s
37 (3.0%) 44 (3.0%) 15 (8.0%) 11 (5.8%) 1 (0.6%) 15 (7.2%) 13 (3.1%) 10 (7.0%) 4 (2.1%) 29 (5.6%)
22 (1.8%) 17 (1.2%) 4 (2.1%) 6 (3.2%) 0 (0%) 3 (1.4%) 4 (0.9%) 3 (2.1%) 4 (2.1%) 15 (2.9%)
F.D.

59.5% 38.6% 26.7% 54.5% 0% 20.0% 30.8% 30.0% 100% 51.7%


0.4s 0.5s 0.8s 0.05s 0.03s 0.05s 0.06s 0.02s 0.03s 0.16s
44 (3.6%) 153 (15.6%) 14 (7.4%) 3 (1.6%) 8 (5.2%) 26 (12.6%) 9 (2.1%) 5 (3.5%) 25 (13.2%) 76 (14.8%)
38 (3.1%) 125 (8.6%) 10 (5.3%) 2 (1.1%) 6 (3.9%) 22 (10.6%) 5 (1.2%) 0 (0%) 23 (12.2%) 46 (9.0%)
S.C.

86.4% 81.7% 71.4% 66.7% 75.0% 84.6% 55.6% 0% 92.0% 60.5%


0.3s 2.9s 0.2s 0.08s 0.09s 0.1s 0.06s 0.03s 0.11s 0.2s
108 (8.8%) 145 (10.0%) 8 (4.2%) 51 (27.0%) 9 (5.8%) 33 (15.9%) 13 (3.1%) 6 (4.2%) 12 (6.3%) 56 (10.9%)
S.A.K.

18 (1.5%) 33 (2.3%) 3 (1.6%) 33 (17.5%) 1 (0.6%) 13 (6.3%) 6 (1.4%) 1 (0.7%) 5 (2.6%) 23 (4.5%)
16.6% 22.7% 37.5% 64.7% 11.1% 39.4% 46.1% 16.7% 41.7% 41.1%
0.3s 0.13s 0.05s 0.02s 0.02s 0.02s 0.02s 0.02s 0.03s 0.05s
62.2% 58.9% 56.4% 71.5% 38.2% 52.7% 58.1% 36.7% 83.4% 60.5%

Fig. 9. Results of Applying the Detection Algorithms. (In each row, the first line is the number of suspicious classes, the second line is the number of
classes being design smells, the third line is the precision, and the fourth line shows the computation time. Numbers in parenthesis are the percentages of the
classes being reported. The last row corresponds to the average precision per system. (F.D. = Functional Decomposition, S.C. = Spaghetti Code, and S.A.K.
= Swiss Army Knife))

short execution times, which explains the minor differences the number of false positives will be low and engineers will
between each system, in the same line in Table 9, and the not spend time checking a vast amount of false results. As
differences between each design smell, in different columns. future work, we propose to sort the results in critical order,
The computation times for PADL models are not surprising i.e., according to the classes that are the most likely to be
because the models contain extensive data, including binary smells, to help engineers in assessing the results. The numbers
class relationships [54] and accessors. of suspicious classes obtained are usually orders of magnitude
lower than the overall number of classes in a system; thus, the
The precisions also vary in relation to the design smells
detection technique indeed ease engineers’ code inspection.
and the systems, as shown in Table 9: First, the systems have
We also indirectly validated the usefulness of D ECOR by
been developed in different contexts and may have unequal
validating D ETEX. Indeed, D ECOR is the method of which
quality. Systems such as A ZUREUS or X ERCES may be of
one instantiation is D ETEX. Therefore, the validation of D E -
lesser quality than L UCENE or Q UICK UML, thus leading to
TEX showed that the D ECOR method provides the necessary
greater numbers of suspicious classes that are actually smells.
steps from which to derive a valid detection technique. As a
However, the low number of smells detected in L UCENE and
metaphor, we could assimilate D ECOR to a class and D ETEX
Q UICK UML leads to a low precision. For example, only one
to one of its instances that has been successfully tested, thus
Functional Decomposition was detected in L UCENE, but it
showing the soundness of its class.
was a false positive, thus leading to a precision of 0% and
an average precision of 38.2%. The smell specifications can
be over- or under-constraining. For example, the rule cards G. Threats to Validity
of the Blob and Spaghetti Code specify the smells strictly
using metrics and structural relationships, leading to a low Internal Validity: The obtained results depend on the services
number of suspicious classes and high precisions. The rule provided by the S MELL FW framework. Our current imple-
cards of the Functional Decomposition and Swiss Army Knife mentation allows the detection of classes that strictly conform
specify these smells loosely using lexical data, leading to lower to the rule cards and we only handle a degree of fuzziness
precisions. Thus, the specifications must not be too loose, not in measurable properties. This choice of implementation does
to detect too many suspicious classes, or too restrictive, to miss not limit D ETEX intrinsically because it could accommodate
smells. With D ETEX, engineers can refine the specifications other implementations of its underlying detection framework.
systematically, according to the detected suspicious classes The results also depend on the specifications of the design
and their knowledge of the systems. The choice of metrics and smells. Thus, we used for the experiments a representative set
thresholds is left to the domain experts to take into account of smells so as not to influence the results.
the context and characteristics of the analysed systems.
External Validity: One threat to the validity of the validation
The number of false positives appears quite high; however, is the exclusive use of open-source JAVA systems. The open-
we obtained many false positives because our objective was source development process may bias the number of design
100% recall for all systems. Using D ETEX and its DSL, smells, especially in the case of mature systems such as PMD
the rules can be refined systematically and easily to fit the v1.8 or X ERCES v2.7.0. Also, using JAVA may impact design
specific contexts of the analysed systems and thus to increase and implementation choices and thus the presence of smells.
precisions if desired, possibly at the expense of recall. Thus, However, we applied our algorithms on systems of various

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 15

sizes and qualities to preclude the possibility for all systems to on other systems; applying our detection technique to other
be either well or badly implemented. Moreover, we performed kinds of smells; comparing quantitatively our method with
a validation on open-source systems to allow comparisons and previous work. With respect to the last work, we are currently
replications. We are in contact with software companies to conducting a study on smells detection tools including several
replicate this validation on their proprietary systems. tools such as RevJava, FindBugs, PMD, Hammurapi, or Lint4j
to our detection technique against existing tools. A first
Construct Validity: The subjective nature of identifying or comparison is available in the related work.
specifying smells and assessing suspicious classes as smells
is a threat to construct validity. Indeed, our understanding of Acknowledgments: We are grateful to G. Antoniol, K. Mens,
smells may differ from that of other engineers. We lessen this and D. Thomas for their comments on earlier versions of this
threat by specifying smells based on general literature and paper. We thank M. Amine El Haimer and N. Tajeddine for
drawing inspiration from previous work. We also asked the applying the method and detection technique on several smells.
engineers in charge of computing precision and recall to do We thank D. Huynh and P. Leduc for their help with the
so. Moreover, we contacted developers involved in each of the implementation of parts of the S MELL FW framework. Finally,
analysed systems to validate our results and to improve our we express our gratitude to the developers who confirmed our
smell specifications. So far, we have received a few answers findings in the open-source systems.
but enthusiastic interest. Engineers analysed independently our Y.-G. Guéhéneuc was partially supported by a NSERC
results for L OG 4J, L UCENE, PMD, and Q UICK UML, and Discovery Grant. N. Moha was supported by the Université de
confirmed the results in Table 9. We thank M. Adamovic, Montréal and The FQRNT (Fonds québécois de la recherche
C. Alphonce, D. Cutting, T. Copeland, P. Gardner, E. Ross, sur la nature et les technologies), a funding agency of the
and Y. Shapira for their kind help. We are in the process of Gouvernement du Québec.
increasing the size of our library of smells thanks to their
support. We believe important to report the detection results R EFERENCES
to the communities developing the systems.
[1] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns –
Elements of Reusable Object-Oriented Software, 1st ed. Addison-
Repeatability/Reliability Validity: The results of the valida- Wesley, 1994.
tion are repeatable and reliable because we use freely open- [2] M. Fowler, Refactoring – Improving the Design of Existing Code, 1st ed.
source programs that can be freely downloaded from the Addison-Wesley, June 1999.
[3] W. J. Brown, R. C. Malveau, W. H. Brown, H. W. McCormick III, and
Internet. Also, our implementation is available upon request T. J. Mowbray, Anti Patterns: Refactoring Software, Architectures, and
while all its results are on the companion Web site [57]. Projects in Crisis, 1st ed. John Wiley and Sons, March 1998.
[4] R. S. Pressman, Software Engineering – A Practitioner’s Approach,
5th ed. McGraw-Hill Higher Education, November 2001.
VI. C ONCLUSION AND F UTURE W ORK [5] G. Travassos, F. Shull, M. Fredericks, and V. R. Basili, “Detecting
defects in object-oriented designs: using reading techniques to increase
The detection of smells is important to improve the quality software quality,” in Proceedings of the 14th Conference on Object-
of software systems, to facilitate their evolution, and thus to Oriented Programming, Systems, Languages, and Applications. ACM
reduce the overall cost of their development and maintenance. Press, 1999, pp. 47–56.
[6] N. Moha, Y.-G. Guéhéneuc, and P. Leduc, “Automatic generation of
We proposed the following improvements to previous work. detection algorithms for design defects,” in Proceedings of the 21st
First, we introduced D ECOR, a method that embodies all the Conference on Automated Software Engineering, S. Uchitel and S. East-
step necessary to define detection techniques. Second, we cast erbrook, Eds. IEEE Computer Society Press, September 2006, pp.
297–300, short paper.
our detection technique, now called D ETEX, in the context of [7] N. Moha, Y.-G. Guéhéneuc, A.-F. L. Meur, and L. Duchien, “A domain
the D ECOR method. D ETEX now plays the role of reference analysis to specify design defects and generate detection algorithms,”
instantiation of our method. It is supported by a DSL for in Proceedings of the 11th international conference on Fundamental
Approaches to Software Engineering, J. Fiadeiro and P. Inverardi, Eds.
specifying smells using high-level abstractions, taking into Springer-Verlag, March-April 2008.
account the context of the analysed systems, and resulting from [8] B. V. Rompaey, B. D. Bois, S. Demeyer, and M. Rieger, “On the
a thorough domain analysis of the text-based descriptions of detection of test smells: A metrics-based approach for general fixture
and eager test,” IEEE Transactions on Software Engineering, vol. 33,
the smells. Third, we applied D ETEX on four design smells and no. 12, pp. 800–817, 2007.
their 15 underlying code smells and discussed its usefulness, [9] G. Bruno, P. Garza, E. Quintarelli, and R. Rossato, “Anomaly detection
precision, and recall. This is the first such extensive validation in xml databases by means of association rules,” in DEXA ’07: Pro-
ceedings of the 18th International Conference on Database and Expert
of a smell detection technique. Systems Applications. Washington, DC, USA: IEEE Computer Society,
Our detection technique and the inputs, outputs, processes, 2007, pp. 387–391.
and implementations defined in each step can be generalised [10] S. Jorwekar, A. Fekete, K. Ramamritham, and S. Sudarshan, “Au-
tomating the detection of snapshot isolation anomalies,” in VLDB ’07:
to other smells. Also, it can be implemented using other Proceedings of the 33rd international conference on Very large data
techniques as long as they provide relevant data for the bases. VLDB Endowment, 2007, pp. 1263–1274.
considered steps. We have not compared our implementation [11] A. Patcha and J.-M. Park, “An overview of anomaly detection tech-
niques: Existing solutions and latest technological trends,” Comput.
with other approaches but will do so in future work. Netw., vol. 51, no. 12, pp. 3448–3470, 2007.
Future work includes using the W ORD N ET dictionary; using [12] Y.-G. Guéhéneuc and G. Antoniol, “DeMIMA: A multi-layered frame-
existing tools to improve the implementation of our method; work for design pattern identification,” Transactions on Software Engi-
neering, vol. 34, no. 5, pp. 667–684, September 2008.
improving the quality and performance of the source code [13] B. F. Webster, Pitfalls of Object Oriented Development, 1st ed. M &
of the generated detection algorithms; computing the recall T Books, February 1995.

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 16

[14] A. J. Riel, Object-Oriented Design Heuristics. Addison-Wesley, 1996. [43] R. Prieto-Dı́az, “Domain analysis: An introduction,” Software Engineer-
[15] M. Mantyla, “Bad smells in software - a taxonomy and an empirical ing Notes, vol. 15, no. 2, pp. 47–54, April 1990.
study.” Ph.D. dissertation, Helsinki University of Technology, 2003. [44] R. Wirfs-Brock and A. McKean, Object Design: Roles, Responsibilities
[16] W. C. Wake, Refactoring Workbook. Boston, MA, USA: Addison- and Collaborations. Addison-Wesley Professional, 2002.
Wesley Longman Publishing Co., Inc., 2003. [45] Y.-G. Guéhéneuc and H. Albin-Amiot, “Using design patterns and
[17] R. Marinescu, “Detection strategies: Metrics-based rules for detecting constraints to automate the detection and correction of inter-class design
design flaws,” in Proceedings of the 20th International Conference on defects,” in Proceedings of the 39th Conference on the Technology of
Software Maintenance. IEEE Computer Society Press, 2004, pp. 350– Object-Oriented Languages and Systems, Q. Li, R. Riehle, G. Pour, and
359. B. Meyer, Eds. IEEE Computer Society Press, July 2001, pp. 296–305.
[18] M. J. Munro, “Product metrics for automatic identification of “bad [46] S. Boroday, A. Petrenko, J. Singh, and H. Hallal, “Dynamic analysis of
smell” design problems in java source-code,” in Proceedings of the 11th Java applications for multithreaded antipatterns,” in Proceedings of the
International Software Metrics Symposium, F. Lanubile and C. Seaman, 3rd International Workshop On Dynamic Analysis. New York, NY,
Eds. IEEE Computer Society Press, September 2005. USA: ACM Press, 2005, pp. 1–7.
[19] E. H. Alikacem and H. Sahraoui, “Generic metric extraction framework,” [47] B. Dudney, S. Asbury, J. Krozak, and K. Wittkopf, J2EE AntiPatterns.
in Proceedings of the 16th International Workshop on Software Mea- Wiley, 2003.
surement and Metrik Kongress (IWSM/MetriKon), 2006, pp. 383–390. [48] B. A. Tate and B. R. Flowers, Bitter Java. Manning Publications, 2002.
[20] K. Dhambri, H. Sahraoui, and P. Poulin, “Visual detection of de- [49] C. U. Smith and L. G. Williams, Performance Solutions: A Practical
sign anomalies.” in Proceedings of the 12th European Conference on Guide to Creating Responsive, Scalable Software. Boston, MA, USA:
Software Maintenance and Reengineering, Tampere, Finland. IEEE Addison-Wesley Professional, 2002.
Computer Society, April 2008, pp. 279–283. [50] Janice Ka-Yee Ng and Y.-G. Guéhéneuc, “Identification of behavioral
[21] F. Simon, F. Steinbrückner, and C. Lewerentz, “Metrics based refac- and creational design patterns through dynamic analysis,” in Proceedings
toring,” in Proceedings of the Fifth European Conference on Software of the 3rd International Workshop on Program Comprehension through
Maintenance and Reengineering (CSMR’01). Washington, DC, USA: Dynamic Analysis, A. Zaidman, A. Hamou-Lhadj, and O. Greevy, Eds.
IEEE Computer Society, 2001, p. 30. Delft University of Technology, October 2007, pp. 34–42, tUD-SERG-
[22] G. Langelier, H. A. Sahraoui, and P. Poulin, “Visualization-based 2007-022.
analysis of quality for large-scale software systems,” in Proceedings of [51] J. M. Chambers, W. S. Clevelmd, B. Kleiner, and P. A. Tukey, Graphical
the 20th International Conference on Automated Software Engineering, methods for data analysis. Wadsworth International, 1983.
T. Ellman and A. Zisma, Eds. ACM Press, November 2005. [52] R. Marinescu, “Measurement and quality in object-oriented design,”
[23] M. Lanza and R. Marinescu, Object-Oriented Metrics in Practice. Ph.D. dissertation, Politehnica University of Timisoara, June 2002.
Springer-Verlag, 2006. [53] S. R. Chidamber and C. F. Kemerer, “A metrics suite for object oriented
[24] E. van Emden and L. Moonen, “Java quality assurance by detecting design,” IEEE Transactions on Software Engineering, vol. 20, no. 6, pp.
code smells,” in Proceedings of the 9th Working Conference on Reverse 476–493, 1994.
Engineering (WCRE’02). IEEE Computer Society Press, Oct. 2002. [54] Y.-G. Guéhéneuc and H. Albin-Amiot, “Recovering binary class rela-
[25] D. Garlan, R. Allen, and J. Ockerbloom, “Architectural mismatch: Why tionships: Putting icing on the UML cake,” in Proceedings of the 19th
reuse is so hard,” IEEE Software, vol. 12, no. 6, pp. 17–26, 1995. Conference on Object-Oriented Programming, Systems, Languages, and
[26] R. Allen and D. Garlan, “A formal basis for architectural connection,” Applications, D. C. Schmidt, Ed. ACM Press, October 2004, pp. 301–
ACM Transactions on Software Engineering and Methodology, vol. 6, 314.
no. 3, pp. 213–249, 1997. [55] C. Consel and R. Marlet, “Architecturing software using: A methodology
for language development,” Lecture Notes in Computer Science, vol.
[27] E. M. Dashofy, A. van der Hoek, and R. N. Taylor, “A comprehen-
1490, pp. 170–194, September 1998.
sive approach for the development of modular software architecture
[56] R. Wuyts, “Declarative reasoning about the structure of object-oriented
description languages,” ACM Transactions on Software Engineering and
systems,” in Proceedings of the 26th Conference on the Technology of
Methodology, vol. 14, no. 2, pp. 199–245, 2005.
Object-Oriented Languages and Systems, J. Gil, Ed. IEEE Computer
[28] D. Jackson, “Aspect: detecting bugs with abstract dependences,” ACM
Society Press, August 1998, pp. 112–124.
Transactions on Software Engineering and Methodology, vol. 4, no. 2,
[57] DECOR, September 2006, http://ptidej.dyndns.org/research/ptidej/
pp. 109–145, 1995.
decor/.
[29] D. Evans, “Static detection of dynamic memory errors.” in Proceedings
[58] G. Kiczales, J. des Rivières, and D. G. Bobrow, The Art of the
of the Conference on Programming Language Design and Implementa-
Metaobject Protocol, 1st ed. MIT Press, July 1991.
tion. New York, NY, USA: ACM Press, 1996, pp. 44–53.
[59] M. Mernik, J. Heering, and A. M. Sloane, “When and how to develop
[30] D. L. Detlefs, “An overview of the extended static checking system,” in domain-specific languages,” ACM Computing Surveys, vol. 37, no. 4,
Proceedings of the First Formal Methods in Software Practice Workshop pp. 316–344, December 2005.
(1996), 1996. [60] Y.-G. Guéhéneuc, H. Sahraoui, and Farouk Zaidi, “Fingerprinting design
[31] J. Brant, “Smalllint,” April 1997, http://st-www.cs.uiuc.edu/users/brant/ patterns,” in Proceedings of the 11th Working Conference on Reverse
Refactory/Lint.html. Engineering, E. Stroulia and A. de Lucia, Eds. IEEE Computer Society
[32] D. Hovemeyer and W. Pugh, “Finding bugs is easy,” SIGPLAN Not., Press, November 2004, pp. 172–181.
vol. 39, no. 12, pp. 92–106, 2004. [61] H. Albin-Amiot, P. Cointe, and Y.-G. Guéhéneuc, “Un méta-modèle
[33] D. Reimer, E. Schonberg, K. Srinivas, H. Srinivasan, B. Alpern, R. D. pour coupler application et détection des design patterns,” in Actes du
Johnson, A. Kershenbaum, and L. Koved, “Saber: smart analysis based 8e colloque Langages et Modèles à Objets, ser. RSTI – L’objet, M. Dao
error reduction,” in ISSTA ’04: Proceedings of the 2004 ACM SIGSOFT and M. Huchard, Eds., vol. 8, numéro 1-2/2002. Hermès Science
International Symposium on Software Testing and Analysis. New York, Publications, janvier 2002, pp. 41–58.
NY, USA: ACM Press, 2004, pp. 243–251. [62] S. Demeyer, S. Tichelaar, and S. Ducasse, “FAMIX 2.1 – the FAMOOS
[34] Analyst4j, February 2008, http://www.codeswat.com/. information exchange model,” University of Bern, Tech. Rep., 2001.
[35] PMD, June 2002, http://pmd.sourceforge.net/. [63] A. Winter, B. Kullbach, and V. Riediger, “An overview of the gxl graph
[36] CheckStyle, 2004, http://checkstyle.sourceforge.net. exchange language.” in Software Visualization, ser. Lecture Notes in
[37] FXCop, June 2006, http://www.binarycoder.net/fxcop/index.html. Computer Science, S. Diehl, Ed., vol. 2269. Springer, 2002, pp. 324–
[38] Hammurapi, October 2007, http://www.hammurapi.biz/. 336.
[39] SemmleCode, October 2007, http://semmle.com/. [64] G. C. Murphy and D. Notkin, “Lightweight lexical source model
[40] D. Beyer, A. Noack, and C. Lewerentz, “Efficient relational calculation extraction,” ACM Trans. Softw. Eng. Methodol., vol. 5, no. 3, pp. 262–
for software analysis,” Transactions on Software Engineering, vol. 31, 292, 1996.
no. 2, pp. 137–149, February 2005. [65] H. A. Muller, J. H. Jahnke, D. B. Smith, M.-A. D. Storey, S. R. Tilley,
[41] D. Beyer, T. A. Henzinger, R. Jhala, and R. Majumdar, “The software and K. Wong, “Reverse engineering: a roadmap,” in ICSE — Future of
model checker blast: Applications to software engineering.” Int. Journal SE Track, 2000, pp. 47–60.
on Software Tools for Technology Transfer, vol. 9, pp. 505–525, 2007, [66] W. B. Frakes and R. A. Baeza-Yates, Information Retrieval: Data
invited to special issue of selected papers from FASE 2005. Structures and Algorithms. Prentice-Hall, 1992.
[42] H. Chen and D. Wagner, “Mops: an infrastructure for examining security
properties of software.” in Proceedings of the 9th ACM Conference on
Computer and Communications Security (CCS), 2002, pp. 235–244.

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

TRANSACTIONS ON SOFTWARE ENGINEERING 17

Naouel Moha received the Master degree in com- Anne-Françoise Le Meur received her Master of
puter science from the University of Joseph Fourier, Science in Computer Science at the Oregon Graduate
Grenoble, in 2002. She also received the Ph.D. de- Institute in Portland Oregon USA in 1999 and her
gree, in 2008, from the University of Montreal (un- PhD in Computer Science from the University of
der Professor Yann-Gaël Guéhéneuc’s supervision) Rennes 1 in 2002. After a one year postdoc at
and the University of Lille (under the supervision DIKU University of Copenhaguen in Denmark, she
of Professor Laurence Duchien and Anne-Françoise obtained in 2004 an associate professor position at
Le Meur). The primary focus of her Ph.D. thesis the University of Lille 1 and joined the INRIA
was to define an approach that allows the automatic team project ADAM. She has worked on program
detection and correction of design smells, which are specialization, and the design and development of
poor design choices, in object-oriented programs. domain-specific languages. Her current work focuses
She is currently a postdoctoral researcher in the INRIA team-project Triskell. mainly on the application of programming-language techniques to the problem
Her research interests include software quality and evolution, in particular of software component-based architecture conception and evolution.
refactoring and the identification of patterns.

Yann-Gaël Guéhéneuc is associate professor at the


Department of computing and software engineering
of Ecole Polytechnique of Montreal where he leads
the Ptidej team on evaluating and enhancing the
quality of object-oriented programs by promoting
the use of patterns, at the language-, design-, or
architectural-levels. In 2009, he was awarded the
NSERC Research Chair Tier II on Software Patterns
and Patterns of Software. He holds a Ph.D. in soft-
ware engineering from University of Nantes, France
(under Professor Pierre Cointe’s supervision) since
2003 and an Engineering Diploma from École des Mines of Nantes since
1998. His Ph.D. thesis was funded by Object Technology International, Inc.
(now IBM OTI Labs.), where he worked in 1999 and 2000. His research
interests are program understanding and program quality during development
and maintenance, in particular through the use and the identification of
recurring patterns. He was the first to use explanation-based constraint
programming in the context of software engineering to identify occurrences of
patterns. He is interested also in empirical software engineering; he uses eye-
trackers to understand and to develop theories about program comprehension.
He has published many papers in international conferences and journals.

Laurence Duchien obtained her Ph.D degree from


University Paris 6 LIP6 laboratory in 1988 and
she worked on protocols for distributed applica-
tions. She joined then the Computer Science De-
partment at CNAM (Conservatoire National des Arts
et métiers) (http://www.cnam.fr), Paris, France as
associate professor in September 1990. She also
holds a Research Direction Habilitation in Com-
puter Science from the University of Joseph Fourier,
Grenoble, France in 1999. She is currently full
professor at the computer science Dept at University
of Lille, France since 2001 and she is the head of the INRIA-USTL-CNRS
team-project ADAM (Adaptive Distributed Applications and Middleware)
(http://adam.lille.inria.fr). Her current research interests include development
techniques for component-based and service-oriented distributed applications
in ambient computing. She works on the different steps of life cycle develop-
ment such as architecture modeling, model composition and transformation
and, finally, software evolution.

Authorized licensed use limited to: ECOLE POLYTECHNIQUE DE MONTREAL. Downloaded on November 6, 2009 at 15:12 from IEEE Xplore. Restrictions apply.

You might also like