Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Topic G - Program Refactoring and Comprehension

Uploaded by

Saahil Karnik
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Topic G - Program Refactoring and Comprehension

Uploaded by

Saahil Karnik
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 46

Topic G

Program Refactoring and Comprehension


What is Refactoring?

“Refactoring is the process of changing a


software system in such a way that it
does not alter the external behaviour of
the code yet improves its internal
structure.”

Omar Badreddin 2
What is Refactoring? (cont.)

Each refactoring is a series of steps

Each step is simple and small


• Minimises introduction of new bugs

Unit tests should be used to verify


changes have no effect on behaviour

Behavioural changes to a program are not


refactorings

Omar Badreddin 3
Why Refactor?
It improves design so that maintenance is easier
• Makes source code and models easy
understand
— Models are easier to understand by
designers
— Code is more usable by programmers
• Reduced chaos in long-lived systems
• Faster and easier maintenance in the long
term

Thinking about specific ways to refactor helps


us formalize and name best practices

Omar Badreddin 4
Opportunities for Refactoring
During initial design and programming

When adding functionality

When fixing a bug

During a code review


• E.g. in pair programming

Omar Badreddin 5
Specific Refactorings to be Discussed
in the Following Slides
Many refactorings make use of other simpler refactorings

Some are opposites (inverses)


• Use the one that makes your code easier to
understand

Some can be applied straightforwardly whereas others


require deeper analysis

Using a tool to help in refactoring is useful


• Many require finding code to change elsewhere, e.g.
code that has to change to be consistent with the
refactoring.
• Sophisticated and expensive commercial tools are
available.

Omar Badreddin 6
Refactoring: Extract Method
Probably the most used refactoring
• You may have done this many times yourself

Identify code that should be put into a separate


method that you will then call
• Code to reuse
— Immediately, or potentially
• Code that is making a method too long,
• Code that is difficult to understand
— The new method’s name helps explain the
function

You may also do a Move Method refactoring too


move the method to a different class

Omar Badreddin 7
Inverse Refactoring: Inline method

Much less common than Extract Method,


but still useful sometimes
• For efficiency
• If the inlined method is only called
once, and overall complexity would be
less after the inlining

Omar Badreddin 8
Refactoring: Extract Class
Performed when a you perform object-oriented
analysis and the resulting UML class
diagram suggests two classes where you have
only one.
• E.g. Extract a ‘common superclass’
• Can result from too many Extract Method
refactorings

Split out related subset of data and methods


into a new class

May need to rename the old class if its


duties have changed

Omar Badreddin 9
Inverse Refactoring: Inline Class
The opposite of Extract Class

Applied when a class is doing too little


• Normally only when there is a 1-1
association

Move fields and methods into a class that uses


them

Omar Badreddin 10
Refactoring: Inline Temp

Replace a local variable with a simple expression


• Can make other refactorings easier
• Can improve efficiency slightly

Only do it if the variable is not being used in


many places

Inverse: Turn a temp back into an expression


• Local variable may be doing no harm, can be
making the code easier to understand, or may
be there to avoid breaking the Law of Demeter
• In which case, leave it there and don’t do
this refactoring

Omar Badreddin 11
Refactoring: Rename Method
Reasons for doing this
• The old name was not clear
• The task of a method has changed

Seems trivial, but is very important


• Bad naming defeats the whole purpose of
encapsulation
• Code is confusing; in the worst case it can
be completely misinterpreted

Omar Badreddin 12
Refactoring: Replace Magic Number with
Symbolic Constant
Encapsulates a standard piece of programming
advice

Applies to strings etc, as well as numbers

Omar Badreddin 13
Refactoring: Replace Error Code with
Exception
Code can be written assuming it works
• It can therefore be much easier to read

Error handling code separated out

Omar Badreddin 14
Refactoring: Encapsulate Field
Make a field (instance variable) private
• Add a getter, and a setter if necessary

Omar Badreddin 15
Refactoring: Remove Control Flag

Remove a local variable that serves as a


control flag to break out of loops

Use Break, Continue and Exit instead

Omar Badreddin 16
Refactoring: Split Complex Condition
if ((a || b) && (c || d)) {

}

Becomes

if (a || b) {
if (c || d) {

}
}

The inverse of this can sometimes also be useful

Omar Badreddin 17
Refactoring: Name The Condition
if ((a || b) && (c || d)) {

}

Becomes

meaningfulVariable = a || b;
anotherVariable = c || d;
if (meaningfulVariable && anotherVariable) {

}

Omar Badreddin 18
General Considerations about Refactoring 1
Do not apply refactorings without thought
for the consequences
• Multi-threaded applications may
experience unexpected effects
• Changing interfaces may affect a
published API
• Some code should not be touched

Omar Badreddin 19
General Considerations about Refactoring 2
Perform good upfront design
• but you can design for ease of
refactoring

Don’t make optimisation decisions before


finding bottlenecks
• Refactor first, optimise after

Omar Badreddin 20
Program Analysis and Comprehension
Program Analysis

Extracting information, in order to


present abstractions of, or answer
questions about, a software system

Static Analysis
• Examines the source code

Dynamic Analysis
• Examines the system as it is executing

Omar Badreddin 22
What are we looking for when performing
program analysis?
Depends on our goals and the system
• In almost any language, we can find out
information about variable usage

• In an OO environment, we can find out which


classes use other classes, what is the
inheritance structure, etc.

• We can also find potential blocks of code that


can never be executed in running the program
(dead code)

• Typically, the information extracted is in


terms of entities and relationships (Class
Diagram) or behaviour (State Machine).
Omar Badreddin 23
Static Analysis

Involves parsing the source code

Usually creates an Abstract Syntax Tree (a


topic of another course)

Borrows heavily from compiler technology


•but stops before code generation

Requires a grammar for the programming


language

Can be very difficult to get right


Omar Badreddin 24
Static analysis in IDEs

High-level languages lend themselves better to


static analysis needs
• Rational Software Modeler does this
with UML and Java

Unfortunately, most legacy systems are not


written in either of these languages

Omar Badreddin 25
Dynamic Analysis

Provides information about the run-time


behaviour of software systems, e.g.
• Component interactions
• Event traces
• Concurrent behaviour
• Code coverage
• Memory management

Can be done with a debugger

Omar Badreddin 26
Instrumentation

Augments the subject program with code that


• transmits events to a monitoring
application
• or writes relevant information to an
output file

A profiler tool can be used to examine the


output file and extract relevant facts from
it

Instrumentation affects the execution speed


and storage space requirements of the system

Omar Badreddin 27
Instrumentation process

Source code Annotator Annotated program

Annotation
script Compiler

Instrumented
executable

Omar Badreddin 28
Non-instrumented approach

One can also use debugger log files to obtain


dynamic information

• Disadvantage: Limited amount of information


provided

• Advantages: Less intrusive, more accurate


performance measurements

Omar Badreddin 29
Summary: Static vs. Dynamic Analysis

Static Analysis Dynamic Analysis

Reasons over all Observes a small


possible behaviours number of behaviours
(general results) (specific results)

Conservative and sound Precise and fast

Challenge: Choose good Challenge: Select


abstractions representative test
cases

Omar Badreddin 30
Program Transformation

The act of changing one program into another


• from a source language to a target language

This is possible because of a program’s well-


defined structure
• But for validity, we have to be aware of the
semantics of each structure

Used in many areas of software engineering:


• Compiler construction
• Software visualization
• Documentation generation
• Automatic software renovation

Omar Badreddin 31
Program transformation application
examples
Converting to a new language dialect

Migrating from a procedural language to an


object-oriented one, e.g. C to C++

Requirement upgrading
• e.g. using 4 digits for years instead of 2
(Y2K)

Structural improvements
• e.g. changing GOTOs to control structures

Omar Badreddin 32
Simple program transformation

Modify all arithmetic expressions to


reduce the number of parentheses using
the formula: (a+b)*c = a*c + b*c

x := (2+5)*3
becomes
x := 2*3 + 5*3

Omar Badreddin 33
Two types of transformations

Translation
• Source and target language are
different
• Semantics remain the same

Rephrasing
• Source and target language are the
same
• Goal is to improve some aspect of the
program such as its understandability
or performance
• Semantics might change
Omar Badreddin 34
Transformation tools

There are many transformation tools

Program-Transformation.org lists 90 of
them
• http://www.program-transformation.org/

Omar Badreddin 35
Program Comprehension

Program comprehension:
• The discipline concerned with studying
the way software engineers understand
programs

Objective of those studying program


comprehension:
• design tools that will facilitate the
understanding of large programs

SEG4110 - Topic R - Software 36


Maintenance
Program Comprehension Strategies 1

The bottom-up model:


• Comprehension starts with the source code
and abstracting from it to reach the
overall comprehension of the system
• Steps:
— Read the source code
— Mentally group together low-level
programming details (chunks) to build
higher-level abstractions
— Repeat until a high-level understanding
of the program is formed

37
Program Comprehension Strategies 2

The top down model:


• Comprehension starts with a general
idea, or hypothesis, about how the
system works
- Often obtained from a very quick look at what
components exist
• Steps
- First formulate hypotheses about the system
functionality
- Verify whether these hypotheses are valid or not
- Create other hypotheses, forming a hierarchy of
hypotheses
- Continue until the low-level hypotheses are
matched to the source code and proven to be
valid or not

38
Program Comprehension Strategies 3

The Integrated Model:


• Combines the top down and bottom up
approaches
• Evidence show that maintainers tend
to switch among the different
comprehension strategies depending on
— The code under investigation
— Their expertise with the system
— Presence of models (system and other
types of models)

39
Partial Comprehension

Usually is not necessarily to understand the whole


system if only part of it needs to be maintained
• But a high fraction of bugs arise from not
understanding enough!

Most software maintenance tasks can be met by


answering seven basic questions:
• How does control flow reach a particular location?
• Where is a particular subroutine or procedure
invoked?
• What are the arguments and results of a function?
• Where is a particular variable set, used or queried?
• Where is a particular variable declared?
• What are the input and output of a particular
module?
40
Reverse Engineering

• The process of analyzing a subject


system
— to identify the system’s components
and their inter-relationships
— and to create representations of
the system, in another form, at a
higher level of abstraction”
Chikofsky and
Cross

41
Two main levels of reverse engineering
Binary reverse engineering
• Take a binary executable
— Recover source code you can then modify
• Useful for companies that have lost their source
code
• Used extensively by hackers
• Can be used legally, e.g. to enable your system to
interface to existing system
• Illegal in some contexts
Source code reverse engineering
• Take source code
— Recover high level design information
• By far the most widely performed type of reverse
engineering
• Binary reverse engineers also generally do this too

42
Reverse Engineering Objectives 1

Cope with complexity:


• Have a better understanding of
voluminous and complex systems
• Extract relevant information and
leave out low-level details

Generate alternative views:


• Enable the designers to analyze the
system from different angles

43
Reverse Engineering Objectives 2

Recover lost information:


• Changes made to the system are
often undocumented;
- This enlarges the gap between the
design and the implementation
• Reverse engineering techniques
retrieve the lost information

44
Reverse Engineering Objectives 3

Detect side effects:


• Detect problems due to the effect a
change may have on the system
before it results in failure

Synthesize higher-level abstractions

45
Reverse Engineering Objectives 4

Facilitate reuse
• Detect candidate system components
that can be reused

46

You might also like