Software Engineering Module 4
Software Engineering Module 4
Software Engineering Module 4
the design phase into code in a given programming language, which can be executed by a computer and that performs the computation specified by the design. For a given design, the aim is to implement the design in the best possible manner. The coding phase affects testing and maintenance. During implementation, it should be kept in mind that the programs should not be constructed so that they are easy to write, but so that they are east to read and understand. Quick Fixes to modify a given code easily, result in a code that is more difficult to understand. Ease of understanding and modification should be the basic goals of the programming activity .Simplicity and clarity is desirable. Programming Practice: The primary goal of the coding phase is to translate the given design into source code, so that the code is simple, easy to test and easy to understand and modify. Good programming is a skill that can only be acquired by practice. Much can be learned from the experience of others, and some general rules and guidelines can be laid for the programmer. Good programming is a practice independent of the target programming language. Some concepts related to coding is discussed. 1. )Top Down and Bottom Up Given the hierarchy of modules produced by design, in what order should the modules be built staring from top level or starting from bottom level. In top down implementation, the implementation starts from top of the hierarchy and proceeds to the lower levels. First the main module is implemented, then its subordinates are implemented, and their subordinates and so on. In a bottom-up implementation, the process is the reverse. The development starts with implementing the modules at the bottom of the hierarchy and proceeds through the higher levels until it reaches the top. Topdown and bottom-up shouldnt be confused with top-down design. Here the design is being implemented, and if the design is fairly detailed and complete, its implementation can proceed in either top-down manner. Which of the two is used mostly affects testing. The main reason is that we want to incrementally build the system that is, we want to build the system in parts, even though the design of the entire system has been done all large systems must be built and tested separately, the issue of top-down versus bottom-up arises. The real issue in which order modules are coded
comes in testing. If all the modules are to be developed and then put together to form system for testing purposes, it is immaterial which modules is coded first. When modules have to be tested separately, top-down and bottom-up lead to top-down and bottom-up approaches to testing. These two approaches have different consequences .When we proceed top-down testing a set of modules, at the top of the hierarchy, stubs will have to be written for the lower level modules that the set of modules under testing invoke. In Bottom-up, all modules in the lower hierarchy have been developed and driver modules are needed to invoke these modules under testing. When the design is not detailed enough, some of the design decisions have to be made during development. For example, building prototype is a top-down development. In layered architecture, layer provides some service to layers above, which use these services to implement the services it provides. It is generally best for the implementation to proceed in a bottom-up manner. In practice, in large systems, a combination of the two approaches is used during coding. The op modules of the system generally contain the overall view of the system and may even contain the user interfaces .On the other hand , the bottom up level modules are working correctly before they are used by other modules. 2.) Structured Programming The basic objective of the coding activity is to produce programs that are easy to understand .it has been argued that structured programming helps develop programs that are easier to understand. Structured programming is often considered as gotoless programming. A program has a static structure as well as a dynamic structure. The static structure refers to the structure of the text of the program, which is usually just a linear organization of statements of the program. The dynamic structure is the sequence of statements executed during the execution of the program. The correctness of the program means that, when the program executes, it produces desired behavior .Much of the activity of the program understanding is to understand the dynamic behavior of the program from text of the program. It will clearly be easier to understand dynamic behavior if structure in dynamic behavior resembles the static structure .The closer the correspondence between the execution and text structure, the easier the program is to understand. The goal of Structured programming is to ensure that the static structures and the dynamic structures are the same. Develop programs whose control flows during the execution is linearized and follows the linear organization of the program text.
No meaningful program can be written as a sequence of simple statements without any branching or repetition .So how is linearizing achieved? -By using structured constructs -We use structured statements in Structured Programming. A structured statement has a single entry and single exit i.e. the execution starts from one defined point and ends at on defined point. Thus a program has a sequence of structured statements. Thus sequence of execution of these statements will be same as sequence in program text. Most commonly used statements: Selection: if B then S1 else S2 If B then S1 Iteration: while B do S Repeat S until B Sequencing: S1; S2; S3; Structured Programming helps write programs clearly. It helps in the formal verification of the programs. In a linearized control flow, if we understand the behavior of each of the basic constructs properly , the behavior of the program can be considered as a completion of the behavior of different statements. Although efforts should be made to avoid such statements that violate single entry and single exit , if the use of such statements is simplest , then the point of readability such constructs are used. 3. )Information Hiding A software solution to problem always contains data structures that are meant to represent information to problem domain. With the problem information represented internally as data structures, the required functionality of the problem domain, which in terms of information in domain, can be implemented as software operations on data structure. Any information in the problem domain typically has a small number of defined operations performed on it. When the information is represented internally as data structures, the same principle should be applied, and only some defined operations should be performed on data structure. This is the principle of information hiding. The information captured in the data structures should be hidden from the rest of the system, and only access functions on the data structures that represent the operations performed on the information should be visible. For each operation on the information an access function should be provided. The rest of the modules should only use these access functions to access and manipulate the data structures.
All modules, other than the access functions, access the data structure through the access functions. Information hiding can reduce the coupling between modules and make the system more maintainable. With information hiding, the impact on the modules using the data needs to be evaluated only when the data structure or access functions are changed. Also when a data structure is changed n, the effect of the change is generally limited to access functions if information hiding is used. Otherwise all modules using data structure may have to be changed. By using information hiding, we have separated the concern of managing data from the concern of using data to produce some desired results. Another form of information hiding is to let a module see only those data items needed by it. Thus each module, is given access to data items on a need to know basis. Most languages do not support it. However the information hiding principle discussed earlier is supported by many modern programming languages in the form of data abstraction. With support for data abstraction, a package or module is defined that encapsulates the data. Some operations are defined by the module on the encapsulated data. The advantage of this form of data abstraction is that the data is entirely in the control of the module in which the data is encapsulated .Other modules cannot access or modify the data; the operations that can access and mo0dify are also a part of this module. 4.) Programming Style For writing a good code we need to apply some general rules: 1. Names Selecting module and variable names is often not considered important by novice programmers. Most variables represent entity in the problem domain, and the modules some process. Variable names should be closely related to the entity they represent, and module names should reflect their activity. It is a bad practice to use cryptic names or to use the same name for multiple purposes. 2. Control Constructs Use single entry and single exit constructs. Use a few standard control constructs. 3. Gotos Use gotos sparingly in the program. Only when the alternative to using gotos is more complex should the gotos is more complex should the gotos be used. If goto is used, forward transfers is more acceptable. Use of gotos in exiting loops or invoking error handlers is more acceptable. 4. Information Hiding
Information hiding should be supported where possible. 5. User Defined Types When facilities like enumerated data types are available, they should be exploited where applicable. For example, Type days = {Mon Sun}; They make programs much clearer. 6. Nesting If nesting becomes too deep, the programs become harder to understand. It is often difficult to determine the if statement to which a particular else clause is associated. Where possible avoid deep nesting, even if it means a little inefficiency. For example, if C1 then S1 else if C2 then S2 else if C3 then S3 else if C4 then S4 This structure can be converted into the following structure. if C1 then S1; if C2 then S2; if C3 then S3; if C4 then S4; This sequence of statements will produce the same result as the earlier sequence, but it is much easier to understand. The price is a little inefficiency in that the large conditions will be evaluated even if a condition evaluates to true, while in the previous case the condition evaluation stops when one evaluates to true. 7. Module size Large modules often will not be functionally cohesive; and too-small modules may incur overhead. There can be no hard-and fast rule about module sizes the guiding principle should be cohesion and coupling. 8. Module Interface
A module with a complex interface might not be functionally cohesive and might be implementing multiple functions. Any module whose interface has more than five parameters should be broken into multiple modules with a simpler interface if possible. 9. Program Layout How the program is organized and presented can have great effect on the readability of it. Proper indentation, blank spaces, and parentheses should be used to enhance the readability of programs. Automated tools are available to pretty print a program. 10. Side Effects When a module is invoked, it sometimes has side effects of modifying the program state beyond the modification of parameters listed in the module interface definition, for example, modifying global variables. Such side effects should be avoided where possible, and if a module has side effects, they should be properly documented. 11. Robustness A program is robust if it does something planned even for exceptional conditions. A program might encounter exceptional conditions in such forms as incorrect input, the incorrect value of some variable, and overflow. The program should not just crash; it should produce some meaningful message and exit gracefully 5.) Internal Documentation. In the coding phase, the output document is the code itself. Some amount of internal documentation is useful for enhancing understandability of the programs>it is done using comments. Comments are textual statements that are meant for program readers and are not executed.Comments, if properly written and kept consistent with the code, can be invaluable during maintenance. The purpose of comments is not to explain in English the logic of the program-the program itself is the best documentation for the details of the logic. The comments should explain what the code is doing, not how it is doing it. This means a comment is not needed for every line of the code. Comments should be provided for blocks for code, particularly those parts of code that are hard to follow. In most cases, only comments for the modules need to be provided. Providing comments for modules is most useful, as modules form the unit of testing, compiling, verification and modification. Comments for a module is often called prologue for the module. It is desirable if the prologue contains the following information: 1. Module functionality or what the module is doing.
2. Parameters and their purpose. 3. Assumptions about the inputs, if any. 4. Global variables accessed and /or modified in the module. Explanation of parameters, stating how global data is affected and the side effects of a module can be quite useful during maintenance. In addition other information can be included, depending on the local coding standards .Examples are the name of the author, the date of compilation, and the last date of modification. If a module is modified, then the prologue should also be modified, if necessary. Verification Verification of output of the coding phase is intended for detecting errors introduced during this phase. It is to show that the code is consistent with the design it is supposed to implement. Program verification falls into two categories: o Static o Dynamic In dynamic methods, the program is executed on some test data and the outputs of the program are examined to determine if there are any errors present. It follows the traditional pattern of testing. In static techniques do not involve actual program execution on actual numeric data, though it may involve some form of conceptual execution. The program is not compiled and then executed as in testing. Common forms are code reading, program verification, code reviews or walkthroughs and symbolic execution. Unlike dynamic techniques where only presence of an error is detected, here errors are detected directly. Types of errors detected by the two categories of verification technique are different. The type of errors detected by static techniques are often not found by testing, or it may be more cost-effective to detect these errors by static methods. Consequently testing and static methods are complementary in nature and both should be used for reliable software. 1. Code Reading Reading of code by programmer to detect any discrepancies between design specifications and actual implementation. It involves determining the abstraction of a module and then comparing it with specification .It is the reverse process of design.
The process of code reading is best done by reading the code inside-out, starting with the innermost structure of the module. First determine its abstract behavior and specify the abstraction .Then the higherlevel structure is considered, with the inner structure replaced by its abstraction .This process is continued until we reach the module or program being read. At that time the abstract behavior of the program/module will be known, which can then be compared to the specifications to determine any discrepancies. Code reading can detect errors not revealed by testing. Code reading is also called desk review. 2. Static Analysis Analysis of programs by methodically analyzing the program text is called static analysis. This is done using software tools. The program is not executed, but the program text is the input to the tools. It detects errors/potential errors or generates information about structure of the program that can be useful for documentation or understanding programs. Different types of static analysis tools can be designed to perform different types of analysis. It is a very cost effective technique. Static analysis detects errors, can provide insight into structure of the program, violations of local standards. Extensive Static Analysis can reduce effort for testing. Data flow analysis is a form of static analysis .It concentrates on uses of data by programs and detects some data flow anomalies (i.e. suspicious use of data in program).Data flow anomalies are technically not errors. They are caused due to carelessness in typing or error in coding. Presence of data flow anomalies implies poor coding. It should be properly addressed. An example of the data flow anomaly is the live variable problem, in which a variable is assigned some value but then the variable is not used in any later computation. Another example is having two assignments to available without using the value of the variable between the two assignments. For example, x=a; . . x does not appear in any R.H.S. . . x=b;
Data flow anomalies can provide valuable information for the documentation of programs. For example, data flow analysis can provide information about bwhi9ch variables are modified on invoking a procedure in the caller program and the value of the variables used in the called procedure. This analysis can identify aliasing, which occurs when different variables represent the same data object. This information can be useful during maintenance to ensure that there are no undesirable side effects of some modifications to a procedure. Other examples of data flow anomalies are unreachable code, unused variables and unreferenced labels. Data flow analysis is usually performed by representing a program as a graph, sometimes called the flow graph. The nodes in a flow graph represent statements of a program, while the edges represent control paths from one statement to another. To reduce the time of processing of algorithms, the search of a flow graph has to be carefully organized. Another way to reduce the time for executing algorithms is to reduce the size of the flow graph. The most common transformation is to have each node represent a block of code that will be executed together. It is often called a call graph. The edge from one node n to another node m represents the fact that the execution of the module represented by n directly invokes the module m. Other Uses of Static Analysis An error often made, especially when different teams are developing different paths of the software , is mismatched parameters list, where actual/formal parameters is different in no:/type. If the programs are separately developed and compiled, this error will not be detected. A state analysis with access to different parts of the program can detect it. It detects calls to nonexistent program modules. It detects infinite loops and illegal recursions. There are different kinds of documents that static analyzers can produce, which can be useful for maintenance or increased understanding of the program. The first is the cross-reference of where different variables and constants are used .Often , looking at the cross-reference can help one detect subtle errors, like many constants defined to represent the same entity. For example, the value of pi could be defined as constant in different routines with slightly different values. A report with crossreferences can be useful to detect such errors. To reduce the size of such reports, it may be more useful to limit it to the use of constants and global variables.
Information about the frequency of use of different constructs of the programming language can also be obtained by static analysis. Static analysis can produce the structure chart of program (which can be compared to structure chart of System design to see the differences). There are some coding restrictions that the different organizations imposes .Such restrictions cannot be checked by compilers, but Static analysis can be used to enforce these standards.
3. Symbolic Execution Here the program is not executed with actual data .The program is symbolically executed with symbolic data. The program is symbolically executed with symbolic data. The inputs are not numbers but symbols representing input data, which can take different values .Execution of the program proceeds like normal execution, except that it deals with values that are not numbers but formulas consisting of symbolic input values.The outputs are symbolic formulas of input values.These formauls can be checked to see if the program will behave as expected. Symbolic execution is also known as symbolic evaluation or symbolic testing. Although the concept is simple, performing symbolic-execution of even modest-size programs is veryu difficult.The problems basically come due to the conditional execution of statements in programs.As conditions of a symbolic expression cannot usually be evaluated to true or false without substituting actual values for the symbols , a case-by-case analysis becomes necessary, and all possible cases with a condition have to be considered.In program with loops, this results in a large number of test cases. To introduce the basic concepts of symbolic execution, let us first consider a simple program without any conditional statements. 1. function product(x, y ,z: integer):integer; 2. var tmp1,tmp2:integer; 3. begin 4. tmp1:=x*y; 5. tmp2:=y*z; 6. product:=tmp1*tmp2/y; 7. end;
10
Symbolic Execution After Statement 1 4 5 6 x xi xi xi xi y z yi zi yi zi yi zi yi zi values of variables tmp1 tmp2 ? ? xi* yi ? xi*yi yi*zi xi*yi yi*zi
Product ? ? ? (xi*yi)*(yi*zi)/yi
With one path and an acceptable symbolic result ,we can claim that the program is correct. Path Conditions Path condition at a statement gives the conditions the input must satisfy so that the statement will be executed .Path condition is a Boolean expression over symbolic inputs that never conatins any program variables .It will be represented by pc. Each symbolic execution begins with pc=true.As conditions are encountered , for different cases referring to different paths in program , the path condition will take different values. For example,symbolic execution of an if statement of the form If C1 then S1 else S2 Will require two cases to be considered , corresponding to the two possible paths, one where C evaluates to true and S1 is executed , and the other where C evaluates to false and S2 is executed. For the first case we set the path condition pc to, pc<- pc and C Which is the path condition for statements in S1. pc<-pc and not C Which is the path condition in S2. On encountering the if statement,symbolic execution is said to fork into two execution : one following the then part , the other the else aprt.Both these parts are independently executed , with their respecyive path conditions.However if at any if statement we can show that pc implies C or not C, we do not need to follow both paths, only the relevant path need to be executed.Such an if statement is a nonforking conditional statement. Let us consider an example involving if statements. 1. function max(x, y ,z: integer):integer 2. begin 3. if x<=y then 4. max:=y 5. else 6. max:=x; 7. if max<z then 8. max:=z; 9. end; The trace of the symbolic execution of this program is shown below:
11
pc
max ? ? xi zi
Return this value of max case(max>=z) 4. (xi>yi) and (xi>=zi) Return this value of max Case(x<=y) Similar to the above. We can check the path of executions using this. Loops and Symbolic Execution Trees The different paths can be represented by execution tree.The node represents the execution of statement and the arc the transition from S1 to S2.For each if where both paths are followed , two arcs from node (if) one labeled T and other F , for then and else paths.At each branching , the path condition is also often shown in the tree. xi
12
Some important points in Symbolic execution: Each leaf represents path that will be followed by input values. Path conditions associated with two different leaves are distinct. If symbolic ouput at each leaf is true is correct , the program is correct. For modest size programs, tree can be infinite due topresence of loops.Therefore Symbolic Execution cant be used for proving correctness. The program to perform Symbolic Execution may not stop.For this reason ,build tools where only some paths are symbolically executed.
4. Proving Correctness Refer Assignment. 5. Code Inspections/Reviews It helps in detecting defects in the code. It reduces the effort during testing. It is held after the code has been successfully completed and other forms of static tools have been applied but before any testing has been performed. The entry criteria for code review is that the code must compile successfully and has been passed by other static analysis tools. The documentation distributed to review team members includes the code to be reviewed and the design document. The review team for code reviewers should include the programmer, the designer, and the tester. The review stars with the preparation for the review and ends with a list of action items.
The aim of the review is to detect defects in code. One obvious coding defect is that the code fails to implement the design. In addition, the input-output format assumed by a module may be inconsistent with the format specified in the design. Other code defects can be divided into two: o Logic and control , eg:-infinite loop o Data operations and computation, eg:- incorrect access of array of components
13
There are quality issues, which the review also addresses:-inefficiency -violation of local standards 6. Unit testing Unit testing is a dynamic verification method. The program is compiled and executed. It is most widely used. Coding is also called coding and unit testing phase. The unit testing involves executing the code with some test cases and then evaluating the results. Here the modules or units are tested and not the entire software system. Other levels of testing are used to test the system. Unit testing is done by the programmer itself. After coding, the programmer tests with some test data .the tested module is given for integration and further testing. Selection of test cases and deciding how much testing is enough are two important aspects of testing.
14