Macro Processors: Basic Function Machine-Independent Features Design Options Implementation Examples
Macro Processors: Basic Function Machine-Independent Features Design Options Implementation Examples
Macro Processors: Basic Function Machine-Independent Features Design Options Implementation Examples
Basic Function Machine-Independent Features Design Options Implementation Examples Macro Instructions A macro instruction (macro) It is simply a notational convenience for the programmer to write a shorthand version of a program. It represents a commonly used group of statements in the source program. It is replaced by the macro processor with the corresponding group of source language statements. This operation is called expanding the macro For example: Suppose it is necessary to save the contents of all registers before calling a subroutine. This requires a sequence of instructions. We can define and use a macro, SAVEREGS, to represent this sequence of instructions. Macro Processors A macro processor Its functions essentially involve the substitution of one group of characters or lines for another. Normally, it performs no analysis of the text it handles. It doesnt concern the meaning of the involved statements during macro expansion. Therefore, the design of a macro processor generally is machine independent. Macro processors are used in assembly language high-level programming languages, e.g., C or C++ OS command languages general purpose Basic Functions Macro Definition Macro Invocation Macro Expansion One-Pass Algorithm Data Structure Macro Definition Two new assembler directives are used in macro definition: MACRO: identify the beginning of a macro definition
MEND: identify the end of a macro definition Prototype (pattern) for the macro: Each parameter begins with & label op operands name MACRO parameters : body : MEND Body: the statements that will be generated as the expansion of the macro. Example of Macro Definition
Macro Invocation A macro invocation statement (a macro call) gives the name of the macro instruction being invoked and the arguments in expanding the macro. The processes of macro invocation and subroutine call are quite different. Statements of the macro body are expanded each time the macro is invoked. Statements of the subroutine appear only one, regardless of how many times the subroutine is called.
Macro Expansion Each macro invocation statement will be expanded into the statements that form the body of the macro. Arguments from the macro invocation are substituted for the parameters in the macro prototype. The arguments and parameters are associated with one another according to their positions. The first argument in the macro invocation corresponds to the first parameter in the macro prototype, etc. Comment lines within the macro body have been deleted, but comments on individual statements have been retained. Macro invocation statement itself has been included as a comment line. The label on the macro invocation statement CLOOP has been retained as a label on the first statement generated in the macro expansion. This allows the programmer to use a macro instruction in exactly the same way as an assembler language mnemonic. Example of Macro Expansion
No Label in the Body of Macro Problem of the label in the body of macro: There will be duplicate labels, which will be treated as errors by the assembler, if the same macro is expanded multiple times at different places in the program. Solutions: Simply not to use labels in the body of macro. Explicitly use PCrelative addressing instead. For example, in RDBUFF and WRBUFF macros, JEQ * +11 JLT *-14 It is inconvenient and error-prone. Other better solution? Two-Pass Macro Processor Two-pass macro processor Pass 1: Process macro definition Pass 2: Expand all macro invocation statements Problem This kind of macro processor cannot allow recursive macro definition, that is, the body of a macro contains definitions of other macros (because all macros would have to be defined during the first pass before any macro invocations were expanded). Example of Recursive Macro Definition
MACROS (for SIC) contains the definitions of RDBUFF and WRBUFF written in SIC instructions. MACROX (for SIC/XE) contains the definitions of RDBUFF and WRBUFF written in SIC/XE instructions. A program that is to be run on SIC system could invoke MACROS whereas a program to be run on SIC/XE can invoke MACROX. Defining MACROS or MACROX does not define RDBUFF and WRBUFF. These definitions are processed only when an invocation of MACROS or MACROX is expanded. Example of Recursive Macro Definition
One-Pass Macro Processor A one-pass macro processor that alternate between macro definition and macro expansion in a recursive way is able to handle recursive macro definition. Because of the one-pass structure, the definition of a macro must appear in the source program before any statements that invoke that macro. Data Structures DEFTAB (definition table)
Stores the macro definition including macro prototype macro body Comment lines are omitted. References to the macro instruction parameters are converted to a positional notation for efficiency in substituting arguments. NAMTAB Stores macro names Serves an index to DEFTAB pointers to the beginning and the end of the macro definition ARGTAB Stores the arguments of macro invocation according to their positions in the argument list As the macro is expanded, arguments from ARGTAB are substituted for the corresponding parameters in the macro body. Data Structures
Algorithm MAIN procedure iterations of GETLINE PROCESSLINE PROCESSLINE procedure DEFINE EXPAND output source line DEFINE procedure make appropriate entries in DEFTAB and NAMTAB EXPAND procedure
set up the argument values in ARGTAB expand a macro invocation statement (like in MAIN procedure) iterations of GETLINE PROCESSLINE GETLINE procedure get the next line to be processed from input file DEFTAB Handling Recursive Macro Definition In DEFINE procedure When a macro definition is being entered into DEFTAB, the normal approach is to continue until an MEND directive is reached. This would not work for recursive macro definition because the first MEND encountered in the inner macro will terminate the whole macro definition process. To solve this problem, a counter LEVEL is used to keep track of the level of macro definitions. Increase LEVEL by 1 each time a MACRO directive is read. Decrease LEVEL by 1 each time a MEND directive is read. A MEND can terminate the whole macro definition process only when LEVEL reaches 0. This process is very much like matching left and right parentheses when scanning an arithmetic expression. Algorithm
Use a special concatenation operator -> to specify the end of the parameter X&ID->1 Example of Concatenation
Generation of Unique Labels Labels in the macro body may cause duplicate labels problem if the macro is invocated and expanded multiple times. Use of relative addressing at the source statement level is very inconvenient, error-prone, and difficult to read. It is highly desirable to let the programmer use label in the macro body Labels used within the macro body begin with $. let the macro processor generate unique labels for each macro invocation and expansion. During macro expansion, the $ will be replaced with $xx, where xx is a two-character alphanumeric counter of the number of macro instructions expanded. XX=AA,AB,AC,.. Labels Defined in Macro Body
Conditional Macro Expansion Arguments in macro invocation can be used to: Substitute the parameters in the macro body without changing the sequence of statements expanded. Modify the sequence of statements for conditional macro expansion (or conditional assembly when related to assembler). This capability adds greatly to the power and flexibility of a macro language. Macro-time conditional structure IF-ELSE-ENDIF WHILE-ENDW Example of Conditional Macro Expansion Two additional parameters used in the example of conditional macro expansion &EOR: specifies a hexadecimal character code that marks the end of a record &MAXLTH: specifies the maximum length of a record Macro-time variable (set symbol) can be used to store working values during the macro expansion store the evaluation result of Boolean expression control the macro-time conditional structures begins with & and that is not a macro instruction parameter be initialized to a value of 0 be set by a macro processor directive, SET
Implementation of Conditional Macro Expansion (IF-ELSE-ENDIF Structure) A symbol table This table contains the values of all macro-time variables used. Entries in this table are made or modified when SET statements are processed. This table is used to look up the current value of a macro-time variable whenever it is required. When an IF statement is encountered during the expansion of a macro, the specified Boolean expression is evaluated. TRUE the macro processor continues to process lines from DEFTAB until it encounters the next ELSE or ENDIF statement. If ELSE is encountered, then skips to ENDIF FALSE the macro processor skips ahead in DEFTAB until it finds the next ELSE or ENDLF statement. Conditional Macro Expansion vs. Conditional Jump Instructions The testing of Boolean expression in IF statements occurs at the time macros are expanded. By the time the program is assembled, all such decisions have been made. There is only one sequence of source statements during program execution. In contrast, the COMPR instruction tests data values during program execution. The sequence of statements that are executed during program execution may be different.
WHILE-ENDW Structure
Implementation of Conditional Macro Expansion (WHILE-ENDW Structure) When an WHILE statement is encountered during the expansion of a macro, the specified Boolean expression is evaluated. TRUE the macro processor continues to process lines from DEFTAB until it encounters the next ENDW statement. when ENDW is encountered, the macro processor returns to the preceding WHILE, re-evaluates the Boolean expression, and takes action again. FALSE the macro processor skips ahead in DEFTAB until it finds the next ENDW statement and then resumes normal macro expansion.
Keyword Macro Parameters Positional parameters Parameters and arguments are associated according to their positions in the macro prototype and invocation. If an argument is to be omitted, a null argument should be used to maintain the proper order in macro invocation: For example: GENER ,,DIRECT,,,,,,3. It is not suitable if a macro has a large number of parameters, and only a few of these are given values in a typical invocation. Keyword parameters Each argument value is written with a keyword that names the corresponding parameter. Arguments may appear in any order. Null arguments no longer need to be used. For example: GENER TYPE=DIRECT,CHANNEL=3. It is easier to read and much less error-prone than the positional method. Example of Keyword Parameters Default values of parameters
RDCHAR: read one character from a specified device into register A should be defined beforehand (i.e., before RDBUFF) Implementation of Recursive Macro Expansion Previous macro processor design cannot handle such kind of recursive macro invocation and expansion, e.g., RDBUFF BUFFER, LENGTH, F1 Reasons: The procedure EXPAND would be called recursively, thus the invocation arguments in the ARGTAB will be overwritten. The Boolean variable EXPANDING would be set to FALSE when the inner macro expansion is finished, that is, the macro process would forget that it had been in the middle of expanding an outer macro. A similar problem would occur with PROCESSLINE since this procedure too would be called recursively.
Solutions: Write the macro processor in a programming language that allows recursive calls, thus local variables will be retained. Use a stack to take care of pushing and popping local variables and return addresses. Another problem: can a macro invoke itself recursively? General-Purpose Macro Processors Goal: macro processors that do not dependent on any particular programming language, but can be used with a variety of different languages Pros Programmers do not need to learn many macro languages. Although its development costs are somewhat greater than those for a language-specific macro processor, this expense does not need to be repeated for each language, thus save substantial overall cost. Cons Large number of details must be dealt with in a real programming language Situations in which normal macro parameter substitution should not occur, e.g., comments. Facilities for grouping together terms, expressions, or statements Tokens, e.g., identifiers, constants, operators, keywords Syntax Macro Processing within Language Translators Macro processors can be Preprocessors Process macro definitions Expand macro invocations Produce an expanded version of the source program, which is then used as input to an assembler or compiler Line-by-line macro processor used as a sort of input routine for the assembler or compiler Read source program Process macro definitions and expand macro invocations Pass output lines to the assembler or compiler Integrated macro processor Line-by-Line Macro Processor Benefits It avoids making an extra pass over the source program. Data structures required by the macro processor and the language translator can be combined (e.g., OPTAB and NAMTAB) Utility subroutines can be used by both macro processor and the language translator. Scanning input lines Searching tables Data format conversion It is easier to give diagnostic messages related to the source statements. Integrated Macro Processor An integrated macro processor can potentially make use of any information about the source program that is extracted by the language translator.
As an example in FORTRAN DO 100 I = 1,20 a DO statement: DO: keyword 100: statement number I: variable name DO 100 I = 1 An assignment statement DO100I: variable (blanks are not significant in FORTRAN) An integrated macro processor can support macro instructions that depend upon the context in which they occur. Drawbacks of Line-by-line or Integrated Macro Processor They must be specially designed and written to work with a particular implementation of an assembler or compiler. The costs of macro processor development is added to the costs of the language translator, which results in a more expensive software. The assembler or compiler will be considerably larger and more complex.
printf(ABSDIFF(3,8)
Conditional compilation statements Example 1: #ifndef BUFFER_SIZE #define BUFFER_SIZE 1024 #endif Example 2: #define DEBUG 1 : #if DEBUG == 1 printf() /* debugging outout */ #endif Miscellaneous functions of the preprocessor of ANSI C Trigraph sequences are replaced by their single-character equipments, e.g., ??< { Any source line that ends with a backlash, \, and a newline is spliced together with the following line. Any source files included in response to an #include directive are processed. Escape sequences are converted e.g., \n, \0 Adjacent string literals are concatenated, e.g., hello, world hello, world.