The Standard C Library
The Standard C Library
STAN DARD
-.ype.h > *
ring.h >
<math.$
cstdlib.h> * i:i .sert.h>
< stdar:
: < setjmp.h > * < signa1.h >
<time.
<1imits.h > * < fl0at.h >
<stdde
i: <errno.h > * <locale.h >
< stdio,
<ctype.h> * <st -ing.h>
<math.11
-stdlib.h> * t:(, -ert.h>
< stdarg.h:
.timp.h >
-rna1.h >
<time.h> *
.float.h>
<stddef.h> * <ermo.h> * <locale.h>
< stdio.h-. A
L2
-(.
LIBRARY
P J. PLAUCER
THE
STANDARD
LIBRARY
I?J. PLAUCER
THE STANDARD C LIBRARY shows you how to use all of the library functions mandated by
the ANSI and IS0 Standards for the programming language C. To help you understand how
to use the library, this book also shows you how to implement it. You see approximately
9,000 lines of tested, working code that is highly portable across diverse computer
architectures.
THE STANDARD C LIBRARY explains how the library was meant to be used and how it can
be used. It places particular emphasis on features added to C as part of the C Standard.
These features include support for multiple locales (cultural conventions) and very large
character sets (such as Kanji).
The code presented in this book has been tested with C compilers from Borland", Saberm,
Project Gnu, SunF, UNIXF, and VAXF, ULTRIXF. It has passed the widely used Plum Hall
Validation Suitem tests for library functions. It has also survived an assortment of publicdomain programs designed to stress C implementations and illuminate their darker corners.
The mathematical functions are particularly well-engineered and tested.
Finally, THE STANDARD C LIBRARY shows you many principles of library design in
general. You learn how to design and implement libraries that are highly cohesive and
reusable.
I?J. Plauger is one of the original users of the C programming language. He chaired the
Library Subcommittee of X3J11-the ANSI-authorized committee that developed the C
Standard. He continues as Secretary to X3Jll and Convenor of WG14, the ISO-authorized
committee developing further enhancements ot the C Standard. Dr. Plauger is co-author
(with Brian Kernighan) of several highly acclaimed books, including SOFTWARE TOOLS,
SOFTWARE TOOLS IN PASCAL, and THE ELEMENTS OF PROGRAMMING STYLE. With
Jim Brodie, Chair of X3Jl1, he co-authored STANDARD C, a complete reference to the C
Programming Language.
PRENTICE HALL P T R
Englewood Cliffs, NJ 07632
THE STANDARD
LIBRARY
P.J. Plauger
Prentice Hall P T R
Englewood Cliffs, New Jersey 07632
P l a u g e r . P. J.
The Standaro C l i b r a r y ! P . J . P l a u g e r .
p.
cm.
I n c l u d e s b i b l i o g r a p h l c a l r e f e r e n c e s and index.
ISBN 0- 13- 838012- 0 (casebound). -- ISBN 0- 13- 131509- 9 (paperbound)
1. C (Computer program language)
I. T i t l e .
QA76.73.Cl5P563
1991
005.13'3--dc20
91-31884
CIP
01992 by P. J. Plauger
for Tana
PERMISSIONS
Excerpts from the IS0 C Standard, ISO/IEC 9899:1990, reprinted by permission of the International Standards Organization, Geneva. The complete Standard, and the other IS0 standards referred to in this book, may be purchased from the IS0 member bodies or directly
from:
IS0 Central Secretariat
Case postale 56
1211 Geneva 20
SWITZERLAND
Excerpts from William J. Cody, Jr. and William Waite, Software Manual for the Elementa y
Functions, O 1980, pp. 44,69,162,183,196,206,226, and 246 reprinted by permission of
Rentice-Hall, Englewood Cliffs, New Jersey.
Excerpts fromP.]. Plauger and Jim Brodie, Standard C, reprinted by permission of the
authors.
Excerpts fr0mP.J. Plauger, Standard C, monthly column in The C Users Journal, reprinted by
permission of the author.
TRADEMARKS
Compaq SLT/386-20s is a trademark of Compaq Computer Corporation.
Corel Draw is a trademark of Corel Systems.
IBM PC and System/370 are trademarks of IBM Corporation.
Macintosh is a trademark of Apple Computer.
MS-DOS, and Windows are trademarks of Microsoft Corporation.
Multics is a trademark of Honeywell Bull.
PDP-11, RSX-I IM, ULTRIX, andVAX are trademarks of Digital Equipment Corporation.
Turbo C++ is a trademark of Borland, International.
UNIX is a trademark of AT&T Bell Laboratories.
Ventura Publisher is a trademark ofventura Software Inc.
TYPOGRAPHY
This book was typeset in Palatino, Avant Garde, and Courier bold by the author using a
Compaq SLT/386-20s computer runningventura Publisher 3.0 and Corel Draw 2.0 under
Microsoft Windows 3.0.
Contents
ix
Preface
The Code
Acknowledgments
Chapter 0:
Introduction
Background
What the C Standard Says
Using the Library
lmplementing the Library
Testing the Library
References
Exercises
Chapter 1:
< a s s e r t . h>
Background
What the C Standard Says
Using <assert.h>
lmplementing <assert.h>
Testing <assert.h>
References
Exercises
Chapter 2:
<ctype. h>
Background
What the C Standard Says
Using <ctype.h>
lmplementing Kctype.h>
Testing Kctype. h>
References
Exercises
Chapter 3:
<errno. h>
Background
What the C Standard Says
Using <errno.h>
lmplementing <errno.h>
Testing <errno.h>
References
Exercises
xii
xiii
1
1
3
7
9
13
15
15
17
17
18
18
20
22
22
24
25
25
28
30
34
42
43
43
47
47
50
50
51
55
55
55
Chapter 4:
< f l o a t . h>
Background
What the C Standard Says
Using < f l o a t .h>
lmplementing < f l o a t .h>
Testing < f l o a t .h>
References
Exercises
Chapter 5:
< l i m i t s . h>
Background
What the C Standard Says
Using < l i m i t s . h>
lmplementing< l i m i t s . h>
Testing < l i m i t s . h>
References
Exercises
Chapter 6:
< l o c a l e . h>
Background
What the C Standard Says
Using < l o c a l e . h>
lmplementing< l o c a l e . h>
Testing < l o c a l e .h>
References
Exercises
Chapter 7:
<math. h>
Background
What the C Standard Says
Using <math.h>
lmplementing <math.h>
Testing <math.h>
References
Exercises
Chapter 8:
<setjmp . h>
Background
What the C Standard Says
Using < s e t jmp.h>
lmplementing <set jmp. h>
Testing < s e t jmp. h>
References
Exercises
Chapter 9:
< s i g n a l . h>
Background
What the C Standard Says
Using <signal. h>
lmplementing <signal. h>
Testing <signal. h>
References
Exercises
Chapter 10:
<stdarg . h>
Background
What the C Standard Says
Using <stdarg. h>
lmplementing xstdarg. h>
Testing cstdarg. h>
References
Exercises
Chapter 1 1:
<stddef . h>
Background
What the C Standard Says
Using <stddef . h>
lmplementing <stddef. h>
Testing <stddef . h>
References
Exercises
Chapter 12:
< s t d i o . h>
Background
What the C Standard Says
Using <stdio. h>
lmplementing < s t d i o. h>
Testing < s t d i o. h>
References
Exercises
Chapter 13:
< s t d l i b . h>
Background
What the C Standard Says
Using Kstdlib. h>
lmplementing < s t d l i b. h>
Testing < s t d l i b. h>
References
Exercises
Chapter 14:
< s t r i n g . h>
Background
What the C Standard Says
Using < s t r i n g . h>
Implementing < s t r i n g . h>
Testing < s t r i n g . h>
References
Exercises
Chapter 15:
< t i m e . h>
Background
What the C Standard Says
Using <time. h>
Implementing <time. h>
Testing <time. h>
References
Exercises
Appendix A: Interfaces
Appendix B: Names
Appendix C: Terms
Index
This book shows you how to use all the library functions mandated by
the ANSI and IS0 Standards for the programming language C. I have
chosen to focus on the library exclusively, since many other books describe
the language proper. The book also shows you how to implement the
library. I present about 9,000 lines of tested, working code. I believe that
seeing a realistic implementation of the Standard C library can help you
better understand how to use it.
As much as possible, the code for the library is written in Standard C.
The primary design goal is to make the code as readable and as exemplay
as possible. A secondary goal is to make the code highly portable across
diverse computer architectures. Still another goal is to present code that
makes sensible tradeoffs between accuracy, performance, and size.
Teaching you how to write C is not a goal of this book. I assume you
know enough about C to read straightforward code. Where the code
presented is not so straightforward, I explain the trickery involved.
the
The Standard C library is fairly ambitious. It provides considerable
Standard power in many different environments. It promises well-defined name
C library spaces for both user and implementor. It imposes fairly strict requirements
on the robustness and precision of its mathematical functions. And it
pioneers in supporting code that adapts to varied cultures, including those
with very large character sets.
To benefit from these ambitions, a user should be aware of numerous
subtleties. To satisfy these ambitions, an implementor must provide for
them. These subtleties are not always addressed in the C Standard proper.
It is not the primary purpose of a standard to educate implementors. Nor
are many of these subtleties well explained in the Rationale that accompanies the ANSI C Standard. A Rationale must serve several masters, only
one of whom is the inquisitive implementor.
The pioneering features I mentioned above are not found in traditional
implementations of C. An implementation can now support multiple locales. Each locale captures numerous conventions peculiar to a country,
language, or profession. A C program can alter and query locales to adapt
dynamically to a broad range of cultures. An implementation can also now
support very large character sets, such as the Kanji characters used in Japan.
Preface
A C program can manipulate such character sets either as multibyte characters or as wide characters. It can also translate between these two forms. That
simplifies, and standardizes, the writing of programs for this rapidly growing marketplace.
Little or no prior art exists for these new features. Hence, even the most
experienced C programmers need guidance in using locales, multibyte
characters, and wide characters. Particular attention is given here to these
topics.
This book explains, for users and implementors alike, how the library
subtleties
was meant to be used and how it can be used. By providing a working
implementation of all the functions in the Standard C library, the book
shows by example how to deal with their subtleties. Where no implementation is clearly the best, it also discusses alternatives and tradeoffs.
An example of the subtleties involved is the function getchar. The
header <stdio.h> can, in principle, mask its declaration with the macro:
#define getchar() fgetc(stdin)
/ * NOT WISE! * /
It must not do so, however. A valid (if useless) C program is:
#include <stdio.h>
#undef fgetc
int main(void) {
int fgetc = getchar();
return (0);
1
It is a far cry from the obvious (and more readable) form first presented
above. Chapter 12: <stdio.h> helps explain why
designing Still another purpose of this book is to teach programmers how to design
libraries and implement libraries in general. By its very nature, the library provided
with a programming language is a mixed bag. An implementor needs a
broad spectrum of skills to deal with the varied contents of the bag. It is not
enough to be a competent numerical analyst, or to be skilled in manipulating character strings efficiently, or to be knowledgeable in the ways of
operating system interfacing. Writing a library demands all these skills and
more.
Good books have been written on how to write mathematical functions.
Other books present specialized libraries for a variety of purposes. They
show you how to use the library presented. Some may even justify many
Preface
reusability
structure
of this
book
xii
Preface
The Code
The code presented in this book has been tested with C compilers from
Borland, Project GNU, and VAX ULTRIX. It has passed the widely used
Plum Hall Validation Suite tests for library functions. It has also survived
an assortment of public-domain programs designed to stress C implementations and illuminate their darker corners. While I have taken pains to
minimize errors, I cannot guarantee that none remain. Please note the
disclaimer on the copyright page.
Please note also that the code in this book is protected by copyright. It
has not been placed in the public domain. Nor is it shareware. It is not
protected by a "mpyleft" agreement, like code distributed by the Free
Software Foundation (Project GNU). I retain all rights.
You are welcome to transcribe the code to machine-readable form for
fair use
your personal use. You can purchase the code in machine-readable from
The C Users Group in Lawrence, Kansas. In either case, what you do with
the code is limited by the "fair use" provisions of copyright law. Fair use
does not permit you to distribute copies of the code, either hard copy or
machine-readable, either free or for a fee.
Having said that, I do permit one important usage that goes well beyond
fair use. You can compile portions of the library and link the resultant
binary object modules with your own code to form an executable file. I
hereby permit you to distribute unlimited copies of such an executable file.
I ask no royalty on any such copies. I do, however, require that you
document the presence of the library, whatever amount you use, either
modified or unmodified. Please include somewhere in the executable file
the following sequence of characters: m i o m of this m k are derived
frnm 'Ihe Standard C Lihary, -@-it
(c) 1992 W P.J. ,-P
published W Prentice-Hall, ard are ueed w i t h pnnissirn. The same
message should appear prominently, and in an appropriate place, on any
documentation that you distribute with the executable image. If you omit
either message, you infringe the copyright.
licensing
You can also obtain permission to do more. You can distribute the entire
library in the form of binary object modules. You can even distribute copies
of the source files from this book, either modified or unmodified. You can,
in short, incorporate the library into a product that lets people use it to make
executable programs. To do so, however, requires a license. You pay a fee
for the license. Contact Plum Hall Inc. in Kamuela, Hawaii for licensing
terms and for on-going support of the library.
Despite the mercenary tone of these paragraphs, my primary goal is not
to flog a commercial product. I believe strongly in the C Standard, having
worked very hard to help bring it about. Much of my effort went into
developing the specification for the Standard C library. I want to prove that
we have constructed a good language standard. I wrote this implementation, and this book, to demonstrate that simple but important fact.
Preface
xiii
Acknowledgments
Compass, Inc. of Wakefield, Massachusetts believed in this project long
before it was completed. They are my first customer for the library code.
They helped test, debug, and improve the library extensively in the process
of accepting it for use with their Intel 860 compiler. Ian Wells, in particular,
bore the brunt of my delays and botches with good-natured professionalism. Don Anderson contributed many a midnight e-mail message toward
making this library hang together properly. For their faith and patience, I
heartily thank everyone I have worked with at Compass.
Paul Becker, my Publisher at Prentice-Hall, also believed in this project.
His gentle but persistent goading was instrumental in bringing this book
to completion. The (anonymous) reviewers he employed helped me
sharpen my focus and tone down some of the more extreme prose. Paul's
professionalism reminded me why Prentice-Hall has been such a major
force in technical publishing for so long.
Moving to Australia for a year part way through this project presented
a bouquet of impediments. My good friend and business colleague John
O'Brien of Whitesmiths, Australia, was always there to help. For turning
thorns into roses, he has been nonpareil. His assistance has surpassed the
bounds even of friendship.
Andrew Binnie, Publishing Manager at Prentice Hall Australia kindly
provided the laser printer I needed to finish this book. He was quick to help
in many ways. The University of New South Wales Computer Science
Department graciously gave me the time and space I needed, even though
they had other plans for both.
Tom Plum has forced many of us to think deeply about fundamental
aspects of C. I have enjoyed numerous fruitful discussions with him on the
topics covered here. Dave Prosser has also freely shared his deep insights
into the workings of C. As editor of both the ANSI and IS0 C Standards,
Dave provided the machine-readable text excerpted extensively in this
book. Advanced Data Controls Corp. of Tokyo, Japan pioneered Kanji
support in C. Takashi Kawahara and Hiroshi Fukutomi, both principals in
that company, have been very helpful in educating me on the technical
needs of Japanese programmers.
Much of the material presented here first appeared in monthly installments in The C Users Journal. Robert Ward has been a particularly easy
publisher to work with. I appreciate his flexibility in letting me recycle
material from his magazine. Jim Brodie has been equally generous in
permitting me to use material from our book Standard C.
Reading technical manuscripts is never an easy task. Both John O'Brien
and Tom Plum reviewed portions of this book and provided helpful feedback. Those who caught (some of the numerous) errors in the first printing
include Nelson H.F. Beebe, Peter Chubb, Stephen D. Clamage, Steven
Pemberton, Tom Plum, and Ian LanceTaylor.
Preface
Finally, I happily acknowledgethe contributions made by my family. My
son, Geoffrey, helped with the layout and typographic design of this book.
My wife, Tana, provided much-needed moral and logistical support over
many long months. They, more than anybody, kept this project fun for me.
P.J. Plauger
Bondi, New South Wales
Chapter 0: Introduction
Background
A libray is a collection of program components that can be reused in
many programs. Most programming languages include some form of
library. The programming language C is no exception. It began accreting
useful functions right from the start. These functions help you classify
characters, manipulate character strings, read input, and write output to name just a few categories of services.
a few
You must declare a typical function before you use it in a program. The
definitions easiest way to do so is to incorporate into the program a header that declares
all the library functions in a given category. A header can also define any
associated type definitions and macros. A header is as much a part of the
library as the functions themselves. Most often, a header is a text file just
like the you write to make a program.
You use the #include directive in a C source file to make a header part
of the translation unit. For example, the header <stdio. h> declaresfunctions
that perform input and output. A program that prints a simple message
with the function printf consists of the single C source file:
/* a simple t e s t program */
#include <stdio.h>
i n t main (void)
{
/* say h e l l o */
printf ( "Hello\n") ;
return ( 0 ) ;
Chapter 0
You can construct your own libraries. Atypical C compiler has a librarian,
making a
library a program that assembles a library from the object modules you specify.
The linker knows to select from any library only the object modules used
by the program. The C library is not a special case.
You can write part or all of a library in C. The translation unit you write
to make a library object module is not that unusual:
Alibrary object moduleshould contain no definitionof the functionmain
with externallinkage. Aprogrammer is unlikely to reuse code that insists
on taking control at program startup.
The object module should contain only functions that are easy to declare
and use. Provide a header that declares the functions and defines any
associated types and maaos.
Most important, a library object module should be usable in a variety of
contexts. Writing code that is highly reusable is a skill you develop only
with practice and by studying successful libraries.
After you have read this book, you should be comfortable designing,
writing, and constructing specialized libraries in C.
The C library itself is typically written in C. That is often not the case
the C
library with other programming languages. Earlier languages had libraries writin C ten in assembly language. Different computer architectures have different
assembly languages. To move the library to another computer architecture,
you had to rewrite it completely. C lets you write powerful and efficient
code that is also highly portable. You can move portable code simply by
translating it with a different C translator.
Here, for example, is the library function s t r l e n , declared in <string. h>.
The function returns the length of a null-terminated string. Its pointer
argument points to the first element of the string:
/* s t r l e n function */
#include <string.h>
size-t
{
r e t u r n (sc
s);
1
s t r l e n is a small function, one fairly easy to write. It is also fairly easy to
write incorrectly in many small ways. s t r l e n is widely used. You might
want to provide a special version tuned to a given computer architecture.
But you don't have to. This version is correct, portable, and reasonably
efficient.
Other contemporary languages cannot be used to write significant
portions of their own libraries. You cannot, for example, write the Pascal
library function w r i t e l n in portable Pascal. By contrast, you can write the
Introduction
Chapter 0
You will find the C Standard hard to read from time to time. Remember
that it is cast intentionally in a kind of legalese. A standard must be precise
and accurate first. Readability comes a distant second. The document is not
intended to be tutorial. X3Jll also produced a Rationale to accompany the
C Standard. If you are curious about why X3Jll made certain decisions, go
read that document. It might help. I emphasize, however, that the Rationale
is also not a tutorial on the C language.
Here are two quotes from the IS0 C Standard. The first quote introduces
the Library section of the C Standard. It provides a few definitions and lays
down several important ground rules that affect the library as a whole.
7. Library
7.1 Introduction
7.1.1 Definitions of terms
string
A string is a contiguous sequence of characters terminated by and including the fust null
character. AUpointerto" a string is a pointer to its initial (lowest addressed) character. The "length"
of a string is the number of characterspreceding the null character and its "value" is the sequence
of the values of the contained characters, in order.
letter
A letter is a printing character in the execution character set corresponding to any of the 52
required lowercase and uppercase letters in the source character set, listed in 5.2.1.
decimal
point
The decimal-point character is the character used by functions that convert floating-point
numbers to or from character sequences to denote the beginning of the fractional part of such
character
It is represented in the text and examples by a period, but may be changed
by the setlocale function.
Forward references: character handling (7.3). the setlocale function (7.4.1.1).
Each library function is declared in a header.89 whose contents are made available by the
#include preprocessing directive. The header declares a set of related functions, plus any
necessary types and additional macros needed to facilitate their use.
The standard headers are
If a file with the same name as one of the above < and >delimited sequences, not provided as
part of the implementation. is placed in any of the standard places for a source file to be included.
the behavior is undefined.
Headers may be included in any order, each may be included more than once in a given scope,
with no effect different from being included only once, except that the effect of including
<assert. h> depends on the definition of NDEBUG.If used, a header shall be included outside
of any external declaration or definition, and it shall first be included before the first reference to
any of the functions or objects it declares, or to any of the types or macros it defines. However,
if the identifier is declared or defined in more than one header, the second and subsequent
associated headers may be included after the initial reference to the identifier. The program shall
not have any macros with names lexically identical to keywords currently defmed prior to the
inclusion.
Forward references: diagnostics (7.2).
Each header declares or defines all identifiers listed in its associated subclause, and optionally
declares or defines identifiers listed in its associated future library directions subclause and
identifiers which are always resewed either for any use or for use as file scope identifiers.
All identifiers that begin with an underscore and either an uppercase letter or another underscore
are always resewed for any use.
Introduction
All identifiers that begin with an underscore are always resewed for use as identifierswith file
scope in both the ordinary identifier and tag name spaces.
Each macro name listed in any of the following subclauses (including the future library
directions) is reserved for any use if any of its associated headers is included.
All identifiers with external linkage in any of the following subclauses (includin the future
library directions) are always reserved for use as identifiers with external linkage.
Each identifier with file scope listed in any of the following subclauses (including the future
library directions) is reserved for use as an identifier with file scope in the same name space if
any of its associated headers is included.
No other identifiers are resewed. If the program declares or defines an identifier with the same
name as an identifier resewed in that context (other than as allowed by 7.1.7), the behavior is
undefined.9'
Footnotes
88. The functions that make use of the decimal-point character are localeconv, fprintf.
f scanf ,printf,scanf, sprintf, sscanf ,vfprintf,vprintf ,vsprintf,
atof, and strtod.
89.
A header is not necessarily a source file, nor are the <and >delimited sequences in header
names necessarily valid source file names.
90. The list of reserved identifiers with external linkage includes errno, s e t jmp, and
va-end.
91. Since macro names are replaced whenever found, independent of scope and name space,
macro names matching any of the reserved identifier names must not be defined if an
associated header, if any, is included.
The second quote describes ways to make use of the functions within
the Standard C library.
7.1.7 Use of library functions
uslng
library
functions
Each of the following statements applies unless explicitly stated otherwise in the detailed
descriptions that follow. If an argument to a function has an invalid value (such as a value outside
the domain of the function, or a pointer outside the address space of the program, or a null pointer),
the behavioris undefined. If a function argumentis described as being an array, the pointer actually
passed to the function shall have a value such that all address computationsand accesses to objects
(that would be valid if the pointer did point to the first element of such an array) are in fact valid.
Any function declared in a header may be additionally implemented as a macro defmed in the
header, so a library function should not be declared explicitly if its header is included. Any macro
definition of a function can be suppressed locally by enclosing the name of the function in
parentheses, because the name is then not followed by the left parenthesisthat indicatesexpansion
of a macro function name. For the same syntactic reason, it is permitted to take the address of a
library function even if it is also defined as a macro.g5 The use of #undef to remove any macro
definition will also ensure that an actual function is referred to. Any invocation of a library function
that is implemented as a macro shall expand to code that evaluates each of its arguments exactly
once, fully protected by parentheses where necessary, so it is generally safe to use arbitrary
expressions as arguments. Likewise, those function-like macros described in the following
subclauses may be invoked in an expression anywhere a function with a compatible return type
could be called.96All object-like macros listed as expanding to integral constant expressionsshall
additionally be suitable for use in # i f preprocessingdirectives.
Provided that a library function can be declared without reference to any type defined in a
header, it is also permissible to declare the function, either explicitly or implicitly, and use it
without including its associated header. If a function that accepts a variable argument list is not
declared (explicitly or by including its associated header), the behavior is undefined.
Example
The function a t o i may be used in any of several ways
by use of its associated header (possibly generating a macro expansion)
Chapter 0
Xinclude <stdlib.h>
const char 'str;
/*. ..*/
i = atoi(str);
/*...*I
i = atoi(str);
or
#include <stdlib.h>
const char *str;
I*...*/
i = (atoi) (str):
by explicit declaration
extern int atoi(const char *);
const char *str;
/*...*I
i = atoi (str);
by implicit declaration
const char %tr:
/*. . .*/
i = atoi (str);
Footnotes
95.
This means that an implementation must provide an actual function for each library function,
even if it also provides a macro for that function.
96. Because external identifiers and some macro names beginning with an underscore are
resewed, implementations may provide special semantics for such names. For example, the
identifier BUILTIN abs could be used to indicategeneration of in-line code for the abs
function. Thus, the apFropriate header could specify
ldefine abs (x)
In this manner, a user desiring to guarantee that a given library function such as abs will
be a genuine function may write
Xundef abs
quotingI
Note how I have marked distinctly each quote from the IS0 C Standard.
the I S 0 The type face differs from the running text of the book and is smaller. A
Standard bold rule runs down the left side. (The notes to the left of the rule are mine.)
Each quote contains at least one numbered head, to make its location within
the C Standard unambiguous. I gather any footnotes and present them at
the end of the quote.
I typeset the quotes from the IS0 C Standard from the same machinereadable text used to produce the C Standard itself. Line and page breaks
differ, of course. Be warned, however, that I edited the text extensively in
altering the typesetting markup. I may have introduced errors not caught
in later proofreading. The final authority on C is, as always, the printed C
Standard you obtain from IS0 or ANSI.
Introduction
Chapter 0
It is a good practice to use a different form of the #include directive for
your own header files. Delimit the name with double quotes instead of
angle brackets. Use the angle brackets only with the standard headers. For
example, you might write at the top of a C source file:
My practice is to list the standard headers first. If you follow the advice I
gave above, however, that practice is not mandatory. I follow it simply to
minimize the arbitrary.
name
The Standard C library has fairly clean name spaces. The library defines
spaces a couple hundred external names. Beyond that, it reserves certain classes
of names for use by the implementors. All other names belong to the users
of the language. Figure 0.1 shows the name spaces that exist in a C program.
It is taken from Mauger and Brodie, Standard C. The figure shows that you
can define an open-ended set of name spaces:
Two new name spaces are created for each block (enclosed in braces
within a function). One contains all names declared as type definitions,
functions, data objects, and enumeration constants. The other contains
all enumeration, structure, and union tags.
A new name space is created for each structure or union you define. It
contains the names of all the members.
A new name space is created for each function prototype you declare. It
contains the names of all the parameters.
A new name space is created for each function you define. It contains
the names of all the labels.
You can use a name only one way within a given name space. If the
translator recognizes a name as belonging to a given name space, it may
INNERMOST BLOCK
Figure 0.1:
Name
Spaces
type definitions
funcllons
data objgts
FILE LEVEL
...
type defetions
funcllons
data objects
...
enumeration tag
structure tag
union tag
enumeration
M
C
-
u
I.
enumeration tag
structure tag
union tag
enumeration
E
Y
0
R
D
S
.W.
goto labels
--
Introduction
fail to see another use of the name in a different name space. In the figure,
a name space box masks any name space box to its right. Thus, a macro can
mask a keyword. And either of these can mask any other use of a name.
(That makes it impossible for you to define a data object whose name is
while, for example.)
In practice, you should treat all keywords and library names as reserved
in all name spaces. That minimizes confusion both for you and future
readers of your code. Rely on the separate name spaces to save you only
when you forget about a rarely used name in the library. If you must do
something rash, like defining a macro that masks a keyword, do it carefully
and document the practice clearly. You must also avoid usingcertainclasses
of names when you write programs. They are reserved for use by the
implementors. Don't use:
names of functions and data objects with external linkagethat begin with
an underscore, such as -abc or -DEF
rn names of macros that begin with an underscore followed by a second
underscore or an uppercase letter, such as -- abc or -DEF.
Remember that a macro name can mask a name in any other name space.
The second class of names is effectively reserved in all name spaces.
Chapter 0
External names may or may not map all letters to a single case. The code
presented here works correctly either way.
It is unlikely that your implementation violates any of these assumptions. If it does, the implementation can probably be made to cooperate by
some ruse. Most C vendors write their libraries in C and use their own
translators. They need this behavior too.
coding
The code in this book obeys a number of style rules. Most of the rules
style make sense for any project. A few are peculiar.
Each visible function in the library occupies a separate C source file. The
file name is the function name, chopped to eight characters if necessary
followed by .c. Thus, the function s t r l e n is in the file s t r l e n .c.That
makes for some rather small files in a few cases. It also simplifiesfinding
functions. Appendix B: Names shows each visible name defined in the
library, giving the page number where you can find the file that defines
the name.
Each secret name begins with an underscore followed by an uppercase
letter, as in -Getint. Appendix B: Names also lists each secret name that
has external linkage or is defined in a standard header.
Secret functions and data objects in the library typically occupy C source
files whose names begin with X, as in x p t i n t . C. Such a f i e can contain
more than one function or data object. The file name typically derives
from the name of one of the contained functions or data objects.
Code layout is reasonably uniform. I usually declare data objects within
functions at the innermost possible nesting level. I indent religiously to
show the nesting of control structures. I also follow each left brace ( I )
inside a function with a one-line comment.
The code contains no r e g i s t e r declarations. They are hard to place
wisely and they clutter the code. Besides, modern compilers should
allocate registers much better than a programmer can.
In the definition of a visible Iibrary function, the function name is
surrounded by parentheses. (Look back at the definition of s t r l e n on
page 2.) Any such function can have its declaration masked by a macro
definition in its corresponding header. The parentheses prevent the
translator from recognizing the macro and expanding it.
This book displays each C source file as a figure with a box around it.
The figure caption gives the name of the file. Larger fies appear on two
facing pages -the figure caption on each page warns you that the code
on that page represents only part of a C source file.
Each figure displays C source code with horizontal tab stops set every
four columns. Displayed code differs from the actual C source file in two
ways- comments to the right of code are right justified on the line, and
a box character (0)marks the end of the last line of code in each C source
file.
Introduction
The resulting code is quite dense at times. For a typical coding project, I
would add white-space to make it at least twenty per cent larger. I compressed it to keep this book from getting even thicker.
The code also contains a number of files that should properly be merged.
Placing all visible functions in separate files sometimes results in ridiculously small object modules, as I indicated above. I also introduced several
extra C source files just to keep all files under two book pages in length.
That was not my only reason for making files smaller, however. I first wrote
each C source file to its natural length, however large. Evey compiler I used
failed to translate at least one of the larger files. The extra modules may
sometimes be unappealing from the standpoint of good design, but they
help both readability and portability in the real world.
Fifteen of the source files in this implementation are the standard headimplementing
headers ers. I listed several properties of standard headers earlier - idempotence,
mutual independence, and declaration equivalence. Each of the properties
has an impact on how you implement the standard headers.
Idempotence is easy to manage. You use a macro guard for most of the
standard headers. For example, you can protect <stdio.h> by conditionally
including its contents at most one time:
idempotence #ifndef -STDIO-H
#&fine
-STDIO-H
BODY OF <stdio.h>
..... /*
*/
#endif
The macro NULL is another example. You can usually write this macro
wherever you want a null pointer to a data object -a pointer value that
designates no data object. One way to define this macro is:
Chapter 0
#define NULL (void *) 0
Introduction
13
testing
all paths
validating
specifications
performance
testing
Testing can be a never-ending proposition. Only the most trivial functions can be tested exhaustively. Even these can never be tested for all
possible interactions with nontrivial programs that use them. You would
have to test all possible input values, or at least exercise all possible paths
through the code. If your goal is to prove conclusively that a function
contains no bugs, you will often fall far short of your goal.
A less ambitious goal is to write tests that exercise every part of the
executable code. That is a far cry from testing every possible path through
the code. It is good enough, however, to build a high level of confidence
that the code is essentially correct. To write such tests, you must know:
= what the code is supposed to do (the specification)
how it does it (the code itself)
You must then contrive tests that test each detail of the specification. (I
intentionallyleave vague what a "detail" might be.) In principle, those tests
should visit every cranny of the code. Every piece of code should help
implement some part of the specification.In practice, you must always add
tests you don't anticipate when you first analyze the specification.
The result is a complex piece of code closely tied to the code you intend
to test. The test program can be as complex as the program to be tested, or
more so. That can double the quantity of code you must maintain in future.
A change to either piece often necessitates a change to the other. You use
each piece of code to debug the othe. Only when the two play in harmony
can you say that testing is complete -at least for the time being. The payoff
for all this extra investment is a significantimprovement in code reliability.
Another form of testing is validation. Here, your goal is to demonstrate
how well the code meets its specification. You pointedly ignore any implementation details. Avendor may know implementation details that are not
easily visible to the customer. It is in the vendor's best interest to test the
internal structure of the code as well as its external characteristics. A
customer, however, should be concerned primarily with validating that a
product meets its specification, particularly when comparing two or more
competing products.
Still another form of testing is for performance. To many people performance means speed, pure and simple. But other factors can matter as much
or more -such as memory and disk requirements, both temporary and
permanent, or predictable worst-case timings. Good performance tests:
measure parameters that are relevant to the way the code is likely to be
used
can be carried out by independent agents
= have reproducible results
have reasonable criteria for "good enough"
have believable criteria for "better than average" and "excellent"
Chapter 0
Introduction
15
A few of the larger headers require two or more test programs, as in
these files defines its ownmain.
You link each with the Standard C library to produce a separate test
program. Do not add any of these files to the Standard C library. I chose t
as the leading character even though a few predefined names begin with
that letter. It forms a simple mnemonic, and the fie names do not to collide
with any in the library proper.
t s t d i o l . c and tstdio2. c. Note that each of
References
ANSI Standard X3.159-1989 (New York: American NationalStandards Institute, 1989). This is the original C Standard, developed by the ANSI authorized committee X3Jll. The Rationale that accompanies the C Standard
explains many of the decisions that went into it.
ISOIIEC Standard 9899:1990 (Geneva: International Standards Organization, 1990). Aside from formatting details and section numbering, the IS0
C Standard is identical to the ANSI C Standard. The quotes in this book are
from the IS0 C Standard.
B.W. Kernighan and Dennis M. Ritchie, The CProgramming Language, Second
Edition (Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1989). The first edition
of The C Programming Language served for years at the de fado standard for
the C language. It also provides a very good tutorial overview of C. The
second edition, upgraded to reflect the ANSI C Standard, is also a good
tutorial.
RJ. Plauger and Jim Brodie, Standard C (Redmond, Wa.: Microsoft Press,
1989). This book provides a complete but succinct reference to the entire C
Standard. It covers both the language and the library.
Thomas Plum, C Programming Guidelines (Cardiff, N.J.: Plum Hall, Inc.,
1989). Here is an excellent style guide for writing C programs. It also
contains a good discussion of first-order correctness testing, on pp. 194-199.
Exercises
Exercise 0.1 Which of the following are good reasons for including a function in a
library?
I The function is widely used.
I Performance of the function can be improved dramatically by generating inline code.
I The function is easy to write and can be written several different ways.
The function is hard to write correctly.
Writing the function poses several interesting challenges.
The function proved very useful in a past application.
The function performs a number of services that are loosely related.
Chapter 0
16
Exercise 0.4 Which of the above expressions do not behave the same as the function
call?
Exercise 0.5 Which of the above expressions can be repaired by altering the macro
definition? Which cannot?
Exercise 0.6 If any standard header can include any other, what style must you adopt
to avoid problems?
Exercise 0.7 [Harder] If a standard header can define arbitrary names, what must a
programmer do to ensure that a large program runs correctly when moved
from another implementation?
Exercise 0.8 [Very hardl Describe an implementation that tolerates keywords being
masked by macros when you include standard headers.
Exercise 0.9 [Very hardl Describe an implementation that tolerates standard headers
being included inside function definitions, or at any arbitrary place within
a source file.
Chapter 1: <assert.h>
Background
The sole purpose of the header <assert .h> is to provide a definition of
the macro assert.You use the macro to enforce assertions at critical places
within your program. Should an assertion prove to be untrue, you want
the program to write a suitably revealing message to the standard error
stream and terminate execution abnormally. (Chapter 12: <stdio. h> describes how you write to a stream.) Thus, you might write:
#include <assert.h>
.....
Any code you write following the assertion can be simpler. It need not
check whether the index idx is in range. The assertion sees to that. And
should this "impossible" situation arise while you are debugging the
program, you get a handy diagnostic. The program does not stumble on to
generate spurious problems at a later date.
Please note that this is not the best way to write production code. It is ill
advised for a program in the field to terminateabnormally. No matter how
revealing the accompanying message may be to you the programmer, it is
assuredly cryptic to the user. Some form of error recovery is almost always
preferred. Any diagnosticsshould be in terms that the user can understand.
What you want is some way to introduce assertions that are enforced
only while you're debugging. That lets you document the assertions you
need from the start, then helps you catch the worst logic errors early on.
Later, you might add code to recover from errors that truly can occur during
execution. You want to leave the assertions in as documentation, but you
want them to generate no code.
macro <assert.h> gives you just this behavior. You can define the macro NDEBUG
NDEBUG at some point in your program to alter the way assert expands. If NDEBUG
is not defined at the point where you include<assert. h>, the header defines
the active form of the macro assert. It expands to an expression that tests
the assertion and writes an error message if the assertion is false. The
program then terminates. If NDEBUG is defined, however, the header defines
the passive form of the macro that does nothing.
Chapter 1
18
which is not defined by <assert.h>.If NDEBUG is defined as a macro name at the point in
the source file where <assert.h> is included, the assert macro is defined simply as
#defin- assert(ignore)
((void) 0
The assert macro shall be implemented as a macro, not as an actual function. If the macro
definition is suppressed in order to access an actual function, the behavior is undefmed.
assert
Description
The assert macro puts diagnostics into programs. When it is executed, if expression is
false (that is, compares equal to 0). the assert macro writes information about the particular
call that failed (including the text of the argument, the name of the source file, and the source line
and
FILE
number - the latter are rcspectively the values of the predefined macros
LINE ) on the standard error file in an implementation-defined f o m a t . q t h e n c x s the
S r t function.
Returns
The assert macro returns no value.
Fbnvard references: the abort function (7.10.4.1).
Fbotnotes
97.
Using <assert.h>
I gave an example of using the assert macro at the beginning of this
chapter. Whether active or passive, assert behaves essentially l i e a function that takes a single int argument and returns a void result. The argument
to the macro is nominally an expression of type int. The macro writes a
message and terminates execution if the value of the expression is zero.
In practice, the argument you write is a predicate- an expression that is
predicates
either true (nonzero)or false (zero).You write predicates in for, if, and while
statements to determine the flow of control through the program. An
assertion is simply a compact way of writing:
if (! okay)
abort ();
benign Note that you can safely define the macro NDEBUG even if it is already
redefinition defined. It is a benign redefinition, as I described on page 12. Benign
Chapter 1
20
Implementing <assert.h>
This header requires very little code, but it must be carefully crafted. To
respond properly to NDEBUG, the header must have the general structure:
/* remove e x i s t i n g d e f i n i t i o n */
#undef a s s e r t
# i f def NDEBUG
#define a s s e r t ( t e s t ) ( (void) 0)
#else
#define a s s e r t ( t e s t ) . .
#endif
.. .
/*
passive form
/*
a c t i v e form
*/
*/
Figure 1.1: /*
assert. h
a s s e r t - h standard header
#undef a s s e r t
*/
/*
remove e x i s t i n g d e f i n i t i o n
#ifdef NDEBUG
#define a s s e r t ( t e s t )
( (void) 0)
/* NDEBUG not defined
#else
void -Assert (char *) ;
/* macros */
#define -STR (x) -VAL (x)
#define -VAL (x) #x
#def i n e a s s e r t ( t e s t )
( ( t e s t ) ? (void) 0 \
: -Assert ( F I L E - " :" -S T R ( L I N E _ ) " " #test))
#endif
*/
*/
Figure 1.2:
xassert c
/*
Assert function
#include <assert.h>
#include <stdio .h>
#include <stdlib.h>
*/
void -Assert(char * m a g )
Figure 1.1 shows the file assert .h.This implementation of the macro
assert performs the test inline.That way an optimizing translator can often
-STR
-VAL
function
-Assert
forward
references
eliminate all code for an assertion that is obviously true. The macro composes the diagnostic information into a single string argument of the form
xyz:nnn expression (to use the notation of the C Standard). The string-creation operator #x encodes much of the information. Then string-literal
concatenation merges the pieces. It is a bit more compact than the form that
the C standard suggests, with the words file and line in it.
One nuisance is that the builtin macro -LINE_
does not expand to a
string literal. It becomes a decimal constant. To convert it to proper form
requires an additional layer of processing. That is performed by adding to
the header the two secret macros -STR and -VAL. One macro replaces
LINE_ with its decimal constant expansion. The second converts the
decimal constant to a string literal. Omit either -STR or -VAL and you end
up with the string literal n m - ~ ~ instead
~
w
of what you want.
Figure 1.2 shows the file xassert .c. It defines the secret library function
-Assert that the macro calls. A smart version of the function-Assert can
parse the diagnostic message and supply the missing bits if it chooses. The
version shown here does not, since the precise format of the message is
implementation-defined.
The function -Assert uses two other library functions. It writes strings
to the standard error stream by calling fputs, declared in <stdio.h>. It
terminates execution abnormally by calling abort, declared in <staib .h>.
The description of each of these headers occurs much later. If you have a
general knowledge of C, such forward references should present few
problems. But if you need to learn more about what they do at this point,
you'll have to skip down quite a number of pages.
A good tutorial presentation minimizes the use of forward references.
Unfortunately, the Standard C library is highly interconnected. Nearly
every part is written in terms of the others and can be described only in
terms of the others. When I must refer ahead, I describe the new material
in general terms, as I have done for fputs and abort.That should minimize
some page flipping for those new to Standard C, but probably not all.
22
Chapter 1
Testing <assert.h>
Figure 1.3 shows the file tassert. C. This test program exercises the
maao four different ways -in its passive and active forms, with
the test condition met and not met. Only the active form with the test not
met should abort. Correct execution should display something like:
assert
--
and terminate normally. Note, however, that the program writes text to
both the standard error and standard output streams. Text lines can appear
in a different order on some implementations. (See Chapter 12: <stdio.h>
for a discussion of streams.)
The test fails if any of the earlier three invocations of assert cause
execution to terminate, or if the program exits normally and reports the
status EXIT-FAILURE (a nonzero value defined in xstdlib. h>).
tassert . c is a fairly sophisticated test program. Two of the functions it
uses are brothers to ones you have already met. The program writes strings
to the standard output stream by calling puts, declared in <stdio.h>. It
terminates execution normally by calling abort, declared in <stdlib .h>.
The program is more ambitious than that, however. It calls the function
signal, declared in <signal. h>, to regain control after -Assert calls abort.
It even uses the assert macro to verify that signal returns successful status.
Imagine using the very machinery you are testing to implement part of the
test harness! That's hardly the way to go about debugging new code.
In fact, it was not the way I debugged this code. My first version of
program
stubs tassert . c simply aborted on the fourth test of the assert macro. I confess
that it took several tries even to get that far. Both fputs and signal sit atop
a lot of machinery, not all of which was debugged when I began testing
<assert. h>. I had to introduce program stubs (much simpler versions) for
most of this code at one time or another. The needs of debugging can be
quite different than the needs of simple confidence testing.
When one of these tests fails, you may have to alter it - or call on the
services of an interactive debugger - to identify the exact failure. That is
one of the design compromises I made to keep the tests succinct.
References
Two good books that preach programming by assertion are:
O.J. Dahl, E.W. Dijkstra, and C.A.R. Hoare, Structured Programming (New
York: Academic Press, 1972).
E.W. Dijkstra, A Discipline of Programming (Englewood Cliffs, N.J.: PrenticeHall, Inc., 1973).
Both are still topical, despite their age.
<assert.h>
Figure 1.3:
tassert .c
t e s t a s s e r t macro
ldefine NDEBUG
linclude < a s s e r t . h >
linclude <signal.h>
linclude <stdio.h>
linclude < s t d l i b . h >
I*
*/
/* s t a t i c d a t a
s t a t i c i n t v a l = 0;
*/
r t a t i c void field-abort ( i n t s i g )
/*
handle SIGABRT
*I
i f ( v a l = 1)
/*
expected r e s u l t
*I
1
else
/*
unexpected r e s u l t */
1
1
r t a t i c void dummy()
{
/*
t e s t dunmy a s s e r t macro */
i n t i = 0;
a s s e r t ( i == 0) ;
assert ( i == 1);
1
Lundef NDEBUG
!include <assert.h>
.nt main()
24
Chapter 1
Exercises
Exercise 1.1 Write a version of assert .h, using the version of xassert .c in Figure 1.2,
that exactly matches the format shown in the C Standard.
Exercise 1.2 Write a version of xassert. C, using the version of assert .h in Figure 1.1,
that exactly matches the format shown in the C Standard.
Exercise 1.3 What are the relative merits of the approaches in the previous two exercises?
Exercise 1.4 Write a version of a s s e r t . h and xassert. c that prints all assertions. Why
would you want to use this version?
Exercise 1.5 [Harder] Write a handler for the signal SIGABRTthat writes the prompt:
Continue (y/n) ?
to the standard error stream and reads the response from the standard
input stream. If the response is yes (in either uppercase or lowercase), the
handler should reestablish itself and return control to the abort function.
Chapter 9: <signal .h> describes signals. Chapter 13: <stdlib. h> describes
the abort function.
Why would you want this capability?
Exercise 1.6 [Harder] Write a handler for the signal SIGABRTthat executes a long jmp to
a set jmp at the top of main. Chapter 8: <set jmp. h> describes the long jmp and
s e t jmp functions.
Why would you want this capability? Describe a safe discipline for initializing static storage in a program that uses this capability.
Exercise 1.7 [Very hard] Some C translators provide a source-level interactive debugger.
Such debuggersoften let you set conditional breakpoints at various points
within the executing program. Locate such a C translator and explore what
is necessary to get <assert .h> to work with the debugger. Your goals are,
in order of increasing difficulty:
Have control revert to the debugger whenever an assertion fails. Execution should continue with the statement following the offending assert
macro invocation.
Have assert generate no inline code. It should pass instructions to the
source-level debugger instead.
Generate code at the same level of optimization whether or not assert
macros appear, in either passive or active form.
Have the modified assert accept test expressions of arbitrary complexity.
Why would you want each of these capabilities?
<= c
&& c
<=
'2'
I I 'a'
<= c
&&
c <=
' 2 ' )
.....
which gives a correct result when the execution character set is ASCII. (The
letters stand for "American Standard Code for Information Interchange."
It is a widely used set of character codes, but hardly universal. This idiom
does not work correctly for other popular character sets, such as IBM's
EBCDIC.)
To identify a digit, we wrote:
if ( ' 0 '
<= c
&& c
Pretty soon, our programs became thick with tests like this. Worse, some
became thick with tests almost like this. You can write the same idiom
several different ways. That slows comprehensionand increases the chance
for errors.
Chapter 2
26
(isalpha (c))
if
( i s d i g i t (c))
if
(isspace (c))
.....
.....
.....
It wasn't long before a dozen-odd functions like these came into being.
They soon found their way into the growing library of C support functions.
More and more programs began to use them instead of reinventing their
own idioms. The character-classification functions were so useful, they
seemed almost too good to be true.
They were. A typical text-processing program might average three calls
on these functions for mwy character from the input stream. The overhead
of calling so many functions often dominates the execution time of the
program. That led some programmers to avoid using these standard character classificationfunctions. It led others to develop a set of macros to take
their place.
surprises
C programmers tend to like macros. They let you write code that is as
with macros readable as calling functions but is much more efficient. You just have to
be alert to a few surprises:
The macro may expand into much more code than a function call, even
if it happens to execute faster than the function call. If your program
expands the macro in many places, it can grow surprisingly larger.
The macro may expand to a subexpression that doesn't bind as tight as
a function call. This is unacceptable, and always has been. A liberal use
of parentheses in the macro definition can eliminate such nonsense.
The macro may expand one of its arguments to code that is executed
more than once, or not at all. A macro argument with side effects will
cause surprises. While some C programmers consider such surprises
acceptable, modern practice avoids them. Only two Standard C library
functions, getc and putc, both declared in <stdio.h>, can have macro
versions with such unsafe behavior.
27
a t y p e . h>
translation
So the challenge in those early days was to produce a set of macros to
tables replace the character-classificationfunctions. Because they were used a lot,
they had to expand to compact code. They also had to be reasonably safe
to use. What evolved was a set of macros that used one or more translation
tables. Each macro took the form:
#define -XXXMASK
#define isxxx(c)
Ox...
LCtyptab [cl
&
-XXXMASK)
Chapter 2
28
Description
isalpha
The isalnum function tests for any character for which isalpha or isdigit is true.
7.3.1.2 The isalpha function
Synopsis
Winclude <ctype.h>
int iaalpha(int c);
Description
The isalpha function tests for any character for which isupper or islower is true, or
any character that is one of an implementation-defined set of characters for which none of
iscntrl,isdigit,ispunct,or isspace is true. In the "C" locale, isalpha returns
true only for the characters for which isupper or islower is true.
iscntrl
Description
isdigit
Synopsis
#include <ctype.h>
int isdigit(int c);
Description
iagraph
The isdigit function tests for any decimal-digit character (as defined in 5.2.1).
7.3.1.5 The isgraph function
Synopsis
Winclude <ctype.h>
int isgraph(int c);
Description
The isgraph function tests for any printing character except space ('
').
<ctype. h>
islower
Description
The islower function tests for any character that is a lowercase letter a is one of an
implementation-defined set of characters for which none of iscntrl, i s d i g i t , ispunct,
a isspace is true. In the "C" locale, islower returns true only for the characters defined
as lowercase letters (as defined in 5.2.1).
isprint
Description
ispunct
The isprint function tests for any printing character including space (' ' ).
7.3.1.8 The ispunct function
Synopsis
#include <Ctype.h>
int ispunct(int C);
Description
The ispunct function rests for any printing character that is neither space ('
character for which isalnumis true.
') nor a
Description
The isspace function rests for any character that is a standard white-space character or is
one of an implementation-defined set of characters for which iealnum is false. The standard
white-space characters are the following: space (' '), form feed (' \ f ' ), newline (' \n' ),
caniage return (' \rr ), horizontal tab (' \tr), and vertlcal tab (' \v' ). In the "C" locale,
isspace returns true only for the standard white-space characters.
Description
The isupper function rests for any character that is an uppercase letter or is one of an
impiementation-defined set of characters for which none of iscntrl, i s d i g i t , ispunct,
or isspace is true. In the "C" locale, isupper returns true only for the characters defined
as uppercase letters (as defined in 5.2.1).
isxdigit
Description
The i s x d i g i t function tests for any hexadecimal-digit character (as defined in 6.1.3.2).
Chapter 2
tolower
Description
The tolower function converts an uppercase letter to the corresponding lowercase letter.
Returns
If the argument is acharacter for which isupper is true and there is acorresponding character
for which islower is true, the tolower function returns the corresponding character;
otherwise, the argument is returned unchanged.
toumer
Description
The toupper function converts a lowercase letter to the corresponding uppercase letter.
Returns
If the argument is acharacter for which islower is true and there is acorresponding character
for which isupper is true, the toupper function returns the corresponding character,
otherwise, the argument is returned unchanged.
Footnotes
Using <ctype.h>
Use the functions declared in <ctype.h> to test or alter characters that
you read in with fgetc, getc, getchar, all declared in <stdio.h>.If YOU Store
such a value before you test it, declare the data object to have type int. If
you store in any character type instead, you lose information. You may
mistake an end-of-file indication for a valid character. Or you may convert
a valid character code to a negative value, which is unacceptable.
If you generate an argument any other way, be careful. The functions
work properly only for the value EOF, defined in <stdio.h>,and values that
type unsigned char can represent. The characters in the basic C character set
have positive values when represented as type char. Others may not.
Classifying characters is not as easy as it first appears. First you have to
understand the classes. Then you have to understand where all the common characters lie within the class system. You have to know where the
implementation has tucked the less common characters. You need some
understanding of how everything changes when you move to an implementation with a different character set. Finally, you need to be aware of
how the classes can change when the program changes its locale.
.,
.$#.
Figure 2.1:
Character
Classes
1-1
'
<
isxdigit
isalnum
isgraph
isprint
isa-
ispunct
++
isspace
iscntrl
++
<ZJ
<
isupper
+
islower
Chapter 2
the rounded rectangles are all the members of the basic C character set.
These are the characters you use to represent an arbitrary C source file. The
C Standard requires that every execution character set contain all these
characters. Every execution character set must also contain the null character, whose code is zero.
A single plus sign under a function name indicates that the function can
represent additionalcharacters in localesother than the YY locale. Adouble
plus sign indicates that the fundion can represent additional characters
locale.
even in the T~~
An execution character set can contain members that fall in none of these
classes. The same character must not, however, be added at more than one
place in the diagram. If it is a lowercase letter, it is also in several other
classes by inheritance. But a character must not be considered both punctuation and control, for example.
As you can see from the diagram, nearly all the functions can change
behavior in a program that alters its locale. Only i s d i g i t and isxdigit
remain unchanged. If your code intends to process the local language, this
is good news. The locale will alter ielower, for example, to detect any
additional lowercase letters.
when
If your code endeavors to be locale independent, however, you must
locales program more carefully. Supplement any tests you make with the characchange ter-classificationfunctions to weed out any extra characters that sneak in.
Or get all your locale-independent testing out of the way before the
program changes out of the * c - locale.
~
If neither of these options is viable, you may have to revert part or all of
the locale for a region of code. See page 88.
The important message is that Standard C introduces a new era. You can
now write code more easily for cultures around the world, which is good.
But you must now write code with more forethought. If it can end up in an
international application, it may someday process characters undreamed
of by early C programmers. Trust the character-classification functions to
contain the problem, to help you with it, and to delineate what can change.
I conclude this section with a remark or two about each of the functions
declared in <ctype. h>.
isalnum
isalnum - "Alnum" is short for "alphanumeric," the fancy term for
letters and digits. A common practice where a program looks for names is
to require that each name begin with a letter, but permit a mixture of letters
or digits to follow. You often use this function to test for the trailing
characters in a name.
iealpha
iealpha -"Alpha" is short for "alphabetic," a common term for letters
of either case. You use this function to test for letters in the local alphabet.
For the
locale, the local alphabet always consists of the familiar 26
English letters, in each of two cases.
iscntrl
isdigit
isgraph
islower
isprint
ispunct
isspace
isupper
iexdigit
Knowing that you can depend on this idiom simplifies and speeds code
that performs numeric conversions.
isgraph - YOU use isgraph to identify characters that display when
printed. This function shifts behavior when you change locale.
islower - What constitutes a lowercase letter can vary considerably
among locales. Use this function to make sure that you recognize all of
them. Don't assume that every lowercase letter has a corresponding uppercase letter, or conversely. Don't even assume that every letter is either
lowercase or uppercase.
isprint -This function recognizes all characters that occupy one print
position when written to a printer.
ispunct -Remember that punctuation is an open-ended set of characLocale. As the description in the C Standard implies,
ters, even in the
you are better off thinking of punctuation as graphic characters other than
alphanumeric.
isspace -This is an important function. Several library functions use
isspace to determine which characters to treat as whitespace. In the "cW
locale, you use this function to identify any of the characters that alter the
print position, when written to a display device, without displaying a
graphic. You should assume that isspace is the best test for such whitespace in any locale.
isupper -Thesame remarks apply as for islower above, only in reverse.
iexdigit -Like i s d i g i t , this function does not change with locale. You
use it for the specific purpose of identifying the digits in a hexadecimal
number. Note, however, that you cannot assume letter codes are adjacent,
the same way digit codes are. To convert a hexadecimal number in any
locale, write:
Chapter 2
....
static const char xd[] =
{ "0123456789abcdefABCDEF");
static const char xv[] =
1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15,
10, 11, 12, 13, 14, 15);
for (value = 0; isxdigit (*a); ++a)
value = (value << 4) + xv[strchr(xd, *a)
tolover
toupper
xd] ;
Note that this code does not check for overflow. That requires additional
complexity.
tolower-Use this function to force any uppercase letters to lowercase.
It deals with such exotica as lowercase letters that have no corresponding
uppercase letter and letters that have no case. Don't assume that you can
convert an uppercase letter to its corresponding lowercase letter simply by
adding or subtracting a constant value. That happens to be true for ASCII
and EBCDIC, two popular character sets, but it is not required by the C
Standard.
toupper- Use this function to force any uppercase letters to lowercase.
The same remarks apply as for tolower above, only in reverse.
<ctype. h>
tolower
toupper
ASCII
and
IS0 646
35
with nine, ten, 16, and even 32 bits used to represent character types. A
translation table that must represent all the values in a 16-bit character is
probably too unwieldy. It would contain 65,537 elements.
Figure 2.1 shows eight distinct classes. That suggests that a translation
table can be an array of unsigned char. But the figure also shows (with pluses)
six places where an implementor can add characters to the classes. That
suggests that the table must be an array of short. You can merge most of these
additions with existing classes. Still, two sets of additions remain, outside
locale at least:
the mlcmv
The function isalpha can recognize characters that are recognized by
neither islower or isupper.
The function isspace can recognize characters that are recognized by
neither iscntrl or isprint.
You must either rule out locales with funny letters and spaces, or you must
make each element of the translation table big enough to hold ten classification bits. If any chance exists that you may want to support locales with
such alphabetic or space characters, declare the translation table to have
type array of short. If you are willing to rule out such latitude, however, you
can save space by declaring the translation table to have type array of
unsigned char. Since this implementation aims at maximum portability, it
takes the former course.
One subtle point should not get bypassed. I have consistently said that
an eight-bit translation table should have elements of type unsigned char.
Not all implementations represent integers in two's complement. In other
representations, converting a negative signed representation to an unsigned one can alter low-order bits. Performing a bitwise and between a
signed value and an unsigned mask can thus cause surprises.
So far, I have assumed that characters are represented in eight bits (or
not much more). I have also assumed that a program can afford to include
a translation table of 514 bytes (or not much more).To show some real code,
I must make at least three more assumptions.
Assumption #I: The case mapping functions tolower and toupper differ
from the other functions in this group. They don't simply classify their
argument, but return a character that may differ from the argument character. I assume that they should be implemented with mapping tables
similar to the translation table shared by all the other functions.
Assumption #2: The execution character set is ASCII, which is widely
used among modern computers. IS0 646, the international variant, has the
same code values and much the same glyphs, or visible forms of the
characters.Some of the punctuation in ASCII can be replaced with alternate
glyphs in IS0 646, however. That is how Europeans can introduceaccented
characters, such as A and e, without going beyond seven-bit codes.
This implementation is compatible with any variant of IS0 646 that
redefines no punctuation characters as letters. It is easily changed to match
Chapter 2
shored
libraries
writable
static
storage
header
<ctype. h>
other IS0 646 variants, however. You can also accommodateother character
sets just as easily. IBM's EBCDIC also requires a simple change of table
entries. Just be sure that your table entries agree with the character constants (such as * ar) produced by your C translator!
Assumption #3: The library can use writable static storage for pointers to
its tables. That supports only the simple case where the translator includes
code from the Standard C library C as needed. Once included in the
program, library code behaves just like code supplied by the programmer.
An implementation that can run multiple programs, however, often benefits from having shared libraries. All the code for the Standard C library
occupies a single place in computer memory. A C program linked to run in
this environment transfers control to functions in the shared library, rather
than including its own private copy of the library code. The obvious
benefits are that each program is smaller and can link faster.
A not-so-obvious drawback appears when one or more functions need
to maintain a writable static data object that is private to the library. You
can't share the same data object between different programs, or between
different threads of control within the same program. You need to allocate
a unique version of each writable static data object for each program or
thread and initialize it to its required starting value.
Sadly, no common method exists for performing this feat. Operating
systems and linkers use ad hoc machinery to make shared libraries work
at all. Some simply disallow writable statics. Others require you to invoke
special machinery to set up and access writable statics. You must write your
code in a special way.
The character classification functions need writable static storage if they
are to adapt to changing locales. One approach is to rewrite the tables when
the locale changes. A better way is to alter pointers to point to different
(read-only) tables. That speeds changing locales. It also minimizes the
amount of writable storage that might need special handling.
This presentation largely ignores the potential problems associated with
writable static storage in the library. I minimize the use of writable statics
as much as possible. I also try to call attention in the code to any writable
static data object that must be introduced. But I use no special notation for
accessing such storage.
Figure 2.2 shows the file ctype .h.The code for the functions declared in
<ctype.h> is built around three translation tables. Three writable pointers
at all times point to the tables corresponding to the current locale. Note that
a e r y function has a corresponding macro. I used fairly cryptic names for
the macros that define the classification bits. That helps save space for the
presentation. It also speeds the processing of standard headers in many
implementations.
<ctype. h>
37
The code for the functions looks much like the macros. Figure 2.3
etc. (isa1num.c) through Figure 2.15 (t0upper.c) shows the code for these
isalnum
functions.
Tolower
Toupper
Figure 2.2:
ctype. h
Figure 2.16 shows the file xtolower. c. It defines the initial value of the
pointer-TO~OW~~,
and the ASCII version of the translation table that accompanies tolower. Similarly, Figure 2.17 shows the file xtoupper.c. It defines
-s
Figure 2.3:
isalnum. c
/*
isalnum function
#include <ctype.h>
*/
i n t (isalnum) ( i n t c)
(
/* t e s t for alphanwmric character
return LCtype [cl 6 (-DI I-LO I-UP I-XA) ) ;
*/
0
38
Chapter 2
Figure 2.4: /* isalpha function */
isalpha.c
#include <ctype.h>
i n t :isalpha) ( i n t c )
/*
t e s t f o r alphabetic character
*/
/*
*/
I
return (-Ctype [cl
t e s t f o r control character
(-BB I-CN) ) ;
#include <ctype.h>
i n t :isdigit) ( i n t c )
/*
*/
t e s t f o r graphic character
*/
isgraph. c
kt (isgraph) ( i n t c )
/*
I
return (--ype[cl
s (-DII-LOI-PUI-UP1
-XA));
islower. c
i n t (islower) ( i n t c )
/*
I
return (-Ctype[c]
t e s t f o r lowercase character
*/
6 -LO) ;
/* ispunct function
#include <ctype.h>
Figure 2.10:
ispunct c
*/
int (ispunct) ( i n t c )
/*
t e s t f o r punctuation character
*/
Figure2.11: /* isspace f u n d i o n */
isspace.
i n t (isspace) ( i n t c )
I
return
m type [CI s
/* test
(-a
I -SP I -XS ) ) ;
f o r spacing character
/* isupper function
#include <ctype.h>
Figure 2.12:
isupper. c
*/
i n t (isupper) ( i n t c )
/*
t e s t f o r uppercase character
/* i s x d i g i t function
#include <ctype.h>
Figure 2.13:
*/
-UP) ;
i s x d i g i t .c
*/
*/
i n t ( i s x d i g i t ) ( i n t c)
/*
t e s t f o r hexadecimal d i g i t
/* tolower function
#include <ctype.h>
Figure 2.14:
tolower. c
*/
-XD) ;
0
*/
i n t (tolower) ( i n t c )
(
/*
I )
*/
40
Chapter 2
Figure 2.16:
xto1ower.c
*/
*/
itatic
0x00,
0x08,
0x10,
0x18,
0x20,
0x28,
0x30,
0x38,
0x40,
I
'p',
' X'
0x60,
h ,
p
'
/* s t a t i c data
const short tolow_tab[257] = {EOF,
0x01, 0x02, 0x03, 0x04, 0x05, 0x06,
0x09, OxOa, OxOb, OxOc, OxOd, OxOe,
0x11, 0x12, 0x13, 0x14, 0x15, 0x16,
0x19, Oxla, Oxlb, Oxlc, Oxld, Oxle,
0x21, 0x22, 0x23, 0x24, 0x25, 0x26,
0x29, Ox2a. Ox2b, Ox2c, Ox2d, Ox2e.
0x31, 0x32, 0x33, 0x34, 0x35, 0x36,
0x39, Ox3a, Ox3b, Ox3c, Ox3d, Ox3e,
'a',
'b',
'c',
'dl,
ler,
lfr,
' m r , 'nr,
'i'
, ' j r , 'k', 'Ir,
'q',
' r r , I s r , It', 'ur, r v r ,
'y' , ' z' , OxSb, 0 x 5 ~ . Ox5d. Ox5e.
'a',
'b',
'c',
'dr,
'err 'fr,
, ir
, ' j r , ' k r , !Ir,
r m r , rnr,
'q',
'y',
0x80,
0x88,
0x90,
0x98,
OxaO,
Oxa8,
oxbo,
Oxb8,
oxco,
Oxc8,
OxdO,
Oxd8,
OxeO,
Oxe8,
Oxf 0,
Oxf8,
0x81,
0x89,
0x91,
0x99,
Oxal,
Oxa9,
Oxbl,
Oxb9,
Oxcl,
Oxc9,
Oxdl,
Oxd9,
Oxel,
Oxe9,
Oxf 1,
Oxf9,
0x07,
OxOf,
0x17,
Oxlf,
0x27,
Ox2f.
0x37,
Ox3f,
rgr,
r
'wr,
Oxsf,
#qr,
Iot,
'r', ' a r ,
t , ,
v
IW',
' z ' , Ox7b, O X ~ C , Ox7d, Ox7e, Ox7f,
0x82,
Ox8a,
0x92,
Oxga,
Oxa2,
Oxaa,
oxb2,
Oxba,
oxc2,
Oxca,
Oxd2,
Ox&,
Oxe2,
Oxea,
Oxf 2,
Oxfa,
0x83,
OxEb,
0x93,
Oxgb,
Oxa3,
Oxab,
Oxb3,
0-,
Ox&,
Oxcb,
Oxd3,
Oxdb,
Oxe3,
Oxeb,
Oxf3,
Oxfb,
0x84,
OX~C,
0x94,
OX~C,
Oxa4,
Oxac,
Oxb4,
Oxbc,
Oxc4,
Oxcc,
Oxd4,
Oxdc,
Oxe4,
Oxec,
Oxf 4,
Oxfc,
0x85,
OxEd,
0x95,
Ox9d.
Oxa5,
Oxad,
Oxb5,
0-,
Oxc5,
Oxcd,
Oxd5,
Oxdd,
Oxe5,
Oxed,
Oxf5,
Oxfd,
= btolow-tab[l]
0x86,
Ox8e,
0x96,
Ox%,
Oxa6,
Oxae,
Oxb6,
O h ,
Oxc6,
Oxce,
Oxd6,
Ox&,
Oxe6,
Oxee,
Oxf6,
Oxfe,
0x87,
Ox8f,
0x97,
Oxgf,
Oxa7,
Oxaf,
Oxb7,
Oxbf,
Oxc7,
Oxcf,
Oxd7,
Oxdf,
Oxe7,
Oxef,
Oxf7,
Oxff);
the initial value of the pointer -Toupper, and the ASCII version of the
translation table that accompanies toupper.
Note the use of the #error directive. It ensures that the code translates
successfully only if its assumptions are correct. The macro UCHAF-MAX,
defined in < l i m i t s . h>, gives the highest value that can be represented by
type unsigned char.
Figure 2.17:
xtoupper c
static
0x00,
0x08,
0x10,
0x18,
0x20,
0x28,
0x30,
0x38,
0x40,
/* s t a t i c data */
const short toup_tab[257] = {EOF,
0x01, 0x02, 0x03, 0x04, 0x05, 0x06,
0x09, OxOa, OxOb, OxOc, OxOd, OxOe,
0x11, 0x12, 0x13, 0x14, 0x15, 0x16,
0x19, Oxla, Oxlb, Oxlc, Oxld, Oxle,
0x21, 0x22, 0x23, 0x24, 0x25, 0x26,
0x29, Ox2a. Ox2b, Ox2c, Ox2d, Ox2e,
0x31, 0x32, 0x33, 0x34, 0x35, 0x36,
0x39, Ox3a, Ox3b, Ox3c, Ox3d, Ox3e,
'H',
'A',
1
'P',
'Q',
'
0x60,
'H',
'P',
X
0x80.
0x88,
0x90,
0x98,
OxaO,
Oxa8,
oxbo,
Oxb8,
oxco,
Oxc8,
OxdO,
Oxd8,
OxeO,
Oxe8,
Oxf 0,
Oxf8,
'B',
'C',
'K',
Y ,
'J',
'R',
'2'.
'Ar,
'B',
' I r , 'J',
'R',
'Q',
0x81.
0x89,
0x91,
0x99,
Oxal,
Oxa9,
Oxbl,
Oxb9,
Oxcl,
Oxc9,
Oxdl,
Oxd9,
Oxel,
Oxe9,
Oxf 1,
Oxf 9,
'2'
0x82,
Ox8a,
0x92,
Oxga,
Oxa2,
Oxaa,
Oxb2,
Oxba,
Ox&,
Oxca,
Oxd2,
Ox&,
0-2,
Oxea,
Oxf 2,
Oxfa,
'St,
-ctype
0x07,
OxOf,
0x17,
Oxlf,
0x27,
Ox2f,
0x37,
Ox3f,
'G',
'O',
'W',
'K',
'Dl,
'L',
'M',
'or,
'T',
'U',
lV',
'W',
Ox7b, Ox7c, Ox7d, Ox7e, Ox7f,
'S',
0x83,
Ox8b,
0x93,
Ox%,
Oxa3,
Ox&,
Oxb3,
0-,
Oxc3,
Oxcb,
Oxd3,
0x84,
Ox8c,
0x94,
Oxgc,
Oxa4,
Oxac,
Oxb4,
Oxbc,
Oxc4,
Oxcc,
Oxd4,
Oxdc,
o*,
Oxe3, Oxe4,
Oxeb, Oxec,
Oxf3, Oxf 4,
Oxfb, Oxfc,
data object
'DP, r E R , r F r ,
lLt,
lMP, l N r ,
'T',
'Ur,
'V',
*/
0x85,
OxEd,
0x95,
Oxgd,
Oxa5,
Oxad,
Oxb5,
oxbd,
Oxc5,
Oxcd,
Oxd5,
Oxdd,
Oxe5,
Oxed,
Oxf5,
Oxfd,
= Stoup-tab[l];
0x86,
Ox8e,
0x96,
Oxge,
Oxa6,
Oxae,
Oxb6,
O h ,
Oxc6,
Oxce,
Oxd6,
0x87,
Ox8f,
0x97,
Oxgf,
Oxa7,
Oxaf,
Oxb7,
Oxbf,
Oxc7,
Oxcf,
Oxd7,
O h , Oxdf,
Oxe6, Oxe7,
Oxee, Oxef,
Oxf6, Oxf7,
Oxfe, Oxff);
C
Figure 2.18 shows the file xctype. c. All the character-classificationfunctions share a common translation table, pointed at by -Ctype. This file
defines both the table and the pointer.
Chapter 2
Figure 2.18:
xctype .c
*/
/* macros */
#define XDI (-DI 1-XD)
#define XLO (-LOI-W)
#define XVP (-UP I - . )
the use of parentheses around the function names in the second set of tests.
That is the same trick I use to define each of the visible functions in the
libray. The parentheses prevent any macro with arguments from masking
the declaration of the actual function earlier in the header. If the execution
character set is ASCII, the program produces the output:
ispunct: !"#$%bf ()*+, -./: ;<=>?@[\IA-'{))isdigit: 0123456789
islower: abcdefghijklmnopqrstuvwxyz
isupper: ABCDEFGHIJKLMNOPQRSTUVWXYZ
isalpha: ABCDEFGHIJKLMNOPQRSTWWXYZabcdefghijklmnopqrstuvarxyz
isalnum: 0123456789ABCDEFGHIJKLMNOPQRSTWWXYZabcdefghijklmnopqrs
tuvarxyz
SUCCESS testing <ctype.h>
Note that the line showing the characters matched by isalnum is folded
here. This book page is not wide enough to display the entire line. The line
will not fold on a typical computer display which has wider lines.
References
Considerable interest has arisen lately in character sets. International
commerce demands better support for a richer set of characters than that
traditionally used to represent English (and C) on computers. Various
vendors have given meaning to all 256 codes that can be represented in the
standard eight-bit byte. Nevertheless, the stalwarts are still the sets of 128
or fewer characters that can be encoded in seven bits. Two standards cover
a vast number of implementations:
ANSI Standard X3.4-1968 (New York: American National Standards Institute, 1989). This defines the ASCII character set, a set of seven-bit codes
widely used to represent characters in modern computers.
IS0 Standard 646:1983 (Geneva: International Standards Organization,
1983).This is the internationalstandard for seven-bit character codes.
Exercises
Exercise 2.1 List all the character classification functions that return a nonzero value for
each of the characters in the string:
"Hello, world! \n"
Exercise 2.2 Modify the functions declared in <ctype.h> to work properly with arbitrary
argument values. Treat an argument value that is out of range the same
way you treat the value EOF. Describe at least two ways to report an error
for an argument value out of range.
Exercise 2.3 A name in C begins with a letter. Any number of additional letters, digits,
or underscore characters follow. Write the function size-t i a e n (const
char *a) that returns the number of characters that constitute the identifier
beginning at a. If no identifier begins at a, the function returns zero.
Chapter 2
Figure 2.19:
t c t y p e .c
Pad 1
'*
*/
++c)
1
{
/*
char *a;
i n t c;
/* display p r i n t a b l e classes */
p r c l a s s ( " ispunct ", Cispunct);
p r c l a s s ( " i s d i g i t " , Cisdigit) ;
p r c l a s s ("islower", Cislower) ;
p r c l a s s ("isupper", Cisupper) ;
p r c l a s s ( " isalpha", Cisalpha);
prclass("isalnum", Cisalnum);
/* t e s t macms f o r required characters
f o r (s = "0123456789"; *a; ++a)
a s s e r t ( i s d i g i t (*a) 66 i s x d i q i t (*a) ) ;
*/
a s s e r t ( i s x d i g i t (*a)) ;
f o r (a = "abcdefghi jklmnopqrstuvwxyz";
a s s e r t (islower (*a)) ;
f o r ( s = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
a s s e r t (isupper (*a) );
f o r ( s = "!\"#%&'( ) ; < = > ? [ \ \ I * + , - . / : I {
1
a s s e r t (ispunct (*a)) ;
f o r ( s = " \ f \ n \ r \ t \ v W ; *a; ++a)
a s s e r t (isspace (*a) 66 i s c n t r l (*a) ) ;
a s s e r t (isspace (' ' ) 66 i s p r i n t (' ' ) ) ;
a s s e r t ( i s c n t r l (' \a' ) 66 i s c n t r l ( I \b' ) ) ;
/* t e s t macros f o r a l l v a l i d codes */
f o r ( c = EOF; c <= UCHAR-MAX;
++c)
/* t e s t f o r proper c l a s s membership
{
i f ( i s d i g i t (c) )
a s s e r t (isalnum (c) ) ;
i f (isupper (c) )
a s s e r t (isalpha (c) ) ;
i f (islower (c) )
a s s e r t (isalpha (c) );
<ctype. h>
i f ( i s a l p h a (c))
a s s e r t (isalnum(c) 66 ! i s d i g i t (c) ) ;
i f (isalnum (c) )
a s s e r t (isgraph (c) 66 ! i s p u n d (c) ) ;
i f (ispunct (c))
a s s e r t (isgraph (c)) ;
i f (isgraph(c))
a s s e r t ( i s p r i n t (c)) ;
i f (isspace (c))
a s s e r t (c == ' ' I I ! i s p r i n t (c)) ;
i f ( i s c n t r l (c) )
a s s e r t ( ! isalnum (c)) ;
Continuing
tctype. c
Pad 2
/*
1
puts("SUCCESS t e s t i n g <ctype.h>");
return (0);
*,
46
Chapter 2
Exercise 2.4 Write the function size_t detab(chax *dest, const char *src) that
copies the null-terminated string beginning at src to dest, with each
horizontal tab replaced by one to four spaces. Assume tab stops every four
columns. A printing character occupies one column. The only other characters that affect the print position are backspace, carriage return, and
newline. Return the length of the new string at dest.
Exercise 2.5 Do you have to modify the function idlen (from Exercise 2.3) to work
properly if the locale changes from "C"? If so, show the modified version.
If not, explain why not.
Exercise 2.6 Do you have to modify the function detab (from Exercise 2.4) to work
properly if the locale changes from "c"? If so, show the modified version.
If not, explain why not.
Exercise 2.7 [Harder] You want to implement a library that can be shared. Describe how
you would alter the code in this chapter for each of the following mechanisms:
The translator can be instructed to place all writable static storage in the
library in a section that is copied into each process that uses the library.
You can add fields to a structure called _Lib_stat, declared in <libstat.h>. You can add initializers to the definition of the structure in the
file libstat. c.
You can add fields to a structure called _Lib_stat, as before. You access
the structure only through a pointer to the structure called _p, also
declared in <libstat.h>.
You can add fields to a structure called _Lib_stat, as before. You access
the structure only through a pointer to the structure returned by a call
of the form _FP O . The function _FP is declared in <libstat.h>.
Exercise 2.8 [Harder] A multithread environment supports one or more threads of controlcontrol;thread of that share the same static storage. Dynamic storage
(with storage class auto or register) evolves separately for each thread.
You want to implement a library that appears atomic to the threads no
function changes behavior, or misbehaves, because another thread changes
the state of library static storage. You make each access to library static
storage safe by surrounding it with synchronization code, as in:
_lock() ;
P = _Ctype;
_unlock() ;
Show how to change the code in this chapter to make it safe for multithread
operation. What does that do to performance? How can you improve
performance and still keep the code safe for multithread operation?
Exercise 2.9 [Very hard] Modify the macros defined in <ctype.h> to work properly with
arbitrary argument values. Treat an argument value that is out of range the
same way you treat the value EOF.
Chapter 3: <errno.h>
Background
If I had to identify one part of the C Standard that is uniformly disliked,
I would not have to look far. Nobody likes errno or the machinery that it
implies. I can't recall anybody defending this approach to error reporting,
not in two dozen or more meetings of X3Jll, the committee that developed
the C Standard. Several alternatives were proposed over the years. At least
one faction favored simply discarding errno. Yet it endures.
The C Standard has even added to the existing machinery. The header
<erro.h> is an invention of the committee. We wanted to have every
function and data object in the library declared in some standard header.
We gave errno its own standard header mostly to ghettoize it. We even
added some words in the hope of clarifying a notoriously murky corner of
the C language.
A continuing topic among groups working to extend and improve C is
how to tame errno.Or how to get rid of it. The fad that no clear answer has
emerged to date should tell you something. There are no easy answers
when it comes to reporting and handling errors.
history
C was born under UNIX. That operating system set new standards for
clarity and simplicity. The interface between user program and operating
system kernel is particularly clean. You specify a system call number and
a handful of operands. The 40-odd system calls of early UNIX have more
than doubled in number over the years. But that is still on the sparse side
compared to systems of comparable power. Operands to UNIX system calls
are almost always scalars -integers or pointers. They are equally spare.
Each implementation of UNIX adopts a simple method for indicating
erroneous system calls. Writing in assembly language, you typically test
the carry indicator in the condition code. If the carry indicator is clear, the
system call was successful. Any answers you requested are returned in
machine registers or in a structure within your program. (You specify the
address of the structure as one of the arguments to the system call.) If the
carry indicator is set, however, the system call was in error. One of the
machine registers contains a small positive number to indicate the nature
of the error.
Chapter 3
That scheme is great for assembly language. It is less great for programs
handling
errors in C you write in C. You can write a library of C-callable functions, one for each
distinct system call. You'd like each function return value to be the answer
you request when making that particular system call. You can do so, but
that makes it difficult to report errors in a way that is easy to test. Alternatively, you can have each function return as its value a success or failure
indication. Do that and you have no easy way to get at the answer you want
from a successful system call.
One trick that mostly works is to do a bit of both. For a typical system
call, you can define an error return value that is distinguishablefrom any
valid answer. A null pointer is an obvious case in point. The value -1 can
also be set aside in many cases, with no serious conflict with valid answers.
Each UNIX system call usually has a such return value to indicate that some
form of error has occurred.
What the C-callable functions do not do is report exactly which error
occurred. That strains the trick a bit too much. All you can tell from the
return value is whether an error occurred. You have to look elsewhere to
get details.
The "elsewhere" that early UNIX programmers adopted was a data
object with external linkage. Any system call that fails stores the error code
from the kernel in an int variable called errno. It then returns -1, or some
other appropriately silly value, to indicate the error. Most of the time, the
program doesn't care about details. An error is an error is an error. But in
those few cases where the program does care, it knows how to get additional information. It looks in errno to see the last error code stored there.
Naturally, you'd better look before it's too late. Make another system call
that fails and the error code gets overwritten. You must also look at errno
only after a system call that fails. A successful call doesn't clear the value
stored there. It's not a great piece of machinery, but it does work.
The first problem with errno is that it was too handy. People started
overworked
machinery finding additional uses for it. It grew from a dirty little trick for augmenting
UNIX system calls to a C institution. And that's when it got overworked.
System calls aren't the only rich source of errors. Another well-explored
vein is the portion of the library that computes the common math functions.
(See Chapter 7:anath.h>.)
Some functions yield values too large to represent for certain arguments
(such as exp (100o.o) ). Some yield values too small to represent for certain
arguments (such as exp (-1000.0) ). Some are simply undefined for certain
argument values (such as sqrt ( - 1 . 0 ) ). Some are defined, but of suspect
worth for certain argument values (such as sin ( l e 3 0 ) ).
You could introduce one or more error codes for each function that can
run into trouble. Following the naming convention for UNIX error codes,
you could report ESQRT for the square root of a negative number. But that
is both open-ended and messy.
math
errors
defined
errors
Chapter 3
50
EDOM
ERANGE
which expand to integral constant expressions with distinct nonzero values, suitable for use in
#if preprocessing directives; and
errno
which expands to a modifiable lvalueg2that has type i n t , the value of which is set to a positive
error number by several library functions. It is unspecified whether errno is a macro or an
identifier declared with external linkage. If a macro defmition is suppressed in order to access an
actual object, or a program defines an identifier with the name errno,the behavior is undefined.
The value of errnois zero at program startup, but is never set to zero by any library function.93
The value of errno may be set to nonzero by a library function call whether or not there is an
error, provided the use of errno is not documented in the description of the function in this
Lnrernational Standard.
Additional macro definitions, beginning with E and a digit or E and an uppercase letter?4 may
also be specified by the implementation.
Footnotes
92. The macro errno need not be the identifier of an object. It might expand to a modifiable
lvalue resulting from a function call (for example, *errno () ).
93.
Thus, a program that uses errno for error checking should set it to zero before a library
function call, then inspect it before a subsequent library function call. Of course. a library
function can save the value of errno on entry and then set it to zero, as long as the original
value is restored if errno's value is still zero just before the return.
94.
Using <errno.h>
The C Standard leaves much unsaid about the errors that can be reported. It says even less about the values of any error codes or the macro
names you use to determine those values. That's because usage varies so
widely among implementations. Even different versions of UNIX define
different sets of error codes.
If you are writing code for a specific system, you may have to learn its
peculiar set of error codes. List the header <errno .h>if you can. All error
codes should be defined there as macros with names beginning with E.
Read any documentation you can find that details error codes. Then be
prepared to experiment. Documentation is notoriously spotty and inaccurate in this area.
If you are writing portablecode, avoid any assumptionsabout extra error
codes. You can count on only the properties of errno specified throughout
the C Standard. 1 listed them on page 49. Rarely do you have to know
explicit error codes, however. Footnote 93 of the C Standard (shown above)
tells you the safest coding style for using errno.Set it to zero right before a
library function call, then test it for any nonzero value before the next library
call:
#include <errno.h>
#include <math.h>
.....
errno = 0;
y = sqrt(x);
i f (errno != 0)
printf (" invalic-5 x: %e\nW,
x);
Never assume that a library function will leave errno unaffected, no matter
how simple the function. It's rather a noisy channel.
Implementing <errno.h>
On the surface, the C Standard demands little of an implementationin
this area. You can write the file errno. h simply as:
*/
#&fine EDOM
1
#define ERANGE 2
extern i n t errno;
#endif
In some library file, you must add a definition for the data object:
i n t errno = 0;
Your only other obligation is to store values such as EDOM and ERANGE in
errno at the appropriate places within the library functions. What could be
simpler?
Here is a case where the overt implementation is the easiest part of the
job. errno causes trouble in two subtler ways -sometimesits specification
is too vague and sometimes it is too explicit. To see why takes some
explaining.
too much
The vagueness comes from the historical use of errno to register systemand call errors. That practice has been implicitly endorsed by the C Standard.
too little Any library function can store nonzero valuesin errno. The storescan occur
because the function makes one or more system calls that fail. Or they can
occur because some function in the library chooses to use this reporting
channel.
All you can count on is the behavior explicitly called out in the C
Standard. Call sqrt ( - 1 . 0 ) and you can be sure that errno containsthe value
EDOM.Call fabs (x) and all bets are off, believe it or not. No library function
will store a zero in errno. Anything else is fair game.
The overspecification mostly affects the math functions. By spelling out
when errno must be set, the C Standard interferes with important optimizations. In partiular, the C Standard makes it hard for compilers to use the
newest floating-point coprocessors to advantage.
Chapter 3
Chips like the Intel 80x87family and the Motorola MC68881 have some
pretty fancy instructions. Some can compute part or all of a math function
with inline code. A smart compiler can dramatically speed up calculations
by using these instructions. If nothing else, the compiler can avoid the
function-call and function-return overhead for a math function.
mathematical
The problem comes when a mathematicalexception occurs. These math
exceptions coprocessors run autonomously, and they want to keep moving. They want
to record an error by carrying along a special code, called NaN (for "Not a
Number") or Inf (for "infinity"). Later operations preserve these special
codes. You can test at the end of a computation whether anything went
wrong along the way.
At best, these coprocessors record an error in their own condition code.
The main processor has to copy the coprocessor condition code into its own
to test whether an error occurred.That stops a pipelined coprocessor in full
career. If a C program must set errno on every math exception, it can run
a math coprocessor at only a fraction of its potential speed.
Footnote 92 of the C Standard suggests one trick that can help. The C
macro
errno Standard does not require that errno be an actual data object. It is defined
as a macro that expands to a modifiable lvalue- an arbitrary expression that
you can use on the left side of an assigning operator (suchas =) to designate
a data object. That gives the implementor considerable latitude. In particular, the errno macro can expand to an expression such as *-~rfun() . Every
time the program wants to check for errors, it calls a function to tell the
program where to look.
That has two implications. First, the implementation can be lazy about
recording errors. It can wait until someone tries to peek at errno before it
stores the latest error code. That might give the implementation sufficient
latitude to leave math coprocessors alone most of the time. (The translator
may be hard pressed to exploit this opportunity however.)
The second implication is that errno can move about. The function can
return a different address every time it is called. That can be a tremendous
help in implementing shared libraries. Static storage is a real nuisance in a
shared library, as I discussed on page 36. Static storage that the user
program can alter at will is even worse. errno is the only such creature in
the Standard C library.
Even as a macro, errno is still an annoying piece of machinery. Any
program can contain the sequence:
y = sqrt (x) ;
i f (errno == EDOM)
The need to support such error tests severely constrains what an implementation can do with sqrt and its ilk. Since any library function can alter
errno, programmers are also ill served. Here we have a mechanism that
can be hard on both the implementor and the user.
/*
parametric
Figure 3.1 shows the code for errno. h. It is not as simple as I suggested
code earlier. That's because I decided to make it parametric. The simpler form
must be tailored for each operating system that hosts the Library. Other
library functions or the operating system itself may have preconceived
notions about the values of error codes. You must change this header to
match, or endure surprising irregularities.
Most of the code that uses <errno.h> cares about the values of one or
two error codes. As I mentioned on page 50, these values change across
operating systems. One or two library functions need to know the valid
range of error codes. This range also varies across operating systems.
I began moving this library to an assortment of environments shortly
after I first wrote it. I found it annoying that perhaps a dozen files had to
change, each in only small ways. I was quickly overwhelmed maintaining
several versions of this double handful of files.
header
That prompted me to introduce what you might call an "internal stand<yvals. w ard header." Several of the standard headers include the header < p a l s . h>.
(The angle brackets tell the translator to look for this header wherever the
other standard headers are stored. That may cause problems on some
systems.) I concentrate in this file many of the changes you must make to
move this library about.
The header <errno.h> defines its macros in terms of other macros
defined in < p a l s .w . This two-step process is necessary because other
headers include <yvals.h>. The macro ERANGE must be defined in your
program only when you include <errno. w .
Note also that the macro guard for <yvals.h> is in the header that
includes it, not in W a 1 s . w itself. That is a small optimization. Since
several standard headers include this header, it is likely to be requested
several times in a translation unit. The macro guard skips the #include
directive once < p a l s . w becomes part of the translation unit. The header
is not read repeatedly.
Chapter 3
I
#include <ermo.h>
#undef errno
i n t errno = 0;
Figure 3.3:
I/*
t e s t errno macro
*/
t e r r n o .c
i n t main ()
Testing <errno.h>
Figure 3.3 shows the test program terrno .c. It doesn't do much. The C
Standard says little about the properties of <errno. h>. Primarily, terrno.c
ensures that a program can store values in errno and retrieve them.
As a courtesy, the test program also displays how the standard error
codes appear when output. The function perror, declared in <stdio .h>,
writes a line of text to the standard error stream. The function determines
the last part of that text line from the contents of errno. If all goes well,
running the executable version of term0.c displays the output:
No error reported as: no error
Range error reported as: range error
Domain error reported as: domain error
SUCCESS testing <errno.-
Again, I must warn that this output comes from both the standard error
and the standard output streams. The possibility is remote in this case, but
some implementations may rearrange the lines.
References
David Stevenson, "A Proposed Standard for Binary Floating-point Arithmetic," Computer, 14:3 (1981),pp. 51-62. This and subsequent articles in the
same issue (pp. 63-87) of Computer explain many aspects of the IEEE 754
Floating-point Standard.
Mark J. Rochkind, Advanced UNIX Programming (Englewood Cliffs, N.J.:
Prentice Hall, Inc., 1985). Rochkind describes the UNIX system calls, where
errno and its error codes originated.
Exercises
Exercise 3.1 List the error codes defined for the C translator you use. Can you describe
in one sentence what each error code indicates?
Exercise 3.2 For the error codes defined for the C translator you use, contrive tests that
cause each of the errors to occur.
Exercise 3.3 Under what circumstances might you care exactly which error code was
last reported?
Exercise 3.4 Alter the test program terrrno. c to callperror for all valid error codes. The
value of the macro -NERR, defined in <errno.-, is one greater than the
largest valid error code.
Exercise 3.5 Assume you have the function int -Getfcc (void) that returns 0, EDOM, or
FZANGE to reflect the last floating-point error (if any) since the previous call
to the function. Write a version of <errno.h> that uses thisfunction to collect
floating-pointerrors only when the program uses the value stored in errno.
56
Chapter 3
Exercise 3.6 [Harder] Write a version of <errno.h> that queues values stored in errno
and returns them in order when the program uses the valuestored in errno.
When is it safe to remove a value from the queue?
Exercise 3.7 [Vey hard] Eliminate the need for errno in the Standard C library. Consider
every function that can store values in errno. Ensure that each has a way
to specify several different error return values.
Chapter 4
overflow
and
underflow
significance
loss
slightly different. That makes writing portable code much more challenging. You need to write math functions and conversion algorithms to retain
varying ranges of values and varying amounts of precision.
Machines that provide floating point as an option combine the worst of
both worlds, at least to compiler implementors. The implementors must
provide software support for those machines that lack the option. They
must makeuse of the machine instructions when the option is present. And
they must deal with confused customers who inadvertently link two
flavors of code, or the wrong version of the library. Rarely can the hardware
and software versions of floating-point support agree on where to hold
intermediate results.
From a linguistic standpoint, however, most of these issues are irrelevant. The main problem the drafters of the C Standard had to deal with was
excess variety. It is a longstanding tradition in C to take what the machine
gives you. A right-shift operator does whatever the underlying hardware
does most rapidly. So, too, does a floating-point add operator. Neither
result may please a mathematician.
With floating-point arithmetic, you have the obvious issues of overflow
and underflow. A result may be too large to represent on one machine, but
not on another. The resulting overflow may cause a trap, may generate a
special code value, or may produce garbage that is easily mistaken for a
valid result. A result may be too small to represent on one machine but not
on another. The resulting underflow may cause a trap or may be quietly
replaced with an exact zero. Such a zero fixup is often a good idea, but not
always. Novices tend to write code that is susceptible to overflow and
underflow. The broad range of values supported by floating point lures the
innocent into a careless disregard. Your first lesson is to estimate magnitudes and avoid silly swings in value.
You also have the more subtle issue of significance loss. Floating point
arithmetic lets you represent a tremendously broad range of values, but at
a cost. A value can be represented only to a fixed precision. Multiply two
values that are exact and you can keep only half the significance you might
like. Subtract two values that are very close together and you can lose most
or all of the significance you were carrying around.
Workaday programmers most often run afoul of unexpected significance loss. That formula that looks so elegant in a textbook is an ill-behaved
pig when reduced to code. It is hard to see the danger in those alternating
signs in adjacent terms of a series- until you get burned, that is, and learn
to do the subtractions on paper instead of at run time.
Overflow, underflow, and significance loss are intrinsic to floating-point
arithmetic. They are hard enough to deal with on a given computer arc tecture. Writing code that can move across computer architectures is harder.
Writing a standard that tells you how to write portable code is harder still.
But another problem makes the matter even worse.
I
<float.h>
Chapter 4
S
sign (f1)
base or radix of exponent representation (an integer > 1)
FLT-ROUNDS
Of the values in the <f loat.h> header, FLT RADIX shall be a constant expression suitable
for use in # i f preprocessing directives; all otheFvalues need not be constant expressions. All
except FLT RADIX and FLT ROUNDS have separate names for all three floating-point types.
The floatingypoint model representation is provided for all values except FLT-ROUNDS.
The rounding mode for floating-point addition is characterized by the value of FLT-ROUNDS
-1 indeterminable
0
1
2
3
toward zero
to nearest
toward positive infinity
toward negative infinity
FLT-RADIX
number of base-FLT-RADIX digits in the floating-point significand, p
FLT MANT DIG
DBL-MANT-DIG
LDBL-MANI-DIG
FLT-DIG
DBL-DIG
LDBL-DIG
number of decimal digits, q, such that any floating-point number with q decimal digits can be
rounded into a floating-point number with p radix b digits and back again without change to
the a decimal dieits.
1 if b is a power of 10
0 otherwise
FLT-DIG
DBL DIG
LDBL-DIG
minimum negative integer such that FLT-RADIX raised to that power minus 1 is a normalized
floating-point number, emi,
FLT-MIN-E"
DBL-MIN-E"
LDBL-MIN-EXP
minimum negative inte er such that 10 raised to that power is in the range of normalized
floating-point numbers.Blogl,+mn-ll
FLT-MAX-EXP
DBL-MAX-EXP
LDBL-MAX-EXP
maximum integer such that 10 raised to that power is in the range of representable finite
floating-point numbers. Lloglo((l - b-P) x be-]
+37
+37
+37
LDBE---IV-EXP
FLT-MAX
DBL-MAX
LDBL-MAX
1E+37
1E+37
1E+37
the difference between 1 and the least value greater than 1 that is representable in the given
floating-point type, bl?
FLT EPSILON
DBL-EPSILON
1E-5
IE-9
IE-9
LDBE-EPSILON
PLT-MIN
DBL-MIN
LDBL-MIN
FLT-MIN
DBL-MIN
LDBL-MIN
1E-37
1E-37
1E-37
Examples
The following describes an artificial floating-point representation that meets the minimum
requirements of this International Standard, and the appropriate values in a <f loat. h> header
for type f l o a t
6
x = s x 1 6 e x x h x 16-k, -31 S e S + 3 2
PLT-RADIX
PLT-WT-DIG
PLT-EPSILON
PLT-DIG
PLT-MIN-EXP
PLT-MIN
PLT-MIN-10-EXP
PLT-MAX-EXP
PLT-MAX
PLT-MAX-10-EXP
The following describes floating-point representations that also meet the requirements for
single-precision and doubleprecision normalized numbers in ANSVIEEE 754-1985,'' and the
appropriate values in a <float. h> header for typesfloat and double
24
xf-- s x 2 e x x h x 2 - k ,
k=l
53
PLT-RADIX
PLT-WT-DIG
PLT-EPSILON
PLT-DIG
PLT-MIN-EXP
PLT-MIN
PLT-MIN-10-EXP
PLT-MAX-EXP
PLT-MAX
PLT-MAX-10-EXP
DBL-WT-DIG
-125SeS+128
Chapter 4
2.2204460492503131E-16
DBL-EPSILON
DBI-DIG
15
DBI-MIN-EXP
DBI-MIN
Dm-MIN-10-EW
-307
DBL-MAX-EXP
+lo24
DBL-MAX
1.7976931348623157E+308
DBL-MAX-10-EXP
+308
-1021
2.2250738585072014E-308
identical.
11.
The floating-point model in that standard sums powers of b from zero. so the values of the
exponent limits are one less than shown here.
Using <float.h>
Only the most sophisticated of numerical programs care about most of
the macros defined in < f l o a t .h> or can adapt to changes among floatingpoint representations. I have found good use for these parameters on just
a few occasions. You will find only a few places in this library that make
good use of them. That's a bit misleading, however. In some places, I use
the underlying macros from which the < f l o a t . h> macms derive. (See the
discussion of how to implement < f l o a t .h> starting on page 64.) In other
places, the code contains implicit assumptions about the range or maximum size of certain floating-point parameters. That limits its portability.
You can use these macros to detect problems before they bite. Remember
that the three pitfalls of floating-point arithmetic are overflow, underflow,
and significance loss. Here are ways you can use the macros defined in
< f l o a t .h> to perform double arithmetic more safely. The same discussion
applies, naturally, to float and long double as well.
overflow
To avoid overflow, make sure that no value ever exceeds DBL-MAX in
magnitude. Of course, it does you no good to test the final result, as in:
i f (DBL-MAX < f a b s ( y ) )
/* SILLY TEST */
.....
(The functions in this and the following examples are the common math
functions declared in unath. h>.)
By the time you make the test, it's too late. If the value you intended to
store in y is too large to represent, y may contain a special code, the value
of DBL-ax, or garbage-depending on the kind of floating-pointarithmetic the implementation provides. Or execution may terminate during the
calculation of the value. In no case will the above test likely yield a useful
result. A more sensible test might be:
i f (x
< log(DBL-MAX)
Y = exp(x);
else
.....
/*
W L E OVERFLOW
*/
~ ~ ~ ( D B L - M Aby
X ) using
if (X <= FLT-MAX-10-EXP)
y = pow(l0, x);
else
.....
/* HANDLE OVERFLOW
*/
This test is more stringent than necessary if FLT-RADIX is not equal to 10.
(Modern computers usually have FLT-RADIX equal to 2 or, in rare cases, 16.)
If you are in the businessof writingfunctions that accept all possibleinputs,
that can make a difference. Otherwise, this test is close enough.
The function ldexp makes it easy to scale a floating-point number by a
power of 2. In the common case where FLT-RADIX equals 2, that can be an
efficient operation. For an integer exponent n, you can make the simple test:
if (n < FLT-MAX-EXP)
y = ldexp(l.0, n);
else
.....
/*
HANDLE OVERFLOW
*/
You are most likely to use this last test when writing additional functions
for a math library.
underflow
To avoid underflow, make sure that no value ever goes below DBL-MIN
in magnitude. The result is usually not quite so disastrous as overflow, but
it can still cause trouble. IEEE 754 floating-pointarithmetic providesgradual
underflow.That mitigates some of the worst effects of underflow. Nearly all
floating-point implementations substitute the value zero for a value too
small to represent. You get in trouble only if you divide by a value that has
suffered underflow. Unexpectedly, your program encounters a zero divide,
with all the attendant confusion. You can make the test:
if (fabs(y) < DBL-MIN)
..
/* UNDERFLOW HAS OCCURRED
.. .
*/
.....
/*
HANDLE UNDERFLOW
*/
if (FLT-MINlO-EXP <= x)
y = pow(l0, x);
else
/* HANDLE UNDERFLOW
*/
if (FLT-MIN-EXP < n)
y = ldexp(l.0, n);
else
/* HANDLE UNDERFLOW
*/
.....
.....
Chapter 4
64
Significance loss occurs when you subtract two values that are nearly
significance
loss equal. Nothing can save you from such a fate except careful analysis of the
problem before you write code. You can, however, protect against a subtler
form of significance loss - adding a small magnitude to a large one. A
floating-point representation can maintain only a finite precision. Important contributions from the sinaller number can get lost in the addition.
You can get in trouble, for example, when performing a quadrature -a
sum of discrete values that approximates a continuous integration. One
form of quadrature is computing the area under a curve by summing a
sequence of rectangles that just fit under the curve. Clearly, the narrower
the rectangles, the closer the sequence approximates the area of the curve.
Unfortunately, that is true only in theory Add a sufficiently small rectangular area to a running sum &part or all of the contribution gets lost. You
can test, for example, whether adding x to Y captures at least three decimal
digits of significance from Y (assuming both are positive) by writing:
if
(X
< y
.....
DBL-EPSILON
1.OE+03)
/ * HANDLE SIGNIFICANCE LOSS
*/
other
The two macros you are least likely to use are FLT-RADIX and FLT-ROUNDS.
macros Don't be surprised, in fact, if you never have occasion to use any of the
Implementing <float.h>
In principle, this header consists of nothing but a bunch of macro
definitions. For a given implementation, you merely determine the values
of the parameters and plug them in. You can even use a freeware program
called enwire to generate <float.h>automatically.
Acommon implementation thesedays is based on the IEEE 754 Standard
for floating-point arithmetic. You will find IEEE 754 floating point arithmetic in the Intel 80x87and the Motorola MC680XO coprocessors, to name just
two very popular lines. It is a complex standard, but only its grosser
properties affect <float.h>. Type long double can have an 80-bit representation in the IEEE 754 Standard, but it often has the same representation
as double. For this common case, you might consider copying the values out
of the example in the C Standard. (See page 61.)
You may find a few problems, however. Not all translators are equally
good at converting floating-point constants. Some may curdle the least
significant bit or two. That could cause overflow or underflow in the case
of some extreme values such as DBL-MAX and DBL-MIN.Or it could ruin the
critical behavior of other values such as DBL-EPSILON.
At the very least, you should check the bit patterns produced by the
using
unions floating-point values. You can do that by stuffing the value into a union one
way, then extracting it another way, as in:
union <
double -D;
unsigned short -Us[4]:
1 dmax = DBL-MAX;
Here, I assume that unsigned short occupies 16 bits and double is the IEEE
754 64-bit representation.Some computers store the most-significant word
at dmax .-US [ o I,others at amax.-us [3 I . You have to check what your implementation does. Whatever the case, the most significant word should have
the value Ox7FEE and all the other words should equal OxFFFF.
A safer approach is to do it the other way around. Initialize the union as
a sequenceof bit patterns, then define the macro to access the union through
its floating-point member. Since you can initialize only the first member of
a union, you must reverse the member declarations from the example
above. With this approach, you place the following in <float.h>:
typedef union {
unsigned short -Us[41;
double -D;
1 Ptype;
extern -Dtype -Daax, i n , -Deps;
#define DBL-MAX
-Daax -D;
In a library source file you provide a definition for -rnnax and friends. For
the 80x86 family, which stores the least-significant word fist, you write:
#include <float.h>
-Dtype -Dmax = {{Oxffff, Oxffff, Oxffff, Ox7fef11;
The code is now less readable, but it is more robust. Figure 4.1 shows the
resulting version of float.h. Each macro refers to a field from one of three
data objects of type -mals - -DM, -~lt,and -~dbl.A separate file called
xfloat.c defines the data objects.
In writing the corresponding data objects, I encountered another annoying problem. You need different versions of these initializers for different
floating-point formats. Even if you stay within the IEEE 754 Standard you
must specify the order of bytes stored in a data object and whether long
double occupies 64 or 80 bits. Other formats with FLT-FWDIX equal to 2 differ
only in niggling ways.
It was time to parametrize the code once again. On page 53, I introduced
parameters
the internal header <yvals.h,. That's where I put any parameters that vary
among translators. Error codes are one set of such parameters. The properties of floating-point representations constitute anothcr. You can include
<yvals.h, in any library source file that must change in small ways across
implementations of C.
Chapter 4
66
Figure 4.1: / * f1oat.h standard header -- IEEE 754 version * /
float. h
Bifndef -FLOAT
#define -FLOAT
Bifndef -YVALS
#include <yvals.h>
#endif
/ * type definitions */
typedef struct <
int -Ddig, - W i g , -DmaxlOe, -Dmaxe, -DminlOe, i n e ;
union <
unsigned short -Us[51;
float -F;
double -D;
long double -Ld;
1 -~eps. - m x . i n ;
1 -Dvals;
/ * declarations * /
extern -Dvals -Dbl, -Fit, -Ldbl;
/ * double properties * /
#define DBL-DIG
-Dbl -Ddig
#define DBL-EPSILON
-Db1.-Deps.-D
#define DBL-MANT-DIG
-Dbl. J m d i g
#define DBL-MAX
-Dbl._Dmax.-D
#define DBL--10-EXP
-Dbl._DmaxlOe
#define DBL-MAX-EXP
Dbl -Dmaxe
#define DBL-MIN
-Db1.-Dmin.-D
#define DBL-MIN-10-EXP -Dbl.-DminlOe
#define DBL-MIN-EXP
-Dbl -mine
/* float properties * /
#define FLT-DIG
-Flt -Ddig
#define FLT-EPSILON
-Fit.-Deps.-F
#define F L T - ~ N T ~ D I G -Fit.-Ddig
#define FLT-MAX
-Flt._Dmax.-F
#define FLT-MAX_lOBXP -Fit.-DmaxlOe
#define FLT-MAX_EXP
-Flt .-Dmaxe
#define FLT-MIN
-~ l .-Dmin.-F
t
#define FLT-MIN-10-EXP -Fit.-DminlOe
#define FLT-MIN-EXP
-~ l .t i n e
/ * common properties * /
#define FLT-RADIX
2
#define FLT-ROUNDS
-FRND
/ * long double properties * /
#define LDBL-DIG
-Lab1 .-Ddig
#define LDBL-EPSILON
-Ldbl.-Deps.-Ld
#define LDBL-MANT-DIG
-Ldbl.-Dmdig
#define LDBL-MAX
-Ldb1.-Dmax.-Ld
#define ~DBL-MAX_l0-EXP-Ldbl.-Drnaxl0e
#define LDBL-MAX_EXP
-Ldbl.-Dmaxe
#define LDBL-MIN
-Ldbl -Dmin. -Ld
#define L D B L - M I N - ~ O - E X P - L ~ ~ ~ . - D ~ ~ ~ ~ O ~
#define LDBL-MIN-EXP
-~dbl.-Dmine
#endif
.
.
-DO
-DOFF
-FOFF
-LOFF
-DBIAS w
-FBIAS
-LBIAS
-Is
(l.FFF.,
(CCC..-'--DBIAS
-DLONG w -DLONG is nonzero if long double has the IEEE 754 80-bit format.
-FRND w -FRND is the value of the macro FLT-ROUNDS
xf loat. c
Figure 4.3 shows the code for xfloat C . It is written in terms
.
of these
The code also contains a number of implicit assumptions:
FLT-RADIX has the value 2.
Type float has a 32-bit representation and exactly overlaps an array of 2
unsigned shorts, while type double has a 64-bit representation and exactly
overlaps an array of 4 unsigned shorts.
Type long double has the IEEE 754 80-bit representation only if -DLONG is
nonzero. Otherwise, it has the same representation as double.
The characteristic is never larger than 14 bits.
The fraction value in a float or double includes a hidden bit. This is the 1.
prepended to the FFF. .. above.
As an example, here are the pertinent values for the Intel 80x87coprocessors, assuming that double and long double have different representations:
Figure 4.2:
SCCCCCCCCCCCFFFF
x.-Us [-DO1
Format
FFFF.
...FFFFl
FFFF.
...FFFFU
FFFF.
...FFFFU
68
Chapter 4
Figure 4.3: * values used by <float.h> macros -- IEEE 754 version * /
xf loat. c :include <float.h>
Part 1
/ * macros * /
:define DFRAC
(49+-DOFF)
:define DMAXE
((lU<<(15--DOFF))-l)
:define FFRAC
(17+-FOFF)
:define FMAXE
((1U<<(15--FOFF))-1)
:define LFRAC
(49+-LOFF)
:define LMAXE Ox7fff
:define LCG2
0.30103
/ * low to high words
:if -DO I = 0
:define DINIT(w0, wx) wx, wx, wx, w.O
:define FINIT(WO. wx) wx, wO
:define LINIT(w0, wl. WX) WX. WX. wx, wl- wO
/ * high to low words
:else
:define DINIT(w0, wx) wO. wx. WX, wX
Hefine FINIT(w0. wx)
wx
:define LINIT(WO. wl, wx) wO, wl. WX, wx, wx
:endif
/ * static data * /
pvals -Dbl = {
/ * DBL-DIG
(int)((DFRAC-l)*LOG2).
/ * DBL-MANT-DIG
(int) DFRAC,
/ * DBL-MAX-10-EXP
(int)((DMAXE-_DBIAS-l)*LOG2),
/ * DBL--EXP
(int)(DMAXE--DBIAS-1).
/ * DBL-MIN-10-EXP
(int)(-_DBIAS*LOG2),
/ * DBL-MIN-EXP
(int)(1--DBIAS),
/ * DBL-EPSILON
(DINIT(-DBIAS-DFRAC+2<<-DOFF, 0)).
/ * DBL-MAX
(DINIT((DMAXE<<-DOFF)-1, -0)).
/ * DBL-MIN
DINI IT(^<<-DOFF, 0)).
1;
pvals -Fit =
/* FLT-DIG
(int)((~FRAC-l)*L002).
/ * FLT-MANT-DIG
(int)FFRAC,
/ * FLT-MAX-10-EXP
(int)( (FMAXE--FBIAS-1)*LOG21 ,
/* FLT--EXP
(int)(FMAXE--FBIAS-1).
I * FLT-MIN-10-EXP
( int ) (--FBIAS*LOG2 ),
/* FLT-MIN-EXP
(int)(1--FBIAS),
/* FLT-EPSILON
(FINIT(-FBIAS-FFRAC+2<<-FOFF, 0)).
/ * FLT-MAX
(FINIT((FMAXE<<-F0FF)-1, -O)),
/ * FLT-MIN
IFINIT(l<<-FOFF, 0)).
1;
kif -DLONG
pvals -Lab1 = {
/ * LDBL-DIG
(int)((LFRAC-l)*LOG2),
/ * LDBL-MANT-DIG
( int ) LFRAC,
/ * LDBL--10-EXP
(int)((LMAXE--LBIAS-l)*LOG2),
/ * LDBLL-EXP
(int)(WE--LBIAS-1).
/ * LDBL-MIN-10-EXP
(int)(--LBIAS*LOG2).
/ * LDBL-MIN-EXP
(int)(1--LBIAS),
(LINIT(-LBIAS-LFRAC+2, 0x8000, 0)).
/ * LDBL-EPSILON
(LINIT(LMAXE-1, -0, -0)).
/ * LDBL-MAX
(LINIT(1, 0x8000. 0)).
/ * LDBL-MIN
1;
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
*I
Continuing #else
xf loat.c
Part 2
-Dvals -Lab1 = I
(int)(DFRAC*LOG2),
(int)DFRAC,
(int)((DMAXE--DBIAS-l)*LOG2),
(int)(DMAXE--DBIAS-1).
(int)(--DBIAS*LOG2),
(int)(1--DBIAS),
(DINIT(-DBIAS-DFRAC+2<<-DOFF, 0)).
IDINIT( (DMAXE<<-DOFF)-1, -01 1,
IDINIT(l<<-DOFF, 0)).
1;
#endif
/ * LDBL-DIG */
/ * LDBL-MUIT-DIG * /
/ * LDBL--10-EXP
*/
/ * LDBL--EXP
*/
/ * LDBL-MIN-10-EXP * /
/ * LDBL-MIN-EXP * /
/ * LDBL-EPSILON * /
/ * LDBL-MAX * /
/ * LDBL-MIN * /
Testing <float.h>
Figure 4.4 shows the test program tf1oat.c. It begins by printing the
values of the macros defined in <float .h> in a form that people can better
understand. It then checks that the macros meet the minimum requirements
spelled out in the C Standard.
Here is the output for the Intel 80x87coprocessor, on an implementation
that supports all three sizes of IEEE 754 operands:
FLT-RADIX = 2
DBL-DIG =
15
DBL-MUIT-DIG =
308
DBL-mEXP =
DBL--10-EXP
=
DBL-MIN-EXP =
DBL-MIN-10-EXP
= -307
DBL-EPSILON = 2 -220446e-16
1.797693e+308
DBL-MAX =
2.225074e-308
DBL-MIN =
53
1024
-1021
FLT-DIG =
6
FLT-MUIT-DIG =
FLT--10-EXP
=
38
FLT-MAX-EXP =
FLT-MIN-10-EXP =
-37
FLT-MIN-EXP =
FLT-EPSILON = 1.192093e-07
FLT-MAX =
3.402823e+38
FLT-MIN =
1.175494e-38
24
128
-125
LDBL-DIG =
19 LDBL-MUIT-DIG =
64
LDBL--10-EXP
= 4932 LDBL-MAX-EXP =
16384
LDBL-MIN-10-EXP = -4931 LDBL-MIN-EXP = -16381
LDBL-EPSILON = 1.084202e-19
LDBL-MAX =
1.189731e+4932
LDBL-MIN =
3.362103e-4932
SUCCESS testing <float.h>
Chapter 4
70
Figure 4.4:
tf loat. c
Part 1
*/
lint m a i n 0
I
/* test basic properties of f1oat.h macros * /
double radlcg;
int digs;
static int radix = FLT-RADIX;
printf ( "FLT-RADIX = %i\n\nV'
, FLT-RADIX);
printf ( "DBL-DIG =
%5i
DBL-MANT-DIG = %6i\nm',
DBL-DIG, DBL-MANT-DIG) ;
printf ( "DBL-MAX-10-EXP = %5i
DBL-MAX-EXP =
%6i\nw,
DBL--10-EXP,
DBL-MAX_EXP);
printf(lWDBL-MIN-lO-mP = %5i
DBL-MIN-EXP =
%6i\nw,
DBL-MIN-10-EXP.
DBL-MIN-EXP);
printf ( "
DBL-EPSILON = %le\ng', DBL-EPSILON);
printf ( "
DBL-MAX =
%le\nVm
, DBL-MAX) ;
printf ( "
DBL-MIN =
%le\n\nW, DBL-MINI ;
printf("FLT-DIG =
%5i
FLT-MANT-DIG = %6i\nM,
FLT-DIG, FLT-MANT-DIG) ;
printf ( " FLT-MAX-10-EXP = %5i
FLT-mEXP =
%6i\nWm,
FLT--10-EXP,
FLT-MAX-EXP);
printf ( " FLT-MIN-10-EXP = %5i
FLT-MIN-EXP =
%6i \n",
FLT-MIN-10-EXP,
FLT-MIN-EXP);
printf ( "
FLT-EPSILON = %e\nm', FLT-EPSILON);
printf ( "
FLT-MAX =
%e \n", FLT-MAX) ;
printf ( "
FLT-MIN =
%e\n\nW, FLT-MIN);
printf ( "LDBL-DIG =
%5i LDBL-MANT-DIG = %6i\nW,
LDBL-DIG, LDBL-MANT-DIG) ;
printf("LDBL-MAX_lO-EXP = %5i L D B L - m E X P = %6i\nmm,
LDBL--10-EXP,
LDBL-MAX_EXP);
printf("LDBL-MIN-10-EXP = %5i LDBL-MIN-EXP = %6i\nw,
LDBL-MIN-10-EXP,
LDBL-MIN-EXP);
printf ( "
LDBL-EPSILON = %Le\nm', LDBL-EPSILON);
printf ( "
LDBL-MAX =
%Le\nm'
, LDBL-MAX) ;
print f ( "
LDBL-MIN =
%Le\nm'
, LDBL-MIN);
radlog = lcglO(radix);
/ * test double properties * /
assert(l0 <= DBL-DIG && FLT-DIG <= DBL-DIG);
assert(DBL-EPSILON <= le-9);
digs = (DBLJfAN-DIG - 1) * radlog;
assert(digs <= DBL-DIG && DBL-DIG <= digs + 1);
assert ( le37 <= DBL-MAX);
assert(37 <= DBL-MAX-10-EXP);
#if FLT-RADIX == 2
- 1) < DBL-MAX);
assert(ldexp(l.0, DBL--EXP
assert(ldexp(l.0, DBL-MIN-EXP - 1) == DBL-MIN);
#endif
<float.h>
-
Continuing
tf loat.c
Part 2
References
ANSZIZEEE Standard 754-1985(Piscataway, N.J.: Institute of Electrical and
ElectronicsEngineers, Inc., 1985). This is the floating-point standard widely
used in modern microprocessors.
Jack J. Dongarra and Eric Grosse, "Distribution of Mathematical Software via Electronic Mail," Communications of the ACM, 30 (19871, pp. 403407. This article describes how you can obtain various test programs via
electronic mail. Two programs you can obtain via electronic mail beat
particularly hard on floating-point arithmetic:
The program e n w i r e tests the propertiesof the floating-point arithmetic
that accompanies a C implementation. It prints its findings in the form
of a usable float .h file. Written by Steven Pemberton of CWI, Arnsterdam, e n w i r e is available through the Internet address stevenecwi.nl.
Chapter 4
Pat Sterbenz, Floating-Point Computation (Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1973).This book is old and currently out of print. Nevertheless, it is hard to find a better discussion of the basic issues.
Exercises
Exercise 4.1 Determine the parameters that characterize floating-point arithmetic for
the C translator you use. Do they conform to the IEEE 754 Standard?
Exercise 4.2 Can you alter <yvals .h> to adapt <float .h>and xfloat .c for the C translator you use? If so, do so. If not, what else must you alter?
Exercise 4.3 Consider the following code sequence:
double d = 1.0;
float a [N];
for (i = 0; i < n; ++i)
d *= a[i];
In IEEE 754 floating-point arithmetic, how large can N be before you have
to worry about overflow in the computation of ti?
In IEEE 754 floating-point arithmetic, how large can N be before you have
to worry about overflow in the computation of la?
Exercise 4.5 Why is the header <yvals.h> included directly in eloat.h> (asopposed to
including it only in xfloat. c)? Alter the code in this chapter to eliminate
the need.
Exercise 4.6 You are given the function int -Getrnd(void) that returns the current
floating-point rounding status. Alter the macro FLT-RADIX to return the
current status.
Exercise4.7 [Harder] Write a C program that determines the values of the macros
defined in <float .h> solely by performing arithmetic. Assume that you
don't know the underlying floating-point representation.
Exercise 4.8 [Very hard] Alter the program from the previous exercise to work safely
even on an implementation that aborts execution on floating-point overflow. Assume that the program cannot regain control once overflow occurs.
history
naming
what
changes
Chapter 5
SCHAR-MIN
SCWAR-MAX
UCWAR-MAX
CHAR-MIN
--MAX
=LEN-MAX
SHRT-MIN
SHRT-MAX
USHRT-MAX
INT-MIN
INT-MAX
UINT-MAX
LONG-MIN
LONG-MAX
ULONG-MAX
9.
See 6.1.2.5.
Using climits.h>
You can use <limits .h> one of two ways. The simpler way assures that
you do not produce a silly program. Let's say, for example, that you want
to represent some signed data that ranges in value between VAL-MIN and
VAL-MAX. YOU can keep the program from translating incorrectly by writing:
#include <assert.h>
#include <limits.h>
#if VAL-MIN < 1-MIN
I I INT-MAX < VAL-MAX
#error values out of range
#endif
You can then safely store the data in data objects declared with type int.
A more elaborate way to use <limits.h, is to control the choice of types
adapting
types in a program. You can alter the example above to read:
#include <assert.h>
#include <limits.h>
#if VAL-MIN < 1-MIN
I I INT-MAX < VAL-MAX
typedef long Val-t;
#else
typedef int Val-t;
#endif
You then declare all data objects that must hold this range of values as
having type val-t. The program chooses the more efficient type.
The presence of <limits .h> is also designed to discourage an old programming trick that is extremely nonportable. Some programs attempted
to test the properties of the execution environment by writing #if directives:
#if (-1 + 0x0) >> 1 > Ox7fff
/ * must have ints greater than 16 bits * /
76
Chapter 5
Figure 5.1:
limite. h
*/
heavily with cross compilers know well that the translation environment
can differ markedly from the execution environment. For tricks like this
one to work, theC Standard would have to require that the translator mimic
the execution environment very closely. And translator families with a
common front end would have to adapt translation-timearithmetic to suit
each environment.
X3Jll discussed such requirements at length. In the end, we decided that
the preprocessor was not the creature to burden with such stringent requirements. The translator must closely model the execution environment
in many ways, to be sure. It must compute constant expressions - to
Implementing d i m i t s . h>
The only code you have to provide for this header is the header itself.
All the macros defined in <limits.h> are testable within #if directives and
are unlikely to change during execution. (The same is not true of most of
the macros defined in <float.h>.)
common
Most modern computers have 8-bit chars, 2-byte shorts, and 4-byte longs.
choices There are several common variations on this principal theme:
An int is either 2 or 4 bytes.
A char has the same range of values as either signed char or unsigned char.
Signed values are encoded most frequently in two's complement, which
has only one form of zero but one negative value that has no corresponding positive value. Less common are one's complement and signed magnitude. Both have two forms of zero but no extra negative value.
The number of bytes for a single multibyte character can be any value
greater than zero.
I found it convenient, therefore, to write a version of <limits.h> that
expands to any of these common choices. Figure5.1 shows the file limits. h.
It includes the configuration file <yvals.h>,which I introduced on page 53.
That file also provides parameters for the header <float .h>, desc~ibedon
page 65. Among other things, <yvals.h> defines the macros:
-ILONG
JLONG - nonzero if an int has 4 bytes
-CSIGN
-CSIGN - nonzero if a char is signed
-122 -1 if the encoding is two's complement, else 0
-c2
M A X - the worst-case length of a single multibyte character.
-MBMAX
The use of the macro 3 2 obscures an important subtlety. On a two'scomplement machine, you cannot simply write the obvious value for
INT-MIN. On a 16-bit machine, for example, the sequence of characters
-32768 parses as two tokens, a minus sign and the integer constant with
value 32,768. The latter has type long because it is too large to represent as
type int. Negating this value doesn't change its type. The C Standard
requires, however, that INT-MIN have type int. Otherwise, you can be
astonished by the behavior of a statement as innocent looking as:
Chapter 5
Figure 5.2: /* test l i m i t s macros */
tlimits. c #include < l i m i t s . h >
#include < s t d i o . h >
Part 1
1i n t main
()
/* test b a s i c p r o p e r t i e s of l i m i t s . h macros
p r i n t f ("C-BIT
= % 2 i MB-LEN-MAX = % 2 i \ n \ n W ,
CHAR_BrT, ME-LFN-MAX) ;
p r i n t f (" CHAF-MAX = %10i C K M I N = %lOi\nl',
CHAR-MAX, C-MIN)
;
p r i n t f ("SCHAF-MAX = % 1 0 i SCHAR_MIN = %10i\n",
SCHAR-MAX, SC-MIN)
;
p r i n t f ("UCHAF-MAX = %10u\n\nW,UCHAR-MAX) ;
p r i n t f ( " SHRT-MAX = % 1 0 i SH-MIN
= %10i\nW,
SHRT-MAX, S K M I N );
p r i n t f ("USHRT-MAX = %lOu\n\nl', USHRT-MAX) ;
p r i n t f (" INT-MAX = %10i
INT-MIN = %10i\n",
INT-MAX, INT-MIN);
p r i n t f (" UINT-MAX = %lOu\n\nl', UINT-MAX) ;
p r i n t f ( " LONG-MAX = %101i LONG-MIN = % 1 0 l i \ n W ,
LONG-MAX, LONG-MIN);
p r i n t f ("ULONG-MAX = %10lu\n", ULONG-MAX) ;
# i f CHAR_BIT < 8 ( 1 CHAR-MAX < 127 ( 1 0 < CHAR_MIN \
I I CHAR-MAX != SCHAR-MAX && CHAR-MAX != UCHAR-MAX
# e r r o r bad c h a r
tendif
# i f INT-MAX < 32767 ( 1 -32767 < INT-MIN l l INT-MAX < SHRT-MAX
t e r r o r bad i n t p r o p e r t i e s
tendif
# i f LONG-MAX < 2147483647 1 1 -2147483647 < LONG-MIN \
I I LONG-MAX < INT-MAX
# e r r o r bad long p r o p e r t i e s
tendif
# i f MB-LEN-MAX < 1
# e r r o r bad ME-=-MAX
#endif
# i f SCHAR-MAX < 127 1 1 -127 < SCHAR-MIN
# e r r o r bad signed c h a r p r o p e r t i e s
#endif
{
Part 2 #endif
Testing d i m i t s . h>
Figure 5.2 shows the test program t1imits.c. It provides a brief sanity
check you can run on <limits.h>.It is by no means exhaustive, but it does
tell you whether the header is basically sane. It also provides a readable
summary of the values of the macros defined in <limits.h>.
Note that all the action occurs at translation time. That's because all the
macros must be usable within #if directives. If this test compiles, it will
surely run, print its summary and success message, then exit with successful status.
Here is the output for a PC-compatible implementation that represents
char the same as signed char:
CHAR-BIT =
MB-LEN-MAX
CHAR-MAX =
SCHAF-MAX =
UCHAR-MAX =
127
127
255
CHAR-MIN =
SCHAF-MIN =
-128
-128
SHRT-MAX =
USHRT-MAX =
32767
65535
SIIRT-MIN =
-32768
INT-MAX =
UINT-MAX =
32767
65535
INT-MIN =
-32768
LONG-= 2147483647
LONG-MIN = -2147483648
ULONG-MAX = 4294967295
SUCCESS testing <limits.h>
Chapter 5
80
References
The program
enquire,
l i m i t s . h.
Exercises
Exercise 5.1 Determine the parameters that characterize integer arithmetic for the C
translator you use.
Exercise 5.2 Adapt <limits.h> for the C translator you use.
Exercise 5.3 Consider the following code sequence:
int i n = 1.0;
short a[N];
for (i = 0; i < n; ++i)
i n *= a t i ];
For the C translator you use, how large can N be before you have to worry
about overflow in the computation of in? How large can N be in a program
intended to run with an arbitrary C translator?
For the C translator you use, how large can N be before you have to worry
about overflow in the computation of lo? How large can N be in a program
intended to run with an arbitrary C translator?
Chapter 6: <locale.h>
3ackground
The header <locale.h> is an invention of X3Jl1, the committee that
developed the C Standard. You will find little that resembles locales in
earlier implementations of C. That stands at odds with the committee's
stated purpose, to "codify existing practice." Nevertheless, those of us
active within X3Jll at that time felt we were acting out of the best of motives
- self defense.
history
This particular header popped up about five years after work began on
the C Standard. At that time, many of us felt that the Standard was
essentially complete. We were simply putting a few finishing touches on a
product in which we had invested five years of our lives. Resistance was
mounting to change of any sort.
About then, we learned that a number of Europeans were unhappy with
certain parts of the C Standard being developed by X3J11.It was simply too
American in several critical ways. They despaired of trying to educate
insular Yankees about the needs of the world marketplace. Rather, they
were content to wait and fight their battles on a more congenial field. The
Europeans took it for granted that an IS0 standard for C must differ from
the ANSI C Standard.
Many of us disagreed with that position. We felt it imperative that
whatever standard ANSI developed had to be acceptable to the international community. We had seen the effects in the past of computer language
standards that differed around the world. Our five years of effort would
be in vain, we felt, if the final word on C came from a separate committee
second guessing all our decisions.
So we asked the Europeans to show us their shopping list of changes.
Most of the items on the l i t dealt with ways to adapt C programs to
different cultures. That is a much more obvious problem in a land of many
languages and nations such as Europe. Americans enjoy the luxury of a
single (widely used if not official) language and a fairly simple alphabet.
AT&T Bell Laboratories went so far as to host a special meeting to deal
with various issues of internationalization.(This is a big word that people
are uttering more and more often. It seems to have no acceptablesynonym
that is any shorter. The informal solution is to introduce the barbarism I1 BN,
Chapter 6
pronounced "EYE eighteen EN." The 18 stands for the number of letters
omitted.) Out of that meeting came the proposal for adding locale support
to Standard C. The machinery eventually adopted is remarkably close to
the original proposal.
Adding locales to C had the desired effect. Many of the objections to
ANSI C as an international standard were derailed. It cost X3Jll an extra
year, by my estimation, to hammer out locales. And we probably spent yet
another year dealing with residual issues from the international community. (WG14, the IS0 C standard committee, is still working on additions to
the existing C Standard.) Nevertheless, we succeeded in producing a standard for C that is currently identical at both ANSI and IS0 levels.
environments
Writing adaptivecode is not entirely new. An early form sprung up about
fifteen years ago in the UNIX operatingsystem. Folks got the idea of adding
environment variables to the system call that launches new processes. (That
service is called exsc, or some variant thereof, in UNIX land.) Environment
variables are an open-ended set of names, each of which identifies a
null-terminated string that represents its value. You can add, alter, or delete
environment variables in a process. Should that process launch another
process, the environment variables are automatically copied into the image
of the new process.
The new process can simply ignore environment variables. It loses a few
dozen, or a few hundred, bytes of storage that it might otherwise enjoy. Or
it can look for certain environment variables and study their current values.
n v , provides information to the library date
A common variable is n v ~ zwhich
functions about the current time zone. If the value of "TZ-- is, say, ESTOSEDT,
the time functions know to label local standard time as EST and local
Daylight Savings Xme as EDT. The local (standard) time zone is 5 hours
earlier than UTC, known in the past as Greenwich Mean Time.
Environment variables have many uses. They are a great way to smuggle
file names into an application program. It is almost always a bad idea to
wire file names directly into a program. Prompting the user for f i e names
is mostly a good idea, except for "secret" files about which the user should
not have to be informed. Asking for such a file name on the command line
that starts the program is somewhat better, but it can be a nuisance. It is a
particular nuisance if several programs in a suite need access to the same
file name. That's why it is often much nicer to set an environment variable
to the file name once and for all in a script that starts a session. The f i e name
is captured in one place, but is made available to a whole suite of programs.
Microsoft's MS-DOS supports environment variables too-one of many
good ideas borrowed from UNIX. Several commercial software packages
use environment variables to advantage. Acommon use is to locate special
directories that contain support files or that are well suited for hosting
temporary files. But they have many other uses as well.
function
The Standard C library includes the function getenv, declared in
getenv <stdlib.h>.Call getenv with the name of an environment variable and it
function
putenv
why
locales
categories
Chapter 6
84
struct lconv
which contains members related to the formarting of numeric values. The structure shall contain
at least the following members, in any order. The semantics of the members and their normal
ranges is explained in 7.4.2.1. In rhe "C" locale, the members shall have the values specified in
the comments.
char
char
char
char
char
char
char
char
char
char
char
char
char
char
char
char
char
char
NULL
LC-ALL
LC-COLLATE
LC-CTYPE
LC-MONET ARY
LC-NUMERIC
*&cimrlgoint;
*thoueande-eep;
"grouping;
*int-curs-eymbol;
*currency-eymbol;
emon-decimalgoint;
%ton-thouemde-eep;
%on-grouping;
"positive-eign;
/*
*negative-eign;
int-frac-digit*;
frac-digit*;
p-c8grecedee:
p-eep-by-epace;
n-cegrecuiee;
n-eep-by-epace;
p-signgom;
n-signgoen;
/"
/*
/*
w .w
w w
ww
/* " "
/* " "
/* ""
/* ""
/*
/*
/*
/*
/*
/*
/*
/*
/*
.,"
ww
*/
*/
*/
"/
*/
*/
*/
*/
*/
w"/
CMR-MAX */
CHAR-MAX */
CHAR-MAX */
CHAR-MAX */
Ckm-bUX */
Ckm-MAX */
CHAR-MAX */
CHAR-MAX */
which expand to integral constant expressions with distinct values, suitable for use as the first
argument to the setlocale function. Additional macro definitions, beginning with the characters LC-and an uppercase lener,lWmay also be specified by the implementation.
Description
The setlocale function selects the appropriate portion of the program's locale as specified
by the category and locale arguments. The setlocale function may be used to change
or query the program's entire current locale or portions thereof. The value LC-ALL for category names the program's entire locale; the other values for category name only a portion
of the program's locale. Category LC COLLATE affects the behavior of the strcoll and
strxfrm functions. Category LC <TYPE affects the behavior of the character handling
functions101and the multibyte functions. Category LC MONETARY affects the monetary formatting information returned by the localeconv funzon. Category LC NUMERIC affects the
decimal-point character for the formatted inputloutput functions anhthe string conversion
functions, as well as the nonmonetary formatting information returned by the localeconv
function. Category LC-TIME affects the behavior of the strftims function.
A value of "C" for locale specifies the minimal environment for C translation; a value of
"" for locale specifies the implementation-defmed native environment. Other implementation-defined strings may be passed as the second argument to setlocale.
At program startup, the equivalent of
<locale.h>
eetlocale(LC_AU, " C " ) ;
is executed.
A null pointer for locale causes the s e t l o c a l e function to return a pointer to the string
associated with the category for the program's current locale; the program's locale is not
changed. lo2
The pointer to string returned by the setlocale function is such that a subsequent call with
that string value and its associated category will restore that part of the program's locale. The
string pointed to shall not be. modified by the program, but may be. overwritten by a subsequent
call to the setlocale function.
Forward references: formatted inputloutput functions (7.9.6), the multibyte character functions
(7.10.7). the multibyte string functions (7.10.8), string conversion functions (7.10.1). the s t r c o l l function (7.1 1.4.3). the strftimefunction (7.12.3.5). the strxf rmfunction (7.1 1.4.5).
localeconv
Description
The localeconv function sets the components of an object with type struct lconv with
values amrwriate for the formatting
- of numeric quantities (monetary and otherwise) according
to the miis df the current locale.
The members of the structure with type char * are pointers to strings, any of which (except
decimaljoint) can point to " ",to indicate that the value is not available in the current locale
or is of zero length. The members with type char are nonnegative numbers, any of which can
be.CHAR MAX to indicate that the value is not available in the current locale. The memkrs include
the follow?ng:
char *decimaljoint
The decimal-point character used to format nonmonetary quantities.
char *grouping
A string whose elements indicate the size of each group of digits in formatted nonmonetary
quantities.
char *mon d e c i m a l p i n t
The deczal-point used to format monetary quantities.
Chapter 6
char *negative sign
The string u s e d z indicate a negative-valued formatted monetary quantity.
char p csgrecedes
Set= 1 or 0 if the currency symbol respectively precedes or succeeds the value for
a nonnegative formatted monetary quantity.
char n csgrecedes
Set= 1 or 0 if the currency symbol respectively precedes or succeeds the value for
a negative formatted monetary quantity.
char p signgosn
S e t i o a value indicating the positioning of the positive-sign for a nonnegative
formatted monetary quantity.
char n signgosn
Setto a value indicating the positioning of the negative-sign for a negative formatted
monetary quantity.
The elements of grouping and mon-grouping are interpreted according to the following:
CHAR-MAXNo further grouping is to be performed.
0
The previous element is to be repeatedly used for the remainder of the digits.
other The integer value is the number of digits that comprise the current group. The next element
is examined to determine the size of the next group of digits before the current group.
The value of p-signgosn and n-signgosn is interpreted according to the following:
0
1
2
3
The implementation shall behave as if no library function calls the localeconv function.
Returns
The localeconv function returns a pointer to the filled-in object. The structure pointed to
by the return value shall not be modified by the program, but may be overwritten by a subsequent
call to the localeconvfunction.In addition, calls to the setlocale function with categories
LC-ALL,LC-MONETARY,or LC-NUMERIC may overwrite the contents of the structure.
Example
The following table illustrates the rules which may well be used by four countries to format
monetary quantities.
Country
Positive forma
Negative format
Italy
Netherlands
Norway
Switzerland
L.1.234
F 1.234.56
kr1.234.56
SFre.1.234.56
-L.1.234
F -1.234,56
kr1.234,56SFre.1.234.56C
Internationalformat
1~~.1.234
NLG 1.234.56
NOK 1.234.56
CAI? 1,234.56
For these four countries, the respective values for the monetary members of the structure
returned by localeconv are:
int-curr-symbol
~urrency~symbol
mon-decimalgoint
mon-thousands-sep
mon-grouping
positive-sign
negative-sign
int-frac-digits
frac-digits
p-csgrecedes
p--p_by-apace
n-csgrecedes
n-sep-by-space
p-signgoan
n-signqosn
Italy
Netherlands
Norway
Switzerland
"ITL."
"L."
"NLG
"NOK
"CHF "
#.
#.
#.
m.
#.
#.
., .,
m.
"\3"
"\3"
.,.,
m. m.
#.
#.
.,
"F"
"\3"
"\3"
m, m,
m. #.
.,_I.
11
0
0
1
0
1
0
1
1
11
"
"kr"
.I
2
2
2
2
1
1
1
0
1
0
1
1
1
.,
_ .,
"
"SFrs."
.,
..
m.
"C"
2
2
1
0
1
0
1
2
Footnotes
100. See "future library directions" (7.13.3).
101. The only functions in 7.3 whose behavior is not affected by the current locale are isdigit
and isxdigit.
102. The implementation must arrange to encode in a string the various categories due to a
heterogeneous locale when category has the value LC-ALL.
Using d o c a l e . h>
Much of the information provided in a locale is purely informative. C
has never treated monetary values as a special data type, so the rest of the
Standard C library is unaffected by a change in the category LC-MONETARY.
On the other hand, some changes in locale very definitelyaffect how certain
library functions behave. If a culture uses a comma for a decimal point, then
the scan functions should accept commas and the print functions should
produce commas in the proper places. That is indeed what happens. Here
are all the places where library behavior changes with locale:
library = The functions s t r c o l l and strxrrm, declared in <string.w, can change
how they collate when category LC-COLLATE changes.
changes
m The functions declared in <ctype.h>, the print and scan functions,
declared in <stdio.h>, and the numeric conversion functions, declared
in <stdlib.w, can change how they test and alter certain characters
when category LC-CTYPE changes.
The multibyte functions, declared in <stdlib.w, and the print and scan
functions, declared in <stdio.w, can change how they parse and translate multibyte strings when category LC-CTYPE changes.
The print and scan functions, declared in <stdio.w, and atof and
strtod, declared in<stdlib. w, can change what they use for thedecimal
point character when category LCJUMERIC changes.
The strrtime function, declared in < t i m e .w,can change how it converts
times to character strings when category LC-TIME changes.
The localeconv function, declared in <locale .h>, can change what it
returns when categories LC-MONETARY or LC-NUMERIC change.
Chapter 6
If you are half as nervous as I am, this litany of changes should scare
you. How do you write portable code if large chunks of the Standard C
library can change behavior underfoot? Can you ship code to Germany and
know what isalpha will do when it runs there? If you mix your code with
functions from another source, how much trouble can they cause? Each
time your functions get control, you may be running in a different locale.
How do you code under those conditions?
X3Jll anguished about such issues when we spelled out the behavior of
locales. We recognized that many people don't want to be bothered with
this machinery at all. Those folks should suffer little from the addition of
locales. Still others have only modest goals. They want to trade in the
Americanisms wired into older C for conventions more in tune with their
culture.Still others are ambitious. They want to write code that can be sold
unchanged, in object-module or executable form, in numerous markets.
That code must be very sophisticated about changing locales.
I-cw The simplest way to use locales is to ignore them. Every Standard C
locale program starts up in the wcll
locale. In this locale, the traditional library
functions behave pretty much as they always have. islower returns a
nonzero value only for the 26 lowercase letters of the English alphabet, for
example. The decimal point is a dot. If your program never calls setlocale,
none of this behavior can change.
native
The next simplest way to use locales is to change once, just after program
locale startup, and leave it at that. The C Standard requires no other locale names
~ . it does define a native locale designated by the empty string
besides T ~ But
If your program executes:
lw-l.
set l o c a l e (LC-ALL,
" ")
.....
89
ocale. h>
Now you can use the functions declared in <ctype.h> with assurance that
you are working in the T* locale. When you're done, revert the locale by
writing:
Note that the code stumbles bravely onward if the heap is exhausted and
maiioc fails. It simply avoids using any null pointers unwisely. You can omit
the business about allocating space and copying the locale string returned
by metlocale only if you are sure that no other calls to that function can
intervene between the two shown above.
formatting
Two locale categories tell you how to format values to match local
values conventions:
Category LC-MONETARY suggests how to format monetary amounts, both
by local custom and in accordance with international standards (IS0
4217).
Category LC-NUMERIC dictates the decimal point character used by the
Standard C library and suggests how to format non-monetary amounts.
Here, for example, are various ways you can format the monetary
amount $-3.00 by local custom, depending upon the values stored in three
members of struct lconv:
n-sep-by-apace:
0
.-.,
Chapter 6
That's a lot of complexity to keep track of. Conceivably, you can make
use of this information throughout an application, but probably not. The
individual pieces are at a low level of detail. What you really want is some
way to format numeric data that applies all of the relevant information in
one place. Unfortunately, the C Standard does not define such a function.
function
I decided to define the missing function. After several false starts, I
Fmtval
ended
up with the declaration:
char *-Fmtval (char *buf, double val, i n t f rac-digs) ;
You provide the character buffer buf to hold the formatted value. (The
modern trend is to specify a maximum length for any such buffer. I found
the function quite complicated enough without such checking, desirable
as it may be.) As a convenience, the function returns the valueof buf, which
then holds the formatted value as a null-terminated string.
You also specify val, the value to be formatted, as a double. That provides
for a fraction part and at least 16 decimal digits of precision. For a nonmonetary value, frac-digits specifies the numer of fraction digits to
include in the formatted value. The members of e t r u c t lconv offer no
guidance on this parameter.
Here's where the design gets clever (perhaps too clever). The locale
information suggests four distinct formats for a value:
an international monetary amount
a local monetary amount
= a non-monetary amount with no decimal point or fraction
= a non-monetary amount with decimal point and fraction
Only in the fourth case do you need to provide a (non-negative)value for
the number of fraction digits. That means you can set aside distinct negative values for the argument rrac-digits to signal these other cases.
Figure 6.1 shows the file xfmtval .c, which defines the function -Fmtval.
It distinguishes the four formats by examining the value of rrac-&site:
A value of -2 (the macro FN-INT-CUR)
tells the function to format an
international monetary amount.
= A value of -1 (the macro FN_LCL-CUR) tells the function to format a local
monetary amount.
= Any other value tells the function to format a non-monetary amount.
The number of fraction digits, however determined, must be a nonnegative value other than CHAR-MAX, defined in <limits.h>, for the
function to include a decimal point and fraction. So if you call -Fmtval
with the value CHAR-MAX, or with any negative value other than -1 or -2,
you tell it to format a non-monetary amount with no decimal point or
fraction.
= By elimination, any non-negative value other than CHAR-MAX tells the
function to format a non-monetary amount with a decimal point and
fraction. The value specifies the number of fraction digits.
Put these lines at the top of your program, or in a separate header file that
you include in your program. Now you are in a position to call the function
in various ways. For example, the code:
.....
char buf
[loo];
printf("You ordered %s
s h e e t s , ",
E'mtval (buf, (double) niterms, FV-INTEGER) ;
p r i n t f ( " each %s
square cm.\nm',
Fmtval (buf, s i z e , 3);
p r i n t f ( " ~ 1 e a s eremit %s
t o our New York o f f i c e , \n",
E'mtval (buf, c o s t , FV-INT-CUR) );
printf ( " (that ' s % s.) \n",
-E'mtval (buf, c o s t , FV-LCL-CUR) );
92
Chapter 6
Figure 6.1:
xfmtval.c
Part 1
<limita.h>
<locale.h>
<stdio.h>
<string.h>
/ * xnacroa * /
Mef ine FN-INT-CUR -2
Uefine FN-LCL-CUR - 1
:har *-Fmtval (char *buf, double d, int fdarg)
{
/ * format number by locale-apecific rules
char *cur-aym, decgt, *grpa, g-aep,
*aign;
conat char *fmt;
int fd, neg;
atruct lconv *p = localeconv();
*I
if (0 <= d)
neg = 0;
else
d = 4 , neg = 1;
if (fdarg == FN-INT-CUR)
f
/ * get international currency parameters *I
cur-sym = p-Ant-curr-aymbol;
d e c g t = p->man-decixnalgoint[O];
fmt = "$-V";
fd = p->int-frac-digits;
grpa = p-mongrouping;
grp_aep = p->man-thousands-sep[O];
aign = neg ? p->negative-aign : p->positive-sign;
1
elae if (fdarg == FN-LCL-CUR)
f
/ * get local currency parametera *I
atatic conat char *ftab[2] 121 151 = {
{ { " (V$)", "-V$",
"V$-", "V-$", "V$-"),
" $V-",
{ " ($V)mm,
" $V",
" - $V",
" $-V") ),
"-V $", "V $-", "V- $", "V $ - " I ,
{ { " (V $ ) " ,
{n($
V)n, m a - $ vn, ma$ v-mm, m m - $ VlI, D l $ -V"}}};
cur-aym = p->currency-aymbol;
d e c g t = p->man-decixnalgoint [O];
if (neg)
fmt = ftab[p->n-sep-by-apace == 11
[p->n-csgrecedea == l][p->n-signgoan < 0
I 1 4 < p->n-aigngosn ? 0 : p->n-aigngosn];
else
fmt = ftab[p->p-aep-by-space == 11
[p->p-csgrecedea == 11 [p->p-signgoan
0
I 1 4 < p->p-signgoan ? 0 : p->p-aigngosn];
fd = p->frat-digita;
grps = p->mongrouping;
grp-sep = p->mn-thousands-aep[O];
aign = neg ? p->negative-aign : p->positive-sign;
<locale.h>
Continuing
else
(
/* g e t numeric parameters (cur-sym not used)
d e c g t = p - > d e c i m a l p i n t [0] ;
fmt = " -V".
x f m t v a l .c
Part 2
*/
f d = fdarg;
grps = p->grouping;
grp-sep = p->thousands-sep [O];
sign
= neg ? "-"
"".I
.-
/*
*/
+=
strlen(s))
*/
*/
*/
*/
*I
*I
menunwe(end
1
1
1
r e t u r n (buf);
Chapter 6
Implementing d o c a l e . h>
This chapter contains a considerable amount of code. Unlike earlier
chapters, the code draws heavily on all parts of the Standard C library. You
got a taste of that variety with the function-mtvai in the previous section.
It made use of string manipulation functions declared in <string .h>and
an output formatting function declared in <etdio.W. YOU will see code
from those headers and others in what follows. I won't try to describe each
new function, just the more exotic usages (such as the sprintf format
"%n. *PI. If you see a function that you don't recognize, just look it up in a
later chapter.
One assist I can provide is a road map. Figure 6.2 shows the call tree for
functions and data objects defined in this chapter with external linkage. I
enclose entries for data objects in brackets. Following each external name
is the name of the C source file that defines it and the page number where
you can find the file. Beneath each function name and indented one tab
stop further to the right are any names that the function refers to. (I omit
this subtree on any later references to the same function name.)
For example, the function metlocale is defined in the C source file
setlocal. c. That function calls itself and refers to the data object _clocale
defined in the sameCsource file. It also calls the functions~~efloc,~~etloc,
and -setloc.
If you find yourself getting lost in the explanations that follow, refer back
to this call tree from time to time. You will find it helpful to understanding
the overall structure of the functions in <locale.w.
Figure 6.2:
Call Tree for
<locale.h>
localeconv
set locale
setlocale
[-Clocale]
-Def loc
-Getloc
Freeloc
[-Loctab]
-Makeloc
-Locvar
-Locterm
Skip
-~ e a d l o c
[-Loctab]
-Skip
Readloc
-setioc
[-Costate]
[-Mbcu-l
[-Mbstate]
[-Wcst ate]
localeco . c,
setlocal. c,
setlocal. c,
set local. c,
xdef loc. c,
xgetloc. c,
xf reeloc. c,
xloctab. c ,
xmakeloc. c ,
xlocterm.c ,
xlocterm. c,
xgetloc. c,
xreadloc. c,
xloctab. c,
xgetloc. c,
xreadloc. c,
xsetloc. c,
xstate. c,
xstate. c ,
xstate. c,
xstate. c,
95
knocking
out
functions
header
<locale.h>
function
localeconv
macro
-~
Note that I did not include the function -Frntval in this call tree. That's
because it is not required by the C Standard. The C Standard permits
additional functions, by the way. They can certainly have funny names like
-Fmtval.They can even have nicer names such as htval. I chose a name
reserved to implementors only as a matter of style for this presentation.
What an implementation cannot do with such a function is:
include a declaration for fmtvalin a standard header, such as <locale.h>
= include a definition for a macro name such as FV-INT-m
in a standard
header
= have any of the Standard C library functions call htval
Any of these practices pollutes the name space reserved for users.
Consider what happens to an added library function that honors these
restrictions. A program that declares and calls our hypothetical fmtval will
cause the linker to include the function when it scans theStandard C library
for unsatisfied references to external names. Aprogram that defines its own
version of htval will not cause the linker to include the function when it
scans the Standard C library. Since no other library functions depend on
the presenceof this version of W v a l , no harm can occur. The user-supplied
version effectively "knocks out" the added library function. Any function
that can be knocked out this way can be safely added to the Standard C
library.
That's enough about -Fmtval, by any name. The remainder of this
chapter deals with implementing the services required by the C Standard
for the header <locale.h>.
The easiest part of implementing <locale .h>is the function localeconv.
All it must do is return a pointer to a structure describing (parts of) the
current locale. That structure has type struct lconv. It is defined in
<locale.h>.Figure 6.3shows the file locale .hand Figure 6.4shows the file
localeco .c. (The latter name is chopped to eight letters because of file
naming restrictions on various systems, as I explained on page 7.) Packed
in with localeconv is the static data object of type struct i c m v whose
address the function returns. Note that the function localeconv has a
masking macro defined in <locale. h>.
I chose once again to parametrize the header <locale. h> by including
the
L internal header <yvals .h>.(See the original discussion of this header
on page 53.) That permits an implementation to provide a definition of the
,
to each implementation. (See the
macro -NULL, and hence of ~ L L tailored
discussion of NULL in Chapter 11: <stddef .h>.)For now, I simply observe
that a suitable definition of L L , in many cases, is:
#define -NULL
implementing
setlocale
(void *) 0
'1
Chapter 6
96
Figure 6.3: /* 1ocale.h standard header */
locale. h
#ifndef -LOCALE
#define -LOCALE
nifndef -WALS
#include <yvals.h>
#en& f
/* macros */
#define NULL
NULL
/* locale codes */
#define LC-ALL
0
#define LC-COLLATE 1
#define LC-CTYPE
2
#define LC-MONETARY3
#define LC-NUMERIC 4
#define LC-TIME
5
/* ADD YOURS HERE */
#define -NUT
6
/* type definitions */
struct lconv {
/* controlled by LC-MONETARY */
char *currency-symbol;
char *int-curr-symbol;
char *mn-decirnalgoint ;
char *mn-~ouping;
char *mn-thousands-sep;
char *negative_sign;
char *positive_sign;
char Frac-digits;
char int-Frac-digits ;
char n-csgrecedes;
char n-aep-by-space;
char n-aigngoan ;
char p-csgrecedea;
char p-sep-by-space;
char p-signgosn;
/* controlled by LC-NUMERIC */
char *decimal_point;
char *grouping;
char *thousands-sep;
-
/*
1;
/* declarations */
struct lconv *localeconv(void);
char *setlocale(int, const char *);
extern struct lconv -Locale;
/* macro overrides */
#define localeconv ( )
(&-Locale)
#endif
Figure 6.4:
1ocaleco.c
* localeconv function
include < l i m i t s . h>
,include <locale.-
*/
/* static d a t a */
l t a t i c char n u l l [ ] = "";
ltruct lconv -Locale = {
/* LC-MONETARY */
null,
null,
null,
null,
null,
null,
null,
/*
/*
currency-symbol
int-curr-symbol
/* mon-decimalgoint
/* mon-grouping
/* mon-thousands-sep
/* negative-sign
/* positive- sign
/* frac-digits
/* int-frac-digits
/* n-csgrecedes
/* n-sep-by-space
/* n-signgosn
/* p-csgrecedes
/* p-sep-by-space
/* p-signgosn
cm-m,
cm-m,
cm-m,
cm-m,
cm-m,
cm-m,
cm-m,
cm-m,
/* LC-NUMERIC
I,
*/
/*
11
I
null,
n u l l );
Aruct lconv
(
/*
* (localeconv) (void)
/* g e t
*/
*/
*/
*/
*/
*/
*/
*/
*/
*I
*I
*I
*I
*I
*I
d e c i m a l g o i n t *I
/* grouping *I
thousands-sep */
p o i n t e r t o current l o c a l e * I
r e t u r n (6-Locale);
mixed
The last task is one of the hardest. That's because you can construct a
locales mixed locale, one containing categories from various locales. For example,
.....
char * s l , 52;
The first call switches to the native locale-some locale preferred by the
local operating environment. The second call reverts one category to the
rwcwr
locale. You must make a copy of the string pointed to by sl because
intervening calls to s e t l o c a l e might alter it. If you later make the call:
s e t l o c a l e (LC-ALL, 52);
Chapter 6
98
locale
setlocale must contrive a name that it can later use to reconstruct an
names arbitrary mix of categories. The C Standard doesn't say how to do this, or
what the name looks like. It only says that an implementation must do it.
The scheme I settled on was to paste qualifiers on a locale name if it
contains mixed categories. Say, for example, that the base locale is us^",
which gives you American date formats and so on. An application adapts
which has the special conthe category LC-MONETARY to the locale gvacctvl,
ventions of accounting. The name of this mixed locale is wUsA;monetary:acct".
*/
/* controlled by LC-CTYPE */
const short *-Ctype;
const short *I~olower;
const short *-Toupper;
unsigned char c u r m a x ;
Statab -*state;
-Statab -Wcstate;
/* controlled by LC-MONETARY and LC-NUMERIC
struct lconv -LC;
/* controlled by LC-TIME */
Tinf o -Times;
) -L.nfo;
*/
Only one instance of this structure exists initially - the data object
clocaledefined
in setlocal.c. -clocale has a nonzero initializer onlyfor
the member -Name, which points at the string w'cwl,
the name of the locale.
(That's where the name is presumed to come first in the structure.)The first
call to setlocale copies all locale-specific information into this data object
before the locale changes. A later call that reverts to the "cW
locale can then
simply copy out the pertinent information.
If - ~ e t i o cdecides to read in a new locale (as described later in this
chapter), the function allocates storage for a new instance of -Linfo and
copies -clocale into it. -Getloc then reads in any changes to the locale. If
all changes are valid, the function adds the new locale to the linked list of
alternate locales beginning with -Clocale . - ~ e x t .A list member whose
member - ~ e x tholds a null pointer terminates the list. (Note that info
appears in this declaration both as a type name and a structure tag. Only a
structure with a tag name can contain a member that points at another
instance of the same structure.)
The structure i info contains several members of type-statab. Several
type
-s t a t e functions in this implementation of the Standard C library use state tables
to define their behavior. That provides the maximum in flexibility with
moderate performance. It also lets you specify the behavior of these functions in a locale using notation very similar to that for the <ctype.m
translation tables. Here are the affected functions:
strcoll and strxrrm, declared in <string . m , map a character string to
another character string, to define a collating sequence.
mbtowc and mbstowcs, declared in <stdlib.m, map a multibyte string to
a wide-character string.
wctomb and wcstombs, declared in < s t d l i b . b , map a wide-character
string to a multibyte string.
header
I describe the behavior of each of these functions in later chapters. For
"xstate.hw now, I observe simply that the internal header "xstate.hw defines the type
-stat- along with several useful macros. It also declares the various data
objects of type -Stat&. The internal header wxlocale.h" includes
"xstate.hn to obtain the information needed to manipulate state tables
when locales change. Figure 6.5 shows the header file xstate-h.
Chapter 6
100
Figure 6.5: /* x5tate.h i n t e r n a l header */
x s t a t e .h
/*
macros f o r f i n i t e s t a t e machines
STCH
OxOOf f
ST-STATE
OxOfOO
ST-STOFF
8
STFOLD
0x8000
ST-INPUT
0x4000
STOUTPUT
0x2000
STROTATE
0x1000
-NSTATE
16
/* type d e f i n i t i o n s */
typedef s t r u c t {
const unsigned s h o r t *-Tab[-NSTATE];
) -Stat&;
/* d e c l a r a t i o n s */
extern -Stat& -Costate, s t a t e , -Wcstate;
*/
#define
#define
#define
#define
#define
#define
#define
#define
Similarly, the functions declared in <time. h> have locale-specificbehavior. The structure type -Tinfo contains several members that point lo
null-terminated strings. These strings control how the time functions format and translate dates and times.
header
The internal header " x t i n f o. w defines the type -Tinf o. It also declares
"xtinfo.hW the data object -~imes,of type -Tinfo, that holds the current information
on times. The internal header "xlocale. hm also includes "xtinf o. hw to
obtain the information needed to manipulate time information when locales change. Figure 6.6 shows the header file x t i n f o . h.
Now you can appreciate what goes on in s e t l o c a l e . Figure 6.7 shows
the file s e t l o c a l . c. Much of its logic is concerned with parsing a name to
determine which locale to use for each category. Another big chunk of logic
builds a name that s e t l o c a l e can later digest. Everything else is small
potatoes by comparison.
s e t l o c a l e contains the code that copies information into the
locale
on the first attempt to change a locale. I adopted that ruse to avoid a nasty
snowball effect. It's easy enough to pile all the various locale-specific tables
into one structure. Do so, however, and you get the whole snowball
type
-Tinfo
l l ~ w l
Figure 6.6:
/*
x t i n f o - h i n t e r n a l header
*/
xtinfo-h
/* type d e f i n i t i o n s
typedef s t r u c t {
const char *-Ampm;
const char *-Days;
const char *-Formats;
const char *-Isdst;
const char *-Months;
const char *-Tzone;
) -Tinfo;
/* d e c l a r a t i o n s */
*/
regardless of how little of it you use. I felt it was better to have setlocale
do a bit more work to avoid this problem. You don't want to drag in ten
kilobytes of code when you use only the function isspace.
function
The function -Getloc determines whether a locale corresponding to a
~
e
t
l
o
c
given
category exists in memory. If it does not, - ~ e t l o clooks for it by
reading a locale file. I describe reading this file in detail below. Figure 6.8
shows the file xgetloc. C,which defines this function.
functlon
The C source file xgetloc. c also defies the function -Skip. Several
-Skip functions that read the locale file call -Skip to skip past a character (other
than the null character) and any white-space that follows. Here, whitespace consists of spaces and horizontal tabs. Using -Skip religiously enforces a uniform definition for whitespace in locale files. It also simplifies
much of the code that follows.
function
Figure 6.9 shows the source file xdef ioc. c. It defines the function -DeDefloc
f
ioc
that determines the name of the native locale. To determine that name,
I chose to use the environment variable w
~
~ That's
~
akin
~
~ to using
w
. the
z wdetermine
*
what time zone you're in. -Defloc
environment variable l w ~ to
inspects the environment variable LOCALE at most once during program
execution.
Figure 6.10 shows the file xset1oc.c. It defines the function -setloc,
function
setloc
which
actually copies new information out to the various bits of static data
affected by changes in the locale. (Note that it also performs a modicum of
checking for the more critical values.) A call to setlocale thus drags in all
this stuff. I don't know how to avoid this particular snowball. At least you
can avoid it if you leave locales alone.
To complete the record, I show here the initial state tables, since both
state
tables setlocale and -setloc manipulate them. (The time information -~imes
lives in the file asctime. c, shown on page 437.) Figure 6.11 shows the f i e
xstate .c. Don't try to understand it in any detail. For now, I tell you only
that the single state table shown is common to all functions that use state
tables. It is cleverly contrived to produce useful, if simple, results for all
these functions. It also makes a good starting point for state tables that you
may choose to define in a locale file.
What I have presented so far is all the basic machinery you need to
support locales. It is enough to let you build additional locales directly into
the library. Just add static declarations of type struct lconv and initialize
them as you see fit. Be sure to change -clocale . - ~ e x tto point at the l i t
you add.
The real fun of locales, however, is the prospect of defining an openlocale
files ended set. To do that, you need to be able to specify a locale without altering
C code. That takes all the remaining machinery incidated in Figure 6.2 that
I have yet to describe. Before I describe that machinery, I must describe
locale files.
Chapter 6
102
Figure 6.7:
'*
setlocale function
*/
set local.c
Part 1
:if -NCAT != 6
:error WRONG NUMBER OF CATEGORIES
:endif
/* static data */
Linfo -Clocale = { "C") ;
katic char *curname = "C";
hatic char namalloc = 0;
/* curname allocated */
itatic const char * const mats[-NCAT] = {
NULL, "collate: ", "ctype: ", "monetary:",
"numeric: ", "time:") ;
itatic -Linfo *pcats [-NCAT] = {
6-Clocale, 6-Clocale, 6-Clocale, 6-Clocale,
6-Clocale, b l o c a l e );
:bar *(setlocale)(int cat, const char *Inme)
/*
size-t i;
if (cat < 0 I I -NCAT <= cat)
return (NULL);
if (lname = NULL)
return (curname);
if (lname[O] = '\O')
lname = -Defloc ( );
if (-Clocale .-Costate .-Tab [0] = NULL)
-Cloca1e.-Costate = -Costate;
-C1ocale.-Ctype = -Ctype;
Cloca1e.-Tolower = -Tolower;
-Cloca1e.-Toupper
= -Toupper;
-
/*
.-Mbcurmax = c u r m a x ;
-Clocale
.-Mbstate = -Mbstate;
-Clocale
= -Wcstate;
-Cloca1e.-Wcstate
= -Locale;
-Cloca1e.-LC
-1Cloca1e.-Times = -Times;
/*
/*
-Linfo
bad category * r
set categories *r
*p;
int changed = 0;
if (cat != LC-ALL)
/*
docale. h>
Continuing
set1ocal.c
Part 2
'....
?
?
return (curname);
1
Chapter 6
Figure 6.8:
x g e t l o c. c
Part 1
r e t u r n (*s == '\0'
*/
char *s)
/* s k i p next char p l u s white-space *,
? s : s + 1 + s t r s p n ( s + 1, " \ t W ) ) ;
1
Linfo *-Getloc(const char *nmcat, const char *hame)
/* g e t l o c a l e p o i n t e r , given category and name
I
const char *ns, *s;
size- t n l ;
-Linfo *p;
/*
*,
*,
size- t n;
f o r (ns =NULL, s = lname; ; s += n + 1)
I
/* look f o r exact match o r LC-ALL
i f ( s [ n = s t r c s p n ( s , ":;")I = '\0' I I s [ n ] = t;')
I
/* memorize f i r s t LC-ALL
i f (ns == NU=)
ns = 8, n l = n;
i f ( s [ n ] -- ' \ O r )
break;
*,
*,
1
e l s e i f (memanp (nmcat, s, ++n) == 0)
{
/* found exact category match
ns = s + n, n l = s t r c s p n ( n s , ";");
break;
1
e l s e i f (s[n
break;
+=
strcspn(s
*,
i f (ns -- NULL)
r e t u r n ( NU LL );
/*
i n v a l i d name
1
f o r (p = 6-Clocale; p; p = p->-~ext)
i f ( ~ ~ ( p - > - N a m ens,
,
nl) = 0
66 p->-Name[nl] = ' \0' )
return (p);
I
/* look f o r l o c a l e i n f i l e
char buf [MAXLIN], *sl;
FILE * I f ;
Locitem *q;
s t a t i c char * l o c f i l e ;
/* l o c a l e f i l e name
i f (locfile)
--
e l s e i f ( ( s = getenv("U)CFILEW))
NULL
I I ( ( l o c f i l e = malloc ( s t r l e n (s) + 1)) ) == NULL)
r e t u r n (NU=);
*,
*,
Continuing
xget loc c
Part 2
else
strcpy (locfile, s);
i f ( ( l f = fopen(locfile, " r " ) ) = NULL)
return (NULL);
while ( (q = -Readloc ( l f , buf, 6s) ) != NULL)
i f (q->-Code = L-NAME
66 ~
~
(ns, snl) =
, 0
66 *-Skip(s + n l - 1) = ' \0')
break;
i f (q = NULL)
p = NULL;
e l s e i f ( (p = malloc(sizeof (-Linfo) ) ) = NULL)
e l s e i f ( (sl = malloc(n1 + 1)) = NULL)
free (p), p = NULL;
else
I
/* build locale
*p = -Clocale;
p->-Name = msmcpy(s1, ns, n l );
s l [ n l ] = ' \O1 ;
i f (-Makeloc(lf, buf, p) )
p->-Next = -Clocale.-Next,
-Clocale.-Next = p;
else
/* parsing error reading locale f i l e
I
fputs (buf, stderr);
fputs("\n-- invalid locale f i l e line\nW,s t d e r r ) ;
Freeloc (p):
free (p), p = NULL;
*,
*,
1
fclose ( l f );
return (p);
1
1
Figure 6.9:
xdef loc. c
/* Defloc function
*/
ltinFlude <stdlib.h>
ltinclude <string.h>
#include "xloca1e.h"
const char *-Def loc (void)
/*
*/
char *s;
s t a t i c char *defname = NULL;
i f (defname)
else i f ( ( s = g e t e n v ( " L 0 W " ) ) != NULL
66 (defname = malloc ( s t r l e n ( s ) + 1)) != NULL)
strcpy(defname, s ) ;
else
defname = "C";
return (defname);
Chapter 6
lo6
Figure 6.10: r* -Setloc function */
xset loc.c
Yinclude <ctype.h>
tinclude <limits.h>
tinclude "xloca1e.h"
/*
switch (cat)
/* set a category *I
{
case LC-COLLATE:
-Costate = p->-Costate;
break;
case LC-CTYPE:
-Ctype = p->-Ctype;
-Tolower = p->-Tolower;
-Toupper = p->-Toupper;
-Mbcurmax = p->-Mbcurmax <= MB-LEN-MAX
? p->-Mbcurmax : MB-ISN-MAX;
-Mbstate = p->-*state;
-Wcstate = p->-Wcstate;
break;
case LC-MONETARY:
-Locale.currency-symbol = p->-LC-currency-symbol;
-Locale.int-curr-symbol = p->I~c.int-currIsymbo1;
-Locale.mon-decimalgoint = p->-Lc.mon-decimalgoint;
-Locale.mon-grouping = p->-LC-mon-grouping;
-Locale-mon-thousands-sep = p->-Lc.mon-thousands-sep;
-Locale-negative-sign = p->-Lc.negative-sign;
-Locale.positive-sign = p->-Lc.positive-sign;
-Locale.frac-digits = p->-Lc.frac-digits;
-Locale-int-frac-digits = p->-Lc.int-frac-digits;
-Loca1e.n-csgrecedes = p->-Lc.n-csgrecedes;
-Loca1e.n-sep-by-space = p->-Lc.n_sep-byYspace;
-Loca1e.n-signgosn = p->-Lc.n-signgosn;
-Loca1e.p-csgrecedes = p->-Lc.p-csgrecedes;
-Loca1e.p-sep-by-space = p->-Lc.p-sep-by-space;
-Loca1e.p-signgosn = p->-Lc.p-signgosn;
break;
case LC-NUMERIC:
-Locale.decimalgoint = p->-Lc.decimalgoint[O] I= '\0'
? p->-Lc.decimalgoint : ".";
-Locale.grouping = p->-Lc.grouping;
-Locale.thousands-sep = p->-Lc.thousan&-sep;
break;
case LCTIME:
-~ i m e s= p->-~imes;
break;
1
return (p);
Figure 6.1 1: /*
xstate. c
*/
/* macros */
#define X
(ST-FOLDIST-OUTPUTIST-mUT)
/* static data */
static const unsigned short tabor2571 =
X~OXOO,X1OxO1, XIOx02, XIOx03, X10x04,
X10x08, X10x09, XIOxOa, XIOxOb, XIOxOc,
X10x10, XlOxll, XIOx12, XIOx13, XIOx14,
X10x18, X10x19, XlOxla, XlOxlb, XlOxlc,
X10x20, XIOx21, XIOx22, XIOx23, XIOx24,
X10x28, X10x29, XlOx2a. XlOxPb, XlOx2c,
XIOx30, XIOx31, XIOx32, XIOx33, XIOx34,
X10x38, X10x39, XlOx3a, XlOx3b, XlOx3c,
XIOx40, XIOx41, XIOx42, XIOx43, XIOx44,
X10x48, X10x49, Xioxla, XlOxlb, XlOx4c,
XIOx50, X10x51, XIOx52, XIOx53, XIOx54,
XI 0x58, X 10x59, X 1 OxSa, XI Ox5b. X 1 OxSc,
XIOx60, XIOx61, XIOx62, X10x63, XIOx64,
X10x68, X10x69, XlOxCa, XlOxCb, XlOx6c,
X10x70, X10x71, XIOx72, XIOx73, XIOx74,
X10x78, X10x79, XlOx7a, XlOx7b, XlOx7c,
X10x80,
x1ox88,
X10x90,
XI 0x98,
XI Oxa0,
XlOxa8,
XI oxbo,
XI Oxb8,
XI oxco,
XI Oxc8,
XI OxdO,
XI Oxd8,
XI Oxe0,
XI Oxe8,
XI Oxf0,
XI Oxf8,
XIOx81,
X10x89,
X10x91,
XI 0x99,
XlOxal,
XI Oxag,
XlOxbl,
XI Oxb9,
XlOxcl,
XlOxc9,
XlOxdl,
X 1 Oxd9,
XlOxel,
XlOxe9,
XlOxfl,
XlOxf9,
XIOx82,
XI Ox8a,
XIOx92,
XI Oxga,
XlOxa2,
XI Oxaa,
XlOxb2,
XI Oxba,
XlOxc2,
XlOxca,
XlOxd2,
X 1 Oxda,
XlOxe2,
XlOxea,
XlOxf2,
XlOxfa,
X 1 0x83,
X 1 Ox8b,
X 10x93,
X 1 Ox9b.
X 1 Oxa3,
X 1 oxab,
X 1 Oxb3,
X 1 Oxbb,
X 10x3,
x 1 Ox&,
X 1 Oxd3,
X 1 Oxdb,
XI Oxe3,
X 1 Oxeb,
XI 0xf3,
X 1 oxfb,
1;
char c u n n a x = 1;
-Statab -Costate =
-Statab s t a t e =
{&tab0[1] );
{&tab0[1] );
-Statab -Wcstate = { &tab0[1] );
{O,
XIOx05,
XIOxOd,
XIOx15,
XlOxld,
XIOx25,
XlOx2d,
XIOx35,
XlOx3d,
XIOx45,
XlOxld,
XIOx55,
X 1 Ox5d,
XIOx65,
XlOxCd,
XIOx75,
XlOx7d.
XI 0x84, XI 0x85,
XI Ox8c. X 1 Ox8d.
XIOx94, XI 0x95,
X 1 0x9~.X 1 Oxgd,
XlOxa4, XI Oxa5,
X 1 Oxac, X 1 Oxad,
XlOxb4, XI Oxb5,
X l oxbc, X 1 Oxbd,
XI 0 ~ ~ 4
XI ,
Oxc5,
X 1 oxcc, X 1 Oxcd,
X 1 Oxd4, X 1 OxdS,
X 1 Oxdc, X 1 Oxdd,
X 1 Oxel, XI Oxe5,
X 1 Oxec, X 1 Oxed,
XlOxf4, XlOxf5,
XlOxfc, X 1 Oxfd,
/* alloc flag */
XIOx06, XIOx07,
XIOxOe, XIOxOf,
X10x16, X10x17,
XlOxle, XlOxlf,
XIOx26, XIOx27,
XlOx2e. XlOx2f,
XIOx36, XIOx37,
XlOx3e, XlOx3f,
XIOx46, XIOx47,
XlOxle, XlOxlf,
XIOx56, XIOx57,
X 1 Ox5e. XI OxSf,
XIOx66, XIOx67,
XlOxCe, XlOxCf,
XIOx76, XIOx77,
XlOx7e, XlOx7f,
XI 0x86,
X 1 Ox8e,
XI 0x96,
X 1 Oxge,
XI Oxa6,
X 1 Oxae,
XI Oxb6,
X 1 Oxbe,
X 1 Oxc6,
X 1 Oxce,
X 1 Oxd6,
X 1 Oxde,
X 1 Oxe6,
X 1 Oxee,
X 1 Oxf6,
X 1 Oxfe,
XIOx87,
X 1 OxBf,
XI 0x97,
XI Ox9f.
XI Oxa7,
X 1 Oxaf,
XI Oxb7,
X 1 Oxbf,
XI 0 ~ ~ 7 ,
X 1 Oxcf,
XI Oxd7,
x 1 oxdf,
XI Oxe7,
X 1 Oxef,
XlOxf7,
XlOxff,
Chapter 6
Alocale should be easy to define. All sorts of peoplemight have occasion
to define part or all of a locale. Different groups may want to:
print dates and times in the local language, using the local conventions
change the decimal point character used for reading, converting, and
writing floating-point values
specify the local currency format and symbols
specify peculiar collating sequences
add letters, punctuation, or control characters to the character classes
defined by the functions declared in <ctype.h>
alter the encodings of multibyte characters and wide characters
I list these changes roughly in order of increasing sophistication. Almost
anybody might want to change month and weekday names to a different
language. A few might undertake to define a special collating sequence.
Only the bravest would consider changing to a new multibyte-character
encoding. (It might not agree with the string literals and character constants
produced by the translator, for one thing.) Nevertheless, none of these
operations should require a change in the Standard C library to pull off.
The goal, therefore, is to contrive a way that ordinary citizens can define
a new locale and introduceit to a C program at runtime.The program must,
of course, be one that calls setlocaleunder some circumstances. And the
program must make use of the information altered by such a call. Given
those obvious prerequisites, the Standard C library should assist program
and user in agreeing on locale specifications.
The approach I take is to introduce two environment variables and a file
format. The environment variables are:
"LOCALE"
"m"
(describedon page101), which specifies the name of the native
locale that is selected on a call such as setlocdle(W-m, ""1
"LOCFILE"
"m",
which specifies the name of the locale file to use if setlocale
encounters a locale name that is not already represented in memory
The file format specifies how you prepare the text file so that it defines all
the additional locales you want to add.
A program called xxx might, for example, begin by executing the call
setlocdle(W-m, ""1 as above. Under MSDOS, you can invoke it from
a batch file that looks like:
set IM3FILJ3=c: \1Ocaleaw1OCs. loc
set I a N E = U S A
xxx
That causes the program xxx to read the file c: \ l o c a l e a w l ~ ~ s . l oinc
search of a locale named ' m l I .Assuming the program can find that locale
and successfully read it in, the program xxxthen executes with its behavior
in the batch script
adapted to the ' m l 'locale. Change "US&"to " m e g 1
and the program searches out a different locale in the same file. Or you can
and always ask for the generic
change the file name specified by "LCCFTIE~~
lunativenl
locale. Both are sensible ways to tailor the native locale.
Amoresophisticated program might use more than just the native locale.
It could determine categories and the names of locales in various ways, then
oblige setlocaleto chase them down in the localefile. Conceivably, it could
even rewrite the contents of the locale file while it is running, to build new
locales on the fly. In any of these case, you certainly want to defer binding
locales to programs as late as possible.
locale
A locale consists of an assortment of data types. Some are numeric
file values, some are strings, and some are tables of varying formats.Each entity
formats in a locale needs a distinct name. You use these names when you write the
locale file to specify which entities you wish to redefine. For the members
of struct lconv,I use the member name as the entity name within the locale
file. In other cases, I had to invent entity names.
A locale file is organized into a sequence of text lines. You begin the
~ l g for example, with the line:
definition of the m l u s locale,
LOCALE USA
Each line that follows begins with a keyword from a predefined list. Use
NOTE to begin a comment and SET to assign a value to an uppercase letter,
as in:
NOTE
SET
USD
'I
'I
The quotes around a string value are optional. You need them only if you
want to include a space as part of the string. You can write a fairly ornate
expression wherever a numeric value is required. I describe expressions in
detail on page 113.
The initial values in each new locale match those in the m~cmm
locale. That
typically saves a lot of typing. All you really have to specify is what you
want changed from the ~lc*~
locale. Write more only if you want more
thorough documentation of a locale.
numeric
You need to specify numeric values for some members of struct lconv.
values These include the category LC-MONETARY information:
frac-digits
int-frac-digits
n-csgrecedes
n-sep-by-spaces
n-signgosn
p-csgrecedes
p-sep-by_spaces
p-signgosn
Chapter 6
Each of these occupies a char member. A value of CHAR-MAX, defined in
<limits.h>,indicates that no meaningful value is provided.
The value of the macro MB-CUR-MAX, defined in docale .h>,can change
with the category LC-CTYPE. I adopted the entity name:
for the char data object that holds the value of this macro.
You need to specify strings for some members of struct lconv. These
strlng
values include the category LC-MONETARY information:
currency-symbol
int-curr-symbol
mon-decimalgoint
mon-thousands-sep
negative-sign
positive-sign
grouping
mon-grouping
(LC-NUMERIC)
(LC-MONETARY)
The value of each character specifies how many characters to group as you
move to the left away from the decimal point. A value of zero terminates
the string and causes the last grouping value to be repeated indefinitely. A
value of CHAR-MAX terminates the string and specifies no additional grouping. To group digits by two and then by five, for example, you want to create
the array r 2 , 5 , CHAR-MAXI. In the locale file, however, you write:
mon-grouping
25"
111
Here are the category LC-TIME entity names with some reasonable string
values for an English-speaking country. They mostly speak for themselves:
amgm
:AM:PM
days
:Sun:Sunday:Mon:Monday:Tue:Tuesday\
Wed:Wednes&y:Thu:Thursday:Fri:Friday:Sat:Saturday
dst-rules
:032402:102702
time-formats
"1%
%D %Ii:%M:%S
%Yl%b %D %YI%H:%M:%Sm'
m o n t ha
:Jan:January:Feb:February:Mar:March\
Apr:April:May:May:Jun:June\
Jul:July:Aug:August:Sep:September\
Oct:October:Nw:November:Dec:December
t ime-zone
:EST:EDT:+0300
Note that you can continue a line by ending it with a backslash. Including
all continuations, a line can have up to 255 characters.
The string t i m e - f o r m a t s specifies the formats used by s t r f t i m e to generate locale-specific date and time (%c), date (%x) and time (%x). I discuss
these formats further in Chapter 15: < t i m e .h>.
'TIMEZONE"
The third field of time-zone counts minutes from UTC (Greenwich Mean
"TZ*Time), not hours. That allows for the various time zones around the world
that are not an integral number of hours away from UTC. If this string is
empty, the time functions look for a replacement string in the environment
variable *$TIMEZONE-. (YOU can append a similar replacement for dst-rules.)
If that variable is also absent, the functions then look for the widely-used
environment variable m @ ~
That
z mstring
@ . takes the form ESTOSEDT, where the
number in the middle counts hours West of UTC.
The string ast-rules is even more ornate. It takes one of two general
forms:
(YWY)MMDDHH+W
( Y W Y ) MMDDHH-W
Daylight Here, WYY in parentheses is the year, MM is the month number, DD is the day
Savings of the month, w is the number of days past Sunday, and HH is the hour
Time number in a 24-hour day. +w advances to the next such day of the week on
or after the dateMM~Din the year in question. -w backs up to the next previous
such day of the week before the specified date. You can omit the fields that
specify year, hour, and day of the week.
The fairly simple example above calls for Daylight Savings Time to begin
on 24 March (MMDD= 0324) at 0200 (HH = 02) and to end on 27 October at the
same time. To switch on the last Sundays in March and October each year
since 1990, write :(1990) o~oio2-o:iooio2-o.(Years before 1990 don't correct for Daylight Savings Time, by this set of rules.)
If you live below the Equator, the year begins in Daylight Savings Xme.
You can capture that nicety by adding a third reversal field, as in
: 0101:
030202 :100202. YOU can also write an arbitrary number of year rules
going back in time. Qualify the first rule of each set with a starting year
( m y ) for the rule to take effect. You can capture the entire history of law
governing Daylight Savings Time in a given state or country, if you choose.
Chapter 6
The functionsdeclared in <ctype.h> all are organized around translation
tables. (SeeChapter 2: <ctype.h>.) Each is an array of 257shortsthat accepts
subscripts in the interval [-I, 2551. In the locale file, you cannot alter the
contents of element-1, which translates the value of the macro EOF, defined
in <stdio.h>. The entity names for these tables are:
ctype
tolower
toupper
$@
$$
+ 'a' -
'A'
'A'
The special term $@ is the value of the index for each element in the
subrange. (Read the term as "where it's at.") The special term $$ is the value
of the previous contents of the table element. (Read the term as "what its
value is.") Note that you can write a simple (single-character)character
constant to specify its code value, and that you can add and subtract a
sequence of terms. The first two lines are, of course, optional. You inherit
them from the "cWlocale.
Several pairs of functions in this implementation use state tables to
state
tables define their behavior, as I discussed on page 99. You can specify up to 16
state tables for each of the three entity names:
collate
mbtowc
wctomb
The first line gives the macro MB-CUR-MAX, defined in xstdlib.h>, the value
1. No multibyte sequence requires more than one character. The second line
defines all elements of state table zero for mbtowc and mbstowcs.It tells the
functions to:
fold the translation value into the accumulated value (SF)
with the input code mapped to itself ($@I
consume the input ($I)
write the accumulated value as the output ( $ 0 )
The successor state is state zero ($0). Translation ends, in this case, when a
zero input code produces a zero wide character.
113
<locale.h>
expressions
That's the list of entities you can specify in a locale. Now you can
understand why certain funny terms can appear in expressions. An expression itself is simply a sequence of terms that get added together. The last
example above shows that you can add terms simply by writing them one
after the other. The plus signs are accepted in front of terms purely as a
courtesy so that expressions read better.
terms
You can write lots of different terms:
Decimal, octal, and hexadecimal numbers follow the usual rules of C
constants. The sequences lo, 012, and ox~all
represent the decimal value
ten.
A plus sign before a term leaves its value unchanged. A minus sign
negates the term.
Single quotes around a character yield the value of the character, just as
for a character constant in a C source fie. (No escape sequences, such as
\012', are permitted, however.)
An uppercase letter has the value last assigned by a SET. All such
variables are set to zero at program startup.
In addition to these terms, a dollar sign is the fist character of a
$X
terms two-character name that has a special meaning, as outlined below. Here are
the special terms signalled by a leading dollar sign:
$$ - the current contents of a table element.
$@- the index of a table element. $$ and $@, if present, must precede
any other terms in an expression.
-the value of the macro CHAR-MAX.
$# - the value of the macro UCHAR-MAX
[$a $b $f $n $r $t $vl -the values of the character escape sequences,
in order, ['\a' '\b' '\f' '\n' '\rf '\t' '\v'].
[$A $C $D $H $L SM $P $S $U $wI -thecharacter-classificationbitsused
in the table c t y p e . These spec*, in order: extra alphabetics, extra control
characters, digits, hexadecimal digits, lowercase letters, motion-control
characters, punctuation, space characters, uppercase letters, and extra
white-space characters. (See the file ctype. h on page 37 for definitions
of the corresponding macros.)
[$o $1 $2 $3 $4 $5 $6 $71 - the successor states 0 through 7 in a
state-table element. (No symbols are provided for successor states 8
through 15. Write $ 7 + $ i for state 8, and so forth.)
[SF $I $0 $R] - the command bits used in a state-table element. These
spec*, in order: fold translated value into the accumulated value, consume input, produce output, and reverse bytes in the accumulated value.
(See the file xstate . h in Figure 6.5 for definitions of the corresponding
macros.)
With these special terms, you can write expressionsin locale files that don't
depend on implementation-specific code values.
$A
Chapter 6
"USA"
USA
mon-thousands-sep
", "
negative-sign
positive-sign
"$ "
,, I,
,.3#,
"USD "
". "
"3"
11-11
"+"
The last line delimits the end of the locale. You need such a line only at the
end of the last locale in the locale file (but it is always permissible). To
improve checking, the functions that read the locale file report an error if
end-of-file occurs part way through a locale specification.
function
Now you are in a position to understand the remaining functions that
~ e t l o cimplement <locale.h>. Recall that-~etloc(Figure 6.8) first attempts to find
revisited a locale in memory. If that fails, it then attempts to open the locale file and
scan it for the start of the desired locale. It looks only at lines in the locale
file that begin with the keyword LOCALE.- G e t l o c calls-~eadlocto read each
line and identify its keyword.
Should - G e t l o c find such a line with the desired name following the
keyword, the function allocates storage for the new locale. It copies the
contentsof -clocale, then changes to the new name. The function-~akeloc
reads the remainder of the information for the locale and alters its storage
accordingly. If -~akelocreports success, - ~ e t l o cadds the new locale to the
list beginning at-Clocale .-Next. If-Makeloc reports failure, - ~ e t l o cwrites
an error message to the standard error stream, discards any allocated
storage, and reports that it could not find the locale. Part of the error
message is the locale-file line that caused the offense.
As a rule, it is bad practice for library functions to write such error
messages. They preempt the programmer's right to decide how best to
recover from an error. I found in this case, however, that the messages are
invaluable. A malformed locale specification is hard to debug if set locale
reads only part of it or quietly refuses to accept it at all. The library is already
indulging in a complex operation that involves opening and reading a file,
/*
Readloc function
kinclude <stdio .h>
kinclude <string.h>
kinclude "x1ocale.h"
*/
/* static data */
static const char kc[] =
/* keyword chars */
"-& C ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ U V I O ( ~ Z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z " ;
Locitem *-Readloc(F1LE *If, char *buf, const char **ps)
{
/* get a line fram locale file */
for (; ; )
1
/* loop until EOF or full line */
size-t n;
for (buf[O] = ' ', n = 1 ; ; n - = 2 )
n, If) = NULL
if (fgets(buf + n, MAXLIN
I I buf [ (n += strlen(buf + n)) - 11 != '\nr)
return (NULL);
/* EOF or line too long
else if (n <= 1 1 1 buf[n
21 !=
break;
/* continue only if ends in \
buf[n - 11 = '\OR;
/* overwrite newline
/* look for keyword on line
{
const char *s = -Skip(buf) ;
Locitem *q;
*/
*/
*/
*/
*pa = -Skip(s
return (q);
1
return (NULL);
/*
found a match
*/
/*
*/
1);
1
1
Chapter 6
of the longest sequence of characters beginning at s all of which are in the
string kc. I chose not to use the character-classification functions from
<ctype.h>,such as isalpha, because they can vary among locales.
-Readloc stores at *psa pointer to the first character on the l i e following
type
-~ocitemthe keyword and any white-space. The function also returns a pointer to
table entry containing information on the keyword that it recognizes. The
.hwdefines the types -Lcde and -~ocitemas:
header wgxlocale
enum -Lcode (
L-GSTRING, L-NAME, L-NOTE, L-SET,
L-STATE, L-STRING, L-TABLE, L-VALUE
1:
typedef struct (
const char *-Name;
size-t -Offset;
enum -Lcode -Code;
1 -Locitem;
(The scalar type size-t is the integer type of the result of operator sizeof.
Several standard headers d e f i e this type. I discuss it at length in Chapter
11: <stddef.h>.) The member -Name points at the name of the keyword.
-offset holds the offset into the structure -Linfo of the member corresponding to the keyword (if any). And -Code holds one of the enumerated
values that characterizeeach instance of -~ocitexn.
data object
-Readloc scans the data object -Loctab, an array of -~ocitem,to find the
-~ o c t a bentry that matches the keyword on each line from the locale file. Figure 6.13
shows the filexloctab. c, which defines -~octab.It uses themacro offsetof,
defined in < s t w e .h>,to determine the offsets into the structure -Linfo. I
use the macro OFF here to shorten the lines in this C source file.
function
One other function uses -Loctab. Figure 6.14 shows the file xfree1oc.c.
-Freeloc It defines the function Freeloc. If -Makeloc encounters an invalid line
while reading the localefile, it reports failure back to -Getloc. That function
calls -Freeloc to free any storage allocated for the new locale (including its
name), then frees the -Linfo data object allocated for the new locale. (It
would probably be acceptable to abandon such storage - requesting a
flawed locale should be a rare event-but it is tidier to reclaim heap space
that is no longer needed.) -Freeloc scans -~octabfor any elements that
correspond to members you can alter in -Linfo by writing lines in the locale
file. For each such element of -Loctab, -Freelo= determines whether any
storage was allocated for the new locale. To do so takes a bit of work.
Remember that each new locale begins life as a carbon copy of the nvm
locale. -Makeloc allocates a new table or string only when a locale-file line
calls for a change. Request such a change and -Makeloc compares the
relevant pointer member of the new -Linfo data object against -Clocale. If
the pointers are the same, -Makeloc knows to allocate a fresh version.
Changes apply to the new version, leaving data for the v~cml
locale alone. If
the pointers differ, -Makeloc assumes that it has already allocated a fresh
version for this new locale. Changes accumulate in the new version.
118
Chapter 6
Figure 6.14:
xfree1oc.c
Freeloc function */
linclude "xlocale. h"
I*
/*
-Locitem
free a l l storage * r
*q;
for ( q = -Lo&&;
q->-Name;
s w i t c h (q->-Code)
++q)
/*
free a l l pointers * I
case L-STATE :
/*
i n t i;
unsigned short **pt
= &ADDR( p , q, unsigned short *) ;
0 <= --i; ++pt)
( * p t && ( * p t ) [-I] != 0 )
for (i = -NSTATE;
if
free ( *pt) ;
?
break;
case L-TABLE:
(NEWADDR ( p , q, short *) )
free (ADDR ( p , q, short *) - 1 ) ;
break:
case L-GSTRING:
case L-NAME:
case L-STRING:
i f (NEWADDR(p, q, char *) )
free (ADDR ( p , q, char *) );
if
?
?
119
function
-U e l o c
function
-Locterm
function
-Locvar
120
Chapter 6
#include <string.h>
#include "x1ocale.h"
xmakeloc c
Part 1
if
!-Locterm(&s. ans))
return (NULL);
while (-Locterm(&s, &val))
*ans += val;
return (a);
1
(
/ * process a line
{
case L-GSTRING:
/ * alter a grouping string
case L-STRING:
/* alter a normal string
if (NEWADDR(p, q, char * ) )
free(ADDR(p, q, char * ) ) ;
if (s[O] == ' " '
&& (sl = strrchr(s + 1, ' " ' ) )
!= NULL
&& *-Skip(s1) == .\On)
*sl = .\on, ++a;
if ((sl = (char *)malloc(strlen(s) + 1)) == NULL)
return (0);
ADDR(p, q, char * ) = strcpy(s1, s);
if (q->-Code == L-GSTRING)
for (; *sl; ++sl)
if ( ( 8 = strchr(gmap, *sl)) != NULL)
*a1 = *a == ,A, ? C H A R - M A X : s - g m a p ;
break;
/ * alter a translation table
case L-TABLE:
case L-STATE:
/ * alter a state table
{
/ * process tab[#,lo:hi] $x expr
int inc = 0;
unsigned short hi, lo, stno, *usp, **uspp;
if (*a != '1'
I I (s = getval(-Skip(s), &stno)) == NULL)
return (0);
if (*a I = ',')
lo = stno, stno = 0;
else if (q->-Code != L-STATE I I -NSTATE <= stno
I I (a = getval(-Skip(s), &lo)) == NULL)
tlocale.h>
Continuing
xmakeloc c
Part 2
return (0);
lo = (unsigned char)lo;
if (*a != ':.)
hi = lo;
else if ((s = getval (-Skig(a), &hi) ) == NULL)
return (0);
else
hi = (unsigned char)hi;
if (*a I = '1.)
return (0);
for (a = -Skig(s); s[O] == # $ # ; s = -Skig(s + 1))
if (s[ll == '8' && (inc & 1) == 0)
inc I= 1;
else if (s[ll == ' $ ' && (inc & 2 ) == 0)
inc I = 2;
else
break;
if ((a = getval(s, &val)) == NULL I I *s != '\On)
return (0);
usgg = &ADDR(g, q, unsigned short * ) + stno;
if (q->-Code == L-TABLE)
usg = NEWADDR(g, q, short * ) ? *uSgg : NULL;
else
usg = (*uspp)1-11 ? *usgg : NULL;
if (usg == NULL)
I * setup a new table
(
if ((usg = (unsigned short *)malloc(TABSIZ))
== NULL)
return (0);
usg[O] = EOF;
/ * allocation flag or EOF
memcgy(++usg, ADDR(g, q, short * ) ,
TABSIZ - sizeof (short));
*uspp = usg;
1
for (; lo <= hi; ++lo)
usg[lo] = val + (inc & 1 ? lo : 0)
+ (inc & 2 ? usg[lol : 0);
1
break;
/ * alter a numeric value
case L-VALUE:
if ((a = getval(s, &val)) == NULL I I *s != '\On)
return ( 0 ) ;
ADDR(g, q, char) = val;
break;
/ * assign to uppercase variable
case L-SET:
if (*(a1 = (char *)-Skig(s)) == #\Oa
I I (sl = (char *)getval (sl, &val) == NULL
I I *sl != '\0' I I -Locvar(*s, val) == 0)
return (0);
break;
/ * end happily with next LOCALE
case L-NAME:
return (1);
1
return (0);
/ * fail on EOF or unknown keyword
1
*I
*I
*I
*I
*i
*i
C
122
Chapter 6
<ctype.h>
< l i m i t s . lo
<string.lo
"xloca1e.h"
/* s t a t i c d a t a */
s t a t i c const char d o l l a r s [ ] = {
"*abfnrtvW
"01234567"
/* PLUS $@ and $$
/* c h a r a c t e r codes
*/
*/
/* s t a t e values */
/* ctype codes */
/* s t a t e comaands */
"ACDHLMPSUW"
"#FIORW;
s t a t i c const unsigned s h o r t d o l v a l s [ ] = {
CBAR-MAX,
' \ a p , '\br, ' \ f ' , ' \ n r , ' \ r r , ' \ t r , 8 \ v t ,
0x000, 0x100, 0x200, 0x300, 0x400, 0x500, 0x600, 0x700,
-BBr
-DI,
X D r -LO, -CN, -Pun -SPr -UP,
XS,
UCBAR-MAX,
ST-F:~,
ST-INPUT, ST-OUTPUT, ST-GTATE
;
s t a t i c const char uppers [I = "ABCDEFGHIJKIMNOPQRSTrmWXYZ'v;
s t a t i c s h o r t v a r s [sizeof (uppers)
11 = {O);
/*
set a $ v a r i a b l e */
?
i n t -Locterm(const char **pa, unsigned s h o r t * a s )
{
/* e v a l u a t e a term on a l o c a l e f i l e l i n e */
const char *s = *pa;
const char * s l ;
i n t mi;
'+'
f o r (mi = 0; *s =
I I * s == -'; s = -Skip (s))
mi = * s = '-' ? !mi : m i ;
i f ( i s d i g i t (a [0] ) )
*ans = s t r t o l (s, (char **) &a, 0);
else i f (s[O] = ' \ " && s [ l ] != ' \ O '
&& s [ 2 ] = ' \ " )
*ans = ( (unsigned char *) s) [ I ], s += 3;
else i f (s[O] && (sl = s t r c h r ( u p p e r s , s [ O ] ) ) != NULL)
*ans = v a r s [ s l - uppers], ++a;
else i f (s[O] = '$' && s [ l ]
&& (sl = s t r c h r ( d o l l a r s , s [1]) ) != NULL)
*ans = d o l v a l s [ s l - d o l l a r s ] , s += 2;
else
r e t u r n (0);
i f (mi)
*ans = - * a s ;
*pa = -Skip(s
1);
r e t u r n (1);
header
"xlocale.h"
Testing <locale.h>
Figure 6.18 shows the test program tlocale.c. It focuses primarily on
the portable behavior you can expect from the functions in <locale .h>.As
a consequence, it doesn't test much of the code presented in this chapter.
To do that, you need to switch to a new locale, such as - U S A ~presented
earlier. Then you can print the results of the extra function -mtval to verify
that the behavior changes as expected.
You can use t1ocale.c to test any implementation of Standard C. It
locale meets the requirements of the C Standard, both
ensures that the T-before and after various changes of locale. It also verifies that you can
and native locales. It
establish mixed locales, at least involving the Tendeavors to determine whether these two locales differ. You get one of
two messages. For this implementation, the expected output is:
Native locale same as "C" locale
SUCCESS testing <locale.h>
References
I S 0 Standard 4217:1987 (Geneva: International Standards Organization,
1987). This Standard specifies the three-letter codes for the currencies of
various nations.
Exercises
Exercise 6.1 Write locales that expresses the monetary conventions for Italy, the Netherlands, Norway, and Switzerland. Use the information from the example
in Section 7.4.2.1 of the C Standard (See page 86).
Exercise 6.2 Write a locale that expresses the character-classificationconventionsfor the
French language. Add the lowercase letters [a ii c e e is 6 a] and their
corresponding uppercase letters [A A A c t t 6 61 to the translation
tables ctype, tolower,and toupper.HOW do you determine the code values
for these letters under your implementation?
Chapter 6
1 24
Figure 6.17:
xlocale h
* xloca1e.h i n t e r n a l header
include <limits. h>
include <locale.h>
include <stdio.h>
include <stdlib.h>
include " xstate. h"
include "xtinf0.h"
*/
<locale.h>
Figure 6.18:
t l o c a l e .c
'* t e s t locales */
:include < a s s e r t . h>
:include <limits.h>
:include <locale.h>
:include <stdio.h>
:include <string.h>
--
--
~ n main()
t
i
/* t e s t basic properties of locales
s t a t i c i n t c a t s [ I = { LC-ALL, LCCOLLATE, LC-CTYPE,
LC-MONETARY, LCNUMERIC, LC-TIME );
s t r u c t lconv *p = NULL;
char buf [32], *a;
a s s e r t ( (p = localeconv 0 ) != NULL) ;
t e s t c l o c a l e (p);
a s s e r t ( (s = s e t l o c a l e (LC-ALL, NULL) ) != NULL) ;
aaaert ( a t r l e n ( s ) < sizeof (buf) ) ;
/* OK i f longer
strcpy (buf, s) ;
/* but not s a f e f o r t h i s program
a s s e r t ( s e t l o c a l e (LC-ALL, " ") ! = NULL) ;
a s s e r t (localeconv ( ) != NULL) ;
a s s e r t ( (s = s e t l o c a l e (LCMONETARY, "C" ) ) != NULL);
puts ( s t r a n p ( s , "C") ? "Native locale d i f f e r s from \"C\""
: "Native locale same a s \"C\"") ;
a s s e r t ( s e t l o c a l e (LC-NUMERIC, "C") != NULL) ;
a s s e r t ( (p = localeconv () ) != NULL);
t e s t c l o c a l e (p);
a s s e r t ( s e t l o c a l e (LC-ALL, buf) != NULL);
a s s e r t ( (p = localeconv () ) != NULL);
t e s t c l o c a l e (p);
puts("SUCCESS t e s t i n g <locale.h>");
return (0);
*,
*,
*,
126
Chapter6
Exercise 6.3 Alter the test program tctype.c (shown on page 44) so that it fist switches
to the locale in the previous exercise. Does it display what you expect when
you run it?
Exercise 6.4 Write a locale that expresses the monetary and numeric conventionsfor the
French language. At the very least, you need to alter:
m~n~decimaljoint
mon-grouping
mon_thousands-sep
negative-sign
decimalgoint
grou~ing
thousands-sep
positive-sign
Test your new locale. (Hint: You may want to commandeer test programs
in this and later chapters as a starting point.)
Exercise 6.5
[~arder]
Tables of values with many fraction digits often group digits by
fives going to the right from the decimal point. An example is:
+1.00000
- 0.16666
+0.00833
- 0.00019
00000
66666
33333
84126
00
67
33
98
Exercise 6.6 [Harder] You want a program to be able to construct its own locale.
Rewriting the locale file is unacceptable. What function(s)would you add
to <locale .h> to permit a program to name, construct, and add new locales
on the fly? Write the user documentation that a programmer would need
to add locales.
Exercise 6.7 [Very hard] Implement the capabilities you described in the previous
exercise.
Chapter 7
Each function should accept all argument values in its domain (the
argument values for which it is mathematically defined). It should
report a domain error for all other arguments. In this case, the function
returns a special code that represents NaN for not-a-number.
Each function should produce a finite result if its value has a finite
representation. It should report a range error for all values too large or
too small to represent. If the value is too large in magnitude, the function
returns a special code +Inf that represents plus infinity, or the negative
of that code -Inf that represents minus infinity, as appropriate. If the
value is too small in magnitude, the function returns zero.
Each function should produce the most sensible result for the argument
values NaN, +Inf, and -1nf. On an implementation that supports multiple NaN codes, such as IEEE 754, the functions preserve particular NaN
codes wherever possible. If a function has a single argument and the
value of that argument is a NaN, for example, the function returns the
value of the argument.
Each function should endeavor to produce a result whose precision is
within two bits of the best-available approximation to any representable
result.
No function should ever generate an overflow, underflow, or zero
divide, regardless of its argument values and regardless of the result.
No function requires a floating-point representation other than double to
perform intermediatecalculations.
I believe I have achieved these goals, as best as I can tell from the testing
these functions have undergone to date.
non-goals
I should also point out a number of goals I chose not to achieve:
The library doesn't try to distinguish +Ofrom-0. IEEE 754 worries quite
a bit about this distinction. All the architectures I mentioned above can
represent both flavors of zero. But I have trouble accepting (or even
understanding) the rationale for this extra complexity. I can sympathize
with recent critiques of the IEEE 754 Standard that challenge that rationale. Most of all, I found the functions quite hard enough to write without
fretting about the sign of nothing.
The library does nothing with various flavors of NaNs. IEEE 754 arithmetic, for example, distinguishes quiet NUNSfrom signalling NUNS.The
latter should generate a signal or raise an exception. This implementation essentially treats all NaNs as quiet NaNs.
I provide low-level primitivesonly for the IEEE 754 representation.They
happen to work rather well with the DEC VAX floating-point representation as well, but the fit isn't perfect. The VAX hardware doesn't
recognize as special the code values for things like + I d and -M. Such
codes can disappear in expressions that perform arithmetic with them.
The primitives must be altered to support System/370 floating-point.
<math.h>
finite
precision
finite
range
Cody
and
Waite
elefunt
tests
129
I have not checked the functions on System/370. The "wobbling precision" on that architecture requires special handling. Mostly, I have tried
to provide such special handling, but it may not be thorough enough.
Many functions are probably suboptimal for machines that retain much
fewer than 53 bits of precision in type double. The C Standard permits a
double to retain as few as ten decimal digits of precision- about 31 bits.
For such machines, you should reconsider the approximations chosen
in various math functions.
Functions that use approximations will almost certainly fail for machines that retain more than 56 bits of precision. For such machines, you
must reconsider the approximations chosen.
Floating-point representations with bases other than 2 or 16 are poorly
supported by this implementation of the math library. An implementation with base-10 floating-point arithmetic, for example, would call for
significant redesign.
Even with these constraints, you should find that this implementation of
the math library is useful in a broad variety of environments.
Computing math functions safely and accurately requires a peculiar
style of programming:
The finite precision of floating-point representation is both a blessing
and a curse. It lets you choose approximations of limited accuracy. But
it offers only limited accuracy for intermediate calculations that may
need more.
The finite range of floating-point representation is also both a blessing
and a curse. It lets you choose safe data types to represent arbitrary
exponents. But it can surprise you with overflow or underflow in
intermediate calculations.
You learn to dismantle floating-point values by performing various seminumerical operations on them. The separate pieces are fractions with a
narrow range of values, integer exponents, and sign bits. You can work on
these pieces with greater speed, accuracy, and safety. Then you paste the
final result together using other seminumerical operations.
An excellent book on writing math libraries is William J. Cody Jr. and
William Waite, Software Manual for the Elementary Functions. Many of the
functions in this chapter make use of algorithms and techniquesdescribed
by Cody and Waite. Quite a few use the actual approximations derived by
Cody and Waite especially for their book. I confess that on a few occasions
I thought I could eliminate some of the fussier steps they recommend. All
too often I was proved wrong. I happily build on the work of these careful
pioneers.
As a final note, the acid test for many of the functions declared in
<math.h>was the public-domain elefunt (for "elementary function") tests.
These derive from the carefully wrought tests in Cody and Waite.
Chapter 7
130
HUGE-VAL
domain
error
For all functions, a domain error occurs if an input argument is outside the domain over which
the mathematical function is defined. The description of each function lists any required domain
errors; an implementation may define additional domain errors, provided that such errors are
consistent with the mathematical definition of the function.lO5 On a domain error, the function
returns an implementation-defined value; the value of the macro EDaM is stored in errno.
mnge
error
Similarly, a range error occurs if the result of the function cannot be represented as a double
value. If the result overflows (the magnitude of the result is so large that it cannot be represented
in an object of the specified type), the function returns the value of the macro HUGE VAL, with
the same sign (except for the tan function) as the correct value of the function; thevalue of the
macro ERANGE is stored in errno.If the result underflows (the magnitude of the result is so
small that it cannot be represented in an object of the specified type), the function returns zero;
whether the integer expression errno acquires the value of the macro ERANGE is implementation-defined.
7.5.2 ~igonometricfunctions
acorn
Description
The acos function computes the principal value of the arc cosine of x. A domain error occurs
for arguments not in the range [-I. +I].
Returns
The acos function returns the arc cosine in the range [0, rr] radians.
asin
Description
The asin function computes the principal value of the arc sine of x. A domain error occurs
for arguments not in the range [-I, +I].
Returns
The asin function returns the arc sine in the range [-M2, +W2] radians.
atan
Description
The atan function computes the principia1 value of the arc tangent of x.
Returns
The atan function returns the arc tangent in the range [-n12,+n/2]radians.
Description
The atan2 function computes the principal value of the arc tangent of y/x, using the signs
of both arguments to determinethe quadrant of the return value. Adomain error may occur if both
arguments are zero.
Returns
The atan2 function returns the arc tangent of y/x, in the range[-n, +n] radians.
com
Description
The cos function computes the cosine of x (measured in radians).
Returns
The cos function returns the cosine value.
min
Description
The s i n function computes the sine of x (measured in radians).
Returns
The s i n function returns the sine value.
tar
Description
The tan function returns the tangent of x (measured in radians).
Returns
The tan function returns the tangent value.
Description
The cosh function computes the hyperbolic cosine of x. Arangeerror occurs if the magnitude
of x is too large.
Chapter 7
Returns
The cosh function returns the hyperbolic cosine value.
dnh
Description
The sinh function computes the hyperbolic sine of x. A range error occurs if the magnitude
of x is too large.
Returns
The sinh function returns the hyperbolic sine value.
tMh
Description
The tanh function computes the hyperbolic tangent of x.
Returns
The tanh function returns the hyperbolic tangent value.
exp
Description
The exp function computes the exponential function of x. Arangeerror occurs if the magnitude
of x is too large.
Returns
The exp function retums the exponential value.
frexp
Description
The frexp function breaks a floating-point number into a normalized fraction and an integral
power of 2. It stores the integer in the i n t object pointed to by exp.
Returns
ldexp
The frexp function returns the value x, such that x is a double with magnitude in the
interval [ I n , I) or zero, and value equals x times 2 raised to the power *exp. If value is
zero, both parts of the result are zero.
7.5.4.3 The l d e x p function
Synopsis
tincluda <math.h>
dbuble ldexp(doub1e x, int exp);
Description
The ldexp function multiplies a floating-point number by an integral power of 2. A range
error may occur.
Returns
The ldexp function retunls the value of x times 2 raised to the power exp.
Description
The log function computes the natural logarithm of x. A domain error occurs if the argument
is negative. A range error may occur if the argument is zero.
Returns
The log function returns the natural logarithm.
Description
The log10 function computes the base-ten logarithm of x. A domain error occurs if the
argument is negative. A range error may occur if the argument is zero.
Returr-s
The log10 function returns the base-ten logarithm.
modf
Description
The modf function breaks the argument value into integer and fraction parts, each of which
has the same sign as the argument. It stores the integer part as a double in the object pointed to
by iptr.
Returns
The modf function returns the signed fractional part of value.
Po*
Description
The p o w function computes x raised to the power y. A domain error occurs if x is negative
and y is not an integral value. A domain error occurs if the result cannot be represented when x
is zero and y is less than or equal to zero. A range error may occur.
Returns
Description
The sqrt function computes the nonnegative square root of x. A domain error occurs if the
argument is negatwe.
Chapter 7
Returns
The sqrt function returns the value of the square root.
Description
The c e i l function computes the smallest integral value not less than x.
Returns
The c e i l function returns the smallest integral value not less than x, expressed as a double.
fabm
Description
The fabs function computes the absolute value of a floating-point number X.
Returns
The fabs function returns the absolute value of X .
floor
Description
The floor function computes the largest integral value not greater than x.
Returns
The floor function returns the largest integral value not greater than x, expressed as a double.
fmod
Description
The fmod function computes the floating-point remainder of x/y.
Returns
The fmod function returns the value x - i * y, for some integer i such that, if y is nonzero,
the result has the same sign as x and magnitude less than the magnitude of y. If y is zero, whether
a domain error occurs or the fmod function returns zero is implementation-defined.
Footnotes
103. See "future library dictions" (7.13.4).
104. HUGE-VAL can be positive infinity in an implementation that supports infimities.
105. In an implementation that supports infmities, this allows infinity as an argument to be a
domain error if the mathematical domain of the function does not include infinity.
Using <math.h>
I have to assume that you have a good notion of what you intend to do
with most functions declared in <math.h>. Few people are struck with a
sudden urge to compute a cosine. I confine my remarks, therefore, to the
usual comments on individual functions:
HUGE-VAL - This macro traditionally expands to a double constant that is
HUGE-VAL
supposed to be ridiculously large. Often, it equals the expansion of DBL-MAX,
defined in <float.h>.On machines that lack a special code for infinity ( I d ,
returning such a large value is considered the best way to warn that a range
error has occurred. Be warned, however, that HUGE-VAL may very well equal
Inf. It is probably safe to compare the return value of a math function against
HUGE-VAL or -HUGE-VAL. (It is probably better to test whether e r r n o has been
set to ERANOE. Both of these macros are defined in <errno.h>.)Don't use
HUGE-VAL any other way.
acos - The functions acos and a s i n are often computed by a common
acos
function. Each effectively computes one of the acute angles in a right
triangle, given the length of one of the sides and the hypotenuse. Be wary,
therefore, of arguments to acos that are ratios, particularly if one of the
terms looks like sqrt (1.0- x * x). YOU may very well want to call asin,
atan, or even better, atan2.
a s i n - See acos above.
asin
atan
atan - The functions atan and atan2 are often computed by a common
function. The latter is much more general, however. Use it in preference to
atan, particularly if the argument is a ratio. Also see acos above.
atan2
atan2-This function effectivelycomputes the angle that a radius vector
makes with the X-axis, given the coordinates of a point in the X-Y plane. It
is by far the most general of the four functions acos, asin, atan, and at an^.
Use it in preference to the others.
c e i l - The functions ceil, floor, and moar let you manipulate the
ceil
fraction part of a floating-point value in various ways. Using them is much
safer than converting to an integer type because they can manipulate
arbitrary floating-point values without causing overflow. Note that c e i l
rounds to the right along the X-axis, while f l o o r rounds to the left. To round
an arbitrary floating-point value x to the nearest integer, write:
x < 0.0
coe
cos - The
ceil(x
0.5) : f l o o r ( x + 0.5)
Chapter 7
cosh
exp
fabs
floor
fmod
f rexp
mdf
P
O
W
sin
sinh
0.5
(exp(x)
exp (-x))
or any of its optimized forms. Unlike this expression, cosh should generate
a more accurate result, and cover the full range of x for which the function
value is represe~table.
exp - If the argument to exp has the form y * i o g ( x ) , replace the
expression with pow (x, y) . The latter should be more precise.
fabs - This function should be reasonably fast. It should also work
properly for the arguments Inf and -Inf, if the implementation supports
those special codes.
floor - See c e i l above.
fmod- This function determines the floating-point analog to a rernainder in integer division. You can sometimes use it to advantage in reducing
an argument to a subrange within a repeated interval. As such, fmod is
better and safer than subtracting a multiple of the interval directly. Other
techniques described later in this chapter often do a better job of argument
reduction, however.
frexp - Use this function to partition a floating-point value when you
can usefully work on its fraction and exponent parts separately. The companion function is often ldexp below.
ldexp -Use this function to recombine the fraction and exponent parts
of a floating-point value after you have worked on them separately. The
companion function is often frexp above.
log- iog(x) is the natural logarithm, often written loge(x)or ln(x).You
can, of course, obtain the logarithm of x to any base b by multiplying the
value of this function by the conversion factor logb(e) (or 1/loge(b)).
log10 - loglo (x) is often computed from log (x). If you find yourself
multiplying the result of loglo by a conversion factor, consider calling log
instead.
mdf - Use this function to partition a floating-point value when you
can usefully work on its integer and fraction parts separately.
pow -This is often the most elaborate of all the functions declared in
<math.h>.Agood implementation will generate better results for pow (x, y)
than the apparent equivalent exp (y log (x) ). It may take longer, however.
Replace pow (e, y) with exp (y) where e is the base of natural logarithms.
Replace pow(x, o . 5 ) with sqrt (x). And replace pow (x, 2 . 0 ) with x x.
s i n -See cos above.
sinh - Use this function instead of the apparent identity:
sqrt
tan
tanh
or any of its optimized forms. Unlike this expression, sinh should generate
a more accurate result, particularly for small arguments. The function also
covers the full range of x for which the function value is representable.
sqrt -This function is generally much faster than the apparent equivalent pow (x, 0 . 5 ) .
tan - This function effectively reduces its argument to a range of n
radians, centered about the X-axis. Omit adding to the argument any
multiple of 2*n. The function will probably do a better job than you of
eliminating multiples of 2*n. Note, however, that each multiple of 2*n in
the argument reduces the useful precision of the result of tan by almost
three bits. For large enough arguments, the result of the function can be
meaningless even though the function reports no error.
tanh -Use this function instead of the apparent identity:
tanh(x)
(exp(2.0
x)
1 . 0 ) / (exp(2.0
x)
1.0)
or any of its optimized forms. Unlike this expression, tanh should generate
a more accurate result, particularly for small arguments. The function also
covers the full range of x for which the function value is representable.
138
Chapter 7
Figure 7.1:
math. h
(math.h>
Figure 7.2:
xvalues .c
--
I E E E 7 5 4 version
*/
/* macros */
#define NBITS
(48+-DOFF)
#if DO
#define I N I T (wO)
0, 0, 0, wO
#else
#define I N I T ( w 0 )
wO, 0, 0, 0
#endif
/* s t a t i c data */
D c o n s t - H u g w a l = { { I N I T (-DM<<-DOFF)
} };
D c o n s t -1nf = { { I N I T (-DM<<-DOFF)
} };
D c o n s t -Nan = { { I N I T (-DNAN) } };
-D c o n s t - R t e p s = {{INIT((-DBIAS-NBITS/2)<<-DOFF)}};
-D c o n s t W i g = { { I N I T ( (-DBIAS+NBITS/P) <<-DOFF) } };
-H u g e v a l
-I n f
- an
-R t e p s
-W i g
header
< y v a i s .w
header
"xanath. h
Figure 7.2 shows the file xvalues.c that defines this handful of values.
It incjudes a definition for -1nf that matches -Hugeval. I provide both in
case you choose to alter the definition of HUGE-VAL.The file also defines:
a an, the code for a generated NaN that functions return when no
operand is also a NaN
- R t e p s , the square root of DBL-EPSILON
(approximately),used by some
functions to choose between different approximations
- i g , the inverse of -~teps.-D, used by some functions to choose
between different approximations
The need for the last two values will become clearer when you see how
functions use them.
The file xvalues.c is essentiallyunreadable. It is parametrized much like
the file xfloat . c, shown on page 68. Both files make use of system-dependent parameters defined in the internal header < p a l s.h>.
xvalues . c does not directly include < p a l s .h>. Instead, it includes the
internal header "xmath. w that includes < p a l s .w in turn. All the files that
implement unath . w include "math. hw. Since that file contains an assortment of distractions, I show it in pieces as the need arises. You will find a
complete listing of "xmath. hwin Figure 7.38. Here are the macros defined
in " x m a t h . hwthat are relevant to xvalues . c
#define
#define
#define
#define
-DFRAC
-DMASK
-DM
-DNAN
( (I<<-DOFF) - 1)
(Ox7fffL--Dm)
( (I<< (15--DOFF) ) - 1)
(0x8000J~DMAX<<~DOFFJ1<<(~DOFF-1))
If you can sort through this nonsense, you will observe that:
the code for Inf has the largest-possible characteristic (-DM)
with all
fraction bits zero
the code for generated NaN has the largest-possible characteristic with
the most-significant fraction bit set
140
Chapter 7
figure 7.3:
f abs c
'*
fabs function */
linclude "xmath.h"
iouble (fabs) (double x)
/*
compute fabs */
/*
t e s t f o r s p e c i a l codes */
case NAN:
e r r n o = EDOM;
r e t u r n (x);
case INF:
e r r n o = ERANGE;
r e t u r n (-Inf .-D) ;
case 0:
r e t u r n (0.0);
d e f a u l t:
r e t u r n (x < 0.0 ? -x : x );
/*
f i n i t e */
1
1
In general, a NaN has at least one nonzero fractionbit.I chose this particular
code for generated NaN to match the behavior of the Intel 80x87 math
coprocessor.
function
The presence of all these codes makes even the simplest functions
fabs nontrivial. For example, Figure 7.3 shows the file fabs . c. In a simpler
world, you could reduce it to the last return statement:
r e t u r n (x < 0.0 ? -x : x) ;
Here, however, we want to handle NaN, -Inf, and +Inf properly along with
zero and finite values of the argument x. That takes a lot more testing.
function
Figure 7.4 shows the file x d t e s t .c. It defines the function -Dtest that
Dtest
categorizes
a double value. The internal header nxmath.hwdefines the variFigure 7.4: /* -Dtest function -- IEEE 754 version */
xdtest .c
#include "xmath.hW
s h o r t -Dtest (double *px)
{
/*
categorize *px */
/*
NaN o r INF
/*
*/
f i n i t e */
/*
zero */
C
ous offsets and category values that -Dtest uses. The macro definitions of
interest here are:
/*
# i f -D0=3
#define -Dl
#define -D2
#define -D3
#else
#define -Dl
#define -D2
#define -D3
#endif
*/
/*
/*
big-endian o r d e r
*/
0
1
2
3
*/
*/
#include "xmath.h"
double ( c e i l ) (double x)
{
/*
compute c e i l (x)
1.0 : x);
l
Figure 7.6:
I/*f l o o r
*/
0
function
*/
double ( f l o o r ) (double x )
{
r e t u r n (-Dint(&x, 0) < 0 66 x
< 0.0
/*
-
? x
compute f l o o r (x)
1.0 : x ) ;
*/
0
Figure 7.8:
modf .c
/* modf function */
#include "xmath.hW
double (modf) (double x, double * p i n t )
{
/* compute modf (x, Lintpart) */
* p i n t = x;
switch (-Dint ( p i n t , 0) )
/* t e s t f o r s p e c i a l codes */
{
case NAN:
r e t u r n (x);
case INF:
case 0:
r e t u r n (0.0) ;
d e f a u l t:
/* f i n i t e */
r e t u r n (x - * p i n t );
1
1
'*
f r e x p function */
linclude "xmath. h"
iouble (frexp) (double x, i n t *pexp)
/*
compute frexp(x, h i ) */
s h o r t binexp;
switch (-Dunscale (hbinexp, 6x) )
{
case NAN:
Case INF:
e r r n o = EDOM;
*pexp = 0;
r e t u r n (x);
case 0:
*pexp = 0;
r e t u r n (0.0);
default:
*pexp = binexp;
r e t u r n (x);
/*
t e s t f o r s p e c i a l codes */
/*
f i n i t e */
1
1
Chapter 7
Figure 7.10:
ldexp c
E
switch (-Dtest (&XI)
E
case NAN:
errno = EDOM;
break;
case INF:
errno = ERANGE;
break;
case 0:
break;
default:
if (0 <= -Decale(&x, xexp))
errno = ERANGE;
1
return (x);
1
Figure 7.1 1:
xdunsca1.c
--
/ * finite */
linclude "xmath.hm
~hort-Dunscale(short *pex, double *px)
E
/ * separate *px to l/2 <= lfracl < 1 and ZA*pex */
unsigned short *ps = (unsigned short *)px;
short xchar = (ps[-DO] & -DMASK) >> -DOFF;
function
l&xp
function
-Dunscale
function
-D s c a l e
function
-Dnorm
function
fmad
Figure 7.10 shows the file iaexp. c . The function iaexp faces problems
similar to frexp, only in reverse. Once it dispatches any special codes, it
still has a nontrivial task to perform. It too calls on a low-level fundion.
Let's look at the two low-level functions.
Figure 7.11 shows the file xdunscal .c. It defines the function - D u n s c a l e ,
which combines the actions of test and frexp in a form that is handier
for several other math functions. By calling - D u n s c a l e , the fundion frexp
is left with little to do.
-D u n s c a l e itself has a fairly easy job except when presented with a
gradual underflow. A normalized value has a nonzero characteristic and
an implicit fraction bit to the left of the most-significant fraction bit that is
represented. Gradual underflow is signaled by a zero characteristic and a
nonzero fraction with no implicit leading bit. Both these forms must be
converted to a normalized fraction in the range [0.5,1.0), accompanied by
the appropriate binary exponent. The function -Dnorm, described below,
handles this messy job.
Figure 7.12 shows the file xdscale.c that defines the function - D s c a l e .
It too frets about special codes, because of the other ways that it can be
called. Adding the short value xexp to the exponent of a finite *px can cause
overflow, gradual underflow, or underflow. You even have to worry about
integer overflow in forming the new exponent. That's why the function
first computes the sum in a long.
Most of the complexity of the function- scale lies in forming a gradual
underflow. The operation is essentially the reverse of norm.
Figure 7.13 shows the file x d n o r m . c that defines the function -Dnorm. It
normalizes the fraction part of a gradual underflow and adjusts the characteristic accordingly. To improve performance, the function shifts the
fraction left 16 bits at a time whenever possible. That's why it must be
prepared to shift right as well as left one bit at a time. It may overshoot and
be obliged to back up.
Figure 7.14 shows the file f m 0 d . c . The function f m o d is the last of the
seminumericalfunctions declared in auath. h>. It is also the most complex.
In principle, it subtracts the magnitude of y from the magnitude of x
repeatedly until the remainder is smaller than the magnitude of y. In
practice, that could take an astronomical amount of time, even if it could
be done with any reasonable precision.
What f m o d does instead is scale y by the largest possible power of two
before each subtraction. That can still require dozens of iterations, but the
result is reasonably precise. Note the way f m o d uses- scale and - D u n s c a l e
to manipulate exponents. It uses -Dunscale to extract the exponents of x
and y to perform a quick but coarse comparisonof their magnitudes. If f m o d
determines that a subtraction might be possible, it uses - D s c a l e to scale x
to approximately the right size.
Chapter 7
Figure 7.12:
xdcsale. c
Part 1
'* D s c a l e function
!in&de " x m a t h .h"
--
I E E E 7 5 4 version
*/
( x c h a r = DMAX)
return ( p s [ - ~ ~ ] 6
I I ps[-DPI I I pe[-D31
else i f ( 0 < x c h a r )
/*
if
? NAN : I N F ) ;
/*
else i f ( ( x c h a r = - D n o r m ( p s ) ) = 0 )
return ( 0 ) ;
lexp = ( 1 o n g ) x e x p + xchar;
i f (-DMAX o l e x p )
/*
*px = pa[-DO]
6 -DSIGN
return ( I N F ) ;
NaN or I N F
1 1 ps [-Dl]
? --1nf.-D
finite
/*
zero
o v e r f l o w , r e t u r n +/-INF
: -1nf.-D;
else i f ( 0 < l e x p )
/*
ps [-DO] = ps [-DO]
return ( F I N I T E ) ;
6 --DMASK
f i n i t e r e s u l t , repack
1
else
/*
1
unsigned short sign = ps [-DO]
d e n o r m a l i z e d , scale
6 -DSIGN;
1
if
= - x e x p ) != 0 )
1
pa[-D3]
= pa[-D3]
>> xexp
I pa[-DP] << 16
xexp;
pa[-DS] = pa[-DP]
>> xexp
I pa[-Dl] << 16 - xexp;
ps [-Dl] = ps [-Dl] >> xexp
I ps [-DO] << 16 - xexp;
ps [-DO] >% xexp;
( (xexp
1
1
/*
scale by b i t s
if (0 <= xexp
I I ps[-D21
Continuing
xdscale.c
(ps[-DO] I I ps[-Dl]
I I ps[-D31))
66
Part 2
/*
&normalized
*/
ps[-DO] I= sign;
return (FINITE);
1
else
I
/* underflow, return +/-0 * /
ps[-DO] = sign, ps[-Dl] = 0;
ps [-D2] = 0, ps [-D3] = 0;
return (0);
1
1
1
Figure 7.13:
xdnorm.c
--
'*
Dnorm function
linclude "xmath. h"
*/
short xchar;
unsigned short sign = ps [-DO]
-DSI(;N;
xchar = 0;
if ((PSI-DO] 6= -DFRAC) != 0 I I ps[-Dl]
I I ps[-D21 I I Ps[-D31)
/* nonzero, scale *,
/* shift left by 16
I
Ps [-DO1 = ps [-Dl], Ps [-Dl1 = Ps [-D21;
ps [-D2 ] = ps [-D31, ps [-D31 = 0;
*,
I
ps [-DO]
ps[-Dl]
ps[-D2]
ps [-D3]
/* shift left by 1 *,
= p~ [-DO] << i I ps [-DII >> 15;
= ps[-Dl] << 1 I ps[-D2] >> 15;
= ps[-D2] << 1 I ps[-D3] >> 15:
<<= 1;
for (;
1
ps [-DO]
6=
-DFRnC;
1
PSI-DO1 I= sign;
return (xchar);
1
Chapter 7
1 48
Figure 7.14:
f mod. c
'*
h o d function */
linclude "xmath . h"
buble (fmod) (double x, double y)
/*
compute fmod(x, y) */
1
e l s e i f ( e r r x = 0 I I e r r y == INF)
r e t u r n (x);
/* fmod (0, nonzero) o r fmod ( f i n i t e , INF) */
else
I
/* fmod(finite, f i n i t e ) */
double t;
s h o r t n, neg, ychar;
(y < 0.0)
y = -y;
i f (x < 0.0)
x = -x, neg = 1;
else
neg = 0;
f o r (t = y, -Dunscale (bychar, b t ) , n = 0; ; )
I
/* s u b t r a c t IyI u n t i l Ixl<lyI * I
s h o r t xchar;
if
t = x;
i f (n < 0 I I -Dunscale(bxchar, b t ) == 0
I I (n = xchar - ychar) < 0)
r e t u r n (neg ? -x : x) ;
f o r (; 0 <= n; --n)
I
/* t r y t o s u b t r a c t 1 yl*2"n */
t = y, -Dscale (st, n);
i f (t <= x)
I
x -= t;
break;
1
1
1
1
1
Now let's look at the trignometric functions. Figure 7.15 shows the file
that definesthe fundion-sin. It computes s i n (x) if qoff is zero and
cos (XI if qoff is one. Using
- such a "quadrant offset" for cosine avoids the
loss of precision that occurs in adding d 2 to the argument instead. I
developed the polynomial approximationsfrom truncated Taylor series by
"economizing" them using Chebychev polynomials. (If you don't know
what that means, don't worry.)
Reducing the argument to the range [-.n/4,~/41 must be done carefully.
It is easy enough to determine how many times n/2 should be subtracted
from the argument. That determines quad, the quadrant (centered on one
of the four axes) in which the angle lies. You need the low-order two bits
of quad + qoff to determine whether to compute the cosine or sine and
whether to negate the result. Note the way the signed quadrant is converted
to an unsigned value so that negative arguments get treated consistenly on
all computer architectures.
What you'd like to do at this point is compute quad%/2 to arbitrary
precision. You want to subtract this value from the argument and still have
full double precision after the most-significant bits cancel. Given the wide
range that floating-point values can assume, that's a tall order. It's also a
bit silly. As I discussed on page 135, the circular functions become progressively grainier the larger the magnitude of the argument. Beyond some
magnitude, all values are indistinguishable from exact multiples of d 2 .
Some people argue that this is an error condition, but the C Standard
doesn't say so. The circular functions must return some sensible value, and
report no error, for all finite argument values.
macro
I chose to split the difference. Adapting the approach used by Cody and
HUGE-RAD Waite in several places, I represent n/2 to "one-and-a-half" times double
precision. The header "xmath.hwdefines the macro HUGE-RAD as:
function
-S i n x s i n .c
#define HUGE-RAD
3.14e30
Chapter 7
Figure 7.15:
xsin. c
Part 1
'* -S i n
f u n c t i o n */
l i n c l u d e "xmath . h*
'*
c o e f f i c i e n t s */
r t a t i c c o n s t double c [ 8 ] =
-0.000000000011470879,
0.000000002087712071,
-0.000000275573192202,
0.000024801587292937,
-0.001388888888888893,
0.041666666666667325,
-0.500000000000000000,
1-01;
r t a t i c c o n s t double s [ 8 ] = {
-0.000000000000764723,
0.000000000160592578,
-0.000000025052108383,
0.000002755731921890,
-0.000198412698412699,
0.008333333333333372,
-0.166666666666666667,
1.0);
l t a t i c c o n s t d o u b l e c l = I3294198.0 / 2097152.0);
r t a t i c c o n s t d o u b l e c 2 = {3.139164786504813217e-7);
r t a t i c c o n s t double twobypi = {0.63661977236758134308);
r t a t i c c o n s t d o u b l e t w o p i = {6.28318530717958647693);
lauble -Sin (double x, unsigned i n t q o f f )
I
/* compute s i n ( x ) o r c o s (x) */
s w i t c h (-Dtest (6x) )
I
c a s e NAN:
e r r n o = EDOM;
return (x);
c a s e 0:
r e t u r n (qoff ? 1 . 0 : 0.0);
c a s e INF:
e r r n o = EDOM;
r e t u r n (-Nan .-D) ;
d e f a u l t:
/* f i n i t e */
/* compute s i n / c o s */
I
double g ;
l o n g quad;
if
( x < -HUGE-RAD
I I HUGERAD
I
g = x / twopi;
-D i n t ( 6 g . 0);
x -= g * twopi;
< x)
/*
x huge, s a w e q u i p u t */
g = x * twobypi;
quad = ( l o n g ) (0 < g ? g + 0 . 5 : g
q o f f += (unsigned l o n g ) quad 6 0x3;
g = (double) quad;
0.5);
if ( ( g
Continuing
< 0.0
? -g : g)
xsin.c
Part 2
< -Rteps.-D)
/*
s i n ( t i n y )==tiny,
i f (qoff 6 0x1)
g = 1.0;
cos ( t i n y )==1* /
/*
c o s ( t i n y ) */
1
e l s e if (qoff 6 0x1)
g = -Poly(g * g, c. 7 ) ;
else
g *= -Poly(g * gr s. 7 ) ;
r e t u r n (qoff 6 0x2 ? -g : g ) ;
1
1
1
*/
-POIY
cos
sin
function
tan
function
-b i n
Chapter 7
Figure 7.17:
cos
.C
/* cos function */
#include <math.h>
double (cos) (double x )
I
return (-Sin(x,
/*
compute cos
Figure 7.18:
sin.c
/* s i n function */
#include <math.h>
double ( s i n ) (double x )
I
return (-Sin(x,
acos
asin
atan
atan2
*/
1) );
/*
compute s i n
*/
0) ) ;
0
<math. h>
Figure 7.19:
tan.c
t a n function */
Yinclude "xmath.h"
I*
*/
-0.17861707342254426711e-4,
0.34248878235890589960e-2,
-0.13338350006421960681e+0);
s t a t i c const double q[4] = {
0.49819433993786512270e-6,
-0.31181531907010027307e-3,
0.25663832289440112864e-1,
-0.4667168333975529424Oe+O);
static
static
static
static
const
const
const
const
double
double
double
double
c l = {3294198.0 / 2097152.0);
c2 = {3.139164786504813217e-7);
twobypi = {0.63661977236758134308);
twopi = {6.28318530717958647693);
iouble t a n (double x)
/*
compute t a n ( x )
*/
double g, gd;
long quad;
switch (-Dtest (6x) )
I
case NAN:
errno = EDOM:
return (x);
case INF:
errno = EDOM;
return (-Nan .-D) ;
case 0:
return (0.0);
default :
/* f i n i t e */
i f ( X < -HUGE-RAD
I I HUGE-RAD < x)
I
/* x huge, sauve qui put */
g = x / twopi:
-Dint (Sg, 0);
x -= g * twopi;
g = x * twobypi;
quad = (long) (0 < g ? g + 0.5 : g - 0.5);
g = (double)quad;
g = (x - g * c l ) - g * c2;
gd = 1.0;
i f (-Rteps.-D
< (g < 0.0 ? -g : g ) )
/* g*g worth computing * I
(
double y = g * g;
gd+= (((qtol * Y + q t l l ) * Y +qt21) * Y +qt31)
g += ((pt01 * Y + p t l l ) * Y + pt21) * Y * g;
* Y
1
return ( (unsigned int)quad 6 0x1 ? -gd / g : g / gd) ;
1
1
Chapter 7
Figure 7.20:
xasin. c
Part 1
/* -Asin f u n c t i o n */
kinclude "xmath.h"
/* c o e f f i c i e n t s , a f t e r Cody & Waite, Chapter 10
s t a t i c c o n s t double p [5] = {
-0.6967457344735064641le+O,
O.l0152522233806463645e+2,
*/
-0.39688862997504877339e+2,
0.57208227877891731407e+2,
-0.27368494524164255994e+2);
s t a t i c c o n s t double q [ 6 ] = {
0.10000000000000000000e+l,
-0.2382385915367023883Oe+2,
0.15095270841030604719e+3,
-0.38186303361750149284e+3,
0.41714430248260412556e+3,
-O.l6421096714498560795e+3);
/*
compute a s i n ( x ) o r a c o s (x)
*,
double g, y;
c o n s t s h o r t e r r x = -Dtest(&x);
i f (0
< errx)
/*
I
e r r n o = EDOM;
r e t u r n ( e r r x = EIAN ? x : -Nan.-D)
INF, NaN * r
1
y = -x, i d x I = 2;
else
y = x;
i f ( y < -Rteps .-D)
else i f (y < 0 - 5 )
{
g = y * y;
Y += Y * g
/*
1
else i f (y < 1.O)
I
/* f i n d 2*asin ( s q r t ( (1-x) /2) ) * i
i d x I = 4;
y) / 2.0;
/* NOT * 0.5! */
g = (1.0
Y = sq* (g) ;
y += y;
Y += Y * g * -Poly (g, p. 4) / -Poly (g, q, 5) ;
1
else i f (y = 1.0)
i d x I = 4, y = 0.0;
else
I
e r r n o = EDOM;
r e t u r n (-Nan .-D) ;
/*
1.0
*/
155
Continuing
xasin c
Part 2
1
switch (idx)
f
d e f a u l t:
case 0:
case 5:
return
case 1:
case 4:
return
case 2:
return
case 3:
return
case 6:
return
case 7 :
return
/*
/* f l i p and f o l d
shouldn't happen
/* asin, [O, 1/2)
/* acos, (1/2, 11
*/
*/
*/
*/
/*
/*
*/
*/
(y);
( (piby4
y)
piby4) ;
/*
asin, [-1/2,
0)
*/
/*
acos,
[-1/2, 0)
*/
/*
*/
/*
acos,
*/
(-y);
( (piby4
( (-piby4
( (piby2
y)
y)
y)
piby4) ;
piby4);
[-I,
-1/2)
piby2);
1
1
acos c
#include Unath.h>
1 double
(acos) (double x)
I
return (-Asin (x, 1)) ;
#include Unath.h>
double (asin) (double x)
/*
compute asin(x)
*/
macro
DSIGN
As you can see, the function atan offers only a subset of the possibilities
inherent in atan2. That's because atan (y) is equivalent to atan2 (y, I . 0).
By the way, the header wxmath.hndefines the macro DSIGN as:
define DSIGN(x) ( ( (unsigned short *) 6 (x)) [DO]
&
-DSIGN)
It lets you inspect the sign bit of a special code, such as Inf, that may not
test well in a normal expression. I use DSIGN to test the sign bit whenever
such a special code can occur.
atan2 first checks its arguments for a variety of special codes. It accepts
any pair that define a direction for a radius vector drawn from the origin.
(The treatment of atan2 (0, O) is controversial. I chose to return zero, based
on the advice of experts.) The function then determines the two arguments
1 56
Chapter 7
Figure 7.23:
atan. c
'*
a t a n f u n c t i o n */
linclude "xmath. h"
Louble ( a t a n ) (double x)
I
/* compute a t a n ( x ) */
unsigned s h o r t hex;
s t a t i c c o n s t double piby2 = {1.57079632679489661923);
s w i t c h (-Mest (6x1
/* t e s t f o r s p e c i a l codes */
I
c a s e NAN:
e r r n o = EDOM;
r e t u r n (x);
case INF:
r e t u r n (DSIGN(x) ? -piby2 : p i b y 2 ) ;
c a s e 0:
r e t u r n (0.0) ;
d e f a u l t:
/* f i n i t e */
i f ( x < 0.0)
x = -x, hex = 0x8;
else
hex = 0x0;
i f (1.0 < X)
x = 1.0 / x, hex *= 0x2;
r e t u r n (-Atan (x, hex) ) ;
1
1
to- tan. z is the tangent argument reduced to the interval [O,11. hex divides
the circle into sixteen equal slices:
If hex 6 0x8, negate the final result.
If hex 6 0x4, add the arctangent of z to x/4.
If hex 6 0x2, subtract the arctangent of z from x/4.
If hex 6 0x1, add x/6 to the arctangent of z
Only -Atan sets the least-significant bit, to indicate that z was initially
greater than 2-3'i2 (about 0.268). It replaces z with:
( z * s q r t (3)-1) /sq*
(3) +z)
All of these machinations derive from various trignometric identities exploited to reduce the range required for approximation.
function
Figure 7.25 shows the file x a t a n . c that defines the function tan. It
tan
assumes
that it is called only by a t a n or atan2. Hence, it checks only
whether its argument x needs to be reduced below 2-31i2. If the magnitude
of the reduced argument is less than -Rteps .-D, that serves as the approximation to the arctangent. Otherwise, the function computes a ratio of
polynomials taken from Cody and Waite. The function adds an element
from the table a to take care of all the adding and subtracting of constants
described above.
Figure 7.24:
a tan2. c
'*
atan2 function */
linclude "wrath. h"
Louble (atan21 (double y, double x)
/*
compute atan(y/x)
double z ;
const s h o r t e r r x = -meet (sx);
const s h o r t e r r y = - m e s t (sy);
unsigned s h o r t hex;
i f ( e r r x <= 0 66 e r r y <= 0)
/*
x 6 y both f i n i t e o r 0
/*
r e t u r n one of t h e NaNs
f
i f (y < 0.0)
y = -y, hex = 0x8;
else
hex = 0x0;
i f ( x < 0.0)
x = -x, hex A= 0x6;
i f ( x < y)
z = x / y, hex A= 0x2;
e l s e i f (0.0 < x )
z = y / x ;
else
r e t u r n (0.0) ;
e r r n o = EDOM;
r e t u r n ( e r r x == NAN ? x : y) ;
'
1
else
f
z = e r r x =t e r r y ? 1 . 0 : 0.0;
hex = DSIGN(y) ? 0x8 : 0x0;
i f (DSIGN(x))
hex A= 0x6;
i f ( e r r y = INF)
hex A= 0x2;
/*
a t l e a s t one INF
1
r e t u r n (-Atan (z, hex) ) ;
function
sqrt
The final group of functions are those that compute exponentials, logarithms, and special powers. Figure 7.26 shows the file s q r t . c. The function
s q r t computes the square root of its argument x, or
It partitions a
positive, finite x, using -Dunscale, into an exponent e and a fraction f. The
argument value is f *2e,where f is in the interval 10.5,l.O). The square root
is then f 1/2*2e/2.
The function first computes a quadratic keast-squaresfit to f 'j2. It then
applies Newton's Method - divide and average- three times to obtain
the needed precision. Note how the function combines the last two iterations of the algorithm to improve performance slightly.
Chapter 7
1 58
Figure 7.25:
xatan c
I* -Atan f u n c t i o n */
!include "wrath-h"
I* c o e f f i c i e n t s , a f t e r Cody 6 W a i t e , Chapter 11
s t a t i c c o n s t double a [ 8 ] =
*/
0.0,
0.52359877559829887308,
1.57079632679489661923,
1.04719755119659774615,
1.57079632679489661923,
2.09439510239319549231,
3.14159265358979323846,
2.61799387799149436538);
static c o n s t double p[4] = {
-0.83758299368150059274e+O,
-0.84946240351320683534e+l,
-0.20505855195861651981e+2,
-O.l3688768894191926929e+2);
static c o n s t double q[5] = {
0.10000000000000000000e+l,
0.15024001160028576121e+2,
0.59578436142597344465e+2,
0.86157349597130242515e+2,
0.41066306682575781263e+2);
s t a t i c c o n s t double f o l d = {0.26794919243112270647);
static c o n s t double s q r t 3 = {1.73205080756887729353);
s t a t i c c o n s t double s q r t 3 m l = {0.73205080756887729353);
h u b l e -Atan(double x, unsigned s h o r t i d x )
f
/* compute a t a n (x), 0 <= x <= 1 . 0
i f ( f o l d < x)
1
/* 2 - s q r t (3) < x
x = ( ( ( s q r t 3 m l * x - 0.5) - 0.5) + x ) / ( s q r t 3
x);
i d x I= 0x1;
*/
*/
1
i f (X < --Rteps.-D
I I -Rteps.-D
f
c o n s t double g = x
x += x
< x)
/*
*/
x;
* g / -Poly(g, q. 4)
(((pro1 * g + p [ l l ) * g
+ pW1) * g + p[31);
1
i f ( i d x 6 0x2)
X = -x;
x += a [ i d x 6 071;
r e t u r n ( i d x 6 0x8 ? -x : x ) ;
<math.h>
Figure 7.26:
sqrt . c
s q r t function */
tinclude < l i m i t s . h>
tinclude "wrath. h"
I*
b u b l e (sqrt)(double x)
compute sqrt (x)
t e s t f o r s p e c i a l codes
/*
f
s h o r t xexp;
switch (-Dunscale (Lxexp, 6x) )
/*
f
case NAN:
e r r n o = EDOM;
r e t u r n (x);
case INF:
i f (DSIGN(x) )
f
e r r n o = EDOM;
r e t u r n (-Nan .-D) ;
1
else
f
e r r n o = ERANGE;
r e t u r n (-Inf .-D) ;
1
case 0:
r e t u r n (0.0) ;
d e f a u l t:
i f ( x < 0.0)
/*
finite
*,
sqrt undefined f o r r e a l s
*,
*,
/*
e r r n o = EDOM;
r e t u r n (-Nan. -D) ;
1
y = (-0.1984742 * x + 0.8804894)
y z 0 . 5 * (y + x / y ) ;
y += x / y;
x = 0.25 * y + x / y;
i f ( (unsigned i n t ) xexp 6 1)
x *= s q r t 2 , --xaxp;
-Dscale(Lx, xexp / 2) ;
return (x);
1
1
1
0 -3176687;
1 60
Chapter 7
Figure 7.27:
xexp.c
I* -Exp function */
Kinclude "wrath. h"
I* c o e f f i c i e n t s , a f t e r Cody 6 Waite, Chapter 6
s t a t i c w n s t double p[3] = {
*/
0.31555192765684646356e-4,
0.75753180159422776666e-2,
0.25000000000000000000e+0);
s t a t i c w n s t double q[4] = {
0.75104028399870046114e-6,
0.63121894374398503557e-3,
0.56817302698551221787e-1,
0.50000000000000000000e+0);
*px, s h o r t e o f f )
/* compute e A (*px) *2"eoff, x f i n i t e
i n t neg;
(*px < 0)
*px = -*px, neg = 1 ;
else
neg = 0;
i f (hugexp < *px)
if
/*
-Inf .-D;
c e r t a i n underflow o r overflow
INF) ;
1
else
/*
double g = *px
nvln2;
s h o r t xexp = ( s h o r t ) (g + 0.5) ;
g = (double) xexp;
g = (*px - g * c l ) - g * c2;
i f (--Rteps.-D
< g 66 g < -Rteps.-D)
*px = 1.0;
else
f
const double y = g
g
*=
(p[OI
/*
g;
Y + ~ [ l l *) y + p[21;
* P X = 0.5 + g /
+ q[31
g);
++xexp;
( ( W O I * Y + q [ l l ) * Y + q[21) * Y
1
i f (neg)
*px = 1 . 0 / *px, xexp = -xexp;
r e t u r n (-Dscale (px, eof f + xexp) ) ;
1
1
Figure 7.27 shows the file xexp. c that defines the function EX^. Several
need to compute the exponential of a finite argument, or ex.A
number of these actually need to compute eX/2. In this case, the argument
eoff is -1. Overflow occurs only if e X /2 overflows.
The header "xmath.hndefines the macro HUGE-EXP as the carefully conmacro
HUGE-EXP trived value:
function
-~ x pfunctions
#define HUGE-EXP
(int) (-DMAX
900L / 1000)
This value is large enough to cause certain overflow on all known floatingpoint representations. It is also small enough not to cause integer overflow
offers a coarse filter for
in the computations that follow. Thus, HUGE-=
truly silly arguments to -~xp.
The trick here is to divide x by ln(2) and raise 2 to that power. You can
pick off the integer part and compute 28, forg in the interval [-0.5,0.51. You
add in the integer part (plus eoff) at the end with -Dscale. That function
also handles any overflow or underflow safely.
Reducing the argument this way has many of the same problems as
reducing the arguments to -Sin and tan, described earlier. The one advantage here is that you can choose extended-precisionconstants c l and c2 to
represent 1 /ln(2) adequately for all reasonable argument values.
. to avoid
As usual, the reduced argument is compared against - ~ t e p s -D
underflowand unnecessary computation. The ratio of polynomials is taken
from Cody and Waite. The approximation actually computes 28/2 thus the
correction to xexp.
function
Figure 7.28 shows the file exp. c. The function exp tests its argument for
exp special codes before calling - ~ x pwith a finite argument. It then tests the
return value for a zero or Inf result, to report a range error.
function
Figure 7.29 shows the file cosh. c. The function cash also has little else to
cosh do besides test its arguments for special codes and call-~xp.That's because
the value of the function depends on exp(x)/2 whichever way it's computed:
If x < _~big.-Dthen the value is (exp(x) + exp (-x) ) /2. The actual form
eliminates the second function call and some arithmetic.
Otherwise, the value is exp (x) / 2, obtained directly from -~xp.
cosh must also report a range error if - ~ x p(x, -1) overflows.
function
Figure 7.30 shows the file sinh-c. The function sinh is also best comsinh puted in terms of - ~ x pover much of its range. But it is an odd function,
unlike cosh. When the magnitude of its argument x is less than 1.0, the
conventional definition (exp(x) - exp(-x) ) / 2 loses precision. Over this
interval, it is better to approximate the function with a ratio of polynomials,
again courtesy of Cody and Waite. As usual, if the magnitude of x is less
than -Rteps .-D, the argument itself is an adequate approximation to the
value of the function.
.
Chapter 7
exp function */
Kinclude "xmath.h"
Figure 7.28: /*
exp c
f
/* t e s t f o r s p e c i a l codes
case NAN:
e r r n o = EDOM;
r e t u r n (x);
case INF:
e r r n o = ERANGE;
r e t u r n (DSIGN(x) ? 0.0 : -1nf .-D) ;
case 0:
r e t u r n (1.0);
default:
/* f i n i t e
i f (0 <= _Exp(Lx, 0) )
e r r n o = ERANGE;
r e t u r n (x) ;
/*
f
switch ( - D t e s t (6x))
*,
1
1
Figure 7.29:
cosh c
I* cosh function */
Kinclude "xmath. h"
iouble (cosh) (double x)
compute cosh(x)
t e s t f o r s p e c i a l codes
/*
f
switch (-Dtest (6x) )
-
f
case NAN:
e r r n o = EDOM;
return (x);
case INF:
e r r n o = ERANGE;
r e t u r n (-Inf .-D) ;
case 0:
r e t u r n (1.0) ;
default:
i f ( x < 0.0)
X = -x;
i f (0 o -Exp(&x, -1))
e r r n o = ERANGE;
e l s e i f (x < i g . - D )
x += 0.25 / x;
r e t u r n (x);
/*
/*
/*
finite
*,
x large
*,
1
1
.th.h>
Figure 7.30:
sinh c
s i n h function */
linclude "xmath.hW
'*
'*
*/
-0.78966127417357099479e+O,
-0.16375798202630751372e+3,
-O.ll56352ll9685l76827Oe+5,
-0.35181283430177117881e+6~;
-0.27773523119650701667e+3,
0.36162723109421836460e+5,
-0.21108770058106271242e+7);
h b l e (sinh) (double x)
/*
compute sinh(x) */
/*
t e s t f o r s p e c i a l codes */
case NAN:
e r r n o = EDOM;
return (x);
case INF:
e r r n o = ERANGE;
r e t u r n (DSIGN(x) ? - 1 n f .
case 0:
r e t u r n (0.0) ;
d e f a u l t:
/*
/* f i n i t e */
compute s i n h ( f i n i t e ) */
s h o r t neg;
x = -x, neg = 1;
else
neg = 0;
i f ( X < -Rteps .-D)
e l s e i f (x
/*
x t i n y */
< 1.0)
const double y = x
x + = x * y
* (((pro1
/ (((do1
*
*
/*
1x1 < 1 */
x;
Y +p[ll)
Y + qWl)
Y +p[21)
Y +qPI)
y +p[31)
Y + qD1);
/*
x large
*/
1
1
1
Chapter 7
1 64
Figure 7.31 shows the file tanh. c. The function tanh is similar in many
tanh ways to sinh. One difference is that it cannot overflow. The function
approaches f1.0 as the magnitude of the argument x increases. (The function could compare x to i
g .-D as do cosh and sinh. The overflow code
returned -EXP serves as adequate notice, however.) The other difference is
where the function chooses to change to a ratio-of-polynomials approximation. The one use here, again from Cody and Waite, is accurate for
magnitudes of x less than In(3)/2 (about 0.549).
function
Figure 7.32 shows the file log. c. It computes log (x) by calling - ~ o g(x,
log 0 ) . Naturally, the header ~nath.h>provides a masking macro for this
function. This may seem silly, but it is the safe way to provide a masking
macro for loglo (described below) as well.
function
Figure 7.33 shows the file xlog. c that defines the function -~og.It
LO^
computes
the natural logarithm using tricks reminiscent of those used in
E
,
only
in reverse. The idea is to pick off the binary exponent e using
Dunscale,
leaving the fraction f. The argument value is f *2e,where f is in
the interval [0.5, 1.0). You can compute the base2 logarithm of these
components as log2(fl+e. You get the final result by multiplying this sum
by ln(2).
That approach requires a few refinements. The approximation from
Cody and Waite wants f in the interval [0.5'12, 2.0'/~].Iff (actually X) is too
small, you have to double it and correct e (xexp).YOU also have to introduce
the new variable z = (f-l)/(f+l). It is better to combine both operations and
eliminate some steps that can cost precision. The approximation is yet
another ratio of polynomials. Note that it actually computes the natural
logarithm, so it is only necessary to scale xexp before forming the sum.
You have to form the sum carefully, at least for logarithms near zero. This
is the other face of the argument reduction problem in-~xp.Both functions
use the same extended-precision representation of In(2). Here, the smaller
part is combined before the larger, to involve as many low-order bits of the
conversion constant as posssible in the final result.
loglo
Figure 7.34 shows the file loglo. c. It computes the base10 logarithm by
calling - ~ o gand multiplying the result by loglo(e).The multiplication takes
place within LO^ only for a finite result.
Figure 7.35 shows the file pow. c. The function pow, which raises x to the
function
p o w y power, is easily the most complex of all the math functions. It must deal
with a broad assortment of special cases. It must also endeavor to develop
a precise result for a broad range of argument values.
By now you should be aware of the dangers in computing exp(y *
log(x) 1. Put simply, the logarithm displaces fraction bits to represent the
exponent of x as an integer part. Multiplying by y can make matters even
worse. The exponential turns integer bits back into exponent bits, but the
damage is already done. Unless you can perform the intermediatecalculations to extended precision, you have to lose bits along the way. This
function
*/
-0.96437492777225469787e+o,
-0.99225929672236083313e+2,
-0.16134119023996228053e+4);
/*
compute tanh(x)
/*
test f o r s p e c i a l codes
case NAN:
e r r n o = EDOM;
r e t u r n (x);
case INF:
r e t u r n (DSIGN(x) ? -1.0 : 1.0);
case 0:
r e t u r n (0.0) ;
d e f a u l t:
/*
/* f i n i t e
compute tanh ( f i n i t e )
s h o r t neg;
i f ( x < 0.0)
x = -x, neg = 1;
else
neg = 0;
i f ( X < -Rteps.-D)
e l s e i f (x
< ln3by2)
const double g = x
x += x
/ (((q[Ol
x;
( ( ~ 1 0 1* g + ~ 1 1 1 )* g + ~ 1 2 1 )
* g + q [ l l ) * g + q[21) * g + q[31);
1
e l s e i f (-Exp(Lx, 0) < 0)
x = 1.0
2.0 / ( x * x
else
x = 1.0;
r e t u r n (neg ? -x : x);
1
1
1
+ 1.0);
/*
x large
Chapter 7
Figure 7.32: /* l o g function */
log. c #include <math.h>
double (log) (double x )
/*
compute l n ( x )
*/
r e t u r n (-Log (x, 0) ) ;
Figure 7.33:
xlog c
Part 1
Log function */
tinFlude "xmath.h"
f*
*/
-0.78956112887491257267e+O,
0.16383943563021534222e+2,
-0.64124943423745581147e+2);
~ t a t i cconst
l t a t i c const
~ t a t i cconst
l t a t i c const
double
double
double
double
c l = f22713.0 / 32768.0);
c2 = {1.428606820309417232e-6);
loge = 0.43429448190325182765;
r t h a l f = (0.70710678118654752440);
t e s t f o r s p e c i a l codes
/*
f
s h o r t xexp;
switch (-Dunscale (Lxexp, 6x) )
/*
f
case NAN:
e r r n o = EDOM;
r e t u r n (x);
case INF:
i f (DSIGN(x) )
f
e r r n o = EDOM;
r e t u r n (-Nan .-D) ;
1
else
/*
INF
fi
finite
I n (negative) undefined
e r r n o = ERANGE;
r e t u r n (-Inf .-D) ;
1
case 0:
e r r n o = ERANGE;
r e t u r n ( - I n f .-D) ;
d e f a u l t:
i f ( x < 0.0)
f
e r r n o = EDOM;
r e t u r n (-Nan .-D) ;
/*
/*
else
Continuing
x l o g .c
double z = x
d o u b l e w;
Part 2
0.5;
if (rthalf < x )
z = (z
0.5) / ( x
else
0.5
0.5);
/*
f
- - xexp;
z /= ( Z
0.5
x <= sqrt ( l / 2 )
0.5);
1
w = z * z ;
z += z * w * ( ( p [ O ] * w + p [ l ] ) * w + p [ 2 ] )
/ ( ( ( w +q [ 0 1 ) * w + q [ l l ) * w + q121);
i f ( x e x p != 0 )
f
/* form z += l n 2 * x e x p s a f e l y
const double xn = (doub1e)xexp;
z = (xn
c2
2)
1
return (decflag ? loge
xn
cl;
z : z);
1
1
1
Figure 7.34:
loglo.
/* l o g 1 0 f u n c t i o n
# i n c l u d e <math. h>
I
I
*/
double ( l o g l o )( d o u b l e x )
f
/*
compute l o g l O ( x )
*/
0
Chapter 7
Figure 7.35:
pow. c
Part 1
'*
pow f u n c t i o n */
linclude "xmath.hW
b u b l e (pow) (double x , double y )
(
/* compute xAy *,
double y i = y;
double yx, z;
s h o r t n, xexp, zexp;
s h o r t neg = 0;
s h o r t erm = -Dunscale (hxexp, hx);
c o n s t s h o r t e r r y = -Dint (hyi, 0);
s t a t i c c o n s t s h o r t shuge = {HUGE-EXP) ;
s t a t i c c o n s t double dhuge = { (double) HUGE--)
;
s t a t i c c o n s t double l n 2 = {0.69314718055994530942);
s t a t i c c o n s t double r t h a l f = {0.70710678118654752440);
if
(0
<=
errx I I 0
<
erry)
O Af i n i t e
*,
r e t u r n -1NF o r I W
*,
/*
else
erm = y < 0.0 ? INF : 0;
( e r m == 0)
r e t u r n (0.0) ;
else i f ( e r r x = INF)
if
e r r n o = ERANGE;
r e t u r n (neg ? --1nf.-D
/*
: -1nf.-D);
1
else
{
e r r n o = EDOM;
r e t u r n (2);
1
1
i f (y = 0.0)
r e t u r n (1.0) ;
i f (0.0 < x )
neg = 0;
/*
r e t u r n NaN
*i
Continuing
else i f ( e r r y < 0)
pow.c
e r r n o = EDOM;
r e t u r n (-Nan .-D) ;
Part 2
1
else
x = -x, neg = -Dint (hyi, -1) < 0;
i f (X < r t h a l f )
/* - s q r t ( .5) <= x <= s q r t (. 5)
x *= 2.0, --xexp;
n = 0, y x = 0.0;
i f (y <= -dhuge)
zexp = xexp < 0 ? shuge : xexp == 0 ? 0 : -shuge;
else i f (dhuge <= y)
zexp = xexp < 0 ? -shuge : xexp = 0 ? 0 : shuge;
else
f
/* y*log2(x) may b e reasonable
double dexp = (double) xexp;
l o n g z l = (long) (yx = y * dexp);
*,
i f ( z l != 0)
f
/* form yx = y*xexp-zl c a r e f u l l y
yx = y, -Dint (hyx, 16) ;
yx = (yx * dexp - (doub1e)zl) + (y - yx) * dexp;
yx *= l n 2 ;
zexp = z l <= -shuge ? -shuge : z l < shuge ? z l : shuge;
i f ( ( n = ( s h o r t ) y ) < -SAFE-EXF' I I SAFE-EXF' < n)
n = 0;
/*
z = 1.0;
i f (x != 1 . 0 )
/* z *= x f r a c A n *
i f ((yi = y
(doub1e)n) != 0.0)
yx += l o g ( x ) * y i ;
i f (n < 0)
f o r ( y i = x; ; y i
*=
yi)
/*
scale by xA2%
i f (n 6 1 )
z *= y i ;
i f ( ( n >>c 1 ) = 0)
break;
i f (y < 0.0)
z = 1 . 0 / z;
1
i f (yx != 0.0)
z = -Exp(hyx, 0) < 0 ? z
i f (0 <= -Dscale (hz, zexp) )
e r r n o = ERANGE;
r e t u r n (neg ? -z : z ) ;
1
1
/*
yx : yx;
/*
*=
2Ayx *
/* z *= PAzexp *
underflow o r overflow *
170
Chapter 7
macro
SAKEXP
The second half of the function computes xY for finite values of x and y.
It begins by rewriting x as f *Ze, where f is in the interval [0.5ll2,2 . 0 ~ / I~f ] .
N is the magnitude of the largest representable double exponent, you know
that you can raise f to this power with no fear of overflow. The magnitude
of the resulting exponent cannot exceed N / 2 . The header "xmath.hwdefines
as:
the macro SAFE-#define SAFE-EXP
(-DMAX>>l)
other
functions
function
ten to
I grouped the middle two terms with malice aforethought. That reduces
the problem to forming the product of three terms:
f "is a loop that multiplies f by itself I n I times. If n is negative, the result
is divided into one. So long as I n I is less than SAFE-EXP, the result cannot
overflow or underflow, for the reasons given above.
( f Y-* * 2g) can be evaluated as the exponential of (y-n)*ln(f) + g*ln(2).
Both terms in the sum are typically small, so no serious loss of precision
should result in the addition or the exponentiation. An exception is
when I n I would exceed SAFE-EXP. In this case, the function sets n (also
known as n in the code) to zero and throws precision to the winds. The
sum cannot overflow, no matter how big y ( Y i ) happens to be. If the
exponential doesn't overflow, then the final result is probably dominated by this term anyway.
2" is a simple call to -Dscale.
Much of the complexity of this computation lies in avoiding overflows
and underflows. The remainder lies in safely partitioning e *y into the sum
of n and g. Note the use of -~intyet another way here. It lets you preserve
an extra 16 bits of precision in y, using yx to extend its precision. That offsets
the loss of up to that much precision during the partitioning. The largest
floating-point exponents supported by this implementation are assumed
to have no more than 14 magnitude bits. The partitioning should thus be
safe over the entire range of representable values.
For completeness, I show two functions that are not used by the other
functions declared in <math.h>. Functions declared in the other standard
headers need them, but these two functions need "xmath.hn.It seemed
wisest to park the two functions here.
Figure 7.37 shows the file xdtento.c that defines the function -mento.
It multiplies the double value x by ten raised to the power n. It is careful to
avoid floating-point overflow or underflow in the process. Note the use of
-Dunscale and -Dscale in the internal function dmul. Any potential overflow or underflow occurs in -Dscale, which handles it safely. Function
-Dtento assumes that the argument x is zero or finite.
function
Figure 7.36 shows the file xldunsca.c.It defines the function- duns scale
-Ldunscale that does the same job for long double arguments that -Dunscale does for
double arguments. In fact, if those two floating-point types have the same
representation, it does exactly the same job. Only if -DLONG is nonzero does
-Ldunscale handle the 10-byte IEEE 754 extended-precisionformat.
header
"xmath.hW
Figure 7.38 shows the file xmath-h.By now, you should have been
introduced to all its mysteries. I show it in its entirety here also for
completeness.
program
tmath2.c
Figure 7.40 shows the file tmath2.c.It tests all the trignometric functions
at angles that are various multiples of n/4. These are often critical angles
for detecting loss of precision or errors in determining the sign of the result.
If all tests pass, the program displays the message:
SUCCESS testing <math .h>, part 2
Figure 7.36:
x1dunsca.c
Part 1
* Ldunscale function
include "xmath.h"
--
i f -DLONG
define -LMASK
define -LMAX
define -LSIGN
i f -D0=3
define -LO
define -L1
define -L2
define -L3
define -L4
:else
:define -LO
:define -L1
:define -L2
define -L3
:define -L4
:endif
Chapter 7
*/
/*
/*
l i t t l e - e n d i a n order */
/*
big-endian order */
1
f o r (; pa[-Ll]
/*
f
pa[-Ll]
ps [ 4 2 ]
ps [-L3]
ps [ 4 4 ]
= ps[-Ll]
= ps [-LP]
= ps [-L3]
<<= 1;
s h i f t l e f t by 1 * I
1
return (xchar);
1
~ h o r t-Ldunscale (short *pex, long double *px)
f
/* separate *px t o lfracl < 1/2 and 2A*pex * I
unsigned short *ps = (unsigned short *)px;
short xchar = ps [-LO] & -LMASK;
i f (xchar = -LMAX)
f
*pex = 0;
return (ps[-Ll] & Ox7fff I I ps [-L2]
I I ps[-L31 I I ps[-L41 ? NAN : INF);
/*
NaN o r INF
*I
Continuing
x1dunsca.c
/*
Part 2
*pex = 0;
r e t u r n (0);
zero *I
1
else
/*
xchar += dnorm(ps) ;
p s [-LO] = p s [-LO] & -LSIGN
*pax = xchar - IBIAS;
return
f i n i t e , reduce t o [1/2, 1) *I
I -IBIAS;
(FINITEIT
1
1
slse
/* long double same a s double * /
t o r t -Ldunscale ( s h o r t *pex, long double *px)
f
/* s e p a r a t e *px t o l f r a c l < 1/2 and 2A*pex * /
unsigned s h o r t *ps = (unsigned s h o r t *)px;
s h o r t xchar = @s[-DO] & -DMASK) >> -DOFF;
/*
f i n i t e , reduce t o [1/2, 1) */
I -DBIAS << -DOFF;
1
else
/*
zero */
*pex = 0;
return (0);
1
1
mdif
program
tmath3 . c
Figure 7.41 shows the file tmaths. c. It tests all the exponential, logarithmic, and special power functions for a few obvious
~ o i that
e
one or two of the tests are obliged to produce an exact result. If all tests
pass, the program displays the message:
SUCCESS t e s t i n g <math.h>, p a r t 3
I can report, rather sheepishly, that these simple tests caught numerous
errors. Some arose, naturally enough, while I was fist writing and debugging the math functions. The more embarassing errors appeared while I
was introducing various "improvements." I learned to rerun them religiously after any changes.
Chapter 7
Figure 7.37:
xdtento. c
Part 1
I* -mento function
Yinclude <errno.h>
Yinclude < f l o a t . h>
Yinclude "xmath.hn
--
IEEE 754 v e r s i o n
*/
/* macros */
Ydefine NPOWS
(sizeof pows / s i z e o f pows[O]
/* s t a t i c d a t a */
s t a t i c const double pows [ I = f
l e l , l e 2 , le4, l e 8 , le16, le32,
l i f 0x100 < -DBIAS
/* assle64, le128, 1e256,
Yendif
1)
1;
s t a t i c const size-t
npows = (NPOWS);
-Dunscale (hxexp,
px) ;
*Px *= y;
r e t u r n (-Dscale (px, xexp) ) ;
1
louble -mento (double x, s h o r t n)
/*
compute x
10**n
double f a c t o r ;
s h o r t erm;
size-t i;
= 0 I I x = 0.0)
r e t u r n (x);
f a c t o r = 1.0;
i f (n < 0)
if (n
/*
s c a l e down
*,
unsigned i n t nu = -(unsigned i n t ) n ;
f o r (i= 0; 0 < nu && i < npows; nu >>c 1, ++i)
if (nu & 1)
f a c t o r *= pows [ i ];
erm = dmul (&x, 1.0 / f a c t o r );
i f ( e r m < 0 && 0 < nu)
f o r ( f a c t o r = 1 . 0 / pows[npows]; 0 < nu; --nu)
i f (0 <= ( e r r x = dmul (&x, f a c t o r ) ) )
break;
1
e l s e i f (0
< n)
/* s c a l e up
f o r ( i = 0; 0 < n && i < npows; n >>c 1, ++i)
i f (n & 1)
f a c t o r *= pows [ i ] ;
f
*,
Continuing
if
xdtento. c
Part 2
1
(0 <= e r r x )
e r r n o = ERANGE;
r e t u r n (x);
if
#else
#define -Dl
#define -D2
#define -D3
#endif
1
2
3
/*
/* r e t u r n values f o r -D functions */
#define FINITE -1
#define I W
1
#define NAN
2
/* d e c l a r a t i o n s */
double -Atan (double, unsigned s h o r t ) ;
s h o r t -Dint (double *, s h o r t ) ;
s h o r t -Dnorm (unsigned s h o r t *);
s h o r t -Dscale (double *, s h o r t ) ;
double -Dtento (double, s h o r t ) ;
s h o r t _Mest (double *);
s h o r t _Dunscale(short *, double *);
s h o r t -Exp (double *, s h o r t ) ;
s h o r t -Ldunscale (short *, long double *);
double -Poly (double, const double *, i n t );
extern -Dconst -Inf, -Nan, -Rteps, i g ;
big-endian order
*i
*,
Continuing
tmathl.c
Part 2
References
WilliamJ. Cody, Jr. and William Waite, SoftwareManual For the Elementary
Functions (Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1980). This is an
excellent referenceon writing reliable and accurate math functions. It is the
source of approximations for many of the functions in this chapter.
John F. Hart, E.W. Cheney, Charles L. Lawson, Hans J. Maehly, Charles
K. Mesztenyi, John R. Rice, Henry G. Thacher, Jr., and Christoph Witzgall,
Computer Approximations (Malabar, Florida: Robert E. Krieger Publishing
Company, 1978).This book contains several chapters on the art and science
of numerical approximation,but its great strengthlies in its extensive tables
of coefficients. You can probably find an approximation with just the
precision you need for any of the common math functions.
e l e f u n t is a collection of transportable FORTRAN programs for testing
the elementary function programs provided with FORTRAN compilers.
They are fanatically thorough. The programs are written in FORTRAN by
William J. Cody and are described in detail in Cody and Waite. Mail to the
Internet address n e t l i b @ r e s e a r c h.a t t . com the request:
send index from e l e f u n t
Exercises
Exercise 7.1 Determine the floating-point representation for your C translator. Can you
alter the parameters in <yvals .h> to accommodate it? If so, do so. Otherwise, alter the primitives to suit.
Exercise 7.2 Write the function double hypot (double, double) that computes the
square root of the sum of the squares of its arguments. (This yields the
hypotenuse of a right triangle whose sides are the two arguments.)Test it
with the expressions:
hypot (0.7 * DBLMAX, 0.7
hypot (DBLMAX, 1.0) ;
hypot (1.0, DBLMAX) ;
hypot(3.0, 4.0);
DBLMAX) ;
Chapter 7
178
Figure 7.40:
tmath2. c
Part 1
--
/* s t a t i c data
t a t i c double eps;
*/
part 2
*/
1
n t main()
Continuing
tmath2. c
Part 2
assert(approx(sin(-2.0
pibyl). -1.0));
assert(approx(sin(-piby4). -rthalf));
assert(approx(sin(0.0). 0.0));
assert(approx(sin(piby4). rthalf));
assert(approx(sin(2.0
pibyl), 1.0));
assert(approx(sin(3.0
pibyl), rthalf));
assert(approx(sin(4.0
piby4). 0.0));
assert(approx(tan(-3.0
piby4). 1.0));
assert(approx(tan(-pibyl), -1.0));
assert(approx(tan(0.0). 0.0));
assert(approx(tan(piby4). 1.0));
assert(approx(tan(3.0
pibyl), -1.0));
puts("SUCCESS testing <rnath.h>, part 2");
return (0);
1
Exercise 7.4 Write functions that perform complex arithmetic. Each complex value x +
i *y is represented by the pair (x, y). Provide at least the operations compare,
subtract, add, divide, multiply, magnitude, and phase. Also provide functions that convert between existing floating-point types and complex. Can
you use any existing functions to advantage? What other functions are
desirable?
Exercise 7.5 Alter the primitives in <rnath.h> to eliminate the special codes for NaN, Inf,
and -1nf. Replace primitives with macros in mxmath.hw wherever possible.
What does this do to the sizes of functions in the Standard C library? What
does it do to execution times?
Exercise 7.6 [Harder]Write versions of all the math functions that acceptfloatarguments
and produce float results. Append an f to each existing function name to
obtain the new function name. How can you test these functions?
Exercise 7.7 [Harder] Write versions of all the math functions that accept long double
arguments and produce long double results. Append an 1 to each existing
function name to obtain the new function name. How can you test these
functions?
Exercise 7.8 [Harderl Write versions of all the math functions that accept complex
arguments and produce complex results. Prepend a c to each existing
function name to obtain the new function name. How can you test these
functions?
Exercise 7.9 [Very hard] Measure a large corpus of code to determine if any of the math
functions are worth coding inline. Modify a C compiler to do so. Measure
the result.
Chapter 7
1 80
Figure 7.4 1:
tmath3.c
--
part 3
*/
s t a t i c double eps;
s t a t i c i n t approx (double d l , double d2)
/* test f o r approximate e q u a l i t y
r e t u r n ( (d2 ? fabs ( (d2 - d l ) / d2) : f a b s ( d l ) ) < eps);
i n t main ()
.....
*/
The major effect of this restriction is that you cannot hide function names
inside a hierarchy. All the functions that you declare within a given translation unit are visible to each other. That is not a major drawback - you
can limit visibility by grouping functions within separateC sourcefiles that
belong to different translation units.
C does, however, suffer in another way because of this design decision.
It provides no easy way to transfer control out of a function except by
returning to the expression that called the function. For the vast majority
of function calls, that is a desirable limitation. You want the discipline of
nested function calls and returns to help you understand flow of control
through a program. Nevertheless, on some occasions that discipline is too
restrictive. The program is sometimes easier to write, and to understand,
if you can jump out of one or more function invocations at a single stroke.
You want to bypass the normal function returns and transfer control to
somewhere in an earlier function invocation. That's often the best way to
handle a serious error.
You can do this sort of thing in Pascal. A nested function can contain a
nonlocal
goto goto statement that transfers control to a label outside that function. (A void
function in C is called a procedure in Pascal. I use "function" here to refer
to Pascal procedures as well.) The label can be in any of the functions
containing the nested function definition, as in:
function x: integer; {a Pascal goto example)
label 99;
function y (val: integer) : integer;
begin
i f val < 0 then
got0 99;
.....
1 82
Chapter 8
You must declare the labels in a Pascal function before you declare any
nested functions so the translator can recognize a nonlocal goto.
Agoto within the same function can often simply transfer control to the
statement with the proper label. A nonlocal goto has more work to do. It
must terminate execution of the active function invocation. That involves
freeing any dynamically allocated storage and restoring the previous calling environment Pascal even closes any files associated with any file
variables freed this way. The function that called the function containing
the goto statement is once again the active function. If the label named in
the goto statement is not in the now-active function, the process repeats.
Eventually, the proper function is once again active and control transfers
to the statement with the proper label. The expression that invoked the
function containing the goto never completes execution.
Pascal uses the nesting of functions to impose some discipline on the
nonlocalgoto statements you can write. The language won't let you transfer
control into a function that is not active. You have no way of writing a
transfer of control to an unknown function. Here is one of the ways that
Pascal is arguably better than C.
The older language PL/I has a different solution to the problem. That
label
variables language lets you declare label variables. You can assign a label to such a
variable in one context, then use that variable as the target of a goto
statement in another context. What gets stored in the label variable is
whatever information the program needs to perform a nonlocal goto. (The
goto need not be nonlocal - it can transfer control to a label within the
current invocation of the current function.)
The PL/I approach is rather less structured than the one used by Pascal.
You can write a goto statement that names an uninitialized label variable.
Or the label assigned to the variable may be out of date -it may designate
the invocation of a function that has terminated. In either case, the effect
can be disastrous. Unless the implementation can validate the contents of
a label variable before it transfers control, it will make a wild jump. Such
errors are hard to debug.
C implements nonlocal transfers of control by using library functions.
The header <set jmp. h> provides the necessary machinery:
jmp-buf
the type jrq-buf, which you can think of as a label data-object type
longjmp
the function longjmp, which performs the nonlocal transfer of control
s e t jmp
the macro s e t jmp which stores information on the current calling context
in a data object of type jmp-buf and which marks where you want control
to pass on a corresponding long jmp call
In this regard, the C mechanism is even more primitive than the unstructured goto of PL/I. All you can do is memorize a place that flow of control
has reached earlier in the execution of the program. You can return to that
place by executing a call to long jmp using the proper jq-buf data object.
If the data object is uninitialized or out of date, you invite disaster.
long jmp and set jmp are delicate functions. They do violence to the flow
of control and to the management of dynamic storage. Both of those arenas
are the province of a portion of the translator that is extremely complex and
hard to write. That part must generate code that is both correct and
optimized for space and speed. Optimizations often involvesubtle changes
in flow of control or the use of dynamic storage. Yet the code generator
often works in ignorance of the properties and actions of longjmp and
set jmp.
Chapter 8
The problem arises because the code generator can elect to store some of
these data objects in registers. This set of registers is often indistinguishable
from the set that can hold temporary intermediate values in an expression
evaluation. Hence, set jmp is obliged to save all such registers and restore
them to an earlier state on a long jmp call. That means that certain dynamic
data objects revert to an earlier state on a subsequent return from set*.
Any changes in their stored values between returns from set jmp get lost.
Such behavior would be an annoyinganomaly if it were predictable.The
problem is that it is not predictable. You have no way of knowing which
parameters and auto data objects end up in registers. Even data objects you
declare as register are uncertain. A translator has no obligation to store
any such data objects in registers. Hence, any number of data objects
declared in a function have uncertain values if the function executes set jmp
and a longjmp call transfers control back to the function. This is hardly a
tidy state of affairs.
volatile
X3Jll addressed the problem by adding a minor kludge to the language.
dynamic Declare a dynamic data object to have a volatile type and the translator
storage knows to be more cautious. Such a data object will never be stored ina place
that is altered by Ion-. This usage admittedly stretches the semantics of
volatile, but it does provide a useful service.
jmp-hf
which is an array type suitablefor holding the information needed to restore a callingenvironment
It is unspecified whether set jmp is a macro or an identifier declared with external linkage.
If a macro definition is suppressed in order to access an actual function, or a program defmes an
external identifier with the name set jmp, the behavior is undefined.
Description
The set jmp macro saves its calling environment in its jmp-buf argument for later use by
the long jmp function.
Returns
If the return is from a direct invocation, the set jmp macro returns the value zero. If the return
is from a call to the long jmp function, the set jmp macro returns a nonzero value.
Environmental constraint
An invocation of the set jmp macro shall appear only in one of the following contexts:
the entire controllingexpression of a selection or iteration statement;
one operand of a relational or equality operator with the other operand an integral constant
expression, with the resulting expression being the entire controllingexpression of a selection
or iteration statement;
the operand of a unary ! operator with the resulting expression being the entire controlling
expression of a selection or iteration statement; or
the entire expression of an expression statement (possibly cast to void).
Description
The long jmp function restores the environment saved by the most recent invocation of the
set jmp macro in the same invocation of the program, with the corresponding jmp buf
argument. If there has been no such invocation, or if the function containing the invocationof the
set jmp macro has terminated executionIo7in the interim. the behavior is undefined.
All accessible objects have values as of the time longjmp was called, except that the values
of objects of automatic storage duration that are local to the function containing the invocation of
the corresponding set jmp macro that do not have volatile-qualified type and have been changed
between the set jmp invocation and longjmp call are indeterminate.
As it bypasses the usual function call and return mechanisms, the longjmp function shall
execute correctly in contextsof interrupts, signals and any of their associated functions. However,
if the longjmp function is invoked from a nested signal handler (that is, from a function invoked
as a result of a signal raised during the handling of another signal), the behavior is undefined.
Returns
After longjmp is completed, program execution continues as if the corresponding invocation
of the set jmp macro had just returned the value specified by val. The long jmp function
cannot cause the set jmp macro to return the value 0;if val is 0, the set jmp macro returns
the value 1.
106. These functions are useful for dealing with unusual conditions encountered in a low-level
function of a program.
107. For example, by executing a return statement or because another long jmp call has
caused a transfer to a set jmp invocation in a function earlier in the set of nested calls.
Report an error and terminate process at any point by executing the call
longjmp (2).
Chapter 8
You can also add additional case labels to handle other argument values
that long* can expect.
Here is what the top-level function might look like:
s t a t i c jq-buf
jmpbuf;
*/
1
1
I assume here that all references to jmpbuf are within this translation unit.
If not, you must declare jmpbuf with external linkage. (Drop the storage
class keyword s t a t i c . ) Alternatively, you must pass a pointer to jmpbuf to
those functions that must access it.
Note in this regard that jmp-buf is an array type. If you write the
j-buf
arguments argument j+f,
the translator alters it to a pointer to the first element of
the array. That's what set jmp and long jmp expect. So even though jmpbuf
appears to be passed by value, it is actually passed by reference. That's how
set jmp can store the calling environment in jmpbuf.
For consistency, you should declare each parameter as jmp-buf buf and
write the correspondingargument as jmpbuf. Don't declare the parameter
as jmp-buf *pbuf or write the argument as 6 jmpbuf. The latter form is
clearer but at odds with the Iong-standing conventions for calling set*
and long jmp.
If you choose an alternate form for using set jmp, execute the macro in
the smallest possiblefunction you can write. If the translator does not treat
set jmp specially, it has less opportunity to surprise you. If it is aware that
set jmp is troublesome, it has less code to deoptimize for safely.
Additional caveats apply if you call long jmp from within a signal handler. Chapter 9: <signal. h> discusses the i s s u e s i n g r e a t e r d e t a i l .
teetjmp
.h>
187
macros
-eetjmp
eetjmp
Note that <eetjmp.h> defines the macro eetjmp in terms of yet another
macro (or function) named -set jmp. The internal header < p a l e .h> once
again provides the required information. You can define-setjmp as a macro
that calls an existing function with a different name. Or you can declare
-setjmp as a function that you write in assembly language. What you cannot
do is provide a function that calls another function. (Think about it.) That's
why I provided an extraordinarydegree of flexibility in how you define the
macro eetj-. As an example, consider the Borland Turbo C++ compiler
for PC-compatibles. The internal header < p a l e .h> might contain:
#define -NBETJMP
10
int -Setjmp(int * ) ;
#ifndef -SETJldP
#define -8ETJMP
#ifndef
#include <pale.h>
#endif
/ * macro8 */
#define eetjmp(env)-Setjmp (env)
/ * type definition8 */
typedef int jmp_buf[-NSETJMP];
/ * declaration8 */
void longjmp(jmp_buf, int);
#endif
Chapter 8
Part of the calling environment is the savedfrarnepointerfrom the calling
function. You can locate the saved frame pointer at a fixed offset from a
single declared dynamic data object.
8 If the calling environment is in the right place and the frame pointer is
set properly, the function can return to the caller that provided that
calling environment.
Some of these assumptions are true of many implementationsof C. Some,
however, are only rarely true. These functions happen to (barely) work for
the VAX computer architecture. To give some hint as to what is going on, I
wrote them in terms of several parameters. For the VAX, the header
<yvale. h> would contain the macro definitions:
#define
#define
#define
#define
function
eet jmp
Figure 8.2:
e e t jmp c
-JBl?P
-JBMOV
1
60
-JBOFF
-NSETJMP
17
/*
/*
/*
/*
i n t o f f e e t of frame pointer */
number of bytee i n c a l l i n g context */
byte o f f e e t of c a l l i n g context */
number of i n t e i n jw-buf */
Figure 8.2 shows the file eet jmp. C. It defines a grubby version of eet jmp.
The function assumes that it can copy a contiguous region of the stack to
the j q b u f data object and save an adequate amount of the calling environment. It declares a number of r e g i s t e r data objects in the hope that it
will force the saving of all important registers with the calling context. It
makes a sham of calling dunnn~to outsmart some optimizers who may
conclude that the registers are never used.
*/
II
e t a t i c i n t getfp (void)
/*
int arg;
i n t s e t jmp ( j q b u f env)
/* t r y t o outemart optimizer
i f (a)
d-y(a,
b, c, d, e, f , g, h, i, j ) ;
env[l] = getfp () ;
memcpy ( (char *) &env[2], (char *) env [1] + --OFF,
-JBMOV) ;
return (0);
long jmp c
/*
longjmp function
#include <setjmp.h>
#include <&ring. h>
*/
e t a t i c void dummy ( i n t a, i n t b, i n t c , i n t d, i n t e ,
i n t f , i n t g, int h, i n t i, i n t j)
/* threaten t o uee argument8 * /
f
1
s t a t i c void e e t f p ( i n t fp)
/*
e e t frame pointer of c a l l e r */
int arg;
e t a t i c i n t do jmp( j e u f env)
f
memcpy ( (char *) env [1]
eetfp(env[l]);
return (env[0] ) ;
/* do t h e actual d i r t y busineee * /
-JBOFF, (char *) &env[2], -JBMOV) ;
1
void longjmp( j-buf
env, i n t v a l )
i f (a)
/* t r y t o outsmart optimizer
dummy(a, b, c, d, e, f . g, h, i, 3 ) ;
env[O] = val ? v a l : 1;
do jmp (env) ;
function
longjmp
*/
Figure 8.3 shows the file long jmp. c. It defines an even grubbier version
of longjmp. The function copies the saved calling context back onto the
stack. It allocates registers the same as s e t jmp and calls yet another function
in the hope that this wild copy won't overlap anything in active use on the
stack. It then jiggers the frame pointer in the hope that it will thus return
control to the function that called eet jmp instead of its true caller.
If all goes well (and there are many reasons why it shouldn't), execution
resumes where eet jmp was first called. The value returned by e e t jmp on
this occasion is the one provided as an argument to longjmp. Wow.
A complete implementation of these two functions must be much tidier.
It may for example, also have to worry about (among other things):
the status of a floating-point coprocessor
whether any signal handlers are active (See Chapter 9: <eignal. h>.)
You will find that proper versions of these functions are typically just as
tricky, only much more reliable.
Chapter 8
Figure 8.4:
tset jmp.c
Part 1
'*
t e e t setjmp function8
linclude <assert.linclude <setjmp.linclude <etdio. h>
/* e t a t i c data
l t a t i c i n t ctr;
~ t a t i cj e u f bO;
*/
*/
~ t a t i cvoid jmpto ( i n t n)
/*
jump on e t a t i c buffer
*,
t e e t f o r etack creep
*,
1
~ t a t i cchar *stackptr (void)
/*
f
char ch;
return (&ch)
;
1
R a t i c i n t t r y i t (void)
exercise jumps
*,
*,
ehould return 1
/*
/*
f
j-uf
b l;
char *sp = etackptr ( );
ctr = 0;
ewitch ( s e t jmp (bO))
/*
f
case 0:
a s s e r t (ep = etackptr () ) ;
aeeert ( c t r = 0);
++ctr;
jrnpto(0) ;
break;
came 1:
aeeert (ep = s t a c k p t r ( ) ) ;
aeeert ( c t r = 1);
/*
++ctr;
jrnpto(2) ;
break;
came 2:
a s e e r t (ep = etackptr () ) ;
aeeert ( c t r = 2);
++ctr;
ewitch (metjmp (bl) )
f
came 0:
a s e e r t (ep = etackptr ( ) ) ;
aeeert ( c t r == 3);
++ctr;
long jrnp (bl, - 7 );
break;
t e e t neeting
case -7 :
aeeert (sp = etackptr ( ) ) ;
aeeert ( c t r = 4 ) ;
++ctr;
jmpto(3) ;
came 5:
return (13);
default :
return ( 0 ) ;
Continuing
t e e t jmp. c
Part 2
1
came 3:
longjmp(b1, 5) ;
break;
1
return (-1);
1
.nt main()
eizeof (j-buf)
= 20
SUCCESS t e e t i n g <eetjmp.h>
Chapter 8
References
ISOIIEC Standard 7185:1990 (Geneva: InternationalStandards Organization, 1990). This defines the programming language Pascal, which permits
a nonlocal goto to a containing function.
ISOIIECStandard 6160:1979 (Geneva: International Standards Organization, 1979). This defines the programming language PL/I, which permits a
nonlocal goto using a label variable.
Exercises
Exercise 8.1 How is the type j q b u f defined for the C translator that you use? Can you
represent it safely as an array of int? If so, how many elements must the
array have?
Exercise 8.2 Write versions of
you use.
Exercise 8.3 Modify the functions you wrote in the previous exercise to check for
obvious usage errors:
Store a checksum or other signature in each j q b u f data object and
check it before you trust the remaining contents.
Verify that the call stack is at least as deep as when the contents were
stored in the jmp-buf data object.
What other checks can you envision?
Exercise 8.4 [Harder] An exception handler is a code sequence that gets control when an
exception is reported, or raised. You register the handler along with thecode
value for an exception in a given context. Any handler already registered
for the same exception code value is masked. (Inother words, registrations
stack.) You unregister the handler when the context terminates. That exposes any earlier handlers. A handler can register a willingness to handle
any condition. It can also remise an exception - pass it up the line to
handlers registered earlier. If no handler is registered for a given code value,
the program terminates abnormally, preferably with a nasty message.
Design functions when and raiee to implement exception handling. when
lets you register and unregister handlers. raiee lets you report exceptions.
Why would you want such a capability?
Exercise 8.5 [Harder] Implement the functions you designed for the previous exercise.
Exercise 8.6 [Very hard] Define semantics for set jmp and longjmp that eliminate the
problems described earlier in this chapter. You want to be able to call eet jnp
from an arbitrary expression. You want all (surviving) data objects to
remain unaffect by a longjmp call. Modify a Standard C translator accordingly.
Chapter 9: <signal.h>
Background
header
<eignal .h>
raise
signal
signal
handlers
Chapter 9
volatile
data objects
type
eig-atcmic-t
problems
One problem is the Standard C library itself. If called with valid arguments, no library function should ever generate a synchronous signal. But
an asynchronoussignal can occur while the library is executing. The signal
may suspend program execution part way through a print operation, for
example. Should the signal handler print a message, an output stream can
end up in a confused state. There is no way to determine from within a
signal handler whether a library function is in an unsafe state.
Another problem concerns data objects that you declare to have volatile
types. That warns the translator that surprising agents can access the data
object, so it is careful how itgenerates accesses to such a data object. In
particular, it knows not to perform optimizations that move the accesses to
volatile data objects beyond certain sequence points. A signal handler is, of
course, a surprising agent. Thus, you should declare any data object you
access within a signal handler to have a volatile type. That helps, provided
the signal is synchronous and occurs between two sequence points where
the data object is not accessed. For an asynchronous signal however, no
amount of protection suffices. Signals are not confined to suspending
program execution only at sequence points.
The C Standard offersa partial solution to the problem of writing reliable
signal handlers. The header <eignal .h>defines the type eig-atomic-t. It
is an integer type that the program accesses atomically. A signal should
never suspend program execution part way through the access of a data
object declared with this type. A signal handler can share with the rest of
the program only data objects declared to have type volatile eig-atomic-t.
As a means of communicating information, signals leave much to be
desired. The semantics spelled out for signals in the C Standard is based
heavily on their behavior under the early UNIX operating system. That
system had serious lapses in the way it managed signals:
Multiple signals could get lost. The system did not queue signals, but
remembered only the last one reported. If a second signal occurred
before a handler processed the first, a signal could go unnoticed.
A program could terminate even when it endeavors to process all
signals. When control first passes to a signal handler, handling for that
signal reverts to default behavior. The signal handler must call eignal
to reestablish itself as the handler for the signal. Should that signal occur
between entry to the handler and the call to eignal, the default handler
gets control and terminates the program.
No mechanism exists for specifically terminating the handling of a
signal. In other operating systems, the program enters a special state.
Processing of subsequent signals blocks until the signal handler reports
completion. On such systems, other functions may have to assist in
processing signals properly. These can include abort and exit, declared
in <etalib.h>, and long jmp, declared in <met jmp . h>.
Moreover, signals arise from an odd assortment of causes on any computer. The ones named in the C Standard are a subset of those supported
by UNIX. These in turn derive from the interrupts and traps defined for
the PDP-11. Mapping the sources of signals for a given computer onto those
defined for C is often arbitrary. Mapping the semantics of signal handling
for a given operating systems can be even more creative.
The C Standard had to weaken the already weak semantics of UNIX
signals to accommodate an assortment of operating systems:
A given signal may never occur unless you report it with raiee.
A given signal may be ignored unless you call eignal to turn it on.
There's not much left.
Thus, no portable use for the functions declared in <eignal .h> can be
portability
defined with complete safety. You could, in principle, specify a handler for
a signal that only raiee reports. It's hard to imagine a situation where that
works better than instead using eet jmp and longjmp, declared in <metjmp. h>. Besides, you cannot ensure that a given signal is never reported on
an arbitrary implementationof C. Any time your program handles signals,
accept the fact that you limit its portability.
The header <eignal h> declares a type and two functions and defines several macros, for
handling various signals (conditions that may be reported during program execution).
The type defined is
which is the integral type of an object that can be accessed as an atomic entity, even in the presence
of asynchronous interrupts.
The macros defined are
SIG-DFL
SIG-ERR
SIG-IGN
SIGRBRT
SIGFPE
SIGILL
SIGINT
SIGSEGV
SIGTERM
SIG-DFL
SIG-ERR
SIG-IGN
which expand to constant expressions with distinct values that have type compatible with the
second argument to and the return value of the signal function, and whose value compares
unequal to the address of any declarable function; and the following, each of which expands to a
positive integral constant expression that is the signal number corresponding to the specified
condition:
SIGABRT abnormal termination, such as is initiated by the abort function
SIGFPE an erroneous arithmetic operation, such as zero divide or an operation resulting in
overflow
SIGILL detection of an invalid function image, such as an illegal instruction
SIGINT receipt of an interactive attention signal
Chapter 9
signal
Description
The signal function chooses one of three ways in which receipt of the signal number eig
is to be subsequently handled. If the value of func is S I G DFL,default handling for that signal
will occur. If the value of func is S I G IGN, the signal w?ll be ignored. Otherwise, func shall
point to a function to be called when thatsignal occurs. Such a function is called a signal handler.
When a signal occurs, if func points to a function, first the equivalent of eignal (eig,
S I G DFL); is executed or an implementation-defined blocking of the signal is performed. (If
the d u e of e i g is S I G I L L , whether the reset to S I G DFL occurs is implementation-defined.)
Next the equivalent of (*func) (eig); is executex The function func may terminate by
executing a return statement or by calling the abort, exit, or longjmp function. If func
executes a return statement and the value of eig was S I G F P E or any other implementationdefined value corresponding to a computational exception, the behavior is undefined. Otherwise,
the program will resume execution at the point it was interrupted.
If the signal occurs other than as the result of calling the abort or raiee function, the
behavior is undefined if the signal handler calls any function in the standard library other than the
eignal function itself (with a fust argument of the signal number corresponding to the signal
that caused the invocation of the handler) or refers to any objecr with static storage duration other
than by assigning a value to a static storage duration variable of type volatile
s i g atomic t. Furthermore, if such a call to the eignal function results in a SIG-ERR
return. the valueof errno is indetenninate.lW
At program startup, the equivalent of
may be executed for some signals selected in an implementation-definedmanner: the equivaleru
of
signal (sig, SIG-DFL)
Description
The raise function sends the signal s i g to the executing program.
Returns
The raise function returns zero if successful, nonzero if unsuccessful.
Footnotes
108. See "future library directions" (7.13.5). The names of the signal numbers reflect the
following terms (respectively): abort, floating-point exception, illegal instruction, interrupt,
segmentation violation, and termination.
109. If any signal is generated by an asynchronous signal handler, the behavior is undefined.
Using <signal.h>
Signal handling is essentially nonportable. Use the functions declared
in <signal.h> only when you must specify the handling of signals for a
known set of operating systems. Don't try too hard to generalize the code.
handllng
If default handling for a signal is acceptable, then by all means choose
signals that option. Adding your own signal handler decreases portability and
raises the odds that the program will mishandle the signal. If you must
provide a handler for a signal, categorize it as follows:
a handler for a signal that must not return, such as SIGETE reporting an
arithmetic exception or SIGAERT reporting a fatal error
a handler for a signal that must return, such as SIGINT reporting an
attention interrupt that may have interrupted a library operation
As a rule, the second category contains asynchronous signals not intended
to cause immediate program termination. Rarely will you find a signal that
does not fit clearly in one of these categories.
A signal handler that must not return ends in a call to abort, exit, or
long jmp. Do not, of course, end a handler for SIGAERT with a call to abort.
The handler should not reestablish itself by calling signal. Leave that to
some other agency, if the program does not terminate. If the signal is
asynchronous, be wary of performing any input or output. You may have
interrupted the library part way through such an operation.
A signal handler that must return ends in a return statement. If it is to
reestablish itself, it should do so immediately on entry. If the signal is
asynchronous, store a nonzero value in a volatile data object of type
sig-atomic-t. Do nothing else that has side effects visible to the executing
program, such as input or output and accessing other data objects.
A sample asynchronous signal handler might look like:
#include <signal.static sig-atomic-t
intflag = 0;
*/
198
Chapte
Note that two small windows exist where these signals can go astral
Within field-int before the call to signal, an occurrence of SIGINT c
terminate the program.
Between the testing and clearing of intflag, an occurrence of SIGINT c
be lost.
Those are inherent limitations of signals.
Here is a brief characterization of the signals defined for all implernc
tations of Standard C. Note that a given implementation may well defi
more. Display the contents of <signal .h> for other defined macro nam
that begin with SIG. These should expand to (small) positive integers tl
represent additional signals.
SIGABRT
SIGABRT - This signal occurs when the program is terminating unw
cessfully, as by an explicit call to abort, declared in <staib.h>. DO I
ignore this signal. If you provide a handler, do as little as possible. End t
handler with a return statement or a call to exit, declared in <staib.h>
SIGFPE
SIGFPE -The name originally meant "floating-point exception." The
Standard generalizes this signal to cover any arithmetic exception such
overflow, underflow, or zero divide. Implementations vary consideral:
on what exceptions they report, if any. Rarely does an implementati
report integer overflow. Ignoring this signal may be rash. A handler ml
not return.
SIGINT
SIGINT - This is the conventional way of reporting an asynchrono
interactive attention signal. Most systems provide some keystroke coml
nation that you can type to generate such a signal. Examples are ctl-C, DE
and ATTN. It offers a convenient way to terminate a tiresome loop ear
But be aware that an asynchronous signal can catch the program part w
through an operation that should be atomic. If the handler does not retu
control, the program may subsequently misbehave. You can safely ignc
this signal.
SIGSEGV
SIGSEGV - The name originally meant "segmentation violation," 1:
cause the PDP-11 managed memory as a set of segments. The C Standa
generalizes this signal to cover any exception raised by an invalid stora
access. The program has attempted to access storage outside any of t
functions or data objects defined by C, as with an ill-formed functic
designator or lvalue. Or the program has attempted to store a value ir
data object with a const type. In any event, the program cannot safe
continue execution. Do not ignore this signal or return frcm its handler.
SIGTERM
SIGTERM-This signal is traditionally sent from the operating system
from another program executing asynchronously with yours. Treat it a:
polite but firm request to terminate execution. It is an asynchronous sign
so it may occur at an inopportune point in your program. You may wa
to defer it, using the techniques described above. You can ignore this sigr
safely, although it may be bad manners to do so.
Implementing <signal.h>
Figure 9.1 shows the file signal .h. The header <signal.h> I present here
is minimal. A UNIX system, for example, defines dozens of signals. Many
systems endeavor to look as much as possible like UNIX in this regard.
They too define all these signals even if they do not generate many of them.
Notwithstanding this concerted group behavior, the choice of signals and
their codes both vary considerably. I have endeavored here to choose codes
that are most widely used.
As usual, I make use of the internal header < p a l s . h> to provide parameheader
<yvals.h> ters that can vary among systems. The code for SIGABRT is one. The highest
valid signal code is another. Some functions in this implementation use the
macro N S I G to determine the lowest positive number that is not a valid
signal code. Thus, the header < p a l s .w defines two macros of interest here.
For a typical UNIX system, the definitions are:
#define -SIGABRT
#&fine -SIC;MAX
6
32
/* signal function
#include <signal.-
-Sisfun
--
UNIX version
*/
*-Signal ( i n t , -Sigfun *)
-Sigfun * (signal) ( i n t s i g ,
/* c a l l the system
{
-Sigfun *fun)
service
return (-Signal ( s i g , fun) ) ;
200
Chapter 9
Figure 9.1: /* signa1.h standard header */
signal.h
#ifndef -SIGNAL
#define -SIGNAL
#ifndef -YVALS
#include <yvals.h>
#endif
/* type d e f i n i t i o n s */
typedef i n t sig-atomic-t;
typedef void -Sigfun ( i n t );
/* s i g n a l codes */
#define SIGABRT-SIGABRT
#define SIGINT 2
#define SIGILL 4
#define SIGFPE 8
#define sIGSEGV 11
#define SIGTERM 15
#define -NSIG
-SIGMAX
/* s i g n a l r e t u r n values */
#define SIG-DFL (-Sigfun *) 0
#define SIG-ERR (-Sigfun *) -1
#define SIG-IGN (-Sigfun *) 1
/* d e c l a r a t i o n s */
i n t r a i s e ( i n t );
-Sigfun *signal ( i n t , -Sigfun *) ;
#endif
/*
its earliest use for sending only the signal SIGKILL.)TO identlfy itself, raise
also needs the system service getpid. Assuming suitable secret names for
these two system services, such as - ill and -Getpid, YOU can write r a i s e
as:
--
/* r a i s e function
#include <signal.h>
UNIX version
*/
function
signal
declaring
-sigtable
hardware
signals
the array is initialized to a null pointer. That happens to match SIG-DFL, the
value that signal uses to indicate default handling.
raise first determines that the signal code is valid. If so, the function
takes the action specified by the corresponding element of - S i g t a b l e .
Default handling is to write a one-line message to the standard error stream
and terminate with unsuccessful status. It names the signals that it knows
about and prints the code value for all others. You can add names for
additional signals if you want more revealing error messages.
Figure 9.3 shows the file s i g n a l . = . It defines the function signal that
serves as a companion to raise above. All it does is validate its arguments
and replace the appropriate entry in - S i g t a b l e with a valid function
pointer. (The pointer is assumed valid if it doesn't match SIG-ERR. That's a
fairly weak check.)
Note the declaration for - S i g t a b l e in this file. My usual practice is to
place such a declaration in a header file that is included by all C source files
that need it. In this case that would be the header <signal. h>, but only if
some masking macro referred to it. More likely, it would be some internal
header with a name such as " x s i g n a l . hw.I couldn't bring myself to create
yet another header for a single declaration, however. Any style must have
its practical exceptions.
You can add to signal any system-specific code needed to get control
when "hardware signals" occur. These are signals reported by the operating system or the computer itself. Be careful here. Many systems will
transfer control to an address you speclfy, but not following the C function
call and return discipline. You may have to provide a bit of assembly
language for each signal you handle this way.
Tell the operating system (or the computer) to transfer control to the
assembly-language signal handler. Have that handler save any necessary
context and call the C function you specify with the proper protocol. It can
determine the address from a static data object that you know how to access
both from C and from assembly language. If the C function returns, the
assembly-languagesignal handler reverses the process to return control to
the interrupted program.
Some operating systems require that you report when a signal handler
completes. For a signal handler that returns, this is relatively easy. The
assembly-languagesignal handler can do what is necessary on the way out
the door. But remember that a signal handler can also terminate by calling
abort or exit, declared in < s t d l i b . w, or by calling longjmp, declared in
<setjmp.w.YOU may have to work over all of these functions to do a proper
job.
202
Chapter 9
Figure 9.2:
raise. c
/* raise
#include
#include
#include
function -- s i m p l e version
<signal.h>
<stdio.h>
<stdlib.h>
*/
I-
/* s t a t i c data */
Sigfun *-Sigtable [-NSIG] = {O) ;
/*
handler table
raise a signal
/* bad signal
s != SIG-DFL)
revert and call handler
int ( r a i s e ) ( i n t s i g )
/*
-Sigfun
*s;
<= 0 I I N S I G <= s i g )
r e t u r n (- 1);
if ( i s = - S i g t a b l e [ s i g ] )
!= SIG-IGN
if ( s i g
/*
-Sigtable [sig]
= SIG-DFL;
&&
(*s) ( s i g ) ;
?
else i f ( s == SIG-DFL)
/*
d e f a u l t handling
/ * p r i n t known signals by n a m e
char a c [ l O ] , *p;
switch (sig)
{
case SIGABRT:
p = "abort";
break;
case SIGE'PE:
p = "arithmetic error";
break;
case S I G I L L :
p = "invalid executable code";
break;
case SIGINT:
p = "interruption";
break;
case SIGSEGV:
p = "invalid storage access";
break;
case SIGTERM:
p = "termination request";
break;
default :
* ( p = L a c [ ( s i z e o f a c ) - 1 1 ) = '\Or
do *--p = s i g % 1 0
'0';
while ( ( s i g /= 1 0 ) != 0 ) ;
fputs ( " s i g n a l #", s t d e r r ) ;
?
fputs ( p , s t d e r r ) ;
fputs ( " -- terminating\nW
, s t d e r r );
exit (EXIT-FAILURE) ;
I )
return (0) ;
< s i g n a l . h>
Figure 9.3: /* signal function -- simple version */
signal. c
#include <signal.h>
/* external declarations * /
extern -Sigfun *-Sigtable[-NSIG];
-Sigfun *(signal)(int
{
-Sigfun *s;
*/
Testing <signal.h>
Figure 9.4 shows the file tsi-1
.c.It doesn't do much, because signals
have so few portable properties. About all it does is test the basic workings
of signal and raise using SIGFPE.The code assumes that no other agency
will report this signal while the program executes. That's a fairly safe
assumption, but not one guaranteed by the C Standard. The test program
also ensures that the various macros are defined, as is the type
sig-atomic-t. It makes no attempt to verify any associated semantics,
however.
As a courtesy, the program displays the size in bytes of sig-atomic-t.If
all goes well, the program displays something like:
sizeof (sig-atomic-t) = 2
SUCCESS testing <signal.h>
References
PDP-11/70 Processor Handbook (Maynard, Mass.: Digital Equipment Corporation, 1976). The PDP-11 traps and interrupts inspired the signals
originally defined for UNIX. You can better understand the naming and
semantics of UNIX signals by going back to this source.
Exercises
Exercise 9.1 List the signal codes defined for the C translator you use. Can you describe
in one sentence what each signal indicates?
Exercise 9.2 For the signal codes defined for the C translator you use, contrive tests that
cause each of the signals to occur?
Exercise 9.3 Under what circumstances might you care whether any signals went
unreported?
Chapter 9
204
Figure 9.4:
tsignal .c
/* test signal f u n c t i o n s
#include <assert.#include < s i g n a l . h >
#include <stdio.h>
#include <stdlib.W
/* s t a t i c data
static i n t sigs [I = {
SIGABRT, SIGE'PE,
*/
*/
S I G I L L , SIGINT, SIGSEGV, S I m E R M ) ;
SIG-ERR,
SIG-IQ4) ;
s t a t i c void f ield-fpe ( i n t s i g )
{
/*
handle SIGFPE */
assert ( s i g == SIGFPE) ;
puts("SUCCESS testing < s i g n a l . b W ) ;
e x i t (EXIT-SUCCESS) ;
1
i n t m a i n ()
{
/* test basic w o r k i n g s of signal f u n c t i o n s */
p r i n t f ( " s i z e o f (sig-atomic-t)
= %u\nW,
sizeof ( s i g - a t o m i c - t ) ) ;
assert ( s i g n a l (SIGFPE, &fi e l d - f p e ) = SIG-DFL) ;
assert ( s i g n a l (SIGE'PE, & f i e l d - f p e ) = & f i e l d - f p e ) ;
raise (SIGFPE) ;
puts("FA1LURE t e s t i n g < s i g n a l . - " ) ;
r e t u r n (EXIT-FAILLURE);
Exercise 9.4 Alter signal and raise to work properly with the C translator you use.
Handle as many hardware signals as possible.
Exercise 9.5 Write a handler for SIGABRT that displays a trace back-a list of the functions
that are active, in the reverse order that they were called. Why would you
want this capability?
Exercise 9.6 [Harder] Identify the critical regions in the Standard C library that should
not be interrupted by a signal. Arrange to have signal handling deferred
until the end of any such critical region if the signal is reported while the
region is active. Why would you want this capability?
Exercise 9.7 [Very hard] Implement new semantics for signals that ensures that:
no signals get duplicated or lost
signals are handled in order of reporting
a program can be sure to handle all signals reported after some point
critical regions can be protected against interuption
a signal handler can communicatesafely with other parts of the program
Chapter 10
<stdarg.h>
va-list
va-list
which is a type suitable for holding information needed by the macros va-start,va-arg,
and va-end.If access to the varying arguments is desired, the called function shall declare an
object (referred to as a9 in this subclause) having type va-list.The object a9 may be passed
as an argument to another function; if that function invokes the va-arg macro with parameter
ap,the value of ap in the calling function is indeterminate and shall be passed to the va-end
macro prior to any further reference to ap.
va-start
The va-start and va-arg macros described in this subclause shall be implemented as
macros, not as actual functions. It is unspecified whether va-end is a macro or an identifier
declared with external linkage. If a macro definition is suppressed in order to access an actual
function. or a program defines an external identifier with the name va-end,the behavior is
undefined. The va-start and va-end macros shall be invoked in the function accepting a
varying number of arguments, if access to the varying arguments is desired.
7.8.1.1 The va-start macro
Synopsis
%include <stdarg.h>
void va-start(va-list ap, parmN);
Description
The va-start macro shall be invoked before any access to the unnamed arguments.
The va-start macro initializes a9 for subsequent use by va-arg and va-end.
The parameterparrrzh'isthe identifier of the rightmost parameter in the variable parameter list
in the function definition (the one just before the , .). If the parameterparmN is declared with
the register storage class, with a function or array type, or with a type that is not compatible
with rhe type tha~results after application of the default argument promotions, the behavior is
undefined.
..
Returns
The va-start macro returns no value.
va-arg
Description
The va-arg macro expands to an expression that has the type and value of the next argument
in the call. The parameter ap shall be the same as the va-list a9 initialized by va-start.
Each invocation of va-arg modifies a9 so that the values of successive arguments are returned
in turn. The parameter type is a type name specified such that the type of a pointer to an object
that has the specified type can be obtained simply by postfixing a to type. If there is no actual
next argument, or if type is not compatible with the type of the actual next argument (as promoted
according to the default argument promotions), the behavior is undefined.
Returns
The first invocation of the va-arg macro after that of the va-start macro returns the
value of the argument after that specified by parmN. Successive invocations return the values of
the remaining arguments in succession.
Chapter 10
208
va-end
Description
The va end macro facilitates a normal return from the function whose variable argument list
was referra to by the expansion of va start that initialized the va l i s t ap. The va end
macro may modify ap so that it is To longer usable (without an Ztervening invocatb of
va start).If there is no corresponding invocation of the va start macro, or if the va-end
mazo is not invoked before the return, the behavior is undefined.
Returns
The va-end macro retums no value.
Example
The function fl gathers into an array a list of arguments that are pointers to strings (but not
more than MAXARGS arguments), then passes the array as a single argument to function f2.The
number of pointers is specified by the fust argument to f 1.
#inclub <stdarg.h>
#&fin.
m
S
31
void fl (int ngtrs,
. . .)
va-list ap;
char *array[MAXMGS];
int ptr-no = 0;
if (ngtrs > MAXARGS)
n g t r s = l4AXMtG.S;
va-start (ap, ngtrs);
w h i h (ptr-no < ngtrs)
array[ptr-no++] = va-arg(ap, char * ) :
va-end(ap);
f2 (ngtrs, array);
Each call to f1 shall have visible the definition of the function or a declaration such as
void fl(int,
... ) ;
tstdarg. h>
va-fputs,
a test", NULL);
In this example, both functions should produce the same output to the
stream stdout.
You can write va-fputs as:
#include <stdarg.h>
#include <stdout.h>
...
*))
*/
!= NULL)
Chapter 10
You can follow this pattern to process a wide range of variable argument
lists. You can even process the variableargument list in a separate function.
Be sure to execute va-start before you call the function. Then execute
va-end when the function returns.
If you want to rescan a variable argument list you have to be a bit more
rescanning
careful. Execute va-start to initiate each rescan, of course. Execute va-end
before the function returns, and only if you execute va-start at least once.
I recommend an even safer discipline - execute va-start and va-end
within the same loop. That way, you are more certain to execute va-end only
when you should.
Many implementations have no need for va-end. The macro expands to
code that does nothing. That means that any errors in using this macro
become time bombs that may not go off for years. They get more expensive
to find and fix with each passing year. Take pains to eliminate the bugs up
front.
va-list
Another danger lurks in calling a function with the argument ap (the
arguments data object of type va-list). In some implementations, it may be an array
type. That means that the function parameter actually becomes a pointer
to the first element of the va-list array. When the called function executes
va-arg, the data object changes in the calling function (called f above).
In other implementations, va-list is not an array type. That means that
the argument ap passes by value as it appears to do. When the called
function executes va-arg, the data object in the calling function f does not
change.
If you process all argumentsin the called function, the difference doesn't
matter. If you execute va-arg in different function invocations with the
"same" ap, however, it can matter. In fact, you get in trouble if your code
requires that the va-list data object be shared or if it requires that the data
object not be shared.
You can ensure the behavior that you need:
If the va-list data object must be shared, write the argument as cap.
Declare the corresponding parameter as va-list *pap.Within the function, execute va-arg ( *pap, T ) to access each argument in the variable
argument list.
If the va-list data object must not be shared, write the argument as ap.
Declare the corresponding parameter as va-list xap.Within the function, declare a data object as va-list ap and execute memcpy (ap, xap,
sizeof (va-list)) . (-cpy
is declared in <string .h>.) Execute
va-arg (ap, T ) to access each argument in the variable argument lit.
These two recipes will work regardless of the type defined for va-list.
assumptions
header
< p a l s .h>
macro
AUPBND
macro
-ADNBND
Figure 10.1 shows the file st&rg.h. It is the only code needed to
implement <st&rg. h>. That's assuming that it can be made to work with
a given implementation of Standard C.
The approach assumes that:
A variable argument list occupies a contiguous array of characters in
memory.
Successive arguments occupy successively higher elements of the character array.
The space occupied by an argument begins on a storage boundary that
is some multiple of 2N bytes.
The size of the space is the smallest multiple of 2N bytes that can
represent the argument.
Any "hole" left in the space is always at the beginning or always at the
end of the argument data object.
These assumptions hold for many implementations of Standard C.
As usual, the internal header < p a l s .h> defines macros that describe
variations among different systems. For the header <st&rg . h>, two parameters are relevant:
-AUPBND is a mask that determines the storage boundary enforced within
the variable argument list. Its value is 2N-1.
-ADNBNII is a mask that determines whether the hole is at the beginning
or at the end of an argument data object. Its value is 2N-1 if the hole is
at the end, otherwise it is zero.
A simple example is the Borland Turbo C++ compiler. For that implementation, the header < p a l s . h>contains the definitions:
#define -AUPBND 1
#define -ADNBND 1
Figure 10.1: /*
stdarg. h
Chapter 10
I discovered the need for specifying a hole before an argument with the
GNU C compiler for the Sun UNIX workstation. For that system, -AUPBND
has the value 3, but -ADNBNII is zero.
Perhaps now you can understand the trickery involved in stdarg .h.The
type
va-list type va-list is just a pointer to char. Such a data object holds a pointer to
the start of the next argument space.
The macro va-start skips past the named argument, which should be
va-start
-Bnd the last of the fixed arguments. It uses the internal macro -Bnd to round up
the size of its argument to a multiple of 2N bytes.
The macro va-arg is the trickiest of the lot. It begins by incrementingthe
macro
va-arg contents of theva-list data object to point to the start of the next argument
space. Then it backs up to point to the beginning of the current argument.
Then it type casts that pointer value to be a pointer to the specified type.
Its last act is to dereferencethe pointer to access the value stored in the data
object. (In this implementation, va-arg is an lvalue. Don't count on that
being true of others.)
macro
The macro va-end has nothing to do in this implementation. It expands
va-end to the place-holder expression (void)O.
sizeof (va-list) = 4
SUCCESS testing <stdarg.h>
References
U N I X Programmer's Reference Manual, 4.3 Berkeley Software Distribution
VirtualVAX-11 Version (Berkeley, Ca.: University of California, 1986). Here
is the source of the header <varargs.h> that served as the model for
<stdarg. h>.
<stdarg. h>
Figure 10.2:
tstdarg-c
'*
t e s t stdarg macros
/include <assert.h>
/include <stdarg.h>
/include <stdio. h>
*/
/* type definitions
:ypedef s t r u c t {
char c;
} Cstruct;
*/
/*
. . .)
t e s t variable argument list *I
i n t ctr = 0;
va- list ap;
va-start (ap, fmt);
f o r (; *fmt; ++fmt)
switch (*fmt)
case 'i' :
a s s e r t (va-arg(ap,
break;
case ' d' :
a s s e r t (va-arg(ap,
break;
case 'p' :
a s s e r t (va-arg(ap,
break;
case ' s' :
a s s e r t (va-arg(ap,
int) = ++ctr)
;
double) = ++ctr);
++ctr)
;
1
va-end (ap);
return (ctr);
1
.nt main ( )
(
/*
Cstruct x = (3);
a s s e r t ( t r y i t ( " i i s d i W ,' \ I 1 , 2, x, 4.0, 5) = 5 ) ;
a s s e r t ( t r y i t ("") == 0);
a s s e r t ( t r y i t ("pdp", "\I", 2.0, "\3") == 3);
p r i n t f ("sizeof (va- list) = %u\nW,sizeof (va- list));
puts ("SUCCESS t e s t i n g <stdarg.h>");
return (0);
214
Chapter 10
Exercises
Exercise 10.1 Determine how your C translator stores arguments in a variable argument
list by reading its documentation. Does that tell you enough?
Exercise 10.2 Determine how your C translator stores arguments in a variable argument
list by displaying the header <st&rg .h>that it provides. Does that tell you
enough?
Exercise 10.3 Determine how your C translator stores arguments in a variable argument
list by examining the code produced for the test program tst&rg .c (Figure
10.2). Does that tell you enough? If not, augment the program to provide
the missing information.
Exercise 10.4 Alter the code presented in this chapter to adapt the header <st&rg .h> to
work with the C translator you use.
Exercise 10.5 Write the function char *scat (char *&st, const char *arc, .. . ) that
concatenates one or more strings and writes them to &st. The first string
starts at src. A null pointer terminates the list. The function returns a
pointer to the terminating null character for the string starting at &st.
Exercise 10.6 [Harder] You want to test whether an argument is present in a variable
argument list. If it is present, you want to determine its type. Describe a
notation that lets you do this.
Exercise 10.7 [Very hard] Implement the notation you developed for the previous exercise.
Chapter 11
types
as
synonyms
macro
NULL
macro
offsetof
<stddef. h>
The following types and macros are defined in the standard header <stddef. h>.Some are
also defined in other headers. as noted in their respective subclauses.
which is the signed integral type of the result of subtracting two pointers;
size-t
which is the unsigned integral type of the result of the sizeof operator; and
wchar-t
wchar-t
which is an integral type whose range of values can represent distinct codes for all members of
the largest extended character set specified among the supported locales; the null character shall
have the code value zero and each member of the basic character set defined in 5.2.1 shall have
a code value equal to its value when used as the lone character in an integer character constant.
The macros are
NULL
NULL
offsetof(type, member-designator)
which expands to an integral constant expression that has type size-t,the value of which is
the offset in bytes, to the structure member (designated by member-designator), from the
beginning of its structure (designated by type). The member-designator shall be such that given
static type t;
then the expression 6(t . member-designator) evaluates to an address constant. (If the specified
member is a bit-field, the behavior is undefined)
The uses for type and macro definitions in the header <stddef.h> are
essentially unrelated. You include this header if you need one or more of
the definitions it provides. Note, however, that only the type definition
ptrdiff-t and the macro offsetof are unique to this header. You will often
find that including another standard header will supply the definition you
need. I discuss each of the type and macro definitions separately.
When you subtract two pointers in a C expression, the result has type
type
ptrdiff-t ptrdiff-t. It is an integer type that can represent negative values. Almost
certainly it is either int or long. It is always the signed type that has the same
number of \ bits as the unsigned type chosen for size-t, described below.
(I said above that the use of these definitions is essentially unrelated. These
two definitions are themselves highly related.)
You can subtract two pointers only if they have compatible data-object
types. One may have a const type qualifier and the other not, for example,
but both must point to the same data-object type. The translator can check
types and complain if they are inappropriate. It generally cannot verify the
additional constraint - both pointers must point to elements within the
same array data object. Write an expression that violates this constraint and
you often get a nonsense result from the subtraction.
Chapter 1 1
The arithmetic essentially proceeds as follows. The program represents
both pointers as offsets in bytes from a common origin in a common
address space. It subtracts the two offsets algebraically, producing a signed
intermediate result. It then divides this intermediate result by the size in
bytes of the data object pointed to by both pointers. If both pointers point
to elements of a common array, the division will yield no remainder. The
final result is the difference in subscripts of the two array elements, regardless of the type of the elements.
That means, for example, that the expression 6 a [ 5 ] always has
the value 3, of typeptrdiff-t .Similarly 6 a [ 2 1 - &a151 always has the value
-3. I assume in both cases that a is an array data object with at least 5
elements. (Pointer arithmetic is still defined for the element "just off the
end" of an array, in this case s a [ 5 1 if a has exactly 5 elements.)
overflow
ptrdiff-t can be an inadequate type, in some instances. Consider an
implementation where size-t is the type unsigned int. Then ptrdiff-t is
the type int. Let's say further that you can declare a data object x as an array
of char whose size N is greater than m-MAX
bytes. (The header <limits. h>
defines the macro INT-MAX as the largest positive value representable by
type int.) Then you might write something like:
tinlcude <limits.h>
#include <stddef.h>
#&fine
.....
INT-MAX+10
char x [N] ;
ptrdiff-t n = 6 x[N]
6 x [O] ;
<stddef. h>
219
results on the fly. This type has the intrinsic limitation that it cannot reliably
capture all results of pointer subtractions. That limits its usefulness in a
portable program. It's nice to know that you can determine the type of the
result of a pointer subtraction.But I don't know why you would care most
of the time.
When you apply the sizeof operator in a C expression, the result has
type
size- t type size-t. It is an unsigned integer type that can represent the size of the
largest data object you can declare. Almost certainly it is either unsigned int
or unsigned long. It is always the unsigned type that has the same number
of bits as the signed type chosen for ptrdiff-t, described above.
Unlike p t r a i f f t , however, size- t is very useful. It is the safest type to
represent any integer data object you use as an array subscript. You don't
have to worry if a small array evolves to a very large one as the program
changes.Subscript arithmetic will never overflow when performed in type
size-t. You don't have to worry if the program moves to a machine with
peculiar properties, such as 32-bit bytes and 1-byte longs. Type size- t offers
the greatest chance that your code won't be unduly surprised. The only
sensible type to use for computing the sizes of data objects is size-t.
The Standard C library makes extensive use of the type size-t. You will
find that many function arguments and return values are declared to have
this type. That is a deliberate change over older practice in C that often led
to program bugs. It is part of a general trend away from declaring almost
all integers as type int.
You should make a point of using type size- t a n p h e your program
performs array subscripting or address arithmetic. Be warned, however,
that unsigned-integer arithmetic has more pitfalls than signed. You cannot
run an unsigned counter down until it goes negative- it never will. If the
translator doesn't warn you of a silly test expression, the program may loop
forever. You may find, in fact, that counting down to zero sometimes leads
to clumsy tests. You will occasionally miss the convenience of using negative values (such as EOF, defined in <stdio.h> to signal end-of-file) and
testing for them easily. Nevertheless, the improvement in robustness is well
worth the learning investment.
The code in this book uses type size- t wherever it is appropriate.You
may see an occasional place where int data objects hold subscripts. In all
such cases, however, the size of related array data objects should be
naturally limited to a safe range of sizes. I indulge in such practices only
when I have an overriding need to mix negative values with proper
subscript values.
You write a wide character constant as, for example, L'x'. It has type
type
w c h a r t wchar-t. You write a wide character string literal as, for example, hello". It
has type away of wchar-t. wchar-t is an integer type that can represent all
the code values for all wide-character encodings supported by the implementation.
Chapter 11
For an implementation with only minimal support for wide characters,
may be as small as char. For a very ambitious implementation, it
may be as large as unsigned long. More likely, wchar-t is a synonym for an
integer type that has at least a 16-bit representation,such as short or unsigned
short.
You use wchar-t to represent all data objects that must hold wide
characters. Several functions declared in <stdlib .h> manipulate wide
characters, either one at a time or as part of null-terminated strings. You
will find that many function arguments and return values in this group are
declared to have this type. For this reason, the header <stdlib.h> also
defines type wchar-t.
The macro N ~ L Lserves as an almost-universal null pointer constant. You
macro
NULL use it as the value of a data-object pointer that should point to no data object
declared (or allocated) in the program. As I mentioned on page 216, the
macro can have any of the definitions 0, OL,or (void *) O.
The last definition is compatible with any data object pointer. It is not,
however, compatiblewith a function pointer. That means you cannot write:
int (*pfun)(void) = NULL;
/* WRONG */
The translator may complain that the expression type is incompatiblewith
the data object you wish to initialize.
An important traditional use for NULL has largely gone away. Early
versionsof the C language had no function prototypes. The translatorcould
not check whether a function-call argument expression was compatible
with the corresponding function parameter declaration. Hence, it could not
adjust the representation of an expression that was compatible but had a
different type (such as changing tan (1)to tan (1.0).The programmer had
to ensure that each argument value had the proper representation.
Modern programming style is to declare function prototypes for all
functions that you call. Nevertheless, an important context still exists where
a function argument has no corresponding parameter declaration. That is
when you call a function that accepts a variable argument list (such as
printf, declared in cstdio.h>). For the extra arguments, the older C rules
apply. A few standard type conversionsype;convertingoccur, but mostly it
is up to you, the programmer, to get each such argument right.
In the earliest implementations of C, all pointers had the same representation. Usually, this representation was the same size as one of the
integer types int or long. Thus, one of the decimal constants o or OL
masqueraded nicely as a null pointer of any type. Define NULL as one of
these two constants and you could assign it to an arbitrary pointer. The
macro was particularly usefulas an argument expression. It advertized that
the expression had some pointer type and was a null-pointer constant.
Then along came implementations where pointers looked quite different
than any of the integer types. The only safe way to write a null pointer was
with a type cast, as in (char *) 0.If all pointers looked the same, you could
wchar-t
22 1
<stddef h>
still define NULL as, say, (char * ) O. The macro still served as a useful way
to write argument expressions.
Standard C permits different pointer types to have different representations. You are guaranteed that you can convert any data object pointer
to type pointer to char (or pointer to signed char or pointer to unsigned char) and
back again with no loss of information. The newly introduced type pointer
to void has the same representationas pointer to char, but is assignment-compatible with all data-object pointers. You use pointer to void as a convenient
generic data-object pointer type, particularly for declaring function arguments and return values.
The safest definition for NULL on such an implementation is (void O.
There is no guarantee, however, that pointer to void has the same representation as any other (non-character)pointer. It isn't even assignment-compatible with function pointers. That means that you can't write NULL as a
universal null-pointer constant. Nor can you safely use it as an argument
expression in place of an arbitrary data-object pointer. It is guaranteed to
masquerade properly only as a character pointer or as a generic pointer to
void.
One modern style of writing C is to avoid the use of NULL altogether. Write
every null pointer constant religiously with an appropriate type cast, as in
(int ) o. That can lead to wordy programs, but has the virtue of being most
unambiguous. A modification of this style is to write a simple o as a
null-pointer constant wherever possible. That can lead to programs clear
enough to the translator but not to human readers.
The style I follow in this book is to use NULL as much as possible. I find
it a useful signal that a null-pointer constant is present. I use type casts to
generate null-pointer constants for function pointers. I also use them for
arguments to functions that accept variable argument lists, particularly if
the required type is other than pointer to void.
You will find the macro NULL defined in half a dozen different headers.
It is easy for you to use the macro if you so choose. My only advice is that
you choose a uniform style, as always, and stick with it.
macro
You use the macro offsetof to determine the offset in bytes of a member
offsetof from the start of the structure that contains it. That can be important if you
wish to manipulate the individual members of a structure using a tabledriven function. See, for example, the function -Makeloc on page 120 and
the table -Loctab on page 117.
The result of this macro is an integer constant expression of type size-t.
That means you can use it to initializea static data object such as a constant
table with integer elements. It is the only portable way to do so. If you write
code such as:
struct xx {
int a, b;
1 x;
static size-t off = (char *)&x->b
(char *)&x;
Chapter 11
the behavior of the last declaration is undefined. Some implementations
can choose to evaluate the initializer and obtain the obvious result. Others
can choose to diagnose the expression instead.
Nor can you reliably step from member to member by performing
pointer arithmetic. The macros defined in <stciarg.h> let you step from
argument to argument in a function that accepts a variable argument list.
Those macros, or others like them, are not guaranteed to work within a
structure. That's because the holes between structure members can differ
from the holes between function arguments. They need not follow any
documented rules, in fact.
You need the macro o f f s e t o f to write code that is portable:
#include <stddef.h>
struct xx {
i n t a , b;
x;
These definitions work for a wide variety of implementations. Nevertheless, certain implementations may require that one or more of them change.
That's why I chose to parametrize them.
macro
For the macro o f f s e t o f I chose to use a common trick. Many implemenoffsetof tations let you type cast an integer zero to a data-object pointer type, then
perform pointer arithmetic on the result. That is certainly undefined behavior, so you may well find an implementation that balks at this approach.
The translator must indulge you a bit further for this definition of the
macro to work properly. It must let you type cast the zero-based address
back to an integer type, in this case size- t in disguise. Moreover, it must
tolerate such antics in an integer constant expression. That's what you need
to initialize static data objects.
Luckily, quite a few translators grant such a triple indulgence. If you
encounter one that doesn't, you will have to research how its implementors
expect you to define o f f s e t o f . TO comply with the C Standard, each
implementation must provide some method.
<stddef. h>
#define _STDDEF
#ifndef _WALS
#include <yvals.h>
#endif
/f macros */
#define NULL
NULL
#define offsetofi~.member) ( (_Sizet)6 ( (T *)0 ) ->member)
/f type definitions f/
tifndef _SIZET
#&fine -SIZET
typedef -Sizet size>;
tendif
tifndef -WCHART
#define -WCHART
typedef _Wchart wchar_t;
#endif
typedef _Ptrdifft ptrdiff_t;
#endif
References
P.J. Plauger, "Data-Object Types," The C Users Journal,, 6, no. 3
(March/April1988). This article discusses a few issues related to the topics
in this chapter.
Exercises
Exercise 1 1.1 Determine the integer types that your implementation has chosen for
ptrdif f-t, size-t, and wchar-t.
Exercise 1 1.2 Write a program that determines experimentally an integer type you can
use for "char-t.
Exercise 1 1.3 Write a program that determines experimentally the integer types you can
use for ptrdiff-t and wchar-t.
224
Figure 11.2:
t s t d d e f .c
Chapter 11
I* test stddef d e f i n i t i o n s
Yinclude < a s s e r t . h >
Yinclude <limits.h>
Yinclude <stddef h>
tinclude <stdio.h>
*/
/* t y p e d e f i n i t i o n s
:ypedef s t r u c t (
char f l ;
struct (
float f l t ;
1 f2;
i n t f3;
) Str;
*/
/* s t a t i c d a t a */
char *pc = NULL;
~ t a t i cdouble *pd = NULL;
~ t a t i csize- t o f f s [ ] = (
of f s e t o f ( S t r , f 1),
of f s e t o f ( S t r , f 2 ) ,
of f s e t o f ( S t r , f 3 ) );
ltatic
.nt main ( )
Exercise 1 1.4 [harder] Some implementations permit you to subtract two pointers in an
integer constant expression if both are based on some static data-object
declaration. Write a definition for o f f s e t o f that uses this capability.
Exercise 1 1.5 [very hard] Add a null-pointer constant to the C language. The keyword
nu1 is a null pointer compatible with all pointer types. How do you handle
nu1 as an argument expression in the absence of a corresponding parameter
declaration?
inputloutput
model
logical
unit
numbers
Chapter 12
PIP
utilities
enter
UNIX
system call
ioctl
device
handlers
Well, almost. Peripheral devices still had fairly strong notions about
what they should be asked to do. When you wrote to a printer, for example,
the first character of each line was diverted to control carriage spacing.
Send the same line to a typewriter and the carriage control characters
printed. And carriage control was a lightweight issue compared to blocking
factors for magnetic tape and diskfiles, or binary card formats, or how to
specify end-of-fieon various inputs. After a while, you learned which pairs
of devices you could switch between tor certain flavors of input and output.
A further step toward device independence came with the evolution of
standard peripheral interchange (or PIP) utilities. These were programs that
would let you specify any combination of source and destination devices,
then endeavored to perform a sensible copy operation between the two.
Usually, you had to specify a bizarre set of options to give PIP a reasonable
chance at guessing right. And invariably, some desirable combinations just
flatly failed no matter how many hints you provided.
Then along came the CRT terminal and everybody took one step backward. Do you terminate a line with a carriage return, with a carriage return
followed by a line feed, with a newline character, or with some other
magical incantation? Does the terminal accept horizontal tab settings and
expand tabs, or are tabs anathema to it? How do you signal end-of-filefrom
the keyboard? As you can imagine, there were about as many answers to
these questions as there were vendors of CRT terminals.
It was into this atmosphere that UNIX came in the early 1970s. Ken
Thompson and Dennis Ritchie, the developers of that now-famous system,
deservedly get credit for packing any number of bright ideas into UNIX.
Their approach to device independence was one of the brightest.
UNIX adopted a standard internal form for all text streams. Each line of
text is terminated by a newline character. That's what any program expects
when it reads text, and that's what any program produces when it writes
it. If such a convention doesn't meet the needs of a text-oriented peripheral
attached to a UNIX machine, then the fixup occurs out at the edges of the
system. None of the code in the middle has to change.
UNIX provides two mechanisms for fixing up text streams "out at the
edges." The preferred mechanism is a generic mapper that works with any
text-oriented device. You can set or test the various parameters for a given
device with the i o c t l system call. Using ioctl, you can (among other
things) choose among various conversions between the internal newline
conventionand the needs of numerous terminals. Over the years, i o c t l has
evolved to a fairly sophisticated little PIP for text-oriented devices.
The second mechanism for fixing up text streams is to tailor the special
software that directly controls the device. For each device that a UNIX
system may need to control, someone has to add a devicehandler to the UNIX
resident. (I'vlS-DOS has adopted similar machinery.) Early on, Thompson
and Ritchie established the precedent that each device should handle
standard text streams wherever possible.
tstdio h>
227
file
When Dennis Ritchie got the first C compiler going on PDP-11 UNIX,
descriptors the language naturally inherited the simple I/O model of its host operating
binary
streams
file
length
C
moves
out
system. Along with the uniform representation for text streams came
several other contributions to elegance. Those LUNs of yore had evolved
over the years into small positive integers called file descriptors or handles.
The operating system assumes responsibility for handing out file descriptors. And it keeps all file control information in its own private memory,
rather than burden the user with allocating and maintaining file- and
record-control blocks.
To simplify matters for most programs, the UNIX shell hands out three
standard file descriptors to every program that it runs. These are for the
now-commonplace standard input, standard output, and standard error
streams. (They are not exactly a UNIX invention, having incubated in PL/I
and MULTICS, among other places.) Programmers quickly learned the
wisdom of reading text from the standard input and writing text to the
standard output, whenever possible. Thus was born the software tool.
Another small but important refinement was 8-bit transparency. Nothing in UNIX prevents you from writing arbitrary binary codes to any open
file, or reading them back unchanged from an adequate repository. True,
sending binary to a text-oriented device might have bizarre consequences,
but a file or pipeline is usually ready and willing to field arbitrary stuff.
Programmers eventually learned the wisdom of making their programs
tolerant of arbitrary binary codes, whenever that made sense, even if the
programs originated as text processing tools. Thus did UNIX obliterate the
long-standing distinction between text streams (for interacting with people) and binary streams (for interacting with other programs).
Yet another refinement was exact-length files. Most operating systems
make only a half-hearted attempt to disguise any underlying block structure in files kept on disk, tape, or other record-oriented devices. When you
write data to a file and then read it back, you may be treated to anywhere
between one and a thousand extra characters tacked onto the end. UNIX
records the size of a file to the nearest byte, so you get back only what was
put into the file. Programmersof device handlers mostly learned to provide
machinery for keeping data streams to and from devices just as tidy. Thus
fell one of the last needs for the once ubiquitousPIP utility. (Note, however,
that UNIX still has the command, a modern-day PIP.)
Similarly, making temporary files requires no advanced preparation,
and hardly any thought. Stitching together C programs from different
authors via pipelines works far more often than not. Those early UNIX
systems delivered to universities produced a generation of C programmers
blissfully ignorant of the ugly realities involved in performing I/O on most
other operating systems.
The honeymoon ended when C moved from UNIX to other operating
systems. Those of us involved in those first implementations faced some
tough decisions. Should we fight to preserve the simple I/O model to
Chapter 12
hiding
the
uglies
X3J11
moves in
text
versus
binary
a t d i o . h>
termlnating
lines
line
length
file
length
229
A UNIX system is free to ignore the b mode qualifier, as is any operating
system for which the distinction has no meaning. O n many systems,
however, the distinction is extremely important. If you want your program
to be portable, think about how each file is used and code its fopen mode
properv. Otherwise, your program can fail in all sorts of subtle ways.
A text file is designed to support closely the UNIX model of a stream of
text. This is not always easy. As I indicated on page 226, conventions for
terminating text lines vary considerably. The implementation requires latitude in converting what's out there to what your C program reads, and in
converting what your program writes to what makes sense to other programs once it's out there. That latitude must extend to the set of characters
you write to text files, to how you construct text lines, and even to the
difference between zero and nothing. Let me elaborate.
Some systems are far from 8-bit transparent when it comes to writing
things in text files. Actl-Z looks like an end-of-filein more than one popular
operating system. Even characters from the basic C character set can .be
chanq. Form feeds and vertical tabs may not survive intact in some
environments. For maximum portability, in fact, you should write to a text
file only the printing characters, plus space, newline, and horizontal tab.
Many systems balk at partial (last) lines, since they have no way to
represent the concept of a line without a terminator. If the last character you
write to a text file is not a newline, that partial last line may go away. Or it
may be completed for you, so that you read a newline back that you did
not write out. Or the program may gripe when you run it. Avoid partial last
lines in text files.
Some systems cannot even represent an empty line. When you write one,
the library may actually write a line containing a space. On input, the
system then discards the space from a line containing only a single space.
Some systems discard all trailing spaces on a text line. That gives you nicer
behavior if your program reads a file consisting of fixed-length text records.
All those trailing spaces conveniently disappear. But what this means is that
you cannot rely on writing a text line with trailing spaces and reading those
spaces back later. Don't even try, in a portable program.
At the other extreme, systems have a right to impose an upper limit on
the longest text line that they can read or write. Longer lines may be
truncated, so the trailing characters are lost. Or they may be folded, so you
suddenly encounter newline characters that were not there originally. Or
you may get a complaint when you run your program. The upper limit
guaranteed by the C Standard for the length of a text line is 254 characters.
(The longest logical C source line, after processing backslash continuations,
is 509 characters.)
Some systems cannot represent an empty file. If you create a new file,
write nothing to it, then close it, the system has no way to distinguish that
empty file from one that is nonexistent. Hence, Standard C permits an
implementation to remove empty files when you close them. Be warned.
Chapter 12
binary
files
evolution
of streams
A file that is very long, on the other hand, may also cause problems.
Under UNIX, you can characterize the position of any byte in a file with a
32-bit integer. The traditional file-positioning functions of C thus assume
that a long can represent an arbitrary file-position. That is often not true on
other systems, even for files well short of 232bytes in length. The committee
added an alternate set of file-positioning functions to theStandard C library
to partially ameliorate this problem.
To end the discussion of text files on a more positive note, I offer one bit
of encouragement. If you follow all these rules, then the sequence of
characters that you write to a text file will exactly match the sequence that
you later read. Just don't push your luck by bending the rules, if such
symmetry is of importance to you.
As for binary files, the major compromise was to reintroduce length
uncertainty. An implementation must preserve exactly all the bytes you
write at the start of a file, but it is at liberty to pad a binary file. Any number
of padding characters can be added, so long as all of them have value zero
( ' \ o r ) . Thus, you may have to be more careful in designing your binary
files. Don't assume you will see end-of-file after you read the last character
you earlier wrote to the file. Either have a way of knowing when the data
ends or be tolerant of trailing zero bytes in the data you read.
As I indicated on page 226, UNIX I/O represents a considerable simplification over earlier systems. Most systems designed before UNIX took it
for granted that 1 / 0 was a complex operation whose complexity could not
be hidden from the executing program. Files had all sorts of structure,
reflected in various attributes such as block or record size, search keys,
printer format controls, and so on seemingly ad infiniturn.Different combinations of these attributes had to be specified on each system call that
performed I/O. Still other bits of information had to be retained between
system calls to keep track of the state of each stream.
So the easiest thing, it seemed, was for the system to require each user
program to allocate storage space for passing and/or remembering all
these attributes and other bits of state information. The storage area was
called a "data control block," "file control block," "record access block," or
some equally vague name. You were obliged to set aside space for a control
block before you opened the file, pass a pointer to the control block on the
system call that opened the file, and pass the same pointer on all subsequent
system calls that performed 1 / 0 on the file. Any other arguments needed
for an 1 / 0 system call get tucked into various fields of the control block
If you were lucky, the operating system vendor provided a package of
assembly-languagemacros for allocating these control blocks and addressing the various fields. If you were smart, you used these macros religiously,
since most vendors felt quite free to change the size and layout of control
blocks with each release. The macro interface tended to be reasonably
stable, since the vendor's systems programmers would have been inconvenienced had that changed.
tstdio. h>
UNIX
110
model
choosing
110
primitives
23 1
But even with the best macro package in the world, you still had to
contend with a pretty unstructured interface. Assemblers, as a rule, can
hardly enforce that you read and write data of the appropriate type from
the fields of a control block. Even worse, the fields tended to be numerous
and ill-documented. It was often not clear whether you could set certain
fields to advantage before a system call, or whether you could rely on the
fields to contain meaningful information after a system call. The one thing
you could count on was that injudicious scribbling within a control block
could curdle I/O, damage files, or even crash the system.
So it was a real step forward when UNIX eliminated the need for control
blocks in user memory. When you open a file under UNIX, you get back
just a file descriptor, a small positive integer. Any control information is
retained within the system, presumably out of reach of stupid or malicious
user programs. Files are sufficiently unstructured that you need specify
only a few parameters on each 1 / 0 system call. It is easy to map from a few
scalar argumentson a function called from C to the minimal (and transient)
structure required by each UNIX system call on any given implementation.
The functions that perform UNIX-style1 / 0 from C have names such as
VM, close, read, mite, and lseek. They traffic in file descriptors and I/O
buffers. They support a simple I/O model that has been imposed on dozens
of more complex operatingsystems. They appear to be ideal candidates for
the 1 / 0 primities in Standard C.
There is one small problem, however. While the earliest programs
written for UNIX were content to call these primitives directb, later programs became more sophisticated. They imposed a layer of buffering, in
user memory, to minimize the number of system calls per byte of data
transferred in and out of the program. A program almost always runs
substantiallyfaster if it reads and writes hundreds of bytes per system call
instead of just a few.
A standard library of functions evolved that automatically took care of
allocatingand freeing buffers, filling them and draining them, and tracking
error conditions in a uniform style. These functions worked with data
structures of type FILE to control streams. Each stream data object kept
track of the state of I/O to the associated file. It also contained a pointer to
a buffer area and additional state information to keep track of the number
of useful bytes in the buffer.
There was broad consensus among the members of X3Jll that streams
were a necessary addition to the Standard C library. Many people had
learned to work exclusively with streams to ensure decent I/O performance. There were even a few implementations of C that had chosen to
implement stream 1 / 0 exclusively, disdaining the simpler UNIX-style
primitives as too inefficient.
Some implementations based on the UNIX primitivesoften had to buffer
data in user memory for the read and mite calls, if only to pack and unpack
records in structured files. Customers using the stream functions suffered
232
Chapter 12
Cstdio. h>
The header <stdio .h> declares three types, several macros, and many functions for
performing input and output.
The types declared are size- t (described in 7.1.6);
PILE
which is an object type capable of recording all the information needed to control a stream,
including its file position indicator, a pointer to its associated buffer (if any), an error indicator
that records whether a readlwrite error has occurred, and an end-of-fe indicator that records
whether the end of the file has been reached: and
NULL
-IOFBF
-IOLBF
-IONBF
which is an object type capable of recording all the information needed to specify uniquely every
position within a file.
The macros are NULL (described in 7.1.6);
-IOFBF
-IOLBF
-IONBF
which expand to integral constant expressions with distinct values. suitable for use as the third
argument to the setvbuf function;
BUFSIZ
BUFSIZ
which expands to an integral constant expression, which is the size of the buffer used by the
setbuf function;
EOF
EOP
which expands to a negative integral constant expression that is returned by several functions to
indicate end-of-file, that is, no more input from a stream;
POPEN-bmx
which expands to an integral constant expression that is the minimum number of files that the
implementation guarantees can be open simultaneously;
PILENAME-MAX
which expands to an integral constant expression that is the size needed for an array of char
large enough to hold the longest file name string that the implementation guarantees can be
opened;'1
L-tmpnam
which expands to an integral constant expression that is the size needed for an array of char
large enough to hold a temporary file name string generated by the tmpnam function;
SEEK-CUR
SEEK-END
SEEK-SET
SEEK-CUR
SEEK-END
SEEK-SET
which expand to integral constant expressions with distinct values, suitable for use as the third
argument to the f seek function;
TMP-MAX
which expands to an integral constant expression that is the minimum number of unique file names
that shall be generated by the tmpnam function;
stderr
stdin
stdout
stderr
stdin
stdout
which are expressions of type "pointer to FILE" that point to the FILE objects associated,
respectively, with the standard error, input, and output streams.
Forward references: files (7.9.3), the fseek function (7.9.9.2), streams (7.9.2), the tmpnam
function (7.9.4.4).
Chapter 12
7.9.2 Streams
streams
Input and output, whether to or from physical devices such as terminals and tape drives, or
whether to or from files supported on structured storage devices, are mapped into logical data
streams, whose properties are more uniform than their various in uts and outputs. Two forms of
mapping are supported, for text streams and for binary streams."
text
streams
A text stream is an ordered sequence of characters composed into lines. each line consisting
of zero or more characters plus a terminating new-line character. Whether the last line requires a
terminating new-line character is implem&tation-defined. Characters may have to be added.
altered, or deleted on input and output to conform to differing conventions for representing text
in the host environment. Thus, there need not be a one-to-one correspondence between the
characters in a stream and those in the external representation. Data read in from a text stream
will necessarily compare equal to the data that were earlier written out to that stream only if: the
data consist only of printable characters and the control characters horizontal tab and new-line;
no new-line character is immediately preceded by space characters; and the last character is a
new-line character. Whether space characters that are written out immediately before a new-lie
character appear when read in is implemenration-defined.
binary
streams
A binary stream is an ordered sequence of characters that can transparentlyrecord internal data.
Data read in from a binary stream shall compare equal to the data that were earlier written out to
that stream, under the same implementation. Such a stream may, however, have an implementation-defined number of null characters appended to the end of the stream.
Environmental limits
An implementation shall support text files with lines containing at least 254 characters,
including the terminating new-line character. The value of the macro BUFSIZ shall be at least
256.
7.9.3 Files
opening
files
A stream is associated with an external file (which may be a physical device) by opening a file.
which may involve creating a new file. Creating an existing file causes its former contents to be
discarded; if necessary. If ;file can support o&ioning reGests (such as a disk file, as opposed
to a terminal), then afileposition indicatorlg associated with the stream is positioned at the stan
(character number zero) of the file, unless the file is opened with append mode in which case it
is implementation-defined whether the file position indicator is initially positioned at the beginning or the end of the file. The file position indicator is maintained by subsequent reads. writes,
and positioning requests, to facilitate an orderly progression through the file. All input takes place
as if characters were read by successive calls to the fgetc function; all output takes place as if
characters were written by successive calls to the fputc function.
Binary files are not truncated, except as defined in 7.9.5.3. Whether a write on a text stream
causes the associated file to be truncated beyond that point is implementation-defined.
buffering
files
When a stream is unbuffered, characters are intended to appear from the source or at the
destination as soon as possible. Otherwise characters may be accumulated and transmitted to or
from the host environment as a block. When a stream isfully buffered,characters are intended to
be transmitted to or from the host environment as a block when a buffer is filled. When a stream
is line buffered, characters are intended to be transmitted to or from the host environmenr as a
block when a new-line character is encountered. Furthermore, characters are intended to be
transmitted as a block to the host environment when a buffer is filled, when input is requested on
an unbuffered stream, or when input is requested on a line buffered stream that requires the
transmission of characters from the host environment. Support for these characteristics is
implementation-defined, and may be affected via the setbuf and setvbuf functions.
closing
files
A file may be disassociated from a controlling stream by closing the file. Output streams are
flushed (any unwritten buffer contents are transmitted to the host environment) before the stream
is disassociated from the file. The value of a pointer to a FILE object is indeterminate after the
associated file is closed (including the standard text streams). Whether a file of zero length(on
which no characters have been written by an output stream) actually exists is implementation-defined.
reopening
files
The file may be subsequently reopened, by the same or another program execution, and its
contents reclaimed or modified (if it can be repositioned at its start). If the main function returns
to its original caller, or if the exit function is called, all open files are closed (hence all outpu~
streams are flushed) before program termination. Other paths to program termination, such as
calling the abort function, need not close all files properly.
The address of the FILE object used to control a stream may be significant; a copy of a FILE
object may not necessarily serve in place of the original.
At program startup, three text streams are predefined and need not be opened explicitly standurd input (for reading conventional input), standard output (for writing conventional
output), and standard error (for writing diagnostic output). When opened, the standard error
stream is not fully buffered; the standard input and standard output streams are fully buffered if
and only if the stream can be determined not to refer to an interactive device.
Functions that open additional Inontemporary) files require afile name which is a string. The
rules for composing valid file names are implementation-defined. Whether the same file can be
simultaneously open multiple rimes is also implementation-defined.
Environmental limits
The value of FOPEN-MAX shall be at least eight, including the three standard text streams.
Forward references: the e x i t function (7.10.4.3), the f g e t c function (7.9.7.1). the fopen
function (7.953). the fputc function (7.9.7.3). the setbuf function (7.9.5.5). the setvbuf
function (7.9.5.6).
remove
Description
The remove function causes the file whose name is the string pointed to by filename to
be no longer accessible by that name. A subsequent attempt to open that file using that name will
fail, unless it is created anew. If the file is open, the behavior of the remove function is
implementation-defined.
Returns
The remove function returns zero if the operation succeeds, nonzero if it fails.
rename
Description
The rename function causes the file whose name is the string pointed to by o l d to be
henceforth known by the name given by the string pointed 10 by new. The file named o l d is no
longer accessible by that name. If a file named by the string pointed to by new exists prior to the
call to the rename function, the behavior is implementation-defined.
Returns
The rename function returns zero if the operation succeeds, nonzero if it fails."3 in which
case if the file existed previously it is still known by its original name.
tmpfile
Description
The tmpfile function creates a temporary binary file that will automatically be removed
when it is closed or at program termination. If the program terminates abnormally, whether an
open temporary file is removed is implementation-defined. The file is opened for update with
"wb+" mode.
Returns
The tmpf i l e function returns a pointer to the stream of the file that it created. If the file
cannot be created, the tmpf i l e function returns a null pointer.
Forward references: the fopen function (7.9.5.3).
Chapter 12
tmpnam
Description
The tmpnam function generates a string that is a valid file name and that is not the same as
the name of an existing file.H4
The tmpnamfunction generates a different string each time it is called, up to TM-MAX times.
If it is called more than TM-MAX times, the behavior is implementation-defined.
The implementation shall behave as if no library function calls the tmpnam function.
Returns
If the argument is a null pointer, the tmpnam function leaves its result in an internal static
object and returns a pointer to that object. Subsequent calls to the tmpnam function may modify
the same object. If the argument is not a null pointer, it is assumed to point to an array of at least
L tmpnamchars; the tmpnam function writes its result in that array and returns the argument
a c t s value.
Environmental limits
The value of the macro TM-MAX shall be at least 25.
fclose
Description
The f c l o s e function causes the stream pointed to by stream to be flushed and the
associated file to be closed. Any unwritten buffered data for the stream are delivered to the host
environment to be written to the file; any unread buffered data are discarded. The stream is
disassociated from the file. If the associated buffer was automatically allocated, it is deallocated.
Returns
The f c l o s e function returns zero if the stream was successfully closed, or EOF if any errors
were detected.
fflush
Description
If stream points to an output stream or an update stream in which the most recent operation
was not input, the ff lush function causes any unwritten data for that stream to be delivered to
the host environment to be written to the file; otherwise, the behavior is undefined.
If stream is a null pointer, the f f lush function perfoms this flushing action on all streams
for which the behavior is defined above.
Returns
The f f lush function returns EOF if a write error occurs, otherwise zero.
Forward references: the fopen function (7.9.5.3). the ungetc function (7.9.7.11).
fopen
Description
The fopen function opens the file whose name is the sting pointed to by filename. and
associates a stream with it.
The argument mode points to a sting beginning with one of the following sequences:115
r+
a+
r+b
r+b
a+b
or rb+
or wb+
or ab+
Opening a file with read mode (' r' as the first character in the mode argument) fails if the
file does not exist or cannot be read.
Opening a file with append mode (' a' as the first character in the mode argument) causes all
subsequent writes to the file to be forced to the then current end-of-file, regardless of intervening
calls to the fseek function. In some implementations, opening a binary file with append mode
('b' as the second or third character in the above list of mode argument values) may initially
position the file position indicator for the stream beyond the last data written, because of null
character padding.
When a file is opened with update mode (' +' as the second or third character in the above list
of mode argument values), both input and output may be performed on the associated stream.
However, output may not be directly followed by input without an intervening call to the f f lush
function or to a file positioning function (fseek, fsetpos, or rewind), and input may not
be directly followed by output without an intervening call to a file positioning function, unless
the input operation encounters end-of-file. Opening (or creating) a text file with update mode may
instead open (or create) a binary stream in some implementations.
When opened, a stream is fully buffered if and only if it can be determined not to refer to an
interactive device. The error and end-of-file indicators for the stream are cleared.
Returns
The fopenfunction returns a pointer to the object controlling the stream. If the open operation
fails, fopen returns a null pointer.
Forward references: file positioning functions (7.9.9).
7.9.5.4 The f reopen function
Synopsis
#include <stdio.h>
F I L E *frmop.n(const char *film-,
F I L E *straam);
Description
The freopen function opens the file whose name is the sting pointed to by filename and
associates the stream pointed to by stream with it. The mode argument is used just as in the
fopen function.Il6
The freopen function first attempts to close any file that is associated with the specified
stream. Failure to close the file successfully is ignored The error and end-of-file indicators for
the stream are cleared.
Returns
The freopen function returns a null pointer if the open operation fails. Otherwise, freopen
returns the value of stream
Chapter 12
238
smtbuf
char 'buf) ;
Description
Except that it returns no value, the setbuf function is equivalent to the setvbuf function
invoked with the values IOFBF formode and BUFSIZ for s i z e . or(ifbuf is a null pointer),
with the value _10NBF%rmode.
Returns
The setbuf function returns no value.
smtvbuf
char 'buf,
i n t mode,
Description
The setvbuf function may be used only after the stream pointed to by stream has been
associated with an open file and before any other operation is performed on the stream. The
argument mode determines how stream will be buffered, as follows: IOFBF causes inputloutput to be fully buffered; IOLBF causes inputloutput to be line buffeza -1ONBF causes
input/output to be unbuffered. Ifbuf is not a nullpointer, the array it points to may be used instead
of a buffer allocated by the setvbuf function. l7 The argument s i z e specifies the size of the
array. The contents of the array at any time are indeterminate.
Returns
The setvbuf function returns zero on success, or nonzero if an invalid value is given for
mode or if the request cannot be honored.
fprintf
... );
Description
The fprintf function writes output to the stream pointed to by stream. under control of
the string pointed to by format that specifies how subsequent arguments are converted for
output. If there are insufficient arguments for the format, the behavior is undefined. If the format
is exhausted while arguments remain, the excess arguments are evaluated (as always) but are
otherwise ignored. The fprintf function returns when the end of the format sting is enmuntered.
The format shall be a multibyte character sequence, beginning and ending in its initial shift
state. The format is composed of zero or more directives: ordinary multibyte characters (not %),
which are copied unchanged to the output stream; and conversion specifications, each of which
results in fetching zero or more subsequent arguments. Each conversion specification is intrcduced by the character %. After the %, the following appear in sequence:
Zero or moreflags (in any order) that modify the meaning of the conversion specification.
An optional minimumfield widrh. If the converted value has fewer characters than the field
width, it will be padded with spaces (by default) on the left (or right, if the left adjustment flag,
described later, has been given) to the field width. The field width takes the form of an asterisk
* (described later) or a decimal integec118
0
An optional precision that gives the minimum number of digits to appear for the 4 i,o, u x,
and X conversions. the number of digits to appear after the decimal-point character for e, E,
and f conversions, the maximum nu&r of significant digits for theg and G conversions, or
the maximum number of characters m be written from a sting in s conversion. The precision
takes the form of a period ( ) followed either by an asterisk * (described later) or by an optional
decimal integer: if only the period is specified. the precision is taken as zero. If a precision
appears with any other conversion specifier, the behavior is undefined.
An optional h specifying that a following 4 i,0,u , x, orX conversion specifier applies to a
short int or unsigned short int argument (the argument will have been promoted
according to the integral promotions, and its value shall be converted to short int or
unsigned short int before printing); an optional h specifying that a following n
conversion specifier applies to a pointer to a short int argument; an optional 1 (ell)
specifying that a following d,i,o,u, x , or X conversion specifier applies to a long int or
unsigned long int argument; an optional 1 specifying that a following n conversion
specifier applies to a pointer to a long int argument; or an optional L specifying that a
following e,E, f,g,or G conversion specifier applies to a long double argument. If an h,
1,or L appears with any other conversion specifier, the behavior is undefined.
A character that specifies the type of conversion to be applied.
As noted above, a field width, or precision, or both, may be indicated by an asterisk. In this
case, an int argument supplies the field width or precision. The arguments specifying field width,
or precision, or both, shall appear (in that order) before the argument (if any) to be converted. A
negative field width argument is taken as a - flag followed by a positive field width A negative
precision argument is taken as if the precision were omitted.
The flag characters and their meanings are
The result of the conversion will be left-justified within the field. (It will be right-justified
if this flag is not specified.)
The result of a signed conversion will always begin with a plus or mnus s~gn.(It will begin
with a sign only when a negative value is converted if this flag is not specified.)
spaceIf the first character of a signed conversion is not a sign, or if a signed conversion results in
no characters, a space will be prefixed to the result. If the space and flags both appear,
the space flag will be ignored.
Ford, i,0,U, x, X, e,E, f,g,and G conversions, leading zeros (following any indication
of sign or base) are used to pad to the field width, no space padding is performed. If the 0
and - flags both appear, the 0 flag will be ignored. Ford, i.o.u. x, and X conversions, if
a precision is specified. the 0 flag will be ignored. For other conversions, the behavior is
undefined.
The conversion specifiers and their meanings are
d, i The int argument is converted to signed decimal in the style [-Idddd. The precision
specifies the minimum number of digits to appear; if the value being converted can be
represented in fewer digits, it will be expanded with leading zeros. The default precision is
1. The result of converting a zero value with a precision of zero is no characters.
The double argument is converted to decimal notation in the style [-1ddd.ddd. where the
number of digits after the decimal-point character is equal to the precision specification. If
the precision is missing, it is taken as 6; if the precision is zero and the t flag is not specified,
no decimal-point character appears. If a decimal-point character appears, at least one digit
appears before it. The value is rounded to the appropriate number of digits.
Chapter 12
produce a number with E instead of e introducing the exponent. The exponent always
contains at least two digits. If the value is zero, the exponent is zero.
g, G The double argument is converted in style fore (or in style E in the case of a Gconversion
specifier), with the precision specifying the number of significant digits. If the precision is
zero, it is taken as 1. The style used depends on the value converted; style e (or E) will be
used only if the exponent resulting from such a conversion is less than 4 or greater than or
equal to the precision. Trailing zeros are removed from the fractional portion of the resulC
a decimal-point character appears only if it is followed by a digit.
In no case does a nonexistent or small field width cause truncation of a field; if the result of a
conversion is wider than the field width, the field is expanded to contain the conversion result.
Returns
The fprintf function returns the number of characters transmitted, or a negative value ifan
output error occurred.
Environmental limit
The minimum value for the maximum number of characters produced by any single conversicn
shall be 509.
Example
To print a date and time in the form "Sunday, July 3. 10:02" followed by n: to five decimal
places:
#include <rmth.h>
#include <stdio.h>
I * .. ."/
*/
... );
Description
The f scanf function reads input from the stream pointed to by stream under control of
the sting pointed to by format that specifies the admissible input sequences and how they are
to be converted for assignment, using subsequent arguments as pointers to the objects to receive
the converted input. If there are insufficient arguments for the format, the behavior is u n d e f d .
If the format is exhausted while arguments remain, the excess arguments are evaluated (as always)
but are otherwise ignored.
The format shall be a multibyte character sequence, beginning and ending in its initial shift
state. The format is composed of zero or more directives: one or more white-space characters; an
ordinary multibyte character (neither % nor a white-space character); or a conversion specification.
Each conversion specification is introduced by the character %. After the %,the following appear
in sequence:
An optional assignment-suppressing character
*.
An optional nonzero decimal integer that specifies the maximum field width.
An optional h, 1 (ell) or L indicating the size of the receiving object. The conversion specifiers
tolong i n t . ~ h n i l a r lthe
~ , conversion
rather than a pointer'to i n t , o;by 1 if it is a
specifiers o, u, and x shall be preceded by h if the corresponding argument is a pointer to
unsigned short i n t rather than a pointer to unsigned i n t , or by 1 if it is a pointer
to unsigned long i n t . Finally, the conversion specifiers e, f , and g shall be preceded by
1 if the corresponding argument is a pointer to double rather than a pointer to float, or
by L if it is a pointer to long double. If an h, 1, or L appears with any other conversion
specifier, the behavior is undefined.
Acharacter that specifies the type of conversion to be applied. The valid conversion specifiers
are described below.
The f scanf function executes each directive of the format in turn. If a directive fails, as
detailed below, the fscanf function returns. Failures are described as input failures (due to the
unavailability of input characters), or matching failures (due to inappropriate input).
A directive composed of white-space character(s) is executed by reading input up to the first
non-white-space character (which remains unread), or until no more characters can be read.
A directive that is an ordinary multibyte character is executed by reading the next characters
of the stream. If one of the characters differs from one comprising the directive, the directive fails,
and the differing and subsequent characters remain unread.
A directive that is a conversion specification defines a set of matching input sequences, as
described below for each specifier. A conversion specification is executed in the following steps:
Input white-space characters (as specified by the isspace function) are skipped, unless the
specification includes a [, c,or n specifier.'Z'
An input item is read from the stream, unless the specification includes an n specifier. An input
item is defined as the longest matching sequence of input characters, unless that exceeds a
specified field width, in which case it is the initial subsequence of that length in the sequence. The
first character, if any, after the input item remains unread. If the length of the input item is zero,
the execution of the directive fails: this condition is a matching failure, unless an error prevented
input from the stream, in which case it is an input failure.
Except in the case of a % specifier, the input item (or, in the case of a %n
directive, the count
of inpuicharacters) is convertid to a type appropriate to the conversion specifier. If the input item
is not a matchine
seauence. the execution of the directive fails: this condition is a matching failure.
u in the
Unless assignment suppression was indicated by a *, the result of the conversion is
object pointed to by the first argument following the format argument that has not already
received a conversion result. If this object does not have an appropriate type, or if the result of
the conversion cannot be represented in the space provided, the behavior is undefined.
Chapter 12
e,f , g Matches an optionally signed floating-point number, whose format is the same as
expected for the subject string of the strtod function. The corresponding argument shall
be a pointer to floating.
Matches an implementation-defined set of sequences, which should be the same as the set
of sequences that may be produced by the %p
conversion of the fprintf function. The
corresponding argument shall be a pointer to a pointer to void. The interpretation of the
input item is implementation-defined. If the input item is a value converted earlier during
the same program execution, the pointer that results shall compare equal to that value;
conversion is undefined.
otherwise the behavior of the %p
No input is consumed. The corresponding argument shall be a pointer to integer into which
is to be written the number of characters read from the input stream so far by this call to the
fscanf function. Execution of a %ndirective does not increment the assignment count
returned at the completion of execution of the fscanf function.
Matches a single %; no conversion or assignment occurs. The complete conversion specification shall be %%.
If a conversion specification is invalid, the behavior is undefined.lZ3
The conversion specifiers E, G, and X are also valid and behave the same as, respectively, e,
g,and x.
If end-of-file is encountered during input, conversion is terminated. If end-of-file occurs before
any characters matching the current directive have been read (other than leading white space,
where permitted), execution of the current directive terminates with an input failure; otherwise,
unless execution of the current directive is terminated with a matching failure, execution of the
following directive (if any) is terminated with an input failure.
If conversion terminates on a conflicting input character. the offending input character is left
unread in the input stream. Trailing white space (including new-line characters) is left unread
unless matched by a directive. The success of literal matches and suppressed assignmentsassignmen! suppression is not directly determinable other than via the %ndirective.
Returns
The fscanf function returns the value of the macro EOF if an input failure occurs before any
conversion. Otherwise, the f scanf function returns the number of input items assigned, which
can be fewer than provided for, or even zero, in the event of an early matching failure.
The call:
#include <atdio.h>
/*. . .*/
-);
will assign to n the value 3, to i the value 25, to x the value 5.432, and name will contain
thompson\O.
The call:
#include <stdio.h>
I * . .*/
int i; float x; char n m u [SO];
fmcmf(stdin, "%2d%f%*d %[0123456789]", Li, Lx, name);
with input:
will assign to i the value 56 and to x the value 789.0, will skip 0123, and name will contain
56\0. The next character read from the input stream will be a.
To accept repeatedly from s t d i n a quantity, a unit of measure and an item name:
#include <stdio.h>
I*.. .*/
the execution of the above example will be analogous to the following assignments:
quant
count
want
count
count
want
count
count
count
printf
=
=
=
=
=
=
=
... );
Description
The p r i n t f function is equivalent to f p r i n t f with the argument s t d o u t interposed
before the arguments to p r i n t f .
Returns
The p r i n t f function returns the number of characters transmitted, or a negative value if an
output error occurred
scanfwrite
... ) ;
Chapter 12
Description
The scanf function is equivalent to f s clanf with the argument stdin interposed before
the arguments to scanf.
Returns
sprintf
The scanf function returns the value of the macro EOF if an input failure occurs before any
conversion. Otherwise, the scanf function returns the number of input items assigned, which
can be fewer than provided for, or even zero, in the event of an early matching failure.
7.9.6.5 The s p r i n t f function
Synopsis
#include <stdio.h>
int sprintf(char *a, const char "format,
... );
Description
The sprintf function is equivalent to fprintf, except that the argument s specifies an
anay into which the generated output is to be written, rather than to a stream. A null character is
written at the end of the characters written; it is not counted as part of the returned sum. If copying
takes place between objects that overlap, the behavior is undefined.
Returns
sscmf
The sprintf function returns the number of characters written in the array, not counting the
terminating null character.
7.9.6.6 The s s c a n f function
Synopsis
#include <stdio.h>
int sscmf(const char *a, const char +format,
... ) ;
Description
The sscanf function is equivalent to fscanf, except that the argument s specifies a string
from which the input is to be obtained, rather than from a stream. Reaching the end of the string
for the f scanf function. 1f copying takes place between
is equivalent to enc~unterin~end-of-file
objects that overlap, the behavior is undefined.
Returns
vfprintf
The sscanf function returns the value of the macro EOF if an input failure occurs before any
conversion. Otherwise, the sscanf function returns the number of input items assigned, which
can be fewer than provided for, or even zero, in the event of an early matching failure.
7.9.6.7 The v f p r i n t f function
Synopsis
#include <st&rg.h>
#include <stdio.h>
int vfprintf(F1LE +strmam, w n s t char +format, va-list rrg);
Description
The vfprintf function is equivalent to fprintf, with the variable argument list replaced
by arg, which shall have been initialized by the va s t a r t macro (and possibly subsequent
va-arg calls). The vfprintf function does not inyoke the va-end macro.'24
Returns
The vfprintf function returns the number of characters transmitted, or a negative value if
an output error occurred.
Example
The following shows the use of the vfprintf function in a general error-reporting routine.
va-list args;
... )
<stdio. h>
va-start (args, format);
/* p r i n t o u t name o f f u n c t i o n c a u s i n g e r r o r */
fprintf(stderr, "ERROR in %a:
/*
", function-n-);
p r i n t o u t r e m a i n d e r o f m e s s a g e */
1
vprintf
Description
The vprintf function is equivalent toprintf, with the variable argument list replaced by
arg, which shall have been initialized by the va-start macro (and possibly subsequent
va-arg calls). The vprintf function does not invoke the va-end macro.124
Returns
The vprintf function returns the number of characters transmitted, or a negative value if an
output error occurred.
vsprintf
vr-list Lug);
Description
The vsprintf function is equivalent to sprintf, with the variable argument list replaced
by arg, which shall have been initialized by the va start macro (and possibl subsequent
va arg calls). The vsprintf function does not &oke the va end macro.' If copying
takG place between objects that overlap, the behavior is undefined.-
Returns
The vsprintf function returns the number of characters written in the array, not counting
the terminating null character.
Description
The fgetc function obtains the next character (if present) as an unsigned char converted
t
uointed to by stream and advances the associated file position
to an i n t . from the i n ~ ustream
indicator for the streak (if defined).
Returns
The f g e t c function returns the next character from the input stream pointed to by stream.
If the stream is at end-of-file, the end-of-file indicator for the stream is set and fgetc returns
EOF. If a read error occurs, the error indicator for the stream is set and fgetc returns EOF.'25
fget s
Description
The f g e t s function reads at most one less than the number of characters specified by n from
the stream pointed to by stream into the array pointed to by s.No additional characters are read
Chapter 12
after a new-line character (which is retained) or after end-of-file. A null character is written
immediately after the last character read into the array.
Returns
fputc
Description
The fputc function writes the character specified by c (converted to an unsigned char)
tothe output stream pointed to by stream, at the position indicated by the associated file position
indicator for the stream (if defined), and advances the indicator appropriately. If the file canna
support positioning requests, or if the stream was opened with append mode, the character is
appended to the output stream.
Returns
fputs
The fputc function returns the character writren. If a write error occurs, the error indicator
for the stream is set and fputc returns EOF~
7.9.7.4 The fput s function
Synopsis
#include <stdio.h>
int fputs(const char +a, FILE +straam);
Description
The fputs function writes the string pointed to by s to the stream pointed to by stream
The terminating null character is not written.
Returns
getc
The fputs function returns EOF if a write error occurs; otherwise it returns a nonnegative
value.
7.9.7.5 The getc function
#include <stdio.h>
int getc(F1lg +straam);
Description
The getc function is equivalent to fgetc,except that if it is implemented as a macro, it may
evaluate stream more than once, so the argument should never be an expression with side
effects.
Returns
gatchar
The getc function returns the next character from the input stream pointed to by stream
If the stream is at end-of-file, the end-of-file indicator for the stream is set and getc returns EOF.
If a read error occurs, the error indicator for the stream is set and getc returns EOF.
7.9.7.6 The getchar function
Synopsis
#include <stdio.h>
int getchar(void);
Description
The getchar function is equivalent to getc with the argument stdin.
<stdio.h>
Description
The getchar function is equivalent to getc with the argument etdin.
Returns
gets
The getchar functionreturns the next character from the input stream pointed to by etdin.
If the stream is at end-of-file, the end-of-file indicator for the stream is set and getchar returns
EOF.If a read error occurs, the error indicator for the stream is set and getchar returns EOF.
7.9.7.7 The get a function
Synopsis
#include <stdio.h>
char *getstchar -8);
Description
The gete function reads charactersfrom the input stream pointed to by etdin. into the array
pointed to by a. until end-of-file is encountered or a new-line character is read. Any new-line
character is discarded, and a null character is written immediately after the last character read into
the array.
Returns
The gete function returns e if successful. If end-of-file is encountered and no characters have
been read into the array, the contents of the array m a i n unchanged and a null pointer is returned.
If a read error occurs during the operation. the array contents are indeterminate and a null pointer
is returned.
putc
Description
The putc function is equivalent to fputc,except that if it is implemented as a macro, it may
evaluate stream more than once, so the argument should never be an expression with side
effects.
Returns
The putc function returns the character written. If a write error occurs, the error indicator for
the stream is set and putc returns EOF.
putchar
Description
The putchar function is equivalent to putc with the second argument etdout.
Returns
The putchar function retums the character written. If a write erroroccurs. the error indicator
for the stream is set and ~ u t c h a rreturns EOF.
puts
Description
The pute function writes the string pointed to by e to the stream pointed to by etdout,and
appends a new-line character to the output. The terminating null character is not written.
Returns
The pute function returns EOF if a write error occurs; otherwise it retums a nonnegative
value.
Chapter 12
ungetc
Description
The u n g e t c function pushes the character specified by c (convened to an u n s i g n e d
c h a r ) back onto the input stream pointed to by s t r e a m . The pushed-back characters will be
returned by subsequent reads on that stream in the reverse order of their pushing. A successful
intervening call (with the stream pointed to by s t r e a m ) to a file positioning function ( f s e e k ,
f s e t p o s , or r e w i n d ) discards any pushed-back characters for the stream. The external storage
corresponding to the stream is unchanged.
One character of pushback is guaranteed. If the ungetc function is called too many times on
the same stream without an intervening read or file positioning operation on that stream. the
operation may fail.
If the value of c equals that of the macro EOF, the operation fails and the inpu~stream is
unchanged.
A successful call to the u n g e t c function clears the end-of-file indicator for the stream. The
value of the file position indicator for the stream after reading or discarding all pushed-back
characters shall be the same as it was before the characters were pushed back. For a text stream,
the value of its file position indicator after a successful call to the ungetc function is unspecified
until all pushed-back characters are read or discarded. For a binary stream, its file position
indicator is decremented by each successful call to the u n g e t c function; if its value was zero
before a call, it is indeterminate after the call.
Returns
The u n g e t c function returns the character pushed back after conversion, or EOF if the
operation fails.
Forward references: file positioning functions (7.9.9).
Description
The f r e a d function reads, into the array pointed to by p t r , up to nmemb elements whose
size is specified by s i z e , from the stream pointed to by s t r e a m . The file position indicator for
the stream (if defined) is advanced by the number of characters successfully read. If an error
occurs, the resulting value of the file position indicator for the stream is indeterminate. If a partial
element is read, its value is indeterminate.
Returns
frrite
The freadfunction returns the number of elements successfully read, which may be less than
nmemb if a read error or end-of-file is encountered. If s i z e or nmemb is zero, f read returns
zero and the contents of the array and the state of the stream remain unchanged.
7.9.8.2 The f w r i t e function
Synopsis
Xinclude <stdio.h>
size-t fwrite(const void 'ptr,
F I L E *stream):
Description
The f w r i t e function writes, from the array pointed to by ptr, up to nmemb elements whose
size is specified by s i z e , to the stream pointed to by s t r e a m . The file position indicator for
the stream (if defined) is advanced by the number of characters successfully written. If an error
occurs, the resulting value of the file position indicator for the stream is indeterminate.
Returns
The fwrite function returns the number of elements successfully written, which will be less
than nmemb only if a write error is encountered.
'pos) ;
Description
The fgetpos function stores the current value of the file position indicator for the stream
pointed to by etream in the object pointed to by poe. The value stored contains unspecified
information usable by the feetpoe function for repositioning the stream to its position at the
time of the call to the fgetpoe function.
Returns
If successful, the fgetpoe function returns zero; on failure, the fgetpoe function returns
nonzero and stores an implementation-defined positive value in errno.
fseek
Description
The f eeek function sets the file position indicator for the stream pointed to by etream.
For a binary stream, the new position, measured in characters from the beginning of the file,
is obtained by adding o f f s e t to the position specified by whence. The specified position is
the beginning of the file if whence is SEEK SET,the current value of the file position indicator
if SEEK-CUR, or end-of-file if SEEK-~m.A binary stream need not meaningfully support
f eeek calls with a whence value of SEEK-END.
For a text stream, either of f e e t shall be zero, or o f f set shall be a value returned by an
earlier call to the f t e l l function on the same stream and whence shall be SEEK-SET.
A successful call to the feeek function clears the end-of-file indicator for the stream and
undoes any effects of the ungetc function on the same stream. After an f eeek call, the next
operation on an update stream may be either input or output.
Returns
The f eeek function returns nonzero only for a request that cannot be satisfied.
fsetpos
Description
The f eetpoe function sets the file position indicator for the stream pointed to by etream
according to the value of the object pointed to by pee, which shall be a value obtained from an
earlier call to the fgetpoe function on the same stream.
A successful call to the f eetpoe function clears the end-of-file indicator for the stream and
undoes any effects of the ungetc function on the same stream. After an feetpoe call. the next
operation on an update stream may be either input or output.
Returns
If successful, the feetpoe function returns zero; on failure, the f eetpoe function returns
nonzero and stores an implementation-defined positive value in errno.
Chapter 12
250
ftell
Description
The f t e l l function obtains the current value of the file position indicator for the stream
pointed to by stream For a binary stream, the value is the number of characters from the
beginning of the file. For a text stream, its file position indicator contains unspecified information,
usable by the f seek function for returning the file position indicator for the stream to its position
at the time of the f t e l l call; the difference between two such return values is not necessarily a
meaningful measure of the number of characters written or read.
Returns
If successful, the f t e l l function returns the current value of the file position indicatorfor the
stream. On failure, the f t e l l function returns-1Land stores an implementation-definedpositive
value in errno.
rewind
Description
The rewind function sets the file position indicator for the stream pointed to by stream to
the beginning of the file. It is equivalent to
(void)fseek(stream, OL. SEEK-SET)
except that the error indicator for the stream is also cleared.
Returns
The rewind function returns no value.
clearerr
Description
The clearerr function clears the end-of-file and error indicators for the stream pointed to
by stream.
Returns
The clearerr function returns no value.
f.0f
Description
The feof function tests the end-of-file indicator for the stream pointed to by stream.
Returns
The feof function returns nonzero if and only if the end-of-file indicator is set for stream
ferror
<stdio. h>
Description
The ferror function tests the error indicator for the stream pointed to by stream.
Returns
perror
The ferror function returns nonzero if and only if the error indicator is set for stream.
7.9.10.4 The perror function
Synopsis
#include <stdio.h>
void p.rror(const
char "8):
Description
The perror function maps the error number in the integer expression errno to an error
message. It writes a sequence of characters to the standard error stream thus: first (if s is not a
null pointer and the character pointed to by s is not the null character), the string pointed to by s
followed by a colon (:) and a space; then an appropriate error message string followed by a
new-line character. The contents of the error message strings are the same as those returned by
the strerror function with argument ermo, which are implementation-defined
Returns
The perror function returns no value.
Forward references: the strerror function (7.11.6.2).
Footnotes
110. If the implementation imposes no practical limit on the length of file name strings, the value
of FILENAME MAX should instead be the recommended size of an array intended to hold
a file name strirg. Of course, file name string contents are subject tooth& system-specific
MAX cannot be expected
constraints; therefore all possible strings of length FILENAMEto be opened successfully.
111. An implementation need not distinguish between text streams and binary streams. In such
an implementation, there need be no new-line characters in a text stream nor any limit to
the length of a line.
112. This is described in the Base Document as afile pointer. That term is not used in this
International Standard to avoid confusion with a pointer to an object that has type FILE.
113. Among the reasons the implementation may cause the rename function to fail are that the
file is open or that it is necessary to copy its contents to effectuate its renaming.
114. Files created using strings generated by the tmpnam function are temporary only in the
sense that their names should not collide with those generated by conventional naming rules
for the implementation. It is still necessary to use the remove function to remove such files
when their use is ended. and before program termination.
115. Additional characters may follow these sequences.
116. The primary use of the f reopen function is to change the file associated with a standard
text stream (stderr, stdin, or stdout), as those identifiers need not be modifiable
lvalues to which the value returned by the fopen function may be assigned.
117. The buffer must have a lifetime at least as great as the open stream, so the stream should be
closed before a buffer that has automatic storage duration is deallocated upon block exit.
118. Note that 0 is taken as a flag, not as the beginning of a field width.
119. No special provisions are made for multibyte characters.
120. See "future library directions" (7.13.6).
121. These white-space characters are not counted against a specified field width.
122. No special provisions are made for multibyte characters.
123. See "future library directions" (7.13.6).
124. As the functions vfprintf, vaprint f , and vprintf invoke the va-arg macro, the
value of arg after the return is indeterminate.
125. An end-of-file and a read error can be distinguished by use of the feof and ferror
functions.
Chapter 12
Either function returns a non-null value of type pointer to FILE only if it can
open a file whose name is fnama with mode fmode and can associate it with
the stream controlled by the data object pointed to by fptr.
Use fptr only as an argument to the other stream I/O service functions
in the Standard C library. Don't try to peek inside the data object it points
to, not even if a particular implementation provides a declaration of FILE
within <stdio .h> that reveals some of the fields. Don't try to alter any of
the fields. Don't even try to copy the contents to another data object of type
FILE and use the copy instead, since implementations are permitted to
assume they know all valid addresses for the data objects that control
streams. (In other words, the address returned by f0-n may be magic, not
just the values stored at that address.)
And once you close a stream, with a successful call to fcloee (or with a
partially successfulcall to freopen),do not use the corresponding fptr value
again. The storage it points to may well be deallocated or recycled. (Don't
even copy the pointer value. Strictly speaking, an implementation can
bomb out just sniffing at a pointer that points to deallocated storage.)
You don't have to know what is inside a FILE data object. All you know
type
FILE is that it has some way to represent, among other things:
an end-of-file indicator that notes whether you attempt to read past the
end of the file
an error indicator that notes whether a read or write resulted in an
irrecoverable data transfer error
a file-position indicator that notes the next byte to read or write from the
file (and that may not be defined for certain kinds of files)
buffer information that notes the presence and size of any buffer area for
reads and writes
state information that determines whether a read or write may follow
As for naming files, your best bet is to avoid wiring any file names into
your code. (This is a good idea for a lot of reasons.) If you have to input or
construct a file name, use a buffer that can hold FILENAME-MAX characters.
(The macro is defined in <atdio.h>.) Assume only that a file name is a
conventional null-terminated string. Don't peek inside, and don't rule out
any characters as components of a file name.
If you must make up file names, such as for the names of your header
files, keep them simple. Any implementation will probably accept file
names that consist of one to six alphabetic characters, followed by a dot,
followed by a single alphabetic character. Some examples are "myhdr.hw
and "x.Y". Don't assume that the case of these characters is significant.
Don't assume that it is not. Don't expect these names to survive unscathed
as names within the operating system. The Standard C library may have to
map them to some other form to comply with local usage.
The file mode is a string that begins with one of three letters:
mode
r specifies that you want to open an existing file for reading.
w specifies that you want to open an existing file for writing and discard
its contents, or you want to create a new file that initially has no contents.
a is the same as w with the added proviso that before each write to the
stream the file-position indicator is positioned at the end of the file.
You can follow the mode with two optional characters, in either order:
+ specifies that you want also to write a file you open for reading (with
r), or you want also to read a file you open for writing (with w or a).
b specifies that you want to open a binary file rather than a text file.
You can write additional characters after these. Each implementation
defines what additional parameters, if any, you can write as part of h d e .
A system may, for example, let you write:
fopen (fname, "w, lrecl=132,recfm=fixed")
reading
and
writing
function
fgetc
254
Chapter 12
unsafe
macros
function
fputc
file
poslioning
<stdio. h>
function
ungetc
f seek
ftell
rewind
255
The function ungetc will work even with a stream that does not support
file-positioning requests, such as a stream from a terminal or pipeline. It
lets you put back a different character than you just read. It even lets you
put back a character before the beginning of a file, if you call the function
before the first read on a stream.
Implementations can vary in the number of characters you can push
back between reads, however. You can be sure of one character of pushback even if you intersperse calls to the formatted-input functions (such as
ecanf), which also require one character of push back. For a portable
program, don't assume that you can push back more than one character.
The ungetc function interacts poorly with the other two mechanisms for
positioning files. Committee X3Jll spent quite a bit of time sorting out the
semantics of various sequences of calls to ungetc and feeek, for instance.
The general rule is that a character you push back with ungetc evaporates
after any other file-positioning request. But you should read the fine print
in the function descriptions to be sure that youget just the result youexped.
My advice is to avoid mixing ungetc calls with anything but read requests.
The functions fee& and f t e l l (and rewind) are the traditional file-positioning functions from the earliest days of C. They assume that you can
encode a file-position indicator as a long, as I indicated on page 230. This
happens to be true under UNIX, where files never exceed 232bytes in length
and where you can position a file to an arbitrary byte. It is not necessarily
true on a system that supports larger files or that requires more elaborate
file-positioninginformation.
A text file, for example, may be structured into blocks and records within
blocks- packing a block number, record number, and offset within record
into a long may require impossible tradeoffs for an arbitrary byte. For these
reasons, the function f t e l l may fail (returning -I), rather than return a
corrupted encoding of the file-position indicator.
You use feeek and f t e l l to advantage in randomly accessing the bytes
of a binary file (provided, of course, that the file is not too big). In this case,
the encoded file-position indicator is the offset in bytes from the start of the
file, which is byte zero. You can perform arithmetic on such file-position
indicators, or compute them out of whole cloth, and be sure to get just the
bytes you'd expect.
The encoded file-position indicator for a text file, however, has a format
that varies among implementations. You use f t e l l to give you a magic
cookie that marks where the file is currently positioned. (It will return a
failure code if it cannot encode the current file-positon indicator.) Later in
the execution of the same program, and before you close the file, you can
pass the same value to f s e e k to restore the file-position indicator to its
earlier value. Don't assume that you can save such values from one execution of a program to the next, or even from one file opening to the next. An
implementation may play really tricky games with the encoding.
Chapter 12
256
If YOU are content merely to reposition files at places you have visited
earlier, you should use the third mechanism. The committee added the
functions fgetpoe and feetpoe to support positioning within files of arbitrary size and structure. These functions work with values of type fpoe-t,
defined in <stdio.h7, which can be as ornate a structure as an implementation needs to encode an arbitrary file-position indicator. Assume that
fpoe-t is a structure type that you can only copy, pass as a function
argument, or receive as a function value. Even for a binary file, there is no
defined way to compare such values or perform arithmetic on them.
buffer
You can, in principle, exercise a certain amount of control over how the
control I/O functions buffer data for a stream. You must realize, however, that
buffering is an optimization based on various conjectures about patterns
of I/O. These conjectures are usually correct, and many implementations
follow your advice. But they don't have to. An implementation is free to
ignore most of your buffering requests.
Nevertheless, if you think a bigger buffer will improve performance or
eetvbuf
eetbuf a smaller buffer will save space, you can supply your own candidate buffer.
Call the function eetvbuf after you open the file and before you perform
any other operationson the stream. (Avoid the older function eetbuf, which
is less flexible.) You can specify whether I/O should be fully buffered,
buffered by text lines, or unbuffered. It just might make a difference in how
well your program performs.
function
Sometimes you want buffering most of the time, but need to exercise
fflueh limited control over when output gets flushed to the outside world. The
function fflueh ensures that one or more streams have their output flushed
when you call it. That can be useful for pushing out messages in an
interactive environment. It can also make a database more robust in the
teeth of occasional program crashes. Be warned, however, that frlueh has
no defined effect on input streams in Standard C. You can't use this function
to reliably discard input before a prompt, as you can under UNIX.
The Standard C library disallows certain patterns of reads and writes.
The basic rule is that you cannot follow a read with a write, or a write with
a read, without an intervening file-positioning request. More specifically,
the intervening call must be to one of the functions ff lush, feeek, feetpoe,
or rewind A read that sets the end-of-file indicator can be followed immediately by a write. Curiously enough, however, a write preceded by an
implicit seek (to a file opened with an hnode that begins with a) cannot
immediately follow a read. Figure 12.1 is a state-transition diagram that
summarizes these rules.
My final piece of advice is to give the stream I/O functions all the
latitude you can. Don't try to control the buffering too closely. You may well
end up optimizing for one implementationand deoptimizing for all others.
And don't push your luck by agressively mixing reads, writes, and various
file-positioning operations. It is easy to break an implementation if you
push it in this area. It is even easier to break your own program.
fgetpoe
feetpoe
Figure 12.1:
States of a
Stream
il.!
OPENED
WRITE AT EOF
Chapter 12
Text files have three significant advantages over binary files:
They can be generated or altered by mere mortals such as you and me.
They can be written to a printer or terminal with a large likelihood that
human beings can understand the display.
They can be shared between programs that share few assumptionsabout
how data is encoded.
print
The process of contriving a text representation of encoded data is called
functions output formatting. The print functions (all with print as part of their names
and all declared in <stdio.h>) produce formatted output. To use the print
functions, you must know how to call them, how they interpret a format,
and what conversions they will perform for you. The Standard C library
provides six different print functions, declared as follows:
int
int
int
int
int
int
vfprintf
vprintf
vsprintf
Let's say, for example, that you want to write formatted messages to
s t h r r , each preceded by a standard prefix. You also want to log each error
on a disk file. You can do all this by writing a function eprint that uses
vfprintf to perform the actual output:
. ..)
print
The mainspring of every print function call is the format string you
formats specify for it. You can (and should) think of a format string as a program in
9, 38);
260
Chapter 12
printing
Not every part of a format string calls for the conversion of an additional
literal text argument. In fact, only certain cornersion specifications gobble arguments.
conversion
specification
flags
field
width
precision
print
conversion
specifiers
character
decimal
floating-point
Chapter 12
~f
f. If 9 is
unspecified or 0, it sets 9 to 6. It chooses the f form if the e form would
yield an exponent in the inclusive range [4,9-11. It omits trailing zeros
from any fraction. It omits the decimal point if no fraction digits remain
and you specify no # flag.
~p- converts the long double argument the same as g.
G - converts the double argument the same as g, except that it replaces
the e before any exponent with E.
LG - converts the long double argument the same as G.
i, hi, ii - are the same as dl ha, id, respectively
n - stores the cumulative number of generated characters in the data
object pointed to by the pointer to int argument.
= hn -is the same as n for a pointer to short argument.
in -is the same as n for a pointer to long argument.
o - converts the int argument to unsigned int and then to an unsigned
sequence of at least 9 octal digits. Default precision is 1.
ho -converts the int argument to unsigned short, then the same as 0.
lo - converts the long argument the same as 0.
Q - converts the pointer to void argument to an implementation-defined
sequence of characters (such as the hexadecimal representation of a
storage address).
s- generates one character for each of the (non-null)characters stored
in the string pointed to by the pointer to char argument. If you spec* a
precision, it generates no more than 9 characters.
u - converts the int argument to unsigned int and then to an unsigned
sequence of at least 9 decimal digits. Default precision is 1.
hu -converts the int argument to unsigned short, then the same as U.
lu -converts the long argument to unsigned long, then the same as u.
x - converts the int argument to unsigned int, then to an unsigned
sequence of at least Q hexadecimal digits. It represents digit values 10
through 15 by the letters a through f. Default precision is 1.
hx - converts the int argument to unsigned short, then the same as X.
IX- converts the long argument to unsigned long, then the same as x
x - converts the int argument the same as x, except that it represents
digit values 10 through 15 by the letters A through F.
hx - converts the int argument to unsigned short, then the same as x.
IX - converts the long argument to unsigned long, then the same as x.
%- converts no argument. It generates a per cent character.
Conversion specifiers handle most of your formatting needs. Where they
fall short, you can get what you want in two steps. First, generate text into
a buffer using eprintf and modify it there. Then write the text using, say,
vrintf. See the function -Fmtval on page 92 for a practical example.
P
decimai
character
count
unslgned
integer
pointer
to void
string
unsigned
decimal
hexadecimal
per cent
<stdio. h>
263
formatted
Not all programs read input. Those that do can read data directly, using
input an assortment of standard library functions, and interpret the data as they
see fit. Converting small integers and text strings for internal consumption
are both exercises that most C programers perform easily. It is only when
you must convert floating-point values, or recognize a complex mix of data
fields, that standard scanning functions begin to look attractive.
Even then the choice is not always clear. The usability of a program
depends heavily on how tolerant it is to variations in user input. You as a
programmer may not agree with the conventions enforced by the standard
formatted-inputfunctions. You may not like the way they handle errors. In
short, you are much more likely to want to roll your own input scanner.
Obtaining formatted input in not simply the inverse of producing formatted output. With output, you know what you want the program to
generate next and it does it. With input, however, you are more at the mercy
of the person producing the input text. Your program must scan the input
text for recognizable patterns, then parse it into separate fields. Only then
can it determine what to do next.
Not only that, the input text may contain no recognizable pattern. You
must then decide how to respond to such an "error." Do you print a nasty
message and prompt for fresh input? Do you make an educated guess and
bull ahead? Or do you abort the program? Various canned input scanners
have tried all these strategies. No one of them is appropriate for all cases.
It is no surprise, therefore, that the history of the formatted input
functions in C is far more checkered than for the formatted output functions. Most implementations of C have long agreed on the basic properties
of printf and its buddies. By contrast, scanf and its ilk have changed
steadily over the years and have proliferated dialects. CommitteeX3Jllhad
to spend considerable time sorting out the proper behavior of formatted
input.
scan
The scan functions are so called because they all have scan as part of their
functions names. These are the functions that scan input text and convert text fields
to encoded data. All are declared in <stdio. h>.TO use the scan functions,
you must know how to call them, how to specify conversion formats, and
what conversions they will perform for you. The Standard C library provides three different scan functions, declared as follows:
The function fscanf obtains characters from the stream stream. The
function scanf obtains characters from the stream stdin. Both stop scanning input early if an attempt to obtain a character sets the end-of-file or
error indicator for the stream. The function sscanf obtains characters from
the null-terminated string beginning at src. It stops scanning input early if
it encounters the terminating null character for the string.
Chapter 12
All the scan functions accept a variable argument list, just like the print
functions.And just like the print functions, you had better declare any scan
functions before you use them by including <stdio.h>.
All the functions accept a format argument, which is a pointer to a
read-only null-terminated string. The format tells the function what additional arguments to expect, if any, and how to convert input fields to values
to be stored. (A typical argument is a pointer to a data object that receives
the converted value.) It also specifies any literal text or white-space you
want to match between converted fields. If scan formats sound remarkably
like print formats, the resemblance is quite intentional. But there are also
important differences.I discuss scan formats in considerable detail below.
All the scan functions return a count of the number of text fields
converted to values that are stored. If any of the functions stops scanning
early for one of the reasons cited above, however, it returns the value of the
macro EOF, also defined in cstdio. h>.Since EOF must have a negative value,
you can easily distinguish it from any valid count, including zero. Note,
however, that you can't tell how many values were stored before an early
stop. If you need to locate a stopping point more precisely, break your scan
call into multiple calls.
A scan function can also stop scanning early because it obtains a character that it is unprepared to deal with. In this case, the function returns the
cumulative count of values converted and stored. You can determine the
largest possible return value for any given call by counting all the conversions you specify in the format. The actual return value will be between
zero and this maximum value, inclusive.
pushing
When either fscanf or scanf obtains such an unexpected character, it
back pushes it back to the input stream. (It also pushes back the first character
characters beyond a valid field when it has to peek ahead to determine the end of the
field.) How it does so is similar to calling the function ungetc. There is a
very important difference, however. You cannot portably push back two
characters to a stream with successive calls to ungetc (and no other intervening operationson the stream). You can portably follow an arbitrary call
to a scan function with a call to ungetc for the same stream.
What this means effectively is that the one-character pushback limit
imposed on ungetc is not compromised by calls to the scan functions. Either
the implementation guarantees two or more characters of pushback to a
stream or it provides separate machinery for the scan functions.
The scan functions push back at most one character. Say, for example,
~ as floating
~
point value.
that you try to convert the invalid field 1 2 3 ~ as
Even the subfield 123E is invalid, since the conversion requires at least one
~ consumed and the conversion fails. No
exponent digit. The subfield 1 2 3 is
value is stored and the scan function returns. The next character to read
from the stream is A. This behavior matters most for floating point fields,
which have the most ornate syntax. Other conversions can usually digest
all the characters in the longest subfield that looks valid.
Cstdio. h>
265
Many of these operations generate values that the scan function stores
in various data objects that you specify with pointer arguments. Any such
arguments must appear in the variable argument list, in the order in which
the format string calls for them. For example:
sscanf("thx 1138", "%s%2o%dW,
&a, Sb, S c ) ;
stores a pointer to the string -thxWin the char array a, the value11 in the int
data object b, and the value 38 in the int data object c. It is up to you to
ensure that the type of each actual argument pointer matches the type
expected by the scan function. Standard C has no way to check the types
of additional arguments in a variable argument list.
Not every part of a format string calls for the conversion of a field and
the consumption of an additional argument. In fact, only certain conversion
specifications gobble arguments. Each conversion specification begins with
the escape character % and matches one of the patterns shown below. The
scan functions treat everything else either as w te-space or as literal text.
White-space in a scan format is whatever the function isspace,declared
scanning
white-space in <ctype.h>,says it is. That can change if you call the function setlocale,
declared in <locale.h>. In the "cn locale, white-space is what you have
learned to know and love. (See Chapter 2: <ctype.h>.)
Chapter 12
scanning
iitemltext
scan
conversion
specifications
assignment
suppression
field width
scan
conversion
specifiers
scanning
numeric
fields
character
decimal
floating-point
general
integer
character
count
octal
Chapter 12
268
pointer
to void
string
unsigned
decimal
hexadecimal
per cent
scan sets
limitations
of scan
functions
p - converts the
BUFSIZ
EOF
FILENAME-MAX
FOPEN-MAX
-IOFBF
-IOLBF
-IONBF
L-t-am
NULL
SEEK-CUR
SEEK-END
SEEK-SET
TMP-MAX
270
Chapter 12
stderr
stdin
stdout
FILE
fpos-t
sire-t
clearerr
fclose
feof
ferror
ffiush
input stream.
stdout-Use this macro to designate the standard output stream.
FILE - YOU declare a pointer to FILE to store the value returned on a
successful fopen or freopen call. You then use this value as an argument to
various functions that manipulate the stream. You never have occasion to
declarea data object of type FILE, however. TheStandard C library provides
all such creatures. Treat the contents of a FILE data object as a black box.
Use the functions declared in xstdio. h> to manipulate its contents.
fpos-t - This is the type of the value returned by fgetpos. It can
represent an arbitrary file-position indicator for any file. That means you
can copy the value and pass it as an argument on a function call, but you
can't perform arithmetic on it. Pass the value to fsetpos to reposition the
file at the point you memorized. Note that the older functions f t e l l and
fseek can perform much the same service, but they can also fail for certain
files (particularly large ones). Use fgetpos and fsetpos wherever possible.
size-t - See page 219.
clearerr-Use this function to clear the end-of-file and error indicators
on a stream. You need it only if you also use the functions feof or ferror.
fclose - If you open a file by calling fopen, you should probably close
it by a later call to fclose. Aprogram that manipulates an arbitrary number
of files may otherwise exceed the maximum number of files that may be
simultaneously open. (See FOPEN-= above.) At program termination, the
Standard C library closes any files that are still open. That is the customary
way to close the three standard streams.
feof - Most functionsthat read a stream return a special value, such as
EOF, to indicate that the read encountered end-of-file.Should you miss this
opportunity to check, use the function eof. It reports the state of the
end-of-file indicator for a stream. A file-positioning request clears this
indicator if it apparently moves the file-position indicator away from
end-of-file. SO too does a call to clearerr.
ferror - A read or write to a stream can fail for any number of reasons.
The error indicator in a stream records all such failures. To check whether
an error has occurred, call ferror. A call to clearerr or rewind clears this
indicator.
ffiush - YOU can ensure that a stream retains no buffered output by
calling ffiush for a stream. That may be important if you are writing
prompting messages to an output stream and reading responses from an
input stream. You want to ensure that the person interacting with the
program knows what sort of reply the program expects next. Call
ffiush (NULL) to flush all output streams. That prepares a program for a
subsequent loss of control. (The program may be about to execute undebugged code. Or it may have just invited the user to turn off the computer.)
The Standard C library flushes all output streams at program termination.
fgetc
fgetpoe
fgets
fopen
fprintf
fputc
fputs
fread
freopen
fscanf
feeek
fgetc -You call this function to obtain the next character from an input
stream. (See page 253.) All functions that read a stream behave as if they
call fgetc to obtain each character. getc has the same specificationas fgetc
but is far more likely to have a masking macro that dramatically improves
performance. As a rule, therefore, you should use getc instead of fgetc.
fgetpoe - Use this function to memorize a position in a fie to which
you want to later return. It returns a value of type fpos-t, described above.
fgets - Use this function to read lines of text from a stream. It stops
reading after it reads and stores a newline or when the buffer you specify
is full. After any successful read, the contents of the buffer are null-terminated. Do not use the function gete in place of this function.
fopen- This is the function you use to open a file. I discuss it at length
starting on page 252. Use freopen to redirect a standard stream.
fprintf -This is the formattedoutput function that writes to the output
stream you specify. See the description starting on page 257.
fputc - YOU call this function to write a character to an output stream.
(See page 254.) All other functions that write to a stream behave as if they
call fputc to deliver each character. putc has the same specificationas fputc
but is far more likely to have a masking macro that dramatically improves
performance. As a rule, therefore, you should use putc instead of fputc.
fputs - Use this function to write characters from a null-terminated
string to a stream. Unlike puts, fpute does not append a newline to whatever it writes. That makes it more useful for assembling lines of text or for
writing binary data.
fread - Use this function to read binary data into an array data object
or to read up to a fixed number of characters from any stream. If the size
(second) argument is greater than one, you cannot determine whether the
function also read up to size - 1 additional characters beyond what it
reports. As a rule, you are better off calling the function as fread(buf, I,
eize
n, etream) insteadof fread(buf, size, n, etream).
freogen -YOU use freopen only to recycle a stream that is already open.
It may be convenient, for example, to redirect stdin or etdout to a different
file under some circumstances. Most of the time, however, you will find
that fopen is the function to use.
fecanf - This is the formatted input function that reads from the input
stream you specify. See the description starting on page 263.
feeek - Use this function to modify the file-position indicator for a
stream. You can memorize a position in a file by executing offeet =
ftell(stream). Return to that position later by executing feeek (etream,
of feet, SEEK-CUR). fseek is more useful with a binary stream. In that case,
the offeet (second) argument is a long byte displacement within the file.
The mode (third)argument must have one of the values SEEK-CUR, SEEK-END,
or SEEK-SET, described above.
Chapter 12
fsetpoe
ftell
fwrite
getc
getchar
gete
perror
printf
putc
putchar
pute
remove
rename
rewind
a t d i o . h>
scanf
eetbuf
eetvbuf
eprintf
escanf
tmpfile
tmpnam
ungetc
vfprintf
wrintf
veprintf
273
scanf - This is the formatted input function that reads from the standard input stream. It is the most widely used of the scan functions.
setbuf -Use setvbuf instead of this function to get more control.
setvbuf - As a rule, it is best to let the Standard C library decide how
to buffer input/output for you. If you are certain that you want no buffering
or line-at-a-time buffering, then use this function to initialize the stream
properly. Call setvbuf immediately after you open the stream. Almost any
operation on the stream will preempt your right to choose a buffering
strategy. Should you specify your own buffer with this call, don't assume
that the stream will actually use it. And never alter the contents of the buffer
while the stream is open. The mode (third) argument must have one of the
values JOFBF, -IOLBF, or JONBF,described above. Also see the macro
BUFSIZ, described above.
egrintf - This is the formatted output function that writes a null-terminated string to the buffer you specify. It is the only way you can convert
encoded values to text without writing to a stream. Note that you cannot
directly specify the maximum number of characters that eprintf stores. Be
wary of conversions that can generate enough characters to store beyond
the end of the buffer. See fprintf, above.
eecanf - This is the formatted input function that reads a null-terminated string from the buffer you specify. You can use it to scan the same
sequence of characters with several different formats, until you find a scan
that succeeds.
tmpfile-Use tmpfile instead of tmpnam wherever possible. The former
opens the file for you and arranges to have it closed and removed on
program termination. The latter requires you to assume more of these
responsibilities.
tmpnam - Use this function to obtain one or more temporary file names
only if tmpfile doesn't meet your needs. You may want to open the file in
a mode other than l m ~ b +
for
mexample.
,
You may have to open and close the
same file repeatedly. Or you may want to rename the file before program
termination. See the macro TMP-MAX, described above.
ungetc - Use this function in conjunction with the read functions only.
The interaction of ungetc with the file-positioning functions is delicate. You
can push back a different character than the last one read. You can even
push back a character at beginning-of-file. But you cannot portably push
back more than one character between calls to read functions.
vfprintf - Use this function to build special versions of fprintf, as
described on page 258.
w r i n t f - Use this function to build special versions of printf, as
described on page 258.
vsprintf - Use this function to build special versions of sprintf, as
described on page 258.
Chapter 12
274
type
PILE
-NULL
-FNAMAX
-FOPMAX
-TNAMAX
(void *)0
/ * value for NULL
64 / * value for FILENAME-MAX
32 / * value for FOPEN-MAX * /
16 / * value for L-tmpnam * /
*/
*/
The file stdio. h contains a few other mysteries which shall become clear
in time. For now, I concentrateon the type definition FILE. Its members are:
-Mode - a set of status bits for the stream, defined below
-Handle - the handle, or file descriptor, returned by the operating
system for the opened file
- B U ~- a pointer to the start of the stream buffer, or a null pointer if no
buffer has been allocated
-Bend - a pointer to the first character beyond the end of the buffer,
undefined if - ~ u fis a null pointer
-Next -a pointer to the next character to read or write, never a null pointer
-Rend - a pointer to the first character beyond the end of data to be read,
getc
putc
str->-wend is always true if space is available in the buffer to write characters to the stream. An expression such as str->-wend = str->-~uf, for
example, disallows writes to the buffer from these macros.
The functions that you call to read and write streams make more extensive tests. A read function, for example, distinguishes a variety of conditions such as: characters are available, buffer currently exhausted, end-offile encountered, buffer not yet allocated, reading currently disallowed,
and reading never allowed. The functions rely heavily on the various
indicators in the member -Mode to make those distinctions.
Only functions within the Standard C library need be privy to the
header
llxstdio.hn meaning of these indicators. For that reason, and others, I created the
internal header "xstdio .hl-. All the functions described in this chapter
include "xstdio.hn. It defines macros for the stream-mode indicators. It
includes xstdio .h> and declares all the internal functions used to implement the capabilities of xstdio. h>. It also defines a number of macros and
types of interest only to the formatted input and output functions.
Unlike xstdio .h>, the header "xstdio.h-Icontains too many distractions
mode
indicators to present at this point. I show you what goes into it as the need arises, then
show you the whole file on page 322. Here, for example, are the macros
names for the various incidators in the member -Mode. Each is defined as a
value with a different bit set, as in 0x1,0x2,0x4,0x8, and so on. The actual
values are unimportant, so I omit them here:
-MOPENR - set if file is open for reading
-MOP^- set if file is open for writing
-MOPENA- set if all writes append to end of file
-MTRUNC-set if existing file was truncated on open (not used after open)
-MCREAT -set if a new file can be created on open (not used after open)
-M ~ I N set if stream is binary, not set if stream is interpreted as text
-M A I ~ U F- set if the buffer must be freed on close
-WIL -set if the FILE data object must be freed on close
-MEOF -the end-of-file indicator
-MERR -the error indicator
-MLBF -set if line buffering in effect
-MNBF -set if no buffering should occur
-MREAD - set if a read has occurred since last file-positioning operation
-MWRITE-set if a write has occurred since last file-positioning operation
These macros have private names-beginning with an underscore and an
uppercase letter - even though they don't have to. As I developed the
libray, I found myself moving them in and out of xstdio. h>.Some version
of the macros visible to user programs used these macro names, later
versions did not. In the end, I left the names in this form as insurance. You
may find occasion to introduce macros that manipulate the indicators in
the member -Mode.
Chapter 12
276
Figure 12.2:
stdio. h
Part 1
'*
s t d i 0 . h standard header */
lifndef -STDIO
ldef ine -STDIO
lifndef -YVALS
linclude < p a l s . h>
lendif
/* macros */
ldefine NULL
-NULL
ldefine -1OE'BF
0
ldefine -1OLBF
1
ldefine -1ONBF
2
ldefine BUFSIZ
512
ldefine EOF
-1
ldefine FILENAME-MAX
-ENAMAX
ldef i n e FOPEN-MAX
-FOPMAX
ldef i n e L-tmpnam
-TNAMAX
32
kdefine TM-MAX
ldefine SEEK-SET
0
1
ldefine SEEKCUR
2
ldefine SEEKEND
ldefine s t d i n
F i l e s [0]
ldefine stdout
-F i l e s [1]
-F i l e s [2]
kdefine s t d e r r
/* type d e f i n i t i o n s */
lifndef -SIZET
ldefine -SIZET
:ypedef -Sizet size- t;
lendif
:ypedef s t r u c t {
/* system dependent */
unsigned long -Off;
1 fpos-t;
:ypedef s t r u c t {
unsigned short -Mode;
short -Handle;
unsigned char *-Buf, *-Bend, *-Next;
unsigned char *-Rend, *-Rsave, *-Wend;
unsigned char -Back [2], -Cbuf, a c k ;
char *-Rnpnam;
} FILE;
/* declarations */
void c l e a r e r r (FILE *) ;
i n t fclose(F1LE * ) ;
i n t feof (FILE *) ;
i n t ferror(F1LE *);
i n t fflush(F1LE *);
i n t fgetc(F1LE *);
i n t fgetpos(F1LE *, fpos-t *);
char *fgets (char , i n t , FILE *) ;
FILE *fopen(const char *, const char *);
i n t f p r i n t f (FILE *, const char *, . . .) ;
i n t fputc ( i n t , FILE *);
i n t fputs(const char *, FILE * ) ;
sire- t f read (void , size- t, size-t, FILE ) ;
FILE *freopen(const char *, const char *, FILE * ) ;
i n t fscanf (FILE , const char , . . ) ;
*);
i n t getc(F1LE *);
i n t getchar (void);
char * g e t s (char *);
void p e r r o r (const char *) ;
i n t p r i n t f (const char *, . . .) ;
i n t p u t c ( i n t , FILE *) ;
i n t put char ( i n t) ;
1 i n t p u t s (const char * ) ;
int remove (const char *);
i n t rename(const char *, const char *);
void rewind (FILE *) ;
i n t scanf (const char *, . . . ) ;
void setbuf (FILE *, char *) ;
i n t setvbuf (FILE *, char *, i n t, size- t ) ;
i n t s p r i n t f (char *, const char *, . .. ) ;
i n t sscanf (const char *, const char *, . . . ) ;
FILE *tmpfi l e (void) ;
char *tmpnam(char * ) ;
i n t ungetc ( i n t, FILE *) ;
i n t v f p r i n t f (FILE *, const char *, char *);
i n t v p r i n t f (const c h a r *, char *);
i n t v s p r i n t f (char *, const char *, char *);
long -Fgpos (FILE *, fpos-t *) ;
i n t -Fspos (FILE *, const fpos-t *, long, i n t ) ;
e x t e r n FILE *-Files[FOPEN-MAX];
/* macro o v e r r i d e s */
#define fgetpos (str, p t r )
(int)-Fgpos (str, p t r )
#define f s e e k ( s t r , o f f , way)
-&pos
( s t r , -NULL, o f f , way)
#define f s e t p o s (str, p t r ) -Fspos ( s t r , p t r , OL, 0)
#define f t e l l (str) -Fgpos ( s t r , -NULL)
#define g e t c ( s t r )
( (str)->-Next < (str)->-Rend \
? * (str)->-Next++ : (getc) (str))
#define getchar ()
(-Files [0] ->-Next < -Files [0] ->-Rend \
? *- Files[0] ->-Next++ : (getchar) ( ) )
#define p u t c (c, str)
( (str)->-Next < (str)->-Wend \
? ( * (str)->-Next++ = c ) : (putc) (c, str))
#define putchar (c) (-Files [1] ->-Next < -Files [1] ->-Wend \
? (*- Files [ l ]->-Next++ = c ) : (putchar) (c) )
#endif
The indicators are actually the union of two sets. One is the set of
indicators that determines how to open a file. The other is the set of
indicators that helps record the state of the stream. Since the two sets
partially overlap, I chose to keep them all in one "space" of bit encodings.
A tidier implementation might well choose to separate the two uses. You
might also want to define two sets of values if you are starved for bits in
-Mode. In either case, you must add code to translate between the two
representations.
278
Chapter 12
function
The best way to see how the library uses a FILE data object is to track
one through its lifetime. Figure 12.3 shows the file fopen. c. It defines the
function fopen that you call to open a file by name. That function first looks
for an idle entry in the static array of FILE pointers called -Files. It contains
FOPEN-MAX elements. If all of these point to FILE data objects for open files,
all subsequent open requests fail.
data object
Figure 12.4 shows the file x f i l e s. c that defines the -Files data object.
f
i
l
e
s
It
defines
static instances of FILE data objects for the three standard streams.
Each is initialized to be open with appropriate parameters. I have wired in
the handles 0 for standard input, 1 for standard output, and 2 for standard
error. This is a widely used convention, inherited from UNIX. You may
have to alter or map these values or map.
Elements beyond the first three in -Files are initialized to null pointers.
Should fopen discover one of these, the function allocates a FILE data object
and marks it to be freed on close. fopen discovers a closed standard stream
by observing a non-null element of -Files that points at a FILE data object
whose member -Mode is zero.
function
fopen c a b on the internal function -Foprep to complete the process of
freopen opening a file. Figure 12.5 shows the file freopen. c . The function freopen
also calls this internalfunction. Note how it records the state of the indicator
-WIL until after f c l o s e has closed the file currently associated with the
stream. The one operation that f r e o p n does not want f c l o s e to perform is
to free the FILE data object.
function
You may as well see f c l o s e too, at this point. Figure 12.xxshows the file
fclose fc1ose.c. It undoes the work of the file-opening functions in a fairly
obvious fashion. The one bit of magic is where it calls the function -Fclose
to close the file associated with the stream.
function
Figure 12.7 shows the file xfoprep. c that defines the function -~oprep.
-~ o p r e pIt parses the mods (second) argument to fopen or freopen, at least as much
as it can understand, and initializes members of the FILE data object
accordingly. In the end, however, it must call on some outside agency to
finish the job of opening the file. -Foprep passes on the file name, the
encoded indicators, and whatever is left of mods to a function called -Fopen.
I describe -Fopen very shortly.
primitives
Fclose and- open are the first of several low-level primitives that stand
between <stdio.h> and the outside world. Each must perform a standardized function for the Standard C library. Each must also be reasonably
easy to tailor for the divergent needs of different operating systems. This
implementation has nine functions in <stdio.h> that must be tailored to
each operating system. Three are standard functions:
remove - Remove a named file.
rename - Change the name of a file.
tmpnam - Construct a reasonable name for a temporary file.
fopen
Figure 12.3:
fopen.c
'* fopen f u n c t i o n */
linclude < s t d l i b . h >
linclude " x s t d i o . h W
'ILE *(fopen) (const c h a r *name, c o n s t c h a r *mods)
/*
open a f i l e * /
FILE * s t r ;
size- t i;
f o r (i = 0; i < FOPENMAX; ++i)
i f (-Files [ i ] = NULL)
/*
s e t u p empty -Files [ i ] * /
/*
*/
str = -Files [ i ] ;
break;
i f (FOPENMAX <= i )
r e t u r n (NULL);
r e t u r n (-Foprep (name, mods, s t r ) ) ;
Figure 12.4:
xfiles. c
/* -Files d a t a o b j e c t
#include " x s t d i o . h"
*/
/* s t a n d a r d e r r o r b u f f e r */
s t a t i c unsigned c h a r ebuf [80];
/* t h e s t a n d a r d streams */
s t a t i c FILe s i n = {
-MOPENR, 0,
NULL, NULL, &sin.-Qmf,
&sin.-Cbuf, NULL, &sin.-Qmf, };
s t a t i c FILe s o u t = {
-MOPENW, 1,
NULL, NULL, &sout.-Cbuf,
&gout.-Cbuf, NULL, &gout.-Cbuf,
};
s t a t i c FILe serr = {
-MOPENW I-MNBF, 2,
ebuf, ebuf
s i z e o f (ebuf), ebuf,
ebuf, NULL, ebuf, };
/*
/*
/*
/* t h e a r r a y of stream p o i n t e r s */
FILE *-Files[FOPEN-MAX]
= { & s i n , &gout, hserr};
s t a n d a r d input
*,
s t a n d a r d output
*,
standard e r r o r
*,
Chapter 12
280
Fiaure
12.5:
"-
freopen . =
1 /*
freopen function
( #include
#include <stdlib.h>
"xstdio.h"
*/
Figure 12.6:
f close. c
fclose function */
!include <stdlib.h>
!include "xstdio.h"
!include "yfuns-h"
'*
nt (fclose)(FILE *str)
/*
close a stream
t
int stat = fflush(str);
if (str->-Mode & -MALBUF)
free (str->-Buf);
str->-Buf = NULL;
if (0 <= str->-Handle && -Fclose (str))
stat = EOF;
if (str->-Tmpnam)
/*
t
if (remove (str->-Tmpnam))
stat = EOF;
free (str->-Tmpnam);
str->-Tmpnam = NULL;
1
str->-Mode = 0;
str-> Next = hstr->-Cbuf;
str->Z~end= hstr->-Cbuf;
str->-Wend = hstr->-Cbuf;
str->-Nback = 0;
if (str->-Mode & -MALFIL)
/*
t
size-t i;
t
-Files[i] = NULL;
break;
1
free(str);
1
return (stat);
1.
/*
found entry
<stdio. h>
Figure 12.7:
x f oprep.c
f* Foprep function
tinclude "xstdio.h"
*/
f* open a stream */
PILE *-Foprep(const char *name, const char *mods,
FILe *str)
{
/* make str safe for fclose, macros
str->-Handle = -1;
str->-'Ihpnam = NULL;
str->-Buf = NU=;
str->-Next = hstr->-Cbuf;
str->-Rend = hstr->-Cbuf;
str->-Wend = bstr->-Cbuf ;
str->-Nback = 0;
str->-Mode = (str->-Mode & -MALFIL)
I (*mods == 'r' ? -MOPENR
: *mods = 'w' ? -MCREAT I -MOPEM I -MTRUNC
: *mods = 'a' ? MCREAT I -MOPENW I -MOPENA
: 0);
if ( (str->-Mode & (-MOPENR 1-MOPENW) ) == 0)
/* bad mods
{
fclose(str);
return (NU=);
*/
*/
/*
open failed
*/
fclose(str);
return (NU=);
1
return (str);
[I
Chapter 12
to the outside world. Only certain functions written for this implementation need include "yfuns.hw.(The internal header <yvals. h>, by contrast,
must be included in several standard headers.) The three macros look like
internal functions with the declarations:
i n t -Fclose (FILE * s t r );
i n t -Fread(FIIZ * s t r , char *buf, i n t s i z e ) ;
i n t -&rite (FILE * s t r , const char *buf, i n t s i z e );
-Fclose
-Fread
-& r i t e
--
/* remove function
UNIX version
#include "xstdio.hW
Figure 12.8:
remove. c
/* UNIX system c a l l
i n t -Unlink (const char *) ;
I
rename. c
*/
Figure 12.9:
*/
/*
remove a f i l e * I
--
/* rename function
#include "xstdio. h"
UNIX version
*/
/* UNIX system c a l l s */
i n t -Link(const char *, const char *);
i n t -Unlink (const char *) ;
i n t (rename) (const char *old, const char *new)
f
return (-Link(old, new) ? -1 :
/* rename
-Unlink(o1d)) ;
a file
*/
UNIX
primitives
function
remove
function
rename
284
Chapter 12
function
tmpnam
Figure 12.10:
tmpnam. c
Figure 12.10 shows the file tnpnam. c. It defines a simple version of tmpnarn
that concocts a temporary file name in the directory /tap, the customary
place for parking temporary files. It encodes the current process-id to make
a family of names that should be unique to each thread of control.
'*
/* UNIX system c a l l
.nt -Getpid (void);
* (tmpnam) (char
h r
*/
*/
*s)
/* c r e a t e a temporary f i l e name
i n t i;
char *p;
unsigned s h o r t t;
s t a t i c char buf [L-tmpnam] ;
s t a t i c unsigned s h o r t seed = 0;
{
i f (s = NU=)
s = buf;
seed = seed = 0 ? -Getpid() : seed
s t r c p y (s, "/tmp/tw) ;
i = 5;
p = s + s t r l e n ( s ) + i;
*p = '\I)';
f o r (t = seed; 0 <= -4; t >>r 3)
*--p = '0' + (t & 07);
r e t u r n (s);
*,
+ 1;
Figure 12.1 1:
xf open. c
*/
/* UNIX system c a l l */
int -Open (const c h a r *, i n t , i n t );
int -Fopen(const char *path, unsigned i n t smode,
const char *mods)
{
/*
open from a f i l e
unsigned i n t acc;
acc = (smode & (-MOPENR I -MOPENW) )
: smode & -MOPENW ? 1 : 0;
if (smode & -MOPENA)
acc I= 010;
if (smode & -MTRUNC)
a c c I= 02000;
if (smode & -MCREAT)
a c c I= 01000;
r e t u r n (-Open (path. acc, 0666) ) ;
= (-MOPENR I -MOPENW)
? 2
0-APPEND
*,
/*
0-TRUNC
/*
0-CREAT
*,
/*
Figure 12.1 2:
x f gpos .c
'*
*/
/*
-1)
/*
query failed */
errno = EFPOS;
return (EOF);
1
if (str->-Mode & -MWRITE)
loff += str->-tieact - str->-Buf;
else if (str->-Mode & -MREAD)
loff -= str->-Nback
? str->-Rsave - str->-Next + str->-Nback
: str->-Rend - str->-Next;
if (ptr = NULL)
return (loff);
else
{
/*
/*
ftell */
fgetpos */
ptr->-Off = loff;
return (0);
1
1
[I
Figure 12.11 shows the file xfopen. c that defines the function -Fopen. It
maps
the codes I chose for the mode indicators to the codes used by the
UNIX system service that opens a file. A proper version of this program
should not include all these magic numbers. Rather, it should include the
appropriate header that UNIX provides to define the relevant parameters.
UNIX makes no distinction between binary and text files. Other operating systems may have to worry about such distinctions at the time the
program opens a file. Similarly, UNIX has no use for any additional mode
information. (-Fopen could insist that the mode argument be an empty
string here. This version is not so particular.)
function
Figure 12.12 shows the file xfspos.c that defines the function -~gpos.It
~
g
p
o
s
asks
the system to deliver the fie-position indicator for the file, then
corrects for any data buffered on behalf of the stream. A file-position
indicator under UNIX can be represented in a long. Hence, type fpos-t,
defined in <stdio.h>, is a structure that contains only one long member. (1
could have defined fpos-t as type long directly, but I wanted to keep the
type as restrictive as possible.) In this case, the functions fgetpos and
fsetpos offer no advantage over the older fie-positioning functions. The
difference can be important for other systems, however.
function
Fopen
286
Figure 12.13:
xfspos. c
Chapter 12
-
'*
--
*/
*,
*.
1
if (ptr)
/* fsetpos
loff += ( (fpos-t *)ptr)->-Off;
if (way = SEEK-CUR && str->-Mode & -MREAD)
loff -= str->-Nback
? str->-have
str->-Next + str->-Nback
: str->-Rend - str->-Next;
if (way = SEEK-CUR && loff != 0
I I way != SEEK-SET I I loff != -1)
loff = -Lseek (str->-Handle, loff, way);
if (loff --= -1)
{
/* reques~tfai
errno = EFPOS;
return (EOF);
1
else
success
empty buffer
/*
if (str->-Mode
&
(-MREAD 1-MWRITE) )
/*
str->-Next = str->-Buf;
str->-Rend = str->-Buf;
str->-Wend = str->-Buf;
str->-Nback = 0;
st r->-Mode
&=
return (0);
1
1
( t w file) (void)
f
FILE *etr;
char fn [L-t~npnaml, *a;
if
Figure 12.15:
c1earerr.c
1)) == NULL)
/ * clearerr function * /
#include "xetdio.hW
void (clearerr)(FILE *etr)
/ * clear EOF and error indicators for a etream */
f
if (etr->-Mode & (-MOPENRI-MOPENW))
etr->-Mode &= -(-MEOFI-MERR);
1
C
function
Figure 12.13 shows the file xfepoe. c that defines the function -Fepoe. It
-Fepoe too benefits from the simple UNIX I/O model in the same ways as -Fgpoe.
Output causes no problems, since the function flushes any unwritten
characters before it alters the file-position indicator.
The remaining three primitives are macros. All expand to calls on functions that perform UNIX system services directly. The UNIX version of
"yfune. hn contains the lines:
#define -Fcloee(etr)
-Cloee((atr)-,-Handle)
#define -Fread(etr, buf, cnt) -Read((etr)->-Handle, buf, cnt)
#define r i t e ( e t r , buf, cnt) -Write((etr)->-Handle, buf, cnt)
int -Cloee(int);
int -Read(int, uneigned char *, int);
int -Write(int, conet uneigned char *, int);
tmpfile
clearerr
feof
ferror
Now that you have seen the 1 / 0 primitives, most of the low-level
functions declared in cetdio .h> should make sense. Let's begin by looking
at the remaining functions that set up or administer streams without
performing input or output. Figure12.14 shows the file tmpfile.c. Function
txnpfile is a simple application of the functions you have already met.
Figure 12.15 (clearerr.c), Figure 12.16 (feof.c), and Figure 12.17 (ferror. c) are even simpler. The only reason the functions defined in these files
lack masking macros in cetdio. h> is because they are used so seldom.
Chapter 12
288
Figure 12.16:
feof.c
/* feof function */
#include "xetdio.hn
i n t :feof) (FILE *err)
/*
I )
return (etr->-Mode
*/
/*
return (etr->-Mode
&
t e a t e r r o r indicator f o r a etream
*/
R R );
/* e e t up buffer f o r a etream
{
eetvbuf ( e t r , buf, buf ? -1OFBF : -IONBF, BUFSIZ);
1
*/
0
eetbuf
Figure 12.18 shows the file aetbuf .c. It consists simply of a call to
eetvbuf eetvbuf. Figure 12.19 shows the file eetvbuf. c. Most of its work consists of
laundering its arguments. Note that eetvbuf will honor requests any time
file
positioning
functions
function
fgetc
function
ungetc
<stdio. h>
Figure 12.19:
setvbuf c
/* eetvbuf function
#include <limite.h>
#include <etdlib.h>
#include "x8tdio.h"
*/
*,
*/
0
#include "xetdio.hW
i n t (feeek) (FILE * e t r , long o f f , i n t smode)
/* e e t eeek o f f e e t f o r etream
*/
290
Chapter 12
Figure 12.22:
fe e t p o s c
/* feetpoe function
#include "xetdio.hw
*/
long ( f t e l l ) (FILE * e t r )
/*
Figure 12.24:
rewind. c
/* rewind function */
#include "xetdio.h"
v o i d (rewind) (FILE * e t r )
/*
-Fspoe ( e t r ,
etr->-&de
rewind etream */
/* fgetc function */
f g e t c - c #include "xetdio.h"
Figure 12.25:
i n t (fgetc) (FILE * e t r )
{
i f (0
< etr->-*a&)
/*
/*
i f (--8tr->-ma& = 0)
etr->-Rend = etr->-Reave;
return (etr->-Back [etr->-*ack]
1
if
(etr->-Next
);
< etr->-Rend)
e l e e i f (-Frprep(8tr) <= 0 )
return (EOF);
return (*etr->-Next++) ;
I )
Figure 12.26:
getc. c
/* getc function */
#include "xetdio.h"
i n t (getc) (FILE * e t r )
/*
return ( f g e t c ( e t r ) ) ;
291
Figure 12.27:
getchar. c
/* getchar function
#include "xetdio h"
*/
i n t (getchar) (void)
/*
*/
return ( f g e t c ( etdin) ) ;
Figure 12.28:
unget c . c
/* ungetc function */
#include "xetdio h"
i n t (ungetc) ( i n t c, FILE * e t r )
292
Chapter 12
Figure 12.29:
/* f r e a d function
*/
fr e a d . c
size-t
e i z e , size-t
nelem, FILE * e t r )
eize-t ne = e i z e * nelem;
unsigned char *a = ptr;
(ne == 0)
return (0);
i f (0 < etr->-?back)
if
for
(;
< str->-Rend)
/* eneure chare i n b u f f e r */
e l e e i f (-Frprep ( e t r ) <= 0)
break;
{
size-t m = etr->-Rend
/* d e l i v e r a 8 many a 8 poeeible */
etr->-Next;
(ne < m)
m = ns;
msmcpy (8, etr->-Next, m) ;
e += m, ne -= m;
etr->-Next += m;
1
1
r e t u r n ( ( e i z e * nelem - ne) / e i z e );
1
if
function
fflush
function
Prror
value on a write error or zero if the stream buffer now contains space to
write characters. Here is where the stream buffer gets allocated. All functions that write a stream rely on --rep
in the end.
Figure 12.37showsthe file f f l u e h . c. Here is where--rite actually gets
called to write the contents of a stream buffer. If the argument is a null
pointer, the function calls itself for each element of the array -File8 that is
not null. I chose to use recursion instead of looping here to keep the control
flow cleaner. Performance is not likely to be an issue on such a call.
One other function belongs in this group. Figure 12.38 shows the file
p e r r o r .C. It composes an error message and writes it to the standard error
stream. The function - s t r e r r o r does the work of the function e t r e r m r
(both declared in < a t r i n g . h>) but with a buffer supplied by the caller. It is
not permissible for p e r r o r to alter the contents of the static storage in
e t r e r r o r . Thus, each function must call-strerror with its own staticbuffer.
<stdio h>
Figure 1230:
fgets. c
*(fgete)(char
har
if (n <= 1)
return (NULL);
for (8 = (uneigned char *)buf; 0 < --n & & etr->-ma&;
)
/* deliver puehed back char8
{
*e = etr->-Back[--8tr->-Nback];
if (etr->-Nback = 0)
etr->-Rend = etr->-tsave;
if (*a++ == '\n' )
{
/* terminate full line
*a = '\or;
return (buf);
1
1
while (0 < n)
{
/* eneure buffer ha8 chars
if (etr->-Next < etr->-Rend)
*/
*/
*/
*/
*/
if (n < m)
el = m L , m = n;
mamcpy (s, etr->-Next, m);
e += m, n -= m;
etr->-Next += m;
if (el)
*s = '\08;
return (buf);
1
1
1
if (s = (unsigned char *)buf)
return (NULL);
elee
{
*a = ' \O' ;
return (buf);
1
/*
*/
294
Chapter 12
Figure 12.3 1:
gets. c
* get8 function */
:include <atring.h>
include "xetdio.h"
Bar
*(gete)(char
*buf)
1
1
for
(;
; )
/*
a[-11 = '\O1;
return (buf);
1
1
1
if (8 ---- (uneigned char *)buf)
return (NULL);
elee
{
*a = '\O';
return (buf);
<stdio. h>
Figure 12.32:
xfrprep.c
Frprep function
~inzlude<etdlib. h>
linclude "xstdio.h"
!include "yfune.hW
I*
*/
/*
str->-Mode I = R R ;
return (-1);
*,
*,
*,
*,
*,
*,
/*
*,
elee if (n = 0)
.I
etr->-Mode = (etr->-&&I
return (0);
&
--MREAD)
1
else
etr->-Mode I = -MREAD;
str->-Rend += n;
return (1);
1
1
1
Chapter 12
Figure 12.33:
fputc. c
fputc function */
linclude "xstdio. h"
f*
~ n t(fputc) ( i n t c i , FILE * e t r )
/*
unsigned char c = c i ;
i f (etr->-Next
< etr->-Wend)
/*
etr->-Wend = etr->-Buf;
i f ( (etr->-Mode & -MNBF
return (EOF);
( c
== ' \nr )
&&
fflueh ( e t r ))
1
return (c);
1
fwrite
Other functions have logic that parallels f p u t c but avoids calling it in
fpute the interest of speed. One variant of fgetc is fwrite, defined in Figure12.39
put8 (fwrite-c). Two others are in Figure 12.40 ( m t e .c) and Figure 12.41
( p t e . c). The latter is a simple variant of the former.
That's the complete set of low-level input and output functions. As you
can see, none is particularly hard. Nevertheless, the whole collection adds
up to a lot of code. And that's only the beginning. The hard part of
implementing <etdio. h> is performing formatted input and output.
formatted
Six functions perform formatted output (the print functions). All call a
output common function -Printf that has the declaration:
i n t -Printf (void * (*pfn) (void *, conet char *, eize-t) ,
void *arg, conet char *fmt, va-list
fprintf
printf
ap);
<stdio. h>
Figure 12.34: /*
putc. c
putc function */
#include "xstdio.hW
/*
*/
Figure 12.35: /*
putchar. c
putchar function */
#include "xstdio.hW
int (putchar)(int c)
/*
*/
Figure 12.36: /*
xfwprep c
Fwprep function
#include <stdlib.h>
#include "xstdio.hW
#include "yfuns.hm
int --rep
*/
(FILE *etr)
1
if (etr->-Buf)
(etr->-Buf = malloc(BUFSIZ) ) = NULL)
/* use 1-char -Cbuf
etr-> Buf = 6str->-Cbuf;
etr->>nd
= etr->-Buf + 1;
else if
*,
1
else
/*
etr->-Mode I= - W W ;
str->-Bend = etr->-Buf + BUFSIZ;
1
etr->-Next = etr->-Buf;
etr->-Rend = etr->-Buf;
etr->-Wend = etr->-Bend;
etr->-Mode I = -MWRITE;
return (0);
*,
Chapter 12
Figure 12.37:
ff1ush.c
'*
f f l u s h function */
linclude "xstdio. h"
linclude "yfuns.hm
nt ( f f l u s h ) (FILE * s t r )
/*
f l u s h an output stream
/*
r e c u r s e on a l l streams
int n;
unsigned c h a r *s;
if (str = NULL)
i n t nf, s t a t ;
f o r ( s t a t = 0, nf = 0; nf < FOPEN-MAX; ++nf)
i f (- Files[nf ] 66 f f l u s h (-Files [nf 1 ) < 0)
s t a t = EOF;
r e t u r n ( s t a t );
1
i f ( ! (str->-Mode 6 -MWRITE) )
return (0);
f o r (s = str->-Buf;
s < str - >- Next; s += n)
/* t r y t o w r i t e buffer
I
n = F w r i t e ( s t r , s, str->-Next - s);
i f (i<= 0)
/* r e p o r t e r r o r and f a i l
I
str->-Next = str->-Buf;
str->-Wend = str->-Buf;
str->-Mode I= -MERR;
r e t u r n (EOF);
1
1
Figure 12.38:
perror c
/* p e r r o r function */
#include <errno.h>
#include < s t r i n g . h >
#include "xstdio.hW
void (perror) (const c h a r *s)
/* put e r r o r s t r i n g t o etderr
I
s t a t i c c h a r buf [I = ( " e r r o r #xxxW
);
i f (s)
{
/*
p u t user- supplied p r e f i x
f p u t s (s, stderr);
f p u t s (" : ", stderr);
tstdio. h>
Figure 12.39:
fwrite.c
fwrite function */
linclude <string.h>
finclude "xstdio.hW
f*
*,
if (ns = 0)
return (0);
while (0 < ns)
/*
*,
*,
if (n < m)
el = NULL, m = n;
memcpy(str->-Next, s, m);
s += m, ns -= m;
str->-Next += m;
if (el 66 fflush(str) )
/*
str->-Wend = str->-Buf;
break;
*,
1
1
1
if (str->-Mode
-MNBF)
/*
*,
str->-Wend = str->-Buf;
fflush(str);
1
return ((size
nelean
ns) / size);
300
Chapter 12
Figure 12.40:
f p u t e .c
'*
f p u t e function */
Iincluds < s t r i n g . h>
/include "xstdio. h"
/*
I
while (*a)
/*
I
i f (etr->-Next
p u t a s t r i n g t o stream */
ensure room i n buffer */
< etr->-Wend)
e l s e i f (-Fwprep(etr)
r e t u r n (EOF);
< 0)
/* copy i n a s many a s p o s s i b l e */
I
const char * e l = etr->-Mode 6 JLBF
? e t r c h r (e, ' \n' ) : NULL;
size- t m = e l ? e l
e + 1 : etrlen(e);
size- t n;
n = etr->-Wend - etr->-Next;
i f (n < m)
e l = NULL, m = n;
memcpy (etr->-Next, e , m) ;
e += m;
etr->-Next += m;
i f ( e l 66 f f l u s h ( e t r ) )
/*
etr->-Wend = etr->-Buf;
r e t u r n (EOF);
f a i l on e r r o r */
1
1
1
i f (etr->-Mods
-MNBF)
I
etr->-Wend = etr->-Buf;
i f ( f f l u s h ( e t r ))
r e t u r n (EOF);
/*
d i s a b l e macros and d r a i n * I
1
return (0);
Figure 12.4 1:
pUtt3.c
/* pute function */
#include "xetdio h"
/* p u t s t r i n g + newline t o etdout */
I
r e t u r n ( f p u t e ( e , etdout) < 0
I I fputc (' \n' , etdout) < 0 ? EOF : 0);
1
0
*/
1
i n t ( f p r i n t f ) (FILE * s t r , const char * a t , . .. )
/* p r i n t formatted t o stream
(
i n t ans;
va-list ap;
*/
Figure 12.43:
p r i n t f .c
/* p r i n t f function */
#include "xstdio.h"
s t a t i c void *prout(void * s t r , const char *buf, size-t n)
/* write t o f i l e
I
return (write (buf, 1, n, str) = n ? etr : NULL) ;
*/
1
i n t ( p r i n t f ) (const char * a t ,
. .. )
/*
p r i n t formatted t o stdout
*/
i n t ans;
va-list ap;
va-start (ap, fmt);
ans = -Printf (bprout, stdout, f m t , ap);
va-end (ap);
return (ans);
other
Figure12.44 shows the file s p r i n t f .c. Here, the generic pointer indicates
print the next place to store characters in the buffer you specify when you call
functions sprinf. Note also that s p r i n t f writes a terminating null character if - ~ r i n t f
302
Chapter 12
f * s p r i n t f function
finclude <string.h>
finclude "xstdio. h"
Figure 12.44:
sprintf c
*/
..
#include "xstdio.hW
s t a t i c void *prout (void *&r, const char *buf, size-t n)
/* write t o f i l e */
return (fwrite(buf, 1, n, str) = n ? str : NULL);
1i n t
I
--
*/
1 int
if (0 <= ans)
s[ansl = '\On;
return (ans);
Testing for the per cent (%) escape character is a delicate matter. The only
safe way is to convert the format string to a sequence of wide characters
and look for one corresponding to a per cent. You must compare the data
object wc against the wide-character code for per cent. Unfortunately some
uncertainty surrounds what that value might be. The C Standard requires
that each of the charactersin the basic C character set have a wide-character
code that equals the single-character code. You write the single-character
code for per cent as # % # . You write the wide-character equivalent as L.W.
Some question remains, however, whether the C Standard should require
such equivalence. It may thus be imprudent to write code that depends on
a delicate point of law.
Still another uncertainty exists. An implementationcan support multiple
encodings for wide characters,at least in principle. A program can conceivably change to a locale where wide-character constants don't match the
current character set. (Yes!) That may be unwise, but it is not specifically
disallowed by the C Standard. Hence, a prudent program might avoid
using either # % # or L#%#as the wide-character code for per cent.
The implementor has three choices for the value to compare against wc:
Use # % # for maximum compatibility with older C translators.Rely on the
codes being equivalent and not changing with locale.
Use L#%# for maximum clarity. Rely on the codes not changing with
locale.
Execute the call mbstowcs (wcs, "%" , 1) on each entry to -~rintf,with
the declaration wchar-t wcs [ 2 I. That stores the current wide-character
code for per cent in wcs t 01.(mbstowce is declared in atdlib. h>.)
I chose the f i t course as the wisest given the current state of C translators,
the C Standard, and multibyte-character support. Be warned that this area
is rapidly evolving, however. A different choice may be more prudent in
the near future.
Chapter 12
Figure 12.48:
xprintf .c
Part 1
-Printf function */
:include <ctype.h>
:include <stdlib.h>
:include <string.h>
:include "xstdio.hW
:define MAX-PAD (eizeof (spaces) - 1)
:define PAD(s, n) if (0 < (n)) Cint i, j = (n); \
for (; 0 < j; j -= i) \
(i = MAX-PAD < j ? MAX-PAD : j; PUT(s, i); 1 1
:define PUT(s, n) \
if (0 < (n)) (if ( (arg = (*pfn)(arg, s, n)) ! = NULL) \
x.nchar += (n); else return (EOF); 1
";
itatic char epaces [I = "
itatic char zeroes[] = "00000000000000000000000000000000";
*.
size-t 1.
/ * print formatted
C
-Pft x;
for (x.nchar = 0;
; )
C
const char *s = fmt;
C
int n;
wchar-t wc;
-Mbeave state = (01;
~ ~ ~ ( fs
m t ,fmt);
if (n <= 0)
return (x-nchar);
fmt = ++a;
1
C
Continuing
xprintf.c
Part 2
else
/* accumulate width digits
for (x-width= 0; isdigit(*s); ++a)
if (x-width< -WMAX)
x.width = x.width * 10 + *s - '0';
if (*s != ' . ' )
x.prec = -1;
else if (*++a = ' *' )
/* get precision argument
I
x.prec = va-arg (ap, int);
++s;
1
else
/* accumulate precision digits
for (x .prec = 0; isdigit (*a); ++a)
if (x.prec < _WMAX)
x.prec = x-prec * 10 + *s - '0';
x.qua1 = strchr("hlL", *s) ? *a++ : '\OR;
/*
do the conversion
char ac [32];
-Putfld(bx,
fmt = s
1
}
1;
x-nzl + x.n2
x.nz2;
*,
*,
*,
*,
Chapter 12
None of the rest of the code in -Printf or its subordinates need worry
about multibyte characters. Conversion specifiers consist of characters
from the basic C character set. Each of these has a one-character encoding.
(In principle, a format string may contain redundant shift codes within a
conversion specifier. I chose not to support such practices.)
-Printf thus frets about multibyte characters only in literal text between
PUT
PAD conversion spec ers. Once it discovers a chunk of literal text, it delivers
all such characters up to but not including any per cent character it
encounters. Note the use of the macro PUT, defined at the top of this C source
file, to deliver characters. You cannot package this operation as a function.
It needs to return f r o m - ~ r i n t fshould the delivery function report an error.
No good is served, on the other hand, by writing out such a messy patch
of logic repeatedly. For much the same reasons, I also created the macro PAD
to deliver padding zeros or spaces.
O n c e - ~ r i n t ftrips across a per cent in a format, it sets about parsing the
conversion specifier that follows. It translates flags into a set of indicators
used throughout -Printf and its subordinates. The header llxetdio.hn
contains the macro definitions:
#define
#define
#define
#define
#define
-FSP
-FPL
-FMI
-FNO
Z
0x01
Ox02
0x04
0x08
0x10
1 v;
char *s;
i n t no, nzO, n l , nzl, n2, nz2, prec, width;
size- t nchar;
unsigned i n t flags;
char qual;
1 -Pft;
Chapter 12
308
Figure 12.49:
xputf1d.c
Part 1
'*
-Putfld function * /
linclude <string.h>
linclude "xstdio-h"
/*
macros
*/
l if -DLONG
ldefine LDSIGN(x) \
(((unsigned short *)&(x))[-DO ? 4 : 01 & 0x8000)
lelse
ldefine LDSIGN(x) (((unsigned short *)&(x))[-DO] & 0x8000)
lendif
roid -~utfld(-~ft*px, va-list *pap, char code, char *ac)
{
/ * convert a field for -Printf */
px->no = px->nz0 = px->nl = px->nzl = px->n2 = px->nz2 = 0;
switch (code)
{
/ * switch on conversion specifier */
case 'c':
/ * convert a single character */
break;
/ * convert a signed decimal integer
case 'd': case 'in:
px->v.li = px->qua1 == '1' ?
va-arg(*pap, long) : va-arg(*pap, int);
if (px->quai == 'he)
px->v.li = (short)px->v.li;
if (px->v.li < 0)
/* negate safely in -Litob
ac[px->no++] = '-';
else if (px->flags & -FPL)
ac[px->no++] =
else if (px->flags & J S P )
ac[px->no++] = ' ' ;
px->s = &ac [px->no];
-Litob(px. code);
break;
case 'or: case 'u':
case 'x': case 'X':
/ * convert unsigned
px->v.li = px->qua1 == '1' ?
va-arg(*pap, long) : va-arg(*pap, int);
if (px->qua1 == 'h')
px->v.li = (unsigned short)px->v.li;
else if (px->qua1 == '\Of)
px->v.li = (unsigned int)px->v.li;
if (px->flags & -FNO && px->v.li != 0)
/ * indicate base with prefix
(
ac[px->no++] = # O f ;
if (code == 'x' I I code == 'X')
ac [px->no++]= code;
1
px->s = &ac [px->no]
;
-Litob(px, code);
break;
/* convert floating
case 'e8: case 'E': case 'f':
case 'g*: case *On:
px->V.ld = px->qua1 == 'L' ?
va-arg(*pap, long double) : va-arg(*pap, double);
*/
*/
I + ' ;
*/
*/
*I
Continuing
xputfld. c
Part 2
if (LDSIGN (px-W.Id))
ac[px->no++] = ' -';
else if (px->flags h -FPL)
ac [px->no++] = '+' ;
else if (px->flags h -FSP)
ac[px->no++] = ' ';
px->s = hac[px->no];
-Ldtob(px, code);
break;
case ' n' :
/* return output count
if (px->qua1 == 'h' )
*va-arg (*pap, short *) = px->nchar;
else if (px->qua1 != '1' )
*va-arg(*pap, int *) = px->nchar;
else
*va-arg(*pap, long *) = px->nchar;
break;
/* convert a pointer, hex long version
case ' p' :
p x - W . li = (long)va-arg (*pap, void *);
px->s = hac[px->no];
Litob(px, ' x' ) ;
break;
case 'st:
/* convert a string
px->s = va-arg (*pap, char *);
px->nl = strlen (px->a);
if (0 <= px->prec hh px->prec < px->nl)
px->nl = px->prec;
break;
/* put a ' % '
case ' %' :
ac[px->no++] = ' %' ;
break;
/* undefined specifier, print it out
default:
ac [px->no++] = code;
*/
*/
*/
*/
*/
1
1
310
Chapter 12
Figure 12.50:
xlitob. c
I* -Litob function */
#include <stdlib.h>
!include <string.h>
Yinclude "xmath.hW
Yinclude "xstdio.h"
static char ldigs [ I = "0123456789abcdef";
static char udigs [I = "0123456789ABCDEFW;
*px, char code)
macro
IDSIGN
30103L / l O O O O O L
NDIG/2;
That provides an adequate estimateof the prescaling required for lcival (x).
You want to multiply by the minimum number of elements of pows. YOU
must end u p with ldval strictly less than los. You prefer that the first group
of eight digits have at least four nonzero digits. You need to capture the
actual scaling factor (in xexp) to generate a proper exponent later. This
Chapter 12
Figure 12.5 1:
xldtob.c
Part 1
Ldtob function */
include <float.h>
include <stdlib. h>
include <string.h>
include "xmath-h"
include "xstdio.h"
/* macros
define NDIG
8
*/
/* static data */
tatic const long double paws[] = {
lelL, le2L, le4L, le8L, le16L, le32L,
if 0x100 < -LEIAS
/* assume IEEE 754 8- or 10-byte
le64L. le128L. le256L,
if -DLONG
/* assume IEEE 754 10-byte
le512L, le1024L, le2048L, le4096L,
endif
endif
*
*
1;
oid -Ldtob (-Pft *px, char code)
/*
char ac[32];
char *p = ac;
long double ldval = px->v.ld;
short errx, nsig, xexp;
if (px->prec < 0)
px->prec = 6;
else if (px->prec = 0 hh (code = 'g' I I code == 'G'))
px->prec = 1;
if (0 < (errx =-Ldunscale(hxexp, hpx->v.ld)))
f
/* x = NaN, x = INF *,
memcpy(px->a, errx
NAN ? "NaN" : "Inf", px->nl = 3);
return;
--
1
else if (0 =
- errx)
nsig = 0, xexp = 0;
else
/*x=O*
/*
int i, n;
if (ldval < 0.0)
ldval = -1dval;
if ((xexp = xexp
30103L / lOOOOOL
NDIG/2) < 0)
/* scale up
n = (-xexp + (NDIG/2-1)) h (NDIG/2-l) , xexp = -n;
for (i = 0; 0 < n; n >>c 1, ++i)
if (n h 1)
ldval *= pows[i];
1
else if (0 < xexp)
(
/*
scale down
<stdio. h>
long double factor = 1.0;
Continuing
xldtob c
xexp h= (NDIG/2-1);
for (n = xexp, i = 0; 0 < n; n >>c 1, ++i)
if (n h 1)
factor *= pow8[i];
ldval /= factor;
Part 2
1
1
f
/*
1
while (0 <= --j)
*--P = '0';
gen = p - hac[l];
for (p = hac[l], xexp += NDIG-1; *p = '0'; ++p)
--gen, --xexp;
/* correct xexp *I
nsig = px-Wrec + (code ='f' ? xexp + 1
: code== 'e' 1 1 c o d e = 'E' ? 1 : 0);
if (gen < nsig)
nsig = gen;
if (0 < nsig)
f
/* round and strip trailing zeros *I
const char drop
= nsig < gen 66 '5' <= p[nsig] ? '9' : '0';
int n;
for (n = nsig; p[--n] = drop; )
--nsig;
if (drop == 9')
++p[nI ;
if (n < 0)
--p, ++nsig, ++xexp;
1
1
1
-Genld(px,
1
Chapter 12
expression begins that process by effectively multiplying e by loglo(2). It
also allows for about four digits to theleft of thedecimal point. The function
then scales ldval accordingly.
The next bizarre approximation is the initializer:
int gen = px-prec
+ (code == 'f' ? xexp + 2+NDIG : 2+NDIG/2);
rmt - a
ap
argument list
The obtaining function obtains the next character to scan if its second
hgwas a value distinct
argument has the value -WANT, defined in mnxstdio.
from any character code or EOF. Otherwise, it treats the second argument as
a character to push back. The function returns EOF on failure.
Figure 12.53 shows the file fscanf-c.It defines both fscanf and the
recanr
ecanf obtaining function scin that it uses. In this case, the generic pointer conveys
secanf the PILE pointer from rscanf through -scar to scin. scin uses this pointer
to read the stream you specify when you call fscanf. Figure 12.54 shows
the file 8canr.c. That function is a simple variant of fscanf. Figure 12.44
shows the file eprintf .c. Here, the generic pointer indicates the next place
to obtain characters in the buffer you specify when you call sscanf.Unlike
the other scan functions, escanf rewrites the generic pointer. That's why
the obtaining function needs a pointer to pointer argument.
Figure 12.56 shows the file xscanf.C. It defines the function scanf that
function
-scanr does all the work.
-scanr packs various bits of information into a structure called x of type
type
-srt s r t .Subordinate functions fill in additional information. By the time they
have done their work for a given conversion specification, -scanf knows
how many characters have been scanned and whether the last conversion
specifier stored a converted value by examining the contents of x. The
header m ~ s t d i ~ .contains
hm
the type definition:
typedef etruct {
int (*pfn)
(void *, int );
void *arg;
va-liet ap;
int nchar, nget, width;
char noconv, qual, stored;
1 Sft;
Chapter 12
Figure 12.52:
xgenld. c
Part 1
'*
Genld function */
/include d o c a l e . h>
linclude < s t r i n g . h >
linclude "xstdio. h"
roid -Genld(-Pft
short xexp)
f
/* generate long double text
const char p o i n t = localeconv() ->decimalgoint [O];
*I
i f ( n s i g <= 0)
n s i g = 1, p = "0";
if (code == ' f r 1 1 (code = 'g' 1 1 code = ' G ' )
&& - 4 <= xexp hh xexp < px->prec)
(
++xexp;
i f (code != I f ' )
/*
f
/* f i x u p f o r 'g'
i f ( ! (px->flags h -FIJO) &h n s i g < px->prec)
px->prec = nsig;
i f ( (px->prec -= xexp) < 0)
px->prec = 0;
*I
*I
1
i f (xexp <= 0)
f
/* d i g i t s only t o r i g h t of p o i n t
px->s [px->nl++] = ' 0' ;
i f (0 < px->prec I I px->flags h -FEIO)
px->s [px->nl++] = p o i n t;
i f (px->prec < -xexp)
xexp = -px->prec;
px->nzl = -xexp;
px->prec += xexp;
i f (px->prec < n s i g )
n s i g = px->prec;
memcpy (hpx->s [px->nl] , p, px->n2 = n s i g ) ;
px->nz2 = px->prec
nsig;
*I
1
e l s e i f ( n s i g < xexp)
(
/* zeros before p o i n t */
memcpy (hpx->s[px->rill, p, n s i g );
px->nl += nsig;
px->nzl = xexp - nsig;
i f (0 < px->prec I I px->flags 6 -E'NO)
px->s[px->nl] = p o i n t , ++px->n2;
px->nz2 = px->prec;
)
else
t s t d i o . h>
Continuing
xgenld c
Part 2
1
else
/*
1 1 code =
i f (code == 'g'
/*
'e'
format
*,
'G')
fixup f o r 'g'
*,
: 'E';
1
px->s [px->nl++] = *pH;
i f (0 < px->prec I I px->flags 6 -FNO)
px->s [px->nl++] = point;
i f (0 < px->prec)
/* put f r a c t i o n d i g i t s
{
i f (px->prec < --nsig)
nsig = px->prec;
memcpy (Spx->s [px->nl] , p, nsig);
px->nl += nsig;
px->nzl = px->prec
nsig;
*,
1
p = Spx->s [px->nl] ;
*p++ = code;
i f (0 <= xexp)
*p++ = '+' ;
else
{
**
= I-#
xexp = -xexp;
1
put exponent
negative exponent
/*
/*
**I
1
*p++ = xexp / 10 + ' O ' , xexp %=10;
*P++ = xexp + '0' ;
px->n2 = p - Spx->s[px->nl];
1
i f ( (px->f lags 6 (-FMI 1-FZE) ) = -FZE)
/* pad with leading zeros
(
i n t n = px->nO + px->nl + px->nzl + px->n2 + px->nz2;
*I
Chapter 12
The internal function t o w c , declared in <stdiib.h>,parses the format
as a multibyte string using state memory of type s t a t e that you provide
on each call. The issues are the same as for -Printf, described on page 303.
Note, however, that s c a n f must distinguish white-space as well as per cent
characters. It assumes that any wide-character code that can be stored in an
unsigned char can be tested properly by isspace.That is certainly true in the
current C Standard. It would be messy to change for an environmentwhere
' \ t is not necessarily equal to L #\ t .
fl
GET
UNGET
fscanf c
#include "xstdio.hw
static int scin(void *str, int ch)
f
/* get or put a character
if (ch == -WANT)
return (fgetc((F1LE *)str));
else if ( 0 <= ch)
return (ungetc(ch, (FILE *)str));
else
return (ch);
1
int (fscanf)(FILE *str, const char *fmt, ...)
f
/* read formatted from stream
int ans;
va-list ap;
va-start(ap, fmt);
ans = -Scanf(&scin, str, fmt, ap);
va-end (ap)
;
return (ans);
Figure 12.54:
scanf .c
f* scanf function */
#include "xstdio ha'
f
/* get or put a character
if (ch == -WANT)
return (fgetc((F1LE *)str));
else if ( 0 <= ch)
return (ungetc(ch, (FILE *)six));
else
return (ch);
1
*I
/*
int ans;
va-list ap;
va-start(ap, fmt);
ans = -Scanf (&scin, stdin, fmt, ap);
va-end(ap);
return (ans);
1
Figure 12.55:
eecanf .c
f* sscanf function */
)include "xstdio.h"
static int scin(void *str, int ch)
/*
/*
deliver a character *I
*(char **)str = s + 1;
return (*s);
1
else if (0 <= ch)
*(char **)etr = s - 1;
return (ch);
1
Lnt (sscanf)(const char *buf, const char *fmt, . . . I
f
/ * read formatted from string
int ans;
va-list ap;
va-start(ap, fmt);
ans = -Scanf(&scin, (void **)&buf, fmt, ap);
va-end(ap);
return (ans)
;
1
*I
Chapter 12
Figure 12.56: '* -Scanf function * /
xscanf c
Part 1
/include <ctype.h>
!include <limits.h>
!include <stdlib.h>
!include <string.h>
iinclude "xstdio.hW
.nt -Scanf(int (*pfn)(void *, int), void *am.
const char *fmt, va-list ap)
/*
read formatted * I
++a)
/*
int ch;
/*
int n;
wchar-t wc;
-Mbsave state = {O};
while (0 < (n = t o w c ( & w c , s, MB-CUR-MAX, &state)))
f
/* check type of multibyte char * /
s += n;
if (WC == '%' )
break;
else if (wc <= UCHAR-MAX && ~ s s ~ ~ c ~ ( w c ) )
/ * match any white-space *,
{
while (isspace(*s))
while (isspace(ch = GET(&x)))
;
UNGET(&x, ch);
1
else
/* match literal text
for (s -= n; 0 <= --n; )
if ((ch = GET(&x)) I = *s++)
/ * bad match
I
UNGET(&x, ch);
return (nconv);
*,
*,
1
1
if (*s == '\on)
return (nconv);
1
f
/*
*,
Continuing
xscanf c
Part 2
UNGET(&x, ch);
1
if ((6 = -Getfld(&x, s)) == NULL)
return (0 < nconv ? nconv : EOF);
if (x. stored)
++nconv;
1
1
GETN
UNGETN
@'xstdio.h"
Chapter 12
Figure 12.57:
xstdio h
Part 1
/ * declarations */
FILE *-Foprep(const char *, const char *, FILE * ) ;
int -Fopen(const char *, unsigned int, const char *);
Part 2 int -Frprep(FILE *);
int -Ftmgnam(char *, int);
int -Fwprep(FILE *);
void -Genld(-Pft *, char, char *, short, short);
const char *-Getfld(-Sft *, const char * ) ;
int -Getfloat(-Sft *);
int -Getint (-Sft *, char) ;
void -Ldtob(-Pft *, char);
void -Litob(-Pft *, char);
int -~rintf(void *(*)(void *, const char *, size-t),
void *, const char *, va-list);
void -Putfld(-Pft *, va-list *, char, char *);
int -Scanf(int (*)(void *, int),
void *, const char *, va-list);
Continuing
xstdio. h
324
Figure 12.58:
xgetf ld.c
Part 1
Chapter 12
Getfld function
tinG1ude xctype. h>
Yinclude < l i m i t s . h>
tinclude <string.h>
tinclude "xstdio. h"
f*
*/
/*
convert a f i e l d
i n t ch;
char *p;
px->stored = 0;
switch (*a)
/* switch on conversion s p e c i f i e r
{
case ' c' :
/* convert an array of chars
i f (px->width = 0)
px->width = 1;
p = va-arg (px->ap, char *) ;
f o r (; 0 < p x - h i d t h ; --px->width)
i f ( (ch = GET(px)) < 0)
return (NULL);
e l s e i f (!px->noconv)
*p++ = ch, px->stored = 1;
break;
/* convert a pointer
case 'p' :
case ' d' : case ' i' : case ' o' :
case U' : case ' X' : case 'X' :
i f (--tint
(px, * s ) )
/* convert an integer
return (NULL);
break;
case 'e' : case 'E': case ' f ' :
case ' g' : case ' G' :
i f (-Getfloat (px) )
/* convert a floating
return (NULL);
break;
case ' n' :
/* r e t u r n output count
i f (px->qua1 == 'hr )
*va-arg (px->ap, short *) = px->nchar;
e l s e i f (px->qua1 != I1')
*va-arg (px->apt i n t *) = px-Snchar;
else
*va-arg (px->ap, long *) = px->nchar;
break;
case ' s' :
/* convert a s t r i n g
px->wet = p x - h i d t h <= 0 ? INT-MU : px->width;
p = va-arg(px->ap, char *) ;
while (0 <= (ch = GETN(px)) )
i f (isspace (ch) )
break;
e l s e i f ( !px->noconv)
*p++ = ch;
UNGETN (px, ch);
i f ( !px->noconv)
*p++ = \0' , px->stored = 1;
break;
Continuing
xgetf ld. c
Part 2
/*
match a
' % I
*/
{
/* convert a scan s e t */
char ccnnp = *++s == ' A ' ? *s++ : ' \ O r ;
const char * t = s t r c h r ( * s = '1' ? s
1 : a, ' 1 ' ) ;
size-t n = t
s;
i f ( t = NULL)
return (NULL);
/* undefined */
px->nget = p x - h i d t h <= 0 ? INT-MAX : px-hidth;
p = vz-arg (px->ap, char *);
while (0 <= (ch = GETN(px)) )
i f ( ! c a p 66 !memchr(s, ch, n)
I I comp 6 6 memchr(s, ch, n ) )
break;
e l s e i f ( !px->noconv)
*p++ = ch;
UNGETN (px, ch) ;
i f (!px->noconv)
*p++ = ' \ O r , px->stored = 1;
s = t;
1
break;
default :
return (NULL);
/*
undefined s p e c i f i e r , q u i t */
1
return ( s ) ;
Chapter 12
Figure 12.59:
xgetint c
*/
Part 1
.nt --tint
(-Sft
px->nget = p x - h i d t h <= 0
I I FMAX < p x - h i d t h ? FMAX : px-Midth;
p = ac, ch = GETN(px);
i f ( & s = '+' 1 ) c h = ' - ' )
*p++ = ch, ch = GETN (px);
i f (ch
'0')
/* match possible prefix
f
seendig = 1;
*p++ = ch, ch = GETN (px);
i f ( ( c h s ' x ' 1 1 ch='X')
66 (base == 0 I I base 3 1 6 ) )
base = 16, *p++ = ch, ch = GETN(px);
else
base = 8;
--
*,
*,
I I code = ' i t )
/* d e l i v e r a signed integer
long l v a l = s t r t o l (ac, NULL, base) ;
f
i f (px->qua1 = 'h' )
*va-arg(px->ap, short *) = l v a l ;
e l s e i f (px->qua1 != '1')
*va-arg(px->ap, i n t *) = l v a l ;
else
*a-arg (px->ap, long *) = lval;
else
Continuing
xgetint .c
Part 2
px->stored = 1;
i f (code == 'p')
*va-arg (px->ap, void **)
e l s e i f (px--1
= 'h' )
*va-arg(px->ap, unsigned
!= ' 1 ' )
e l s e i f (px--1
*va-arg(px->ap, unsigned
else
*va-arg(px->ap, unsigned
= (void *)ulval;
short *) = ulval;
i n t * ) = ulval;
long *) = ulval;
1
return ( 0 ) ;
program
tstdio2 . c
References
Brian W. Kernighan and l?J.Plauger, Software T d s (Reading, Mass.:
Addison-Wesley, 1975). Also by the same authors, Software Tools in Pascal
(Reading, Mass.: Addison-Wesley, 1978). Both of these books illustrate how
to impose the UNIX I/O model upon a variety of operating systems by
implementing a small number of primitive interface functions.
William D. Clinger, "How to Read Floating-point Numbers Accurately,"
Proceedings of the ACM SIGPLAN '90 Conference on Programming Language
Design and Implementation (New York: Association for Computing Machinery,1990, pp. 92-101). This article discusses the difficultiesof converting a
text string to floating-point representation if your goal is to maintain full
precision.
Guy L. Steele, Jr. and Jon L. White, "How to Print Floating-point Numbers Accurately," Proceedings of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation (New York: Association for
Computing Machinery, 1990, pp. 112-126). This article is an interesting
companion to the one above, from the same conference proceedings.
328
Figure 12.60:
x g e t f l o a .c
Chapter 12
/* -Getfloat function
#include <ctype.h>
#include <locale.h>
#include <stdlib.h>
#include <string.h>
#include " x s t d i o . h
'int
-Getf l o a t (-Sft
*/
*px)
/*
char *p;
i n t ch;
char ac[FMAX+l];
char seendig = 0;
px->nget = px-Width <= 0
I I FMAX < px-Width ? FMAX : px-Width;
p = ac, ch = GETN(px);
i f (ch == '+' I I ch = ' - r )
*p++ = ch, ch = GETN (px);
f o r (; i s d i g i t ( & ) ; seendig = 1)
*p++ = ch, ch = GETN (px);
i f (ch == localeconv () ->decimalgoint [0] )
*p++ = ch, ch = GETN (px);
f o r (; i s d i g i t (ch); seendig = 1)
*p++ = ch, ch = GETN (px);
i f ( (ch == 'e' I I ch == 'E' ) && seendig)
f
/* parse exponent
*p++ = ch, ch = GETN (px);
i f (ch == '+' 1 1 ch = ' - ' )
*p++ = ch, ch = GETN (px);
f o r (seendig = 0; i s d i g i t ( c h ) ; seendig = 1)
*p++ = ch, ch = GETN (px);
1
rJNGETN(px, ch);
i f ( !seendig)
return (-1);
*p = ' \O' ;
i f ( !px->noconv)
/*
convert and s t o r e
1
return (0);
Exercises
Exercise 12.1 How does the operating system you use represent text fies? Do you have
to make any changes to match the internal represent of a text stream in
Standard C?
Exercise 12.2 Write the functions f p r i n t i ,
v f p r i n t i and vsprinti.
printi,
and
sprinti
in terms of calls to
Exercise 12.3 Write a version of rename that copies a file if it cannot simply rename it.
Delete the original fie only after a successful copy.
Exercise 12.4 Write a version of remove that simply renames the file to be removed. Place
the file in an out-of-the-way directory,or give it a name not likely to conflict
with common naming conventions for files. Why would you want this
version?
Exercise 12.5 Write a version of tmpnam that checks for conflicts with existing names. CTry
to open an existing file with that file name for reading.) The function keeps
generating new file names until it cannot open the corresponding file. Why
would you want this version? What happens if two programs executingin
parallel call this function at the same time?
The C Standard says, 'The implementation shall behave as if no library
function calls the tmpnam function. (See page 236.) What do you have to
do to satisfy this requirement?
Exercise 12.6 Implement the primitives -Fclose, -Fopen, -Fread, and -Fwrite for the
operating system you use. Do you have to write any assembly language?
Exercise 12.7 [Harder] Implement the functions -Fgetpos and _ ~ s e t p o sfor an operating
system that terminates each text line with a carriage return plus line feed.
Exercise 12.8 [Harder] Write a function that converts a text string to long double by the
same rules that s t r t o d uses for double. (See page 362.)
Exercise 12.9 [Very hard] Redesign the scan functions so they are more widely usable.
Devise a way to communicate scan failures to the calling program so that
it can:
spot the failure more precisely
try an alternate conversion
recover gracefully from a read error
Chapter 12
330
Figure 1 2.6 1:
tstdiol.c
Part 1
t e s t s t d i o functions, part 1
linclude <assert.h>
linclude <errno.h>
linclude < f l o a t . h >
linclude 6 n a t h . m
linclude <stdarg.h>
linclude < s t d i o.h>
linclude <string.h>
f*
*/
/*
t e s t v f p r i n t f */
1
s t a t i c void vp(const char *fmt,
- -1
/*
t e s t v p r i n t f */
1
s t a t i c void vsp (char *s, const char
* fmt, . . . )
/*
t e s t v s p r i n t f */
1
~ n main
t
()
/*
t e s t b a s i c workings of s t d i o functions */
t s t d i o . h>
Continuing
tstdiol c
Part 2
1
p r i n t f ("BUFSIZ = %u\nW,BUFSIZ);
p r i n t f ( "L-tmpnarn = %u\nml,L-tmpnam) ;
printf("F1LENAME-MAX = %u\nW,FILENAME-MAX);
MAX) ;
p r i n t f ("FOPEN-MAX = %u\nW,FOPENp r i n t f ("M-MAX = %u\nW
, TM-MAX) ;
vsp (buf, "SUC%c%sW
, ' C' , "ESS");
v f p ("%a t e s t i n g %a", b u f , " < s t d i o . h>");
vp(", p a r t l \ n W;
)
r e t u r n (0):
*.
332
Figure 12.62:
tstdio2. c
Chapter 12
I* t e s t s t d i o functions, p a r t 2
#include < a s s e r t . h>
#include <errno.h>
*/
int main()
/* t e s t b a s i c workings of s t d i o functions
f
char buf [32], tname [L-tmpnam] , *tn;
FILE *pf;
s t a t i c i n t macs[] = {
-IOFBF, -IOLBF, -IONBF, BUFSIZ, EOF, FILENAME-MAX,
FOPENMAX, TMP-MAX, SEEK-CUR, SEKEND, SEEK-SET};
*,
334
Chapter 13
The header <stdlib.h> declares four types and several functions of general utility, and
defines several r n a ~ r 0 s . l ~ ~
The types declared are size-t and wchar-t (both described in 7.1.6).
div-t
div-t
which is a structure type that is the type of the value returned by the div function, and
ldiv-t
NULL
ldiv-t
which is a structure type that is the type of the value returned by the ldiv function.
The macros defined are NULL (described in 7.1.6);
EXIT-FAILURE
EXIT-FAILURE
and
EXIT-SUCCESS
EXIT-SUCCESS
which expand to integral expressions that may be used as the argument to the e x i t function to
return unsuccessful or successful termination status, respectively. to the host environment:
which expands to an integral constant expression, the value of which is the maximum value
returned by the rand function; and
m-cm-mx
which expands to a positive integer expression whose value is the maximum number of bytes in
a multibyte character for the extended character set specified by the current locale (category
LC-CTYPE), and whose value is never greater than ME-UZN-MAX.
Description
The atof function converts the initial portion of the string pointed to by nptr to double
representation. Except for the behavior on error, it is equivalent to
atrtod(nptr, (char **)NULL)
Returns
The atof function returns the converted value.
rtoi
Description
The a t o i function converts the initial portion of the string pointed to by nptr to i n t
representation. Except for the behavior on error. it is equivalent to
(int)atrtol(nptr, (char **)NULL, 10)
Returns
The a t o i function returns the converted value.
Forward references: the s t r t o l function (7.10.1.5).
< s t d l i b . h>
atol
Description
The at01function converts the initial portion of the string pointed to by nptr to long i n t
representation. Except for the behavior on error, it is equivalent to
atrtol(nptr, (char **)NULL, 10)
Returns
The at01function returns the converted value.
Forward references: the s t r t o l function (7.10.1.5).
atrtod
Description
The s t r t o d f u n c t i o n converts the initial portion of the string pointed to by n p t r to d o u b l e
representation. Fist, it decomposes the input string into three parts: an initial, possibly empty,
sequence of white-spacecharacters (as specified by the isspace function), a subject sequence
resembling a floating-pointconstant; and a final string of one or more unrecognized characters,
including the terminating null character of the input string. Then, it attempts toconvert the subject
sequence to a floating-point number, and returns the result.
The expected form of the subject sequence is an optional plus or minus sign, then a nonempty
sequence of digits optionally containing a decimal-point character, then an optional exponent pan
as defined in 6.1.3.1, but no floating suffix. The subject sequence is defined as the longest initial
subsequence of the input suing, starting with the first non-white-space character, that is of the
expected fonn. The subject sequence contains no characters if the input string is empty or consists
entirely of white space. or if the fist non-white-space character is other than a sign, a digit, or a
decimal-pointcharacter.
If the subject sequence has the expected form, the sequence of characters starting with the fust
digit or the decimal-point character (whichever occurs fust) is interpreted as a floating constant
according to the rules of 6.1.3.1, except that the decimal-point character is used in place of a
period, and that if neither an exponent part nor a decimal-point character appears, a decimal point
is assumed to follow the last digit in the string. If the subject sequence begins with a minus sign,
the value resulting from the conversion is negated. A pointer to the final string is stored in the
object pointed to by e n e t r , provided that e n d p t r is not a null pointer.
In other than the "C" locale, additional implementation-defined subject sequence forms may
be accepted.
If the subject sequence is empty or does not have the expected form, no conversion is
performed; the value of n p t r is stored in the object pointed to by e n d p t r , provided that
e n d p t r is not a null pointer.
Returns
The -,rtodfunction returns the converted value, if any. If noconversioncould be performed,
zero is returned. If the correct value is outside the range of reuresentable values. ulus or minus
HUGE VAL is returned (according to the sign of the value), and the value of the macro ERANGE
is s t o a in errno. If the correct value would cause underflow, zero is returned and the value of
the macro ERANGE is stored in errno.
int baa.):
Description
The s t r t o l function converts the initial portion of the string pointed to by n p t r to l o n g
i n t representation.First, it decomposesthe input string into three pans: an initial, possibly empty,
sequence of white-space characters (as specified by the isspace function), a subject sequence
Chapter 13
resembling an integer represented in some radix determined by the value of base, and a final
string of one or more unrecognized characters, including the terminating null character of the
input string. Then, it attempts to convert the subject sequence to an integer, and returns the result.
If the value of base is zero, the expected form of the subject sequence is that of an integer
constant as described in 6.1.3.2, optionally preceded by a plus or minus sign, but not including
an integer suffix. If the value of base is between 2 and 36, the expected form of the subject
sequence is a sequence of letters and digits representing an integer with the radix specified by
base, optionally preceded by a plus or minus sign, but not including an integer suffix. The Letters
from a (or A) through z(or 2) are ascribed the values 10 to 35; only letters whose ascribed values
are less than that of base are permitted. If the value of base is 16, the characters O x or O X may
optionally precede the sequence of letters and digits, following the sign if present.
The subject sequence is defined as the longest initial subsequence of the input string, starting
with the first non-white-space character, that is of the expected fom. The subject sequence
contains no characters if the input string is empty or consists entirely of white space, or if the fust
non-white-space character is other than a sign or a permissible letter or digit.
If the subject sequence has the expected form and the value of base is zero, the sequence of
characters starting with the first digit is interpreted as an integer constant according to the rules
of 6.1.3.2. If the subject sequence has the expected form and the value of base is between 2 and
36, it is used as the base for conversion, ascribing to each letter its value as given above. If the
subject sequence begins with a minus sign, the value resulting from the conversion is negated. A
pointer to the final string is stored in the object pointed to by e n e t r , provided that endptr
is not a null pointer.
In other rhan the "C" locale, additional implementation-defined subject sequence forms may
be accepted.
If the subject sequence is empty or does not have the expected form, no conversion is
performed; the value of nptr is stored in the object pointed to by endptr, provided that
endptr is not a null pointer.
Returns
The s t r t o l function returns the converted value, if any. If no conversioncould be performed,
zero is returned. If the correct value is outside the range of representable values, LONG MAX or
LONG-MIN is returned (according to the sign of the value), and the value of the macro ~
G
is stored in errno.
7.10.1.6 The s t r t o u l function
Synopsis
#include <atdlib.h>
unaigned long i n t mtrtoul(conmt char "nptr, char **andptr, i n t baa.);
Description
The strtoul function converts the initial portion of the string pointed to by nptr to
unsigned long i n t representation. First, it decomposes the input string into three parts: an
initial, possibly empry, sequence of white-space characters (as specified by the isspace
function), a subject sequence resembling an unsigned integer represented in some radix determined by the value of base, and a final string of one or more unrecognized characters, including
the terminating null character of the input string. Then, it attempts to convert the subject sequence
to an unsigned integer, and returns the result.
If the value of base is zero, the expected form of the subject sequence is that of an integer
constant as described in 6.1.3.2, optionally preceded by a plus or minus sign, but not including
an integer suffix. If the value of base is between 2 and 36, the expected form of the subject
sequence is a sequence of letters and digits representing an integer with the radix specified by
base, optionally preceded by a plus or minus sign, but not including an integer suffix. The letters
from a (or A) through z(or 2 ) are ascribed the values 10 to 35; only letters whose ascribed values
are less than that of base are permitted. If the value of base is 16, the characters O x or O X may
optionally precede the sequence of letters and digits, following the sign if present.
The subject sequence is defined as the longest initial subsequence of the input string, starting
with the first non-white-space character, that is of the expected form. The subject sequence
contains no characters if the input string is empty or consists entirely of white space, or if the first
non-white-space character is other than a sign or a permissible letter or digit.
If the subject sequence has the expected form and the value of base is zero, the sequence of
characters starting with the first digit is interpreted as an integer constant according to the rules
of 6.1.3.2. If the subject sequence has rhe expected form and the value of base is between 2 and
36, it is used as rhe base for conversion, ascribing to each letter its value as given above.
If the subject sequence begins with a minus sign, the value resulting from the conversion is
negated. A pointer to the final string is stored in the object pointed to by endptr, provided that
endptr is not a null pointer.
In other than the "C" locale. additional implementation-defined subject sequence forms may
be accepted.
If the subject sequence is empty or does not have the expected form, no conversion is
performed; the value of nptr is stored in the object pointed to by endptr, provided that
endptr is not a null pointer.
Returns
The s t r t o u l function returns the converted value, if any. If no conversion could be
performed, zero is returned. If the correct value is outside the range of representable values,
ULONG-MAX is returned, and the value of the macro ERANGE is stored in errno.
Description
The rand function computes a sequence of pseudo-random integers in the range 0 to
RAND-MAX.
The implementation shall behave as if no library function calls the rand function.
Returns
The rand function returns a pseudo-random integer.
Environmental limit
The value of the RAND-=
srand
Description
The srand function uses the argument as a seed for a new sequence of pseudo-random
numbers to be returned by subsequent calls to rand. If srand is then called with the same seed
value, the sequence of pseudo-random numbers shall be repeated. If rand is called before any
calls to srand have been made, the same sequence shall be generated as when srand is first
called with a seed value of 1.
The implementation shall behave as if no library function calls the srand function.
Returns
The srand function returns no value.
Example
The following functions define a ponable implementation of rand and srand.
static u n s i g ~ dlong int n u t = 1;
int rand(void)
(
/*
RAND-W
nrxt = n u t
1103515245 + 12345;
rrturn (unsignmd int) (naxt/65536) % 32768;
void srmnd(unsig~dint u r d )
(
next = srrd;
)
assumed to be 32767
*/
Chapter 13
7.10.3 Memory management functions
The order and contiguity of storage allocated by successive calls to the calloc, malloc,
and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably
aligned so that it may be assigned to a pointer to any type of object and then used to access such
an object or an array of such objects in the space allocated (until the space is explicitly freed or
reallocated). Each such allocation shall yield a pointer to an object disjoint from any other object.
The pointer returned points to the start (lowest byte address) of the allocated space. If the space
cannot be allocated, a null pointer is returned. If the size of the space requested is zero, the behavior
is implementation-defined; the value returned shall be either a null pointer or a unique pointer.
The value of a pointer that refers to freed space is indeterminate.
ca11oc
aiza);
Description
fraa
The calloc function allocates space for an array of nmemb objects, each of whose size is
size. The space is initialized to all bits zero.Iz7
Returns
The calloc function returns either a null pointer or a pointer to the allocated space.
7.10.3.2 The f r e e function
Synopsis
tincluda <atdlib.h>
void fraa(void *ptr);
Description
The free function causes the space pointed to by pt r to be deallocated, that is, made available
for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does
not match a pointer earlier returned by the calloc. malloc, or realloc function, or if the
space has been deallocared by a call to free or realloc, the behavior is undefined.
Returns
mrll0C
Synopsis
tinclub <atdlib.h>
void *mrlloc (aiza-t aiza):
Description
The malloc function allocates space for an object whose size is specified by size and whose
value is indeterminate.
Returns
raalloc
The malloc function returns either a null pointer or a pointer to the allocated space.
7.10.3.4 The r e a l l o c function
Synopsis
linclub <atdlib.h>
void *r~alloc(void *ptr, aiza-t
aiza);
Description
The realloc function changes the size of the object pointed to by ptr to the size specified
by size. The contents of the object shall be unchanged up to the lesser of the new and old sizes.
If the new size is larger, the value of the newly allocated portion of the object is indeterminate. If
ptr is a null pointer, the realloc function behaves like the malloc function for the specified
size. Otherwise, if ptr does not match a pointer earlier returned by the calloc. malloc, or
realloc function, or if the space has been deallocated by a call to the free or realloc
function, the behavior is undefined. If the space cannot be allocated, the object pointed to by ptr
is unchanged. If size is zero and ptr is not a null pointer, the object it points to is freed.
< s t d l i b. h>
Returns
The realloc function returns eithera null pointer or apointerto the possibly moved allocated
space.
abort
Description
The abort function causes abnormal program termination to occur, unless the signal
SIGABRT is being caught and the signal handler does not return. Whether open output streams
are flushed or open streams closed or temporary files removed is implementation-defined. An
implementation-defined form of the status unsuccessful termination is returned to the host
environment by means of the function call raise(SIGABRT).
Returns
The abort function cannot return to its caller.
rtrxit
(void) );
Description
The a t e x i t function registers the function pointed to by func, to be called without
arguments at normal program termination.
Implementation limits
The implementation shall support the registration of at least 32 functions.
Returns
The a t e x i t function returns zero if the registration succeeds, nonzero if it fails.
Forward references: the e x i t function (7.10.4.3).
rxit
Description
The e x i t function causes normal program termination to occur. If more than one call to the
e x i t function is executed by a program, the behavior is undefined.
First, all functions registered by the a t e x i t function are called, in the reverse order of their
registration.lZ8
Next, all open streams with unwritten buffered data are flushed, all open streams are closed,
and all files created by the tmpf i l e function are removed.
Finally, control is returned to the host environment. If the value of status is zero or
EXIT SUCCESS, an implementation-defined form of the status successful termination is
returna. If the value of status is EXIT FAILURE, an implementation-defined form of the
status unsuccessful termination is returned-herwise the status returned is implementation-defined.
Returns
The e x i t function cannot return to its caller.
Chapter 13
Description
The getenv function searches an environmenr lisr, provided by the host environment, for a
string that matches the string pointed to by name.The set of environment names and the method
for altering the environment list are implementation-defined.
The implementation shall behave as if no library function calls the getenv function.
Returns
The getenv function returns a pointer to a string associated with the matched list member.
The string pointed to shall not be modified by the program, but may be overwritten by a subsequent
call to the getenv function. If the specified name cannot be found, a null pointer is returned.
system
Description
The system function passes the string pointed to by string to the host environment to be
executed by a command processor in an implementation-defined manner. A null pointer may be
used for string to inquire whether a command processor exists.
Returns
If the argument is a null pointer, the system function returns nonzero only if a command
processor is available. If the argument is not a null pointer, the system function returns an
implementation-defined value.
bsearch
Description
The bsearch function searches an array of nmemb objects, the initial element of which is
pointed to by base, for an element that matches the object pointed to by key. The size of each
element of the array is specified by size.
The comparison function pointed to by compar is called with two arguments that point to the
key object and to an array element. in that order. The function shall return an integer less than.
equal to, or greater than zero if the key object is considered, respectively,to be less than, to match,
or to be greater than the array element. The array shall consist of: all the elements that compare
less than, all the elements that compare equal to, and all the elements that compare greater than
the key object, in that order.lZ9
Returns
The bsearch function returns a pointer to a matching element of the array. or a null pointer
if no match is found. If two elements compare as equal, which element is matched is unspecified.
qsort
Description
The qsort function sorts an array of nmemb objects, the initial element of which is pointed
to by base. The size of each object is specified by size.
The contents of the array are sorted into ascending order according to a comparison function
pointed to by compar, which is called with two arguments that point to the objects being
compared. The function shall return an integer less than, equal to, or greater than zero if the first
argument is considered to be respectively less than, equal to, or greater than the second.
If two elements compare as equal, their order in the sorted array is unspecified.
t s t d l i b. h>
Returns
The qsort function returns no value.
aba
Description
The abs function computes the absolute value of an integer j. If the result cannot be
represented, the behavior is undefined.130
Returns
The abs function returns the absolute value.
div
Description
The divfunction computes the quotient and remainder of the division of the numeratornumer
by the denominatordenom. if the division is inexact, the resulting quotient is the integer of lesser
magnitude that is the nearest to the algebraic quotient. If the result cannot be represented, the
behavior is undefined; otherwise, q u o t * denom
r e m shall equal numer.
Returns
The div function returns a structure of type div t , comprising both the quotient and the
remainder. The structure shall contain the following mimbers, in either order:
int quot;
/* quotient */
int ram;
/* remainder */
labs
Description
The labs function is similar to the abs function, except that the argument and the returned
value each have type l o n g i n t .
ldiv
Description
The ldiv function is similar to the div function, except that the arguments and the members
of the returned structure [which has type ldiv-t) all have type l o n g i n t .
Chapter 13
342
mblrn
Description
If s is not a null pointer, the m b l e n function determines the number of bytes contained in the
multibyte character pointed to by a. Except that the shift state of the mbtoorc function is not
affected, it is equivalent to
mbtowc( (wchar-t *)0, a, n);
Description
If s is not a null pointer, the m b t o w c function determines the number of bytes that are
contained in the multibyte character pointed to by s.It then determines the code for the value of
type w c h a r t that corresponds to that multibytecharacter. (The value of the code corresponding
to the null cGracter is zero.) If the multibyte character is valid and p w c is not a null pointer, the
m b t o w c function stores the code in the object pointed to by pwc. At most n bytes of the array
pointed to by s will be examined.
The implementation shall behave as if no library function calls the m b t o w c function.
Returns
If s is a null pointer, the m b t o w c function returns a nonzero or zero value, if multibyte
character encodings, respectively, do or do not have statedependent encodigs. If s is not a null
pointer. the m b t o w c function either returns 0 (if s points to the null character), or returns the
number of bytes that are contained in the converted multibyte character (if the next n or fewer
byres form a valid multibyte character), or returns -I (if they do not form a valid multibyte
character).
In no case will the value returned be greater than n or the value of the ME-CUR-MAX macro.
rctomb
Description
function determines the number of bytes needed to represent the multibyte
The w c t &
character corresponding to the code whose value is w c h a r (including any change in shift state).
It stores the multibyte character representation in the array object pointed to by s (if s is not a
null pointer). At most ME CUR MAX characters are stored. If the value of w c h a r is zero, the
w c t & function is left irthe iztial shift state.
The implementation shall behave as if no library function calls the w c t o m b function.
Returns
function returns a nonzero or zero value, if multibyte
If s is a null pointer, the w c t &
character encodings, respectively, do or do not have state-dependent encodings. If s is not a null
pointer, the w c t & function returns -I if the value of w c h a r does not correspond to a valid
multibyte character, or returns the number of bytes that are contained in the multibyte character
corresponding to the value of w c h a r .
In no case will the value returned be greater than the value of the MB-CUR-MAX
macro.
Description
The mbstowcs function converts a sequence of multibyte characters that begins in the initial
shift state from the array pointed to by s into a sequence of corresponding codes and stores not
more than n codes into the array pointed to by pwcs. No multibyte characters that follow a null
character (which is converted into a code with value zero) will be examined or converted. Each
multibyte character is converted as if by a call to the mbtowc function, except that the shift state
of the mbtowc function is not affected.
No more than n elements will be modified in the array pointed to by pwcs. If copying takes
place between objects that overlap, the behavior is undefined.
Returns
If an invalid multibytecharacter is encountered, thembstowcsfunction returns ( s i z e t ) -
1. Otherwise, the mbstowcs function returns the number of array elements modifs, not
including a terminating zero code, if any.132
7.10.8.2 The wcstombs function
Synopsis
tincluda <atdlib.h>
aizr-t rcatomba(char *a, conat wchar-t
Description
The w c s t h s function converts a sequence of codes that correspond to multibytecharacters
from the array pointed to by pwcs into a sequence of multibyte characters that begins in the initial
shift state and stores these multibyte characters into the array pointed to by s,stopping if a
multibyte character would exceed the limit of n total bytes or if a null character is stored. Each
code is converted as if by a call to the w c t h function, except that the shift state of the w c t h
function is not affected.
No more than n bytes will be modified in the array pointed to by s. If copying takes place
between objects that overlap, the behavior is undefined.
Returns
If a code is encountered that does not correspond to a valid multibytecharacter, the wcstombs
function returns ( s i z e t ) -1.Otherwise, the wcstombs function returns the number of bytes
modified, not inc~udingaterminatin~
null character, if any.132
Footnotes
126. See "future library directions" (7.1 3.7).
127. Note that this need not be the same as the representation of floating-point zero or a null
pointer constant.
128. Each function is called as many times as it was registered.
129. In practice, the entire array is sorted according to the comparison function.
130. The absolute value of the most negative number cannot be represented in two's complement.
131. If the implementation employs special bytes to change the shift state, these bytes do not
produce separate wide charactercodes, but are grouped with an adjacent multibytecharacter.
132. The array will not be null- or zero-terminated if the value returned is n.
Chapter 13
Using <stdlib.h>
storage
allocation
functions
the
heap
heap
overhead
you can easily consume four to eight times as much storage on the heap.
The heap is also subject to fragmentation. Allocating and freeing data
objects on the heap in arbitrary order inevitably leaves unusable holes
between some of the allocated data objects. That too lowers the usable size
of the heap.
Don't overreact to this knowledge. Gather related data into a structure
and allocate it all at once. That minimizes heap overhead, to be sure, but it
is also good programming style. Do not gather unrelated data just to save
heap overhead. Similarly, allocate data objects with similar lifetimes all at
once, then free them at about the same time. That minimizes heap fragmentation, but it too is good style. Do not advance or defer unrelated heap
operations just to minimizefragmentation.The storage allocationfunctions
are an important aid to programming flexibility. Use them as they are
intended to be used.
The other group of related functions helps you manipulate large charmultibyte
character acter sets. Standard C added this group in response to the rapidly growing
sets use of Kanji and other large character sets in computer-based products. The
functions support two representations for such character sets:
Multibyte characters are sequences of one or more codes, where each
code can be represented in a C character data type. (The character data
types are char, signed char, and unsigned char. All are the same size in a
given implementation. That size is at least eight bits.) A subset of any
multibyte encoding is the basic C character set, each character of which
is a sequence of length one.
Wide characters are integers of type wchar-t, defined in both xstddef. h>
and <stdlib.h>. (Assume that wchar-t can be any integer type from char
to unsigned long.) Such an integer can represent distinct codes for each
of the characters in the large character set. The codes for the basic C
character set have the same values as their single-character forms.
Multibyte characters are convenient for communicating between the program and the outside world. Magnetic storage and communicationslinks
have evolved to support sequences of eight-bit characters. Wide characters
are convenient for manipulating text within a program. Their fixed size
simplifies handling both individual characters and arrays of characters.
The C Standard defines only the bare minimum needed to support these
two encodings. mblen, mbstowcs, and mbtowc help you translate from multibyte characters to wide-characters. wcstombs and wctomb help you do the
reverse. You can be sure that more elaborate sets of functions will soon be
standardized. For now, however, this is what you have.
You may have no immediate intention to write programs that are fluent
with large character sets. That should not deter you from writing programs
that are tolerant of large character sets as much as possible. See, for example,
how such characters can appear in the formats used by the print and scan
functions, declared in <stdio.m, and by strftime, declared in <time.m.
Chapter 13
I conclude with the usual description of the individual macros defined
and functions declared in x s t d l i b . h>:
EXIT-FAILURE
EXIT-FAILURE - Use this macro as the argument to e x i t or the return
value from m a i n to report unsuccessful program termination. Any other
nonzero value you use instead may have different meanings for different
operating systems.
EXIT-SUCCESS - Use this macro as the argument to exit or the return
EXIT-SUCCESS
value from m a i n to report successful program termination. You can also use
zero. Any other value you use may have different meanings for different
operating systems.
ME-CUR-MAX -NO multibyte sequence that defines a single wide characME-CUR-MAX
ter will be longer than ME-CUR-MAX in the current locale. You can declare a
character buffer of size ME-LEN-MAX, defined in <limits.h>,then safely store
ME-CUR-MAX characters in the initial elements of the buffer. Calling mbtowc
with a third argument of at least ME-CUR-MAX is always sufficient for the
function to determine the next wide character in a valid multibyte sequence. See the example for wctomb on page 352
RAND-MAX - Use this value to scale values returned from rand. For
example, if you want random numbers of type float distributed over the
interval [0.0,1.01, write the expression ( f l o a t ) rand ( ) /RAND- MAX. The value
of RAND-MAX is at least 32,767.
size- t
size- t -See page 219.
wchar-t
wchar- t -See page 219.
d i ~ t div-t -Declare a data object of this type to store the value returned by
div, described below.
ldiv-t - Declare a data object of this type to store the value returned
by ldiv, described below.
abort
abort - Call this function only when things go terribly wrong. It
effectively calls raise (SIGABRT) , as described in Chapter 13: <signal .h>.
That gives a signal handler for SIGABRT the opportunity to perform any
last-minute operations. On the other hand, you can't be assured that
input/output streams are flushed, files closed properly, or temporary files
removed. Whenever possible, call exit (EXIT-FAII~E) instead.
abs
abs-Call abs (x) instead of writing the idiom x < o ? -X : X. Agrowing
number of Standard C translators generate inline code for abs that is
smaller and faster than the idiom. In addition, you avoid the occasional
surprise when you inadvertently evaluate twice an expression with side
effects. Note that on a two's-complement machine, abs can generate an
overflow. (See page 77.)
atexit
atexit-Use this function to register another function to be called when
the program is about to terminate. You may, for example, create a set of
temporary files that you wish to remove before the program terminates.
Write the function void t i d y ( v o i d ) to remove the files. Call atexit ( & t i d y )
once you store the name of the first file to remove. When m a i n returns or a
atof
atoi
at01
bsearch
1
Entry *lookup (char *key)
/* lookup key i n t a b l e */
{
r e t u r n (bsearch (key, symtab,
s i z e o f symtab / s i z e o f symtab[O],
s i z e o f symtab[O], 6anp) ) ;
A few caveats:
If a key compares equal to two or more elements, bsearch can return a
pointer to any of these elements.
Beware of changes in how elements sort when the execution character
set changes - call qsort, described below, with a compatible comparison function to ensure that an array is properly ordered.
Chapter 13
calloc
div
exit
free
getenv
You are not obliged to free storage that you allocate. A good discipline,
however, is to free all allocated storage as soon as possible. Freed storage
can be reallocated, making better use of a limited resource. Moreover, some
implementations can report storage allocated at program termination.That
helps you locate places where you unintentionally fail to free storage.
getenv - Use this function to obtain a pointer to the value string
associated with an environment variable. (See page 82.) If you name an
environment variable that has no definition, you get a null pointer as the
value of the function. Don't alter the value string. A subsequent call to
getenv can alter the string, however. To allocate a private copy, write
something like:
#include <stdlib.h>
char *copyenv (const char *name)
/* get and copy environment variable */
{
char *sl = getenv(name);
char *s2 = sl ? malloc(strlen(sl) + 1) : NULL;
return (s2 ? strcpy (s2, sl) : NULL);
1
labs
labs-See
ldiv
malloc
mblen
the discussion of
abs, above.
*/
Chapter 13
mbstowcs
mbtowc
qsort
, const void
*) ) Sstrcmp.
Figure 13.1:
strtod
Pattern
realloc
Chapter 13
strtoui
system
wcstombs
w c t d
i n t w c c h e c k (wchar-t * w c s )
{
/* return zero i f w c s is valid
char buf[MB-LEN-MAX];
i n t n;
*/
for (wctomb(NULL, 0 ) ; ; + + w c s )
if ( ( n = w c t o m b ( b u f , * w c s ) ) <= 0 )
return ( - 1 ) ;
else i f ( b u f [n - 11 = \ O n )
return ( 0 ) ;
header
<yvals.h>
macro
-EXFAIL
data object
-m c u m a x
type
--fun
function
abs
function
aiv
labs
iaiv
function
qsort
Chapter
Figure 13.3:
s t d l i b .h
Part 1
I* s t d 1 i b . h standard header */
Kifndef -STDLIB
#define -STDLIB
kifndef -YVALS
#include <yvals.h>
#endif
/* macros */
#define NULL
NULL
kde f i n e
-EXFAIL
kdef i n e EXIT-SUCCESS
0
Kdefine MB-CUR-MAX
curmax
Kdefine RAND-MAX
32767
/* type d e f i n i t i o n s */
tifndef -SIZET
kdefine -SIZET
typedef -Sizet size- t;
Kendi f
Cifndef -WCHART
kdefine -WCHAFtT
typedef -Wchart wchar-t ;
Kendi f
typedef s t r u c t {
i n t quot;
i n t rem;
] div-t;
typedef s t r u c t {
long quot;
long rem;
] ldiv-t;
typedef i n t -Cmpfun(const w i d *, const void *) ;
typedef s t r u c t {
unsigned char - State;
unsigned s h o r t l ~ c h a r ;
] s a v e ;
/* d e c l a r a t i o n s */
void abort (void);
i n t abs ( i n t );
i n t a t e x i t (void (*) (void) ) ;
double atof (const char *);
i n t a t o i ( c o n s t char *) ;
Long a t 0 1 (const char *) ;
void *bsearch (const void *, const void *,
size- t, size- t, -Cmpfun *);
void * c a l l o c (size- t, size- t);
div-t d i v ( i n t , i n t ) ;
void e x i t ( i n t ) ;
void f r e e (void *) ;
char *getenv (const char *);
long l a b s (long);
ldiv-t l d i v (long, long) ;
void *malloc(size-t);
i n t mblen(const char *, size- t);
size- t mbstowcs(wchar-t *, const char *, size- t);
, const char *, size- t);
i n t mbtowc (wchar-t
void q s o r t (void , size- t, size- t, -Cmpfun *) ;
EXIT- FAIL^-
< s t d l i b. h>
--
stdlib h
Part 2
G,
Figure 13.4:
abs.c
/* abs function */
#include < s t d l i b . h >
l i n t :as) ( i n t i)
I )
I
*/
b
r e t u r n (-1);
Chapter 13
Figure 13.6: / *
labs. c
labs function */
#include <stdlib.h>
long (labs)(long i )
Figure 13.7:
ldiv. c
~~~~~
ldiv function */
binclude <stdlib.h>
f *
Ldiv-t
Figure 13.8:
qsort c
Part 1
'*
qsort function */
binclude <stdlib.h>
binclude <etring.h>
/ * macros */
ldefine MAX-BUF256
/*
*/
Continuing
qsort c
Pafl2
if (i < j)
C
char buf [MAX-BUF];
char *ql = qi;
char *q2 = qj;
size-t m. ma;
/*
*I
char buf[MAX-BUFI;
char *ql = qi;
char *q2 = w ;
size-t m. ma;
for (ma = size; 0 < ma; ma -=
ql += lh ~2 -= m)
/ * swap as many as possible * r
C
1
j = n - i - 1, qi += size;
if (j < i)
358
Chapter 13
#include <stdlib.h>
void *(bsearch)(const void "key, const void *base,
*cmp)
size-t nelem, size-t size, --fun
i
/ * search sorted table by binary chop
const char *p;
size-t n;
for (p = (const char *)base, n = nelem; 0 < n; )
i
/ * check midpoint of whatever is left
const size-t pivot = n >> 1;
const char *const q = p + size
pivot;
const int val = (*cmp)(key, q);
if (vaf < 0 )
n = pivot;
else if (val == 0)
return ((void *)q);
else
i
p = q + size;
n -= pivot + 1;
1
1
return (NULL);
1
/ * no match
1
Figure 13.10:
rand. c
/* rand function */
#include <stdlib.h>
/* the seed */
unsigned long -Randseed = 1;
int (rand)(void)
-Randseed = -Randseed
*/
0
-Randseed = seed;
/*
*/
function
arand
function
-stoul
Chapter 13
Figure 13.12:
xstoul. c
Pafll
/* Stoul function */
#include < s t d l i b. h>
#include <ctype.h>
/* macros */
#define BASEMAX
36
/* l a r g e s t v a l i d base
/* s t a t i c d a t a */
s t a t i c const char d i g i t s [ ] = {
/* v a l i d d i g i t s
"0123456789abcdefghi jkImnopqrstuvwxyzZZ};
s t a t i c const char ndigs [BASE-MAX+l] = {
/* 32-bits!
0, 0, 33, 21, 17, 14, 13, 12, 11, 11,
10,10,9,9,9,9,9,8,8,8,
8, 8, 8, 8, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7 , 7, 7 , 7 , ) ;
unsigned long -Stoul(conat char *a, char **enc&tr, i n t baae)
1
/* convert s t r i n g t o unsigned long, with checking
const char *sc, *ad;
const char * s l , *s2;
char sign;
p t r d i f f - t n;
unsigned long x, y;
for
(SC
*
*
= s; i s s p a c e (*sc); ++sc)
s t r i p Ox o r OX
s k i p leading zeros
/*
if
(endptr)
*endptr = (char * ) a ;
return (0);
1
else i f
(base)
1
i f (baae == 16 && *ac = '0'
&& ( s c [ l ] = 'x'
1 I s c [ l ] = 'X'))
s c +==2;
/*
1
e l s e i f (*ac != ' 0 ' )
base = 10;
e l s e i f ( s c [ l ] == 'x' I I s c [ l ] == ' X I )
base = 16, s c += 2;
else
base = 8;
f o r (81 = sc; *sc == '0'; ++sc)
/*
x = 0;
Continuing
x s t o u l .c
Part 2
y = x;
x
base
(sd
/*
/* accumulate d i g i t s */
f o r overflow checking */
/*
check s t r i n g v a l i d i t y * /
digits);
1
if
(81 ==
sc)
1
if
(endptr)
*endptr = (char * ) a ;
return (0);
1
n = s c - a2
i f (n < 0)
else i f
11
1
(0
(X
ndigs[base];
BC[-11) / base != y)
/*
overflow * I
e r r n o = ERANGE;
x = ULONG-MAX;
if ( s i g n ==
X
/*
'-')
g e t f i n a l value * I
= -x;
if
(endptr)
*endptr = (char *) sc;
return (x);
/*
convert s t r i n g t o i n t
*/
at01 c
I ,
/*
convert s t r i n g t o long
*/
Figure 13.15:
s t r t o u l .c
/* s t r t o u l function
#include < s t d l i b . h >
*/
362
Chapter 13
Figure 13.16:
s t r t o l. c
I* s t r t o l function */
binclude <ctype.h>
Yinclude <errno.h>
Yinclude < l i m i t s . h>
Yinclude < s t d l i b. h>
Long ( s t r t o l ) (const char *a, char **endptr, i n t base)
f
/* convert s t r i n g t o long, with checking * I
const char *sc;
unsigned long x;
f o r ( s c = s; isspace (*sc); ++sc)
/*
not sc! */
/*
&& LONG-MAX
< x)
p o s i t i v e number overflowed */
f
errno = ERANGE;
r e t u r n (LONG-MIN) ;
1
e l s e i f (*sc !=
I-'
/*
errno = ERANGE;
r e t u r n (LONGMAX) ;
1
else
r e t u r n ( (long) X) ;
Figure 13.17:
atof.c
/* atof function */
#include < s t d l i b . h >
double ( a t o f ) (const char *a)
/*
convert s t r i n g t o double */
I {
I )
*/
Note the rare use of the type p t r d i f f-t, defined in <stddef . h>.It ensures
that n can hold the signed differencebetween two pointers. As I warned on
page 218, p t r d i f f-t is not a completely safe type. An argument string with
over 32,767 significant digits can fail to report overflow on a computer with
16-bit pointers. That is an unlikely occurrence, but it can happen. Still, it is
tedious to write the test completely safely. I chose speed in this case over
absolute safety.
atoi
Figure 13.13 through Figure 13.15 show the files atoi-c,at01 .c, and
at01 strtoul .c, respectively. These all define functions that call -stoul directly.
strtoul Note that atoi and at01 can overflow. TheC Standard does not require that
function
strtol
atof
strtod
function
-stoa
mbtowc
dlen
function
mbstowcs
Chapter 13
Figure 13.19:
xstod-c
Part 1
function * /
<ctype.h>
< f l o a t . h>
< l i m i t s . h>
<locale.h>
< s t d l i b . h>
"xmath . h"
I* Stod
ltinclude
kinclude
kinclude
kinclude
ltinclude
kinclude
kdef i n e SIGMAX 32
louble -Stod(const char *a, char **endptr)
(
/* convert s t r i n g t o double, with checking
const char p o i n t = localeconv ( ) - > d e c i m a l g o i n t [0] ;
const char *sc;
char buf [SIG-MAX] , sign;
double x;
i n t n d i g i t , n s i g , nzero, olead, opoint;
for
(SC
= s; i s s p a c e (*sc); ++sc)
'+';
*
*
1
if
( n d i g i t == 0)
s e t endptr
== 0; --nsig)
/* s k i p t r a i l i n g d i g i t s
/*
f
i f (endptr)
*endptr = (char * ) a ;
r e t u r n (0.0) ;
1
f o r (;
< nsig
&&
buf [ n s i g -
11
Continuing
xstod. c
Part 2
f
/* compute s i g n i f i c a n d
const char *pc = buf;
i n t n;
long l o [SIG-MAX/8+1] ;
long * p l = & l o [ n s i g >> 31;
s t a t i c double fat[] = f 0 , l e 8 , le16, le24, le32);
f o r (*pl = 0, n = nsig; 0 < n; --n)
i f ( ( n & 07) == 0)
/* s t a r t new sum
*--pl = *PC++;
else
10
*PC++;
*pl = *pl
f o r (x = (double)lo[O], n = 0; ++n <= ( n s i g >> 3 ) ; )
i f ( l o [ n ] != 0)
x += f a c [n] * (double) l o [n] ;
*,
*,
/*
f o l d i n any e x p l i c i t exponent
*,
long lexp = 0;
s h o r t sexp;
if
(*BC= 'e'
I I *sc = 'E')
/*
f
const char *scsav = sc;
const char e s i g n = *++sc ==
'+'
i f ( ! i s d i g i t (*sc) )
s c = scsav;
else
/*
f o r (; i s d i g i t (*sc); ++sc)
i f (lexp < 100000)
l e x p = lexp * 10
*sc
i f (esign = ' -')
lexp = -1exp;
*,
I I *sc == '-'
/*
p a r s e exponent
*,
exponent looks v a l i d
*,
/*
e l s e overflow
*,
'0';
1
1
i f (endptr)
*endptr = (char *) sc;
i f (opoint < 0)
lexp += n d i g i t - nsig;
else
lexp += opoint
olead - nsig;
sexp = lexp < SHRT-MIN ? SHRT-MIN : lexp
? ( s h 0 r t ) l e x p : SHRT-MAX;
x = -Dtento (x, sexp) ;
r e t u r n (sign == '-' ? -x : x ) ;
< SHRT-MAX
1
1
Chapter 13
Figure 13.20: /* mblen function */
rnblen c
#include <stdlib.h>
/*
-Mbsave
s t a t i c data
-Mbxlen = {0);
*/
Figure 13.21:
mbtowc.c
*/
17
/* mbtowc f u n c t i o n */
#include < s t d l i b . h >
s t a t i c d a t a */
-Mbxtowc = 10);
/*
-Mbsave
Figure 13.22:
rnbstowcs.c
'*
mbstowcs f u n c t i o n
linclude <stdlib.h>
*/
f o r (pwc = w c s ; 0
1
r e t u r n (pwc
wcs);
function
h t o w c . c.
-mtowc multibyte sequence far enough to develop the next wide character that it
represents. It does so as a finite-state machine executing the state table
stored at s t a t e , defined in the file x s t a t e.c. (See page 107.)
~btowc
must be particularly cautious because s t a t e can be flawed.
It can change with locale category LC-CTYPE in ways that the Standard C
library cannot control.
<stdlib. h>
Figure 13.23:
ranbtowc c
Mbtowc function
Yinclude <limits.h>
Yinclude <stdlib.h>
Yinclude "xstate. h"
f*
*/
/*
*ps = initial;
return (-Mbstate.-Tab[O] [0]
ST-STATE);
1
1
ps->-State = -NSTATE;
return (-1);
/*
error return * /
1
1
Chapter 13
Note the various ways that the function can elect to take an error return:
if a transfer occurs to an undefined state
if no state table exists for a given state
if the multibyte string ends part way through a multibyte character
if the function makes so many state transitions since generating a wide
character that it must be looping
if the state table entry specifically signals an error
The rest of - ~ b t o w c is simple by comparison. The function retains the
wide-character accumulator (p->-wchar) as part of the state memory. That
simplifies generating a sequence of wide characters with a common component while in a given shift state. - ~ b t o w c returns after delivering each
wide character.
function
Figure 1 3 . 2 4 shows the file w c t o m b - c . The function wctomb calls the
wctomb internal function -wctomb solely to provide separate state memory. In this
case, the shift state can be stored in a data object of type c h . The data object
-wcxtomb has a name with external linkage so that the header < s t d l i b . h>
can define a masking macro for wctomb.
function
Figure 1 3 . 2 5 shows the file w c s t o m b s . ~ .The function w c s t o m b s calls
w c s t o m b s _wctomb repeatedly to translate a wide-character string to a multibyte
string. It too provides its own state memory, but it need not retain the shiit
state between calls.
What makes this function complex is the finite length of the char array
it writes. If at least MB-CUR-MAX elements remain, _wctomb can deliver characters directly. Otherwise, w c s t o m b s must store the generated characters in
an array of length MB---MAX
and deliver as many as it can.
function
Figure 1 3 . 2 6 shows the file m c t o m b . c . The function -wctomb converts a
-wctomb wide character to the one or more characters that comprise its multibyte
representation. It does so as a finite-state machine executing the state table
stored at -Wcstate, defined in the file xstate .c. (See page 1 0 7 . )
wctomb must also be cautious because -wcstate can also be flawed. It
canchange with locale category L C - C T ~ E in ways that the Standard C
library cannot control. Note the various ways that the function can elect to
take an error return:
if a transfer occurs to an undefined state
if no state table exists for a given state
if the generated multibyte string threatens to become longer than
MB-CUR-MAX characters
if the function makes so many state transitions since generating a character that it must be looping
if the state table entry specifically signals an error
The rest of -wdomb is likewise simple by comparison. It returns after
consuming each input wide character.
wctornb c
/* s t a t i c d a t a
char -Wcxtomb = 10);
*/
/* wcstombs f u n c t i o n
#include < l i m i t s . h >
Figure 13.25:
wcstornbs . c
*/
*,
for (SC-s;
*,
*,
*,
((i
*,
1
1
+= i;
i f (sc[-11 == ' \ O r )
return (sc - s
sc
1
return (sc
1);
s );
C
Chapter 13
370
Figure 13.26:
xwctomb. c
Wctomb function
tinzlude <limits.h>
tinclude < s t d l i b. h>
tinclude " x s t a t e . h"
f*
*/
/*
*ps = i n i t i a l ;
r e t u r n (-Mbstate.-Tab[O]
1
1
char s t a t e = *ps;
i n t l e a v e = 0;
i n t l i m i t = 0;
i n t nout = 0;
unsigned s h o r t w c = wcin;
[O]
/*
set i n i t i a l s t a t e
ST-STATE);
run f i n i t e s t a t e machine
f o r (; ; )
/* perform a s t a t e transformation
1
unsigned s h o r t code;
const unsigned s h o r t *stab;
i f (-NSTATE <= s t a t e
I I ( s t a b = Wcstate.-Tab[state] ) = NULL
II
<= nout
I I (-NSTATE*UCHAR-MAX) <= ++limit
I I (code = stab[wc 6 UCHAR-MAX]) = 0)
break;
s t a t e = (code 6 ST-STATE) >> ST-STOFF;
i f (code 6 ST-FOLD)
wc = w c 6 -UCHAR-MAX
I code 6 ST-CH;
i f (code 6 ST-ROTATE)
wc = wc >> CHAR_BIT 6 UCHAR-MAX I wc << C-BIT;
i f (code 6 STOUTPUT)
/* produce an output char
1
if ( (s[nout++] = code 6 STCH ? code : wc) == ' \0' )
leave = 1;
l i m i t = 0;
1
if (code 6 ST-INPUT I I leave)
/* consume input
1
*ps = s t a t e ;
r e t u r n (nout);
1
1
*ps = -NSTATE;
r e t u r n (-1);
MB-CUR-MAX
1
)
#include <stddef.h>
#include <stdlib.h>
#ifndef -WALS
#include <yvals.h>
#endif
/* macros */
#define CELL-OFF
(sizeof (size-t) + -MEMBND 6 --MEMBND)
#define SIZE-BLOCK 512
/* minimum block size * /
#define SIZE-CELL \
((sizeof (-Cell) + -MEMBND 6 --MEMBND) - CELL-OFF)
/* type definitions */
typedef struct -Cell {
size-t -Size;
struct -Cell *-Next;
) -Cell;
typedef struct {
Cell **-Plast:
Cell *-Head;
1 -Altab;
/* declarations */
void *-Getma (size-t);
extern -=tab -Al&ta;
header
'lxalloc.
hw
macro
CELL-OFF
storage
boundaries
Chapter 13
macro
-MF.MBNDassume that a worst-case storage boundary exists. Any data object aligned
on such a boundary is thus suitably aligned. The internal header < y v a l s . h>
defines the macro -MF.MBNDto specify this worst-case storage boundary. For
a boundary of 2N,the macro has the value 2N-1.On an Intel 80x86computa,
for example, the macro can be zero (no constraints).You should probably
make it at least 1 (two-byte boundaries). For such a computer with 32-bit
memoy, you might want to make it 3 (four-byteboundaries).
CELLOFF
Much of the ugly logic in the storage allocation functions results from
SIZE-CELL this attempt to parametrize the worst-case storage boundary. The macro
CELL-OFF assumes that a list element begins on a worst-case storage boundary. It determines the start of the usable area as the next such boundary
beyond the space set aside for the member - S i z e . Similarly, the macro
SIZE-CELL yields the smallest permissible value of - S i z e for a list element.
The list element must be large enough to hold a - C e l l data object. It must
also end of a worst-case storage boundary.
function
The remainder of glxaiioc.hvm
is best explained along with the function
m a i i o c m a i i o c . Figure 13.28 shows the file m a i i o c . c . The function m a i i o c endeavors to allocate a data object of s i z e bytes. To do so, it looks for an element
on the list of available storage that has a usable area at least this large. If it
finds one, it splits off any excess large enough to make an additional list
element. It returns a pointer to the usable area.
data object
The internal function findrnem, defined in m a i i o c. c scans the list of
-=data available storage. It retains two static pointers in the data object -=data of
type -=tab, defined in " x s t d i o . hw:
-H e a d points to the start of the list. If the list is empty, it contains a null
pointer.
- last is the address of the pointer to the next list element to consider. It
can point to - k l d a t a . - h a d or to the -Next of an available list element.
Or it can be a null pointer.
Whenever possible, findrnem begins its scan where it left off on a previous
call. That strategy reduces fragmentation at the start of a list by distributing
usage over the entire list. m a i i o c itself and the function free cooperate in
maintaining these two pointers.
If f i n h e m cannot find a suitable element on the available list, it endeavors to obtain more storage. (Initially the heap is empty, so the first request
takes this path.) It calls the function -Getmem, declared in l l x a i i o c . hl-to do
so. That primitive function must return a pointer to a storage area of at least
the requested size, aligned on the worst-case storage boundary. If it cannot,
it returns a null pointer.
macro
The macro SIZE-BLOCK, defined in " x a i i o c.hw,specifies the smallest
SIZE-BLOpreferred
CK
list-element size. I have set it to 512, but you may want to change
it. f i n h e m first requests the larger of the required size and SIZEBLOCK If
that fails, it halves the requested size repeatedly until the request is granted
Chapter 13
Figure 13.28: r* malloc f u n c t i o n */
malloc.c
Part 1
/* s t a t i c d a t a
Altab -=data
= (0);
s t a t i c -Cell
*/
**findrnem(size-t
/*
heap i n i t i a l l y q t y * /
size)
/*
-C e l l
f o r (;
f i n d storage */
*q, **qb;
; )
i f ( (qb = +data.-Plast)
/*
check f r e e d space f i r s t */
= NULL)
I
/* t a k e it from t h e t o p */
f o r (qb = 6-Aldata-Head; *qb;
qb = 6 (*qb) ->-Next)
i f ( s i z e <= (*qb) ->-Size)
r e t u r n (qb);
1
else
/* resume where w e l e f t off */
I
f o r (; *qb; qb = 6 (*qb) ->-Next)
i f ( s i z e <= (*qb) ->-size)
r e t u r n (qb)
- ;
q = *-Aldata.-Plast;
f o r (qb = 6-Aldata . H e a d ; *qb != q;
qb = 6 (*qb) ->-Next)
i f ( s i z e <= (*qb) ->-Size)
r e t u r n (qb);
1
/* t r y t o buy more space
I
size- t bs;
const size- t s z = s i z e + CELL-OFF;
f o r (bs = SIZE_BU)CK; ; b s >>r 1)
I
/* t r y l a r g e r blocks f i r s t */
i f (bs < s z )
bs = sz;
i f ( (q = - G e t m a (be) ) != NULL)
break;
else i f (bs = s z )
/* no s t o r a g e */
r e t u r n (NULL);
1
/* g o t s t o r a g e : add t o heap and r e t r y */
q->-Size = (bs 6 --MEMBND)
- CELL-OFF;
f ree ( (char * ) q + CELL-OFF) ;
1
1
1
t s t d l i b . h>
Continuing
* (malloc)(size-t
roid
~ 1 1 0C ~ .
Part 2
size)
-Cell
/*
*q, **qb;
/*
-Aldata.-Plast
= qb ? qb : NULL;
return ((char *) q + CELL-OFF);
resume here * I
Figure 13.29:
xgetmem.
/* -Getmem function
#include "xalloc.h"
--
UNIX version * /
*/
/*
void *p;
int isize = size;
return (isize <= 0 I I (p = -ark (isize)) == (void *) -1
? NULL : p);
I
I
*(calloc)(size-t
void
/*
n);
*/
Chapter 13
376
Figure 13.3 1:
free-c
linclude "xal1oc.h"
roid (free)(void *ptr)
/*
-Cell
*,
*q;
if (ptr = NULL)
return;
q = (-Cell * ) ((char *) ptr
if (q->-Size 6 -MEMEND)
return;
if (-=data .-Head = NULL
I I q < -=data .-Head)
CELL-OFF) ;
bad pointer
*,
*,
*,
/*
/*
1
q->-Next = -Aldata.-Head;
-Aldata .-Head = q;
1
else
/*
-Cell
*qp;
char *qpp;
for (qp = -=data .-Head;
-->-Next
66 q < -->-Next;
)
qp = qp->-Next;
qpp = (char *)qp + CELL-OFF + -->-Size;
if ((char *) q < qpp)
return;
/* erroneous call *A
else if ((char *)q == qpp)
/* merge qp and q *A
1
-->-Size
+= CELL-OFF + q->-Size;
q = qp;
1
else
/*
splice q after qp *A
q->-Next = =->-Next;
-->-Next
= q;
1
1
if (q->-Next h h
(char *)q + CELL-OFF
1
1
-A1data.-Plast
1
= hq->-Next;
/*
- -
Figure 13.32:
r e a l l o c .c
r e a l l o c function
tinclude < s t r i n g . h >
tinclude "xa1loc.h"
f*
*/
* ( r e a l l o c ) (void
roid
{
-C e l l
* p t r , size- t s i z e )
/* r e a l l o c a t e a d a t a object on the heap
*,
*q;
i f ( p t r = NULL)
r e t u r n (malloc ( s i z e ) ) ;
q = (-Cell *) ( (char * ) p t r
i f (T>-Size < s i z e )
CELL-OFF) ;
/* t r y t o buy a l a r g e r c e l l
{
char *const n e w g = malloc ( s i z e );
*,
i f ( n e w g = NULL)
r e t u r n (NULL) ;
-cpy
(newg, p t r , q->-Size) ;
free (ptr);
return (nawg);
1
else i f (q->-Size
< s i z e + CELL-OFF
r e t u r n ( p t r );
else
SIZE-CELL)
/*
/*
-MEMBND)
leave c e l l alone
*,
f r e e excess space
*,
& --MEMBND;
*)((char *)ptr
new-n);
new-n;
1
1
function
realloc
Chapter 13
abort
379
<stdlib.h>
Figure 13.33: /* a b o r t f u n c t i o n */
abort.c
/*
terminate abruptly
*/
function t o c a l l a t e x i t
*/
r a i s e (SIGABRT);
a x i t (EXIT-FAILURE);
Figure 13.34:
a t e x i t .c
/* a t e x i t f u n c t i o n */
#include < s t d l i b . h >
/* e x t e r n a l d e c l a r a t i o n s
e x t e r n void (*-Atfuns [I ) (void);
e x t e r n size- t -Atcount;
*/
/*
i f (-Atcount = 0)
r e t u r n (- 1);
-Atfuns[---Atcount]
return (0);
/*
list i s f u l l
*/
= func;
Figure 13.35: /* e x i t f u n c t i o n */
exit.c
/* macros */
#define NATS
32
/* s t a t i c d a t a */
void (*-Atfuns [NATS] ) (void) = { 0);
size- t -Atcount = {NATS);
void ( a x i t ) ( i n t s t a t u s )
{
/*
t i d y up and e x i t t o system
*r
/*
close a l l f i l e s
*,
size- t i;
f o r ( i = 0; i < FOPENMAX; + + i )
i f (- Files [ i ] )
f c l o s e (- Files [ i ] );
-E x i t ( s t a t u s );
1
Chapter 13
Figure 13.36:
getenv. c
*a; s += etrlen(s)
1)
/ * look for name match * I
'=')
...
);
/ * fork failed * /
else if (pid == 0)
(
/ * continue here as child * /
-E x e ~ l ( ~ / b i n / s hllshw,
~,
" - C w , e, NULL);
exit(EX1T-FAILURE);
else
/ * continue here as parent
while (-Wait(NULL) I = pid)
/ * wait for child
*/
*/
To display the final line and exit successfully, the program must do
several things right. It must supply a handler for SIGABRT that fields the call
to abort. That handler must call a x i t with successful status EXIT-SUCCESS.
Anc! e x i t must call the handler done registered with ateucit. That handler
r,tust be able to write a line of text to the standard output stream. All that
stuff exercises much of the logic for handling program termination.
References
Donald Knuth, The Art of Computer Programming, Vols. 1-3 (Reading,
Mass.: Addison-Wesley, 1967 and later). Here is a rich source of algorithms,
complete with analysis and tutorial introductions. Volume1 is Fundamental
Algorithms, volume 2 is Seminumerical Algorithms, and volume 3 is Sorting
and Searching.Some are in second edition.
You will find oodles of information on:
maintaining a heap
computing random numbers
searching ordered sequences
sorting
converting between different numeric bases
Before you tinker with the code presented in this chapter, see what Knuth
has to say.
Ronald F. Brender, Character Set Issues for Ada 9X, SEI-89-SR-17 (Pittsburgh, Pa.: Software Engineering Institute, Carnegie Mellon University,
October 1989). Here is an excellent summary of many of the issues surrounding large character sets and multiple character sets in programming
languages. While the document focuses on the programming language
Ada, it is largely relevant to C as well.
Chapter 13
-
Figure 13.38:
tstdlib.c
Part 1
t e a t s t d l i b functions
#include < a s s e r t . h >
tinclude < l i m i t s . h>
tinclude <signal.h>
tinclude <stdio.h>
#include < s t d l i b . h >
tinclude < s t r i n g . h>
I*
*/
s t a t i c void a b r t ( i n t s i g )
/* handle SIGABRT * A
a x i t (EXIT-SUCCESS);
1
s t a t i c i n t cmp(const void *pl, const void *p2)
/* compare function f o r bsearch and qsort
{
unsigned char c l = *(unsigned char * ) p l ;
unsigned char c2 = *(unsigned char *)p2;
r e t u r n (*(unsigned char * ) p l
*,
1
s t a t i c void done (void)
{
/* g e t c o n t r o l from a t e x i t
pute("SUCCESS t e s t i n g < s t d l i b . h > " ) ;
*,
1
int main ( )
--
*,
t s t d l i b . h>
Continuing
tstdlib.c
Part 2
Chapter 13
384
Exercises
Exercise 13.1 The following locale file defines the "Shift JIS" multibyte encoding for
Kanji. A character code in the intervals [Ox81,0x9FIor [OxEO, OxFCl signals
the first of a two-character sequence. (Any other code is a single character.)
The second character must be in the interval [0x40, OxFCl:
ImAm sHIFr-JIS
WlE JIS & with 0 x 8 1 - W F a OA~O-OXFC f 0 l l 4 by 0x40-OXFC
SEPA(br81
SEPBWf
SEPCQXeo
SEPDOxfc
SEX' M 0x40
SEPNOxfc
SEPXO
llb-cur-IllilX 2
nbtowcC0, 0:$#1 $@ SF
$0 $1 $0
nbtowcC0. A:B 1 $@ $SF $R
$I $1
nWxmcCONC:D1 $ @ S F @
SI$1
nbtowcC1, O:$#l
X
nbtowcCLM:Nl $@SF
$0$1$2
nbtowcC2. 0:$#1
0 SF SR
$0
-CON
0:$#1
SR
$1
wctU&[l# 0:$#1
x
- L O
I
SR $0 $1$0
wctarb[lNA:B I $@
$R $0
$2
wctabCINC:D I $@
SR $0
$2
W2tUbC2, O:$#l
X
-12,
M:N 1
$0 $1$0
ImAmd
Exercise 13.2 One definition cf EUC ("Extended UNIX Code") is similar to Shift JIS. A
character code in the interval [OxAl, OxFEl is the first of a two-character
sequence. The second character must be in the interval [0x80, OxFF]. Alter
the locale file presented in the previous exercise to define this multibyte
encoding. Describe your choice of mapping to wide chamcters.
Exercise 13.3 The following locale file defines the "JIS" multibyte encoding, which has
locking shift states. The three-character sequence 11\33$B11shifts to twccharacter mode. The three-character sequence "\33(Bn shifts back to onecharacter mode. In two-character mode, both character codes must be in
the interval [Ox21,0x7E]:
ImAm JIS
WlE JIS ccdea with ESC+(+B and ESC+$+B
SET A 0x21
SEPBMe
SEPXO
SET Z 033
lib-cur-IllilX
-10,
0:$#1 $@ SF
$0 $1 $0
-[O,O
]$@SF
-10,z
1
$1 $1
~ [ 1 , 0 : $ # 1
x
-11,
'(' 1
$1 $2
-[I,
I$'
I
$1 $3
~ [ 2 , 0 : $ # ] X
-12,
'B'1
OSFSF
@$O
nbtowC13, O:$#]
X
nktow~[3, 'B' 1
$1 $4
nbtowc[4, O:$#]
X
-r4,z
I
$
$1I
-14.
A:B I $@ SF SF
$1 $5
~ [ 5 , 0 : $ # ] X
-D,
A:B I $@SF
$0 $1 $4
-10,
0:$#1
SF
$1
WctaIb[l,O:$#]
X
-[I,
0 1
SF $0 $1 $0
w%arb[l,
A:B 1 Z
$0
$2
wctaIbr2, 0:$#1 '$I
$0
$3
-13,
0:$#] 'B'
$0
$4
wZtaIb14, O:$#]
x
-[4,
0
I z
$0
$7
-14.
A:B 1 $@
SF $0
$5
w!taIb[5, O:$#]
x
~ ~ t a 1 b 1 5A:B
,
1
$0 $1 $6
w.ztcnbl6. 0:$#1
SF
$4
wzta1b17, O:$#] ' ( '
$0
$7+$1
w=tarb[8, 0:$#] 'B'
$0
$1
U)CAIEd
Exercise 13.4 Alter the storage allocation functions to maintain up to eight lists of
fixed-size elements. Add a freed item to an existing list of elements that
have the same size. (Don't bother to sort these lists by storage address.)
Otherwise, create a new list if not all eight have been established. Allocate
from these lists if the request is exactly the right size. Why would you want
to introduce this extra complexity?
Exercise 13.5 Alter the storage allocation functions to store a signature as well as a size
in each allocated element. You might try a recipe something like:
p>-Signatme = p>-Size
(int)p
0x01234567;
(This example assumes that both p and p>-Size occupy 32 bits. It is not
portablecode.) Check the signatureof each element to be freed. Why would
you want to introduce this extra complexity?
Exercise 13.6 Alter the storage allocation functions to require that all allocated storage
be freed prior to program termination. Do you have to change exit as well?
What discipline does that impose on the use of the storage allocation
functions? Why would you want this extra constraint?
386
Chapter 13
Exercise 13.7 Implement axit, getenv, and system for the C translator that you use. Do
you have to write any assembly language?
Exercise 13.8 [Harder] Alter s t a o d to translate the input string Inf to the special code
Inf. Translate the input string N ~ to
N the special code NaN. Is this extension
permitted by the C Standard? How can you modify the code in <locale.h>
to turn the translation on and off? Can you devise a notation for specdying
arbitrary not-a-number codes?
Exercise 13.9 [Very hard] Modify a C compiler to generate inlinecode for abe,
and ldiv.
div, labs,
Chapter 14
The names of some functions are mysterious. strcspn and strpbrk, for
example, do not loudly proclaim what they do.
The set of functionsis incomplete and inconsistent. strnlen and mermrchr
are two sensible additions, for example, whereas strncat is surprising.
Despite these aesthetic gripes, I find the functions declared in <string.h>
to be both important and useful. Several of them are, in fact, leading
contenders for generating inline code. Many C programs use these functions, and use them a lot. They are worth the effort to learn and to optimize.
Description
The memcpy function copies n characters from the object pointed to by s2 into the object
pointed to by 61. If copying takes place between objects that overlap, the behavior is undefined
Returns
The memcpy function returns the value of 61.
Description
The Inennuove function copies n characters from the object pointed to by s2 into the object
pointed to by 61. Copying takes place as if then characters from the object pointed to by s2 are
first copied into a temporary array of n characters that does not overlap the objects pointed to by
s1 and s2. and then the n characters from the temporary array are copied into the object pointed
to by m 1 .
Returns
The memmove function returns the value of 61.
strcpy
Description
The strcpy function copies the string pointed to by s2 (including the terminating null
character) into the array pointed to by 61. If copying takes place between objects that overlap,
the behavior is undefined.
< s t r i n g . h>
Returns
The strcpy function returns the value of 61.
atrncpy
Description
The strncpy function copies not more than n characters (characters that follow a null
character are not copied) from the array pointed to by s2 to the array pointed to by s1.13"If
copying takes place between objects that overlap, the behavior is undefined.
If the array pointed to by s2 is a string that is shorter than n characters, null characters are
appended to the copy in the array pointed to by 61, until n characters in all have been written.
Returns
The atrncpy function returns the value of sl.
Description
The s t r c a t function appends acopy of the string pointed to by s2 (including the terminating
null character) to the end of the string pointed to by 61.The initial character of s2 overwrites the
null character at the end of 61. If copying takes place between objects that overlap, the behavior
is undefined.
Returns
The s t r c a t function returns the value of 61.
a i z e t n);
Description
The strncat function appends not more than n characters (a null character and characters
that follow it are not appended) from the array pointed to by s2 to the end of the string pointed
to by 61. The initial character of s2 overwrites the null character at the end of 61. A terminating
null character is always appended to the result.135 If copying takes place between objects that
overlap, the behavior is undefined.
Returns
The atrncat function returns the value of 51.
Forward references; the s t r l e n function (7.1 1.6.3).
Description
The msmcmp function compares the first n characters of the object pointed to by sl to the
first n characters of the object pointed to by s2 . 136
Chapter 14
Returns
The memcmp function returns an integer greater than, equal to, or less than zero, accordingly
as the object pointed to by sl is greater than, equal to. or less than the object pointed to by s2.
atrcmp
Description
The strcmp function compares the string pointed to by s 1 to the string pointed to by s2.
Returns
The strcmp function returns an integer greater than, equal to, or less than zero, accordingly
as the string pointed to by s 1 is greater than, equal to, or less than the string pointed to by 52.
atrcoll
Description
The strcoll function compares the string pointed to by s 1 to the string pointed to by s2,
both interpreted as appropriate to the LC-COUATE category of the current locale.
Returns
The strcoll function returns an integer greater than, equal to, or less than zero, accordingly
as the string pointed to by 61 is greater than, equal to, or less than the string pointed to by s 2
when both are interpreted as appropriate to the current locale.
atrncmp
Description
The strncmp function compares not more than n characters (characters that follow a null
character are not compared) from the array pointed to by s 1 to the array pointed to by 62.
Returns
The strncmp function returns an integer greater than, equal to, or less than zero, accordingly
as the possibly null-terminated array pointed to by 61 is greater than, equal to, or less than the
possibly null-terminated array pointed to by 62.
atrxfrm
n);
Description
The s t r x f rm function transforms the string pointed to by s 2 and places the resulting string
into the array pointed to by sl.The transformation is such that if the strcmp function is applied
to two transformed strings, it returns a value greater than, equal to, or less than zero, corresponding
to the result of the s t r c o l l function applied to the same two original strings. No mwe than n
characters are placed into the resulting array pointed to by sl, including the terminating null
character. If n is zero, s 1 is permitted to be a null pointer. If copying takes place between objects
that overlap, the behavior is undefined.
Returns
The s t r x f r m function returns the length of the transformed string (not including the
terminating null character). If the value returned is n o r more, the contents of the array pointed to
by s 1 are indeterminate.
Example
The value of the following expression is the size of the array needed to hold the transformation
of the string pointed to by s.
1
atrxfrm(NULL, a, 0)
Description
The memchr function locates the first occurrence of c (converted to an unsigned char)
in the initial n characters (each interpreted as unsigned char) of the object pointed to by s.
Returns
Thememchr function returns a pointer to the located character. or a null pointer if the character
does not occur in the object.
atrchr
int c);
Description
The strchr function locates the first occurrence of c (converted to a char) in the string
pointed to by s.The terminating null character is considered to be part of the string.
Returns
The strchr function returns a pointer to the located character. or a null pointer if the character
does not occur in the string.
strcapn
Description
The strcspn function computes the length of the maximum initial segment of the aring
pointed to by s 1 which consists entirely of characters not from the string pointed to by s2.
Returns
atrpbrk
Synopsis
#inclu& <atring.h>
char *atrpbrk(conat char '81,
Description
The strpbrk function locates the first occurrence in the aring pointed to by sl of any
character from the string pointed to by 62.
Returns
atrrchr
The strpbrk function returns a pointer to the character, or a null pointer if no character from
s 2 occurs in sl.
7.11.5.5 The s t r r c h r function
Synopsis
#include <atring.h>
char 'atrrchr(conat char *a, int C);
Chapter 14
Description
locates the last occurrence of c (converted to a char) in the string
The strrchr function
hi
pointed to by s. The terminating null character is considered to be part of the string.
Returns
atrapn
The strrchr function returns a pointer to the character, or a null pointer if c does not occur
in the string.
7.11.5.6 The s t r s p n function
Synopsis
#include <atring.h>
size-t atrapn(wnat char *a1, conat char "82);
Description
The strspn function computes the length of the maximum initial segment of the string
pointed to by sl w ch consists entirely of characters from the string pointed to by 62.
Returns
The strspn function returns the length of the segment.
atratr
Description
The s t r s t r function locates the first occurrence in the string pointed to by sl of the sequence
of characters (excluding the terminating null character) in the string pointed to by s2
Returns
Strtok
The s t r s t r function returns a pointer to the located string, or a null pointer if the string is
not found. If s2 points to a string with zero length, the function returns 61.
7.11.5.8 The s t r t o k function
Synopsis
#include <atring.h>
char *atrtok(char *a1, conat char *a2):
Description
A sequence of calls to the strtok function breaks the string pointed to by sl into a sequence
of tokens, each of which is delimited by a character from the string pointed to by 62. The first
call in the sequence has s 1 as its first argument, and is followed by calls with a null pointer as
their first argument. The separator string pointed to by s2 may be different from call to call.
The first call in the sequence searches the string pointed to by sl for the first character that is
not contained in the current separator string pointed to by s2. If no such character is found, then
there are no tokens in the string pointed to by 51 and the strtok function returns a null pointer.
If such a character is found, it is the start of the first token.
The strtok function then searches from there for a character that is contained in ihe current
separator string. If no such character is found, the current token extends to the end of the string
pointed to by sl,and subsequent searches for a token will return a null pointer. If such a character
is found, it is overwritten by a null character, which terminates the current token. The strtok
function saves a pointer to the following character, from which the next search for a token will
start.
Each subsequent call, with a null pointer as the value of the first argument, starts searching
from the saved pointer and behaves as described above.
The implementation shall behave as if no libra~yfunction calls the strtok function.
Returns
The strtok function returns a pointer to the first character of a token, or a null pointer if
there is no token.
( s t r i n g . h>
Example
#inclu& <atring.h>
atatic char atr[] = "?a???b,, ,#c";
char *t;
t
t
t
t
Description
The -et
function copies the value of c (converted to an unsigned char) into each of
the first n characters of the object pointed to by S.
Returns
The memset function returns the value of
S.
Description
The strerror function maps the error number in errnum to an error message string.
The implementation shall behave as if no library function calls the strerror function.
Returns
The strerror function returns a pointer to the string, the contents of which are implementation-defined. The array pointed to shall not be modified by the program, but may be overwritten
by a subsequent call to the strerror function.
Description
The s t r l e n function computes the length of the string pointed to by s.
Returns
The s t r l e n function returns the number of characters that precede the terminating null
character.
Footnotes
133. See "future library directions" (7.13.8).
134. Thus, if there is no null character in the first n characterb of the array pointed to by s2, the
result will not be null-terminated.
135. Thus, the maximum number of characters that can end up in the array pointed to by sl is
s t r l e n (61)+n+l.
136. The contents of "holes" used as padding for purposes of alignment within structure objects
are indeterminate. Strings shorter than their allocated space and unions may also cause
problems in comparison.
Chapter 14
394
Using <string.h>
NULL
size-t
rmmchr
mcmp
rnemcpy
mernmove
mmmet
<string.h>
strcat
strchr
strcoll
strcpy
strcspn
strerror
Chapter 14
396
strlen
strncat
strncmp
strncpy
strpbrk
strrchr
strspn
strstr
strtok
strxfrm
"
The first call to strtok has a first argument that is not a null pointer. That
starts the scan at the beginning of l i n e . Subsequent calls replace this
argument with NULL to continue the scan. If the return value on any call is
not a null pointer, it points to a null-terminated string containing no
separators. Note that strtok stores null characters in the string starting at
l i n e . Be sure that this storage is writable and need not be preserved for
future processing.
You can specify a different set of separators on each call to strtok that
processes a given string, by the way.
strxfrm - Use strxfrm(s1, s 2 , n) to map the null-terminated string
s2 to a (non-overlapping)version at s l . Strings you map this way can later
be compared by calling strcmp. The comparison determines the locale-specific lexical ordering of the two strings that you mapped fiom. You must
know the current status of locale category LC-COLLATE to use this function
wisely. (You must at least assume that someone else has set this category
wisely.) Under most circumstances, you may want to use s t r c o l l , described above, instead. Use strxfrm if you plan to make repeated comparisons or if the locale may change before you can make the comparison. Use
m a l l o c , declared in < s t d l i b . h>, to allocate storage for s l , as in:
size- t n = strxfrm(NULL, s2, 0) ;
char * s l = m a l l o c (n
1);
i f (sl)
strxfrm(s1, s2, n ) ;
The fist call to strxfrm determines the amount of storage required. The
second performs the conversion (again) and stores the translated string in
the allocated array.
398
Chapter 14
Implementing <string.h>
The functions declared in < s t r i n g . ID work largely independent of each
other. The only exception is the pair s t r c o l l and strxfrm. They perform
the same essential operation two different ways. I discuss them last. The
remaining functions each perform a fairly simple operation. Here, the
challenge is to write them to be clear, robust, and efficient.
header
Figure14.1 shows the file s t r i n g . h. As usual, it inherits from the internal
< s t r i n g . ID header < p a l s . h> definitions that are repeated in several standard headers.
I discuss the implemention of both the macro NULL and the type
- - definition
size- t in Chapter 11: < s t u e f . h>.
Figure 14.1: I / * s t r i n 9 . h standard header */
string. h
#ifndef -STRING
#define -STRING
l i f n d e f -WALS
#include <yvals.h>
#endif
/* macros */
#define NULL
NULL
/* type d e f i n i t i o n s */
#ifndef -SIZET
#define -SIZET
typedef -Sizet size- t;
#endif
/* d e c l a r a t i o n s */
void *rnemchr (const void *, i n t , size- t);
i n t meancmp(const void *, const void *, size- t);
void *meancpy (void *, const void *, size- t);
void *memwve(void *, const void *, size- t);
void *memset (void *, i n t , size- t);
char * s t r c a t ( c h a r *, const char *);
char * s t r c h r ( c o n s t char *, i n t ) ;
i n t strcmp(const char *, const char *) ;
i n t s t r c o l l (const char *, const char *) ;
char *strcpy(char *, const char *);
size- t s t r c s p n (const char *, const char * );
char * s t r e r r o r ( i n t );
size- t s t r l e n (const char *) ;
char * s t r n c a t (char *, const char *, size- t);
i n t strncmp (const char *, const char *, size- t);
char *strncpy(char *, const char *, size- t);
char *strpbrk(const char *, const char *);
char * s t r r c h r ( c o n s t char *, i n t ) ;
size- t s t r s p n ( c o n s t char *, const char *);
char * s t r s t r ( c o n s t char *, const char *);
char * s t r t o k ( c h a r *, const char *);
size- t strxfrm(char *, const char *, size- t);
char *-Strerror ( i n t, char *) ;
/* macro overrides */
#define s t r e r r o r (errcode) -Strerror (errcode, -NULL)
#endif
#include <string.h>
void *(memchr) (const void *s, i n t c, size- t n)
/* f i n d f i r s t occurrence of c i n s[n]
const unsigned char uc = c;
const unsigned char *su;
*/
Figure 14.3:
melncmp.c
/* memcmp function */
#include <string.h>
i n t (memcmp) (const void * s l , const void *s2,
size- t n)
{
/* compare unsigned char s l [n] , s2 [n]
const unsigned char *sul, *su2;
*/
Chapter 14
Figure 14.4: /* memcpy function */
memcpy .c #include <string.h>
*(memcpy)(void
void
char *sul;
const char *su2;
Figure 14.5:
I* mermplove function */
menmove. c
kindude <string.h>
* ( m o v e )(void
void
char *scl;
const char *sc2;
scl = sl;
sc2 = s2;
if (sc2 < scl 66 scl < sc2 + n)
for (scl += n, sc2 += n; 0 < n; --n)
*--scl = *--g~2;
else
for (; 0 < n; --n)
*scl++ = *sc2++;
return (sl);
1
Figure 14.6:
memset. c
/*copy backwards
/*
copy forwards
-et
function */
linclude <string.h>
/*
void *(-et)
*,
haracter type.) =cpy can assume that its source and destination areas do
not overlap.-~ence,it performs the simplest copy that it can.
function
Figure 14.5 shows the file menmove.c. The function m o v e must work
menmove properly even when its operands overlap. Hence, it fist checks for an
overlap that would prevent the correct operation of an ascending copy. In
that case, it copies elements in descending order.
for
(;
*s = ' \ O ' ;
return (sl);
1
/* strncmp function
strncmp-c #include <string.h>
Figure 14.8:
function
memset
function
etrncat
function
strnm
function
strncpy
strcat
strcrnp
strcpy
*/
Figure 14.6 shows the file memset.c. I chose unsigned char as the working
type within rnemset in the off chance that some implementation might
generate an overflow storing certain int values in the other character types.
Now consider the three strn functions. Figure 14.7 shows the file
strncat-c.The function strncat first locates the end of the destination
string. Then it concatenates at most n additional characters from the source
string. Note that the function always supplies a terminating null character.
Figure 14.8 shows the file 6trncmp.c. The function strncmp is similar to
memcmp, except that it also stops on a terminating null character. And unlike
rnemcmp, strncq can use its pointer arguments directly. It type casts them
to pointer to unsigned char only to compute a nonzero return value.
Figure 14.9 shows the file etrnwy-c.The function etrnwy is likewise
similar to r n e ~ y except
,
that it stops on a terminating null. strnwy also
has the unfortunate requirement that it must supply null padding characters for a string whose length is less than n.
Three of the etr functionsare direct analogs of the strnfunctions. Figure
14.10 throughFigure14.12 show the files strcat .c, strcw.c, and strwy.c.
The functions strcat, str-, and strcpy differ only in not worrying about
a limitingstring length n.Of course, e t r w y has no padding to contend with.
402
Chapter 14
/* s t r n c p y f u n c t i o n */
Figure 14.9:
st r n c p y .c
k i n d u d e <string.h>
* (strncpy) ( c h a r
=har
* s l , c o n s t c h a r *s2, size- t n)
/* copy c h a r s 2 [rnax n] t o s l [ n ]
c h a r *s;
f o r ( s = 91; 0 < n
*s++ = *s2++;
f o r (; 0 < n; --n)
*s++ = ' \ O ' ;
return ( s l ) ;
66
strcat. c
* ( s t r c a t ) (char
char
* s l , c o n s t char *s2)
/* copy c h a r s 2 [ ] t o end o f s l [I * I
char *s;
f o r ( s = sl; * s != ' \ O t ; ++s)
f o r (;
/*
f i n d end o f s l [I */
/*
copy s 2 [ ] t o end */
++s, ++s2)
return ( s l ) ;
i n t ( s t r a p ) (const c h a r * s l ,
c o n s t c h a r *s2)
1
/* compare unsigned c h a r s l [ I , s2 [I */
f o r (; * s l == *s2; ++sl, ++s2)
i f (*sl = '\Or)
return (0);
r e t u r n ( (* (unsigned char * ) sl
< *(unsigned c h a r * ) s 2 ) ? -1 : +1);
1
Figure 14.12: /* s t r c p y f u n c t i o n */
strcpy c
#include <string.h>
* (strcpy) (char
char
* s l , c o n s t c h a r *s2)
/* copy c h a r s 2 [I t o s l [I
*/
c h a r * s = sl;
f o r ( s = sl; (*s++ = *s2++) != ' \ O r ;
return ( s l ) ;
#include <string.h>
strlen c
/*
f i n d length of s [ ]
*/
*/
f
const char *sc;
f o r (sc =
S;
r e t u r n (sc
++sc)
s);
/* s t r c h r function */
#include <string.h>
Figure 14.14:
strchr.c
f o r (; * s != ch; ++s)
i f (*s = ' \ O n )
r e t u r n (NULL);
r e t u r n ( (char *) s);
*/
*/
C
function
strlen
Chapter 14
Figure 14.16:
strpbrk c
/* s t r p b r k function */
#include <string.h>
char * ( s t r p b r k ) (const char * s l , const char *s2)
f
/* f i n d index of f i r s t s l [ i ] t h a t matches any s2[] */
const char * s c l , *sc2;
f o r ( s c l = sl; * s c l != ' \O' ; ++scl)
f o r (sc2 = s2; *sc2 != ' \0'; ++sc2)
i f (*scl == *sc2)
r e t u r n ((char * ) s c l );
r e t u r n (NULL);
/* terminating n u l l s match */
Figure 14.17:
strspn. c
/* s t r s p n function */
#include <string.h>
size- t ( s t r s p n ) (const char * s l , const char *s2)
f
/* f i n d index of f i r s t s l [ i ] t h a t matches no s2[] */
const char * s c l , *sc2;
f o r (scl = sl; * s c l != ' \ O ' ;
++scl)
f o r (sc2 = s2; ; ++sc2)
i f (*sc2 == ' \ 0 ' )
r e t u r n (scl - s l ) ;
else i f (*scl =
- *sc2)
break;
r e t u r n (scl - s l ) ;
/*
n u l l doesn't match */
Figure 14.1 8,
s t r r c h r.c
I* s t r r c h r function
Yinclude <string.h>
*/
(SC
= NULL; ; ++s)
i f (*s == ch)
SC
/*
9;
i f (*s == ' \ O r )
r e t u r n ( ( c h a r *) s c ) ;
function
strrchr
Figure 14.19:
strstr c
'*
strstr function */
linclude <string.h>
:har *(strstr)(conet char *sl, const char *s2)
(
/ * find first occurrence of s2 [I in sl[l
if (*s2 == '\0')
return ((char *)el);
for (; (sl = strchr(e1, *s2)) I = NULL; ++el)
(
/ * match rest of prefix
const char *scl, *sc2;
*,
*,
Figure 14.20:
etrtak. c
f * strtok function * /
tinclude <string.h>
f
ssave = " ";
return (NULL);
1
send = sbegin + etrcspn(sbegin, s2);
if (*send I = r\O')
*send++ = '\On;
esave = send;
return (ebegin);
1
/* end of scan *I
/*
for safety *I
Figure 14.19 shows the file etrstr-c.The function strstr calls strchr to
find the first character of the string s2 within the string sl.Only then does
it tool up to check whether the rest of e2 matches a substring in sl. The
function treats an empty string e2 as a special case. It matches the implicit
empty string at the start of sl.
function
Figure 14.20 shows the file etrtok-c.The function strtok is the last and
strtok the messiest of the seven string scanning functions. It doesn't look bad
because it is written here in terms of strspn and strpbrk.It must contend,
however, with writable static storage and multiple calls to process the same
function
strstr
Chapter 14
strerror
-S t r e r r o r
#include <errno.h>
#include <string.ID
char *-Strerror ( i n t errcode, char *buf)
/* copy e r r o r message i n t o buffer a s needed
(
s t a t i c char sbuf [I = ( " e r r o r #xxxW
);
*,
i f (buf = NULL)
buf = sbuf;
switch (errcode)
f
/* switch on known e r r o r codes
case 0:
return ("no e r r o r " );
case EDOM:
return ("domain e r r o r " );
case ERANGE:
return ("range e r r o r " ) ;
case EFPOS:
return ( " f i l e positioning e r r o r " ) ;
default :
i f (errcode < 0 I I -NERR <= errcode)
return ("unknown e r r o r " );
else
/* generate numeric e r r o r code
strcpy (buf, "error #xxxW
);
buf[9] = errcode % 10 + '0';
buf [8] = (errcode /= 10) % 10 + '0';
buf [7] = (errcode / 10) % 10 + '0';
return (buf);
*,
*,
1
1
char
* ( s t r e r r o r ) ( i n t errcode)
I )
*,
collation
The last two functions declared in <string.h> help you perform localefunctions specific string collation. Both s t r c o l l and strxfrm determine collation
sequence by mapping strings to a form that collates properly when compared using s t r a p . The locale category LC-COLLATE determines this mapping. (See Chapter 6: <locale. h>.) It does so by specifing the state table
used by the internal function -strxfrm. Thus, s t r c o l l and strxfrm call
-strxfrm to map strings appropriately.
header
Figure 14.22 shows the file x8trxfrm.h. All the collation functions in"x8trxfrm.h" clude the internal header "xstrxfrm.h".It includes in turn the standard
header <string.h> and the internal header "xstate.hw. (See the file
xstate .h on page 100.) Beyond that, "xstrxfrm.hwdefines the type-cosave
and declares the function -strxfrm. A data object of type -cosave stores
state information between calls to -Strxfrm.
function
Figure 14.23 shows the file strxfrm. c. The function strxfrm best illusstrxfrm trates how the collation functions work together. It stores the mapped
string in the buffer pointed to by 81, of length n. Once the buffer is full, the
function translates the remainder of the source string to determine the full
length of the mapped string. strxfrm stores any such excess characters in
its own dynamic temporary buffer buf.
function
Figure 14.24 shows the filexstrxfrm. c. It defines the function -strxfrm
strxfrm
that
performs the actual mapping. It does so as a finite-state machine
executing the state table stored at -wcstate, defined in the file xstate. c.
(See page 107.)
strxfrm must be particularly cautious because-wcstate can be flawed.
It can change with locale category LC_COLLATE in ways that the Standard C
library cannot control.
Note the various ways that the function can elect to take an error return:
if a transfer occurs to an undefined state
if no state table exists for a given state
if the function makes so many state transitions since generating an
output character that it must be looping
if the state table entry specifically signals an error
Figure 14.22:
xstrxfrm.h
**,
size- t,
C
Chapter 14
408
Figure 14.23:
strxfrm. c
nx, &state);
el += i, nx += i;
if (0 < i && 811-11 == '\0')
return (nx - 1);
else if (fs == '\0')
s = (const unsigned char *)s2;
/f rescan *I
1
for
(;
; )
nx += i;
if (0 < i && buf1i - 11 == '\0')
return (nx
1);
else if (fs == '\0')
s = (const unsigned char *)s2;
/ * rescan
*I
1
1
<string.h>
Figure 14.24: '* Strxfrm function
xstrxfrm-c
*/
:include <limits.h>
:include "xstrxfrm.h"
rize-t -Strxfrm(char *sout, const unsigned char **pain,
size-t size, -Cosave *pa)
/* translate string to collatable form
(
char state = ps->-State;
int leave = 0;
int limit = 0;
int nout = 0;
const unsigned char *sin = *pain;
unsigned short wc = ps->-Wchar;
for (; ;
*,
*,
*,
1
1
sout [nout++] = '\O' ;
*pain = sin;
ps->-State = -NSTATE;
return (nout ;
/*
error return * I
410
Chapter 14
/* s t r c o l l function */
#include "xstrxfrm. h"
Figure 14.25:
strcoll. c
1
/* type d e f i n i t i o n s */
typedef s t r u c t (
char buf [32];
const unsigned char * e l , *s2, *sout;
-Cosave s t a t e ;
) Sctl;
s t a t i c size- t getxfrm(Sct1 *p)
/*
g e t transformed chars
size- t i;
(
/* loop u n t i l chars delivered *
p - > s a t = (const unsigned char *)p->buf;
i = Strxfrm (p->buf, hp->el, s i z e o f (p->buf) , hp->state)
i f (0 < i hh p->buf[i - 11 = '\Or)
return ( i
1);
else i f (*p--1 = ' \0')
p->el = p->s2;
/* rescan *,
) while (i= 0) ;
return ( i ) ;
1
i n t ( s t r c o l l ) (const char * e l , const char *s2)
(
/* compare e l [I , s2 [I using locale-dependent r u l e
size- t n l , n2;
S c t l stl, st2;
s t a t i c const -Cosave i n i t i a l = ( 0 ) ;
s t l . e l = (const unsigned
s t l . 8 2 = (const unsigned
st1. s t a t e = i n i t i a l ;
s t 2 . 8 1 = (const unsigned
s t 2 . 8 2 = (const unsigned
st2. state = initial;
f o r ( n l = n2 = 0; ; )
(
*,
char * ) e l ;
char * ) e l ;
char *)s2;
char *)s2;
/*
*,
i n t ans;
size- t n;
i f ( n l = 0)
n l = getxfrm(hst1);
i f (n2 = 0)
n2 = getxfrm(hst2);
n = n l < n2 ? n l : n2;
i f (n = 0)
r e t u r n ( n l = n2 ? 0 : 0 < n2 ? -1 : + I ) ;
else i f ( (ans = memcmp(st1.sout, s t 2 . s o u t , n ) ) != 0)
r e t u r n (ans);
s t l . sout += n, n l -= n;
s t 2 . sout += n, n2 -= n;
1
1
function
strcoli
Testing <string.h>
Figure 14.26 shows the file t s t r i n g.c. The test program performsseveral
cursory tests of each of the functions declared in <string.h>. The header
defines no unique macros or types, so there are no interesting sizes to
display. If all goes well, the program simply displays:
SUCCESS t e s t i n g <string.h>
References
R.E. Griswold, J.F. Poage, and LP.Polonsky, The SNOBOL4 Programming
Language, (Englewood Cliffs, N.J.: Prentice-Hall, Inc. 1971). The programming languageSNOBOL pushes to the extreme both pattern matching and
substitution within text strings. You may be surprised at what powerful
programs you can base largely on string manipulations.
Exercises
Exercise 14.1 The following locale file defines a simple "dictionary" collation sequence
that ignores punctuation and distinctions between uppercase and lowercase letters:
LOCALE DICT
NOTE dictionary c o l l a t i o n sequence
collate[O, 0
] ' .'
$0 $I
c o l l a t e [O, 1: $#
]
$I
collate[O, ' a ' : ' z r ] $@
$0 $1
collate[O, 'A' : ' 2 ' 1 $@+'a'-'A' $0 $1
c o l l a t e [ l , O:$#
] $@
$0 $1
$1
$0
$0
$0
$1
LOCALE end
412
Chapter 14
Figure 14.26:
t8tring.c
Part 1
test s t r i n g functions
linclude < a s s e r t . h >
linclude <errno.h>
linclude <stdio.h>
linclude < s t r i n g . h >
f*
*/
~ n main()
t
--
Continuing
t s t r i n g .c
Part 2
Exercise 14.2 Modify the locale file in the previous exercise to order names that begin
with ~ a interchangeably
c
with names that begin with MC. Order ~ a before
c
MC only if the names otherwise compare equal.
Exercise 14.3 Describe a precise specification for:
how names sort in your telephone book
how words sort in the dictionary you use
how text lines sort in the computer sort utility you use
Can you define a locale that matches the behavior of each of these collation
rules? How many states does it take to specify each?
Exercise 14.4 A simple calculator program recognizes the following tokens:
numbers palatable to the function s t r t o 4 declared in < s t d l i b . h > (See
the syntax diagram on page 351
operators in the set [+ - * / = cl
comments inside double quotes (.I)
These tokens are separated by spaces, horizontal tabs, and newlines. Such
characters can, however, occur inside comments.
Write a function that reads characters from the standard input stream and
parses them into tokens. Use the function strtok, declared in < s t r i n g .h>.
Rewrite the function to avoid using strtok. Which of the two versions do
you prefer? Why?
Exercise 14.5 Identify the "missing" functions not declared in <string.h> (such as strnl e n and memrchr). Write them. Can you add them to the Standard C library
and still conform to the C Standard? Can you add their declarations to
< s t r i n g . h> and still conform?
Exercise 14.6 Measure a large corpus of code to determine the five functionsdeclared in
< s t r i n g . h> that consume the most time. How much could you speed up a
typical program if these functions were instantaneous? How much could
you speed up a typical program if each of these functions ran five times
faster? What are the comparablefigures for the program you measured that
would benefit most?
Chapter 14
Exercise 14.7 [Harder] Write assembly language versions of the functions you identified
in the previous exercise. Can you achieve a significant speedup just by
altering the C code? How much faster is each function compared to the C
version presented here?
Exercise 14.8 [Very hard] Modify a C compiler to generate inline code for the functio
you identified in the previous two exercises. How much faster is ea
function compared to the versions discussed in the previous exercise?
Chapter 15
416
NULL
The header <time.h> defines rwo macros, and declares four types and several functions for
manipulating time. Many functions deal with a calendar time that represents the current date
(according ro the Gregorian calendar) and time. Some functions deal with local time, which is the
calendar time expressed for some specific time zone, and with Daylight Saving Time, which is a
temporary change in the algorithm for determining local time. The local time zone and Daylight
Saving Time are implementation-defined.
The macros defined are NUU (described in 7.1.6); and
CLOCKS-PER-SEC
size-t
CLOCKS-PER-SEC
which is the number per second of the value returned by the clock function.
The types declared are size-t (described in 7.1.6);
clock-t
clock-t
and
which are arithmetic types capable of representing times; and
which holds the components of a calendar time, called the broken-down time. The structure shall
contain at least the following members, in an order. The semantics of the members and their
normal ranges are expressed in the
int
int
int
int
int
int
int
int
int
tm-sac;
tu-min;
tu-hour;
tm-m&y;
tm-mon;
tmjear:
tm-wday;
tmjday;
tu-isdst;
/*
/*
/*
/*
/*
/*
/*
/*
/*
The value of tm-isdst is positive if Daylight Saving Time is in effect, zero if Daylighr
Saving Time is not in effect, and negative if the information is not available.
clock
Description
The clock function determines the processor time used.
Returns
The clock function returns the implementation's best approximation to the processor time
used by the program since the beginning of an implementation-defined era related only to the
program invocation. To determine the time in seconds, the value returned by the clock function
should be divided by the value of the macro CLOCKS PER SEC. If the processor time used is
not available or its value cannot be represented, the fundon s u m s rhe value (clock-t ) -l.'3s
difftime
Description
The difftime function computes the difference between two calendar times: time1 time0.
<t b e . h>
Returns
The dif ftime function returns the difference expressed in seconds as a double.
mktime
Description
The m k t i m e function converts the broken-down time. expressed as local time, in the structure
pointed to by timeptr into a calendar time value with the same encoding as that of the values
y
returned by the time function. The original values of the tm wday and t m ~ d a components
of the structure are ignored, and the original values of the otFer components are not restricted to
the ranges indicated above.139 On successful completion, the values of the tm-wday and
t m j d a y components of the structure are set appropriately, and the other components are set to
represent the specified calendar time, but with their values forced to the ranges indicated above;
the final value of tm-mday is not set until tm-mon and tmjear are determined.
Returns
Themktime function returns the specified calendar time encoded as a value of type ti-t.
If the calendar time cannot be represented, the function returns the value ( t i - t )
-1.
Example
What day of the week is July 4.2001?
#include <stdio.h>
#include <tima.h>
static const char *const wdayl] = (
"Sunday", "Mondey", "Tuesday", "Wednesday",
"Thursday", "Friday", "Saturday", "-unknown-"
1;
struct tm t i m t r :
/* ...* /
t ima
Description
The t i m e function determines the current calendar time. The encoding of the value is
unspecified.
Returns
The time function returns the implementation's best approximation to the current calendar
time. The value ( t x t ) -1 is returned if the calendar time is not available. If timer is not
a null pointer, the return value is also assigned to the object it points to.
Chapter 15
418
asctimr
1973\n\0
1;
s t a t i c const char rn-name [ l 2 ] [3] = (
,-Jm,m,wFeb.m, , w m,r ~
"Apr", "May", "Jun",
"
Jul",
1;
s t a t i c char r e s u l t 1261 ;
s p r i n t f ( r e s u 1 t . "%.3s %.3s%3d%.Zd:%.Zd:%.Zd%d\n",
*day-name [thptr->tm-wday] ,
mon-name [timeptr->tar-monl ,
timeptr->tm-day,
timeptr-Xm-hour,
timptr->tm-min,
tiwptr->tm-sec,
1900 + timoptr->tm_year);
return result;
Returns
The asctime function returns a pointer to the string.
ctime
Description
The d i m e function converts the calendar time pointed to by t i m e r to local time in the form
of a string. It is equivalent to
asctima (localtime (timer) )
Returns
The d i m e function returns the pointer returned by the a s c t i m e function with that
broken-down time as argument.
Fonvard references: the localtima function (7.12.3.4).
gmtime
*timar);
Description
The g m t i m e function converts the calendar time pointed to by t i m e r into a broken-down
time, expressed as Coordinated Universal Time (UTC).
Returns
The g m t i m e function returns a pointer to that object, or a null pointer if UTC is not available.
Description
The l o c a l t i m e function converts the calendar time pointed to by t i m e r into a brokendown time, expressed as local time.
Returns
The l o c a l t i m e function returns a pointer to that object.
strftime
Description
The s t r f t i m e function places characters into the array pointed to by a as controlled by the
string pointed to by format. The format shall be a multibpe character sequence, beginning and
ending in its initial shift state. The format string consists of zero or more conversion specifiers
and ordinary multibpe characters. A conversion specifier consists of a % character followed by a
character that determines the behavior of the conversion specifier. All ordinary multibyte characters (including the terminating null character) are copied unchanged into the array. If copying
takes place between objects that overlap, the behavior is undefined. No more than m a x s i z e
characters are placed into the array. Each conversion specifier is replaced by appropriate
characters as described in the following list. The appropriate characters are determined by the
LC-TIME category of the current locale and by the values contained in the structure pointed to
by t i m e p t r .
"%a"is replaced by the locale's abbreviated weekday name.
"%A"is replaced by the locale's full weekday name.
"%W"is replaced by the week number of the year (the first Monday as the first day of week 1) as
a decimal number (00-53).
"%x"
is replaced by the locale's appropriate date representation.
"%X"is replaced by the locale's appropriate time representation.
"%yW
is replaced by the year without century as a decimal number (00-99).
"%Y"is replaced by the year with century as a decimal number.
Chapter 15
"%Z"is replaced by the time zone name or abbreviation, or by no characters if no time zone is
determinable.
"%%" is replaced by %.
Footnotes
137. The range [O, 611 for tm-eec allows for as many as two leap seconds.
138. In order to measure the time spent in a program, the clock function should be called at
the stan of the program and its return value subtracted from the value returned by subsequent
calls.
139. Thus, a positive or zero value for tm i s d s t causes the mktime function to presume
initially that Daylight Saving Time, respectively, is or is not in effect for the specified time.
Anegative value causes it to attempt to determine whether Daylight Saving Time is in effea
for the specified time.
Using <time.h>
The functions declared in <time. h> determine elapsed processor time
and calendar time. They also convert among different data representations.
You can represent a time as:
type clock-t for elapsed processor time, as returned by the primitive
function clock
type t i m e - t for calendar time, as returned by the primitive function time
or the function mktime
type double for calendar time in seconds, as returned by the function
difftilne
type etruct tm for calendar time broken down into separate components, as returned by the functions gmtime and localtime
a text string for calendar time, as returned by the functions asdime,
d i m e , and strftime
You have a rich assortment of choices. The hard part is often identifying
just which data represention, and which functions, you want to use for a
particular application.
The one complicated function declared in <time.h> (from the outside, at
function
strftime least) is strftime. You use it to generate a text representation of a time and
date from a struct tm under control of a format string. In this sense, it is
modeled after the print functions declared in x s t d i o . h>. It differs in two
important ways:
strftime does not accept a variable argument list. It obtains all time and
date information from one argument.
The behavior of strftime can vary considerably among locales. The
locale category LC-TIME can, for example, specify that the text form of all
dates follow the conventions of the French culture.
localtime(&tO));
If your goal is to display times and dates in accordance with local custom,
then e t r f t i m e gives you just the flexibility you need. You can even write
multibyte-charactersequencesbetween the conversion specifiers.That lets
you convert dates to Kanji and other large character sets.
Here are the conversion specifiers defined for strftime. I follow each
conversion
specifiers with an example of the text it produces. The examples, from Plauger and
Brodie, all assume the "cWlocaleand the date and time Sunday, 2 December
1979 at 06:55:15 AM EST:
%a- the abbreviated weekday name (sun)
%A - the full weekday name (sunday)
%b- the abbreviated month name (Dec)
%B - the full month name (December)
%c- the date and time (Dec 2 06:55: 15 1979)
%a - the day of the month (02)
%H - the hour of the 24-hour day (06)
%I - the hour of the 12-hour day (06)
%j - the day of the year, from 001 (335)
%m - the month of the year, from 01 (12)
%M- the minutes after the hour (55)
%p- the AM/PM indicator (AM)
%s- the seconds after the minute (15)
%u - the Sunday week of the year, from 00 (48)
%W - the day of the week, from 0 for Sunday (01
%w - the Monday week of the year, from 00 (47)
%x- the date (Dec 2 1979)
%x-thetime(06:55:15)
%y-the year of the centuv, from 00 (79)
%Y - the year (1979)
%z - the time zone name, if any (EST)
%%- the per cent character (%I
I conclude with the usual description of the individual types and macros
defined in < t i m e .h>. It is followed by brief notes on how to use the functions
declared in < t i m e.h>.
Chapter 15
422
shared
Note that the functions share two static data objects. All functions that
data return a value of type pointer to char return a pointer to one of these data
objects objects. All pointers that return a value of type pointer to struct tm return
asctime
The members may occur in a different order, and other members may also
be present. The DST flag is greater than zero if DaylightSavings Time (DST)
is in effect, zero if it is not in effect, and less than zero if its state is unknown.
The unknown state encourages the functions that read this structure to
determine for themselves whether DST is in effect.
asctime - (The asc comes from ASCII, which is now a misnomer.) Use
this function to generate the text form of the date represented by the
argument (which points to a broken-down time). The function returns a
pointer to a null-terminated string that looks like emsunDec 2 06:55: 15
1979\nW.This is equivalent to calling etrftime with the format string
11%
the w ~ locale.
c m g Call asctime if you want the English-language form
regardless of the current locale. Call strftime if you want a form that
changes with locale. See the warning about shared data objects, above.
clock - This function measures elapsed processor time instead of
calendar time. It returns -1 if that is not possible. Otherwise, each call
should return a value equal to or greater than an earlier call during the
same program execution. It is the best measure you can get of the time your
program actually consumes. See the macro CLOCKS-PER-SEC, above.
ctime - ctime(pt ) is equivalent to the expression asctime(loca1time (pt) ) . You use it to convert a calendar time directly to a text form that
is independent of the current locale. See the warning about shared data
objects, above.
difftime - The only safe way compute the difference between two
times tl and to is by calling difftime (tl, to). The result, measured in
seconds, is positive if tl is a later time than to.
-time - (The comes from ~ w r which
,
is now a slight misnomer.) Use
this function to convert a calendar time to a broken-down UTC time. The
member tm-isdst should be zero. If you want local time instead, use
localtime, below. See the warning about shared data objects, above.
localtime - Use this function to convert a calendar time to a brokendown local time. The member tm-isast should reflect whatever the system
knows about Daylight Savings Time for that particular time and date. If
you want UTC time instead, use -time, above. See the warning about
shared data objects, above.
mktime - This function first puts its argument, a broken-down time, in
canonical form. That lets you add seconds, for example, to the member
tm-eec of a broken-down time. The function increases tm-min for every 60
seconds it subtracts from tm-sec until tm-sec is in the interval [O,591. The
function then corrects tm-min in a similar way then each coarser division
of time through tm_year. It determines tm-wday and t m d a y from the other
fields. Clearly, you can also alter a broken-down time by minutes, hours,
days, months, or years just as easily.
mktime then converts the broken-down time to an equivalent calendar
time. It assumes the broken-down time represents a local time. If the
member tm-isdst is less than zero, the function endeavors to determine
whether Daylight Savings Time was in effect for that particular time and
date. Otherwise, it honors the original state of the flag. Thus, the only
reliable way to modify a calendar time is to convert it to a broken-down
time by calling localtime, modify the appropriate members, then convert
the result back to a calendar time by calling &time.
strftime - This function generates a null-terminated text string containing the time and date information that you specify. You write a format
string argument to specify a mixture of literal text and converted time and
date information. You specify a broken-down time to supply the encoded
%c\nM in
clock
ctime
difftime
-time
localtime
&time
strftime
Chapter 15
time
Figure 15.1:
time h
time and date information. The category %TIME in the current locale
determines the behavior of each conversion. I descxibe how you write
format strings starting on page 421. See the warning about shared data
objects, above.
time-This function determines the current calendar time. It returns-1
if that is not possible. Otherwise, each callshould return a value at the same
time or later than an earlier call during the same program execution. It is
the best estimate you can get of the current time and date.
1:
/* d e c l a r a t i o n s */
char * a s d i m e (const s t r u c t tm *);
clock-t clock (void);
char *ctime(const time-t *);
double d i f ftima (ti-t,
time_t);
s t r u c t t m * g m t h (const ti-t
*) ;
s t r u c t t m *localtime (const time_t *);
t i m e - t mktime ( s t r u c t t m *);
size- t s t r f t i m e ( c h a r *, size- t, const char
const s t r u c t tm *);
t+t
time(time-t
*);
#endif
-
*,
Implementing <time.h>
The functions declared in < t i m e . h> are quite diverse. Many wrestle with
the bizarre irregularities involved in measuring and expressing times and
dates. Be prepared for an assortment of coding techniques.
header
Figure 15.1 shows the file t i m e . h. As usual, it inherits from the internal
c t i m e . h> header < p a l s . h> definitions that are repeated in several standard headers.
I discuss the implementationof both the macro NULL and the type definition
size-t in Chapter 15: <stddef .h>.
< y v a l s . h> also defines two macros that describe properties of the primitive functions c l o c k and t i m e :
The macro -CPS specifies the value of the macro CLOCKS-PE-ECOND.
-CPS
The macro TBIAS gives the difference, in seconds, between values
-TBIAS
returned b y t i m e and the time measured from 1 January 1900. (This
macro name does not appear in <time. -.)
The values of these macros depend strongly on how you implement
c l o c k and t i m e . This implementation represents elapsed processor time as
an unsigned int (type c l o c k - t ) . It represents calendar time as an unsigned
long (type t k t )that counts UTC seconds since the start of 1January1900.
That represents dates from 1900 until at least 2036. You have to adjust
whatever the system supplies to match these conventions.
The macro -TBIAS is a kludge. Normally, you want to set it to zero. The
version of time you supply should deliver calendar times with the appropriate starting point. UNIX, however, measures time in seconds since 1
January 1970. Many implementations of C offer a function t i m e that
matches this convention. If you find it convenient to use such a time
function directly, then < p a l s . IO should contain the definition:
#define-TBIAS
((70
365LU
17)
86400
That counts the 70 years, including 17 leap days, that elapsed between the
two starting points. In several places, the functions declared in < t i m e . adjust a value of type time_t by adding or subtracting-TBIAS.
function
Figure 15.2 shows the file t i m e . c . It defines the function t i m e for a UNIX
time system. As usual, I assume the existence of a C-callable function with a
reserved name that peforms the UNIX system service. For this version of
t i m e , the header < p a l s . IO can define the macro -TBIAS to be zero.
function
UNIX also provides an exact replacement for the function c l o c k . SO do
c l o c k many implementations of C modeled after UNIX. Thus, you may not have
to do any additional work. Just define the macro - c ~ sappropriately. For a
PC-compatible computer, for example, the value is approximately18.2.
Figure 15.3 shows the fie c l o c k . c . It defines a version of c l o c k you can
use if the operating system doesn't provide a separate measure of elapsed
processor time. The function simply returns a truncated version of the
calendar time. In this case, the header <yvals .h> defines the macro -CPS to
be 1.
Chapter 15
426
Figure 15.2: / * time function -- UNIX version */
time. c
#include <time.h>
time-t t = -Time(NULL)
if (tad)
*tod = t;
return (t);
1
clock. c
clock-t (clock)(void)
f
return ((clock-t)time(NULL));
1
I }
Figure 15.4 shows the file aifftime.c. It is careful to correct the biases of
both times before comparing them. It is also careful to develop a signed
difference between two unsigned integer quantities. Note how the function
negates the difference tl - to only after converting it to double.
header
The remaining functions all include the internal header nxtime.hu.
"xtime.h w Figure15.5 shows the file xtime.h. It includes the standard header <time.h,
and the internal header "xtinfo.hw.(See the file I'xtinfo. h w on page 100.)
That internal header defines the type -Tinfo. It also declares the data object
-Times, defined in the file asctime.~.(See page 437.) -Times specifies
locale-specific information on the category LC-TIME.
The header "xtime.h w defines the macro WDAY that specifies the weekday
for 1 January 1900 (Monday). It defines the type Dstrule that specifies the
components of an encoded rule for determining Daylight Savings Time.
(See the file xsetast.c beginning on page 432.) And it declares the various
internal functions that implement this version of <time.h>.
function
difftime
/* xtime. h i n t e r n a l header */
#include
<time.h>
x t i m e .h
#include " x t i n f o - h "
/* macros */
#define WDAY
1
/* t o g e t day of week r i g h t * I
/* t y p e d e f i n i t i o n s */
typedef s t r u d {
unsigned char wday, hour, day, mon, year;
} Dstrule;
/* i n t e r n a l d e c l a r a t i o n s */
i n t -Daysto ( i n t , i n t );
const char *-Genthe (const s t r u c t t m *, -Tinfo *,
const char *, i n t *, char *);
Dstrule *-Getdst(const char *);
const char *-Gettime(const char *, i n t , i n t * ) ;
i n t -1sdst (const s t r u c t t m *) ;
const char *-Getzone (void);
size- t -Strftime (char *, size- t, const char *,
const s t r u c t t m *, Tinfo *);
s t r u c t t m *-Ttotm ( s t r u c t t m *, ti-t,
i n t );
C
t+t
-Tzoff (void);
Figure 15.5:
Figure 15.6:
gmtime.c
/* gmtime function */
#include "xtime.hN
struct t m
r e t u r n (-~totm(NULL,
*tod, 0 ) ) ;
Figure 15.6 shows the file -time. C. The function gmtime is the simpler
of the two functions that convert a calendar time in seconds (type ti-t)
to a broken-down time (type s t r u c t tm). It simply calls the internal function
-~ t o t m .The first argument is a null pointer to tell - ~ t o t m to store the
broken-down time in the communal static data object. The third argument
is zero to insist that Daylight Savings Time is not in effect.
function
Figure 15.7 shows the file xttotm. c. It defines the function - ~ t o t mthat
-~ t o t mtackles the nasty business of converting seconds to years, months, days,
and so forth. The file also defines the function-~a~sto that-~totmand other
functions use for calendar calculations.
function
Daysto counts the extra days beyond 365 per year. To do so, it must
Daysto
determine
how may leap days have occurred between the year you speclfy
and 1900. The function also counts the extra days from the start of the year
to the month you spedy. To do so, it must sometimes determine whether
the current year is a leap year. The function recognizes that 1900 was not a
leap year. It doesn't bother to correct for the non-leap years1800 and earlier,
or for 2100 and later. (Other problems arise within just a few decades of
those extremes anyway.)
function
mime
Chapter 15
Figure 15.7:
xttotm. c
Part 1
'* -Ttotm
and-Daysto functions
linclude "xtime .h"
*/
/* macros */
Idef i n e MONTAB (year)
\
( ( y e a r ) h 03 1 1 (year) == 0 ? mos : h s )
/* s t a t i c d a t a */
l t a t i c const s h o r t lxnos [I = {O, 31, 60, 91, 121, 152,
182, 213, 244, 274, 305, 335);
l t a t i c const s h o r t mas[] = {O, 31, 59, 90, 120, 151,
181, 212, 243, 273, 304, 334);
.nt -Daysto ( i n t year, i n t mon)
I
/* compute e x t r a days t o s t a r t of month * I
int days;
/* c o r r e c t f o r l e a p year: 1801-2099 */
i f (0 < year)
days = (year
1) / 4;
else i f (year <= -4)
days = 1 + (4 - year) / 4;
else
days = 0;
r e t u r n (days + MONTAB (year) [mon] ) ;
1
~ t r u c tt m *-Ttotm(struct
/*
t m * t , time-t secsarg, i n t i s d s t )
convert s c a l a r time t o time s t r u c t u r e */
i n t year;
long days;
t-t
secs;
s t a t i c s t r u c t t m ts;
s e c s a t g += -TBIAS;
i f (t = NULL)
t = hts;
t->tm-isdst = i s d s t :
f o r (secs = secsarg; ; secs = secsarg + 3600)
/* loop t o c o r r e c t f o r DST * I
I
days = secs / 86400;
t->tm-wday = (days + WDAY) % 7;
I
/* determine year * I
long i;
f o r (year = days / 365;
days < (i= -Daysto(year, 0) + 365L * y e a r ) ; )
--year;
/* c o r r e c t guess and recheck */
days -= i;
t - X m j e a r = year;
t - X m j d a y = days;
/*
Continuing
determine month * /
int mon;
const short *pm = MONTAB(year);
xttotm.c
Part 2
p[mon]
1;
1
secs %=86400;
t->tm-hour = secs / 3600;
secs %= 3600;
t-Xm-~nin = secs / 60;
t->tm-sec = secs % 60;
if (0 <= t->tm-isdst I I (t->tm-isdst = -1sdst (t)) <= 0)
/* loop only if <O => 1 * I
return (t);
1
1
Daysto handles years before 1900 only because the function mktime can
develop intermediate dates in that range and still yield a representable
t+t
value. (You can start with the year 2000, back up 2,000 months, and
advance 2 billion seconds, for example.) The logic is carefully crafted to
avoid integer overflow regardless of argument values. Also, the function
countsexcess days rather than total days so that it can cover a broader range
of years without fear of having its result overflow.
-~ t o t muses -Daysto to determine the year corresponding to its time
argument secsarg. Since the inverse of -Daysto is a nuisance to write,
-~ t o t mguesses and iterates. At worst, it should have to back up one year
to correct its guess. Both functions use the macro MONTAB,defined at the top
of the file, to determine how many days precede the start of a given month.
The macro also assumes that every fourth year is a leap year, except 1900.
The isdst (third) argument to -~totm follows the convention for the
isdst member of strud tm:
If isdst is greater than zero, Daylight Savings Time is definitely in effect.
-Ttotm assumes that its caller has made any necessary adjustment to the
time argument secsarg.
If isdst is zero, Daylight Savings Time is definitely not in effect. -~totm
assumes that no adjustment is necessary to the time argument secsarg.
If isdst is less than zero, the caller doesn't know whether Daylight
Savings Time is in effect. -Ttotm should endeavor to find out. If the
function determines that Daylight Savings Time is in effect, it advances
the time by one hour (3,600 seconds) and recomputes the broken-down
time.
Thus, -~totmwill loop at most once. It calls the function -1sdst only if it
needs to determine whether to loop. Even then, it loops only if -1sdst
concludes that Daylight Savings Time is in effect.
Chapter 15
430
Figure 15.8 shows the file xisdst .c. The function -1sdst determines the
-~ s d s tstatus of Daylight Savings Time (DST).-Times .-~sdstpoints at a string that
function
spells out the rules. (See the file asctime.c in Figure 15.16 for the definition
of -Times. See page 111 for a description of the rule string.)
~ s d sworks
t
with the rules in encoded form. Those rules are not current
the-first time you call the function or if a change of locale alters the last
encoded version of the string-~imes.-~sdst.If that string is empty, -1sdst
looks for rules appended to the time-zone information -Times .-t zone. It
calls - a t z o n e as necessary to obtain the time-zone information. It calls
-Gettime to locate the start of any rules for DST. The function - G e t d s t then
encodes the current array of rules, if that is possible.
Given an encoded array of rules, -1sdst scans the array for rules that
cover the relevant year. It adjusts the day specified by the rule for any
weekday constraint, then compares the rule time against the time that it is
testing. Note that the first rule for a given starting year begins not in DST.
Successive rules for the same year go in and out of DST.
Figure15.9 shows the file xgetdst.c. It defines the function - G e t d s t that
function
-~ e t d s tparses thestring pointed to byrimes .-~sdstto construct thearray of rules.
The first character of a (non-empty)string serves as a field delimiter, just
as with other strings that provide locale-specific time information. The
function first counts these delimiters so that it can allocate the array. It then
passes over the string once more to parse and check the individual fields.
~ e t d scalls
t
the internalfunction g e t i n t to convert the integer subfields
in ;rule. No overflow checks occur because none of the fields can be large
enough to cause overflow. The logic here and in - ~ e t d s tproper is tedious
but straightforward.
Figure 15.10 shows the file 1ocaltim.c. The function localtime calls
function
localtime - ~ t o t mmuch like gmtime. Here, however, localtime assumes that it must
convert a UTC time to a local time.To do so, the function must determine
the time difference, in seconds, between UTC and the local time zone.
function
The file 1ocaltim.c also defines the function - ~ z o f f that endeavors to
~
z
o
f
f
determine
this time difference (tzoff, in minutes). The time difference is
not current the first time you call the function or if a change of locale alters
the last encoded version of the string-~imes.- zone. If that string is empty,
f
the function -Getzone to determine the time difference from
-~ z o f calls
environment variables, if that is possible.
However obtained, the string -~imes.-l zone takes the form
:EST :EDT: +0300. (See page111.) - ~ z o ff calls the function-~ettimeto determine the starting position and length (n) of the third field (#2, counting
from zero). The function s t r t o l , declared in <stdlib.h> must parse this
field completely in converting it to an encoded integer. Moreover, the
magnitude must not be completely insane. (The maximum magnitude is
greater than 12*60 because funny time zones exist on either side of the
International Date Line.)
< t i m e. h>
Figure 15.8:
xisdst c
'*
Isdst function */
!include < s t d l i b. h>
linclude "xtime .h"
I
i f (-Times .-Isdst [0] ==
/*
' \0'
/*
f i n d c u r r e n t dst-rules
*/
*/
i n t n;
i f (-Times .-Tzone [ 0] == ' \0' )
Times.-Tzone = -Getzone();
Times. -1sdst = -Gettime (-Times .-Tzone, 3, hn) ;
i f (-Times .-Isdst [0] != ' \ O r )
/* p o i n t t o d e l i m i t e r * I
---Times.-Isdst;
1
i f ( ( p r = - G e t d s t (-Times .-Isdst) ) = NULL)
r e t u r n (-1);
f r e e ( r u l e s );
rules = pr;
o l d d s t = -Times.-Isdst;
i n t ans =
const i n t
const i n t
const i n t
/* check time a g a i n s t r u l e s * /
0;
do = -Daysto ( t - > t m j e a r , 0);
hour = t->tm-hour + 24 * t - > t m j & y ;
wdO = (365L * t->tm_year + do + WDAY) % 7 + 14;
i f (0 < pr--day)
I
i n t wd = ( t h y
/*
wdO
s h i f t t o s p e c i f i c weekday
pr--day)
% 7;
rday += wd == 0 ? 0 : 7
i f ( p r - M y <= 7)
rday -= 7;
*/
wd;
/*
s t r i c t l y before
*/
1
r e t u r n (ans);
1
1
Chapter 15
432
Figure 15.9:
xgetdst .c
Part 1
'*
Getdst function
/includeCctype.h>
/include <stdlib.h>
/include<string.h>
/include "xtime.h"
*/
/*
accumulate digits */
int value;
for (value = 0; 0 <= --n && isdigit(*s); ++s)
value = value * 10 + *s - '0';
return (0 <= n ? -1 : value);
1
ktrule *Getdst(const char *s)
/*
/*
/*
parse rules */
int year = 0;
for (pr = rules; ; ++pr, ++s)
if (*s ==
' (' )
/*
Continuing
x g e t d s t .c
Part 2
/*
/*
pr--day
= s [ l ] == '0'
i f (*s
'+')
p r - h d a y += 7;
s += 2;
i n v a l i d week day *I
? 7 : s[1]
1
i f (*s == ' \ O f )
1
else i f (*s != delim)
break;
1
f r e e ( r u l e s );
r e t u r n (NULL);
1
1
Figure 15.10:
1ocaltim.c
l o c a l t i m e function
linclude < s t d l i b . h >
linclude "xtime .h"
I*
*/
I
/* determine l o c a l t i m e o f f s e t */
s t a t i c c o n s t c h a r *oldzone = NULL;
s t a t i c long t z o f f = 0;
s t a t i c const long maxtz = 60*13;
i f (oldzone != -Times.-Tzone)
/*
1
,turn (tzoff
60);
~ t r u c ttm
l o c a l time s t r u c t u r e */
-1)) ;
C
434
Chapter 15
Figure 15.11:
xgettime.c
I*
Gettime function
#include < s t r i n g . h>
#include "xtime.hW
*/
f o r (; ; --n, s = s l
+ 1)
/* f i n d end of c u r r e n t f i e l d
i f ( (sl = s t r c h r (s, delim) ) = NULL)
s l = s + s t r l e n (s);
i f (n <= 0 )
I
/* found proper f i e l d
*len = s l
s;
return (s);
I
1
else i f ( * s l ==
'\O'
*,
*,
/*
not enough f i e l d s
*,
*len = 1;
r e t u r n (sl);
1
1
1
Figure 15.11 shows the file xgettime. c. It defines the function -Gettime
that
locates a field in a string that specifies locale-specifictime information.
See the description of - ~ e t d s t , above, for how -Gettime interprets field
delimiters. If - ~ e t t h ecannot find the requested field, it returns a pointer
to an empty string.
function
Figure 15.12 shows the file xgetzone-C. The function -Getzone calls
-Getzone getenv, declared in c s t d l i b .h>, to determine the value of the environment
variable w
~
~ That
~ value
~ should
~
have
~
the
n same
.
format as the localespecific time string-~imes.-g zone,described above (possiblywith rules for
determining Daylight Savings Time bolted on).
~
~
~
~
w
If no~ value exists
for TIMEZ
ZONE^, ~the function
-Getzone
then looks for
"TZ" the environment variable l m ~ zThat
m l . value should match the UNIX format
ESTO~EDT.
The internal function reformat uses the value of "TZ" to develop
the preferred form in its static buffer.
If - ~ e t z o n efinds neither of these environment variables, it assumes that
the local time zone is UTC. In any event, it stores its decision in the static
internal buffer tzone. Subsequent calls to the function return this rernembered value. Thus, the environment variables are queried at most once, the
first time that -Getzone is called.
function
Figure 15.13 shows the file &time. c. The function &time computes an
&time integer time-t from a broken-down time s t r u c t tm. It takes extreme pains
to avoid overflow in doing so. (The function is obliged to return the value
-1 if the time cannot be properly represented.)
function
Gettime
<time. h>
Figure 15.12:
xgetz0ne.c
/* Getzone function
#include <ctype.h>
#include < s t d l i b . h >
#include < s t r i n g . h >
#include "xtime.hW
*/
/* s t a t i c d a t a */
s t a t i c const char *defzone = ":UT.C:UTC:O";
s t a t i c char *tzone = NULL;
s t a t i c char *reformat (const char *s)
/*
refonnat TZ
*I
i n t i, v a l ;
s t a t i c char t z b u f [ ] = ":EST:EDT:+0300";
f o r (i = 4; 1 <= --i; )
i f ( i s a l p h a (*s) )
tzbuf [ i ] = *s++;
else
r e t u r n (NULL);
tzbuf[9] = * s = '-' I I * s == '+' ? *s++ : I + ' ;
i f ( ! i s d i g i t (*s) )
r e t u r n (NULL);
'Of;
v a l = *s++
i f ( i s d i g i t (*s))
v a l = 10
v a l + *s++ - '0';
f o r ( v a l *= 60, i = 14; 10 <= --i; v a l /= 10)
tzbuf [ i ] = v a l % 10;
, )
f o r ( i = 8; 5 <= --i.
i f ( i s a l p h a ( * s ))
tzbuf [ i ] = *s++;
else
r e t u r n (NULL);
r e t u r n (*s == ' \ O r ? tzbuf : NULL);
/*
436
Chapter 15
Figure 15.13:
mktime . c
I* mktime function */
#include < l i m i t s . h >
#include "xtime .h"
time-t (mktime) ( s t r u d t m * t )
/* convert l o c a l time s t r u c t u r e t o s c a l a r time
I
double dsecs;
i n t mon, year, ymon;
time-t secs;
Figure 15.14:
ctime. c
/* ctime function
#include <time.h>
char
*/
* (ctime) (const
t-t
*tod)
convert calendar time t o l o c a l text */
r e t u r n (asctime(localtime (tod) ) ) ;
1
0
Figure 15.15:
strftime. c
/* s t r f t i m e function
#include "xtime .h"
/*
*/
I )
*/
<time.h>
Figure 15.16: / * asctime function * /
asctime. c
#include "xtime.hW
/* static data * /
static const char ampm[] = {":AM:PM");
static const char days [I = {
":Sun:Sunday:Mon:Monday:Tue:Tuesday:Wed
":Thu:Thursday:Fri:Friday:Sat:Saturday"
static const char fmts[l = {
"1%
%D %H:%M:%S %Yl% %D %YI%H:%M:%Sm'}
static const char isdst[l = { " " I ;
static const char mons [I = {
":Jan:January:Feb:February:Mar:March"
" :Apr:April:May:May: Jun: June"
":Jul:July:Aug:August:Sep:September"
~~:Oct:October:Nov:November:Dec:December"3;
static const char zone[] = ( " " 3 ;
/ * adapt by default *I
static -Tinfo ctinfo = {ampm, days, fmts, isdst, mons. zone};
-Tinfo -Times = {ampm, days, fmts, isdst, mons, zone};
char *(asctime) (const struct tm at)
{
/ * format time as "Day Mon dd hh:mm:es yyyy\nml*I
static char tbuf [I = "Day MOn dd hh:mm:ss yyyy\nmm;
time
formatting
functions
function
asctime
The first part of &time determines a year and month. If they can be
represented as type int, the function calls -Daysto to correct for leap days
since 1900.&time then accumulates the time in seconds as type double, to
minimize further fretting about integer overflow. If the final value is
representable as type time-t, the function converts it to that type. &time
calls -Ttotm to put the broken-down time in canonical form. Finally, the
function corrects the time in seconds for Daylight Savings Time and converts it from local time to UTC. (The resultant code reads much easier than
it wrote.)
The remaining functions declared in <time.h> convert encoded times to
text strings in various ways. All depend, in the end, on the internal function
-strftime to do the actual conversion. What varies is the choice of locale.
The function asctime (and, by extension, the function ctime) convert times
locale regardless of
by a fixed format, following the conventions of the l~cll
the current state of the locale category LC-TIME. The function strftime,on
the other hand, lets you specify a format that directs the conversion of a
broken-down time. It follows the conventions of the current locale. Thus,
one of the arguments to -strftime specifies the locale-specific time information (of type -Tinfo) to use.
Figure15.16 shows the file asctime-c.It defines thefunctionasctime that
formats a broken-down time the same way irrespective of the current
Chapter 15
function
ctime
function
strftime
function
-strftime
function
-G e n t i m e
locale. The file also defines the data object -Times that specifies the localespecific time information. And it defines the internal data object ctinfo,
which replicates the time information for the -cm*
locale.
Figure 15.14 shows the file c t i m e - c . The function c t i m e simply calls
localtime, then a s c t i m e , to convert its t i m e - t argument. Thus, it always
follows the conventions of the glclllocale.
Figure 15.15 shows the file strftime-C.The function strftime calls
S t r f t i m e , using the locale-specific time information stored in -Thee.
Thus, its behavior changes with locale.
Figure 15.17 shows the file xstrftim-c. It defines the internal function
S t r f t i m e that does all the work of formatting time information. - ~ t r f t i m e
uses the macro PUT, defined at the top of the file x s t r f t i r n . ~ ,to deliver
characters. The macro encapsulates the logic needed to copy generated
characters, count them, and limit the number delivered.
The internal f u n c t i o n t o w c , declared in <stdlib.h>, parses the format
as a multibyte string using state memory of type -%state that you provide
on each call. The issues are the same as for - p r i n t f , described on page 303.
Figure 15.18 shows the file xgentime. c. It defines the function - G e n t h e
that performs the actual conversions for -strftime. The function -Gentime
consists primarily of a large switch statement that processes each conversion separately.
Each conversion determines a pointer P to a sequence of characters that
gives the result of the conversion. It also stores a signed integer count at
*pn. A positive count instructs s t r f t i m e to generate the designated sequence of characters.
One source of generated characters is the function - G e t t i m e , which
selects a field from one of the strings in the locale-specifictime information.
Another is the internal function getval, also defined in the file xgentime.~,
which generates decimal integers. getval stores characters in the accumulator provided by -strftime.
Note that - G e n t h e includes a nonstandard addition. The conversion
specifier %D converts the day of the month with a leading space in place of
a leading 0 . That's what a s c t i m e insists on.
-Gentime returns a negative count to instruct -Strf t i m e to "push down"
a format string for a locale-specific conversion. Three conversions change
with locale- %c,%x, and %x. (The conversion %x, for example, becomes the
format string .%b %d %Y. in the g ~ clocale.)
me
You express these conversions
as format strings that invoke the other conversions. (Page 111 describes
how to write a locale file that alters these format strings.) Note that the
function - s t r f t i m e supports only one level of format stacking.
The other internal function in the file xgentime. c is wkyr. It counts weeks
from the start of the year for a given day of the year. The week can begin
on Sunday ( w s t a r t is 0) or Monday ( w s t a r t is 1).The peculiar logic avoids
negative arguments for the modulus and divide operators.
< t i m e . h>
Figure 15.17:
xstrftim.c
/* S t r f t i m e f u n c t i o n
#include < s t d l i b . h>
#include <string.h>
#include "xtime .h"
*/
/* macros */
#&fine PUT ( s , na) (void) (nput = (na), \
0 < nput h h (nchar += nput) <= b u f s i z e ? \
( m c p y ( b u f , s , n p u t ) , buf += nput) : 0)
size- t -Strftime (char *buf, size- t b u f s i z e , const c h a r * f a ,
const s t r u c t tm * t , -Tinfo * t i n )
f
/* format time information * r
c o n s t c h a r *fmtsav, *s;
size- t l e n , lensav, nput;
size- t nchar = 0;
f o r (8 = fmt, l e n = s t r l e n ( f m t ) , fmtsav = NULL; ; fmt = s )
f
/* p a r s e format s t r i n g *I
i n t n;
wchar-t wc;
Mbsave s t a t e = {O);
while (0
< (n = -Mbtowc(hwc, s, l e n , h s t a t e ) ) )
/*
s += n, l e n -= n;
i f (wc == ' %' )
break;
1
i f (fmt < 8)
PUT(fmt, s
fmt
i f (0 < n)
/*
(0
<n
o r '\O'
-Gentime ( t ,
/*
do t h e conversion *I
t i n , a++, hm, a c );
--len;
i f (0 <= m)
PUT(p. m);
else i f (fmtsav = NULL)
fmtsav = s , s = p, lensav = l e n , l e n = -an;
1
i f (0 = l e n h h fmtsav = NULL I I n C 0)
f
/* format end o r bad multibyte char
PUT ("", 1);
/* n u l l termination
r e t u r n (nchar <= b u f s i z e ? nchar - 1 : 0 ) ;
1
else i f (0
len)
s = fmtsav, fmtsav = NULL, l e n = lensav;
1
1
*I
? 1 : 0));
f
c h a r a c [20];
i n t m;
c o n s t char *p =
*,
*,
Chapter 15
Figure 15.18: /* -Genthe function * /
x g e n t i m e . ~ #include "xtime.hw
Part 1
/ * macros
#define SUNDAY 0
#define MONDAY 1
*/
/ * codes for tm-wday
3
static int wkyr(int wstart, int wday, int yday)
{
/ * find week of year
wday = (wday + 7 - wstart) % 7;
return (yday - wday + 12) / 7 - 1;
3
const char *-Gentime(const struct tm *t,
const char *s, int *pn, char *ac)
{
info
*tin,
switch (*a++)
{
/* switch on conversion specifier
case 'a' :
/ * put short weekday name
p = -Gettime(tin->-Days, t-~tm-wday << 1, pn);
break;
case 'A' :
/ * put full weekday name
p = -~ettime(tin->-Days, (t-~tm-wday << 1) + 1, pn);
break;
case 'be:
/ * put short month name
p = -Gettime(tin->-Months, t-~tm-mon << 1, pn);
break;
case 'B' :
/ * put full month name
p = -Gettime(tin-D-Monthe, (t-~tm-mon<< 1) + 1, pn);
break;
case 'c':
/ * put date and time
p = -~ettime(tin->-Formats. 0. pn). *pn = - * m ;
break;
case 'd' :
/ * put day of month, from 01
p = getval(ac, t-~tm-mday, *pn = 2);
break:
case 'D' :
/ * put day of month, from 1
p = getval(ac, t-~tm-mday, *pn = 2);
if (ac[O] == '0')
ac[Ol = ' ';
break;
/ * put hour of 24-hour day
case 'H' :
p = getval(ac, t-~tm-hour, *pn = 2);
break;
<time.h>
Continuing
xgentime-c
Pari 2
case '1':
/* put hour of 12-hour day * /
p = getval(ac, t-~tm-hour % 12, *pn = 2);
break;
case j ':
/ * put day of year, from 001 * /
p = getval(ac, t - ~ t m ~ d a+ y1, *pn = 3 ) ;
break;
case 'm' :
/ * put month of year, from 01 * /
p = getval(ac, t-~tm-mon + 1, *pn = 2);
break;
case 'M' :
/ * put minutes after the hour *I
p = getval(ac, t-~tm-min, *pn = 2);
break;
case p' :
/ * put AM/PM *I
p = -Gettime(tin-D-Ampm, 12 <= t-~tm-hour, pn);
break;
case 'S':
/* put seconds after the minute *I
p = getval(ac, t-~tm-sec. *pn = 2);
break;
case 'U' :
/ * put Sunday week of the year *I
p = getval (ac,
wkyr(SUNDAY, t-~tm-wday, t - ~ t m ~ d a y )*pn
, = 2);
break;
came 'w' :
/ * put day of week, from Sunday * I
p = getval(ac, t-~tm-wday, *pn = 1);
break;
case 'W' :
/ * put Monday week of the year * r
p = getval (ac,
wkyr(M0NDAY. t-~tm-wday, t - ~ t m ~ d a y )*pn
, = 2);
break;
case 'x. :
/ * put date *I
p = -Gettime(tin-D-Formate, 1, pn), *pn = -*pn;
break;
case 'X' :
/ * put time *I
p = -~ettime(tin-D-Formats, 2, pn), *pn = -*pn;
break;
case 'y' :
/ * put year of the century * I
p = getval(ac, t - ~ t m y e a r% 100, *pn = 2);
break;
/ * put year a,
case 'Y':
p = getval(ac, t->tm_year + 1900, *pn = 4 ) ;
break;
case 'Z':
/ * put time zone name *,
if (tin-D-Tzone[Ol == '\0')
tin-D-Tzone = -Getzone();
/ * adapt zone *,
p = -Gettime(tin-D-Tzone, 0 < t-~tm-isdst, pn);
break;
case *%' :
/ * put "%" *,
p = w%lg,
*pn = 1;
break;
/ * unknown field, print it *,
default:
p = s - 1, *pn = 2;
3
return (p);
Chapter 15
442
Figure 15.19:
ttime-c
C
/ * test basic workings of time functions
char buf 1321;
clock-t tc = clock();
struct tm tsl;
time-t ttl, tt2;
static char *dstr = "Sun Dec 2 06:55:15 1979\nW;
*/
ttl = time(&tt2);
assert(tt1 == tt2);
tsl.tm-sec = 15;
tsl-tm-min = 55;
tsl-tm-hour = 6;
tsl-tm-day = 2;
tel.tm-mon = 11;
tsl.tm_year = 79;
tsl-tm-isdst = -1;
ttl = mktime(hts1);
assert(tsl.tm-wday == 0);
assert(tsl.tm_yday == 335);
++tsl.tm-sec;
tt2 = mktime(&tsl);
assert(difftime(tt1, tt2) < 0.0);
assert(strcmp(asctime(1ocaltime(&ttl)), dstr) == 0);
assert (strftime(buf, sizeof (buf), "%SW,
gmtime(&tt2)) == 2);
assert (strcmp(buf, "16") == 0);
assert(tc <= clock( ) );
fputs("Current date -- ", stdout);
time(&ttl);
fputs(ctime(&ttl), stdout);
puts("SUCCESS testing <time.h>");
return (0);
1979
References
W.M. O'Neil, Time and the Calendars, (Sydney, N.S.W.: Sydney University
Press, 1975). Calendars are notoriously idiosyncratic. This book tells you
more than you probably want to know about the history of measuring
calendar time. It also explains why days and dates are named and determined the way they are today.
Exercises
Exercise 15.1 Write a locale file that expresses the time conventions for the French
language. You need to alter:
amgm
dst-rules
time-zone
days
months
time-formats
Test your new locale. (Hint: You may want to commandeer test programs
in this and earlier chapters as a starting point.)
Exercise 15.2 Determine the rule where you live for beginning and ending Daylight
Savings Time. (If Daylight Savings Time is not observed where you live,
then pick a place that does so where you might like to live.) Write a locale
file that observes this rule. How has the rule changed over the last twenty
years? Can you express all these changes succinctly in a locale-file specification for dstrules?
Exercise 15.3 Many astronomers believe that the universe "began" approximately 15
billion years ago with a big bang. How many seconds have elapsed since
the big bang? How many bits does it take to represent the seconds that have
elapsed since the big bang?
Exercise 15.4 Leap years generally occur every multiple of four years. They generally do
not occur every multiple of one hundred years. They do occur every
multiple of four hundred years. Alter the function -Daysto, defined in the
file xttotm.c, to determine leap years properly before 1801 and after 2099.
Over what period does it make sense to have this function work properly?
Exercise 15.5 Write the function long delta-days (int year, int mon, int delta-man)
that counts the days in a span of months. The initial day is the first day of
the month mon in the year year. The span of months is the signed value
delta-man.Why do you need to specify the initial year?
Exercise 15.6 Implement the primitive functions clock and time for your system. What
can you say about the accuracy (and meaning) of the values returned by
these functions?
Exercise 15.7 In recent years, astronomers have taken to adding "leap seconds" to certain
years, just before midnight on New Year's Eve. (This corrects for the
slowing rotation of the Earth.) Find a list of years that have added leap
seconds. Correct for leap seconds at the appropriate place within the time
functions.
444
Chapter 15
Exercise 15.8 [Harder] Assemble a table of all the time zones in the world. Devise a
mnemonic naming scheme for all the zones. Add a function that lets you
specify your working time zone by this mnemonic name. What do you do
about Daylight Savings Time?
Exercise 15.9 [Very hard] Devise a notation for expressing calendar times succinctly as
text strings. You want people to be able to type these strings easily. Write
the function time-t strtotime(const char * ) that parses such a null-terminated calendar time string and produces the corresponding encoded
calendar time. How do you adapt the notation to changes in the current
locale?
Appendix A: Interfaces
This appendix summarizes what you have to do to interface this implementation of the Standard C library to a given execution envtronment. It is
aimed primarily at those who intend to do something with the implementation that I have presented so far. Others may find parts that are of interest,
if only to understand the issues involved. If your concern ends with the C
Standard or with the advice to users, however, you can safely skip what
follows.
Even among potential implementors, goals can vary widely. Some may
wish only to mine the code presented here for a few useful gems. If so, your
challengeis to find a consistent subset that meets your needs, then integrate
it into an existing C implementation. Others may wish to displace completely an existing C libray. If so, you have more work to do. I can only
sketch those extra steps here.
assumptions
I introduced the header < p a l s .h> to summarize as many parameters as
possible. Where that failed, I introduced the header "yfuns.w to tailor the
names of low-level primitives. I don't pretend that changing these headers
alone will adapt this library to all sensible environments. The code is
riddled with assumptions. Where those assumptionsfail to hold, you have
to alter the code to adapt it. Here are the assumptions you must verify:
all files -Review the assumptionsstarting on page 9. Many parts of the
library also assume that you can define writable static data objects
within the library. See the discussion on page 36.
<ctype. h>
<ctype .h> - The files xctype. c, xtolower. c, and xtoupper. c assume
that the execution character set is ASCII. Change the tables they contain
for a different character set. These files also assume that a char occupies
eight bits. If a char is larger, you may have to reconsider the approach
based on tables.
<errno.h>
<errno.h> - The files errno. c and errno. h assume that you can maintain errno as a writable static data object. You may have to call a function
on each access to errno to capture a deferred error report.
< f l o a t .h>
<float .h>-The files float .hand xfloat. c assume that the format for
floating-pointvalues is IEEE 754 or a closely related form. If the format(s1
differ sufficiently, you may have to reconsider the approach based on
the parameters in < p a l s .h>.
Appendix A
<limits.h> - The file 1imits.h assumes that a char occupies eight bits
and an int occupies either two or four bytes, (See page 77.)
<locale.h> - This code assumes knowledge of the inner workings of
several parts of the library. Look for problems here if you change any
code in: <ctype.h> (translation tables), <limits.h> (MB-LEN-MAX), <stdlib. h> (multibyte functions), <string.h> (collation functions), or
<time.h> (locale-specific time information).
<math.h> - This code is at least as dependent on floating-point fonnat
as <float.h>, above. (See the discussion beginning on page 127.) Be
prepared to make major changes if double retains more than 56 bits of
precision or has a decimal base.
<etdarg.h> - The file stdarg. h assumes that arguments passed to a
function are stored in ascending storage locations following a predictable pattern. (See page 211.) You have to reconsider this approach if any
of the assumptions fail to hold.
cstddef.h>-The macro off setof in file stddef .h assumes that you can
perform several tricks involving pointers and integers. (See page 222.)
If any of those tricks fail, you must find an alternateset of tricks that does
work. (Such a set must exist.)
primitives
Nineteen functions depend heavily on the execution environment. You
that interface this implementation
can think of them as the basic
to the execution environment. I made little or no attempt to provide
parametric versions of these functions. Expect to make significant changes
here. In many cases, you will find that existing functions in a C implementation can serve. Unless your goal is to displace completely an existing
library, you can commandeer such functions rather than write your own.
Here is a summary of the primitives:
<eetjmp.h>
<setjmp.h> - The functions setjmp and longjmp must be written in
assembly language specially for each implementation. You can probably
adapt the file set jmp.h merely by altering the macro -NSETJMP, defined
in the file yva1s.h. Don't even think about using the example files
long jmp.c and set jmp.c, however.
<signal.h>
<signal.h> - The files raise.c and signal .c must be modified to
control hardware signals. Some systems provide a direct replacement
for the function signal.
<etdio.h>
<etdio.h> - Nine functions and macros isolate most of the system
dependencies from the rest of the code. The functions are in the files
remove. c, rename. c, tmpnam-c,xfgpos. c, xfopen-c,and xfspos.~. The
macros are -Fclose, -Fread, and -Fwrite, defined in the file yfuns-h
Some systems provide direct replacements for a few of these functions.
Check carefully, however, that these candidates have the required behavior as well as the expected names.
<stdlib.h>
<stdiib.h> - Four functions and macros isolate most of the system
dependencies from the rest of the code. The functions are in the files
getenv.~,syetem.~,and xgetmem.~.The macro is -~xit,defined in the
Interfaces
Figure A.1:
yfuns.h
447
/* yfune-h functions header -- UNIX vereion */
#ifndef -YFUNS
#define -YFUNS
/ * macro8 */
#define -Envp
(*-Environ)
#define -Fcloee(etr)
-Cloee((etr)->-Handle)
#define -Fread(etr, buf, cnt) -Read((etr)->-~andle, buf, cnt)
#def ine r i t e (etr, buf. cnt) -Write ( (etr)->-Handle. buf. cnt)
/ * interface declaration8 */
extern conet char **-Environ;
int -Cloee(int);
void - ~ x i t( int ) ;
int -~ead(int, uneigned char *, int);
int -Write(int, conet uneigned char *, int);
#endif
C
file yfune .h. YOU can often use the file getenv. c presented here, given a
suitable definition or declaration for the data object - E n w in the file
yfune h.
<time .h> rn <time.h, -Two functions isolate most of the system dependencies from
the rest of the code. The functions are in the files clock. c and time. c.
You can write clock.c in terms of time. c,as I did here. That can be handy
environ
clock
cloee
exec1
exit
fork
getpid
kill
link
-Leeek
-Open
-Read
-Sbrk
-Signal
-Time
-Unlink
-Write
leeek
open
read
ebrk
eignal
time
unlink
write
I list -Environ first because it names a data object. (Like the macro errno,
defined in <errno.h>, it can be a function call that returns a pointer, if
necessary.) All the rest name functions that provide UNIX system services.
You may well have to write, or alter, assembly language files to supply these
services.
You can cheat and replace the reserved names with the conventional
names. That can be a quick way to get started using this implementation.
But that shortcut also causes a few name collisions. And it violates the rules
in the C Standard about the use of name spaces, of course.
Appendix A
-ADNBND
-AUPBND
-c2
-CPS
Given the necessary primitives, you adapt the remainder of the code by
altering the internal header <yvals.h,. It defines the following macros:
8 -ADNBND - used by etdarg-h to back up an argument pointer (value
typically O,1,3, or 7)
-AUPBND - used by 8tdarg.h to advance an argument pointer (value
typically O,1,3, or 7)
rn -c2 - used by 1imite.h to distinguish two's-complement representation (value1) from one's-complement or signed-magnitude (value 0)
rn CPS - used by t1me.h to determine the value of the macro
CLOCKS-PER-SEC
-DO rn
-DBIAS
rn
-DLONG
rn
-DOFF
-EDOM rn
-EFPOS rn
-ERANGE rn
-E RRMAX rn
-FBIAS
rn
-FNAMAX
rn
- used by xiloat .c to determine the bit offset of a float characteristic in the more-significant word
-FOP=
- used by 8tdio.h to determine the value of the macro
-FOFF rn POFF
-FOPKAX
FOPEN-MAX
-FRND rn -FRND - used by float. h to determine the value of the macro FLT-ROUNDS
-ILONG
-LBIAS rn
-LOFF rn
-MBMAX rn
MB-LEN-MAX
-MEMBND rn -MEMBND - used by several files to enforce the worst-case storage bound-
Interfaces
-NSETJMP - used
-NSETJMP
by
eet jmp.h
jv-buf
-NULL
-NULL
-SIGABRT
SIGABRT
-SI(;MAX
-TBIAS
-TNAMAX
-SIGMAX - used
-TBIAS
by
signal codes
k t
Figure A.2 shows the file yvale.h. It is a version of the header <yvals.h>
DEC
VAX that work with the VAX ULTRIX system. Most of the parameters are
ULTRIX common to many versions of UNIX. The floating-point parameters describe the proprietary format supported by the VAX and the older PDP-ll
computer architectures. That format does not truly support codes for Inf
and NaN, but this library defines them anyway. So long as you perform no
arithmetic operations on these special codes, they can survive to convey
useful information.
You can easily modify this version of yvale .h to work with the GNU C
GNU C
under compiler under Sun UNIX (using Motorola MC680XO microprocessors).
Sun UNIX First, change the floating-point parameters to describe IEEE 754 formats:
You must also provide a set of renamed UNIX system services, of course.
complete
If your goal is to displace completely an existing library for a given
libraries compiler, you have two additional concerns:
You must supply a C startup header that gets control initially from the
operating system. That requires an intimate knowledge of how the
operating system runs programs. The C startup header ensures that the
call stack is properly set up, that static storage is properly initialized, and
that the three standard streams are open. It calls main, then exit with the
status returned from main. Operating systems vary considerably in how
much of this work they do for you.
Appendix A
450
A.2:
yvals
--
*/
You must supply any C runtime functions that the generated code may
call. That requires an intimate knowledgeof how the compiler generates
code. A switch statement, for example, often calls a runtime function
rather than perform all the compares and branches with inline code.
Compilers vary considerably in how much they depend on C runtime
functions.
45 1
Interfaces
/*
*/
Appendix A
other
systems
IBM
System1370
freestanding
programs
improvements
Appendix B: Names
This appendix lists the names of entities defined in this implementation
of the library that have externallinkage or are defined in one of the standard
headers. They are the names that your program sees, for good or for ill. A
function name that appears twice has a macro definition that masks its
declaration in the standard header that declares it.
Name
--BDFSIZ
CHAR-BIT
CHAR-MIN
CLOCKS_PER_SEC
DBL-DIG
DBL-E3? SILON
DBL---DIG
DBL-MAX
DBLDaL_Eaur_lo_EwMAXMAXIO-EXP
DBLDaL_Eaur_lo_EwMAX-EXP
DBL-MIN
DBL-MIN-10-EXP
DBL-MIN-EXP
EDOM
EFPOS
EOF
ERANGE
EXIT-FAILEXIT-SUCCESS
FILE
FILE--MAX
FLT-DIG
FLT-EPSILON
FLT-=-DIG
FLT-MAX
FLT-MAX-10-EXP
FLT-MAX-EXP
FLT-MIN
FLTMIN-lO-E3@
Header
File
stdio. h
limite .h
limite.h
limits.h
time.h
float. h
float.h
float.h
float.h
float. h
float.h
float.h
float.h
float.h
errno.h
errno.h
etdio. h
errno-h
etdlib.h
etdlib.h
etdio.h
8tdio. h
float.h
float.h
float.h
float. h
float.h
float.h
float.h
float.h
Page
Appendix B
Name
FOPEN-MAX
HUGE-VU
INT-MAX
INT-MIN
LC-LC-COLLATE
LC-CTYPE
LC-MONETARY
LC-NUMERIC
LC-TIME
LDBL-DIG
LDBL-EPS ILON
LDBL-MANl-DI G
LDBL-4AX
LDBL-MAX-10-EXP
LDBL-MAX-EW
LDBL-MIN
LDBL-MIN-1 0-EXP
LDBL-MIN-EXP
LONG-MAX
LONG-MIN
-
---
SCHAR-MAX
SCHAR_MIN
SEEK-CUR
SEEK-END
SEEK-SET
SHRT-MAX
SHRT-.IN
SIGAERT
SIGE'PE
SIGILL
SIGINT
SIGSEGV
SIGTERM
SIG-DFL
Header
File
Page
float.h
f l o a t .h
f l o a t .h
etdio. h
math.h
limite .h
l i m i t s .h
locale. h
l o c a l e .h
locale. h
l o c a l e .h
locale. h
l o c a l e .h
float.h
f l o a t .h
f l o a t .h
f l o a t .h
f l o a t .h
float.h
f l o a t .h
float.h
f l o a t .h
l i m i t s .h
limits .h
etdio. h
etdlib .h
limite .h
l o c a l e .h
st ddef .h
etdio. h
etdlib. h
etring .h
time. h
etdlib .h
limite. h
l i m i t e .h
etdio.h
etdio. h
etdio. h
limite .h
limite .h
eignal .h
eignal .h
eignal .h
eignal .h
eignal h
eignal .h
eignal .h
66
66
66
276
138
76
76
%
96
96
96
96
%
66
66
66
66
66
66
66
66
66
76
76
276
354
76
96
223
276
354
398
424
354
76
76
276
276
276
76
76
Names
Name
USHRT-MAX
abort
abe
acoe
aectime
aein
I,
I,
aeeert
atan
atan2
atexit
atof
I,
I,
atoi
I,
I,
Header
File
signal. h
signal .h
8tdio.h
1imits.h
1imite.h
1imite.h
1imfte.h
abort. c
ab8.c
acoe c
math.h
a8ctime.c
a8in.c
math.h
assert. h
atan.c
atan2.c
atexit .c
atof .c
etd1ib.h
atoi c
etdl ib. h
at01 c
etd1ib.h
beearch.c
calloc c
cei1.c
c1earerr.c
clock.c
time.h
COB. C
math-h
coeh. c
ctime.c
difftime-c
div c
etdl ib h
errno.c
exit .c
exp c
f abe c
fclose. c
feof .c
ferror-c
ff1ueh.c
f getc c
fgetpoe c
8tdio.h
.
.
beearch
calloc
ceil
clearerr
clock
clock-t
COB
I,
.I
coeh
ctime
dif ftime
div
div-t
errno
exit
exP
fabs
f cloee
feof
f error
fflueh
fgetc
fgetpoe
I,
I,
Appendix B
Page
Name
Header
File
fgete
floor
<stdio. h>
(math. W
(math. W
<etdio.h>
<etdio.h>
<etdio.h>
<etdio.h>
<atdio.h>
<etdio .h>
<etdlib.h>
<etdio.h>
6nath.W
<stdio.h>
<etdio.h>
<stdio.h>
<stdio.h>
<stdio. h>
<stdio.h>
<stdio.h>
<etdio.h>
<stdio.h>
<stdio.h>
<etdio. h>
<stdio h>
<stdlib.h>
<stdio h>
<time. h>
<ctype.h>
<ctype.h>
-type. h>
-type. h>
<ctype.h>
<ctype.h>
<ctype.h>
<ctype. h>
<ctype.h>
<ctype.h>
<ctype.h>
<ctype. h>
<ctype. h>
<ctype. h>
<ctype. h>
<ctype. h>
<ctype. h>
<ctype. h>
<ctype.h>
<ctype.W
fgete. c
floor. c
fm0d.c
fopen. c
etdio. h
f p r i n tf . c
fputc. c
fput8.c
fread. c
free.c
f reopen. c
frexp. c
f ecanf . c
f8eek.c
etdio . h
fsetpos. c
etdio . h
f t e l l .c
s t d i o. h
fwrite . c
getc-c
etdio.h
getchar. c
etdio . h
getenv . c
get8.c
gmtime. c
ctype h
isalnum. c
ctype h
iealpha . c
ctype. h
i e c n t r l. c
ctype . h
i e d i g i t .c
ctype.h
iegraph. c
ctype. h
ielower . c
ctype. h
isprint. c
ctype. h
ispunct . c
ctype. h
isspace. c
ctype. h
ieupper. c
fmod
fpoe-t
fprintf
fputc
fput e
f read
free
f reopen
frexp
fscanf
f seek
I,
11
ftell
*.
,,
getchar
,I
11
getenv
gets
gmtime
iealnum
,.
w.
iealpha
I,
11
iscntrl
I,
11
isdigit
I,
I,
iegraph
I,
11
islower
I,
11
ispunct
11
I,
.
.
Names
457
Name
jmp_buf
labs
ldexp
ldiv
ldiv-t
localeconv
.I
I,
localtime
log
" *
mbstowce
mbtowc
I.
I,
memchr
memcmp
memcpy
memve
memeet
mktime
modf
off eetof
perror
POW
printf
ptrdif f-t
putc
.I
I,
putchar
I,
I,
puts
qeort
raise
rand
realloc
remova
rename
rewind
ecanf
eetbuf
Header
File
<ctype.h>
ectype .h>
<setjmp. h>
eetd1ib.h~
<math.h>
<stdlib.h>
estd1ib.h~
<locale.h>
<locale.h>
<time.h>
<math.h>
<math.h>
<math.h>
<math.h>
<eetjmp.h>
<etdlib.h>
eetd1ib.h~
<etdlib.h>
<etdlib.h>
<etdlib.h>
<etdlib.h>
e6tring.h~
<etring.h>
e8tring.h~
<etring.h>
q8tring.h~
<time.h>
<math.h>
<etddef.h>
<etdio .h>
<math.h>
cetdio .h>
qetddef .h>
eetdio h>
<etdio.h>
<etdio .h>
<etdio.h>
<etdio .h>
cetd1ib.h~
<eignal.h>
cetd1ib.h~
atd1ib.h~
cetdi0.h~
<etdio .h>
<etdio.h>
Cetdio .h>
<etdio .h>
ctype.h
i8xdigit.c
8etjnlp.h
labs.c
ldexp. c
ldiv. c
etd1ib.h
1ocaleco.c
locale. h
localtim. c
log. c
math-h
log10. c
math.h
long jmp. c
malloc c
mblen. c
etd1ib.h
mbetowce.~
mbtowc c
etd1ib.h
memchr c
memcmp c
memcpy c
memmove c
memeet. c
mktime-c
modf c
etddef .h
perror c
p0w.c
printf .c
etddef .h
putc. c
etdio. h
putchar. c
8tdio.h
put8.c
qsort c
raiee. c
rand. c
realloc c
remove. c
rename. c
rewind. c
ecanf c
eetbuf c
Page
.
.
.
.
.
37
39
187
356
144
356
354
97
96
433
166
138
167
138
189
374
366
354
366
366
354
399
399
400
400
400
436
143
223
298
I68
301
223
297
276
297
276
Appendix B
Name
s e t jmp
I*
11
setlocale
setvbuf
sig-atomic-t
signal
sin
I,
I.
sinh
Header
<setjmp . h>
<set jmp h>
<locale. h>
<stdio.h>
<signal. h>
<signal. h>
<math.h>
<math.h>
<math.h>
<stddef . h>
<stdio h>
< s t d l i b. h>
<string. h>
<time.<stdio.h>
<math.h>
< s t d l i b . h>
< s t d l i b . h>
<stdio.h>
<stdio. h>
<stdio.h>
<stdio. h>
<string. h>
<string. h>
<string. h>
<string. h>
<string. h>
<string. h>
<string. h>
<string. h>
<time. h>
<string. h>
<string. h>
<string. h>
<string. h>
<string. h>
<string. h>
<string. h>
<string. h>
< s t d l i b. h>
<stdlib. h>
<string. h>
<stdlib. h>
< s t d l i b . h>
< s t d l i b . h>
<string. h>
< s t d l i b . h>
<math.h>
sprintf
srand
I,
1,
sscanf
stderr
stdin
stdout
strcat
st rchr
strcmp
strcoll
StrcPY
strcspn
strerror
I,
I,
strftime
strlen
strncat
strncmp
strncpy
strpbrk
strrchr
strspn
strstr
strtod
11
I,
st rtok
strtol
strtoul
11
I,
strxfrm
system
tan
File
s e t jmp . c
setjmp-h
setlocal. c
setvbuf. c
signal. h
signal. c
math.h
sin. c
sinh . c
stddef .h
stdio. h
s t d l i b .h
string.h
time. h
sprintf. c
s q r t .c
srand. c
s t d l i b .h
sscanf .c
stdio. h
stdio. h
stdio. h
s t r c a t .c
st rchr . c
strcmp. c
s t r c o l l. c
strcpy c
strcspn c
s t r e r r o r .c
st ring. h
s t r f t i m e. c
strlen .c
strncat.c
strncmp. c
strncpy. c
strpbrk. c
s t r r c h r.c
strspn. c
strstr . c
s t d l i b.h
strtod.c
strtok. c
strt01. c
stdlib. h
strtou1. c
strxfrm. c
system. c
tan. c
Names
Name
tanh
time
time_t
tmpf i l e
tmpntolower
11
1,
toupper
11
11
ungetc
va-arg
va-end
va-list
va-start
vfprintf
vprintf
vsprintf
wchar-t
.,
wcstcinbs
wctcinb
,I
-A
11
-ADAUPBND
Aldata
Asin
-Assert
-Atan
-BB
Bnd
-C2
-CN
-CPS
-CSIGN
-C!CYPE
-Cmpfun
-Costate
-D
- C t m
-DO
DBIAS
-D I
DLONG
-DOFF
-Daysto
-Dbl
-Dconst
-Def loc
-Dint
Header
<math.h>
<time. h>
<time. h>
<stdio h>
<stdio.h>
<ctype. h>
<ctype h>
<ctype.h>
<ctype.h>
<stdio.h>
<stdarg. h>
<stdarg. h>
<stdarg.h>
<stdarg. h>
<stdio.h>
<stdio. h>
<stdio.h>
<stddef. h>
<etdlib. h>
< s t d l i b . h>
< s t d l i b.h>
< s t d l i b.h>
<yvals. h>
<yvals. h>
"xal1oc.h"
<math.h>
<assert. h>
"xmath.h"
<ctype.h>
<stdarg. h>
<yvals.h>
<ctype. h>
<yvals.h>
< p a l s . h>
<ctype. h>
< s t d l i b. h>
" xstate . h"
<ctype.h>
< w a l e.h>
<yvals h>
xctype. h>
< p a l s.h>
< p a l s.h>
<time. h>
< f l o a t . h>
<math.h>
"xlocale .h"
"xmath. h"
.
.
459
Page
File
tanh.c
time. c
time-h
tmpfile . c
tmpnam-c
ctype.h
tolower. c
ctype . h
toupper. c
ungetc . c
stdarg .h
stdarg .h
stdarg . h
stdarg.h
v f p r i n t f .c
vprintf . c
vsprintf . c
stddef .h
6tdlib.h
w c s t ~. cs
s t d l i b .h
wctcinb. c
yva1s.h
yvals. h
malloc . c
xasin. c
xassert c
xatan . c
ctyp0.h
stdarg .h
yvals. h
d y p e. h
yvals. h
pals.h
ctype. h
s t d l i b.h
x s t a t e .c
xctype.c
yva1s.h
yvals . h
ctype.h
yva1s.h
yva1s.h
xttotm. c
xf l o a t . c
math. h
xdefloc. c
xdint. c
165
426
424
287
284
37
39
37
39
291
211
211
211
211
302
302
Appendix B
Name
Header
File
"xstdio.h"
"xstdio. h"
"xlocale .h"
"xetdio. h"
<etdio.h>
"xetdio.h"
"xstdio. h"
"xtime .h"
"xtime .h"
"xetdio.h"
"xetdio h"
"xetdio. h"
"xlocale h"
"xalloc.hw
"xtime.h"
"xtime.h w
<math.h>
< p a l e . h>
<etdio.h>
<etdio. h>
<etdio.h>
"xmath .h"
"xtime .h"
<yvale.h>
<limits.h>
<ctype.h>
xdnonn.c
xdecale . c
xdtento. c
xdteet . c
xduneca1.c
float. h
yvale. h
yva1e.h
p a l e .h
p a l e. h
errno. h
xexp.c
p a l e .h
float. h
yva1e.h
yva1e.h
p a l e .h
yva1e.h
xf gpoe.c
xfi1ee.c
xf l o a t . c
xfmtval. c
xf open. c
xf oprep. c
xfree1oc.c
xf rprep. c
xf spoe.c
xfwprep. c
xgenld. c
xgentime.c
xgetdst . c
xgetfld. c
xgetfloa . c
xget i n t . c
xgetloc c
xgetmsm. c
xgettime . c
xgetzone .c
xvaluee . c
p a l e .h
etdio. h
etdio. h
etdio. h
xvaluee . c
xiedst .c
yva1e.h
limits. h
ctype-h
-Dnorm
-Decale
-Dtento
Dteet
-Dunecale
-Dvale
-EDOM
-EFPOS
-ERANGE
-E?tRMAx
-ERRNO
-Exp
FBIAS
FLOAT
-FNAMAX
-FOFF
-FOPMAX
-F r n
-Fgpoe
-F i l e s
-F l t
-Fmtval
-Fapen
-Fop=P
Freeloc
-FrPreP
Fspoe
I-reP
n l d
Gentime
Getdat
-Getfld
G e t fl o a t
Getint
-Getloc
Getmem
Gettime
-Get zone
Eugeval
-G
-ILQNG
-IOFBF
-IOLm?
-IONBE'
-Inf
-Iedet
-LBIAS
-LIMITS
-LO
Names
Name
-LOCALE
-LOFF
-Ldbl
-Ldtob
-Ldunscale
-Litob
-Loctab
-Locterm
-Locvar
-MBMAX
-MEMBND
-Makeloc
-Mbcurmax
-Mbsave
-Mbstate
-Mbtowc
-Mbxlen
-Mbxtowc
-NATS
-NCAT
-NERR
-NSETJMP
-NSIG
-NULL
-Nan
-PU
-Poly
-Printf
-P t r d i f f t
-Putfld
-Randseed
-Readloc
-Rteps
-SETJMP
-SIGABRT
-SIGMAX
-SIGNAL
-SIZET
-SIZET
-SIZET
-SIZET
-SIZET
-SP
-STDARG
-STDDEP
-STDIO
Header
<locale. h>
< p a l s h>
< f l o a t . h>
"xstdio. h"
"xmath .h"
"xstdio. h"
"xloca1e.h"
"xloca1e.h"
"xloca1e.h"
<math.h>
<math.h>
< p a l s.h>
< p a l s . h>
"xloca1e.h"
<stdlib. h>
<stdlib. h>
"xstate.h"
< s t d l i b.h>
<stdlib. h>
< s t d l i b . h>
< s t d l i b. h>
(locale. h>
<errno. h>
< p a l s .h>
<signal. h>
< p a l s .h>
"2anath.h"
xctype.h>
"2anath.h"
"xstdio. h"
<yvals. h>
"xstdio. h"
< s t d l i b. h>
"xloca1e.h"
"2anath h"
<set jmp. h>
< p a l s. h>
< p a l s . h>
<signal. h>
<stddef. h>
<stdio. h>
<stdlib. h>
<string. h>
<time.h>
<ctype.h>
<stdarg. h>
<stddef. h>
<stdio.h>
File
locale. h
yva1s.h
xf l o a t . c
xldtob . c
xldunsca . c
x l i t o b .c
xloctab. c
xlocterm .c
xlocterm. c
xlog. c
math. h
p a l s .h
p a l s .h
xmakeloc. c
xstate . c
stdlib.h
x s t a t e .c
~ t o w. c
mblen. c
mbtowc. c
s t d l i b.h
locale. h
errno. h
p a l s .h
signal .h
p a l s .h
xvalues . c
ctype. h
xpoly. c
xprintf. c
pa1s.h
xput f l d . c
rand. c
xreadloc . c
xvalues . c
s e t jmp .h
p a l s .h
p a l s .h
signal. h
stddef .h
stdio. h
stdlib.h
string. h
time .h
ctype. h
stdarg .h
stddef .h
s t d i o. h
461
Page
96
450
68
312
172
310
117
122
122
166
138
450
450
Appendix B
Page
Name
Header
File
-STDLIB
-STR
-STRING
-Scanf
-Setloc
-Sigfun
xetdlib . h>
<assert. h>
<&ring. h>
"xstdio. h"
"xlocale.h"
<signal. h>
<math.h>
<yvals.h>
"x1ocale.h"
< s t d l i b . h>
< s t d l i b . h>
<string. h>
"xtime.h"
"xstrxf nn. hw
<yvals.h>
<time. h>
<yvals. h>
"xtinfo. h"
"xtinfo. h"
xctype. h>
<ctype.h>
<time.<time .h>
xctype. h>
<assert. h>
<stddef. h>
< s t d l i b . h>
<yvals.h>
"xstate. h"
< s t d l i b.h>
< s t d l i b. h>
cctype. h>
<ctype h>
Cctype h>
"2anath.h"
< p a l s . h>
stdlib. h
assert.h
string.h
xscanf c
xsetloc. c
signal .h
xsin . c
yvals. h
xgetloc c
xstod. c
xstoul c
s t r e r r o r.c
xstrftim. c
xstrxfnn. c
yvals. h
time.h
yvals. h
asctime. c
xtinfo .h
xtolower . c
xtoupper .c
xttotm. c
localtim.c
ctype-h
assert.h
stddef .h
stdlib. h
yvals. h
x s t a t e .c
xwctamb.~
wctcmlb. c
ctype.h
ctype.h
ctype.h
xvalues. c
yvals. h
-Sin
-Sizet
-Skip
-Stod
-Stoul
-S t r e r r o r
-Strftime
Strxf nn
-T ITBIAS
-TIME
-TNAMAX
-Times
-Tinfo
-Tolower
-Toupper
-Ttotm
-Tzof f
-UP
-VAL
-WCIlART
-WCIlART
-wchart
-Wcstate
-w c t a m b
-wcxtamb
-XA
-XD
-XS
-Wig
-YVALS
.
.
354
20
398
320
106
200
1 50
450
104
364
360
406
439
409
450
424
450
437
100
40
41
428
433
37
20
223
354
450
107
370
369
37
37
37
1 39
450
Appendix C: Terms
This appendix lists terms that have special meaning within this book.
Check here if you suspect that a term means more (or less) than you might
ordinarily think.
access- to obtain the value stored in a data object or to store a new value
in the data object
address constant expression- an expression that you can use to initialize
a static data object of some pointer type
allocated storage-data objects whose storage is obtained during program
execution
alphabetic character- a lowercase or uppercase letter
alphanumeric character - an alphabetic character or a digit
ANSI -American National Standards Institute, the organization authorized to formulate computer-related standards in the U.S.
argument - an expression that provides the initial value for one of the
parameters in a function call
argument-level declaration- a declaration for one of the arguments in a
function definition or a function prototype
arithmetic type -an integer or floating-point type
array type - a data-object type consisting of a prespecified repetition of a
data-object element
ASCII - American Standard Code for Information Interchange, the U.S.
version of the standard character set IS0 646
assembly language - a programming language tailored to a specific
computer architecture
assertion -a predicate that must be true for a program to be correct
assign - to store a value in a data object
assigning operator -an operator that stores a value in a data object, such
as =, +=, or ++
assignment-compatible types - two data-object types that are valid on
either side of an assigning operator
Appendix C
asynchronoussignal - an important event not correlated with the execution of the program, such as someone striking an attention key
atomic-an indivisible operation that synchronizes two threads of control
base - the value used to weigh the digits in a positional number representation, such as base 8 (octal)or base 10 (decimal)
basic C character set - the minimum set of character codes needed to
represent a C source file
beginning-of-file -the file position just before the first byte in a file
benign redefinition -a macro definition that defines an existing macro
to have the same sequence of tokens spelled the same way and with
white-space between the same pairs of tokens
bias - the value added to an exponent to produce the characteristic in a
floating-point representation
binary - as opposed to text, containing arbitrary patterns of bits
binary stream -a stream that can contain arbitrary binary data
block - a group of statements in a C function enclosed in braces
block-level declaration -a declaration within a block
buffer - an array data object used as a convenient work area or for
temporary storage, often between a program and a file
C Standard - a description of the C programming language adopted by
ANSI and IS0 to minimize variations in C implementationsand programs
call tree - a hierarchical diagram showing how a group of functions call
each other within a program
calling environment - the information in a stack frame that must be
preserved on behalf of the calling function
category- part of a locale that deals with a specific group of services, such
as character classification or time and date formatting
character -a data-object type in C that occupies one byte of storage and
that can represent all the codes in the basic C character set
character class-a set of related character codes, such as digits, uppercase
letters, or punctuation
character constant - a token in a C program, such as a., whose integer
value is the code for a character in the execution character set
characteristic - the part of a floating-point representation that holds a
biased exponent
close - to terminate a connection between a stream and a file
code - colloquial term for programming language text or the executable
binary produced from that text
collate -to determine the ordering of two strings by some rule
compiler - a translator that produces an executable file
Terms
Appendix C
Terms
Appendix C
Terms
Appendix C
Terms
program termination- the period in the execution of a program just after
main returns or exit is called
push back -to return a character to an input stream so that it is the next
character read
punctuation - printable characters other than letters and digits, used to
separate and delimit character sequences
range error - calling a math function with an argument value (or values)
for which the result is too large or too small to represent as a finite value
read function- one of the functions that obtain input from a stream
read-only -containing a stored value that cannot be altered
recursion -calling a function while an invocation of that function is active
representation -the number of bits used to represent a data-object type,
along with the meanings ascribed to various bit patterns
reserved name - a name available for use only for a restricted purpose
round -to obtain a representation with reduced precision by some rule,
such as round to nearest
rvalue - an expression that designates a value of some type (without
necessarily designating a data object)
scan function - one of the functions that convert text to encoded values
under control of a format string
scan set - a conversion specifier for a scan function that specifies a set of
matching characters
seek -to alter the file-position indicator for a stream to designatea given
character position within a file
semantics - the meaning ascribed to valid sequences of tokens in a
language
sequence point - a place in a program where the values stored in data
objects are in a known state
side effect - a change in the value stored in a data object or in the state of
a file when an expression executes
signal - an event that occurs during program execution that demands
immediate attention
signal handler - a function that executes when a signal occurs
signed integer - an integer type that can represent negative as well as
positive values
signed-magnitude arithmetic - a positional binary encoding where the
negative of a number has its sign bit complemented
significance loss- a reduction in meaningful precision of a floating-point
addition or subtraction caused by cancellation of high-order bits
source file -a text file that a C translator can translate to an object module
Appendix C
Terms
Appendix C
VAX - a DEC computer architecture developed as a successor to the DEC
PDP-11, on which C and UNIX are still widely used
void type-a type that has no representation and no values
volatile type - a qualified type for data objects that may be accessed by
more than one thread of control
WG14- the ISO-authorized committee responsible for C standardization
white-space- a sequence of one or more space characters, possibly mixed
with other characters such as horizontal tab
wide character - a code value of type wchar-t used to represent a very
large character set
width - part of a conversion specification in a format that partially
controls the number of characters to be transmitted
writable -can have its value altered, opposite of read-only
write function- one of the functions that deliver output to a stream
X3Jll - the ANSI-authorized committee that developed the original C
Standard
zero fixup- replacing a floating-point underflow with an exact zero
abort. c 378-379,455
abort 18,21,24,194-198,201,234,333,339,
346,354,378-379,381,383,452,455
abm c 353,355,455
abm 6,333,341,346,349,353-355,382,386,
455
access 463
acom.c 152,155,455
acom 130,135,138,151-152,155,178,455
Ada 381
address constant expression 217,463
-ADNBND 211-212,448-451,459
-Aldata 371-372,374376,459
alert 31,33
allocated
See storage
alphabetic
See character
alphanumeric
See character
AM/PM 110-111,419,421
ANSI 3,463
See C Standard
append
See file
arbitrary
See base
argument 463
array 5,186
function 220,467,473
jmp_buf 186
null pointer 216
reduction 149,151,161,164
va-limt 210
variable list 5,12,205-212,214-215,220,
222,258-259,264-265,267,296,307,
315,321,420,473
argument-level
See declaration
arithmetic
complex 179
See floating-point
one's-complement 77,448,469
pointer 217-219,222,224,362
signed-magnitude 77,448,471
subscript 219
translation-time 76-78
two's-complement 35,77,218,309,343,
346,448,473
See type
unsigned-integer 219
array
See argument
See type
ASCII
See character set
character met 422
amctime. c 101,426,430,437,455,462
amctime 418,420,422-424,436-437,442,455
amin.c 152,155,455
amin 130,135,138,152,154-155,178,455
-Amin 138,151,154-155,459
assembly language 2-3,47-48,187,191,201,
230,283,329,386,414,446-447,463
eammert .h> 4,9,11,14,17-24,455,459,462
ammert 11,17-18,20-24,44-45,54,70-71,
125,176-180,190-191,204,213,224,
330-332,382-383,412-413,442,455
-AmBert 20-22,459
assertion 17-19,22,463
assign 463
See operator
assignment suppression 241-242,266,315
assignment-compatible
See type
asterisk 238-239,241,260,266
asynchronous
See signal
AT&T Bell Laboratories iii-iv, 73,81,473
atan.c 152,156,455
atan2.c 152,157,455
atan2 131,135,138,152,155-157,178,455
Index
binary
See base
See file
See stream
binary search 358
381-383,455
block 255,464
Atfune
378
See control
atof. c 362-363,455
block-level
atof 5,87,333-334,347,354-355,362-363,
See declaration
383,455
-Bnd 211-212,459
atoi.c 361,363,455
Borland
atoi 5,333-334,347,354-355,361,363,383,
See Turbo C++
455
boundary
at01 .c 361,455
See storage
at01 333,335,347,354-355,361,363,383,
bracket 209,242,268
455
Brender, Ronald F. 381
atomic 46,194-195,198,464
Brodie, Jim xiii, 15
attention key 193,195,197-198,464
broken-down
See time
-AUPBND 211-212,448-451,459
auto 46,183184,466
beearch-c 358,455
beearch 333,340,347-348,350,358,382-383,
455
buffer
backslash 111,115
file 231,474
backspace 31,33,46
BUPSIZ 233-234,238,269,273,276,288,295,
base 381,464
297,325,331-332,453
arbitrary 136,267,336,359
binary 129,164
decimal 113,119,129,136,164,238-239,
241,260-262,267-268,311,419,438,
C Standard
446,464-465
ANSI ix, xi, xiii, 3,15,81-82,228,451,
e 136,164
473-474
hexadecimal 113,119,129,239,241,262,
IS0 iii-iv, ix, xi, xiii, 6,15,81-82,474
268,310,467
C Users Group xii
octal 113,119,239,241,262, 267,464,469 C Users Journal iv, xiii, 223
basic C
-cz 76,448,450-451,459
See character set
localevu 337,421,423,438
calendar
-BB 37-38,42,122,459
beginning-of-file
See time
See file
call tree 94,464
benign
calling
redefinition 12,19,464
environment 201
undefinition 20
calloc.c 373,375,455
Berkeley
calloc 333,338,344,348-349,351,354,373,
See UNIX
375,382,455
bias
carriage
See floating-point
See control
atan 130,135,138,152,155-156,178,455
-Atan 152,156-158,175,459
-Atcount 378
atexit.c 378-379,455
atexit 333,339,344,346347,354,378-379,
Index
carriage return 26,29,31,33,46,226,228,
286,329,452
category
See locale
ceil. c 141,455
ceil 134-135,138,141,143,176,455
CELL-OFF 371-372
-cell 371
CHAR-BIT 74,76,78,367,370,409,453
CHAR-MAX 74-76,78,85-86,90,93,97, 110,
113,122 125,453
CHAR-MIN 74-76,78,453
character 464
alphabetic 32,113,253,463
alphanumeric 28,31-33,463
class 25-27,30-32,34-36,43,108,112-113,
116,123,464
constant 36,108,112-113,217,219, 464
control 28,30-32,108,113,465
conversion 306
See graphic
motion-control 113
multibyte 74,77,112 238,240-241,251,
260,266,303,318,333-334,341-343,
345-346,349,366,368,384, 419,421,
469
padding 230,234,237-239,260-261,269,
306,401
printing 28-29,31,33,42,46,229,234,
240,467,470
punctuation 31-33,35,113,411,464,471
push-back 248,254-255,264,273-274,
288,315,471
See type
wide 112,219-220, 303,318,333,342-343,
345-346,349-350,366,368,384,408,474
character set
ASCII 25-26,30-31,3435,43,112,445,
463
basic C 30,32-33,217,229,303,306,345,
464
EBCDIC 25,34,36,466
execution 26,32,34,43,464,466
IS0 646 35,43,463
Kanji ix, 260,345,384,421
large ix, 344-345,381,421,469
Index
computer architecture 1-3,57-58,73-74,
ctime 418,420,423-424, 436-438,442, 455
137,141,149,257,309-311,323,348,
cctype.h> 4,25-46,87-89,98-99,102 106,
353,371,399,452, 463,465,468,470
108,112-113,116,119,122,265,269,
concatenation
304,320,324,328,360, 362 364,432,
See string
435,445-446,456-457,459-462
constant
ctype 123
character 303
-CTYPE 37-39,41-42,98,102,106,117,124,
See floating-point
459
See integer
currency 468-469
See null pointa
currency symbol 84-87,89,108-110,465
See type
international 84-85,87,89-90, 109-110,
wide-character 303
114
control
IS0 4217 85,89,123,468
block 230-232
D
carriage 226
See character
-DO 67-68,139-142, 144,146-147,172-173,
flow of 18,181-184,472
175,308,448-451,459
See multithread
Dahl, O.J. 22
thread of 36,46,193,284,464,469,
data-object
473-474
See type
conversion
Daylight Savings
specification 238,240-242,260,265-266,
See time
268,307,311,314,321, 465-466,474
-Dayeto 427-429,431,436-437,443,459
specifier 239-242, 260-262,266-267,306,
-DBIAS 67-69,139,142,144,173-174,
310,314,318,321-323,419-421,465,471
448-451,459
converting
DBL-DIG 60,62,66,70-71,453
&e '7pe
DBL-EPSILON 61-62,64-66,70,139,151,176,
copyleft xii
178,180,331,453
copyright ii, xii
DB~J~ANT-DIG60-61,66,70,453
o e . c 151-152, 455
DBL-MN-10-EXP
61-62,66,70,453
coe 131,135-136,138,149,151-152, 178,455
DBL-MAX-EXP 60,62,66,70-71, 453
-Coeave 407
DBL-MAX 61-62, 65-66,70,135,178,453
coeh. c 161-162,455
DBL-MIN-10-EXP 60,62,66,70,453
coeh 131,136,138,161-162,164,180,455
DBL-MIN-DIG 60
-coetate 100,102,106-107,117, 124,409,
DBL-MIN-EXP 62,66,70-71,453
459
DBL-MIN 61-63,65-66,70,453
-CPS 424-425,448,450-451,459
-DH 65-66,68,459
Cray, Seymour 59
-Dconet 137-139,175,459
create
aa 227
See file
debugging 17,19,22,24,182, 191,210,377
creation
DEC
See string
See PDP-11
cross compiler
See ULTRIX
See compiler
See VAX
-CSIGN 76,448,450-451,459
ct ime.c 436,438,455
Index
decimal
See base
point 4-5,83-91,108, 110,114,126,
238-240,261-262,266,314,335,351,
465,473
declaration 465
argumen t-level 463
block-level 464
file-level 4-5,7,12 466
See function
default 465
#define 468
definition 465
See macro
See type
-Def loc 94,101-102,105,124,459
device
See handler
-DI 37-38,42,122,459
diagnostic 17-18,21,27,465
d i f f t i m e.c 426,455
DSIGN 155,310
DST
See time
-~ t e n t 170,174-175,363,365,460
o
-D t e e t 140,144-145,148,150,153-154,
156-157,162-163,165,175,460
143-145,148,157,159,164,166,
-~unecale
168,170-171,174-175,460
-D v a l e 65-66,68-69,460
dynamic
See storage
EBCDIC
See character set
EDOM 49-55,130,140,142-144,148,150,
153-154,156-157,159, 162-163,
165-166,168-169,332,406,412, 448,453
-EWM 53-54,448,450-451,460
efficiency 2,20, 26,74-75
EFPOS 49,53,285-286,406,448,453
d i f f t i m e 416-417,420,423-424,426,442,455
-Eppos 53-54, 448,450-451,460
digit 7,25,28,31-33,43,85-87,89-90,113,
electronic mail 71,177
239-240,261,268,311,314,335-336,
elefunt 129,171,177
359,363,463-465
element 466
hexadecimal 29,31,33,113,268
empty
Dijkstra, E.W. 22
See file
- ~ i n t141-143,149-150,153,167-170,175,
See line
459
end-of-file
d i v.c 353,355,455
See file
div-t 334,341,346,348,353-354,455
See indicator
aiv 333-334,341,346,348-349,353-355,383, enquire 64,71,80
386,452,455
environment 466
divide
calling 182,184-188,464
See zero
freestanding 215-216,452
-DLONG 68,172,308,312,448-451,459
hosted 215,452
-Dnorm 144-147,173,175,460
list 340
-DOFF 67-69,139-140,142,144,146-147,173,
variable 82, 101,108,340,349,378,434,
175,448-451,459
466
dollar sign 112-114,119
-E n v p 378,447
domain
EOF 27-28,30,34,40-45,112,119,219,233,
See error
244-248,264,269,276,280,282,
Dongarra, Jack J. 71
285-286,288,290-291,296,298,300,
dot 9,83,88,238,253,260-261,335,465,470
315,319,321-322,332,453
- D e c a l e 145-146,148,159-161, 169-170,
equal sign 378
174-175,460
480
49-51,53-55,130,135,140,144,159,
162-163,166,168-169,175,335-337,
347,361-362,406,448,453
-ERANGE 53-54,448,450-451,460
-ERRMAX 53-54,448,450-451,460
errno.c 54,445,455
<errno.h> 4,47-56,135,175,272,330,347,
373,395,406,412,445,447-448,
452-453,455,460-461
errno 5,47-55,130,135,140,142-144,148,
150,153-154,156-157,159,162-163,
165-166,168-169,174-175,196,
249-251,272,285-286,298,332,
334-337,347,360-362,373,395,445,
447,452,455
-ERRNO 53,460
error
domain 49,55,128,130-131,133-134,
152,327,465
file-positioning 466
See indicator
range 49,55,128,130-133,161,347,471
read 233,245-248,251-252,254,263,282,
291,329
See stream
write 233,240,243-249,252,254,272,
282,292,296
#error 40
escape 113,260,265,303
EUC 384
exception 192,466
executable
file 468,474
execution
See character set
-EXFAIL 353,451
exit.c 378-379,455
EXIT-FAILURE 22-23,202,204,334,339,346,
348,353-354,379,382-383,453
EXIT-SUCCESS 23,204,334,339,346,348,
353-354,381-382453
exit 23,194,196-197,201-202,
204,234,333,
339,344,346-348,353-354,378-379,
381-382,385-386,447,449,455
-~ x i t378,446
exp.c 161-162455
ERANGE
Index
48,62,132,136-138,
161-162,164,180,
455
x 160-165,169,175,460
p
exp
-~
exponent
See floating-point
expression 466
extended precision
See floating-point
external linkage 2,5,9-10,12,48,50,184,
186-187,207,363,368,447,453,468
140,455
51,134,136,138,140,176,178,180,
331,455
fabe c
fabe
failure
input 241-242,244,263,329
matching 241-242,244,264,266,268,329
fair use xii
-FBIAS
67-68,448-451,460
fcloee c 278,280,455
fcloee 105,232,236,252,270,276,278,
280-281,331-332,379,455
-~ c l o e e278,282,287,329,446
feof c 287-288,455
feof 243,250-251,270,276,287-288,332,
455
ferror c 287-288,455
ferror 243,250-251,270,276,287-288,332,
455
f f lueh. c 292,298,455
fflueh 236-237,256,270,276,280,286, 292,
296-300,332,455
fgetc c 288,290,455
fgetc 27,30,232,234,245-246,253-254,
271-272,276,288,290-291,318-319,
332,455
fgetpoe.c 288-289,455
fgetpoe 232 249,254,256,270-272,276-277,
285,289,331,455
-~getpoe329
fgete.c 291,293,456
fgete 115,245,271-272,276,291,293,332,
456
-~ g p o e452
Index
field 110,466
file (continued)
truncation 240
truncate 234,237,275
update 237
width 238-242, 251,260-261,266-267,
file-level
306-307,321
See declaration
file 466
append 234
file-position
See indicator
batch 108
beginning 255
file-positioning
See error
beginning-of 234,249-250,269,273,464
binary 25,228,230,235,237,253,255,
See function
-FILE- 18
258,269,285,464
buffer 464
PILE 124,231-234,251-252,254,270,
close 182,229,234-237,270,273-275,278,
274-278,288,296,315,322-323,453,466
282,339,346-347,464
FILENAME-MAX 233,251,253,269,276,325,
331-332,448,453
create 229,234-235,237,251,253,272,275
Pilee 276-280,292, 298,379,460
descriptor 227,231,274,466
empty 229,234
finitestate machine 366,368,467,472
See table
end-of 114,226,229-230,233-234,237,
fixed-length
242,244-247,249,251-253,269-270,
275,282,291,466
See record
<float .h> 4,57-72,74,77,127,135,151,
executable xii, 1,88,464-466
handle 227,274,467
174,176,178,180,215,312,330,333,
364,445-446,448,453-454,459-461
header 7,91,98,201,253,467
include 467
-FLOAT 66,460
floating-point
interactive 232,235,237,255-256,270
length 227,229-230
arithmetic 57
locale 95, 101,108-110,112-116,118-119,
base 60,129
126,384,411,413,438,443
bias 464
long 230
characteristic 67,139,141,145,448,464
See name
constant 64,335
open 114,228,230-231,233-238,251-253,
conversion 108
exception 198
256,269-275,277-278,282,285,329,
339,449,466,469,472
exponent 60,67,129,136-137,143,145,
See record
157,164,170,240,261-262,311,314,
335,363,448,464,466
remove 235,272-274,278,329,339,
346-347
extended precision 149,161,164,170-171
fraction 67,129,132-133,363,466-467
rename 235,272,278,329
reopen 237
gradual underflow 63,127,141,145
hidden bit 67
source xii, 1,7,9-12,16,19,32,94,98,
101,113,181,201,325,464,467,
IEEE 754 55,61,63-65,67,69,71-72,
127-128,137,141,171,311,363,445,
470-471,473
448-449,468-469
temporary 227,233,235-236,269,
Inf 52, 128,135-137,139-140, 167,179,
272-274,278,284,339,346-347
310-311,386,449
text 1,108,228-230,237,253,255,258,
infinity 52,127-128,134-135,311,467
265,269,285-286,329,452,472
482
floating-point (continued)
NaN 52,128,139-140,167,179,310-311,
386,449
not-a-number 52,127-128,311,386,469
overflow 49,58,62,72,127-128,130,145,
161,164,170,195,198,363,470
precision ix, 58,60,64,77,127-129,135,
145,149,164,171,323,363,446
representation 464
rounding 5960,72,239,314,471,473
significance loss 49,58,62,64,127,
136-137,152,161,363,471
truncation 59-60,473
See type
underflow 49,58,62-63,128,130,145,
151,161,170,198,335,363,466,
473-474
wobbling precision 129
zero fixup 58,63,128,130,335,474
floor.c 141,456
floor 134136,138,141,143,176,456
flow
See control
FLT-DIG 60-61,66,70-71,453
PLT-EPSILON 61,66,70-71,331,453
FLT-MANT-DIG 60-61,66,70-71,
453
FLT--10-EXP
61,63,66,70-71,453
FLT-MAX-EXP 60-61,66,70-71,453
PLT-MAX 61,66,70-71,453
FLT-MIN-10-EX 60-61,63,66,70-71,453
FLT-MIN-DIG 60
FLT-MIN-EXP 61,63,66,70-71,454
FLT-MIN 61,6670-71,453
FLT-MIX
60-61,63-66,70-72,454
-FLT-WIX
67
FLT-ROUNDS 60,64,66,71,448,454
-FLT-ROUNDS 67
-pit 65-66,68,460
fluh
See stream
fmoa.c 145,148,456
f m d 134,136,138,145,148,176,456
-~ m t v a l90-92,94-95,123,126,262460
-PNAMAX 276,448,450-451,460
-POFP 67-68,448-451,460
fopen.c 278-279,456
Index
FOPEN-MAX
233,235,269-270,276-280,298,
325,331-332,379,448,454
fopen 105,228-229,232,236-237,251-253,
270-272,276,278-279,287,331-332,456
-~ o p e n278,281-282,284-285,323,329,452,
460
-POPMAX 276,448,450-451,460
-Foprep 278-281,323,460
form feed 26,29,31,33,229
format 91,94,259-260,264-267,296,303,
306,315,419-420,422-423,437-438,
46547,474
FORTRAN 127,177,206,225
fpoe-t 233,256,270-272,276-277,285,456
fprint f.c 296,301,456
fprintf 5,20,238,240,242-244,258-259,
271-273,276,296,301,329,331,456
fputc.c 291,296,456
fputc 27,44,232,234,246,254,271-272,
276,291,296-298,300,332,456
fpute c 296,300,456
fpute 21,23,44,105,202,209,246,
271-272,
276,296,298,300,332,442,456
fraction
See floating-point
fragmentation
storage 345
frame
See stack
fr e d . c 291-292,456
fread 248,271,276,291-292,332,456
- red 282,286-287,291,329,446, 452
free 467
Free Software Foundation
See GNU
free.c 373,376,456
free 89,103,105,118,120,280,289,333,
338,344,348-349,351,354, 373-374,
376-377,382,431,433,456
-~reeloc105,116-119,124,460
freestanding
See environment
freopen.c 278,280,456
freopen 237,251-252,270-271,276,278,280,
331-332,456
frexp.c 143,456
Index
frexp 132,136,138,143,145,176,456
-FRND66-67,448-451,460
-~ r p r e p288,290-295,323,460
fecanf.c 315,318,456
fecanf 5,240-244,263-265,271,276,315,
318,331,456
feeek. c 288-289,456
feeek 233,237,248-250,254-256,269-272,
277,289,331,456
feetpoe c 288,290,456
feetpoe 232, 237,248-249,254,256,270,
272,277,285,290,331,456
-~ e e t p o e329
-~ e p o e277,282,286-290,452,460
ftell.c 288,290,456
ftell 249-250,254-255,269-272,277,290,
331,456
function 467
argument 224
date 82
declaration 1-24-5,10
file-positioning 230,237,248-249,
254-255,270,273,275,285,288,452
centime 427,438-440,460
318
getc.c 288,290,456
getc 26-27,30,246,254,271-272,274,277,
288,290,332,456
getchar.c 288,291,456
getchar X, 27,30,246-247,272,274,277,
288,291,332,456
-Getdet 427,430-432,434,460
getenv.c 378,380,446,456
GET
82,104-105,333,339-340,349,354,
378,380-382,386,434-435,456
getenv
g e t fld 321,323-324,460
-Get f loat 323-324,328,460
-~ e t i n t321,323-324,326,460
-Getloc 94,99,101-104,114,116,124,460
-Getmem 371,373-375,460
GETN 321
gete.c 291,294,456
gete 247,271-272,277,291,294,332,456
-~ e t t i m e427,430-431,433-434,438,
440-441,460
-Getzone 427,430-431,433435,441,460
multibyte 77,87,341,344,363,446
G M T 415,423,467,473
gmt ime.c 427,456
nesting 181
gmt ime 418,420,423-424,427,430,442,456
numeric conversion 87
GNU
parameter 220,224
C xii,212,449,451,467
print 84,87,94,171,212,225,238,
Project xii
257-261,263-265,271-275,296,301,
309,314,323,325,345,420,467,470
goto 181-182
prototype 206,208,216,220,259,463,467 nonlocal 181,184-185,192
read 253,273,275,471
gradual underflow
See floating-point
scan 87,171,212,225,255,263-266,268,
271,273-275,296,314,318,323,325,
graphic 31,33,467
Griswold,R.E. 411
329,345,351,467,471
storage allocation 344
Grosse,Eric 71
time 100,420,437,467
grouping 84-87,89,110,114,126
guard
write 253,474
-~ w p r e p291-292,296-297,299-300,323,460
See macro
fwrite.c 296,299,456
fwrite 248-249,272,277,296,299,301-302,
332,446,456
handle
--rite 282,286-287,329,452
See file
handler
G
device 226,228,465
-Genld 313-314,316,323,460
signal 193-197,199-201,471
Index
Hart, John E 177
header 1-2,5,12
See file
idempotence 4,7,11,19
independence 4,7,11
internal 53,98,275,281,445,448
See name
standard xi, 4-5,7,9-12,16,53,95,98,
116,123,216,333,425,453,472
heap 89,116,344,467
See storage
hexadecimal
See base
See digit
hidden bit
See floating-point
hiding
See name
Hoare, C.A.R 2 2 358
hole
See storage
Homer's Rule 151
hosted
See environment
HUGEEXP 161
HUCE-RAD 149
HUGE-VAL 130,134-135,137-139,171,
176-177,335,454
-Hugeval 138-139,460
I/O 468
IBM
See PC
See System/370
idempotence
See header
identifier 467
IEEE 467
IEEE 1003
See POSIX
IEEE 754
See floating-point
# i f 5,19,50, 60,74-75,77,79
ignoring
See signal
-ILONG 76,448,450-451,460
implementation 467
include
See file
#include 1,4,7-8,12,467,473
independence
See header
indicator
end-of-file 233,237,245,247-250,252,
254,256,263,270,275,466
error 233,237,245-247,250-252,254,263,
270,272,275,466
file-position 49,230,233-234,237,
245-246,248-256,269-272,282,
285-287,466,471
Inf
See floating-point
-~ n 139-140,146,159-160,162-163,166,
f
168,175,460
infinity
See floating-point
inline
See code
input
See failure
See stream
input/output model 225,227-228,231,452
INT-w
74,76,78-79,218,224,289,324-325,
436,454
INT-MIN
74,76-78,436,454
integer 467
constant 336
constant expression 221-222,224,468
overflow 33-34,195,198,306,346,352,
359,362-363,401,429-430,434,437,470
See type
Intel
80x86 372,468
80x87 52,64,67,69,140, 468
interactive
See f i e
interface 47,468,470
internal
See header
international
See currency symbol
Index
International Date Line 430
interpreter 1,468
invalid 465,468
ioctl 226,228
-IOWP 233,238,269,273,276,288-289,332,
460
JOLBF 233,238,269,273,276,289,331-332,
460
-IONSP 233,238,269, 273,276,288-289,332,
460
iea1num.c 37,456
iealnum 28-29,32,37,43-45,456
iealpha. c 38,456
iealpha 26,28,32,35,37-38,44-45,88,116,
435,456
iecntrl.c 38,456
iecntrl 28-29,33,35,37-38,44-45,456
-~ s d e t429
iedigit.c 38,456
iedigit 26,28-29,32-33,37-38,44-45,122,
305,321,328,364-365,432,435,456
-~ s d e t100,117,427,429-431,460
iegraph. c 38,456
iegraph 28,33,37-38,45456
ielower .c 38,456
ielower 28-30,32-33,35,37-38,44-45,88,
456
IS0 3,468
IS0 4217
See currency symbol
IS0 646
See character set
IS0 C Standard
See C Standard
ieprint.c 38,456
ieprint 27,29,33,35,37-38,44-45,456
iepunct.c 39,456
iepunct 28-29,33,37,39,44-45,456
ieepace.c 39,456
ieepace 26,28-29,33,35,37,39,44-45,101,
241,265,318,320-321,324,335,351,
360,362,364,456
ieupper.c 39,456
ieupper 28-30,33,35,37,39,44-45,456
iexdigit.c 39,457
iaxdigit 29,32-33,37,39,44-45,457
JIS 384
182-188,191-192,449,457
See argument
justify 238-239,260
jmp-buf
Kahan, W.M. 72
Kanji
See character set
Kernighan and Ritchie 15,73
Kernighan, Brian W. 15,327
keyword 4,7,9,16,109,114-116, 119,224,
347
knock out 95,232,468
Knuth, Donald 381
Koenig, Andy 205
~-tmpnam 233,236,269,276,284,287,325,
331-332,449,454
label
See variable
labs. c 353,356,457
labs 333,341,349,353-354,356,382,386,
457
large
See character set
Lawson, Charles L. 177
-LBIAS 67-68,173,312,448-451,460
LC-ALL 84,86-87,96,102-103,108,125,454
LC-COLLATE 83-84,87,96,106, 125,390,395,
397,407,454
LC-CTYPE 83-84,87,96,106,110,125, 334,
341,343,353,366,368,454
LC-MONETARY 83-84,86-87,89,96,98,106,
109-110,125,454
LC-NUMERIC 83-84,86-87,89,96,106,110,
125,454
LC-TIME 83-84,87, %, 106,110-111,125,
419-420,424,426,437,454
lconv 84-85,89-91,95,98,101,109-110,114,
126
LDBL-DIG 60,66,70-71,313, 454
Index
61,66,70-71,331,454
LDBL-m-DIG
60,66,70-71,454
LDBL-MAX-10-EXP 61,66,70-71,454
LDBL--EXP
60,66,70-71,454
LDBL-MAX 61,66,70-71,454
LDBL-MIN-10-EXP 60,66,70-71,454
LDBI-MIN-DIG 60
LDBL-MIN-EXP 66,70-71,454
LDBL-MIN 61,66,70-71,454
-L ~ H 65-66,68-69,461
laexp.c 144-145,457
76,460
line
empty 229
feed 26,226,228,286,329
length 229,234,251
long 229
partial 229,234
text 229,234,271,286,329
-LINE- l8,21
- info 98-99,116,118
linker 1-2,15,36,95,199,314,468
ldexp 63,70-71,132,136,138,144-145,
list
176-177,457
See environment
iaiv.c 353,356,457
literal
ldiv-t 334,341,346,354,457
See string
ldiv 310,313,333-334,341,346,349,
-~ i t o b307-311,323,461
353-354,356,383,386,457
-LO 37-38,42,122,460
LDSIGN 310
local
-~dtob307,309,311-312,314,323,461
See time
locale ix, 27-28,30,32-33,35-36,46,74,
-Ldunecale 171-173,175,311-312,
461
leap
81-84,87-89,91,95,98-101,108,
day 425,427
113-114,117,123,126,217,261,266,
second 420,443
303,334,341,343,351,395,413,422,
year 427,429,443
452,468
length
27-29,31-33,35,42,46,84-85,88-89,
See file
96-97,99-100,109,112,116,119,123,
See line
265,335-336,351,381,437,468,473
letter 4,25,31-35,43,108,239,336,468
category 83-85,87,95,98,100-101,
lowercase 7,9,29-34,113,123,411,463,
109-111,334,341,343,353,368,390,
468
395,397,407,419-420,424,426,437,
uppercase 4,9-10,29-31,33-34,50,109,
464
113,123,275,283,411,463-464,473
expression 109,113
librarian 2,468
See file
library 468
mixed 97-98,123
definition 1
native 84,88,96-97,101,108-109,123,469
design x-xi, 2-3,114,373,377,387
reverting 32,88-89,97,99
function 1,5,26,48,127
specific 99-100,111,116,423,426,430,
object-module xii, 2
434,437-438,446,468
shared 36,46,52
"USA" 108-109,114,123
Standard C ix, 215,472
docale.h> 4,81-126,216,265,316,328,
licensing ii, xii
333,364,386,446,454,457-458,461
dimite. h> 4,40-42,44,59,73-80,90,92,97, m
~ 101,108
~
~
~
~
106-107,110,122,124-125,159,215,
-LOCALE 96,98,461
218,224,289,320,324,346,352,360, localeco.c 95,97,457
362,364,367,369-370,382,409,436,
localeconv 5,84-87,92,95-98,125,316,328,
446.448.453-455.460
LDBI-EPSILON
-LIMITS
lm~ll
Index
localtim.c 430,433,457,462
localt ime 418-421,423-424,430,433,436,
442,457
"LOCPILE" 108
-~ocitem116,118
487
88-89,97,103-105,120-121, 279,287,
289,295,297,333,338,344,348349,
351,354,372-373,375,377,382,397,
432,435,457
masking
See macro
matching
See failure
malloc
-Loctab 119
-~octab115-118,124,221,461
-Locterm 119-120,122,124,461
-~ocvar119,121-122,124,461
<math.h> 4,48-49,51,54,70,127-180,311,
-LOFF67-68,172,448-451,461
330,446,454-462
log. c 164,166,457
H
-~ T 138,461
loglo .c 164,167,457
MB-CUR-MAX 1 10,112,304,320,334,342-343,
loglo 70,133,136,138,164,167,180,457
346,349,353-354,367-370,381-383,454
log 62,133,136,138,164,166,169,180,457 MB-LEN-MAX
74,76-78,106,334,346,352,
-~ o 138,164,166-167,
g
461
368-369,382,446,448,454
long
-Mbcurmax 102,106-107,117,124,353-355,
See file
461
See line
mblen.c 363,366,457,461
366,383,457
long jmp.c 189,446,457
longjmp 24,182-187,189-192,194-197,201,
-MEWAX 76,448,450-451,461
-meave 304,320,354-355,363,366-367,439,
446,457
231,447
lvalue 50,118,198,468
modifiable 52,209,251,469
-Elbetate
461
100,102,106-107,117,124,301,
318,366-367,370,438,461
mbetowce.c 363,366,457
mbetowce 99,112,303,333,343,345,350,
354,363,366,383,457
mbtowc
.c 363,366,457,461
machine 468
mbtowc
99,112,301,333,342-343,345346,
macro 468
350,352,354-355,363,366,
383-385,457
definition 1-24-5,19,468
t
o
w
c
301,304,318,320,355,363,
guard 1 1,19,53,468
366-368,438-439,46
1
masking 5-7,9-10,16,36,42-43,137,
-mxlen
355,363,366,461
151-152,164,199-201,254,271,
-mxtowc 355,363,366,461
287-288,353,359,363,368,399,453,
member 469
469
-MEMBND
371-372,374-377,448-451,461
unsafe 5,26,246-247,254,473
memchr
.
c
399,457
Maehly,Hans J. 177
memchr
293-294,299,325-326,361,391,394,
mail
398399,403,412,457
See electronic
memcmp
.c 399,457
main 2,15,24,234,346,348,353,449
memcmp
104-105,382-383,389,394,398399,
maintenance 13-14
401,410,412-413,457
make 19
memcpy.c 399-400,457
-Makeloc 105,ll4,116,118-120,124,221,461
malloc.c 372,374-375,457,459
leeek
Index
105,121,188-189,210,292-294,
299-300,302-303,310,312,316-317,
357-358,369,377,388,394,398-401,
412,439,457
memmove. c 400,457
memmove 91,93,388,394-398,400,412,457
memeet. c 400-401,457
memeet 375,393-394,398,400-401,412,457
Mesztenyi, Charles K. 177
&time .c 434,436,457
&time 417,420,423-424,429,434,436-437,
442,457
mode 469
modf .c 143,457
modf 133,135-136,138,143,177,457
module
See object module
monetary 84-87,89-90,126,469
month
See name
Motorola
MC680XO 64,449,469
MC68881 52,469
MS-DOS iv, 82,108,226,228,452,469
multibyte
See character
See character set
See function
Multics iv, 227
multithread 46,82-83,193,198,329,469
merncpy
name 469
category 98
external 94
file 5,7,9-10, 12,82 95,233,235237,
251-253,269,272-274,278,284,329,466
header 7,9,14
hiding 181
length 251
locale 98-100,109,116,126
month 111,419,421,443
reserved 47,9,11-12,20,50,83,275,323,
353,399,447,471
space viii-ix, 5, 16,447,469
weekday 111,419,421,443
NaN
See floating-point
139,148,150,153-154,159,166,
168-169,175,461
native
See locale
-NATS 461
-NCAT 96,102-103,461
NDEBUG 4,11,17-20
-NERR 53,55,406,461
nesting
See function
newline 26,29,31,33,46,226,228-229,234,
242,246-247,251,271-272,413,452
Newton's Method 157
nonlocal
See goto
not-a-number
See floating-point
NOTE 109
-NSETJMP 187,446,449-451,461
-NSIG 199-200,202-203,461
null
See character
null pointer 469
See argument
constant 216-217,220-221,343,469
NULL 11,84,91,96,216-217,220-223,233,
269,276,334,353-354,388,394,398,
416,422,424-425,449,454
-NULL 95-96,222-223,276-277,354,398,424,
449-451,461
numeric conversion
See function
-N a n
0
O'Neil, W.M. 443
object
See data
object module xi-xii, 1-2 88,468-469
octal
See base
offset 469
offsetof 116-117,216-217,221-224,446,457
one's-complement
See arithmetic
489
Index
open
See file
open 231,447
operand 469
operating system 470
operator 469-470
assigning 52,463
right-shift 58
optimization 21,24,53,183,186,188,256,
388
order
See storage
output
See stream
overflow
See floating-point
integer 80,135,145,161,218-219,309,474
overlap
storage 91,474
P
306-307
padding
See character
parameter 463,466-467,470
parametric
See code
paranoia 72,171
parenthesis 10,209
parse 263,321,470
partial
See line
Pascal 2,181,192
PC N, l87,468-470,473
PDP-11 iii-iv, 25,57, 295,198,203,205-206,
227,449,470,474
Pemberton, Steven 71
per cent 238.240-242,262,265,268,303,
306,318,321,419-421,465
performance ix, 13,15,19,26,46,52,99,
129,143,145,157,161,179,183,
231-232,254,256,271,292,318,363,
398-399,413-414
period 470
perror.c 292,298,457
PAD
54-55,251,272,277,292,298,327,
332,395,399,406,457
- ~ f t306-307
PIP 226-227,470
PL/I 182,192,227
I'lauger and Brodie N, xiii, 8,15,351-352,
421
Plauger, P.J. 15,223,327
Plum Hall Inc. xii
Plum Hall Validation Suite xii
Plum, Thomas xiii, 15
Poage, J.F. 411
pointer
See arithmetic
See null pointer
See type
Polonsky, 1.P 411
-P O ~ Y151,154,158,175,461
portability ix, 2-3,7,11,35,50,53,58,62,
64,73-75,80, 83,88,119,127,187,193,
195,197,203,205,216,219,221-222,
229,255,258,261,264,268-269,273,
307,343,353,385,395-396,470
POSIX 470
IEEE 1003 73,80,470,474
pound sign 260,306
pow. c 164,168-169,457
pow 63,133,136-138,164,167-168,170,180,
457
precision 238-240,260-262,266,306-307,
311,314,470
floating-point 129
predicate 18,274,463,467,470
preprocessor 75-76,78,470
primitive 137,177, 179,231-232,274,278,
281-283,287,327,329,378,420,425,
443,445446,448,452,470
print
See function
printr.c 296,301,457
printr 1,3,5,70,78,91,177,191,204,213,
220,224,243,245,258-259,263,
272-273,277,296,301,307,309,329,
331,383,457
-~rintr296,301-304,306-307,311,314,318,
322-323,438,461
perror
Index
printing
See character
program 470
startup 2,50,113,196,232,235,252,344,
351,449,452,470,472
stub 22,452,472
suspension 193-194,196
rand. c 358-359,457,461
RAND-MAX 334,337,346,354,
359,381-383,
454
rand
333-334,337,344,346,350-351,355,
358-359,383,457
Eandeeed 355,359,461
range
termination 17-18,21-22,27,79,193-195, See error
197-198,201,235,251,270,273,327,
Rationale ix, 4,15
334,339,344,346,348-349,353,378,
read
381,471-472
See error
prototype
See function
See function
See stream
ptrdif f-t 216-219,223,362,457
read-only 36,258,264,465,471-472
-Ptrdi fft 222-223,450-451,461
-PU 37-39,42,122,461
punctuation 108
See character
push-back
See character
read
231,447
readability 4,11,65
-Readloc 105,114-116,120,124,461
realloc c
realloc
377,457
333,338-339,344,348,351,355,
373,377,382,457
306
putc c 291,297,457
record 452
.
fixed-length 229,253
putc 26-27,247,254,271-272,274,277,297, recursion 292, 358,471
335 457
reduction
putchar. c 291,297,457
See argument
putchar 27,247,272,274,277,
297,332, 457 register 10,46,183-184,188-189,466
putenv 83
remove
-putrid 305,307-310,314,323,461
See file
PUT
put e c 296,300,457
remove. c 283,457
puts 22-23,45,54,71,79,125,177,179-180, remove 235,251,272,277-278,280,283,329,
191,204,213,224,247,271-272,
277,
296,300,332, 382-383,413,442,457
Q
.
qeort c 353,356-357,457
qeort 333,340-341,347,350,353-354,
357-358,382-383,457
Quicksort 350,353
quotes 413
radix
See base
raise. c 200,202, 446,452,457
raise 193,195204,339,346, 379,457
332,457
rename
See file
rename. c 283,446,457
rename 235,251,272,277-278,
457
representation
See type
reserved
See name
reusability xi, 1
reverting
See locale
See signal
See storage
rewind. c
288,290,457
283,329,332,
Index
237,248,250,254-256,270,272,277, ~ e t l o c94,101-103,106,124,462
288,290,331,457
metlocal.c 94,99-100,102-103,458
Rice, John R. 177
setlocale 4,27,83-86,88-89, 94-95,97-101,
Ritchie, Dennis 3,15,205,226-227
108-109,114,265,458
Rochkind, Mark J. 55
eetvbuf.c 288289,458
rounding
eetvbuf 233-234,238,256,269,273, 277,
See floating-point
288-289,331,458
RSX-11M iv
-srt 315
-Rtepe 139,151-154,156,158,160-161,163, shareware xii
165,175,461
shift
rvalue 471
See state
SHRT-MAX 74,76,7879,365,454
S
SHRT-MIN 74,76,78-79,365, 454
side
effect 26,197,246-247,254,346,466,
SAFE-EXP 170
470-471,473
scan
eig-atomic-t 194-197,200,203,458
See function
SIG-DFL 195196,200-202,204,454
set 242-243,266,268,471
SIG-ERR 23,195-196,199-201,203-204,455
ecanf.c 315,319,457
ecanf 5,243-244,255,263-265,273,277,315, SIG-ION 195-196,199-200,202, 204,455
SIG-ILL 196
319,331,457
SIGABRT 23-24,195,197-200,202, 204,339,
-scanf 314-315,318-323,462
346,378-379,381-383,449,454
SCHAR-w
74-76,78-79,454
SIGABRT
199-200,449-451,461
SCHAR-MIN 74-76,78,454
SIGFPE 195-198,200,202-204,454
seek
-sigfun 199-200,202-203,462
See stream
SIGIU 195,200,202,204,454
SEEK-CUR 233,249,269,271,276,282, 286,
SIGINT 195,197-198, 200,202,204,454
332,454
-SIGMAX 199-200,449-451,461
SEEK-END 233,249,269,271,276,282,332,
sign 84-87,89,109-110, 113-114,126,129,
454
155,239,260-261,268,306-307,
SEEK-SET 233,249,269,271,276,282, 286,
335-337,359,363
290,331-332, 454
signal
185,193,195-198,201,203,339,449,
semantics 471
452,471
semicolon 98
asynchronous 193-195,197-198,464
separator 397
handler 185-186,339,346, 378,381
See thousands separator
hardware 201,204,446
sequence point 194,471
ignoring
193,195196,198
SET 109,113,119
reverting
194, 196
eetbuf.c 288,457
193-194,472
synchronous
eetbuf 233-23~,238,256,273,277,
288,331,
eigna1.c 201,203,446,458
457
<signal.h> 4,22,24,49,189,193-204,346,
eetjmp.~188,446,458
379,446,449,452,454-455,457-458,
<setjmp.h> 4,24,181-192,194-195,201,446,
461-462
449,457-458,461
signal 22-23,49,186,193,195-201,203-204,
setjmp 5,24,182-192,195, 446,458
382-383,446,458
-SETJMP 187,461
-s
IGNAL 200,461
s e t jmp 187,461
rewind
492
signed integer
See type
signed-magnitude
See arithmetic
significance loss
See floating-point
SIGSEGV 195,198,200,202,204,454
SIGTERM 295,198,200,202,204,454
ein.c 151-152,458
ein 48,131,135-136,138,149,151-152,
178-179,279,409,458
-sin 138,149-152,161,462
einh-c 161,163,458
einh 132,136,138,161,163-164,180,458
size
See code
s IZE-BLOCK 372
SIZE-CELL 372
size-t 11,116,124,216-219,223,233,270,
276-277,322-323,334,346,353-355,
371,388,394,398,407,416,422,
424-425,427,458
eizeof 11,116,119,219
-sizet 222-223,276,354,398,424,450-451,
461-462
-skip 101,104-105,115,120-122,124,462
SNOBOL 387,411
source
See file
-SP 37-39,42,122,461
space 12,26,28-31,35,46,101,109,113,
229,234,238-239,251,260,306,413,
472
trailing 229,234
See white-space
specification
See conversion
specifier
See conversion
eprint f.c 301-302,315,458
eprintf 5,91,93-94,244-245,258,273,277,
301-302329-331,458
6qrt.c 157,159,458
aqrt 48,51-52,54,133,135-138,152,154,
157,159,171,180,458
Index
333,337,344,350-351,355,359,383,
458
eecanf.c 319,458
eecanf 5,244,263,265,268,273,277,315,
319,330-331,458
stack 187-189,191-192,344,438,449,472
creep 191
frame 188,472
standard
See C Standard
See character set
See currency symbol
See floating-point
See header
See POSIX
See stream
See time
Standard C 472
See library
startup
See program
S t a t a b 99
state
shift 238,240,260,266,301,306,318,
erand
341-343,349-350,352,363,368,381,
384,408,419,438
See table
statement 472
static
See storage
status
successful 14,79,327,334,348,381
unsuccessful 22,193,201,334,339,348
eetdarg.h> 4,12205-215,258-259,322,330,
371,446,448,459,461
-STDARG 211,461
<etddef.h> 4,11,91,116-117,175,215-224,
333,345,353,360,362,371,398,425,
446,454,457-459,461-462
-STDDEP 223,461
etderr
20-21,23,105,202,233,251-252,259,
270,276,298,332,458
etdin
233,242-244,246-247,251-252,
270-271,276,291,294,319,331-332,458
Index
-stoul 355,359-363,462
40-42,44,49,54-55,70,78,87,91-92, S T R 20-21,462
94,104,112,115,119,124-125,176,178, etrcat .c 401402,458
etrcat 382,389,395-396,398,401-403,412,
180,190,202,204,209,212-213,
458
219-220,224-332,345,351,373,379,
382,395,399,406,412,420,442,446,
448-449,453-461
-STDIO 276,462
<etdlib.h> 4,6, 18,21,23-24,49,77,82,
87-88,99, 104-105,112,119-120,124,
194,198,201-202204,215,220,260,
266-267,279-280,287,289,295,297,
301,303-304,310,312,318,320-321,
323,326,328,333-386,397,413,
430-435,438-439,446,453-459,461-462
-STDLIB 354,462
etdout 44,233,243,247,251-252,258,
270-271,276,227,300-302,330,332,
442,458
Steele, Guy L. 327
Sterbenz, Pat 72
Stevenson, David 55
-stod 355,362-364,462
storage
alignment 348
allocated 89,99,114,116-117,119,220,
231,236,252,274,333,338-339,
344-345,348-349,351,371-373,377,
385,430,463,466-467
allocation 269
boundary 205,211,371-373,393,448-449
dynamic 182-185,187-188,251,344,358,
407,466
fragmentation 345,372-373
heap 333,345,371-372,381
hole 205,211-212,222,257,345,393,467
order 65,257
overlap 67,189,244-245,343,388-390,
394-397,400,419
reverting 183
static 24,36,46,52,77,196,292,344,
349-350,378,397,405-406,417,422,
427,434,445,449,463,472
storage allocation
See function
store 472
etrchr-c 403,458
etrchr 93,120,122,300,305,321,325-326,
391,395-396,398,403-405,412,432,
434,458
etrcmp.c 401-402,458
e t r c m p 125,330-332,347-348,350,382-383,
389-390,395,397-398,401-402,407,
412,442,458
etrcoll .c 410-411,458
atrcoll 84,87,99,333,348,350,390,395,
397-398,407,410-412,458
etrcgy .c 401-402,458
etrcpy 88,93,97,103,105,120,125,243,
284,287,349,382-383,388,395,398,
402,406,412-413,435,458
etrcepn-c 403,458
etrcepn 104,388,391,395-396,398,403,
412,458
stream 231-232,234,452,469,471-472
append 237,246,275
binary 227,234,248-251,271,275,464
buffer 232,234-238,251-252,254,256,
269-270,273-275,285-286,288,
291-292,339
flush 234,236,256,339,346-347
input 240,256,271
output 236,238,240,270-271,339
read 237,241,253,264,275,282,315
seek 471
standard error 17-18,21-22,24,55,
114-115, 193,201,227,233,235,
251-252,269-270,272,278,292,395,449
standard input 227,233,235,252,
269-270,273,278,413,449
standard output 22,55,194,209,227,
233,235,252,259,269-270,272,278,
381,449
text 226-227,234,248-251,275,329,473
update 235-236,249
write 237,253,258,275
etrerror.c 406,452,458,462
494
251,272, 292,393,395,398-399,
406,412,458
-strerror 292,298,398-399,406,462
etrftime. c 436,438,458
etrftime 84,87,110-111,333,345, 417,
419-424,436-438,442,458
-strftime 427,436-439,462
string
concatenation 21
creation 21
literal 219,387,472
multibyte 87,99,238,240,266,301,318,
343,349-350,352,363,368,381,438
wide-character 99,219,343,350,352,
363,368
<etring.h> 2,487-88,91-92,94,99,102,
104-105,115,120,122,125,188-189,
210,272,284,287,292-294,298-300,
302-304,308,310,312,316,320,324,
326,328,330,332-333,347-348,350,
357,360,369,375,377,380, 382,
387-414,432,434-435,439,446,454,
457-458,461-462
-STRING 398,462
etrlen. c 403,458
etrlen 2,10,93,97,103-105,115,120,125,
284,300,309,332 380,382,393,
395396,398,403,412-413,434-435,
439,452,458
etrncat .c 401,458
etrncat 388-389,396,398,401, 403,412,458
etrncmp. c 401,458
etrncmp ll5,332,380,389-39OI 396,398,
401,412,458
etrncpy .c 401-402,458
etrncpy 389,396,398,401-402,412,458
etrpbrk. c 403-404,458
etrpbrk 388,391,395-396,398,403-405,412,
458
etrrchr .c 404,458
etrrchr 120,382,391,396,398,404,412,458
etrepn. c 403-404,458
etrerror
Index
etrtod. c 362-363,458
e t r t d 5,87,242,267,323,328-329,333-335,
347,351,355,362-363,383,386,413,
458
etrtok. c 405,458
etrtok 392-393,397-398,405-406, 413,458
etrtol c 362-363,458
etrtol 119,122,241,267,321,326,333-336,
347,351-352,355,362-363,383,430,
433,458
etrtoul .c 361,363,458
etrtoul 241,267-268,321,327,333,336,
352,355,359,361,363,383,458
structure
See type
etrxfrm.c 407-408,458
etrxfrm 84,87,99,390-391,395,397-398,
407-408,411,413,458
-strxfrm 407-411,462
stub
See program
style 10,15,50, 114,129,143,201,221,345,
349
subscript
See arithmetic
Sun UNIX 54,212,449,472
suppression
See assignment suppression
suspension
See program
synchronization 46,193
synchronous
See signal
synonym
See type
syntax 472
system
call 283
service 47-48,51,55,73,82,199,285,373,
378,425,447,449,470,472
system. c 378,380,446,458
System/370 iv, 127-129,253,452,466,472
etrepn 104,115,392,396,398,404-405,412, eyetem 333,340,352,355,378,380-382,386,
458
458
6tretr.c 405,458
etretr 392,397-398,405,413,458
Index
time (continued)
processor 416,420,422-423,425,447
tab
standard 82
horizontal 10,12,26,29,31,33,46,101,
zone 82,101,111,415416,420,430,444,
226,229,234,413
465,473
vertical 26,29,31,33,229
time.c 425-426,447,459
table
<time.h>4,87,100,110-111,333,345,350,
state 99,101,112-113,118-119,366,368,
415-444,446-448,453-459,461-42
407,472
time-t 416-420,422,424-425,427,429,434,
translation 27,34-35,99,112,119,123,
449,459
445-446,473
time 350,417,424426,442-443,459
tan. c 151,153,458
-TIME 424,462
tan 130-131,137-138,151,153,161,179,458 -Timee 100-102,106,117,124,426,430-431,
tanh.c 164-165,459
433,436-438,462
tanh 132,137-138,164-165,180,459
"TIMEZONE" 111,434
taeeert .c 22-23
-Tinfo 100,110,426,437,462
_TBIAS 425-426,428,436,449-451,462
t1imite.c 78-79
tctype.c 42,44-45,126
t1ocale.c 123,125
temporary
tm 416-420,422,424,427,434
See file
tmath1.c 171,176-177
termination
tmath2.c 171,178-179
See program
tmath3.c 173,180
terrno.c 54-55
TMP-MAX 233,236,269,273,276,325,
testing 13-15,22,42,55,69,79,123, 171,
331-332,455
179,191,203,212,223,325,381,442
tmpf ile .c 287,459
text
-file
235,273,277,287,332,339,459
See file
tmpnam.c 284,446,459
See line
tmpnam 233,236,251,269,272-273,277-278,
See stream
284,287,329,331-332,459
tf1oat.c 69-71
-TNAMAX 276,449-451,462
Thacha-,Henry G. 177
token 12,77,392,397,413,472-473
Thompson, Ken 25,226
tolower.c 39,459
thousands separator 84-85,87,89,110,114, tolower 30,34-35,37,39,112,123,361,459
126,473
-Tolower 37,39-40,98,102,106,117,124,
thread
462
See control
toupper.c 37,39,459
See multithread
toupper 30,34-35,37,39,112,123,459
time
-Toupper 37,39,41,98,102,106,117,124,
broken-down 416-420,422-423,427,429,
462
434,437
trailing
calendar 416-420,422-425,427,449,465
See space
Daylight Savings 82,111,416,420,
translation
422-423,426-427,429-430,434,437,
See table
443,465
unit 1-253,181,186,468-469,473
See function
translation-time
local 82,415-419,423,430,465
See arithmetic
Index
496
translator 1-2,52-53,473
truncation
See field
See file
See floating-point
teetjmp-c 190-191
teignal.c 203-204
tetdarg-c 212-214
tetddef.c 223-224
tetdiol c 325,330-331
tetdio2.c 327,332
tetd1ib.c 381-383
tetring-c 411-413
ttime-c 442
-~ t o t m427-430,433,436-437,462
Turbo C++ iv, xii, 54,187,211,451,473
two's-complement
See arithmetic
type 473
arithmetic 422,463
array 186,192,210,217,219,344,
347-348,463,472
assignment-compatible 221,463
character 34,240,242,261,267,345,389,
399-401,445,448
compatible 217,220,224
constant 198,217,404,465
conversion 221
converting 206,220,259,309,465
data-object 217,465
definition 1-2,4,8,11, 473
double 129
floating-point 57,128,179,239-242,257,
261,264,267,307,311,323,329,
334-335,348,351,363,422,445-446,
448,463,467
integer 74,135,194,219-220, 223,257,
307,334-335,345,359,422,448,463,
468,471,473
pointer 220,224,240,242,257,262,268,
310,323,348,470
representation 34-35,40,57,59,61-62,
64-65,67,72,74,77,79-80,129,137,
141,170-171,177, 205,215-220,257,
345,348,359,362, 445446,448-449,
464,471
U
UCHAR-MAX
40-42,44-45,74-76,78-79,107,
113,122,124,320,367,370,409,455
UINT-MAX 75-76,78-79,455
ULONG-MAX 76,78-79,337,352, 361,455
ULTRIX iv, xii, 54,449,451,473
#under 5-6,20,54
underflow
See floating-point
underscore 4,6,9-10,43,275,283
UNGET 318
ungetc c 288,291,459
ungetc
27,24&249,254-255,264,273,277,
288,291,318-319,332,459
UNGETN 321
union
See type
UNIX iii-N, 25-26,47-50,55,73,80,82,
194195,199-200,203,226-232,
255-256,278,283,285-287,327,373,
378,415,425,434,447,449,452, 470,
472-474
Berkeley 212
See Sun
unsafe
See macro
unsigned integer
See arithmetic
See type
-UP 37-39,42,122,462
update
See stream
Index
UniForum
See /usr/group
USHRT-MAX 74,76,78-79,455
/usr/group 73-74
UTC 82,111,415,418,423,425,430,434,
437,465,467,473
100,102,106-107,117,124,368,
370,407,462
wcetombe.c 368-369,459
wcetombe 99,333,343,345,352,355,
368-369,383,459
wctomb c 368-369,459,462
wctomb 99,112,333,342-343,345-346,352,
355,368-369,383-385,459
~ c t o m b355,368-370,462
va-arg 206-213,244-245,251,305,308-309,
-~cxtomb355,368-369,462
324-328,459
weekday
va-end 5,206-213,244-245,259,301-302,
See name
318-319,330,459
WG14
3,82,474
va-list 12,207-212,259,296,314315,
White,
Jon
L. 327
322-323,459
white-space
11-12,25-26,29,33,88,101,
See argument
113,116,240-242,251,264-268,318,
va-start 206-213,244-245,259,301-302,
321,335-336,351,359,363,474
318-319,330,459
wide
-VAL 20-21,462
See character
validation 13-14
See
character set
<vararge.a> 205-206,.,212
474
width
variable 473
See
field
See argument
Witzgall, Christoph 177
label 182,192
-wmx 306
VAX iv, 54,127-128,188,449,473-474
writable
474
See ULTRIX
write
vfprint f.c 301-302,459
See error
vfprintf 5,12,244,251,258-259,273,277,
See function
302,325,329-330,459
See stream
void
write 231,447
See type
volatile
X
See type
vprintf c 301-302,459
X3Jll 3,474
vprintf 5,12,245,251,258,273,277,302,
-XA 37-38,122,462
325,330,459
##xal
loc hm8371-372,374377,459-40
vepfrintf 12
xaein. c 151,154-155,459
veprintf .c 301,303,459
xaeeert .c 21,459
veprintf 5,245,251,258,273,277,303,325,
xatan.c 156,158,459
329-330,459
- m i g 139,161-164,175,462
xctype c 4142,445,459
-XD 37,39,42,122,462
xaefloc .c 101; 105,459
Waite, William 177
xdint.c 141-142,459
wchar-t 216-217,219,223,334,345-346,
xanorm. c 145,147,460
353-355,459,474
xaecale.c 145-147,460
-Wchart Z2-Z!3,354,450-451,462
xdtento-c 170,174-175,363,460
-wcetate
Index
498
xdteet c 140,460
xdunecal c 144-145,460
xaxp-c 160-161,460
xetate-c 101,107,112,119,353,366,368,
407,459,461-462
99-100,113,118,124,367,370,
xfgpoe.c,285,446,460
407,459,461-462
xfilee.c 278-279,460
.xe tdio an 275,279-281,283-304,306,308,
xfloat .c 65,67-69,72,139,445,448,459-461
310-312,315-316,318-324,326,328,
xfmtval.c 90,92-93,460
460-462
xfopen.c 284-285,446.460
x e t d.c 363-365,462
xfoprep c 278,281,460
xetoul .c 359-361,462
xfreeloc.c 116,118,460
xetrftim.~438-439,462
xfrprep.c 291,295,460
xetrxfrm.c 407,409,462
xfepoe.c 286-287,446,460
nxstrxfrm.h1# 407-410,462
xfwprep.c 291,297,460
xtime .an 426-428,431-437,43940,460,
xgen1d.c 314,316-317,460
462
xgentime.c 438,440-441,460
nxtinfo.hn 462
xgetdet.c 426,430,432-433,460
**xtinfo
.hn 100,124,426-427
xgetf1d.c 321,324325,460
xtolower c 37,40,445,462
xgetfloa.c 323,328,460
xtoupper.c 37,41,445,462
xgetint.c 321,326-327,460
xttotm. c 427-429,443,459,462
xgetloc c 101,104-105,460,462
xvaluee c 139,460-462
xgetmem.c 373,375,446,460
xvrctomb.c 368,370,462
xgett ime.c 434,460
xgetzone.c 434-435,460
xiedet c 430-431,460
xldtob.c 311-313,461
xlauneca.c 171-173,461
x1itob.c 309-310,461
nlxlocale.hne
98-100,102,104-107,115-120,
122-124,459-462
x1octab.c 116-117,461
xlocterm.c 119,122,461
x10g.c 164,166-167,461
xmakeloc.c 119-121,461
mgxmath.hog
139-144,146-151,153-163,
165-166,168,170-172, 174175, 179,
zero divide 128,193,195,198,472
310-312,363-364,459-462
zero fixup
h t o w c.c 366-367,461
See floating-point
xp0ly.c 151,461
zone
xprint f c 301,304-305,461
See time
xputfld.c 307-309,461
xreaaloc.c 115,461
-xs 37,39,122,462
xecanf.c 315,320-321,462
xeet1oc.c 101,106,462
xein-c 149-151,462
"xetate.a"