Notes On Programmng in C

Rob Pike discusses several principles for writing clear, readable code in C: - Variable names should be short but descriptive, using context to minimize length. Index variables like i are sufficient. - Pointers can simplify code by directly referencing objects rather than using expressions with indices. They also allow compile-time type checking. - Procedure names should indicate what they do or return to help readers understand their purpose from the name alone. - Comments should primarily introduce sections of code or explain global concepts. In-line comments stating obvious things like increment operations are unnecessary. Code should be written clearly enough that extensive comments are not needed.

Uploaded by

api-26343069

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views

Notes On Programmng in C

Uploaded by

api-26343069

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Notes on Programming in C

Rob Pike

Introduction
Kernighan and Plauger’s The Elements of Programming Style was an important and rightly
influential book. But sometimes I feel its concise rules were taken as a cookbook approach to good style
instead of the succinct expression of a philosophy they were meant to be. If the book claims that variable
names should be chosen meaningfully, doesn’t it then follow that variables whose names are small essays
on their use are even better? Isn’t MaximumValueUntilOverflow a better name than maxval? I
don’t think so.
What follows is a set of short essays that collectively encourage a philosophy of clarity in program-
ming rather than giving hard rules. I don’t expect you to agree with all of them, because they are opinion
and opinions change with the times. But they’ve been accumulating in my head, if not on paper until now,
for a long time, and are based on a lot of experience, so I hope they help you understand how to plan the
details of a program. (I’ve yet to see a good essay on how to plan the whole thing, but then that’s partly
what this course is about.) If you find them idiosyncratic, fine; if you disagree with them, fine; but if they
make you think about why you disagree, that’s better. Under no circumstances should you program the
way I say to because I say to; program the way you think expresses best what you’re trying to accomplish
in the program. And do so consistently and ruthlessly.
Your comments are welcome.

Issues of typography
A program is a sort of publication. It’s meant to be read by the programmer, another programmer
(perhaps yourself a few days, weeks or years later), and lastly a machine. The machine doesn’t care how
pretty the program is — if the program compiles, the machine’s happy — but people do, and they should.
Sometimes they care too much: pretty printers mechanically produce pretty output that accentuates
irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold
font. Although many people think programs should look like the Algol-68 report (and some systems even
require you to edit programs in that style), a clear program is not made any clearer by such presentation,
and a bad program is only made laughable.
Typographic conventions consistently held are important to clear presentation, of course — indenta-
tion is probably the best known and most useful example — but when the ink obscures the intent, typogra-
phy has taken over. So even if you stick with plain old typewriter-like output, be conscious of typographic
silliness. Avoid decoration; for instance, keep comments brief and banner-free. Say what you want to say
in the program, neatly and consistently. Then move on.

Variable names
Ah, variable names. Length is not a virtue in a name; clarity of expression is. A global variable
rarely used may deserve a long name, maxphysaddr say. An array index used on every line of a loop
needn’t be named any more elaborately than i. Saying index or elementnumber is more to type (or
calls upon your text editor) and obscures the details of the computation. When the variable names are
huge, it’s harder to see what’s going on. This is partly a typographic issue; consider
-2-

for(i=0 to 100)
array[i]=0
vs.
for(elementnumber=0 to 100)
array[elementnumber]=0;
The problem gets worse fast with real examples. Indices are just notation, so treat them as such.
Pointers also require sensible notation. np is just as mnemonic as nodepointer if you con-
sistently use a naming convention from which np means ‘‘node pointer’’ is easily derived. More on this in
the next essay.
As in all other aspects of readable programming, consistency is important in naming. If you call one
variable maxphysaddr, don’t call its cousin lowestaddress.
Finally, I prefer minimum-length but maximum-information names, and then let the context fill in the
rest. Globals, for instance, typically have little context when they are used, so their names need to be rela-
tively evocative. Thus I say maxphysaddr (not MaximumPhysicalAddress) for a global variable,
but np not NodePointer for a pointer locally defined and used. This is largely a matter of taste, but
taste is relevant to clarity.
I eschew embedded capital letters in names; to my prose-oriented eyes, they are too awkward to read
comfortably. They jangle like bad typography.

The use of pointers.

C is unusual in that it allows pointers to point to anything. Pointers are sharp tools, and like any such
tool, used well they can be delightfully productive, but used badly they can do great damage (I sunk a
wood chisel into my thumb a few days before writing this). Pointers have a bad reputation in academia,
because they are considered too dangerous, dirty somehow. But I think they are powerful notation, which
means they can help us express ourselves clearly.
Consider: When you have a pointer to an object, it is a name for exactly that object and no other.
That sounds trivial, but look at the following two expressions:
np
node[i]
The first points to a node, the second evaluates to (say) the same node. But the second form is an expres-
sion; it is not so simple. To interpret it, we must know what node is, what i is, and that i and node are
related by the (probably unspecified) rules of the surrounding program. Nothing about the expression in
isolation can show that i is a valid index of node, let alone the index of the element we want. If i and j
and k are all indices into the node array, it’s very easy to slip up, and the compiler cannot help. It’s partic-
ularly easy to make mistakes when passing things to subroutines: a pointer is a single thing; an array and an
index must be believed to belong together in the receiving subroutine.
An expression that evaluates to an object is inherently more subtle and error-prone than the address
of that object. Correct use of pointers can simplify code:
parent->link[i].type
vs.
lp->type.
If we want the next element’s type, it’s
parent->link[++i].type
or
-3-

(++lp)->type.
i advances but the rest of the expression must stay constant; with pointers, there’s only one thing to
advance.
Typographic considerations enter here, too. Stepping through structures using pointers can be much
easier to read than with expressions: less ink is needed and less effort is expended by the compiler and
computer. A related issue is that the type of the pointer affects how it can be used correctly, which allows
some helpful compile-time error checking that array indices cannot share. Also, if the objects are struc-
tures, their tag fields are reminders of their type, so
np->left
is sufficiently evocative; if an array is being indexed the array will have some well-chosen name and the
expression will end up longer:
node[i].left.
Again, the extra characters become more irritating as the examples become larger.
As a rule, if you find code containing many similar, complex expressions that evaluate to elements of
a data structure, judicious use of pointers can clear things up. Consider what
if(goleft)
p->left=p->right->left;
else
p->right=p->left->right;
would look like using a compound expression for p. Sometimes it’s worth a temporary variable (here p) or
a macro to distill the calculation.

Procedure names
Procedure names should reflect what they do; function names should reflect what they return. Func-
tions are used in expressions, often in things like if’s, so they need to read appropriately.
if(checksize(x))
is unhelpful because we can’t deduce whether checksize returns true on error or non-error; instead
if(validsize(x))
makes the point clear and makes a future mistake in using the routine less likely.

Comments
A delicate matter, requiring taste and judgement. I tend to err on the side of eliminating comments,
for several reasons. First, if the code is clear, and uses good type names and variable names, it should
explain itself. Second, comments aren’t checked by the compiler, so there is no guarantee they’re right,
especially after the code is modified. A misleading comment can be very confusing. Third, the issue of
typography: comments clutter code.
But I do comment sometimes. Almost exclusively, I use them as an introduction to what follows.
Examples: explaining the use of global variables and types (the one thing I always comment in large pro-
grams); as an introduction to an unusual or critical procedure; or to mark off sections of a large computa-
tion.
There is a famously bad comment style:
i=i+1; /* Add one to i */
and there are worse ways to do it:
-4-

/**********************************
* *
* Add one to i *
* *
**********************************/

i=i+1;
Don’t laugh now, wait until you see it in real life.
Avoid cute typography in comments, avoid big blocks of comments except perhaps before vital sec-
tions like the declaration of the central data structure (comments on data are usually much more helpful
than on algorithms); basically, avoid comments. If your code needs a comment to be understood, it would
be better to rewrite it so it’s easier to understand. Which brings us to

Complexity
Most programs are too complicated — that is, more complex than they need to be to solve their prob-
lems efficiently. Why? Mostly it’s because of bad design, but I will skip that issue here because it’s a big
one. But programs are often complicated at the microscopic level, and that is something I can address here.
Rule 1. You can’t tell where a program is going to spend its time. Bottlenecks occur in surprising
places, so don’t try to second guess and put in a speed hack until you’ve proven that’s where the bottleneck
is.
Rule 2. Measure. Don’t tune for speed until you’ve measured, and even then don’t unless one part
of the code overwhelms the rest.
Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have
big constants. Until you know that n is frequently going to be big, don’t get fancy. (Even if n does get
big, use Rule 2 first.) For example, binary trees are always faster than splay trees for workaday problems.
Rule 4. Fancy algorithms are buggier than simple ones, and they’re much harder to implement. Use
simple algorithms as well as simple data structures.
The following data structures are a complete list for almost all practical programs:
array
linked list
hash table
binary tree
Of course, you must also be prepared to collect these into compound data structures. For instance, a sym-
bol table might be implemented as a hash table containing linked lists of arrays of characters.
Rule 5. Data dominates. If you’ve chosen the right data structures and organized things well, the
algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.
(See Brooks p. 102.)
Rule 6. There is no Rule 6.

Programming with data.

Algorithms, or details of algorithms, can often be encoded compactly, efficiently and expressively as
data rather than, say, as lots of if statements. The reason is that the complexity of the job at hand, if it is
due to a combination of independent details, can be encoded. A classic example of this is parsing tables,
which encode the grammar of a programming language in a form interpretable by a fixed, fairly simple
piece of code. Finite state machines are particularly amenable to this form of attack, but almost any pro-
gram that involves the ‘parsing’ of some abstract sort of input into a sequence of some independent
‘actions’ can be constructed profitably as a data-driven algorithm.
Perhaps the most intriguing aspect of this kind of design is that the tables can sometimes be gen-
erated by another program — a parser generator, in the classical case. As a more earthy example, if an
-5-

operating system is driven by a set of tables that connect I/O requests to the appropriate device drivers, the
system may be ‘configured’ by a program that reads a description of the particular devices connected to the
machine in question and prints the corresponding tables.
One of the reasons data-driven programs are not common, at least among beginners, is the tyranny of
Pascal. Pascal, like its creator, believes firmly in the separation of code and data. It therefore (at least in its
original form) has no ability to create initialized data. This flies in the face of the theories of Turing and
von Neumann, which define the basic principles of the stored-program computer. Code and data are the
same, or at least they can be. How else can you explain how a compiler works? (Functional languages
have a similar problem with I/O.)

Function pointers
Another result of the tyranny of Pascal is that beginners don’t use function pointers. (You can’t have
function-valued variables in Pascal.) Using function pointers to encode complexity has some interesting
properties.
Some of the complexity is passed to the routine pointed to. The routine must obey some standard
protocol — it’s one of a set of routines invoked identically — but beyond that, what it does is its business
alone. The complexity is distributed.
There is this idea of a protocol, in that all functions used similarly must behave similarly. This
makes for easy documentation, testing, growth and even making the program run distributed over a net-
work — the protocol can be encoded as remote procedure calls.
I argue that clear use of function pointers is the heart of object-oriented programming. Given a set of
operations you want to perform on data, and a set of data types you want to respond to those operations, the
easiest way to put the program together is with a group of function pointers for each type. This, in a nut-
shell, defines class and method. The O-O languages give you more of course — prettier syntax, derived
types and so on — but conceptually they provide little extra.
Combining data-driven programs with function pointers leads to an astonishingly expressive way of
working, a way that, in my experience, has often led to pleasant surprises. Even without a special O-O
language, you can get 90% of the benefit for no extra work and be more in control of the result. I cannot
recommend an implementation style more highly. All the programs I have organized this way have sur-
vived comfortably after much development — far better than with less disciplined approaches. Maybe
that’s it: the discipline it forces pays off handsomely in the long run.

Include files
Simple rule: include files should never include include files. If instead they state (in comments or
implicitly) what files they need to have included first, the problem of deciding which files to include is
pushed to the user (programmer) but in a way that’s easy to handle and that, by construction, avoids multi-
ple inclusions. Multiple inclusions are a bane of systems programming. It’s not rare to have files included
five or more times to compile a single C source file. The Unix /usr/include/sys stuff is terrible this
way.
There’s a little dance involving #ifdef’s that can prevent a file being read twice, but it’s usually
done wrong in practice — the #ifdef’s are in the file itself, not the file that includes it. The result is
often thousands of needless lines of code passing through the lexical analyzer, which is (in good compilers)
the most expensive phase.
Just follow the simple rule. -----

Coding Standards
No ratings yet
Coding Standards
5 pages
ICMP and Checksum Calc
100% (1)
ICMP and Checksum Calc
58 pages
C# Coding Standards and Best Programming Practices
50% (2)
C# Coding Standards and Best Programming Practices
58 pages
Notes On Writing Code
100% (1)
Notes On Writing Code
16 pages
Rob Pike Notes on Programming in C
No ratings yet
Rob Pike Notes on Programming in C
8 pages
07 Good Programming Style
No ratings yet
07 Good Programming Style
6 pages
Developing Good Style: Commenting
No ratings yet
Developing Good Style: Commenting
6 pages
C Programming Style: Paul Krzyzanowski
No ratings yet
C Programming Style: Paul Krzyzanowski
7 pages
Programming Problems: Advanced Algorithms
From Everand
Programming Problems: Advanced Algorithms
Bradley Green
3.5/5 (7)
OOP SS14 CodingStyle
No ratings yet
OOP SS14 CodingStyle
5 pages
Dsa - Barnette and Tonga - 6
No ratings yet
Dsa - Barnette and Tonga - 6
3 pages
Code Tells You How Comments Tell You Why
No ratings yet
Code Tells You How Comments Tell You Why
3 pages
Robustpython Preview
No ratings yet
Robustpython Preview
5 pages
Prompt Engineering Techniques - by OpenAI (For Consult)
No ratings yet
Prompt Engineering Techniques - by OpenAI (For Consult)
14 pages
Covariance and Contravariance - A Fresh Look at An Old Issue
No ratings yet
Covariance and Contravariance - A Fresh Look at An Old Issue
24 pages
Writing Clean Code (Part I)
No ratings yet
Writing Clean Code (Part I)
10 pages
How To Write An ML Paper - A Brief Guide
No ratings yet
How To Write An ML Paper - A Brief Guide
3 pages
Programming Problems: A Primer for The Technical Interview
From Everand
Programming Problems: A Primer for The Technical Interview
Bradley Green
4.5/5 (3)
Welcome To Particletree
No ratings yet
Welcome To Particletree
8 pages
Affixation Adrian Tuarez
No ratings yet
Affixation Adrian Tuarez
5 pages
Clean code Naming
No ratings yet
Clean code Naming
36 pages
Notes On Writing: Fredo Durand, MIT CSAIL
No ratings yet
Notes On Writing: Fredo Durand, MIT CSAIL
5 pages
Norvig Lisp Style
100% (1)
Norvig Lisp Style
116 pages
Pseudo Code
No ratings yet
Pseudo Code
2 pages
chatGPT Prompt Engineering
No ratings yet
chatGPT Prompt Engineering
11 pages
Clean Code Basic Principles
No ratings yet
Clean Code Basic Principles
29 pages
Tech Note Format
No ratings yet
Tech Note Format
2 pages
A Graph-Based Semantics For UML Class and Object Diagrams
No ratings yet
A Graph-Based Semantics For UML Class and Object Diagrams
27 pages
AutoLISP Lesson 5 - VARIABLES - Draftsperson
No ratings yet
AutoLISP Lesson 5 - VARIABLES - Draftsperson
6 pages
When Type
No ratings yet
When Type
2 pages
Assignment 1c: Random Sentence Generator: Due Fri Oct 23rd (By Midnight) The Inspiration
No ratings yet
Assignment 1c: Random Sentence Generator: Due Fri Oct 23rd (By Midnight) The Inspiration
5 pages
Typescript Jumpstart Book Udemy
No ratings yet
Typescript Jumpstart Book Udemy
44 pages
Prompt Engineering
No ratings yet
Prompt Engineering
26 pages
Notes On Writing: Fredo Durand, MIT CSAIL
No ratings yet
Notes On Writing: Fredo Durand, MIT CSAIL
12 pages
Thesis Code Snippets
100% (3)
Thesis Code Snippets
4 pages
How To Improve Your Skills As A Programmer: Steps
No ratings yet
How To Improve Your Skills As A Programmer: Steps
7 pages
typescript-handbook-beta
No ratings yet
typescript-handbook-beta
87 pages
Nombres 2
No ratings yet
Nombres 2
3 pages
Composite Design Patterns (They Aren't What You Think) : Pattern Hatching
No ratings yet
Composite Design Patterns (They Aren't What You Think) : Pattern Hatching
6 pages
Arrays Considered Somewhat Harmful
No ratings yet
Arrays Considered Somewhat Harmful
4 pages
Prompt Engineering - OpenAI API
No ratings yet
Prompt Engineering - OpenAI API
21 pages
20 Advanced Coding Tips For Big Unity Projects
No ratings yet
20 Advanced Coding Tips For Big Unity Projects
8 pages
Thesis List of Acronyms
100% (3)
Thesis List of Acronyms
5 pages
Prompt Engineering
No ratings yet
Prompt Engineering
20 pages
Chapter 2 Functional English
No ratings yet
Chapter 2 Functional English
11 pages
06 Writing Data Analysis Reports
No ratings yet
06 Writing Data Analysis Reports
2 pages
2.5 Leaving Comments in Code Why, How, and When
No ratings yet
2.5 Leaving Comments in Code Why, How, and When
4 pages
TNTRB Data Structure WWW - Governmentexams.co - in
No ratings yet
TNTRB Data Structure WWW - Governmentexams.co - in
121 pages
Its in The Way That You Use It
No ratings yet
Its in The Way That You Use It
3 pages
Attention For Time Series Forecasting and Classification - by Isaac Godfried - Towards Data Science
No ratings yet
Attention For Time Series Forecasting and Classification - by Isaac Godfried - Towards Data Science
10 pages
C# Coding Standards and Best Programming Practices: by Refined By: Tuukka Haapaniemi
No ratings yet
C# Coding Standards and Best Programming Practices: by Refined By: Tuukka Haapaniemi
18 pages
Great Tips For Elsevier Papers
No ratings yet
Great Tips For Elsevier Papers
6 pages
Project Report
No ratings yet
Project Report
4 pages
C# Coding Standards and Best Programming Practices
No ratings yet
C# Coding Standards and Best Programming Practices
18 pages
601.465/665 - Natural Language Processing Assignment 1: Designing Context-Free Grammars
No ratings yet
601.465/665 - Natural Language Processing Assignment 1: Designing Context-Free Grammars
11 pages
Lesson 6: Pointers in C++: What Are Pointers? Why Should You Care?
No ratings yet
Lesson 6: Pointers in C++: What Are Pointers? Why Should You Care?
3 pages
Fundamental Concepts in Programming Languages
No ratings yet
Fundamental Concepts in Programming Languages
39 pages
Peer Review Response Sheets
No ratings yet
Peer Review Response Sheets
10 pages
65b Flykt
No ratings yet
65b Flykt
3 pages
Python: Best Practices to Programming Code with Python
From Everand
Python: Best Practices to Programming Code with Python
Charlie Masterson
No ratings yet
Hexagonal Architecture Explained
From Everand
Hexagonal Architecture Explained
Alistair Cockburn
No ratings yet
C Language Tutorial by Gordon Drodrill (1999)
100% (1)
C Language Tutorial by Gordon Drodrill (1999)
124 pages
Exodus
No ratings yet
Exodus
44 pages
Hosea
No ratings yet
Hosea
10 pages
Haggai
No ratings yet
Haggai
2 pages
Deuteronomy AMHARIC
No ratings yet
Deuteronomy AMHARIC
35 pages
Habakkuk
No ratings yet
Habakkuk
3 pages
Bible in Amharic 01 Genesis
No ratings yet
Bible in Amharic 01 Genesis
53 pages
Ezra
No ratings yet
Ezra
10 pages
Esther
100% (1)
Esther
8 pages
Daniel
No ratings yet
Daniel
15 pages
Amos
100% (1)
Amos
8 pages
1 Kings
No ratings yet
1 Kings
33 pages
1 Samuel
No ratings yet
1 Samuel
33 pages
Socket Programming in C++
No ratings yet
Socket Programming in C++
40 pages
C++ Programming HOW-To
100% (1)
C++ Programming HOW-To
59 pages
Socket Programming
100% (5)
Socket Programming
206 pages
Socket Programmingt
No ratings yet
Socket Programmingt
150 pages
Beej's Guide Old Version (Use This)
No ratings yet
Beej's Guide Old Version (Use This)
75 pages
Swet - Frouzane - Networkingchap 08
No ratings yet
Swet - Frouzane - Networkingchap 08
61 pages
Psalms
No ratings yet
Psalms
82 pages
C++ Interactive Course
100% (1)
C++ Interactive Course
299 pages