[PDF][PDF] Literate programming

CJ Van Wyk - Communications of the ACM, 1987 - dl.acm.org
CJ Van Wyk
Communications of the ACM, 1987dl.acm.org
Moderator's Introduction to Column 2 Judging from the amount of mail I received about
Column 1, the readership of “Literate Programming” is gratifyingly large. Part of the reason
for the large mail volume, however, just might be that the program in Column 1 contains a
serious error. The first letter I received about the column came from Michael Shook:“The sort
in the routine print words requires a bucket for each possible frequency of occurrence, not
for each different word. The error would be exercised by input of exactly two identical words.” …
Moderator’s Introduction to Column 2 Judging from the amount of mail I received about Column 1, the readership of “Literate Programming” is gratifyingly large. Part of the reason for the large mail volume, however, just might be that the program in Column 1 contains a serious error. The first letter I received about the column came from Michael Shook:“The sort in the routine print words requires a bucket for each possible frequency of occurrence, not for each different word. The error would be exercised by input of exactly two identical words.” I asked David Hanson to run his program on the input" hello hello"; he quickly reported back that his program had dumped core. In his review of Hanson’s program, John Gilbert had said,“‘total’violates the rule that a global variable should have the most descriptive name possible.” This is a strong hint to why this error was hard to catch: Although Section 6 of the program points out that" total... counts the number of distinct words in the table,” the name tota 1 is misleading enough that one could forget this in Section 7 and imagine that total is the total number of words in the input. Many others wrote to point out that Hanson’s program uses an inefficient hash function. Hanson measured the performance of his program on the sample input, and found that it left 3000 hash slots empty and three hash slots contained more than 100 items. Hanson experimented with Hans Boehm and Joe Warren to devise a better hash function; on the same sample input, it leaves only 1500 hash slots empty, and the longest hash chain it constructs has only seven items. Readers interested in more details can write directly to Hanson at the Department of Computer Science, Princeton University, Princeton, NJ 08544. There was a very tiny overlap between correspondents who noticed the bug and those who noticed the poor performance of the hash function. How much easier it is to read a program attentively to assess its efficiency OY its correctness than to read a program with an eye to both, not to mention its literary qualities. In reply to last year’s “Programming Pearls” about
ACM Digital Library