Lua Programming Gems
Lua Programming Gems
edited by
Luiz Henrique de Figueiredo
Waldemar Celes
Roberto Ierusalimschy
Lua.org
Rio de Janeiro
2008
Lua Programming Gems
edited by Luiz Henrique de Figueiredo, Waldemar Celes, Roberto Ierusalimschy.
ISBN 978-85-903798-4-3.
Copyright
c 2008 by the editors and individual contributors. All rights reserved.
Book cover by Pedro de Mazza Cerqueira. Lua logo design by Alexandre Nako.
Typesetting by the editors using LATEX.
Although the editors and the authors have used their best efforts in preparing
this book, they assume no responsibility for errors or omissions, or for any dam-
age that may result from the use of the information presented here. All product
names mentioned in this book are trademarks of their respective owners.
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Foreword, by Cameron Laird . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Lua and Lightroom, by Mark Hamburg . . . . . . . . . . . . . . . . . . . . . . xi
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
I Programming Techniques
1 Lua Per-Thread Library Context . . . . . . . . . . . . . . . . . . . . . . 3
Doug Currie
2 Lua Performance Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Roberto Ierusalimschy
3 Vardump: The Power of Seeing What’s Behind . . . . . . . . . . . . . 29
Tobias Sülzenbrück and Christoph Beckmann
4 Serialization with Pluto . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Ben Sunshine-Hill
5 Abstractions for LuaSQL . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Tomás Guisasola Gorham
6 Boostrapping a Forth in 40 Lines of Lua Code . . . . . . . . . . . . . . 57
Eduardo Ochs
7 Effecting Large-Scale Change (with little trauma) using Metatables . . 71
Sérgio Alvares Maffra and Pedro Miller Rabinovitch
II Design Techniques
8 MVC Web Development with Kepler . . . . . . . . . . . . . . . . . . . 85
André Carregal and Yuri Takhteyev
9 Filters, Sources, Sinks, and Pumps . . . . . . . . . . . . . . . . . . . . 97
Diego Nehab
10 Lua as a Protocol Language . . . . . . . . . . . . . . . . . . . . . . . . 109
Patrick Rapin
v
vi CONTENTS
IV Game Programming
20 Using Lua in Game and Tool Creation . . . . . . . . . . . . . . . . . . 249
Konstantin Sokharev and Vadim Groznov
21 A Dynamic and Flexible Event System for Script-Driven Games . . . . 259
Robert Oates
22 Lua for Game Programming . . . . . . . . . . . . . . . . . . . . . . . . 269
Steve Gargolinski
23 Designing an Efficient Lua Driven Game Scripting Engine . . . . . . . 281
Nicolas Peri
It gives us great pleasure to publish this collection of Lua gems. Not only does
it record some of the existing wisdom and practice on how to program well in
Lua, but it also reflects the maturity of the Lua community. It is gratifying
to see that Lua has motivated other people to learn it well and to share their
knowledge with other users. In well-written articles that go much beyond the
brief informal exchange of tips in the mailing list or the wiki, the authors share
their mastery of all aspects of Lua programming, elementary and advanced.
Producing this book has required several steps. In response to a call for con-
tributions, we received over 70 abstracts, selected 43, and received full versions
for 28 of these. The authors received our comments and suggestions to prepare
the final version of their articles. The whole process took two years, much longer
than we had imagined. The selection of abstracts proved to be surprisingly dif-
ficult. Many potentially good submissions could not be accepted due to space
limitations. Despite the long time it took and the amount of work it required
(or because of it!), we are very happy to have this collection of articles on Lua
contributed by members of our community. We trust the book was worth waiting
for.
We thank all the authors for their hard work on the articles and everyone
that submitted abstracts in the first phase. We also thank the whole Lua
community for its friendliness and expertise. The active participation of our
users has been to us a constant source of motivation for improving Lua. Finally,
we give our warm thanks to Cameron Laird and Mark Hamburg for writing
forewords to this book.
Additional material and errata will appear in the book web site:
http://www.lua.org/gems/
vii
Foreword
by Cameron Laird
ix
x FOREWORD
Do you want to “program well in Lua”? The Lua team set that as a goal when
it first announced its plans for Lua Programming Gems. The final result fulfills
that goal; you’ll like it.
Lua and Lightroom
Mark Hamburg
Founder, Adobe Photoshop Lightroom
When we started work on the project that would become Adobe Photoshop
Lightroom, we knew we wanted to make scriptability an important part of our
story, so early on we reviewed the usual suspects. What drew us to Lua was
its combination of simplicity, power, ease of embedding, and relatively high-
performance. Having a straightforward license helped too when it came time
to talk to Adobe’s lawyers. Personally, as an old Scheme fan, I was drawn to
its first-class closure support. I also found the coroutine system intriguing. The
relative minimalism also resonated with a back-to-basics attitude that had us
weaning ourselves away from intensive C++ usage and back toward C.
Still, it was hard to position Lua as anything other than an obscure choice.
We could cite heavy use in the games community and we had set out with a
mission of learning something from game developers, but if asked what mate-
rials one could turn to learn Lua or where we would find experienced Lua pro-
grammers, the answers were limited. For the former, we had the well-written
reference manual, some good material on the Lua users wiki, and an intelligent
forum on the Lua mailing list. This was good material, but there wasn’t a lot of
it. For the latter question, our answer was essentially “Any programmer worth
hiring ought to be able to learn Lua quickly.” This was a situation we were pre-
pared to deal with and the arrival of Programming in Lua certainly helped, but
it was easy to understand why it might be off putting to someone looking in from
the outside.
Why this matters is that along with Lua’s simplicity come some issues that
make people with backgrounds in other languages stumble. The beauty of a
small core is that there is a real opportunity for mastery. This is one of C’s great
strengths as well. That small core, however, comes at a price. For example, Lua
has no syntax for exception handling. C doesn’t either but having one seems
almost required in modern languages. Lua has a syntax for object-oriented
xi
xii LUA AND LIGHTROOM
André Carregal was introduced to Lua in 1994 during his MSc in Computer
Science, which was supervised by Roberto Ierusalimschy. He has been working
with web development using Lua since 1996. He currently coordinates the
Kepler project and the LuaForge site while working as a consultant for Lua-
related projects.
Diego Nehab was introduced to Lua in 1996, while working for Tecgraf in
PUC-Rio. Over the years, he has been involved in a variety of Lua-related
projects, including the IupLua, CDLua, IMLua, and LuaSQL libraries. He is
best known as the author of the LuaThreads and LuaSocket libraries. Diego
received a BEng in Computer Engineering and an MSc in Programming Lan-
guages from PUC-Rio, under the supervision of Roberto Ierusalimschy. He later
received an MSc and a PhD in Computer Graphics from Princeton University.
His research now focuses on high-quality shape acquisition and on real-time
rendering techniques.
xiii
xiv CONTRIBUTORS
special interest in little languages, Doug has also contributed technically to open
source projects such as Moscow ML, Hibernate, Gambit Scheme, and SICStus
Prolog. Doug holds an S.B. degree in Electrical Engineering and Computer
Science from the Massachusetts Institute of Technology.
Nicolas Peri is co-founder and technical director of the French company Stone-
Trip, creator of the 3D game development platform ShiVa. He is in charge,
among other things, of the ShiVa scripting engine, which is based on Lua. Before
that, he worked as engine developer for other gaming companies, including
Kalisto Entertainment and UbiSoft Tiwak.
Ralph Steggink joined Océ in 2001. With a degree in both chemistry and com-
puter science, he now develops controller software for printers. Together with
Wim Couwenberg he prototyped revolutionary concepts using Lua. These cur-
rently find their way into several Océ products. He is an enthusiastic volleyball
player and trainer.
xvi CONTRIBUTORS
Sérgio Alvares Maffra is a MSc and Computer Engineer from PUC-Rio. He’s
been working with Lua at Tecgraf as a software developer for over a decade now.
Steve Gargolinski spent his early programming days hacking together small
games built with code snippets from a QuickBasic programming manual. He
has since evolved into a professional game developer, working as a member of
the technical teams that produced the Zoo Tycoon 2 series, Star Trek: Legacy,
and the upcoming Empire Earth III. Steve is currently working for Blue Fang
Games as an AI Programmer. His interests include baseball, abstract strategy,
practical AI, and walking in the woods.
Tomás Guisasola works with Lua since 1995 when he developed with Roberto
Ierusalimschy (his MSc advisor) the first implementation of the hooks mecha-
nism and the debug facilities. Since then he worked mainly with CGILua as
the platform for some administrative systems at PUC-Rio and also contributed
xvii
Programming Techniques
Lua Per-Thread Library Context
1
Doug Currie
Libraries written in C for use with Lua sometimes have a context that can be
modified by the Lua program. For example, in the decNumber library, the Lua
program may select the rounding mode and precision for arithmetic operations.
The decNumber library user expects the context to be applied during library
operations, and remain fixed until explicitly changed.
There are many other examples of library context. Libraries may need to
maintain a per-thread global variable, like the POSIX library’s errno. The
C standard libraries have a current input file and current output file that are
implied for many operations.
It would be wrong for a context setting in one Lua thread to affect the setting
in another Lua thread. The other thread would get an unexpected rounding
error, or an unexpected errno value, for example. Each thread should have
its own context so that the library functions it uses operate the same way
independent of the activities of other threads.
Lua does not provide a per-thread variables mechanism directly, though
there are many ways to create this affect. The solution presented in this gem is
to use the mechanism provided by LUA_ENVIRONINDEX. All functions in the library
share a common closure. In this closure is a table used to map the thread’s
identity, i.e., L to a context. Since only the functions in the library have access to
the common closure, there is no chance of interference from other libraries. The
mechanism is fast, and can be made even faster with caching.
This gem presents the solution in a straightforward implementation, and
adds userdata context functions, caching, and performance measurement in
incremental steps.
((rate/100+1)^years)*start
but, if every decNumber function had to have the context supplied, it might look
like this:
decNumber.multiply(CONTEXT,
decNumber.expt(CONTEXT,
decNumber.add(CONTEXT, 1,
decNumber.divide(CONTEXT, rate, 100)),
years),
start)
I suspect you’d prefer the Lua operators to the API with context.
Implementation alternatives
So, how do we implement a per-thread library context?
The Lua 5.1 Reference Manual describes a “thread environment” that is ac-
cessible using LUA_GLOBALSINDEX. But all threads share the same global table as
their thread environment by default. This prevents the thread environment’s
5
to retrieve it.
• Lua offers no guarantee that the thread will not be relocated by the garbage
collector; if it is relocated, the value L will change from one library invoca-
tion to another, and our key is useless to identify the thread private data;
• Our private storage table would never be garbage collected since Lua
cannot determine if our light userdata is garbage, so the table entries will
remain in the table until we explicitly remove them.
First we will develop a caching approach, and then explore its benefits,
measuring its performance advantage, and discuss some potential drawbacks.
The caching technique is quite straightforward: we simply record the thread
(L) and context on each lookup in the library private storage. Each reference to
the thread context first checks if the reference is from the last thread to perform
a lookup, and if so returns the cached value avoiding the second and subsequent
lookups.
This implementation depends on threads and userdata not being moved by
the garbage collector. Fortunately, this is true for present Lua implementations,
and those in the foreseeable future since the Lua authors “have no intention
of allowing userdata addresses to change during GC” (http://lua-users.org/
lists/lua-l/2006-04/msg00384.html).
Lua doesn’t provide a way to push a userdata from the C library; the pointer
returned by lua_touserdata or luaL_checkudata is incremented past the Lua
userdata header. The userdata header structure is opaque to the C library. Most
of the time the library doesn’t care about this; it just wants a pointer to the
context and doesn’t need to put it on the Lua stack. In these cases the cache
works fine. In other rare cases, the C library must push the context to pass
it as an argument to Lua code or return it to the Lua library caller. In these
situations the cache is simply bypassed.
Another optimization is that ldn_set_context now takes the decContext as
an argument. This avoids having to convert the context value on the stack to a
userdata pointer for storage into the cache. The caller generally has this pointer
in hand at the time of the call.
Our C code is uses preprocessor macros to enable and disable caching so we
can build with either implementation. Here are the updated functions:
10 1 · Lua Per-Thread Library Context
#if LDN_ENABLE_CACHE
static lua_State *L_of_context_cache;
static decContext *context_cache;
#endif
/*
* either we need a decContext on the Lua stack, so we must bypass the
* cache, or we have a cache miss
*/
static decContext *ldn_push_context (lua_State *L)
{
decContext *dc;
lua_pushthread (L); /* key */
lua_rawget (L, LUA_ENVIRONINDEX);
else
{
dc = ldn_check_context (L, -1);
#if LDN_ENABLE_CACHE
/* and cache */
L_of_context_cache = L;
context_cache = dc;
#endif
}
return dc; /* leaves context on Lua stack */
}
yet, it is not the context of the new thread, which presumably should be a newly
initialized context. So, for example, thread Y could use a rounding mode and
precision set by thread X rather than the defaults, or could use file handles
thread X established for I/O rather than the standard I/O handles.
The root cause of the problem is that we are holding a reference to the thread
and userdata context that is not reachable from Lua’s root set. Of course, if it
was reachable, we’d have the same memory leak that we fixed using a weak
table for the per-thread library context.
Can this failure mode be eliminated?
Adding a __gc method to the context userdata that invalidates the per-thread
library context cache is probably a good place to start; that will prevent access
to freed memory. This is not a general solution to the problem, though, since the
context may not have been collected yet. The context may not be garbage if it can
be shared by multiple threads (in decNumber it can be shared). Furthermore,
even if the context is not shared, it is not guaranteed to be collected on the
same collection cycle as the thread, so there is still a potential problem with
using the wrong context, that of the freed thread (thread X’s context rather than
thread Y’s).
Unfortunately, Lua does not have a gc hook that is called after every collec-
tion; otherwise a hook function could simply invalidate the per-thread library
context cache after every collection. One can emulate this hook by allocating a
sacrificial userdata with a __gc method that invalidates the cache, and imme-
diately popping it from the stack. It’s __gc method would also create another
identical userdata so that the cache is flushed every collection cycle.
This is a bit tricky, and dependent on non-specified behavior of the garbage
collector. The Lua garbage collector offers no guarantee that either (a) garbage
is collected in any particular order relative to becoming untraceable, or (b) all
garbage is collected on every cycle. There are garbage collectors that collect in
some arbitrary order, e.g., memory address order, and/or only a portion of the
free memory on each collection cycle. The Caml Light GC works that way, for
example. So, if the thread is collected but not the sacrificial userdata on the
same cycle, the trick (now a bad kludge) doesn’t work.
However the trick works with the present Lua 5.1 garbage collector, and
probably with most future implementations. As the sacrificial userdata is al-
ready dead when the cycle starts, it will surely be collected. Even if the Lua im-
plementers introduce generational garbage collection, this userdata will never
move to older generations.
Another solution to eliminating the failure mode is to require that each new
thread calls an initialization function in the library that, perhaps among other
things, invalidates the cache. Since this depends on actions of library users to
prevent the failure mode, it is not ideal. However, if your library needs per-
thread initialization for other reasons, it may be a reasonable fix.
In summary, to avoid the highly unlikely caching failure mode, you should
use a __gc method of the context userdata that invalidates the per-thread li-
brary context, and either require library users in each new thread to call an
13
• For the simple arithmetic loop, caching gave a bit over a 21% reduction in
compute time. In other words, the arithmetic loop run with the context-
cache-enabled library ran in 79% of the time of the no-context-cache li-
brary.
• For the complex calculation loop, caching gave a bit over a 0.6% reduction
in compute time. Lua table lookup is quite fast compared with this calcu-
lation!
So, as you’d expect, the benefits of caching the library context will depend a
lot on the time complexity of your library functions.
14 1 · Lua Per-Thread Library Context
Conclusion
Caching the per-thread library context provides a small increase in performance.
The caching failure mode identified above is highly unlikely, but catastrophic.
The solutions to avoid the failure are either Lua implementation dependent, or
put a per-thread initialization obligation on library users. Fortunately, the non-
cache version of per-thread library context performs quite well. Unless you need
the small performance gain, or until Lua implements a gc hook, the cache may
be more trouble than it’s worth. If you need the performance, use a solution to
avoid the failure that’s best for your application.
Lua’s LUA_ENVIRONINDEX mechanism has several interesting uses. It was a
joy to discover, and seems to be just the right approach for per-thread library
context. The mechanism is so easy to implement that every library with context
should use per-thread library context.
Lua Performance Tips
2
Roberto Ierusalimschy
In Lua, as in any other programming language, we should always follow the two
maxims of program optimization:
Rule #1: Don’t do it.
Rule #2: Don’t do it yet. (for experts only)
Those rules are particularly relevant when programming in Lua. Lua is famous
for its performance, and it deserves its reputation among scripting languages.
Nevertheless, we all know that performance is a key ingredient of program-
ming. It is not by chance that problems with exponential time complexity are
called intractable. A too late result is a useless result. So, every good program-
mer should always balance the costs from spending resources to optimize a piece
of code against the gains of saving resources when running that code.
The first question regarding optimization a good programmer always asks is:
“Does the program needs to be optimized?” If the answer is positive (but only
then), the second question should be: “Where?”
To answer both questions we need some instrumentation. We should not
try to optimize software without proper measurements. The difference between
experienced programmers and novices is not that experienced programmers are
better at spotting where a program may be wasting its time: The difference is
that experienced programmers know they are not good at that task.
A few years ago, Noemi Rodriguez and I developed a prototype for a CORBA
ORB (Object Request Broker) in Lua, which later evolved into OiL (Orb in
Lua). As a first prototype, the implementation aimed at simplicity. To avoid
the need for extra C libraries, the prototype serialized integers using a few
arithmetic operations to isolate each byte (conversion to base 256). It did not
support floating-point values. Because CORBA handles strings as sequences of
characters, our ORB first converted Lua strings into a sequence (that is, a Lua
table) of characters and then handled the result like any other sequence.
When we finished that first prototype, we compared its performance against
a professional ORB implemented in C++. We expected our ORB to be somewhat
slower, as it was implemented in Lua, but we were disappointed by how much
slower it was. At first, we just laid the blame on Lua. Later, we suspected
that the culprit could be all those operations needed to serialize each number.
So, we decided to run the program under a profiler. We used a very simple
profiler, not unlike the one described in Chapter 23 of Programming in Lua.
The profiler results shocked us. Against our gut feelings, the serialization of
numbers had no measurable impact on the performance, among other reasons
because there were not that many numbers to serialize. The serialization of
strings, however, was responsible for a huge part of the total time. Practically
every CORBA message has several strings, even when we are not manipulating
strings explicitly: object references, method names, and some other internal
values are all coded as strings. And the serialization of each string was an
expensive operation, because it needed to create a new table, fill it with each
individual character, and then serialize the resulting sequence, which involved
serializing each character one by one. Once we reimplemented the serialization
of strings as a special case (instead of using the generic code for sequences), we
got a respectable speed up. With just a few extra lines of code, the performance
of your implementation was comparable to the C++ implementation.1
So, we should always measure when optimizing a program for performance.
Measure before, to know where to optimize. And measure after, to know whether
the “optimization” actually improved our code.
Once you decide that you really must optimize your Lua code, this text may
help you about how to optimize it, mainly by showing what is slow and what is
fast in Lua. I will not discuss here general techniques for optimization, such
as better algorithms. Of course you should know and use those techniques,
but there are several other places where you can learn them. In this article
I will discuss only techniques that are particular to Lua. Along the article, I will
constantly measure the time and space performance of small programs. Unless
stated otherwise, I do all measures on a Pentium IV 2.9 GHz with 1 GB of main
memory, running Ubuntu 7.10, Lua 5.1.1. Frequently I give actual measures
(e.g., 7 seconds), but what is relevant is the relationship between different
measures. When I say that a program is “X% times faster” than another it
means that it runs in X% less time. (A program 100% faster would take no time
to run.) When I say that a program is “X% times slower” than another I mean
that the other is X% faster. (A program 50% slower means that it takes twice
the time.)
1 Of course our implementation was still slower, but not by an order of magnitude.
17
Basic facts
Before running any code, Lua translates (precompiles) the source into an in-
ternal format. This format is a sequence of instructions for a virtual machine,
similar to machine code for a real CPU. This internal format is then interpreted
by C code that is essentially a while loop with a large switch inside, one case for
each instruction.
Perhaps you have already read somewhere that, since version 5.0, Lua uses
a register-based virtual machine. The “registers” of this virtual machine do not
correspond to real registers in the CPU, because this correspondence would be
not portable and quite limited in the number of registers available. Instead,
Lua uses a stack (implemented as an array plus some indices) to accommodate
its registers. Each active function has an activation record, which is a stack
slice wherein the function stores its registers. So, each function has its own
registers2 . Each function may use up to 250 registers, because each instruction
has only 8 bits to refer to a register.
Given that large number of registers, the Lua precompiler is able to store all
local variables in registers. The result is that access to local variables is very
fast in Lua. For instance, if a and b are local variables, a Lua statement like
a = a + b generates one single instruction: ADD 0 0 1 (assuming that a and b
are in registers 0 and 1, respectively). For comparison, if both a and b were
globals, the code for that addition would be like this:
GETGLOBAL 0 0 ; a
GETGLOBAL 1 1 ; b
ADD 0 0 1
SETGLOBAL 0 0 ; a
So, it is easy to justify one of the most important rules to improve the perfor-
mance of Lua programs: use locals!
If you need to squeeze performance out of your program, there are several
places where you can use locals besides the obvious ones. For instance, if you
call a function within a long loop, you can assign the function to a local variable.
For instance, the code
for i = 1, 1000000 do
local x = math.sin(i)
end
Access to external locals (that is, variables that are local to an enclosing
function) is not as fast as access to local variables, but it is still faster than
access to globals. Consider the next fragment:
print(foo(10))
print(foo(10))
This second code runs 30% faster than the original one.
Although the Lua compiler is quite efficient when compared with compilers
for other languages, compilation is a heavy task. So, you should avoid compiling
code in your program (e.g., function loadstring) whenever possible. Unless you
must run code that is really dynamic, such as code entered by an end user, you
seldom need to compile dynamic code.
As an example, consider the next code, which creates a table with functions
to return constant values from 1 to 100000:
function fk (k)
return function () return k end
end
19
About tables
Usually, you do not need to know anything about how Lua implement tables to
use them. Actually, Lua goes to great lengths to make sure that implementation
details do not surface to the user. However, these details show themselves
through the performance of table operations. So, to optimize programs that use
tables (that is, practically any Lua program), it is good to know a little about
how Lua implements tables.
The implementation of tables in Lua involves some clever algorithms. Every
table in Lua has two parts: the array part and the hash part. The array part
stores entries with integer keys in the range 1 to n, for some particular n. (We
will discuss how this n is computed in a moment.) All other entries (including
integer keys outside that range) go to the hash part.
As the name implies, the hash part uses a hash algorithm to store and find its
keys. It uses what is called an open address table, which means that all entries
are stored in the hash array itself. A hash function gives the primary index of a
key; if there is a collision (that is, if two keys are hashed to the same position),
the keys are linked in a list, with each element occupying one array entry.
When Lua needs to insert a new key into a table and the hash array is full,
Lua does a rehash. The first step in the rehash is to decide the sizes of the new
array part and the new hash part. So, Lua traverses all entries, counting and
classifying them, and then chooses as the size of the array part the largest power
of 2 such that more than half the elements of the array part are filled. The hash
size is then the smallest power of 2 that can accommodate all the remaining
entries (that is, those that did not fit into the array part).
When Lua creates an empty table, both parts have size 0 and, therefore,
there are no arrays allocated for them. Let us see what happens when we run
the following code:
local a = {}
for i = 1, 3 do
a[i] = true
end
It starts by creating an empty table a. In the first loop iteration, the assignment
a[1]=true triggers a rehash; Lua then sets the size of the array part of the table
to 1 and keeps the hash part empty. In the second loop iteration, the assignment
a[2]=true triggers another rehash, so that now the array part of the table has
size 2. Finally, the third iteration triggers yet another rehash, growing the size
of the array part to 4.
20 2 · Lua Performance Tips
A code like
a = {}
a.x = 1; a.y = 2; a.z = 3
does something similar, except that it grows the hash part of the table.
For large tables, this initial overhead is amortized over the entire creation:
While a table with three elements needs three rehashings, a table with one
million elements needs only twenty. But when you create thousands of small
tables, the combined overhead can be significant.
Older versions of Lua created empty tables with some pre-allocated slots
(four, if I remember correctly), to avoid this overhead when initializing small
tables. However, this approach wastes memory. For instance, if you create
millions of points (represented as tables with only two entries) and each one
uses twice the memory it really needs, you may pay a high price. That is why
currently Lua creates empty tables with no pre-allocated slots.
If you are programming in C, you can avoid those rehashings with the Lua
API function lua_createtable. It receives two arguments after the omnipresent
lua_State: the initial size of the array part and the initial size of the hash part
of the new table.3 By giving appropriate sizes to the new table, it is easy to avoid
those initial rehashes. Beware, however, that Lua can only shrink a table when
rehashing it. So, if your initial sizes are larger than needed, Lua may never
correct your waste of space.
When programming in Lua, you may use constructors to avoid those initial
rehashings. When you write {true, true, true}, Lua knows beforehand that
the table will need three slots in its array part, so Lua creates the table with
that size. Similarly, if you write {x = 1, y = 2, z = 3}, Lua will create a table
with four slots in its hash part. As an example, the next loop runs in 2.0 seconds:
for i = 1, 1000000 do
local a = {}
a[1] = 1; a[2] = 2; a[3] = 3
end
If we create the tables with the right size, we reduce the run time to 0.7 seconds:
for i = 1, 1000000 do
local a = {true, true, true}
a[1] = 1; a[2] = 2; a[3] = 3
end
If you write something like {[1] = true, [2] = true, [3] = true}, how-
ever, Lua is not smart enough to detect that the given expressions (literal num-
bers, in this case) describe array indices, so it creates a table with four slots in
its hash part, wasting memory and CPU time.
3 Although the rehash algorithm always sets the array size to a power of two, the array size can
be any value. The hash size, however, must be a power of two, so the second argument is always
rounded to the smaller power of two not smaller than the original value.
21
The size of both parts of a table are recomputed only when the table rehashes,
which happens only when the table is completely full and Lua needs to insert a
new element. As a consequence, if you traverse a table erasing all its fields (that
is, setting them all to nil), the table does not shrink. However, if you insert some
new elements, then eventually the table will have to resize. Usually this is not
a problem: if you keep erasing elements and inserting new ones (as is typical in
many programs), the table size remains stable. However, you should not expect
to recover memory by erasing the fields of a large table: It is better to free the
table itself.
A dirty trick to force a rehash is to insert enough nil elements into the table.
See the next example:
a = {}
lim = 10000000
for i = 1, lim do a[i] = i end -- create a huge table
print(collectgarbage("count")) --> 196626
for i = 1, lim do a[i] = nil end -- erase all its elements
print(collectgarbage("count")) --> 196626
for i = lim + 1, 2*lim do a[i] = nil end -- create many nil elements
print(collectgarbage("count")) --> 17
I do not recommend this trick except in exceptional circumstances: It is slow
and there is no easy way to know how many elements are “enough”.
You may wonder why Lua does not shrink tables when we insert nils. First,
to avoid testing what we are inserting into a table; a check for nil assignments
would slow down all assignments. Second, and more important, to allow nil
assignments when traversing a table. Consider the next loop:
for k, v in pairs(t) do
if some_property(v) then
t[k] = nil -- erase that element
end
end
If Lua rehashed the table after a nil assignment, it would havoc the traversal.
If you want to erase all elements from a table, a simple traversal is the correct
way to do it:
for k in pairs(t) do
t[k] = nil
end
A “smart” alternative would be this loop:
while true do
local k = next(t)
if not k then break end
t[k] = nil
end
22 2 · Lua Performance Tips
However, this loop is very slow for large tables. Function next, when called
without a previous key, returns the “first” element of a table (in some random
order). To do that, next traverses the table arrays from the beginning, looking
for a non-nil element. As the loop sets the first elements to nil, next takes longer
and longer to find the first non-nil element. As a result, the “smart” loop takes
20 seconds to erase a table with 100,000 elements; the traversal loop using pairs
takes 0.04 seconds.
About strings
As with tables, it is good to know how Lua implements strings to use them more
efficiently.
The way Lua implements strings differs in two important ways from what is
done in most other scripting languages. First, all strings in Lua are internalized;
this means that Lua keeps a single copy of any string. Whenever a new string
appears, Lua checks whether it already has a copy of that string and, if so,
reuses that copy. Internalization makes operations like string comparison and
table indexing very fast, but it slows down string creation.
Second, variables in Lua never hold strings, but only references to them.
This implementation speeds up several string manipulations. For instance, in
Perl, when you write something like $x = $y, where $y contains a string, the
assignment copies the string contents from the $y buffer into the $x buffer. If
the string is long, this becomes an expensive operation. In Lua, this assignment
involves only copying a pointer to the actual string.
This implementation with references, however, slows down a particular form
of string concatenation. In Perl, the operations $s = $s . "x" and $s .= "x"
are quite different. In the first one, you get a copy of $s and adds "x" to its end.
In the second one, the "x" is simply appended to the internal buffer kept by the
$s variable. So, the second form is independent from the string size (assuming
the buffer has space for the extra text). If you have these commands inside loops,
their difference is the difference between a linear and a quadratic algorithm. For
instance, the next loop takes almost five minutes to read a 5MByte file:
$x = "";
while (<>) {
$x = $x . $_;
}
local t = {}
for line in io.lines() do
t[#t + 1] = line
end
s = table.concat(t, "\n")
Although natural, this representation is not very economic for large polylines, as
it needs a table for each single point. A first alternative is to change the records
into arrays, which use less memory:
For a polyline with one million points, this change reduces the use of memory
from 95 KBytes to 65 KBytes. Of course, you pay a price in readability: p[i].x
is easier to understand than p[i][1].
A yet more economic alternative is to use one list for the x coordinates and
another one for the y coordinates:
The same trick may be used for closures, as long as you do not move them out
of the scope of the variables they need. For instance, consider the following
function:
We can avoid the creation of a new closure for each line by moving the inner
function outside the loop:
For many kinds of string processing, we can reduce the need for new strings
by working with indices over existing strings. For instance, the string.find
function returns the position where it found the pattern, instead of the match.
By returning indices, it avoids creating a new (sub)string for each successful
match. When necessary, the programmer can get the match substring by calling
string.sub.4
When we cannot avoid the use of new objects, we still may avoid creating
these new objects through reuse. For strings reuse is not necessary, because Lua
does the job for us: it always internalizes all strings it uses, therefore reusing
them whenever possible. For tables, however, reuse may be quite effective.
As a common case, let us return to the situation where we are creating new
tables inside a loop. This time, however, the table contents are not constant.
Nevertheless, frequently we still can reuse the same table in all iterations,
simply changing its contents. Consider this chunk:
local t = {}
for i = 1970, 2000 do
t[i] = os.time({year = i, month = 6, day = 14})
end
The next one is equivalent, but it reuses the table:
local t = {}
local aux = {year = nil, month = 6, day = 14}
for i = 1970, 2000 do
aux.year = i
t[i] = os.time(aux)
end
A particularly effective way to achieve reuse is through memoizing. The basic
idea is quite simple: store the result of some computation for a given input
so that, when the same input is given again, the program simply reuses that
previous result.
LPeg, a new package for pattern matching in Lua, does an interesting use of
memoizing. LPeg compiles each pattern into an internal representation, which
is a “program” for a parsing machine that performs the matching. This compila-
tion is quite expensive, when compared with matching itself. So, LPeg memoizes
the results from its compilations to reuse them. A simple table associates the
string describing a pattern to its corresponding internal representation.
A common problem with memoizing is that the cost in space to store previous
results may outweigh the gains of reusing those results. To solve this problem
in Lua, we can use a weak table to keep the results, so that unused results are
eventually removed from the table.
In Lua, with higher-order functions, we can define a generic memoization
function:
4 It would be a good idea for the standard library to have a function to compare substrings, so that
we could check specific values inside a string without having to extract that value from the string
(thereby creating a new string).
26 2 · Lua Performance Tips
Given any function f, memoize(f) returns a new function that returns the same
results as f but memoizes them. For instance, we can redefine loadstring with
a memoizing version:
loadstring = memoize(loadstring)
We use this new function exactly like the old one, but if there are many repeated
strings among those we are loading, we can have a substantial performance
gain.
If your program creates and frees too many coroutines, recycling may be an
option to improve its performance. The current API for coroutines does not offer
direct support for reusing a coroutine, but we can circumvent this limitation.
Consider the next coroutine:
co = coroutine.create(function (f)
while f do
f = coroutine.yield(f())
end
end
This coroutine accepts a job (a function to run), runs it, and when it finishes it
waits for a next job.
Most recycling in Lua is done automatically by the garbage collector. Lua
uses an incremental garbage collector. That means that the collector performs
its task in small steps (incrementally) interleaved with the program execution.
The pace of these steps is proportional to memory allocation: for each amount
of memory allocated by Lua, the garbage collector does some proportional work.
The faster the program consumes memory, the faster the collector tries to recycle
it.
If we apply the principles of reduce and reuse to our program, usually the
collector will not have too much work to do. But sometimes we cannot avoid the
creation of large amounts of garbage and the collector may become too heavy.
The garbage collector in Lua is tuned for average programs, so that it performs
reasonably well in most applications. However, sometimes we can improve the
27
Final remarks
As we discussed in the introduction, optimization is a tricky business. There
are several points to consider, starting with whether the program needs any
optimization at all. If it has real performance problems, then we must focus on
where and how to optimize it.
28 2 · Lua Performance Tips
The techniques we discussed here are neither the only nor the most impor-
tant ones. We focused here on techniques that are peculiar to Lua, as there are
several sources for more general techniques.
Before we finish, I would like to mention two options that are at the border-
line of improving performance of Lua programs, as both involve changes outside
the scope of the Lua code. The first one is to use LuaJIT, a Lua just-in-time
compiler developed by Mike Pall. He has been doing a superb job and LuaJIT
is probably the fastest JIT for a dynamic language nowadays. The drawbacks
are that it runs only on x86 architectures and that you need a non-standard Lua
interpreter (LuaJIT) to run your programs. The advantage is that you can run
your program 5 times faster with no changes at all to the code.
The second option is to move parts of your code to C. After all, one of Lua
hallmarks is its ability to interface with C code. The important point in this case
is to choose the correct level of granularity for the C code. On the one hand, if you
move only very simple functions into C, the communication overhead between
Lua and C may kill any gains from the improved performance of those functions.
On the other hand, if you move too large functions into C, you loose flexibility.
Finally, keep in mind that those two options are somewhat incompatible. The
more C code your program has, the less LuaJIT can optimize it.
3
Vardump: The Power of
Seeing What’s Behind
Tobias Sülzenbrück and Christoph Beckmann
Here is a simple example of vardump in action. It prints out the variable foo
that contains the string “Hello World”.
> foo = "Hello World"
> vardump(foo)
(string) Hello World
Implementation
As shown in Listing 1, vardump is a Lua function with three parameters: one
for the resulting data value and two others reserved for the recursive function
invocation through itself.
First the key parameter is checked for adding a line prefix. This is used when
printing tables to also print out the index of the table cell. Second, depending
on the table depth (i.e., the current iteration step when printing a table), spaces
are added in front of each line to enhance the readability. When calling vardump
with a simple data type, such as a string, the afore mentioned additions to lines
have no influence on those. As said before, this is only interesting when printing
out tables, because the Lua print function automatically adds a line break.
Next the value is checked against some basic types. This includes tables,
functions, threads, userdata, and all other types the data in vardump can have.
When the resulting type is a simple data type, it is displayed with the Lua print
function. The data type is printed in brackets in front of the value. Functions,
threads, userdata, and nil values are printed without their data types at the
beginning of the line. E.g. the vardump of a function will print out its memory
address, as the standard print function will do.
Tables are the universal structure in Lua and they need special handling in
vardump. As tables may contain other tables it is essential to get all information
out of them, no matter how deep the nesting is. The output of vardump for a set
of nested tables is shown in Listing 2.
The function begins with obtaining the current iteration depth. From the
depth value, the amount of fore-standing spaces in determined. A table can
contain other tables or the special metatable. If so, the current value is replaced
by the contents of the metatable. Next, vardump is invoked recursively for each
pair in the table. The new value and the corresponding key are two of the
arguments for the call. The third parameter is the current depth, as mentioned
for adding spaces at the beginning of the line.
Conclusion
The functionality of vardump is powerful and a must for any developer. vardump
gives you the transparency you need for every variable you use. One improve-
ment for extending vardump to adapt it to your workflow might be another argu-
ment that describes the maximum iteration depth for tables — this is very useful
when handling large, deeply nested tables.
31
Serialization refers to the process of taking a piece or set of data from a running
program and writing it to a one-dimensional datastream (such as could be
stored in a string or a file), with the goal of restoring that data later from the
datastream. The most common use of serialization is implementing a “save”
feature. In this instance, the serialized data is the data in a spreadsheet
application, or the state of the gameboard in a chess game. A closely related use
is the creation of “rollback points”, useful in simulations and databases, which
allows the application to revert to an earlier state. The idea of rollback points
is particularly interesting as it relates to coroutines, because the state of the
application is contained not only in the data of the program, but in the execution
state of all extant coroutines. A similar situation can easily be encountered in a
save-game feature in video games, particularly where the state of a character’s
AI is embodied in a coroutine.
The plot, therefore, thickens: like many problems in computer science, seri-
alization starts out sounding trivial, and becomes more complex as the full scope
of the problem is examined. An additional level of complexity arises when one
begins to consider the practical aspects of serialization as opposed to treating it
as an exercise in theory. Indeed, the creation of a serialization system that is
both correct and useful is a remarkably involved task. Because of this, premade
serialization libraries are a convenient way to reduce development time while
ensuring a robust, efficient result. In the ideal case, the complexities of deciding
on a file format and keeping the loading and saving code in sync are swept away
in favor of a simple “serialize this”. Pluto is one such system, which handles the
function creategraph2()
local objs = {}
-- creation
objs[1] = {}
objs[2] = {}
objs[3] = "foo"
-- filling
insert(1, 3, 2)
insert(2, 3, 1)
return objs[1]
end
Listing 1.
Otherwise, assign a new integer index to the object and output that index, and
output the object’s data. For other objects that it references, recursively write
those objects. During deserialization, maintain an array of integer indices and
the objects to which they refer. when an object is encountered, read whether that
object has been encountered before. If it has, simply return the object with the
given index. Otherwise, create a new object of the specified type within the array
at the specified index, and then read the object’s data (recursively invoking the
routine to read objects which have been referenced), finally returning the fully
created object.
The correctness of these algorithms may easily be proven: Assuming that
integer indices are assigned incrementally, it may be shown that during seri-
alization no object completes being written unless all other objects with lower
indices already been assigned these indices. Likewise, it may be shown that
during deserialization no object begins being written unless all other objects
with lower indices have already been created, and that all created objects are
fully initialized by the time deserialization completes.
Using Pluto
Pluto is implemented as a Lua module in C, which must be built for a particular
version of the Lua interpreter (due to its direct manipulation of Lua’s data
36 4 · Serialization with Pluto
structures). It can serialize every Lua type except for C functions, which would
not be possible without architectural support. (Such a feature would usually
be undesirable anyway; C functions can be registered as permanent objects,
as described later.) In particular, Pluto correctly supports shared upvalues, a
necessary feature for certain OO systems.
In its simplest form, Pluto can be used as follows:
require("pluto")
-- then, later...
require("pluto")
function persistvertex2d(v)
local x = v.x
local y = v.y
return function()
return vertex2d(x, y)
end
end
Permanent objects
When one examines the space of situations where custom serialization routines
are necessary, two distinct patterns emerge. In one pattern, it is fully possible
to serialize the state of the object into a file as long as the system knows how.
In the other pattern, however, the state of the object extends past the scope of
the serialized data. A userdata, for instance, may be a handle to a database, for
instance, or to a hardware function used to access the system time.
One can invent different custom serialization routines for these, of course,
but they all tend to involve the same thing: fixing up references during deseri-
alization with preexisting objects. In the case of the system time function, for
instance, it is likely that during deserialization such a routine will have already
been loaded by a module, just as it was before the creation of the data which was
originally serialized. Ideally, therefore, during serialization the system would
not even attempt to save this routine to disk, but instead describe the routine in
some way (likely through a unique identifier). During deserialization, the same
routine in its new instantiation would be connected with the same identifier,
and the objects being loaded would have their references fixed-up to reference
this new routine. Some convention would be used to ensure that the new ob-
ject would be equivalent to the old one, such as giving the function’s “canonical
name” in the global namespace as its identifier.
38 4 · Serialization with Pluto
Pluto supports this fixup behavior with a “permanents table”. During seri-
alization, a table of permanents is passed in (the empty first argument in the
examples above), with the keys being the permanent objects and the values be-
ing the identifiers for the objects. The identifiers are serialized in the normal
way, and can be of any type, although in practice only integers and strings are
used. During deserialization, the reverse is passed in, with the keys being the
identifiers and the values being the permanent objects. In the example shown
in Listing 2, canvas is a userdata created by the Canvas Draw library, which
refers to the hardware screen; for obvious reasons, it cannot be serialized, and
is instead fixed up via an entry in the permanents table.
One common situation in which the permanents table is required is that of
serializing coroutines. When a coroutine has yielded by calling coroutine.yield,
that C function is still referenced by the coroutine’s callstack. If Pluto tried to
serialize the callstack, it would fail to serialize that value. Therefore, in order
to serialize a running coroutine it is necessary to have coroutine.yield in the
permanents table.
Limitations of Pluto
There are certain guarantees which Pluto does not currently provide. None of
these are fundamental limitations of the technology, but rather implementation
decisions which keep the library’s design simple. If any of these features would
be particularly useful to Pluto’s users, they could be added to a future release.
First, Pluto does not handle byte ordering issues. It is assumed that the
memory representations of numbers will be the same between serialization
and deserialization. This makes Pluto of limited use in network protocols for
cross-platform applications, and for other situations where a differently endian
architecture will be deserializing data.
Secondly, Pluto is not hardened against invalid bytestreams. Untrusted
bytestreams should not be deserialized in security-critical situations, as they
could crash the application or even enable code-injection attacks.
Finally, Pluto uses an inefficient algorithm for deserializing certain types of
upvalues, requiring a traversal of the entire garbage collection list. This has not
caused any known significant slowdowns, but applications with extremely large
working sets could conceivably experience problems.
require("pluto")
drawingagent = {
canvas = nativecanvas,
drawblueline = function(this, x1, y1, x2, y2)
this.canvas.SetForeground(cd.BLUE)
this.canvas.Line(x1, y1, x2, y2)
end,
drawredline = function(this, x1, y1, x2, y2)
this.canvas.SetForeground(cd.RED)
this.canvas.Line(x1, y1, x2, y2)
end}
Listing 2.
40 4 · Serialization with Pluto
LuaPickle
The first approach is found in the “Lua Pickle” library available on the lua-
users.org wiki. This library is implemented entirely in Lua and outputs plain
text files, making it fully cross-platform. Elegantly, it outputs data as a Lua
program, which is simply executed to deserialize the data. The ease with which
this is possible reflects Lua’s pedigree as a data-description language.
LuaPickle does not support custom serialization routines or permanent ob-
jects. That is not a result of any fundamental limitations of the technology,
though, and could be added to the library without too much effort. The pure-Lua
approach, however, does limit the number of built-in types which may be seri-
alized. Userdata, coroutines, and functions are all unsupported, as the built-in
Lua libraries do not provide adequate introspection facilities for them.
As the library requires no nonstandard native code, it is the easiest of the
three to integrate with an existing program. If you are certain you will never
need to serialize any of the unsupported data types, its convenience is un-
matched.
lper
lper is a melding of the Lua virtual machine with LPSM, an off-the-shelf per-
sistent memory manager. By maintaining the VM’s entire memory space in a
disk-backed virtual memory region (a surprisingly straightforward task, thanks
to Lua’s support for custom memory allocation routines), the entire Lua universe
may be written and read. The simplicity of this approach is admirable: after all
the worrying about upvalues and userdata, this type-agnostic system can seri-
alize it all, in a manner reminiscent of Alexander’s cutting of the Gordian Knot.
If this approach is sufficient for your needs, it’s difficult to beat for simplicity,
power, and ease of use.
There are definite tradeoffs, however. The chief issue is that the entire Lua
universe must be saved. This severely limits its usefulness for saving games or
documents, unless an effort is made to segregate the data into a separate VM
instantiation (which presents its own set of difficulties relating to data sharing).
lper is therefore best suited for long-running, processor-intensive simulations.
In such a situation, lper could be used for creating rollback points, where poten-
tially most of the system state must be saved regardless. It is also necessary
to ensure that references to memory not allocated by Lua (such as registered C
functions) remain invariant across invocations. Custom serialization routines
are also unsupported. Finally, lper is an experimental library, which has not
been fully tested and is limited to POSIX environments due to the requirements
of LPSM.
41
Conclusion
Serialization is an important task for many applications, and often a compli-
cated one, particularly if it is not planned for from the outset. At the same time,
it is undeniably prosaic. The implementation of a robust serialization system
is a task to be deferred to third-party solutions whenever possible, to allow the
programmer to concentrate on application-specific tasks. If you are planning out
the technology to be used for an application, or if you need to graft serialization
or persistence onto an existing project, Pluto can minimize the pain involved in
integrating serialization.
Abstractions for LuaSQL
5
Tomás Guisasola Gorham
This article shows how to build an abstraction layer over LuaSQL to ease
the most common uses of the library made by application developers. The
reader is expected to know Lua and the basics of LuaSQL: how to install,
open a connection, and execute SQL statements. We will show some common
uses of LuaSQL’s API, extracted from our own experience, and try to develop,
step by step, a set of abstractions to simplify them, aiming at a higher level
programming style.
We will begin by showing an example from which we point out common pieces
of code that are found in many programs. The following four sections will de-
tail those constructions, showing some forms of generalization and abstraction
that should help make the whole program easier to write, maintain and under-
stand. Finally, a complete abstraction is obtained in the form of a library that
encapsulates the main of LuaSQL.
Common uses
Listing 1 shows an example of a common use of LuaSQL library, which includes
almost all the points we plan to examine. These points are marked with num-
bers between parentheses.
The example starts by loading a LuaSQL driver and opening the connection.
This initialization phase is marked by number (1). Then the example builds an
SQL statement (2), sends it to the database and checks for errors (3), and finally
-- Initialization (1)
require"luasql.postgres"
local env = luasql.postgres ()
local conn = assert (env:connect ("lpg"))
-- Closing (5)
cur:close()
conn:close()
env:close()
Listing 1. An ordinary complete sample of LuaSQL use, where the typical phases
are marked by numbers between parentheses.
45
retrieves the results set (4).1 These phases will be analyzed in the following
sections.
Defining a module
As mentioned above, we will develop a Lua module to group all the abstractions
together. We shall use a table to encapsulate the actual LuaSQL connection
and add functions/methods to its metatable. The programmer can access the
actual LuaSQL object to perform other operations such as turn the auto-commit
mode on or off, or call the commit and the rollback methods. Let us start with a
constructor of this new type of object, which will be also responsible for opening
the connection to the database, and a closing function. This will constitute the
file database.lua.
We will also write a test file. It will be useful for testing, but also as a set of
use samples.
From now on, we will develop the following two files in parallel: the module
file (database.lua) and the test file. Sometimes we will enhance a piece of
code that had been developed earlier and thus it will be replaced by the new
implementation.
1 The code that retrieves the results set can be more compact like this:
for id, name in cur.fetch, cur do print (id, name) end
I chose the more verbose version mainly because I could not found any use of this compact form in a
search in the Internet, at least by the time of this writing. Anyway, my point is that legibility could
be improved in both forms.
46 5 · Abstractions for LuaSQL
Error handling
LuaSQL handles errors just like the standard Lua libraries: an error is raised
only if the arguments do not follow the types defined by the API. Errors gener-
ated by the database client, such as incorrect SQL syntax, unknown identifiers,
or even violation of database restrictions, are informed in the conventional way,
by returning false and an error message. This behavior provides the program-
mer with the freedom to check for errors only when they show up. Although it is
tedious to write down an if-test everywhere, the fact is that they are not usually
written anywhere! However, a simple function can do this for us. Let us add the
following definition to our module:
function assertexec (self, stmt)
local cur, msg = self.conn:execute (stmt)
return cur or error ((msg or ’’).." SQL = { "..stmt.." }", 2)
end
To test it, let us create a test database and insert some rows into it, not
forgetting to check if it raises errors properly:
assert (pcall (db.assertexec, db, "wrong SQL statement") == false)
db:assertexec[[create table people (
id integer,
name varchar (100),
sex char(1),
tel varchar (10)
)]]
a separate module for that, but we will put everything together for concision.
Our main goal is to provide both practicality and robustness. Practicality can
be achieved with a small set of functions covering the most common SQL state-
ments: delete, insert, update and select. Robustness — at this level — has to be
assured by properly quoting and escaping the sentences, preventing common
mistakes and also reducing tedious work — which is another common cause of
error.
Infrastructure
As we have mentioned, a common mistake is to forget to quote a string, but a
more common one is to forget to escape a quote inside a quoted string. These
arguments should be enough to force us to define functions for escaping and
quoting a given string. Until LuaSQL 2.1 there was no support for these
operations2 thus both had to be done in Lua. These operations should be
included in the assembly of the SQL statements but we should also be cautious
about their use, so that we do not escape or quote the same string twice.
Nevertheless sometimes we do not want a quoted string, for example when
using a select as the value of a column (a sub-select), or when using a pre-defined
database value such as NULL or CURRENT TIMESTAMP. Consequently it is important
to let the user differentiate these situations in a convenient way.
Since all SQL expressions could be represented between parentheses — and,
in fact, the case of sub-select have to be done this way —, we decided that
parenthesized strings would not be quoted. Thus, we can write the quote function
in order to quote only strings that are not enclosed by parentheses3 :
Insert
Now let us consider a change to our test file, establishing that every value
retrieved from the database be checked. A reasonable way to do that is by
defining a Lua table with all the values we want. Then, an automatic routine
could store this data on the database and another routine could retrieve and
check the values, item by item, comparing them to the original data.
2 The
escape function was added to LuaSQL 2.2 as a consequence of writing this article.
3 Some
systems require that a single quote be escaped with two single quotes instead of a
backslash, as shown in the code.
49
In order for this to work, we need an insert method in our database connec-
tion. Basically, an insert SQL statement contains three “arguments”: a table
name, an optional list of columns and a list of values. The natural way to pro-
vide lists in Lua is using a table as an array. Better yet, since we want two
corresponding lists we can use the same table to provide both pieces of informa-
tion: the list of columns is the list of table keys, and the list of values is the list
of values associated with these keys. Therefore, our new method only needs to
inform the name of the table to act on and a table with the column-value pairs,
as in:
db:insert ("people", { id=1, name="John Doe", sex="M", tel="12", })
The function that builds the two lists may be added to our infrastructure as
displayed below:
function twolists (tab)
local k, v = {}, {}
local i = 0
for key, val in pairs (tab) do
i = i+1
k[i] = key
v[i] = quote (val)
end
return table.concat (k, ’,’), table.concat (v, ’,’)
end
The twolists function can also be used to build parts of other SQL state-
ments as will be shown later.
Hence, the implementation of the insert method can be:
function insert (self, tablename, contents)
return self:assertexec (string.format (
"insert into %s (%s) values (%s)",
tablename, twolists (contents)))
end
Our test script can be rewritten to automatically populate the database with
data from a table, by using the following code:
-- Set of data
data = {
{ name = "John Doe", sex = "M", tel = "12", },
{ name = "Jane Doe", sex = "F", tel = "01", },
{ name = "O’Neill", sex = "M", tel = "98", },
}
-- Adding content to the table
for i, row in ipairs (data) do
row.id = i
db:insert ("people", row)
end
50 5 · Abstractions for LuaSQL
Select revisited
We can add the same facility to assemble the SQL statement of our result
set iterator. The second argument, that is, the statement, can be replaced by
a string with a list of columns, followed by the table name, the conditional
expression and any other text. In this way, the iterator will be responsible for
adding some words to guarantee the correct syntax of the statement:
function select (self, columns, tabname, cond, other, modestring)
-- Assemble the SQL statement
tabname = tabname and (" from "..tabname) or ""
cond = cond and (" where "..cond) or ""
other = other or ""
local stmt = string.format ("select %s%s%s %s",
columns, tabname, cond, other)
-- Do the query
local cur = self:assertexec (stmt)
return function ()
local t
if modestring then t = {} end
return cur:fetch (t, modestring)
end
end
Sometimes it is important to hide the internals of the implementation. In
this case, however, I believe it is better to expose it. In other words, it is
important for the programmer to know that the arguments will be joined to form
the final SQL statement, because he is able to use this to his own advantage.
The programmer can exploit the fact that the list of columns is not just a list
of columns and add more text to enhance the SQL statement being built, like
renaming a column, adding two or more columns with a string separator and
others. The same applies to all of the arguments4 . Now we can automatically
check each column of each row against the original data:
for row in db:select ("*", "people", nil, "order by id", "a") do
row.id = tonumber(row.id)
for col, val in pairs (data[row.id]) do
assert (row[col] == val)
end
end
A subtle point to note is the release of open cursors, which are confidently
left to the garbage collector. LuaSQL’s implementation of fetch5 already closes
4 In fact, this new implementation can be used just like the others with the raw SQL statement
(removing the “select” word from the beginning) as in: db:select"* from people order by id".
5 This behavior — closing the cursor when there are no more rows — is in part a consequence of
writing this article and was planned to be added to LuaSQL version 2.2, which should have been
released by the time this article is published.
51
the cursor when there is no more rows to return, but if the iterator is not called
to the end, the cursor remains open. In some systems, with severe restrictions,
this practice could make the system get out of resources, therefore the select
iterator have to be used with care. The most effective way to avoid this situation
is to create queries that return the exact number of rows needed, so that the
loop will call fetch until there is no more rows and the cursor will be closed.
Nevertheless, the raw LuaSQL connection is accessible via the conn field and
the usual execute-fetch loop could be used.
Delete
The delete method should be simple, following the same guidelines used for the
insert method: the name of the table and a condition.
As with the select method, the tablename argument can be the complete
SQL statement including the condition6 , so the last argument is optional.
Update
While the insert command requires two comma-separated lists (for column names
and column values), the update command requires a single comma-separated
list of pairs in the form column-name = column-value. Since the where clause
also requires a similar list of pairs, we will define a function to cover both uses
by accepting an optional separator. By providing the string " AND " as the sep-
arator, the function can be used to form a typical condition to the where clause.
Since Lua does not guarantee the order of the traversal of a table, I added
a call to table.sort so that pairslist always produces the same string for the
same contents of a table. This predictability makes it easier to test the function.
6 As in db:delete"table where status=’invalid’".
52 5 · Abstractions for LuaSQL
Extensions
Here we explore the facility to extend this library by showing a pair of examples
and proposing others. LuaSQL is not supposed to be extended directly and for
security reasons it has to be that way. On the other hand, our library is easily
extendable, such as most Lua libraries.
Other ideas
An even more sophisticated (and also useful) extension is the implementation of
a pool or a cache of database connections encapsulated by the connect method.
In both cases, careful must be taken on the release of open transactions and also
on the sharing policy.
A cache of database connections should improve the efficiency of an applica-
tion that repeatedly connects to the same database, does some stuff and release
7 http://forge.mysql.com/wiki/MySQL Proxy
54 5 · Abstractions for LuaSQL
Discussion
Our implementation is now complete, although it can receive some additions.
The test case should be much better developed but as the goal here was to show
how to do it, I left this task to be done as an exercise.
I think the main functions should check their arguments’ types whenever
possible, using features such as the luaL check* set of functions in the Lua
auxiliary library. However, these C functions are not exported to Lua and
implementing them in Lua can cause a significant performance penalty. In fact,
this could be the subject for another gem.
A last but not less important point regards my decisions on the API style
and its organization. I chose the arguments of the SQL constructors guided
by our own usage and, I have to confess, changed them a little while writing
this document. Take the result set iterator, for instance: it could receive five
lengthy arguments which can make the call difficult to understand. To reduce
this problem, the function could have been implemented to receive a table with
the arguments in particular fields, as the following example:
db:select{
columns = "col1, col2, col3, col4",
from = "tablename t inner join othertable o",
having = "t.fk = o.id and t.col3 > 10",
groupby = "...",
}
The drawbacks of this approach are the growth of the library size and the
possibility of having to deal with differences between the accepted SQL syntax
of the databases or even limit the use of particular extensions.
The functions I grouped as “infrastructure” (escape, quote, twolists and
pairslist) could be generalized and stored in pre-existing packages, such as
55
string and table. Additionally, I packaged all SQL constructors into another
file which helped reuse the select constructor to build sub-selects. I do not think
there is any canonical way to decide whether to put a function in a new module
or inside a pre-existing one.
Conclusion
To illustrate the point we have made, let us rewrite the first example in the
article using the tools developed:
-- Initialization (1)
local database = require"database"
local db = database.connect("lpg", "tomas", nil, "postgres")
-- SQL execution and error handling (3) and Iteration loop (4)
for id, name in db:select("a.id, a.name", tab, cond) do
print(id, name)
end
There is a huge difference from the previous version to this one. The former
explicitly checked for errors, while in the new one, this is performed automati-
cally by the library functions. The iteration loop is now a concise for-construct
without repeated calls to the fetch method. In addition, the SQL statement
construction is now much better supported, which helps build correct and more
legible code in a convenient way.
Finally, this library settles a new ground over which other abstractions could
be defined. Some applications are already constructed on top of it and so are
other libraries. An example is a module that provides facilities to the definition
of classes and objects directly associated with database tables and rows. It takes
advantage of the homogeneity of the API (insert and update methods) and also
of the SQL statements creation (table of fields becomes a where clause).
Boostrapping a Forth in
6
40 Lines of Lua Code
Eduardo Ochs
Introduction
The real point of this article is to propose a certain way of implementing a Forth
virtual machine; let’s call this new way “mode-based”. The main loop of a mode-
based Forth is just this:
while mode ~= "stop" do modes[mode]() end
In our mode-based Forth, which is implemented in Lua and that we will refer
to as “miniforth”, new modes can be added dynamically very easily. We will
start with a virtual machine that “knows” only one mode — “interpret”, which
corresponds to less than half of the “outer interpreter” of traditional Forths —
and with a dictionary that initially contains just one word, which means “read
the rest of the line and interpret that as Lua code”. That minimal virtual
machine fits in 40 lines of Lua, and is enough to bootstrap the whole system.
But, “Why Forth?”, the reader will ask. “Forth is old and weird, why shouldn’t
we stick to modern civilized languages, and ignore Forth? What do you still
like in Forth?”. My feeling here is that Forth is one of the two quintessential
extensible languages, the other one being Lisp. Lisp is very easy to extend and
to modify, but only within certain limits: its syntax, given by ‘read’, is hard to
change(1). If we want to implement a little language (as in [1]) with a free-from
syntax on top of Lisp, and we know Forth, we might wonder that perhaps the
right tool for that would have to have characteristics from both Lisp and Forth.
And this is where Lua comes in — as a base language for building extensible
languages.
Disclaimer: I’m using the term “Forth” in a loose sense throughout this article.
I will say more about this in the last section.
Figure 1. A 16-bit Forth with primitives. Forth instructions with very high values are
primitives.
Figure 2. A 16-bit Forth with no primitives. All Forth instructions point to heads
(double boxes); each head points to a routine in 8086 machine code.
Figure 3. An imaginary 16-bit Forth with 1-byte heads and variable-length Forth
instructions.
Figure 4. Miniforth. Heads and Forth primitives are represented by strings in the
memory cells. Forth non-primitives are represented by numbers.
61
Bootstrapping miniforth
The program in Listing 1 is all that we need to bootstrap miniforth. It defines
the main loop (run), one mode (interpret), the dictionary (_F), and one word in
the dictionary: %L, meaning “evaluate the rest of the current line as Lua code”.
The program below is a first program in miniforth. It starts with only "%L"
defined and it defines several new words: what to do on end-of-line, on end-of-
text, and "[L", which evaluates blocks of Lua code that may span more than one
line; then it creates a data stack DS and defines the words "DUP", "*", "5", and
".", which operate on it.
subj = [=[
%L _F["\n"] = function () end
%L _F[""] = function () mode = "stop" end
%L _F["[L"] = function () eval(parsebypattern("^(.-)%sL]()")) end
[L
DS = { n = 0 }
push = function (stack, x)
stack.n = stack.n + 1; stack[stack.n] = x end
pop = function (stack)
local x = stack[stack.n]; stack[stack.n] = nil;
stack.n = stack.n - 1; return x end
_F["5"] = function () push(DS, 5) end
_F["DUP"] = function () push(DS, DS[DS.n]) end
_F["*"] = function () push(DS, pop(DS) * pop(DS)) end
_F["."] = function () io.write(" "..pop(DS)) end
L]
]=]
After running this program the system is already powerful enough to run
simple Forth programs like, for example,
5 DUP * .
Note that to “run” this Forth program what we need to do is:
subj = "5 DUP * ."; pos = 1; mode = "interpret"; run()
It is as if we were setting the memory (here the subj) and the registers of a
primitive machine by hand, and then pressing its “run” button. Clearly, that
interface could be made better, but here we have other priorities.
62 6 · Boostrapping a Forth in 40 Lines of Lua Code
Listing 1.
63
The programs above don’t have support for non-primitives; this will have to
be added later. Look at Figure 4: non-primitives, like ”SQUARE”, are represented
in the bytecode as numbers (addresses of heads in the memory[]) and we have
not introduced either the memory or the states “head” or “forth” yet.
Note that the names of non-primitives do not appear in the memory, only
in the dictionary, _F. For convenience in such memory diagrams we will draw
the names of non-primitives below their corresponding heads. For instance, in
Figure 4, we have _F["SQUARE"] = 1 and _F["CUBE"] = 5.
Modes
When the inner interpret runs — i.e., when the mode is “head” or “forth”; see
Figure 5 — , at each step the processor reads the contents of the memory at IP
and processes it. When the outer interpreter runs, at each step it reads a word
from subj starting at pos, and processes it. There’s a parallel between these
behaviors. . .
I have never seen any references to “modes” in the literature about Forth.
In the usual descriptions of inner interpreters for Forth, the “head” mode is not
something separate; it is just a transitory state that is part of the semantics of
executing a Forth word. Also, the “interpret” and “compile” modes do not exist:
the outer interpreter is implemented as a Forth word containing a loop; it reads
one word at a time, and depending on the value of a state variable, it either
“interprets” or “compiles” that word. So, in a sense, “interpret” and “compile”
are “virtual modes”. . .
Let me explain how I arrived at this idea of “modes” — and what I was trying
to do that led me there.
Some words interfere with the variables of the outer interpreter. For ex-
ample, ":" reads the word the pos is pointing at (for example, SQUARE), adds a
definition for that word (SQUARE) to the dictionary, and advances pos. When the
control returns to modes.interpret(), the variable pos is pointing to the posi-
tion after SQUARE — modes.interpret() never tries to process the word SQUARE.
Obviously, this can be used to implement new languages, with arbitrary syntax,
on top of Forth.
Some words interfere with the variables of the inner interpreter — they mod-
ify the return stack. Let’s use a more colorful terminology: we will speak of
words that “eat text” and of words that “eat bytecode”. As we have seen, ":" is
a word that eats text; numerical literals are implemented in Forth code using a
word, LIT, that eats bytecode. In the program below,
: DOZENS 12 * ; --> ok
5 DOZENS . 60 --> ok
the word DOZENS is represented in bytecode in miniforth as:
memory = {"DOCOL", "LIT", 12, "*", "EXIT"}
-- 1 2 3 4 5
-- DOZENS
64 6 · Boostrapping a Forth in 40 Lines of Lua Code
When the LIT in DOZENS executes, it reads the 12 that comes after it, and
places it on the data stack; then it changes the return stack so that in the next
step of the main loop the IP will be 4, not 3. Here is a trace of its execution; note
that there is a new mode, “lit”. The effect of “executing” the 12 in memory[3] in
mode “lit” is to put the 12 in DS.
The code in Lua for the primitive LIT and for the mode “lit” can be synthe-
sized from the trace. By analyzing what happens between steps 2 and 3, and 3
and 4, we see that LIT and “lit” must be:
so from this point on we will consider that the traces give enough information,
and we will not show the corresponding code.
Note that different modes read what they will execute from different places:
“head”, “forth”, and “lit” read from memory[RS[RS.n]] (they eat bytecode),
whereas “interpret” and “compile” read from subj, starting at pos (they eat text).
Our focus here will be on modes and words that eat bytecode.
Virtual modes
How can we create words that eat bytecode, like LIT, in Forth? In the program
below, the word TESTLITS call first LIT, then VLIT; VLIT should behave similarly
to LIT, but LIT is a primitive and VLIT is not.
This is a full solution, so start by ignoring the cells 2, 3, and 4 of the memory,
and the lines t=5 to t=8 of the trace. From t=5 to t=9 what we need to do is
where the –1 is a magic number: roughly, the number of ”call frames” in the
stack between the call to VLIT and the code that will read its literal data,
negated. In other situations this could be –2, –3, . . . One way to get rid of that
magic number is to create a new stack — the “parsing stack” (PS) — and to have
“parsing words” that parse bytecode from the position that the top of PS points
to; then a word like VLIT becomes a variation of a word, PCELL, that reads a cell
from memory[PS[PS.n]] and advances PS[PS.n]. The code for VLIT given above
shows how that is done — we wrap PCELL as "R>P PCELL P>R" — and from the
trace we can infer how to define these words.
Note that the transition from t=2 to t=3 corresponds to the transition from
t=4 to t=10; the mode being “lit” corresponds to having the address of the head
of VLIT at the top of RS, and the mode being “head”; using this idea we can
implement virtual modes in Forth. Better yet: it all becomes a bit simpler if we
regard the mode as being an invisible element that is always above the top of
RS. So, an imaginary mode “vlit” would be translated, or expanded, into a 1 (the
head of VLIT), plus a mode “head”; or another word, similar to VLIT, would just
switch the mode to “vlit”, and the action of that word would be to expand it into
the head of VLIT, plus the mode “head”.
PCELL, in the sense that it reads the data of the polynomial from the memory,
starting at the position PS[PS.n], and advancing PS[PS.n] at each step. This
PPOLY takes a value from the top of the data stack — it will be 10 in our
examples — and replaces it with the result of applying P on it, —P(10)—, which
is 2345.5 for the example above.
By defining POLY from PPOLY, as we defined VLIT from PCELL
we get a word that eats bytecode; a call to POLY should be followed by data of a
polynomial, just like LIT is followed by a number. And we can also do something
else: we can create new heads, DOPOLY and DOADDR, and represent polynomials as
two heads followed by the data of the polynomial. The program and trace below
test this idea.
The trace above does not show what &P(X) does; the effect of running &P(X) is
to put the address of the beginning of data of the polynomial, namely, 3, into the
data stack. Note how a polynomial — which in most other languages would be a
piece of passive data — in Forth is represented as two programs, P(X) and &P(X),
that share their data. Compare that with the situation of closures in Lua — two
closures created by the same mother function, and referring to variables that
were local to that mother function, share upvalues.
67
(Meta)Lua on miniforth
The parser for the language for Propositional Calculus in the last section had to
be recursive, but it didn’t need backtracking to work. Here is a language that
is evidently useful — even if at this context it looks like an academic exercise —
and whose parser needs a bit of backtracking, or at least lookahead. Consider
the following program in Lua:
foo = function ()
local storage
return function () return storage end,
function (x) storage = x end
end
68 6 · Boostrapping a Forth in 40 Lines of Lua Code
memory = {
"foo", "=", "function", "(", ")",
"local", "storage",
"return", "function", "(", ")", "return", "storage", "end", ",",
"function", "(", "x", ")", "storage", "=", "x", "end",
"end",
"<eof>" }
One way of “executing” this bytecode made of string tokens could be to pro-
duce in another region of the memory a representation in Lua of the bytecode
language that the Lua VM executes; another would be to convert that to an-
other sequence of string tokens — like what MetaLua [5] does. Anyway, there’s
nothing special with our choice of Lua here — Lua just happens to be a simple
language that we can suppose that the reader knows well, but it could have been
any language. And as these parsers and transformers would be written in Lua,
they would be easy to modify.
Why Forth?
Caveat lector: there is no single definition for what “Forth” is. . . Around 1994
the community had a big split, with some people working to create an ANSI
Standard for Forth, and the creator of the language and some other people going
in another direction, and not only creating new Forths that went against ideas
of the Standard, but also stating that ANS Forth “was not Forth”. I can only
write this section clearly and make it brief if I choose a very biased terminology;
also, I’m not going to be historically precise, either — I will simplify and distort
the story a bit to get my points across. You have been warned!
Forth was very popular in certain circles at a time when computers were
much less powerful than those of today. Some of the reasons for that popularity
were easy to quantify: compactness of programs, speed, proximity to machine
code, simplicity of the core of the language, i.e., of the inner and the outer
interpreters. None of these things matter so much anymore: computers got
bigger and faster, their assembly languages became much more complex, and
we’ve learned to take for granted several concepts and facilities — malloc and
free, high-level data structures, BNF — and now we feel that it is “simpler” to
send characters through stdout than poking bytes at the video memory. Our
notion of simplicity has changed.
In the mid-90s came the ANS-Forth Standard, and with it a way to write
Forth source that would run without changes in Forths with different memory
models, on different CPU architectures. At about the same time the creator
of the language, Chuck Moore, started to distance himself from the rest of the
community, to work on Forths that were more and more minimalistic, and on
specialized processors that ran Forth natively.
69
interesting ways of thinking that have practically disappeared, and that have
become hard to communicate.
Conclusion
After a draft of this article had been written, Marc Simpson engaged in a long
series of discussions with me about Forths, Lisp, SmallTalk, several approaches
to minimality, etc., and at one point, over the course of one hectic weekend
in December, 2007, he implemented a usable (rather than just experimental)
dialect of Forth — based mainly on Frank Sergeant’s Pygmy Forth and Chuck
Moore’s cmForth, and borrowing some ideas from this article — on top of Ruby
(“RubyForth”), and later ported his system to Python and C. A port of it to Lua
is underway.
I thank Marc Simpson and Yuri Takhteyev for helpful discussions.
References
[1] Jon Bentley: More Programming Pearls, Addison-Wesley, 1990 (chapter 9:
Little Languages).
Introduction
In real-world conditions, software maintenance becomes as important as soft-
ware development. Environments change, business partners choose different
strategies, technology evolves — and devolves. Distributed systems become cen-
tralized, and centralized ones get scattered around. In particular, requirements
have a peculiar way of being significantly altered once a project approaches com-
pletion. . . or the day after it has been deployed.
It becomes more and more important to be able to quickly adapt to these
changing conditions. In the following sections we’ll discuss the application of a
particularly powerful feature of Lua to this end. We’ll start by presenting a short
review of metamethods along with a couple of simple examples. We will then
show how Lua’s metatables were used to dramatically change the performance
profile of an application with little effort. Finally, we conclude showing a few
examples where metatables can help developers change the key features of a
system even if it’s already in a late development phase or even in production.
c 2008 by Sérgio Alvares Maffra and Pedro Miller Rabinovitch. Used by permission.
Copyright
71
72 7 · Effecting Large-Scale Change (with little trauma) using Metatables
local tableOfHeavyData = {}
setmetatable( tableOfHeavyData,
{
__index = function( tbl, key )
-- calculate the required value
local data = performHeavyComputing( key )
-- cache response in the table
tbl[key] = data
return data
end
}
)
-- do serious computing
function performHeavyComputing( x )
print( "computing the value of "..x )
return x * x -- dude. Heavy.
end
--[[ Output:
computing the value of 2
value for 2 is 4
computing the value of 1
value for 1 is 1
value for 2 is 4
computing the value of 3
value for 3 is 9
--]]
being of “high performance”. Lines such as “we’ll just tell them to buy more
RAM” are heard and management is confident of the project success. Boat
catalogs are browsed.
This is all well and good until requirements change. Perhaps the code will
have to run on a less powerful platform, such as portable devices. Perhaps the
client can’t afford the extra budget for better equipment. Or perhaps the testing
cycle just got way too long, since each time the application is run, everything is
loaded into memory in one big shot.
Now we have a problem. Our hypothetical application is bloated; its modules
and libraries are not well separated; the libraries it depends on are taking
much more cycles than expected; what is one to do? Picture modules named
pic.lua that define functions with naming conventions as diverse as pic open,
PICdecode, and pngPICformat. Add a couple of list of pictures or images
tables. Multiply that by, say, 30 or 40 functions defined in each of 20+ modules.
Throwing a couple of interns at the problem probably won’t give the best results.
We had such a problem in an actual application we developed — in our case,
the graphical interface library which the application depended on went through
a large change and started taking a lot longer to create dialogs. The change
was for the better as far as the GUI presentation was concerned, of course.
But running it, even if one was just trying to check on the latest changes, was
taking way too long. Granted, our application was not the nightmarish vision
we presented above, but we’re making a point here.
Well, if your development is in a language as powerful as Lua, you can solve
your problems1 in about an hour with the judicious use of metamethods and the
replacement of a couple of system functions. We will present a solution as we
analyze a sample implementation in the following section.
solved, but the perverse naming conventions are more into the realm of physical punishment.
75
that would require us to track down and alter every reference made to each of
the dialogs used in the application. Thankfully, all GUI code was written in Lua,
which allowed us to adopt a better approach.
The loading time could be reduced if the require calls that created dialogs
were removed from our initialization methods. But, that would leave us with
a lot of missing values in our hands. As mentioned in Section 7, the index
metamethod can be used to provide missing values on the fly. Therefore, our
dialogs could be created when needed by loading their defining modules in an
index metamethod set in the global environment.
By using this solution we avoided going through all the code of the appli-
cation. It required, however, knowing the values defined by each module in
the application. A simple table containing module names that are retrieved
from variable names, like the one defined in Listing 2, is all that was necessary.
Granted, creating this table can still require a lot of work. Fortunately, the table
in Listing 2 was generated automatically by using the newindex metamethod.
We have implemented the solution in the form of a library we’ve dubbed
“Origins”. The library works in a two-phase approach:
function origins:startWatching()
origins.original_require = require
require = origins.new_require
setmetatable( getfenv(0), origins_metatable )
end
function origins:stopWatching()
setmetatable( getfenv(0), nil )
require = origins.original_require
end
Setup
During the setup phase we use the newindex metamethod to establish which
module is providing each global function and variable. This is done by the
function startWatching presented in Listing 3. First, we hook the loading
functions we’re interested in (require in this case; dofile or any other functions
that should be processed as well) by replacing their global reference with our
own versions. These work as illustrated in Listing 4, keeping track of the lua
file (and therefore library module) being currently processed. Our metatable
is set on the global environment (acquired via getfenv(0)). The newindex
metamethod, shown in Listing 5, notes each global variable that is set and keeps
track of the lua file that was being processed at the time of its definition.
After these proceedings, we can load our application normally. As each new
global is set, our variable catalog is built, and by the time the application is up
and running — having loaded every variable we’re interested in — we can save
the stored catalog in a data file that will be used in run time. This is illustrated
in Listing 6, and a sample data file is represented in Listing 2. Notice that the
data file is simply a Lua code file and that the table data keeps an entry for each
variable with the path to its original loading module.
77
local origins_metatable = {
__newindex = function( table, key, value )
--print( "[origins] newindex: ", key )
origins.data[key] = origins.currentFilename
rawset( table, key, value )
end,
__index = function( table, key )
--print( "[origins] index: ", key )
local source = origins.data[key]
if source then
origins.original_require( source )
end
return rawget( table, key )
end,
}
Quick setup
It is true that one can execute the setup phase by running the application as
normal after calling startWatching. However, Listing 7 shows an alternative —
artificially loading each module used by the application in a single stretch. This
will be enough to refresh all necessary data in most cases, and a proper data
file can be generated. This has the additional benefit that we can easily call the
setup script in an automated build system, automatically updating the data file
as we change the modules.
Runtime
At run time, the application doesn’t load its libraries at startup as usual. In-
stead, it loads the “origins” data file through the loadData function, which is
78 7 · Effecting Large-Scale Change (with little trauma) using Metatables
depicted in Listing 8. This loads the reference catalog (through a trivial dofile)
and sets the runtime index metamethod, shown in Listing 5, which looks for a
reference in the catalog in order to load the required module.
After the loading is done, the value that was being sought should be avail-
able. The application is not even aware that a module was being loaded on
the fly, since it was waiting for the variable referencing to occur while the
metamethod was running.
Limitations
The library uses variable and function names as keys; therefore, conditional
file loading might cause problems if they define functions by the same name.
Consider the code in Listing 9. The execution path taken during setup time
would be the one “Origins” considers as the source of out write.
There is a way to pause processing, however (the stopWatching method). This
enables developers to circumvent said limitation. The use of a quick setup script
as described previously would also resolve the situation by not including either
module and only loading them at the original point in run time, as intended by
the code.
Even if conditional loading is not an issue, what should happen if a global
variable is set in more than one module? Such a naming conflict does not
79
have a resolution we can deem as correct, since even in Lua, the name clash
would cause one of the values to be overwritten. We chose not to address this
issue in this implementation, but some alternatives would include loading all
the modules that defined the conflicting variable (perhaps in the order they
originally appeared), printing out warning messages, or even firing an error
during setup.
Further development
The system introduced here could be the basis for further development. Con-
sider a scheme where standard Lua libraries installed at predetermined path
locations were loaded at run time as code in execution needed them, without
preloading them through require. We could delay library module loading un-
til code is necessary by implementing “Origins” on a system-wide range. One
could get rid of requires by stipulating that all libraries should be installed
through a program responsible for managing a library function catalog, much
in the manner of a package manager. After that, any running applications that
tried to execute cataloged functions would have the appropriate module loaded
and ready at the first call attempt. Module dependencies, of course, would be
handled automatically.
Some work would have to be done to keep different versions working correctly
together when required. Packaging schemes with property files that describe
required module versions come to mind. But scripting, in particular, would
greatly benefit from such a scheme — specially if an automated download and
installation procedure was available at run time.
Dynamic library loading. Instead of preloading all modules required by the sys-
tem, we could wait to load each of them as they become needed. This can
be done by watching the global environment for access, as shown in the
main section of this gem.
It’s important to note that these changes are easily made and undone. Since
no code is changed in the original application, just setting or not setting the ap-
propriate metatables will activate or deactivate the corresponding functionality.
This makes the approach perfect for experimentation, since alternative methods
can be kept along with working code in the same development branch.
Another way metatables can help us is on debugging or measuring a system’s
correctness or performance profile. This can be done in ways such as:
Tracing field access. One could use the index and newindex metamethods, paired
with other debugging information (such as those provided by the debug li-
brary) to trace which execution paths are altering and accessing a variable
or field.
Asserting variable values. Instead of relying on getter and setter functions, one
could use appropriate metamethods to check that certain variables always
have valid values.
Profiling function calls. The same procedure described for tracing field access,
above, could be used to trace function calls. Although a hook could be used
to trace every function call in the system and then check for the function
names we’re interested in, it is probably faster to set a limited number of
metamethods. With appropriate timing information or even if we’re just
tracking the number and/or origin of calls made, one can obtain quite a lot
of interesting information with such a system in place.
81
Conclusion
We have presented in this gem a simple use of what we consider as one of the
most versatile and powerful features of Lua. It is easy to see how a simple
change implemented this way could alter the execution of a large system as
a whole, making a significant and perceptible difference on application perfor-
mance and usage. In our case it represented the difference between a bit of de-
velopment work and rewriting the entire GUI system in order to spare process-
ing and memory. By implementing the dynamic loading mechanism presented
here we prevented much unnecessary expenditures with little development cost.
The most important lesson to keep, however, is that metatables offer a way to
effect system-wide change with little use of search-and-replacing and other po-
tentially traumatic methods.
Part II
Design Techniques
8
MVC Web Development
with Kepler
André Carregal and Yuri Takhteyev
Introduction
Kepler1 is an open source web development platform based on Lua that brings
many of Lua’s advantages to the development of web applications.2 Like Lua,
Kepler is small, portable, and flexible. Kepler 1.1 provides support for web ap-
plication development that follows the Model-View-Controller (MVC) paradigm,
bringing to Lua some of the benefits provided by popular web frameworks writ-
ten for other programming languages.
MVC refers to the division of application code into three sections and was
originally brought to desktop GUI programming by Smalltalk. When applied to
1 http://www.keplerproject.org/
2 Since this article addresses readers already familiar with Lua, we do not discuss here the exact
advantages that Lua offers in general or specifically for web development. Such advantages are
discussed, however, on the Kepler site.
Web development, this metaphor has to be adjusted due to the different nature
of the interaction: the model handles data manipulation and storage, the view
handles client-side interaction, and the controller responds to requests using the
model for data manipulation and generating views that are used by the client.
Having the model clearly separated is an important part of MVC and the
success of such frameworks as Rails had much to do with their handling of
the model. The implementation of a model, however, is not specific to web
development and we will thus assume here a pre-existing model. We will
similarly avoid the discussion of the view contents in much detail, since the
structure of the view is no way specific to Lua. Instead, we will focus on two
problems that must be resolved by the controller: dispatching incoming requests
(originating from a view) and generating a response that contains a view for the
next request.
Request dispatching
Overview
The general problem of HTTP request dispatching is to map an incoming request
to an action within the system. The action must at the minimum generate a
response that will be sent to the client, but may also have side effects, such as
altering the data stored in the model. Here is a general model of the way the
MVC controller handles an HTTP request:
For the purpose of this article we will consider a simplified version of this
model, which assumes that the controller relies only on the URL and uses
neither other parts of the request nor the internal state: We do not discuss the
use of HTTP headers and POST parameters, though those parts of the request
are already provided by Kepler in an simple way and their use presents no
87
conceptual challenges.
For the purpose of the this article we will use the term “URL” to refer only
to the local part of a URL (/<PATH>?<QUERY>), since the host domain and port
number should not concern the web application. As the server dispatches the
request, it consumes some part of PATH. The remaining part of PATH is passed to
the application and we will refer to it as PATH INFO.
A common approach to dynamic web development is to structure the ap-
plication as a collection of functions (in the broad sense of the word) and in-
terpret each request as an identifier of a function and a set of parameters to
this function. The traditional use of CGI scripts follows this model: each CGI
script acts as a function that accepts a set of parameters (encoded in the QUERY
part of the URL). An application is then structured as a collection of scripts,
each responsible for a different type of request. Under this approach, a re-
quest for wiki/show.cgi?p=HomePage is interpreted as a call to a specific function
(/wiki/show.cgi) which is called with one parameter (p=HomePage). If the user
clicks on a link to edit the page, they will generate a request for a different
function, such as /wiki/edit.cgi?p=HomePage.
Kepler supports this style of web development through “Lua Scripts” or “Lua
Pages”. This style, however, has been increasingly criticized in the recent years
for the insufficient separation between view generation and the application logic.
In this article we will show how to use Kepler to implement some of the different
alternative approaches that are currently used in some of the popular web
frameworks. In doing so, we want to show a range of options available to the
developer.
One way of making the dispatching more flexible is to change the way func-
tions and parameters are stored and represented. For instance, we can triv-
ially map /wiki/show onto a Lua function called show, passing to this function
the table representing the QUERY parameters. (This would be very similar to
the approach used, for example, by CherryPy.) Additionally, we may want to
avoid passing parameters via QUERY, as many people find QUERY-free URLs more
“clean”.
We can do this by implementing a dispatcher similar to Rails’ “routes”, which
will let us map a URL prefix and a sequence of parameters onto a function that
would accept a table of parameters:
URLs = {
{"/show/$page_name/$version", show_page}
}
This would map a request for /wiki/show/Home Page/23 onto a call to
show_page{page_name="Home_Page, version="23"}
which would return version 23 of the page. See Example 1 in this section for the
implementation of this approach.
A yet more flexible and very popular method (used, for example, by Django
and web.py) is to determine the function by matching the URL against a list of
patterns, each associated with a different function. E.g.:
88 8 · MVC Web Development with Kepler
URLs = {
{"/show/([%w_-]*)/(%d*)", show_page},
{"/save/([%w_-]*)", save_page}
}
Kepler setup
If you do not yet have Kepler installed, please follow the instructions on the
Kepler site. Kepler 1.1 default request handler is CGILua, which defaults to
CGI-like dispatching with URLs like /index.lp, but also allows the use of more
sophisticated dispatchers. The example code available on the Lua Gems site
include a gem directory that should be copied to your CGILUA APPS directory. After
that, access /app.lua/gem in your browser to get the examples home page. From
that page you can run all the examples.3
The following examples will show three different approaches to implement-
ing URL dispatching for a wiki, each using a different URL dispatching model.4
URLs = {
{"/show/$page_name/$version", show_page, "show"},
{"/history/$page_name/$year/$month/$date", show_history, "history"},
{"/diff/$page_name/$version1/$version2", show_diff, "diff"},
}
Here each mapping consists of three values: a path pattern, a function, and
the name of the mapping (used later for generating URLs). So a request for
/example1/diff/Home Page/24/25 would call
show_diff{page_name="Home_Page", version1="24", version2="25"}
In order to actually dispatch the request we will need a pattern matching
function and a function to iterate over the table looking for matches:
-- Checks if a URL matches a pattern
function match(url, pattern)
local params = {}
-- convert the pattern into a Lua-style pattern
local lua_pattern = string.gsub(pattern, "(/$[%w_-]+)", "/?([^/]*)")
-- extract param values from the URL
local param_values = {string.match(url, lua_pattern)}
-- save them in table fields
local i = 1
for name in string.gmatch(pattern, "/$([%w_-]+)") do
params[name] = param_values[i]
i = i + 1
end
-- return params or nil
return next(params) and params
end
For this example to work we need to also implement functions show page,
show history and show diff. We discuss their implementation later in “Content
generation”. For now, however, we will only mention that most of such functions
will need to generate URLs pointing back to the wiki. We can do this by reusing
our URLs table:
URLs = {
{"/show/$page_name/$version", show_page, "show"},
{"/history/$page_name/$year/$month/$date", show_history, "history"},
{"/diff/$page_name/$version1/$version2", show_diff, "diff"},
}
return cgilua.dispatcher.route(URLs)
URLs = {
{"/show/([^%/]*)/?(%d*)", show_page},
91
{"/history", show_history},
{"/history/([^%/]*)", show_history},
{"/history/([^%/]*)/(%d*)/?", show_history},
{"/history/([^%/]*)/(%d*)/(%d*)/?", show_history},
{"/history/([^%/]*)/(%d*)/(%d*)/(%d*)/?", show_history},
{"/diff/(%d*)/(%d*)", show_diff},
}
Note that this mapping is more verbose, but allows for more precise selection of
URLs. To implement this we use replace the match function of Example 1 with
a simple wrapper around string.match:
We also make a small change to map() since our URLs table now has only two val-
ues per row. We then use map() in just the same way as we did in example1.lua.
All code shown in this example is provided in example2.lua.
This approach does not allow for such simple generation of URLs as was the
case in Example 1 so this would have to be done manually. (A more complicated
version of this method would make URL-generation possible, but will not be
discussed here.)
/example3/Home_Page.diff?version=24&version2=25
both History and Calendar pages we want to redefine how show is handled as
well as define additional actions (rss and ical) that might make little sense for
other pages.
We implement our dispatching as follows:
Note that the function that is called in the end is determined by the object.
The set of functions is thus not limited a priori — each object can support dif-
ferent functions, and we can remap them easily. In fact, in the case of Sputnik
the functions can be remapped by the visitor to the site. Note that page.actions
contains names of functions and we rely on load action function() to get the
actual callable functions:
function load_action_function(action_name)
-- if action contains a dot, assume it’s defined in an external
-- module. if it doesn’t, assume it refers to a global function
if string.find(action_name, "%.") then
local mask = "([%w_]*)%.([%w_]*)"
module_name, function_name = string.match(action_name, mask)
local m = require(module_name)
return m[function_name]
else
return _M[action_name]
end
end
Home_Page = {
title = "Home Page",
actions = {
show = "show_page",
diff = "show_diff",
history = "show_history",
}
}
History = {
title = "Site History",
actions = {
show = "example3_history.show_history",
diff = "show_diff",
history = "show_history",
rss = "example3_history.show_wiki_history_as_rss",
}
}
Since we are using standard QUERY parameters, generating URLs can take
advantage of the functionality provided by cgilua:
Content generation
Overview
After dispatching, the controller must generate the necessary parts of an HTTP
response: the HTTP status, the HTTP headers, and the content part. This
section focuses on the approaches to generating content. One approach is to
generate the content programmatically, pushing one string after another into
a buffer. Another approach is to define template strings that are filled with
content when they are processed. We call the first method “scripting” and the
second method “templating”.
94 8 · MVC Web Development with Kepler
Scripting
The most basic way to do scripting in Kepler is to use cgilua.put to push bits of
content. Example 1 uses this approach:
function show_history(params)
local page = model.get_page(params.page_name)
local history =
page:get_history(params.year, params.month, params.date)
cgilua.htmlheader()
cgilua.put("<H1>"..page.title.." History</H1>")
cgilua.put("<UL>")
for i, p in ipairs(history) do
p.page_name = params.page_name
cgilua.put(string.format(" <LI><A HREF=%s>v. %s (%s)</A></LI>",
makeurl("show", p), p.version, p.time_stamp))
end
cgilua.put("</UL>")
end
which would give us something like:
<H1>Home Page History</H1>
<UL>
<LI><A HREF=/example1/show/Home_Page/3>v. 3 (2007-05-29 20:02:01)</A></LI>
<LI><A HREF=/example1/show/Home_Page/2>v. 2 (2007-05-29 10:03:31)</A></LI>
<LI><A HREF=/example1/show/Home_Page/1>v. 1 (2007-05-29 08:20:00)</A></LI>
</UL>
Alternatively, we can use a module like HTK6 to generate the same HTML
(see example4.lua):
require"htk"
function show_history(params)
local page = model.get_page(params.page_name)
local history =
page:get_history(params.year, params.month, params.date)
cgilua.htmlheader()
cgilua.put(htk.H1{page.title.."History"})
items = {}
for i, p in ipairs(history) do
p.page_name = params.page_name
local url = makeurl("show", p)
table.insert(items, htk.LI{htk.A{href=url, "v. ", p.version,
"(", p.time_stamp, ")"}})
end
cgilua.put(htk.UL(items))
end
6 HTK, by Tomás Guisassola, is available at http://www.tecgraf.puc-rio.br/∼tomas/htk/.
95
Templating
Another content generation approach uses a template string with placeholders
for dynamic content. Such templates allow inclusion of arbitrary code, which
blurs the separation between code and presentation; they have both advantages
and disadvantages. Alternatively, one can use “safe templates” that can only
call the functions that are explicitly given to them. We present here two simple
libraries for those two cases, each implemented in under 150 lines of Lua code.
Kepler’s solution to arbitrary code templates is “Lua Pages”, which are text
files with syntax for two types of placeholders. The first allows arbitrary Lua
code, using <% cgilua.put(title) %>. The second is a placeholder for a sin-
gle Lua expression, using <%= title %>
The Lua Pages pre-processor makes global substitutions on the template,
searching for matching pairs of markup and generating the corresponding Lua
code, which can then be executed.
To use Lua Pages with the dispatching methods discussed in the previous
section, we can call them explicitly from inside a Lua script. In example5.lua
we implement show history as follows:
function show_history(params)
local page = model.get_page(params.page_name)
local history =
page:get_history(params.year, params.month, params.date)
local env = {
page = page, history = history, page_name = params.page_name,
cgilua = cgilua, ipairs = ipairs, makeurl = makeurl,
}
cgilua.handlelp ("example5.lp", env)
end
This function delegates most of the content generation to example5.lp, which
looks like this:
<H1><%= page.title %> History</H1>
<UL>
<% for i, p in ipairs(history) do
p.page_name = page_name %>
<LI>
<A HREF="<%=makeurl("show", p)%>"> v.<%=p.version%>
(<%=p.time_stamp%>)</A>
</LI>
<% end %>
</UL>
Kepler also provides a solution for safe templates through Cosmo, a library that
allows two types of template placeholders: $var name and $fn name[[template]].
When a template is filled (using cosmo.fill function), a table must be provided
in addition to the template string. If cosmo.fill encounters a $var name pattern
96 8 · MVC Web Development with Kepler
it will simply look up the value in the table and substitute it. If it finds some-
thing like $fn name[[...]], it will look up the fn name field in the table but will
assume the corresponding value to be a function. Cosmo will then call this func-
tion in a coroutine, expecting it to yield one or more tables (using cosmo.yield).
Each table that is yielded will be used to fill the template inside [[...]], and
all the resulting text will be concatenated and inserted into the output. For
example:
show_history_template = [==[
<H1>$title History</H1>
<UL>
$list_versions[[<LI><A HREF="$url"> v. $version ($time_stamp)</A></LI>]]
</UL>]==]
function show_history(params)
local page = model.get_page(params.page_name)
local history =
page:get_history(params.year, params.month, params.date)
cgilua.htmlheader()
cgilua.put(cosmo.fill(show_history_template,
{ title = page.title,
list_versions = function()
for i, p in ipairs(history) do
p.page_name = params.page_name
p.url = makeurl("show", p)
cosmo.yield(p)
end
end }))
end
Conclusion
Kepler allows for MVC-style web development using many of the currently
popular approaches. Instead of locking the user into a specific solutions to
such problems as request dispatching and content generation, Kepler focuses
on making web applications portable across operating systems and servers,
letting the application developers choose higher-level solutions appropriate to
their specific case. Your choice between the above mentioned approaches to
request dispatching may depend on how “clean” you want your URLs to be, how
much control you need over them, and whether the system needs to support
resource-specific actions. Similarly, the choice of method for content generation
may depend on the degree to which you want to separate design work from
programming. Lua Pages may offer a simpler solution in cases where design
and coding is done by the same person, while a safe template solution like Cosmo
may make your life easier if the design work is to be delegated to a designer or
even to anonymous end users.
9
Filters, Sources, Sinks & Pumps
or Functional programming for the rest of us
Diego Nehab
Introduction
Within the realm of networking applications, we are often required to apply
transformations to streams of data. Examples include the end-of-line normaliza-
tion for text, Base64 and Quoted-Printable transfer content encodings, breaking
text into lines with a maximum number of columns, SMTP dot-stuffing, gzip
compression, HTTP chunked transfer coding, and the list goes on.
Many complex tasks require a combination of two or more such transforma-
tions, and therefore a general mechanism for promoting reuse is desirable. In
the process of designing LuaSocket 2.0, we repeatedly faced this problem. The
solution we reached proved to be very general and convenient. It is based on the
concepts of filters, sources, sinks, and pumps, which we introduce below.
Filters are functions that can be repeatedly invoked with chunks of input,
successively returning processed chunks of output. Naturally, the result of
concatenating all the output chunks must be the same as the result of applying
the filter to the concatenation of all input chunks. In fancier language, filters
commute with the concatenation operator. More importantly, filters must handle
input data correctly no matter how the stream has been split into chunks.
A chain is a function that transparently combines the effect of one or more
filters. The interface of a chain is indistinguishable from the interface of its
component filters. This allows a chained filter to be used wherever an atomic
filter is accepted. In particular, chains can be themselves chained to create
arbitrarily complex operations.
Filters can be seen as internal nodes in a network through which data will
flow, potentially being transformed many times along the way. Chains connect
these nodes together. The initial and final nodes of the network are sources and
sinks, respectively. Less abstractly, a source is a function that produces new
chunks of data every time it is invoked. Conversely, sinks are functions that
give a final destination to the chunks of data they receive in successive calls.
Naturally, sources and sinks can also be chained with filters to produce filtered
sources and sinks.
Finally, filters, chains, sources, and sinks are all passive entities: they must
be repeatedly invoked in order for anything to happen. Pumps provide the
driving force that pushes data through the network, from a source to a sink,
and indirectly through all intervening filters.
In the following sections, we start with a simplified interface, which we
later refine. The evolution we present is not contrived: it recreates the steps
we ourselves followed as we consolidated our understanding of these concepts
within our application domain.
A simple example
The end-of-line normalization of text is a good example to motivate our initial
filter interface. Assume we are given text in an unknown end-of-line convention
(including possibly mixed conventions) out of the commonly found Unix (LF),
Mac OS (CR), and DOS (CR LF) conventions. We would like to be able to use the
following code to normalize the end-of-line markers:
local CRLF = "\013\010"
local input = source.chain(source.file(io.stdin), normalize(CRLF))
local output = sink.file(io.stdout)
pump.all(input, output)
99
This program should read data from the standard input stream and nor-
malize the end-of-line markers to the canonic CR LF marker, as defined by the
MIME standard. Finally, the normalized text should be sent to the standard
output stream. We use a file source that produces data from standard input,
and chain it with a filter that normalizes the data. The pump then repeatedly
obtains data from the source, and passes it to the file sink, which sends it to the
standard output.
In the code above, the normalize factory is a function that creates our normal-
ization filter, which replaces any end-of-line marker with the canonic marker.
The initial filter interface is trivial: a filter function receives a chunk of input
data, and returns a chunk of processed data. When there are no more input
data left, the caller notifies the filter by invoking it with a nil chunk. The filter
responds by returning the final chunk of processed data (which could of course
be the empty string).
Although the interface is extremely simple, the implementation is not so
obvious. A normalization filter respecting this interface needs to keep some kind
of context between calls. This is because a chunk boundary may lie between the
CR and LF characters marking the end of a single line. This need for contextual
storage motivates the use of factories: each time the factory is invoked, it returns
a filter with its own context so that we can have several independent filters
being used at the same time. For efficiency reasons, we must avoid the obvious
solution of concatenating all the input into the context before producing any
output chunks.
To that end, we break the implementation into two parts: a low-level filter,
and a factory of high-level filters. The low-level filter is implemented in C and
does not maintain any context between function calls. The high-level filter fac-
tory, implemented in Lua, creates and returns a high-level filter that maintains
whatever context the low-level filter needs, but isolates the user from its inter-
nal details. That way, we take advantage of C’s efficiency to perform the hard
work, and take advantage of Lua’s simplicity for the bookkeeping.
function normalize(marker)
return filter.cycle(eol, 0, marker)
end
100 9 · Filters, Sources, Sinks, and Pumps
The normalize factory simply calls a more generic factory, the cycle factory,
passing the low-level filter eol. The cycle factory receives a low-level filter,
an initial context, and an extra parameter, and returns a new high-level filter.
Each time the high-level filer is passed a new chunk, it invokes the low-level
filter with the previous context, the new chunk, and the extra argument. It is
the low-level filter that does all the work, producing the chunk of processed data
and a new context. The high-level filter then replaces its internal context, and
returns the processed chunk of data to the user. Notice that we take advantage
of Lua’s lexical scoping to store the context in a closure between function calls.
The outer function eol simply interfaces with Lua. It receives the context
and input chunk (as well as an optional custom end-of-line marker), and returns
the transformed output chunk and the new context. Notice that if the input
chunk is nil, the operation is considered to be finished. In that case, the loop
will not execute a single time and the context is reset to the initial state. This
allows the filter to be reused many times:
When designing filters, the challenging part is usually deciding what to store
in the context. For line breaking, for instance, it could be the number of bytes
that still fit in the current line. For Base64 encoding, it could be a string with the
bytes that remain after the division of the input into 3-byte atoms. The MIME
module in the LuaSocket distribution contains many other examples.
Filter chains
Chains greatly increase the power of filters. For example, according to the
standard for Quoted-Printable encoding, text should be normalized to a canonic
end-of-line marker prior to encoding. After encoding, the resulting text must be
broken into lines of no more than 76 characters, with the use of soft line breaks
(a line terminated by the = sign). To help specifying complex transformations
like this, we define a chain factory that creates a composite filter from one or
more filters. A chained filter passes data through all its components, and can be
used wherever a primitive filter is accepted.
The chaining factory is very simple. The auxiliary function chainpair chains
two filters together, taking special care if the chunk is the last. This is because
the final nil chunk notification has to be pushed through both filters in turn:
102 9 · Filters, Sources, Sinks, and Pumps
function filter.chain(...)
local f = arg[1]
for i = 2, #arg do
f = chainpair(f, arg[i])
end
return f
end
Sources
A source returns the next chunk of data each time it is invoked. When there
are no more data, it simply returns nil. In the event of an error, the source can
inform the caller by returning nil followed by the error message.
Below are two simple source factories. The empty source returns no data,
possibly returning an associated error message. The file source yields the
contents of a file in a chunk by chunk fashion:
function source.empty(err)
return function()
return nil, err
end
end
103
Filtered sources
A filtered source passes its data through the associated filter before returning it
to the caller. Filtered sources are useful when working with functions that get
their input data from a source (such as the pumps in our examples). By chaining
a source with one or more filters, such functions can be transparently provided
with filtered data, with no need to change their interfaces. Here is a factory that
does the job:
function source.chain(src, f)
return function()
if not src then
return nil
end
local chunk, err = src()
if not chunk then
src = nil
return f(nil)
else
return f(chunk)
end
end
end
Sinks
Just as we defined an interface for a source of data, we can also define an
interface for a data destination. We call any function respecting this interface a
sink. In our first example, we used a file sink connected to the standard output.
Sinks receive consecutive chunks of data, until the end of data is signaled
by a nil input chunk. A sink can be notified of an error with an optional extra
argument that contains the error message, following a nil chunk. If a sink
detects an error itself, and wishes not to be called again, it can return nil,
followed by an error message. A return value that is not nil means the sink
will accept more data.
Below are two useful sink factories. The table factory creates a sink that
stores individual chunks into an array. The data can later be efficiently concate-
104 9 · Filters, Sources, Sinks, and Pumps
nated into a single string with Lua’s table.concat library function. The null
sink simply discards the chunks it receives:
function sink.table(t)
t = t or {}
local f = function(chunk, err)
if chunk then table.insert(t, chunk) end
return 1
end
return f, t
end
Pumps
Although not on purpose, our interface for sources is compatible with Lua iter-
ators. That is, a source can be neatly used in conjunction with for loops. Using
our file source as an iterator, we can write the following code:
Loops like this will always be present because everything we designed so far
is passive. Sources, sinks, filters: none of them can do anything on their own.
The operation of pumping all data a source can provide into a sink is so common
that it deserves its own function:
105
The pump.step function moves one chunk of data from the source to the sink.
The pump.all function takes an optional step function and uses it to pump all
the data from the source to the sink. Here is an example that uses the Base64
and the line wrapping filters from the LuaSocket distribution. The program
reads a binary file from disk and stores it in another file, after encoding it to the
Base64 transfer content encoding:
local input = source.chain(
source.file(io.open("input.bin", "rb")),
encode("base64"))
local output = sink.chain(
wrap(76),
sink.file(io.open("output.b64", "w")))
pump.all(input, output)
The way we split the filters here is not intuitive, on purpose. Alternatively,
we could have chained the Base64 encode filter and the line-wrap filter together,
and then chain the resulting filter with either the file source or the file sink. It
doesn’t really matter.
Exploding filters
Our current filter interface has one serious shortcoming. Consider for example
a gzip decompression filter. During decompression, a small input chunk can be
exploded into a huge amount of data. To address this problem, we decided to
change the filter interface and allow exploding filters to return large quantities
of output data in a chunk by chunk manner.
More specifically, after passing each chunk of input to a filter, and collecting
the first chunk of output, the user must now loop to receive other chunks from
the filter until no filtered data are left. Within these secondary calls, the caller
106 9 · Filters, Sources, Sinks, and Pumps
passes an empty string to the filter. The filter responds with an empty string
when it is ready for the next input chunk. In the end, after the user passes a
nil chunk notifying the filter that there are no more input data, the filter might
still have to produce too much output data to return in a single chunk. The user
has to loop again, now passing nil to the filter each time, until the filter itself
returns nil to notify the user it is finally done.
Fortunately, it is very easy to modify a filter to respect the new interface. In
fact, the end-of-line translation filter we presented earlier already conforms to
it. The complexity is encapsulated within the chaining functions, which must
now include a loop. Since these functions only have to be written once, the user
is rarely affected. Interestingly, the modifications do not have a measurable neg-
ative impact in the performance of filters that do not need the added flexibility.
On the other hand, for a small price in complexity, the changes make exploding
filters practical.
A complex example
The LTN12 module in the LuaSocket distribution implements all the ideas we
have described. The MIME and SMTP modules are tightly integrated with
LTN12, and can be used to showcase the expressive power of filters, sources,
sinks, and pumps. Below is an example of how a user would proceed to define
and send a multipart message, with attachments, using LuaSocket:
assert(smtp.send{
rcpt = "<fulano@example.com>",
from = "<sicrano@example.com>",
source = message})
Conclusion
In this article, we introduced the concepts of filters, sources, sinks, and pumps
to the Lua language. These are useful tools for stream processing in general.
Sources provide a simple abstraction for data acquisition. Sinks provide an ab-
straction for final data destinations. Filters define an interface for data trans-
formations. The chaining of filters, sources and sinks provides an elegant way
to create arbitrarily complex data transformations from simpler components.
Pumps simply push the data through.
Acknowledgments
The concepts described in this text are the result of long discussions with David
Burgess. A version of this text has been released on-line as the Lua Technical
Note 012, hence the name of the corresponding LuaSocket module, LTN12.
Wim Couwenberg contributed to the implementation of the module, and Adrian
Sietsma was the first to notice the correspondence between sources and Lua
iterators.
Lua as a Protocol Language
10
Patrick Rapin
This article describes the use of Lua as a communication vector between a client
and a server programs. In addition to some implementation choices, we discuss
the advantages and drawbacks of this approach, with a special point made on
security.
Background
Our company is using a custom source control system program called Code-
Administrator, a tool I wrote several years ago in C++ using Microsoft Foun-
dation Classes (MFC). The program has run to satisfaction until now; however
it has some limitations: it can only run on Windows and it cannot be used over
a regular Internet connection, only through a virtual private network.
We tried to find a way to keep the compatibility of the source database and
version numbers, while adding support for Unix-based systems and Internet
functionality. An idea for a solution arose: rewrite the core of the program using
Lua, into a new tool called Lua CodeAdministrator or LCA. At first at least,
there is no need to port all features, notably the administrative tasks, since it
is aimed to be a user add-on to the original tool rather than a full replacement.
Although the goal of this article is to discuss the ideas beyond the protocol used,
it will refer to this particular program when needed for the explanation.
Choice of language
What are the advantages of using Lua to implement a version checking utility?
Compared to compiled languages, using a scripting language simplifies the
implementation a lot because:
• It is easy to ensure the portability of the program.
• The code is typically shorter compared to C functions.
• It is easy to customize functions, for example by overriding global vari-
ables.
• Configuration files can be written in the same language as the main pro-
gram.
• The protocol itself can use the same language, which is the main subject of
this article.
Other scripting languages would certainly also fit the requirements for this
tool. The reasons we prefer Lua are the following:
• It is fast, compared to most other scripting languages.
• It is small, thus there is no practical problem embedding it into programs.
• It is easy to compile on all platforms.
• We have a very good experience with the language since we are using it for
our printers.
Protocol
Like for example CVS, LCA can be run in four different modes:
Standalone. The user directory and the repository can both be accessed directly,
either on a local hard drive or over a mapped network drive. There is no
need to worry too much about security in this case: we can assume that the
file system management already checks for read and write authorizations.
Client. The user directory can be accessed directly by the program, and any
read or write action to the repository must be performed through a request
over the network to the server. Security is not a problem on this side; we
assume that the user has authorized access to his computer. But the client
must be able to provide security checks to the server.
Server. Unlike the other two modes, which are run once for each operation
requested, the server must run permanently, as a daemon. It has full
direct access to the repository, but each time it needs reading or writing
to the user directory, it issues requests back to the client. Security is an
important issue for any program accepting requests from the Internet. The
user must first log in, and data can be encrypted to ensure confidentiality.
Other tricks are used, as discussed later.
111
CGI Server. It has direct access to the repository, but no knowledge of a user
directory. It is not permanent like the previous mode, but is run when
needed by the Web server through the CGI interface. It is able to browse
the database and produce regular HTML output.
While the repository database format is already strictly defined and cannot
be changed without breaking compatibility, we have complete freedom over the
protocol used to exchange data. In order to simplify the coding and to unify all
concepts, we are using Lua code as the native format.
Any transferred data has the form of some Lua script, normally just a call
to a global function with parameters. Parameters values can be arbitrary com-
plex: they may contain big table constructors or embedded files encoded with
the string.format("%q") feature. This function is very helpful to embed binary
data into valid literal Lua strings. The result is a quoted string, with problem-
atic characters escaped and the other ones copied verbatim.
We also need a command terminator, to synchronize the client and the server
through the socket. It is customary in Internet protocols to use a line feed
optionally preceded by a carriage return. As this pattern may be present in valid
Lua code, we prefer to use the single null character ‘\0’. This character can only
appear inside long strings (those of the form [[· · ·]]), a construction never used
in the implementation. A second pattern, consisting of a semicolon immediately
followed by a line feed, can only be found at the end of a Lua statement. If the
whole request is too big, it is possible to execute it piece by piece by cutting it
along this pattern.
A typical transaction looks like the following. The client initiates a socket
and sends a request to the server as a Lua chunk followed by a null byte. The
server listens to the socket until it finds the termination character. After some
basic security checks, it will compile the code and run it. During the execution,
results and additional requested information are formatted into another Lua
chunk, and sent back to the client followed by the null terminator. The client
will then compile and run the response code. An advantage here is that the
protocol is fully symmetrical: the client always initiates the connection, but work
can be requested by both ends. Also, we can deal with complex situations with
hierarchical data and callbacks, without having to define a complicated protocol.
The following code shows the skeleton for the server function (without log-
ging and security checks). Please note the use of setfenv, forcing the chunk to
run in a protected environment, where the only global functions are the ones
we need to export. The "*z" parameter is a little extension made to LuaSocket
library allowing us to read all data up to a null byte (excluded from the result
string). At the end, a complete garbage collection is performed. The main server
function consists simply in calling ExecuteJob in protected mode and collecting
garbage. In case of any error occurring in the job function, the server recovers
automatically, because all resources, including sockets, are local variables that
will be freed or closed with the collection.
112 10 · Lua as a Protocol Language
function Server()
while true do
pcall(ExecuteJob)
collectgarbage()
end
end
The first line is the client request, sending the desired file version, the list of files
to retrieve, and the destination directory. The second line is the server response,
asking the client to save the given data into a file specifying its full file name
and modification time.
Compression
The original database format uses BZIP2 compression library to drastically
decrease its size. By the way, we were surprised to observe that CA databases
for typical C projects have an overall compression ratio of about 95% (or 20
times)! LCA must of course use this library to open the source code database.
The same algorithm is also used to compress the Lua source code stored inside
the executable file, and to compress data sent over the Internet socket. The
latter is optional, because on a local network the transfer time is probably lower
113
than the compression time, while the inverse yields true over a slow Internet
connection. An auxiliary global Load function is used for this task. Its prototype
is:
Load(flags, string_data)
where flags is a combination of Boolean values, indicating whether or not
string data is compressed, and whether or not it is encrypted (see below). The
string data is again generated with the string.format("%q") feature.
Encryption
Optionally, data can also be encrypted using an MD5 library in cypher-feedback
mode. We chose this algorithm simply because a standard Lua module exists.
The security of this algorithm is certainly enough for our application. The same
Load function is used for this case. This is no requisite; an idiom like this one
would do the same thing, although it is a little more verbose:
loadstring(Uncompress(Decrypt(string_data)))()
Secured mode
There are two running modes for the server: secured and unsecured. The
unsecured mode is targeted to be used inside a secured local network, while
the secured mode could be opened on the whole Internet.
In secured mode, the authorized user must log in with a password before
he can make any operation. An MD5 hash of a challenge phrase is used for
the authentication procedure. Before login is complete, the only global value in
the environment is the authorization function itself. The server will check and
refuse any request that do not look like a valid authorization request.
If the login succeeds, most of the other custom functions become available, as
in the unsecured mode. Critical functions that could destroy the database may
only be executed over a secured local network.
Benchmark
A small benchmark run between the original MFC program and the Lua-based
version showed surprising results, which may be of interest to other Lua pro-
grammers. For the key feature of extracting a version, the equivalent of cvs
checkout -r tag, LCA in standalone mode happens to take roughly the same
114 10 · Lua as a Protocol Language
time as its C++ counterpart. It uses however about twice as much of memory.
We can probably explain these measures with the observations in the next para-
graph.
All CPU intensive operations (compression, checksum computation, merge)
are implemented in C code, using the same libraries. The core work of both
implementations is to build the file lists of any version from the incremental
database. For these, we mainly use strings, and data structures like arrays and
hashes. In Microsoft Foundation Classes these are implemented using CString,
CArray and CMap objects respectively. In Lua, we of course use native strings
and tables. While Lua is itself interpreted, which is a performance penalty over
C++, its native implementation of strings and tables seems to be faster than
the equivalent MFC classes. Concerning the memory usage, we can argue that
objects are not freed immediately when they are not needed anymore, like in
C++, but incrementally collected by the garbage collector. Using default settings,
the garbage collector of Lua 5.1 has a step pause of 200%, meaning that it waits
until the memory used has doubled compared to last collection before running
again. This factor is approximately the one we saw in the benchmark.
In client/server mode, if both programs runs on the same computer, the
checkout time using raw transfers is about 20% higher than in standalone mode.
The overhead rises to about 60% when using both compressed and encrypted
transfers.
Security
Security is a major issue for any system opening sockets over the Internet. As
the protocol consists of plain scripting code, this is quite an invitation for hackers
to send malicious code to the server!
Library functions
Fortunately, Lua gives us some weapons to fight against hacking. First, in this
language we have full control of which functions are exported into the global
environment. None of the standard Lua functions is present in the environment
used to evaluate external command scripts. These functions are in reality
present in Lua state memory, but only as local variables, so there is no way
to access them even with a malicious code.
Of the standard libraries, the coroutine, math, table and string libraries
are normally harmless. They are not exported nevertheless because we do not
need them, and do not want to give these facilities to the outside world.
On the other hand, the io, os, and debug libraries are very dangerous. If
a hacker has direct access to io.open or os.execute function, he can delete
or create files on the server system nearly as he wants (just limited by the
operating system permissions). The base function dofile and the package
library may be used to run external Lua code present somewhere on the server
hard disk, providing the hacker has the knowledge of the place to find these.
115
The debug library opens a more subtle security backdoor. Using debug.getlocal,
you can access local variables, and with debug.getupvalue, non-local variables.
These are normally completely inaccessible to an outside program, but if a
malicious code can use debug.getupvalue to gain access to the os table stored as
a non-local variable, we have lost the game.
Buffer overrun
The most common security issues using networking programs are the so-called
buffer overrun bugs. They typically occur when reading data coming from the
outside world without checking for the maximum size. A simple example in C is
this one:
int InputNumber(void)
{
char buffer[100];
gets(buffer);
return atoi(buffer);
}
Instead of the expected number, a hacker can send more than 100 non-null
characters and patch the function return address normally found in memory
just after the buffer. This forces the program to jump to an arbitrary address,
typically the function buffer itself, where the hacker just placed some executable
code!
Here is the very good news: this type of bugs is impossible in Lua language
(and in most other scripting languages), provided that there is no remaining
bugs in the implementation that could be exploited by malicious people. Be-
cause Lua has been used and tested much more than our application, we can
reasonably assume that there is no security issue related to buffer overruns.
Denial of service
A hacker may also attack a server trying to overflow the computer performance.
There are plenty of possibilities there: opening hundreds of simultaneous con-
nections without closing them, overwhelming the bandwidth by requesting huge
amount of data, asking the server for too complex requests, etc. This type of at-
tack is not as harmful as the previous ones, because no private data can be
stolen this way from the server, and there should not be any data loss. Restart-
ing the server program (or the whole computer) is enough to recover from such
a problem. Nevertheless, it harms regular users, who won’t be able to access the
server for some period of time. Some critical services cannot afford such a risk
of interruption and must take important measures against denial of service at-
tacks. For others, it may be enough to guarantee that no data loss or corruption
occurs, and that the server will be up again in a reasonable amount of time.
Using Lua as a protocol language makes it very easy to overwhelm the server.
Here are some examples:
116 10 · Lua as a Protocol Language
The first request uses 100% CPU time, and never finishes. The second
example too: as this is a tail call, it does not consume any stack space and
there is no limit on the call level. The third line will exponentially eat all the
available virtual memory on the computer (physical memory and swap file),
until an out-of-memory error occurs. This shows that loops and tail calls are
dangerous and should be forbidden. But even without loops we can achieve the
same result with a finite but long enough command as shows code 4. If we also
forbid concatenation, there is still the possibility to send a huge command like
code 5, supposing the hacker has even bandwidth and time for his attack.
It is very difficult to protect against all possible attacks. However, a small
number of checks can be done to drastically decrease vulnerability. There are not
always necessary; it depends on the application design and the desired security
level.
• Before login is complete, refuse any request not matching a strict string
pattern.
• Place a lower limit for memory allocation than the total available on the
computer. For that you just have to provide a custom lua Alloc function to
lua newstate, counting allocated memory.
• Run a separate program that will monitor CPU and memory usage of the
server, like top on Unix or the task manager on Windows. If this usage
goes higher than a reasonable limit, the program will kill and rerun the
server.
• Forbid some Lua virtual machine opcodes. This can be achieved by first
compiling the received chunk, then analyzing the binary code. We just
have to avoid 3.5 instructions out of 38:
The code below implements this check. Notice that it needs private headers
and structures. So it is neither portable nor advisable: use it only when
necessary.
#include "lstate.h"
#include "lopcodes.h"
Conclusion
This experience shows that it is possible to use Lua as a protocol language
over an Internet socket. Such a protocol simplifies the implementation and
debugging of the communication tool, if a Lua interpreter is used for other tasks
as well. However, it is clearly not a good choice for critical services, because
the scripting language opens a number of security issues. This approach is best
targeted to quickly developed enterprise tools, run over a secured local area
network.
Lua Script Packaging
11
Han Zhao
the directory hierarchy will be lost — all source files are compiled and packed
together flatly. For simple projects, it’s possible to build a utility to map the
directory hierarchy to the flat structure of the release package. But there will be
a maintenance overhead — the developer will have to keep in mind where a file
is located in the release package. We don’t want our modularization strategy for
Mock require
We’ll leverage the module loading mechanism described above to build a deco-
rator around the standard require to do the extra work: plugging our module
loader onto package.preload table to load a .lua file from package and leaving
the rest to the standard require.
The mocked require looks like this:
-- rename standard require
local lua_require = require
-- mocked require
function require(mod_name)
-- redirect loading function to our package (.dat) loader
package.preload[mod_name] = fio_loader
-- Lua standard loading routine
lua_require(mod_name)
end
The package loader (fio_loader) will convert the given module name to a
format needed by the packaging format. Then fio_loader will load the module
(.lua file) referenced by module name from the package:
121
function fio_loader(mod_name)
print(’fio require ’..mod_name)
In the code above FioG is a name space. The .dat package format will be
described in later section. You can choose a common format or implement a
specific one for your project.
FioG.fio_script_dat is a global user data of type DatFileReg. It holds the
.dat package and keeps a map from module name to an offset to access a specific
module file in the package.
FioG.c_load_chunk_from_dat is an imported C function. It will locate the
script in the package, load it using lua_load, and return the result: a Lua
function or an error message. Since lua_load just loads the chunk without
running it, we need to invoke the returned function with the module name
passed in:
return ret(mod_name)
(For Lua 5.0, the module name (mod_name) can be ignored. If you’re using the
module function introduced in Lua 5.1: module(..., package.seeall), it is
obligate to be passed in.)
The implementation of FioG.c_load_chunk_from_dat in C++ is:
122 11 · Lua Script Packaging
DatFile is the actual reader of packed script — it implements the gzip algorithm
for .dat file. It’s held by a ZipWrap struct which is passed into lua_load as the
opaque data value for the lua_Reader callback function:
Mock dofile
The dofile function “opens the named file and executes its contents as a Lua
chunk.” Unlike require, each time dofile invoked, a fresh piece of script will be
loaded and executed.
dofile is useful when you use Lua for configuration or resource description.
For example, in a game project, a map is a .lua file which contains tables of
cells, items, and critters. When the player steps into a new level, dofile will
dynamically load the map file into memory to construct the level. Like mocked
require, mocked dofile also relies on FioG.c_load_chunk_from_dat, as shown
in Listing 1. Here we use absolute path (home_dir) for dofile, e.g., to load a
game map:
dofile(home_dir..’master.dat\\level\\dungeon.lua’)
Then pattern matching is applied on the path to get the directory where the
script is and the reference name (in the example, the directory is ‘master.dat’
and the reference name is ‘level/dungeon.lua’). After adjusting reference name
format, we are able to load the file from corresponding packed .dat file (here,
FioG.fio_master_dat) just like what we’ve done for mock require.
Sometimes it’s not necessary to pack all resource files into packages. In the
code above, when directory is ‘savegame’, we use standard dofile (renamed to
lua_dofile) to load the saved game record which is just a .lua file saved in
‘savegame’ directory. Thus we have the flexibility to load file from both package
and normal file system; this is a powerful mechanism for loading Lua-described
resources.
function FioG.mock_dofile()
-- save standard dofile
local lua_dofile = dofile
function dofile(filename)
-- find script directory and reference name
local b, e, dir, ref = string.find(filename,
home_dir..’([%a%.]+)\\([%a%p%d]+)’)
if(b) then
-- adjust for reference name in .dat
ref = string.gsub(ref, ’/’, ’\\’)
local ret = nil
Listing 1.
125
the packaging (or deploying) strategy doesn’t interfere with the development
file organization structure. A smooth transfer from separate development files
to packed ones for release is necessary. Otherwise, we will have to manually
maintain the mapping between development structure and release package
structure.
As we’ve seen, there’s a code snippet to adjust reference name in both mocked
require and dofile. This snippet automatically maps the development struc-
ture to release package structure. And the mapping reserves the directory struc-
ture (e.g., ‘utility/utf8’) in the reference name.
For require, during development we put all the files under a directory named
‘script’, creating subdirectories for submodules if necessary. The standard
require will find and load these files based on package.path or LUA_PATH and
reference name (e.g., require’utility/utf8’).
In the release build, we pack all the files in ‘script’ directory into a .dat file
with the reference names and file sizes built in. The packed .dat file will be
loaded during startup (in a DatFileReg object named FioG.fio_script_dat in
Lua) and the standard require is also mocked here. Later when we require a
module, the mocked one will be invoked. It’ll find the script in the .dat and load
it as described in previous section.
dofile works in a similar but more flexible way: you can choose to pack
resources into more than one packed file for a better organization.
When to mock?
It depends on your project. The following loading steps work well in several
projects:
1. Two config.lua files: one for development and one for release. It should
contains a field ‘packed’ which is false for development and true for release.
The config.lua file works like a compiler switch for mocking.
2. Mock functions (mock require, mock dofile) in a separate .lua file. The
two mock functions are defined and loaded here.
3. Your first Lua entry file. This is the bootstrap file which requires other .lua
files. After reading the config.packed field from C++ , we can decide which
loader to invoke: if the project is in development phase, using the standard
lua_load; or in release, using a function like FioG.c_load_chunk_from_dat
to load. The file should contain following initialization code at the top:
if(config.packed) then
mock_require()
mock_dofile()
end
For released project, all later require and dofile are mocked hereafter.
126 11 · Lua Script Packaging
.dat format
Here is a brief description of the .dat format.
.dat is a format used in Black Isle Studio’s Fallout role-playing game series
for game resource packaging. Its simple structure makes it a good candidate.
A .dat file contains a sequence of gzip compressed files with a record at the tail
describing each file’s reference name, file size, and offset.1 In our implementa-
tion, DatFileReg holds the .dat file, maintains a map from reference name to
offset; DatFile is responsible for reading a single zip file (located by a reference
name) in which a resource file is packed. The format has been well studied
and supported by the Fallout modding community — there are many .dat pack-
ers/unpackers available (both command line and GUI ones).2 It’s convenient to
choose this format instead of inventing a new one and building the tool set from
scratch.
Please note that the compiled file is still easy to be hacked — there are some
decent decompilers for Lua out there. You can use the technique described here
as a starting point to build more advanced features like accessing verification or
encryption into your packaging algorithm to protect the source code.
1 .dat file format description: http://wiki.fifengine.de/index.php?title=DAT architecture#
DAT2
2 .dat file tools: http://www.teamx.ru/eng/files/utils/ F2 DAT-files packer/unpacker (DAT2),
Patching
Another consideration on choosing a packaging algorithm is patching. After
the product gets shipped, we’ll have to maintain it, fix bugs, or upgrade. The
modified script files need to be delivered to the user and installed there.
For small projects, we can just repack all files together and release the new
package. But this is not convenient or feasible for large projects. For example, in
a game project you might have packed the scripts with other resources (image,
video, sound, map files, etc.) into one file (perhaps hundreds of megabytes). It’s
awkward for a player to download a big patch just to update several kilobytes of
script files.
Depending on your project, the patching requirement might be vital. There
are two approaches for patching: replacing an existing file in the package, or
appending one or more files to the package.
The ‘append’ approach is easier to implement. If other decisions like encryp-
tion get in the way of the ‘replace’ approach, you can use the ‘append’ approach
instead: the new file will be appended at the tail of the package and you only
need to adjust the mapping from reference name to the new file’s offset without
touching the existing old file which is buried in the package.
Several utilities for .dat format have implemented both the ‘replace’ and the
‘append’ functionalities.
Conclusion
We started from Lua’s basic facilities (require, dofile, and the binding API) to
build a flexible and powerful packaging mechanism. With Lua script packaging,
we can setup a direct mapping from the development structure to the deploy-
ment structure that hides internal organizing details from the end user. The
application will be easier to develop, deploy, and maintain.
The technique described here has been applied in two game projects and a
shareware product.
Objects, Lua-style
12
Reuben Thomas
assigns −5 to p’s x field, 3 to its y field, and “blue” to its color field. (Note
that we use a table rather than a normal argument list so that both named
and unnamed fields may be conveniently initialized.)
Strictly speaking, the above is only true when the prototype object’s _code
field is unmodified. It may usefully be overridden to add initialization code.
An object has no record of its prototype. One can be made explicitly, or a
_prototype field could be set by the default _clone method.
By convention, fields have string keys, and private fields have keys starting
with an underscore. Since private fields aren’t hidden, it’s up to the
programmer to ensure they don’t clash.
Access field: object.field
Since object fields are normal table entries, the standard syntax is used to
read and write to them.
Call method: object:method(...)
As for field access, method invocation works using the standard syntax.
Call class method: Class.method(object, ...)
The obvious way to call a class method is used: simply use dot rather than
colon notation, to pass the object explicitly.
A judicious justification
By now, some readers are probably verging on apoplexy, either because of the use
of prototypes, or because of the object model’s extreme simplicity. Such readers
131
may be particularly irritated that the design was foisted on them without mo-
tivation or justification. I did this because I thought it better to let the design
speak for itself before wading into the inevitable controversy. Nonetheless, the
design is a good general-purpose object implementation for Lua:
It is simple. The simple, even minimal, design fits with the Lua philosophy. For
every OO devotee bemoaning its naivety there will be a Lua purist who
thinks it’s superfluous.
Prototypes are natural in Lua. Prototypes sit well with Lua’s weakly-typed and
dynamic nature.
It works for ad-hoc wrapping. . . As well as working for pure Lua programs, this
object model can be used for ad-hoc wrapping of other object models. I’ve
used it to make OO interfaces to C structures, for example (sadly, it’s not
code I can share).
. . . but doesn’t claim to be the one true way. If you’re trying to make an exten-
sive, easy-to-use, and above all automatic wrapper for another object model,
you should be rolling your own Lua implementation, because it’s easy and
will work better than compromising with an existing model: this is a case
in which having multiple OO implementations is a good thing.
A delightful detour
Our object implementation needs some basic functions which should be in any
Lua programmer’s toolbox: clone, merge and rearrange. The following imple-
mentations are taken from the stdlib project (http://luaforge.net/projects/
stdlib). The code verges on the trivially simple; precisely for this reason I repro-
duce it here for the reader’s enjoyment. Of crucial importance is that all three
routines are functional: they do not have any side effects. Though Lua is an
imperative language, it is well-suited to a functional style, which in my opinion
should be used whenever applicable, as it encourages clear, robust and re-usable
code.
clone makes a shallow copy of a table, including any metatable:
function clone(t)
local u = setmetatable({}, getmetatable(t))
for i, v in pairs(t) do
u[i] = v
end
return u
end
merge merges two tables. The merge, like assignment, goes right to left: fields
of the second argument override those of the first, but the result’s metatable,
if any, is that of the first argument. The left-hand argument, though, is not
overwritten.
132 12 · Objects, Lua-style
function merge(t, u)
local r = clone(t)
for i, v in pairs(u) do
r[i] = v
end
return r
end
rearrange rearranges the keys of a table. Its first argument is a map from old
keys to new keys, and its second argument is a table. Only the keys mentioned
in the map are rearranged.
function rearrange(p, t)
local r = clone(t)
for i, v in pairs(p) do
r[v] = t[i]
r[i] = nil
end
return r
end
(Note that both here and later I omit distracting details important in a
production implementation, such as packaging the code as a requirable module.
This is done in stdlib.)
Implementation
Given the functions above, the actual object implementation is brief.
Object = {
_init = {},
_clone = function(self, values)
local object = merge(self, rearrange(self._init, values))
return setmetatable(object, object)
end,
__call = function(...)
return (...)._clone(...)
end,
}
setmetatable(Object, Object)
The careful reader will want to check that the innocuous-looking first line of
_clone really does what it should, and note that the odd-looking implementation
of __call really is correct: the first (...) adjusts the list to one element, the
object, while the second passes the entire argument list to _clone, which is really
a method, so its first argument is indeed the object itself.
133
Weaknesses
The default _clone could be made to discard excess numbered initializers, but
that feels un-Lua-ish, as it imposes behavior that is not required for correct
functioning.
There are also some obvious major omissions. First, since our objects can be
indexed just like ordinary Lua tables, there’s nothing to stop the programmer
treating them as such. In other words, we lack information hiding, one of the
main planks of object orientation. I don’t think this is a problem, however. Lua
is not designed for opacity, and is not a good choice when strong type discipline
is required. (This is not to ignore Lua’s excellence as a language for safe,
sandboxed scripting, which rests on its namespace control.)
Secondly, there is no multiple inheritance. With prototypes, multiple inher-
itance is often replaced by aggregation: a number of classes objects are cloned
and merged together. For example:
o = merge(c1._clone(), merge(c2._clone(), c3._clone()))
This could be abbreviated
o = subclass(c1, c2, c3)
where subclass is defined:
function subclass(...)
local r = {}
for c in {...} do
r = merge(r, c)
end
return r
end
I didn’t include this in the package because I haven’t yet needed it.
Inconclusion
In conclusion, we can pertinently wonder whether having a general-purpose
object system in Lua is useful at all. The principle advantages of an OO style
in Lua are encapsulation of readable syntax for well-structured data types
supporting a limited range of operations. However, it’s often possible to write
shorter, clearer code without objects. For example, when processing poorly
structured, unstructured or arbitrarily structured data, such as text, tag soup or
XML, a table-oriented approach is often clearer, and a functional style briefer.
One example of this is the utility functions used to implement objects, which
perform general table operations. The use of general-purpose functions, helps
ensure that the object model has no undefined behavior and is robust, as well as
avoiding the need to write special-purpose code.
The real conclusion is that Lua is flexible, and lends itself to a variety of
approaches. Each should be used when appropriate, but none taken too far.
Exceptions in Lua
13
John Belmonte
Despite the well known advantages of using exceptions for program errors, the
mechanism is underutilized in Lua — both in quantity and quality. One aspect of
this relates to the Lua core and standard library, which tend to raise exceptions
only in the most serious situations such as parse errors, type errors, and invalid
arguments. When exceptions are thrown, they are exclusively string values
which are not enumerated as part of the API. Tables, the primary data structure,
yield nil for a nonexistent key rather than raise an error. All of this leads to an
unspoken bias in Lua that exceptions are something to be thrown but rarely
caught — that they are serious errors which normally go unhandled. In the few
situations where we do catch them, no distinction is made with respect to the
cause of the error.
The core and standard libraries arguably work well as they are, and their
use of errors may not warrant meddling. But why are exceptions also under-
utilized within Lua programs and third party modules? One problem is the
unfriendliness of Lua’s protected call interface to programmers expecting a na-
tive try–catch construct. This in turn discourages library authors from using
exceptions for fear of alienating users. The inability to use coroutines within a
protected call also works to limit uptake by libraries.
Lua possesses the necessary building blocks for exceptions; however, rough
edges appear when one tries to assemble them. This perpetuates disuse of
exceptions and strengthens anti-exception patterns such as signaling errors
by way of return values. To break the cycle, we first need to promote idioms
and know-how for richer use of exceptions. As more Lua developers encounter
the same rough spots, the necessary motivation will exist for some incremental
improvements in the core and language itself.
This gem intends to start the process by presenting some exception tools and
know-how for Lua. First we spell out a criteria as to when a function should raise
an exception versus simply return an error status. For handling the exceptions,
we present a simple try–catch idiom that works with today’s stock Lua. We then
cover why custom error objects are important and address gaps in Lua regarding
their use. Finally, we set out to find the right pattern for exception safety in Lua.
What is an error?
What failure situations should be considered a first-class error, warranting the
use of exceptions? Calling a function with invalid arguments is an obvious
error. In contrast, a negative result from a string matching function is normally
not considered an error. In between these is an expanse of various error-like
situations. What about an attempt to append to a read-only file; a failed hash
table lookup; a database conflict; or an HTTP connection failure? We need a
guideline for evaluating these.
On this subject, “Programming in Lua” suggests that if an error cannot be
easily avoided, it should be signaled with a return code rather than exception.
This logic is geared towards letting you handle error situations without the need
for a try–catch — a decidedly conservative view on the use of exceptions. What
effect does it have on a program?
Let’s consider a Lua program which outputs the length of a file given its
name on the command line:
local f = io.open(arg[1])
local length = f:seek(’end’)
print(length)
The program lacks error handling — it may be the work of a novice program-
mer or a lazy expert programmer. How does it behave when things go wrong?
Let’s try an input file, “abc”, which doesn’t exist:
The good news is that an unhandled exception occurred, causing the program
to return a non-zero exit code. This is the bare minimum behavior we need
from a command-line program on error. The error message, however, is not very
helpful. In this simple program we can look at the source code and quickly
deduce that io.open returned nil instead of a file object, causing an error on
call of the seek method. In a complex program, debugging could be much more
difficult. The file object could be passed to a different place in the program, and
perhaps not used until long after the io.open call.
137
Wrapping the io.open call in assert would address this error situation,
producing an exception with an accurate location and message.1 However, the
novice programmer didn’t consider that, and the expert programmer either
didn’t think his program would be used so foolishly, or didn’t care. In large
programs such negligence can go unnoticed until a certain obscure code path
is encountered. Arguably, it’s better not to present the opportunity for an
oversight.
A more liberal guideline for errors is this: if a failure situation is most
often handled by the immediate caller of your function, signal it by return
value. Otherwise, consider the failure to be a first-class error and throw an
exception. The effect is to use exceptions when errors are communicated two
levels up the call stack or higher (including possible program termination). This
is intended to extract the best value from exceptions. When an error is likely to
traverse several levels, we relieve intermediate code from having to propagate
the error — a task which is error prone and clutters both code and API. On
the other hand, when a failure is usually consumed by the caller, we spare the
extravagance and expense of a throw and catch.
What is the outcome when this guideline is applied to io.open? It’s sub-
jective, but programs usually have a strong dependency on the files they open.
When a problem occurs — whether it be a full storage device, permission error,
or missing file — it tends to require handling at a high level in the program, if
it is handled at all. It’s a good guess that the error will be traveling up past the
immediate caller of the I/O function.
1 Wrapping a call with assert assumes it follows the convention of returning a nil and error
message tuple on failure. The convention can’t be used, however, if nil or false happen to be valid
outputs. It can also interfere with code readability when a function has multiple outputs and the
caller elects not to wrap with assert (e.g., a function returns coordinates x and y, but on error y
doubles as a message).
138 13 · Exceptions in Lua
try(function()
-- Try block
end, function(e)
-- Catch block. E.g.:
-- Use e for conditional catch
-- Re-raise with error(e)
end)
The catch function, should it be invoked, receives the error object as an ar-
gument. After inspecting the error, it can elect to either suppress the exception
by taking no action, re-raise the existing error, or throw a different error.
A notable limitation of using functions to define our code blocks is that flow
control statements, such as return and break, cannot cross outside the try–catch.
For example, the following code would not work as expected:
function foo()
try(function()
if some_task() then
return 10 -- does not cause foo() to return 10
end
end, function(e)
-- ...
end)
return 20
end
Lua’s pcall operates by calling the function given to it. Any exception will
be trapped, returning nil and the error object. Based on that, the definition of
our try function is trivial:
function try(f, catch_f)
local status, exception = pcall(f)
if not status then
catch_f(exception)
end
end
Unfortunately coroutines do not mix well with pcall, so this will preclude
their use within our try block. The problem is well known and has various
workarounds, ranging from a pcall replacement implemented in pure Lua to
an extensive Lua core patch.
try(function()
do_transaction()
end, function(e)
log(’Retrying database transaction’)
do_transaction()
end)
The issue here is that we end up retrying the transaction not only when
there is a database problem, but also for any other error. This could mask
bugs such as calling a function with the wrong arguments, producing strange
program behavior. Clearly we want to be more selective by handling only the
errors we understand and letting the rest pass through. Given the common
practice of throwing strings, however, this becomes tricky. We are faced with
fragile parsing of exception messages which may change in the future, especially
if they originate from a third party’s module.
To address this problem, we take advantage of the often-overlooked ability of
error to throw values other than strings. A table is the natural choice, leaving
room for expanded functionality by way of methods and internal state. The
database module might simply contain the following definition:
ConflictError = {}
This approach serves not only to allow positive identification of an exception,
but also to enumerate the errors which can be raised by a module — it should be
considered part of the API. Now the database module can signal a conflict with
error(ConflictError) and our catch function can be refined as follows:
function(e)
if e == db.ConflictError then
log(’Retrying database transaction’)
do_transaction()
else
error(e) -- re-raise
end
end
A new problem is lurking however. What if the database conflict should go
unhandled? Let’s simulate the situation in Lua’s interactive interpreter:
> error({})
(error object is not a string)
Unfortunately, the uncaught exception handler which lives inside Lua’s stan-
dard interpreter refuses to do anything with a non-string error value. We’re
missing the human-readable message and call stack which are essential for lo-
cating the source of the error. The required improvement to the interpreter
is minor however: just pass the error value through tostring before invoking
debug.traceback. This change is planned for the next version of Lua. With this
change in the interpreter, and by enhancing our error object with an appropriate
__tostring metamethod, the behavior becomes:
140 13 · Exceptions in Lua
While this is a significant improvement, there is sill one detail missing from
the trace: the file name and line number of the exception. Normally, with
a string value, the error function adds this information at the point of the
exception by prefixing it to the string. For other value types the location is
omitted. While the association between an error and its location might best be
maintained by the Lua core, such a change would be substantial. A compromise
is to alter the error function to store the location in a field (assuming the value
is a table) and have it picked up by the interpreter’s handler. This location fix
and the aforementioned tostring fix are available together as a “custom errors”
patch to Lua. (See http://lua-users.org/wiki/LuaPowerPatches.)
Continuing with our database application, suppose we wish to catch any
exception specific to the database module. Or perhaps the module author decides
to distinguish between read and write conflicts using separate error types, while
our handler remains interested in both cases. It would be unfortunate to have
to spell out each error to be caught when all we mean is “any database module
error” and “any conflict error”, respectively. This suggests the need for an error
hierarchy, where we can test if a certain instance belongs to a given class of
errors.
In other languages, an error hierarchy tends to be defined by class inheri-
tance. In Lua we are free to do the same, but without a standard class system
the error values from various modules and our own code will lack a common root
and API. As a compromise, the database module author might make a utility
available for testing inheritance among the module’s own objects. The imple-
mentation should be robust, yielding a negative result for foreign values. Our
catch function then becomes:
function(e)
if db.instance_of(e, db.ConflictError) then
log(’Retrying database transaction’)
do_transaction()
else
error(e)
end
end
having internal state, a new instance must be created for each exception thrown.
In that case equality cannot be used to identify the exception.
The argument for custom errors is that a human-readable error message,
while essential, should be only one component of a richer error object. Errors
should be enumerated as part of an API, providing the ability to positively
identify exceptions and perhaps locate their place within a hierarchy of errors.
Custom error objects can also serve to store arbitrary state at the time of an
exception, which may be useful for debugging and error reporting. All of this is
light work for Lua tables, although the need for hierarchy testing does present
an interoperability issue between modules.
Exception safety
With exceptions comes the issue of exception safety — proper cleanup of acquired
resources and program state when an exception does occur. Acquired resources
might include memory allocated from special pools, device handles, and mutex
objects. Consider the following simplistic function to paint a logo onto the screen:
function display_logo(display_buffer, x, y)
local canvas = allocate_canvas(50, 50)
render_logo(canvas)
display_buffer:lock()
display_buffer:copy(canvas, x, y)
display_buffer:unlock()
canvas:free()
end
During the course of this function we acquire a graphic canvas (perhaps
off-screen video memory) and a lock on the display buffer. If the render_logo
function happens to throw an exception then the canvas may not be freed
in a timely manner — it may happen automatically when the canvas value is
garbage collected, but we don’t know when that will be. More seriously, if the
display_buffer:copy call throws an exception because the input coordinates
are out of range, the display is never unlocked. Clearly, if resources like this are
going to be exposed to the scripting environment, we need a way to free them
even if an exception occurs.
Even if we decide not to expose management of critical resources to scripting,
there are common cases where we must ensure that some program state is
restored despite an error. Say we’d like the text output of a certain third party
function directed to a file, but the module has been hard-coded to use standard
output. We could work around the limitation by changing Lua’s default output
temporarily:
local out = io.output()
io.output(log_file)
somelib.do_task()
io.output(out)
142 13 · Exceptions in Lua
The problem here is that if do_task throws an exception, the default output
will never be restored. One may argue that restoring this state doesn’t matter
because the process will be terminated anyway. This overlooks the possibility
that the exception may be handled at a higher level in the execution stack, al-
lowing the program to continue. That a certain error is too dire to be intercepted
usually turns out to be a myopic view since, at the highest level of the program,
there are always options such as reattempting an operation or switching to a
failover routine. This makes proper exception safety especially important when
implementing a library, where the author cannot imagine all usage scenarios.
Now that we’ve identified the need for exception safety, how is it accom-
plished? The solutions are all variations on one theme: install cleanup code
to be run on exit of the current scope, whether that be normally or by exception.
Traditionally, programming languages have two mechanisms available for this.
One is the try-finally construct, where the scope is defined by a “try” block, and
the cleanup code placed in a “finally” block which always runs afterward. As a
language construct, however, try-finally has fallen out of fashion. To consider
why, let’s pretend that Lua supported try..finally..end and apply it to our
display_logo function:
function display_logo(display_buffer, x, y)
local canvas = allocate_canvas(50, 50)
try -- no such thing in Lua
render_logo(canvas)
display_buffer:lock()
try
display_buffer:copy(canvas, x, y)
finally
display_buffer:unlock()
end
finally
canvas:free()
end
end
The issue is one of code readability and maintenance. A nested try-finally is
needed for each consecutive resource acquired, making the flow of the original
program difficult to follow. Moreover, although having “try” come before “finally”
is most intuitive and the common layout, it tends to maximize the distance
between acquisition and cleanup code. This problem becomes more pronounced
as the size of the function grows — to the point where the programmer cannot
see them on the screen together, and could modify one without considering the
other.
The other traditional mechanism for exception safety is the use of a custom
object which is referenced solely by the local scope. The cleanup code exists
in the destructor of the object so that it will be invoked as the value goes out
of scope. Rather than defining an ad hoc type for each cleanup situation —
which becomes verbose and a burden to maintain — the use of a generic “scope
143
manager” object is becoming common (e.g., the C++ scope guard pattern, or the
D “scope” statement). A scope manager allows the registration of arbitrary code
which will be called at scope exit. Since registration can take place multiple
times and throughout the scope, it enables natural placement of cleanup code. In
some languages it’s possible for the manager to know if the scope exited normally
or by exception, further enhancing the utility of this pattern.
Unfortunately, Lua provides no way to hook into scope exit.2 As object de-
struction is subject to the whim of the garbage collection system, the trick of
using an object referenced only by the local scope does not provide deterministic
cleanup. As in our try–catch implementation, however, it’s possible to approxi-
mate such a hook by way of an explicit function scope and pcall. We’ll use that
to create a simple scope manager in Lua for our cleanup needs.
function display_logo(display_buffer, x, y)
scope(function()
local canvas = allocate_canvas(50, 50)
on_exit(function() canvas:free() end)
render_logo(canvas)
display_buffer:lock()
on_exit(function() display_buffer:unlock() end)
display_buffer:copy(canvas, x, y)
end)
end
from Lua 5.1. I hope that this can be resolved in a future version of the language — perhaps by
creating a new class of variable which notifies its value when it goes out of scope, or by adding a
construct along the lines of Python’s “with” statement.
144 13 · Exceptions in Lua
again it offers more flexibility on placement of cleanup code. Here is the scope
function implementation:
function scope(f)
local function run(list)
for _, f in ipairs(list) do f() end
end
local function append(list, item)
list[#list+1] = item
end
local success_funcs, failure_funcs, exit_funcs = {}, {}, {}
local manager = {
on_success = function(f) append(success_funcs, f) end,
on_failure = function(f) append(failure_funcs, f) end,
on_exit = function(f) append(exit_funcs, f) end,
}
local old_fenv = getfenv(f)
setmetatable(manager, {__index = old_fenv})
setfenv(f, manager)
local status, err = pcall(f)
setfenv(f, old_fenv)
-- NOTE: behavior undefined if a hook function raises an error
run(status and success_funcs or failure_funcs)
run(exit_funcs)
if not status then error(err, 2) end
end
Like the try–catch implementation, this scope hook suffers from an incom-
patibility with coroutine yield, and the inability to use flow control statements
across the scope’s boundary (i.e., return, break, etc.). A more fundamental limi-
tation exists however: cleanup code itself must not raise an exception. Allowing
this would create at least two ambiguities: 1) if an exception happens in one
piece of cleanup code, should the entire cleanup contract be invalidated? 2) if
there are multiple, logically parallel exceptions, which is to be propagated? The
situation is best avoided and, in the implementation presented, its behavior is
left undefined.
A slightly different design for the scope utility would be to pass the manager
object to the user’s function as an argument. Besides eliminating the complexity
of making on_exit and the other registration methods appear implicitly within
the function, this would allow the manager to be passed to utility functions.
For example, the allocate_canvas function could take a scope manager as an
optional argument, and in that case register the canvas cleanup code for us.
On the other hand, the explicit manager variable makes the user’s code more
verbose in the simple case, and opens the door for confusion should someone try
to operate on the manager of an already expired scope.
This pattern to assist with exception safety is the final component in our bag
of exception tools. Combined with custom error objects, which allow discern-
145
ing between errors, and a try–catch construct implemented in pure Lua, pro-
grammers can explore richer use of exceptions in their programs and libraries.
Limitations and rough spots exist for sure, but hopefully this situation is tempo-
rary — the authors of Lua have a good track record of improving the flexibility
of the language and its implementation over time.
Part III
Algorithms
and
Data Structures
14
Word Ladders
Gavin Wraith
Word Ladders or Doublets is a word game whose invention has been attributed
to Lewis Carroll. The idea is to transform one word into another by changing
only a single letter at each step.
Here is a simple example:
BEST → PEST → POST → POSE → ROSE → RISE → RISK
We present a small Lua program which, given a lexicon of words as a com-
mand line argument, takes a word from the standard input and prints the words
in the lexicon that can be obtained from it by such transformations. This pro-
gram is really just an excuse for presenting a more abstract application: a mod-
ule for calculating the strata of the connected component of a vertex in an undi-
rected graph.
Undirected graphs
From the earliest days of programming this game has been a vehicle for demon-
strating algorithms about undirected graphs. An undirected graph may be de-
fined as a set of vertices together with a Boolean-valued function on the set of
unordered pairs of distinct vertices, which tells which vertices are joined by an
edge. In what follows the term graph should be read as undirected graph.
If two vertices can be joined by a sequence of edges then we say that they
belong to the same component of the graph. Given a lexicon of words we can
construct a graph whose vertices are words, where two vertices are joined by an
edge if the corresponding words differ by only one letter. The Word Ladder game
is about finding a path from a given word to another.
The mathematical notion of graph is an abstraction. The Word Ladder
game has various features which are thrown away by this abstraction — the
significance of the words, for example, or the fact that we can enumerate words
in lexicographic order. If a program is to be reusable it should be as faithful as
possible to the mathematical abstraction, and avoid building in specific features
of this or that example. However, there is a problem. Mathematicians tend to
use the notion of set in contrast to the notion of enumerated set. Programming
languages, on the other hand, having developed in an era of serial processors,
are usually better adapted to handle enumerated sets. Of course, every finite set
can be enumerated in some way or other (and if you believe the axiom of choice,
every set can be totally ordered), but if a problem does not mandate a particular
enumeration it seems rather ugly to have to choose one in order to program a
solution.
The fundamental datatype in Lua is the table. Any value except nil can be
a key in a table. We can use tables to represent both enumerated sets, using
integers starting from 1 as keys, and non-enumerated sets by taking as the
elements keys with value true. Lua has two iterator functions for tables: ipairs
iterates over integer keys in order starting from 1 and so is appropriate for
enumerated sets, and pairs iterates over all keys but in no predictable order,
and so is appropriate for non-enumerated sets. This unpredictability is the
serial processor’s apology, so to speak, to the parallel world of mathematical
sets (logicians may be thinking of permutation models at this point).
Although a component of a graph may have no natural enumeration, once
we have chosen a particular vertex in it we get a natural stratification of it. The
first stratum consists just of the chosen vertex. The n-th stratum consists of
those vertices having a shortest path of n − 1 edges to the chosen vertex. Each
stratum has no particular enumeration itself, and so should be represented by a
Lua table whose values we may take to be true. The collection of all the strata
does have an enumeration, and so is represented as an array of tables.
Listing 1 shows a module, graph, that calculates the array of strata, given
a graph and a vertex of it. A graph is a table with keys vertex and edge. The
value for vertex is a true-valued table. The value for edge is a Boolean valued
function on pairs of distinct keys of the vertex table. It should be symmetric,
that is to say if g is a graph then g.edge(x,y) and g.edge(y,x) should have the
same value for any keys x, y of g.vertex.
The function graph.component makes a copy of the table mygraph.vertex and
points the local variable vertex at it. It initializes the strata array to a single
stratum containing just the start node, and then removes the start node from
vertex. It defines a local function, more, which searches the nodes of vertex to
see if they are joined by an edge to the current stratum. If they are, they are
removed from vertex (which is why we had to make a copy of mygraph.vertex)
and put into the next stratum. This avoids redundancy. The more function is
called repeatedly until no more strata can be found.
151
-- graph
--[[component of start node in graph as a list of sets of nodes]]
module "graph"
component = function(mygraph,start)
local vertex,edge = {},mygraph.edge
for node,val in pairs(mygraph.vertex) do
vertex[node] = val
end -- for
local strata = {{ [start] = true }}
vertex[start] = nil
local more = function()
local new,change = {},nil
for x,_ in pairs(strata[#strata]) do
for y,_ in pairs(vertex) do
if edge(x,y) then -- x cannot equal y
change = true
new[y] = true
end -- if
end -- for
end -- for
if change then
insert(strata,new)
for x,_ in pairs(new) do vertex[x] = nil end -- for
end -- if
return change
end -- function
repeat until not more() -- add new vertices while possible
return strata
end -- function
X x2
i
nm + n2 +
i
2
Summary
The graph module and the ladder program that uses it are tiny and straightfor-
ward, and in themselves unremarkable. They are simply pegs on which to hang
some observations. Readers are, of course, free to quarrel with my personal
preferences.
-- ladder
-- arg[1] holds pathname of lexicon
do
local read,lines = io.read,io.lines
print "Enter a word from the lexicon"
local startword = (read()):lower()
local n = #startword
local pat = ("%a"):rep(n) -- pattern for words of same size
local lexicon = {}
for line in lines(arg[1]) do
line:gsub(pat,function(word) lexicon[word] = true end)
end -- for
-- add vary to string library
string.vary = function(s,i) -- pattern - vary i-th char
return s:sub(1,i-1).."."..s:sub(i+1,-1)
end -- function
local differby1 = function(x,y)
local n = #x
for i = 1,n do
if y:match(x:vary(i)) then return true end -- if
end -- for
return false
end -- function
local wordgraph = { vertex = lexicon; edge = differby1; }
local printout = function(strata)
for i,stratum in ipairs(strata) do
for word,_ in pairs(stratum) do
print(i,": ",word:upper())
end -- for
end -- for
end -- function
require "graph"
printout(graph.component(wordgraph,startword))
end -- do
• I am proud that in Lua functions are first-class values. For that reason
I make it clear that function definitions are assignments, and I do not use
the syntactic sugar provided to conceal this fact, lest it be misinterpreted
as apologetic. If you have got it, flaunt it!
The choice of data structures and the division of labour between main program
and required modules should follow from analysis of the abstractions thrown up
by the problem. In this case the analysis is the trivial fact that a vertex in a
graph stratifies the component in which it lives. It tells us loud and clear that
we need a function that returns a list of tables and not just their union, as a
superficial reading of the task might suggest.
15
Building Data Structures
and Iterators in Lua
Luis Carvalho
Besides being lightweight and fast, Lua is also highly regarded for being an
elegant and expressive language. The main purpose of this gem is to reinforce
this impression by showing how powerful and straightforward the concerted
application of tables, metamethods, and coroutines is in the implementation of
complex data structures and their iterators.
Here we implement a graph object module where graphs are modeled us-
ing a vertex set, a weighted edge/arc set and adjacency lists (set objects are as
described in “Programming in Lua” (PiL)). Graph vertex sets can be iterated
by depth first search (DFS), breadth first search (BFS), and topological sorting,
which illustrate well the application of closures and coroutines. Moreover, classi-
cal routines for shortest paths from a vertex and minimum spanning tree (MST)
are also provided. These routines require additional data structures in order to
achieve optimal time complexity: queues (similar to the ones implemented in
PiL) are used in the BFS iterator, while heaps and partition sets (both as trees
with special properties) are used in the MST routine. Each data structure, on its
own, comprises an individual pure Lua module whose methods satisfy the usual
colon calling convention for objects. A few examples and direct applications of
the routines are also presented.
Introduction
The study and development of algorithms and data structures are tightly cou-
pled together: appropriate data structures are the core of well designed, optimal
algorithms. Besides designing suitable data structures, it is also important to
use a programming language that allows easy specification and implementation
of a data structure and that is rich and expressive enough to favor the realiza-
tion of abstract concepts and possible future extensions. This gem aims to show
that Lua is such a language: with its many appealing resources — including
a powerful unique data structure building block, the table, metamethods, and
coroutines — we can implement many data structures effortlessly.
The main data structure portrayed in the text is a graph, but we also present
others in order to solve some classical graph problems efficiently. A complete
listing of all routines, which are encapsulated in modules, one per data struc-
ture, can be found in the gem repository.
Although we provide an explanation for each method in the text, we try to
be as terse as possible to keep the text short. Familiarity with data structures
and design of algorithms is highly desirable — this way you can concentrate in
enjoying the Lua code! — but not necessary; however, the reader should refer
to many excellent books that cover these topics for more details1 . We should
mention the excellent “Programming in Lua” (PiL)2 , by Roberto Ierusalimschy,
one of Lua’s creators: we make many references to it, and try to draw inspiration
from it whenever possible.
Queues
Our first data structure is a queue, a list in which elements are always inserted
to the front and retrieved at the rear, that is, in a first-in, first-out fashion.
Although simple, our queue implementation will be useful later on this chapter
when we talk about graph traversal and can serve as good warm-up exercise.
The implementation comprises a module Queue, very similar to the one in PiL: a
queue object is a table with two pointers, first and last, indicating the current
positions of the queue’s front and rear. A queue is then initialized by
where we use __index to enable colon call notation in our objects. Note that
getfenv takes 1 as default argument and returns the current, module, environ-
1 We particularly recommend the classical “Design and analysis of computer algorithms”, by Aho,
Hopcroft and Ullman, and the more modern “Introduction to algorithms”, by Cormen, Leiserson,
Rivest, and Stein.
2 PiL’s first edition is available online at http://www.lua.org/pil.
157
ment. The methods in the environment are the typical insert and retrieve, as
presented in PiL3 ,
Thanks to the __index metamethod we are able, for example, to simply issue
Q:retrieve() instead of Queue.retrieve(Q). We will be following this practice
of defining __index as the module environment for all objects in this chapter,
and so the queue module gives the reader a good opportunity to get familiarized
with this prototype-based convention4 .
Heaps
We now implement a heap data structure, which is a binary tree stored in
an array. For each node in the tree we associate a record numeric field such
that the tree satisfies the (min) heap property: for every node v, record(v) ≤
min{record(l), record(r)}, where l and r are the left and right children of v. If a
tree is a heap, its root contains the smallest record over all nodes in the tree.
Heaps have many applications, but in this gem we focus on heaps as priority
queues, that is, as a data structure for keeping a set of objects ordered by their
record field. Priority queues will be at the core of our solutions to the minimum
spanning tree and shortest path problems in a latter section about graphs. Since
records in a heap represent some object feature — such as the time of occurrence
of events in a simulation engine — it is then desirable to have a correspondence
between nodes in the heap and objects. We use two additional fields for this
3 They are called push and pop in PiL.
4 For a more thorough presentation, check PiL’s chapter on object-oriented programming, more
specifically on the section about classes.
158 15 · Building Data Structures and Iterators in Lua
intent: key stores object labels and ref stores references from keys back to
heap nodes. Of course, key[ref[k]] = k and ref[key[n]] = n for all keys k
and nodes n. Since this correspondence between objects and the heap will be
useful for us later on, our heap methods are implemented with it in mind.
Heaps are similar to queues since we can insert elements, retrieve them,
and check the heap for emptiness. Insertions are performed at the leaves, filling
each level of the tree before going to the next, while retrieval is performed at
the root. Heaps differ from queues in that they also have an update operation
that rearranges the heap after some record is decreased. Still, our main concern
should be for the tree to keep the heap property after each of these operations.
We create heap objects with
function new ()
return setmetatable({record={}, key={}, ref={}}, {__index = modenv})
end
where our tree is represented by integer-keyed tables record, key and ref in the
following way: if i is a position representing a node in the tree, then 2i is the left
child of i, 2i + 1 is the right child of i, and thus ⌊i/2⌋ is the parent of i. Since we
are going to need parents of nodes a lot, it is convenient to provide:
local function parent (n) return (n - n % 2) / 2 -- floor(n / 2) end
The update method is listed below; we maintain the heap property by perco-
lating the node i with key k and new record value v < record[ref[k]] up the
tree until the property is again satisfied:
function update (H, k, v)
local record, key, ref = H.record, H.key, H.ref
local i = ref[k]
local p = parent(i)
while i > 1 and record[p] > v do -- climb tree?
-- exchange nodes
record[i], key[i], ref[key[p]] = record[p], key[p], i
i, p = p, parent(p)
end
record[i], key[i], ref[k] = v, k, i -- update
end
Note how we actually update three trees, one for each field in the heap, as we
exchange parent and child nodes, but use record exclusively for comparisons.
Insertions are now straightforward; we simply insert a new leaf and call update
to maintain the heap property:
function insert (H, v, k)
local ref = H.ref
assert(ref[k] == nil, "key already in heap")
ref[k] = #H.record + 1 -- insert reference
update(H, k, v) -- insert record
end
159
Updates and insertions are simple because there is a unique path from any
node to the root and so we just need to follow that path up looking for a suitable
position to place our new or updated object. Retrieval is a bit more complicated
since it comprises extracting the root and swapping the rightmost leaf (the last
position in the array) by the root: both left and right subtrees from the root are
still heaps, but now the root might be violating the heap property. The task of
fixing the tree in case this happens is handled recursively by heapify:
Heapsort
We could not talk about heaps and not mention one of their main applications:
to sort an array. The concept is an ingenious derivation from retrieve: we can
sort an array in-place by iteratively replacing the root by the rightmost leaf —
the first position by the last position in the array — and not actually removing
the root, but simply reducing the heap size. By performing the whole sorting
in-place, we do not need to insert elements either5 . However, since we have a
min heap, we sort decreasingly; to sort increasingly we would need a max heap,
which can be trivially obtained by swapping record comparisons in heapify and
insert. Our heapsort routine does not use a ref field and needs to explicitly set
the size of the heap in heapify since no elements are removed from the heap:
The motivation behind using the key field to store the original position of the
entries in the array to be sorted is to illustrate an interesting use of key: we can
recreate the original state of a sorted table, as in
5 You might ask: why not implement heapsort using insertions and retrievals? This is possible, of
course, but less efficient: the “build heap” loop in heapsort has complexity O(n), n being the heap
size, as opposed to O(n log n) if the heap is built using insert.
161
Partition sets
Consider now a set S and a partition {Si } over S, that is, the sets Si are disjoint,
∩i Si = ∅, and their union is S, ∪i Si = S. We want an efficient way to implement
two operations for partition sets: merge two sets into their union and, given an
element e ∈ S, find the set to which e belongs.
A good representation for partition sets is a forest where every set is repre-
sented by a tree and is identified by the element at the root of its tree. This way,
merging two sets S1 and S2 requires a simple attachment of the tree represent-
ing S1 as a subtree in S2 . In addition, finding the set that contains a random
element involves a tree climbing routine from the element up to the root of the
tree representing the containing set. As we will see shortly, to guarantee effi-
ciency we also need to perform these operations according to some rules, but for
now let’s just assume that we need to keep track of the number of elements in
each set.
Our partition sets are represented by a table with two fields: card, a table
storing the cardinality of each set, and parent, a table storing the parent of
an element in its containing set. An element e is the root of a set S if and
only if card[e] contains the number of elements in S — e represents S — and
parent[e] == nil. New partition objects are created with
function new ()
return setmetatable({card = {}, parent = {}}, {__index = modenv})
end
while set takes a new element and creates a new set in the partition containing
only the argument,
and sameset checks if two elements belong to the same set in the partition:
For merge we also try to keep the tree shorter by applying a weighting rule:
when merging two sets, attach the smaller set to the larger set. This rule
requires the use of card to keep track of set sizes. The best way to connect
the two sets is then to redefine the root of the smaller set as a child of the root
of the larger set.
Graphs
There are many ways to represent a graph data structure, each being better
suited to a particular application. For our graph object we adopt a representa-
tion using adjacency lists — or better, tables — where the keys are vertices and
the values hold adjacency relations.
Vertices can be of any type but nil, and are stored in the graph’s vertex
set, vset; similarly, we store the edges or arcs of a graph, if it is undirected or
directed respectively, in its edge (or arc) set eset. To avoid redundancy, vset is
actually the adjacency list: vset[v] is a table where each key is a neighbor u
of v and the corresponding value is the edge {u,v}, also a table. The edge set
eset can be a regular set as presented in PiL — keys are edges and values are
true — or values can hold edge weights.
Graphs are created with our (now canonical) new method:
function new()
return setmetatable({vset = {}, eset = {}}, {__index = modenv})
end
As basic graph operations we offer the addition of vertices,
function addvertex (G, v)
assert(v ~= nil, "cannot add nil as vertex")
local vs = G.vset
assert(vs[v] == nil, "vertex already in graph")
vs[v] = {} -- new adjacency list
end
and edges to a graph,
function addedge (G, v1, v2, w)
assert(v1 ~= nil and v2 ~= nil, "cannot add nil as vertex")
local vs = G.vset
-- add v1 and v2 if not in G
if not vs[v1] then vs[v1] = {} end
if not vs[v2] then vs[v2] = {} end
-- update v1 and v2 adjacency lists
local e = {v1, v2} -- new edge
vs[v1][v2] = e -- v1 -> v2
vs[v2][v1] = e -- v1 <- v2
-- update edge set
G.eset[e] = w or true
end
Addition of arcs can be handled by a very similar method where the adjacency
lists are updated only according to the arc direction.
It is useful to have some iterators for traversing the vertex and edge sets,
and the neighborhood of a vertex v — the set of vertices adjacent to v. We can
derive such iterators by mimicking the stateless pairs iterator:
164 15 · Building Data Structures and Iterators in Lua
It should be noted that, even though pairs can be used instead of each
iterator, we explicitly define them to keep our following graph methods abstract,
that is, independent of our particular underlying graph implementation: if we
later decide to change the graph representation to solve some problem more
efficiently our higher level algorithms should require minimum modifications,
or even none at all.
In the next sections we treat a few classical graph problems that arise from
many applications. All problems have a theme in common, graph connectivity,
that allow us to exploit our chosen adjacency list representation.
Graph search
Many graph related problems involve a search through the graph’s vertex set in
order to, say, verify some property or compute some desired quantity. It is often
the case when the vertices should be visited in some specific order as required
by the task at hand. The most common orderings are provided by two graph
search procedures: depth-first search (DFS) and breadth-first search (BFS).
Both procedures take a vertex in the graph as a starting point. As the
names suggest, depth-first search follows some edge leaving the current vertex
in the search as deep as possible, backtracking to explore more vertices when
the options are exhausted, while breadth-first search explores the neighborhood
of the current vertex before attempting to branch the search further.
Let’s address DFS first; check Listing 1. Our DFS method is actually an
iterator wrapped around an auxiliary routine, search. Thanks to Lua’s corou-
tines, implementing such an iterator is simple even when it involves a recursive
routine since we can yield from it. In dfs, visited is a control variable that
keeps a set of already visited vertices. As an iterator, dfs is not as efficient
as neighborhood since a new closure is created for each search; however, even
though dfs is algorithmically more complex, it is, at the same time, semantically
as simple as neighborhood thanks to Lua’s generic for.
A subgraph S of a graph G is a graph such that V (S) ⊆ V (G) and E(S) ⊆
E(G), where V (G) and E(G) are the vertex and edge sets of G respectively. A
component C of a graph G is a maximal connected subgraph of G: any two
vertices in C have a path connecting them, that is, C is connected, and any
vertex not in C has no edges to a vertex in C, that is, if we add any other vertex,
C is not connected anymore. An important application of graph searches is to
165
identify the components of a graph; the following method is our first application
of dfs and returns a table containing vertex sets for each component of a graph:
function components (G)
local comp = {} -- holds components
local S = {} -- unvisited vertices
for v in G:vertices() do S[v] = true end
for v in pairs(S) do
local C = {} -- component set
for u in G:dfs(v) do -- or bfs
C[u] = true -- add to C
S[u] = nil -- remove from S
end
comp[#comp + 1] = C
end
return comp
end
We can also have true iterators, as discussed in PiL, by implementing a
factory that returns a DFS iterator starting at a provided vertex as a closure on
the graph, the control variable, and the auxiliary search routine; see Listing 2.
The iterator takes two functions as optional arguments, visit and finish, that
perform some action as the vertex is visited and finished, respectively. By a
finished vertex we mean a vertex that had all its neighbors traversed by DFS.
Using dfs from the listing above, we can, for example, print the order in which
the vertices are discovered by DFS from a vertex s by just creating an iterator
dfser = dfs(G, s) and then calling dfser(print). This last dfs has a strong
166 15 · Building Data Structures and Iterators in Lua
Shortest paths
Consider now two distinct vertices u and v in a connected graph G: there may be
many paths in G connecting them. However, if we attribute to each edge in G a
positive cost, we are usually more interested in a shortest path between u and v,
that is, a path of minimum total cost. The cost of a path between u and v is
called the distance between u and v, and comprises the sum of the costs over all
edges in the path.
Dijkstra’s algorithm finds the shortest paths between a source v and all other
vertices in G9 . His algorithm iteratively marks a vertex once its shortest path
to the source is known, and visits the unmarked vertices in order of increasing
distance to the source. Since the costs are positive, shortest paths leading to
unmarked vertices can only pass through already marked vertices. Thus, we can
compute the distance from an unmarked vertex to the source by only considering
the known distance from one of its marked neighbors to the source.
8 You cannot sort a list of n elements with complexity less than O(n log n).
9 Even if we are only interested in the shortest path between two specific vertices, an algorithm
for that problem would not be more efficient in the worst case than the best single-source algorithm.
169
that creates stateless iterators that backtrack from the destination to the source
along the shortest path, and use one such iterator to implement a path builder:
170 15 · Building Data Structures and Iterators in Lua
Conclusions
This gem highlights Lua’s simplicity and expressiveness through the implemen-
tation of data structures and iterators. Data structures are easily realized by
using tables and their __index metamethod for object-like notation, while iter-
ators can be build using Lua’s generic for loop, closures and coroutines. The
resulting code is simple, high-level and abstract, mostly composed of table ac-
cesses and object method calls.
One of the main motivations of this text is to show many possible ways of
achieving a goal in Lua. For instance, there are many ways to construct an
iterator in Lua according to PiL; we cover them all here. Another motivation is
to be true to one of Lua’s maxims in PiL — “Lua gives you the power, you build
the mechanisms” — by providing iterators instead of tables, for example, or ways
of generating a desired table instead of the table itself, as in the backtracker
factory.
16
A Primer of
Scientific Computing in Lua
Luis Carvalho
Introduction
Lua tables are fast enough for many numerical applications. However, for more
specific applications performance is usually critical. One way to achieve a high
performance environment in Lua is to extend the language with suitable objects
and methods from efficient numerical libraries. Of course, these extensions
should profit from Lua’s expressiveness and resourcefulness.
In this gem we explore two common numerical extensions: matrices and dis-
crete Fourier transforms. These together allow us to implement many standard
scientific computing algorithms including numerical linear algebra, interpola-
tions, and quadratures. Moreover, with Lua we are able to devise very efficient
implementations that can beat even well-known scientific computing software!
We assume the reader is fairly familiar with Lua’s C API and feels comfort-
able managing the Lua–C virtual stack. Due to the numerical nature of the text,
a background in engineering or numerical analysis is desirable, but not needed;
the last section, however, deals with more mathematically sophisticated appli-
cations and relies on some knowledge of elementary calculus and linear algebra.
Matrices
In order to keep our interface simple our matrices are real two-dimensional
arrays stored in column-major order. Column-major order is essential because
we will be using Fortran libraries, and that is Fortran’s storage mode for arrays.
The first issue we need to address when representing matrices in Lua is
indexing. For a one-dimensional array (a vector) that is simple enough: just
create a userdatum with fields size and data and return data[i] for index i.
For matrices this is a bit more complicated since we have to go through two
dimensions to access an entry. One solution to accessing index (i, j) is to return
the i-th row as a matrix which in turn behaves like a vector and returns the j-th
entry.
Since matrices are stored in column-major order, rows in a matrix have an
offset from one entry to the next, that is, between consecutive columns. This row
offset is the number of rows from the matrix referenced by the row. We call this
offset stride, and we keep track of it in our data structure.
It is not uncommon to use a matrix without referencing its entries, that
is, by only using matrix operations. That is actually the most efficient way
to use our library: operate in the higher level, say, adding, multiplying or
transforming matrices, and leave the heavy-duty operations to the optimized,
architecture-dependent libraries under the hood; seasoned users of numerical
software usually refer to this strategy as “vectorizing your code”. This practice
suggests that we should adopt a lazy interning strategy and only create the rows
175
of a matrix when needed, that is, when some entry is requested. To intern the
rows of a matrix we can use the environment table of the matrix’s userdatum.
The straightforward association is then to store the i-th row at entry i in the
environment table.
The next issue is memory allocation and deallocation. Suppose that we
decide to allocate a memory block to store both structure (number of rows,
columns, and stride) and data of a matrix; what would happen if we attribute one
row of this matrix to a variable and then garbage-collect the matrix? The data
would be collected along, and the variable would be left dangling. To avoid this,
we separate the structure and data of a matrix and allocate them individually. A
data memory block can only be collected once all the matrices referencing it have
been collected. To this end, we use a weak-keyed table where the keys are matrix
userdata holding both the structure and a pointer to a data memory block, and
the values are the referenced data memory blocks. To avoid extra table lookups
we do not create a dedicated table to hold the matrix to data associations, but
instead use the matrix userdatum metatable as a holder.
After all these considerations our matrix should have the following structure:
typedef struct {
int rows;
int cols;
int stride;
lua_Number *data;
} lua_Matrix;
From now on, to make the code conciser, we implicitly perform all matrix
checks and assume the arguments given to all functions are consistent: for
example, if we are adding two matrices, they should have the same number
of rows and columns1 .
Metamethods
To access entries in our matrix we need to implement two metamethods: __index
and __newindex. Let’s start with __index. Since we want to enable colon call
notation for matrix methods — like m:method(...) — our __index should first
check if the key is a number, in which case an entry is requested, or otherwise
delegate to a metatable lookup. Next, if the key is numeric, we should check if
it is valid, that is, if it is positive and less than or equal to the size of the matrix.
Now, according to our previous discussion, we decided for lazily interned rows
when indexing a matrix and for a direct entry if the matrix is a vector; if the row
is interned, we should just get it from the userdatum’s environment, otherwise
intern it before returning. This discussion leads to matrix__index in Listing 2.
The first part of __index, when the key is a valid numeric index, is expected:
it returns an entry if the matrix m is one-dimensional or a row if not, taking care
to store the row if it is not interned yet. The row interning procedure comprises
three steps: getting the data block data associated with m from the function
environment; pushing a new row r such that r->rows = 1, r->cols = m->cols,
r->stride = m->rows, and r->data = data[k - 1] using pushmatrix, since the
r’s data block is only referenced, not allocated; and finally interning r as the k-th
entry in m’s environment.
If the key is not a number we resort to a metatable lookup. The class table
containing all matrix methods is the first upvalue in the __index closure; more
details on how this is set up will appear when we talk about luaopen_lmatrix in
the “Library setup” section.
Matrix entry attribution is accomplished through __newindex only on one-
dimensional matrices. The procedure is very similar, but simpler; it is listed in
Listing 3 for the sake of completeness.
We can also implement a __len metamethod,
1 Don’t worry: the complete code in the repository contains all the consistency checks.
178 16 · A Primer of Scientific Computing in Lua
which is really useful only for column vectors since it only returns the number
of rows — we have matrix_size to retrieve both dimensions if needed — and a
__tostring metamethod for pretty printing:
Core methods
Now, true to our intention of pushing the heavy-duty work to the C side of our
library, we need to provide some basic methods that operate over all entries of a
matrix.
Our first routine, matrix_fill, sets all entries of a matrix to a number given
as argument:
180 16 · A Primer of Scientific Computing in Lua
This routine is a good example of two important practices in our matrix methods:
the stride should always be used when traversing a matrix, as in the for loop;
and the input matrix should always be returned for notational convenience2 as
it allows consecutive colon calls in Lua, as in r,c = m:fill(1):size(). Note
that, since we are returning the first argument, we set the stack top and pop the
second argument.
A useful routine is scalar summation, where a matrix m is added to a number
s yielding a matrix m + s. Although this routine produces a new matrix by
definition, we can use an in-place version of it, which we call matrix_shift, by
simply changing the for loop at line 6 in matrix_fill to
for (i = 0; i < n; i += m->stride) m->data[i] += s;
In-place routines, where the entries of a matrix are updated, should always
be preferred. This guideline is justified, at this lower level of our implementa-
tion, as a way to avoid unnecessary memory allocations; if we really need to copy
a matrix, we should explicitly do so. To perform scalar summation on m and s, for
example, we would copy m to another matrix c and then apply c:shift(s). Scalar
summation will be treated shortly, when we discuss the __add metamethod. An
efficient implementation of a matrix copying routine is presented in the “Exter-
nal libraries” section.
We also need some operations, that is, routines that take two matrices with
same number of columns and rows and return another matrix consistent with
the arguments. An important operation is the element-wise multiplication3 of
two matrices:
static int matrix_ewmul (lua_State *L) {
lua_Matrix *a = checkmatrix(L, 1);
lua_Matrix *b = checkmatrix(L, 2);
int i, n;
lua_settop(L, 2);
n = a->rows * a->cols;
for (i = 0; i < n; i++) a->data[i * a->stride] *= b->data[i * b->stride];
lua_pop(L, 1); /* b */
return 1; /* a */
}
2 Some might disagree and point out that such feature actually reduces code readability.
3 Also known as Hadamard product, but .* should be more familiar.
181
Note that operation is in-place for the first argument, and that the strides of
both arguments are used in the update.
Functional facilities
In the previous section we managed to avoid loops that update entries in a
matrix by providing specialized routines. After all, we do not want to incur
in metatable overheads for calling __index and __newindex on loops like
for i = 1, #v do
v[i] = foo(v[i])
end
are actually very common, and we refer to folds arising from their application
as linear folds. Since linear folds can be parameterized by alpha, we can define
a simpler version of matrix_fold:
a suitable initial value, but this is more efficient (not to mention traditional).
183
External libraries
So far we have been able to provide efficient routines for simple methods in our
matrix library. For many other specialized and more complex matrix routines —
like computing the norm of a matrix, or solving a linear system, or inverting a
matrix — we can resort to optimized code from external libraries.
For our basic needs here we are going to use the high-quality ubiquitous
BLAS (Basic Linear Algebra Subprograms) library, or better, an optimized ver-
sion of it7 .
Our first routine using BLAS is a method to copy matrices:
The actual copying is done by the BLAS routine at lines 10 and 13, which
have signature
dcopy_(int *n, double *x, int *incx, double *y, int *incy);
6 An easy polynomial object from c:
optimized versions of BLAS, depending on the platform, but the most common open source version
is ATLAS: http://math-atlas.sourceforge.net.
184 16 · A Primer of Scientific Computing in Lua
where x is to be copied to y, n is the size of x and y and incx and incy are
the strides of x and y respectively. Note that all arguments are pointers; since
BLAS’s natural implementation is in Fortran8 and Fortran passes arguments
by reference, we should provide variables by their memory addresses. Of course,
we are assuming that lua_Number is defined as double, as in vanilla Lua. We
should also always provide strides for our matrix arguments, similar to what we
did in the previous sections. As a matter of fact, all BLAS routines that we use
here have a common signature pattern: the size of the argument(s) comes first,
followed in some routines by a number meant for scalar multiplication, and then
the matrix argument(s) as a data block address and a stride.
We can provide an optional destination to matrix_copy as a second argument.
If no destination is specified, a data block is allocated and a fresh matrix is
pushed; otherwise, we just copy to the provided destination matrix — of course,
as in our previous routines, we are assuming the matrices are consistent and do
not set any checks in our prototype. The main reason for using a copy destination
is when a procedure performs a copy operation often and we can then use a
buffer to avoid new matrices being created at each operation. Also, note that we
return the copy destination matrix, as expected.
To scale a matrix m by a number s, that is, to multiply each element in m by a
scalar s, we use
where dscal performs the hard work. The signature pattern should be already
familiar. As usual, we provide references as arguments to dscal and return the
scaled matrix.
Continuing with in-place linear operations, we have a routine that incre-
ments a matrix y by a * x, where a is a (not necessarily positive) optional num-
ber and x is consistent with y:
Library setup
Now that we have all methods, we can register them in our library. For this pur-
pose we create two luaL_regs, one for class methods and other for metamethods,
static const luaL_reg lmatrix_func[] = {
{"new", matrix_new},
/* ... list other methods here ... */
{"dot", matrix_dot},
{NULL, NULL}
};
Lua side
Now that everything is set up on the C side, we can turn to Lua to enhance our
library, matrix.lua. First of all, we need to load all methods from lmatrix,
Since matrix is global and also the name of our library, the class table
returned by require"lmatrix" becomes the environment after the call to module.
As promised before, we now provide arithmetic metamethods for our matrix
userdata based on add, shift, and scale in Listing 5. All these metamethods
should return a new table, and so we use copy to explicitly copy the values and
apply any needed transformation in-place. Depending on the second argument
to __add and __sub, we either shift or add the copy s of the original matrix a.
Note that __unm uses consecutive calls in colon notation, which could also be
used for __add, for example,
mt.__add = function(a, b)
return type(b) == "number" and a:copy():shift(b) or a:copy():add(b)
end
local mt = getmetatable(matrix.new(1))
mt.__unm = function(m)
return m:copy():scale(-1)
end
mt.__add = function(a, b)
local s = a:copy()
if type(b) == "number" then s:shift(b) else s:add(b) end
return s
end
mt.__sub = function(a, b)
local s = a:copy()
if type(b) == "number" then s:shift(-b) else s:add(b, -1) end
return s
end
where math_abs is a local for math.abs defined before the module call.
It is also handy — but not necessarily efficient — to have a method similar
to pairs for matrix traversal. The idea is simple: we just need to keep two
control variables as row and column indexes and update them as we traverse
the matrix. Our first try, in Listing 6, uses coroutines; local variables i and j
are controls for row and column indexes respectively, while v is used to cache
the i-th row and avoid metatable lookup overhead.
Vectors are treated differently because there is actually only one dimension
to traverse. We chose to assert this case here to save space, but it should
be simple to implement it by adapting from the matrix case9 . As expected,
coroutine_wrap and coroutine_yield are locals to their respective methods,
coroutine.wrap and coroutine.yield.
9 Or by drawing inspiration from ipairs. The complete version of entries can be found in the
repository.
188 16 · A Primer of Scientific Computing in Lua
Applications
Armed with our matrix module, we can now work on some applications. Since
the methods we implemented are basic, we need to illustrate their power with
some simple — but not elementary! — applications. In the next sections we
will first exercise our module with some straightforward tasks to gain more
10 A good exercise is to provide a stateless iterator for vectors; check the repository for a solution.
189
familiarity, and then deal with variations on a theme: interpolation. We will talk
about the most common interpolation, Lagrangian interpolation, then discuss
discrete Fourier transforms, and finally apply another type of interpolation to
compute the quadrature of an arbitrary function.
This section, and in particular the latter part, is more mathematically in-
volved, but we try to keep the text as self-contained as possible without going
too much into details. The applications are standard in numerical computing,
and the interested reader can refer to a number of good books in the subject
to quench the curiosity for more details and the desire for more rigor. The last
application demands some degree of familiarity with more complex math, but
serves well to illustrate how versatile and powerful Lua is; it was largely in-
spired by some of Prof. Lloyd Trefethen’s works11 .
Basic operations
Let’s first take our matrix module for a test drive in an interpreter session. We
start off by checking some basic operations:
$ lua
Lua 5.1.2 Copyright (C) 1994-2007 Lua.org, PUC-Rio
> require "matrix"
> n = 4
> a = matrix.linspace(1, n) -- [1, 2, ..., n]’
> b = matrix.new(n) -- [0, 0, ..., 0]’
> x = matrix.linspace(0, math.pi, n) -- [0, pi/(n-1), ..., pi]’
> x:copy(b):map(math.cos) -- b[i] = cos(x[i])
> s = -(a + 1 + b - 1 - b) -- __add, __sub, __unm
> for i = 1, n do print(a[i], x[i], b[i], s[i]) end
1 0 1 -1
2 1.0471975511966 0.5 -2
3 2.0943951023932 -0.5 -3
4 3.1415926535898 -1 -4
function dot(a, b)
local t = a:copy() -- t[i] = a[i]
t:ewmul(b) -- t[i] = a[i] * b[i]
return t:linfold() -- sum_i a[i] * b[i]
end
Our dot is clearly less efficient than matrix.dot since it needs a copy of the
first argument, but it provides a good exercise nonetheless. For instance, once
11 In particular, “Is Gauss quadrature better than Clenshaw–Curtis?” and “An extension of Matlab
we are more comfortable with the colon notation, we can drop temporary local
variables like t above and just concatenate operations — being careful not to
abuse notation at the cost of readability! — as this quick test shows:
that is, if we look at the anti-diagonals starting at the top-left corner of the
matrix we see the rows of Pascal’s triangle. The recursive definition above stems
from a well-known binomial coefficient identity12 and it is particularly useful for
our implementation:
12 k+1 k k
= + , where k = i + j − 3 and l = i − 1.
l l l−1
192 16 · A Primer of Scientific Computing in Lua
> pretty(pascal(5))
1 1 1 1 1
1 2 3 4 5
1 3 6 10 15
1 4 10 20 35
1 5 15 35 70
Lagrangian interpolation
Given n points in the plane, (xi , yi ), i = 1, . . . , n, with distinct xi s, the interpola-
tion problem requires us to find an interpolating function f — the interpolant —
such that f (xi ) = yi . Interpolants are usually Pexpressed as a linear combination
n
of a set of basis functions b1 , . . . , bn : f (x) = k=1 ck bk (x), where the coefficients
ck are to be determined in order to satisfy the interpolation criterion
n
X
f (xi ) = ck bk (xi ) = yi , i = 1, . . . , n. (16.1)
k=1
1 x1 · · · xn−1
1 c1 y1
1 x2 · · · xn−1 c2 y2
2
BT c = . .. .. = .. = y
.. . .
.. . . . . .
1 xn ··· xn−1
n cn yn
193
The matrix B above has a particular structure where the i-th row can be
obtained from the i − 1-th row by element-wise product with x; B is then said to
be a Vandermonde matrix with basis x and order n13 . We can easily generate a
Vandermonde matrix with
Observe how we specify the copy destination in the for loop and use the colon
notation: r is v[i-1], and what gets multiplied by b is v[i], the copy destination.
For small n this method works fine, but even though B is not singular it
might get ill-conditioned as n grows — even singular to machine precision —
resulting in very sensitive coefficients. Moreover, we still need to solve the
system in equation (16.1) by providing more bindings to our matrix library from
other external libraries14 .
Another option is to specify a different, more numerically stable, basis. For
instance, if we choose the Lagrangian basis
Y x − xj
bk (x) = , k = 1, . . . , n,
xk − xj
j6=k
we can even implement it efficiently using tables. On the other hand, a version
using matrices should avoid explicit loops to be more efficient. Listing 7 presents
both versions in interp1t and interp1 respectively.
In interp1, b computes bk (z) by folding a closure on xk that accumulates the
product over x. We avoid the singularity when j == k by comparing xk directly
to xj and doing nothing, that is, passing t untouched, if they are equal15 . After
evaluating bk (z) by mapping b to x, we apply a dot product to y to obtain the
13 Some authors actually define the transpose of a Vandermonde matrix as Vandermonde.
14 BLAS provides methods for solving linear systems only when B is triangular, and so we need
to resort to BLAS’s big brother, LAPACK (http://www.netlib.org/lapack), to solve general linear
systems. Unfortunately, this is out of our scope.
15 Assure yourself that it is ok here to compare floating point numbers directly.
194 16 · A Primer of Scientific Computing in Lua
function interp1t(x, y)
local n = #x
assert(n == #y, "sizes differ")
return function(z) -- f
local p = 0 -- f(z)
for k = 1, n do
local xk = x[k]
local t = 1 -- b_k(z)
for j = 1, n do
if j ~= k then
local xj = x[j]
t = t * (z - xj) / (xk - xj)
end
end
p = p + t * y[k]
end
return p
end
end
Clenshaw–Curtis quadrature
Suppose now that we want to compute the integral of an arbitrary function f .
An approximative approach is to sample f at a number of distinct abscissae xk ,
16 FFTW is free software and can be found at http://www.fftw.org.
17 Incase you are wondering: yes, there is one such implementation in the repository, and it is very
educative on its own.
197
then ck can be obtained by a discrete cosine transform on f (x) and wk are simply:
Z 1
0, k odd
wk = Tk (x) dx =
−1 2(1 − k 2 )−1 , k even
There is still room for improvement in cheb. For one, we could provide a pre-
cision as argument instead of the number of sample points, and control the pre-
cision by selecting the appropriate number of points. Another improvement, or
better, extension, would be to implement an interpolation routine that computes
an approximation to f (x) at arbitrary x based on our Chebyshev interpolant.
By choosing a suitable transformation, we can even approximate improper
integrals. Consider the tangent transform,
b 2 tan−1 (b)/π
π πx 2 πx
Z Z
f (t) dt = f tan sec dx,
a 2 2 tan−1 (a)/π 2 2
do
local f = function(x)
return 1 / math.sqrt(2 * math.pi) * math.exp(-x * x / 2)
end
local g = ttintegral(f, 1000)
pnorm = function(x) return g(-math.huge, x) end
end
199
require "matrix"
local linspace, dot = matrix.linspace, matrix.dot
local pi, cos, setmetatable = math.pi, math.cos, setmetatable
module(...)
require "cheb"
function ttintegral (func, n) -- tangent transform
local p2 = math.pi / 2
local f = function(x)
local v = x * p2
local c = 1 / math.cos(v) -- sec(pi * x / 2)
return func(math.tan(v)) * c * c
end
local c = cheb.new(f, n)
return function(a, b)
return p2 * c(math.atan(a) / p2, math.atan(b) / p2)
end
end
gamma = function(t, n)
assert(t >= 1, "argument is lesser than 1: " .. t)
local n = n or 1000
local f = function(x) return x ^ (t - 1) * math.exp(-x) end
return ttintegral(f, n)(0, math.huge)
end
Conclusions
In this gem, we have implemented a matrix library and a few applications. Lua’s
resources, including userdatum environments, metamethods, function closures,
first class functions, and proper lexical scoping were invaluable for a simple
yet powerful implementation: our matrices have lazily interned rows, efficient
functional facilities, and proper arithmetic operators, while our Chebyshev in-
terpolants can compute the integral of any function with arbitrary precision and
limits of integration.
Thanks to the C API, it is almost straightforward to extend Lua by either
wrapping routines from high performance, specialized, external libraries or
providing your own routines. We hope this gem inspires the reader to create
new libraries — especially numerical ones! — by following the same approach.
One could also extend the current implementation as suggested in the footnotes,
or even by enhancing the matrix object to account for multidimensionality and
a typing system that would elect the best routine for a specific task if the
matrix were, say, triangular or symmetric. For an implementation of these
latter improvements and other scientific computing facilities, including random
deviates for a number of probability distributions, complex number support,
and specialized functions, the reader can refer to the Numeric Lua project at
http://numlua.luaforge.net.
Complex Structured Data Input
17
Julio M. Fernández-Dı́az
Lua is very good at describing (often complex) data structures through tables.
However, apart from spotting syntax errors, Lua cannot deal directly with the
logical structure of read tables.
This gem explains a small library which, in combination with data templates,
enables the user to: introduce complex, controlled structures to any depth; in-
clude test functions to check the validity of the input; declare optional values at
any level for missing fields, if desired; etc. An example, including the appropri-
ate driver, which runs in a convenient protected mode, is also shown.
The problem
As a developer, you are preparing a program (to be used by other people) which
requires (relatively general) data to be processed: control parameters in a chem-
ical plant, objects characteristics in an arcade game, etc.
You may also wish to develop programs that allow some external configura-
tion: sometimes for self use; at other times, the end users wish to adapt the
program to their particular needs (which always adds value to the product).
On other occasions, you are preparing a complex program with a graphical
user interface: windows, menus, buttons, radio-buttons, etc. A tool to facili-
tate the implementation would be welcome: a system that describes the menu
structure both well and clearly (with the corresponding actions) would be very
useful. Then, by adding only a general function for managing the structure, we
have nearly all the necessary parts for an operative program.
Let us show an example (for the sake of brevity, this is somewhat simplified
compared with a more realistic case). In a program to make book covers, a box
is described in the form:
box "box 1" "front" "c" 90 10 40 10 20 0.5 0.3 0
Our program manages it well, but the user will probably have to go to the
‘manual’ to understand the meaning of each item. It is evident that a descriptive
format (even with comments) is much better:
box {
-- this is a comment
id = "box 1",
place = "front",
adjust = "c",
angle = 90, -- another comment
position = {x = 10, y = 40},
width = 10,
height = 20,
fill = {color = {r = 0.5, g = 0.3, b = 0}}
}
This second fragment, apart from data, is a chunk of Lua code, which has
to be processed using a function named box (defined in another place inside the
program). Some of the characteristics of the box may even be optional and have
default values. These improvements are easy to implement in the latter case,
but not in the former.
Lua is very good at describing data with a complex structure through tables
(see Section 10.1 in PIL2 for an example). However, the Lua interpreter is
not able to deal directly with the logical structure of those tables, to determine
whether a field must exist or not, to distinguish correct fields from wrong ones,
etc. For example, if the user types plaxe instead of place, the program should
throw an error. If the user inputs r = 2 (forbidden because the amount of
red must be between 0 and 1), the program must signal it. Therefore, a data
description ‘shell’ should be included.
Besides, the possibility of optional fields (with default values or not) should
also exist. Sometimes the program should hinder the use of fields different from
a given set, but at other times this restriction is inadmissible (because the data
file is also used by other programs with diverse necessities).
Thus, we have a problem which may be stated as follows:
To develop a Lua library that allows the input of complex structured
data, with complete control and validation of the contents.
The solution
The initial idea was to include in the program a table with the structure of the
desired input data and to compare the value type field by field. However, this
203
require "datatest"
local positive = datatest.numrange(0, math.huge)
local purecolor = {VALUE = 1, TEST = datatest.numrange(0, 1)}
local rgbcolor = {r = purecolor, g = purecolor, b = purecolor}
local places = datatest.inset({"front", "back", "spine"},string.lower)
does not allow other validations. For example, we could not distinguish between
different numbers (some correct, others not).
The solution found uses data templates as well, but also includes information
about the actual fields. Besides, the template structure must be related to the
data to be input, with the aim of facilitating their management from Lua.
Using the above example, we shall present and comment on a suitable tem-
plate for a box. Subsequently, we shall set out the design of a function to manage
the data. A version for the example is shown in Listing 1 (in a realistic case,
more fields could appear). This listing is almost self-descriptive. The template
is a table with sub-tables. The fields placed at an odd depth are control fields,
with information about the treatment of the fields placed at the next depth. A
level n in the data table corresponds to fields in CONTAINS (or VALUE) at level 2n
in the template. The possible control fields are described in Table 1.
In the example, only the shown fields are allowed, because in the first level
the control ALLSTRICT = true is declared. Some fields are optional, such as fill
and angle, although the latter has a default value of 0 if not given, due to its
control DEFAULT = 0. For some fields, the unique check is data type (i.e., id must
be a string), but other fields have testing functions (such as the color element
fill.color.r: the number must be between 0 and 1). The adjust field, which is
optional with a default value "l", is solely able to take the values "l", "L", "c",
"C", "r" or "R".
204 17 · Complex Structured Data Input
The content of a field VALUE is any value of the expected type (even a function
or a table), which is used for type checking. The contents of the field CONTAINS
(‘non-terminal data’) must be a table with a sub-template.
Although the description of possible contents in a box is somewhat verbose,
the versatility is apparent. It should be borne in mind that the template is
designed once, and that the user ‘does not see it’.
The programmer should develop checking routines for the data. The follow-
ing, which appear in the example, are interesting:
These routines are included in the appropriate place in the Gems repository
(as part of a complete example). Other similar datatest.* routines can be
developed to check other entry types.
In Listing 1, we defined and used several variables (positive, purecolor,
and rgbcolor) based on the above functions, because the code is clearer.
The routines set out in http://lua-users.org/wiki/LuaTypeChecking could
also be useful for data checking (but be careful, as another box function is given
there which has no relationship with ours).
The recursive character of the data structure allows a relatively simple pro-
cess via a recursive function: datatest.main (template, data, label) verifies
205
whether table data agrees with template. The argument label is used to dis-
play the table name if an error is detected. The function returns true and a void
string "" if all is correct, and false plus an error message otherwise. An outline
of this function is shown in Listing 2. (In the repository, the library datatest is
provided as a module.)
Listing 2. Outline of datatest.main. Parts of the code summarized inside <<· · ·>>.
207
Then the function checks CONTAINS and VALUE fields, which are incompatible,
but one of which must appear. After this, the routine analyzes whether the field
exists in the data table. If the field is not optional and is not provided, an error
message is prepared and the error is returned. If the field is optional and is
not given, but it has a DEFAULT field, this is assigned to the corresponding one
in the original table torig. (We use setvar from http://lua-users.org/wiki/
SetVariablesAndTablesWithFunction in this part.) A message is placed in an
auxiliary container (precart) to indicate the analysis of the default value (this
might have been incorrectly input in the template). It is evident that an error in
a template DEFAULT field is more a problem of programming than of the user: the
latter could inform the developer of it. Nevertheless, the user should check the
template with an extensive set of testing data before deployment of the program.
After that, the routine tests whether the item is terminal. If so, the data type
is checked versus the one in the template. Then the corresponding TEST function
is called to verify whether the value is valid, returning an error otherwise.
When data are not terminal, the field in data and template must contain
tables. If this is so, the presence of more fields than allowed (if STRICT or
ALLSTRICT are true) is analyzed, returning an error in that case. Otherwise
all the fields in the sub-table are analyzed by recursively invoking maintest for
each one with the appropriate arguments.
How to use
A driver for the above example is shown in Listing 3. Data processing is per-
formed in a protected environment, which allows assignments, calling functions
from the libraries math, string, table, and the ones defined in process (like box).
This methodology is known as ‘sandboxing’. The use of a protected environment
is very important in the present case: it avoids undue use of some dangerous
functions (like os.execute and os.remove) by the user.
Employing this method, variables can be included in the data file for their
use in some parts of the data. For example, we can define a color:
for k = 1, 10 do
w = k; h = 11-k
if k ~= 3 then box{width = w, height = h, <<other fields>>} end
end
208 17 · Complex Structured Data Input
-- fname stores the name of the data file taken from argument #1
local fname = arg[1]
local proc, msg = loadfile(fname)
if not proc then
print(msg); os.exit(1)
end
In this case, we could also export pairs, ipairs, tonumber, tostring, and per-
haps os.time and os.date by means of setfenv.
In the data file, we can also define functions to be applied to the data before
processing them inside box (in the example). Any variable definition, whether
local or global, used in the data file (using the driver shown above) remains
internal in it and does not pollute other environments of the program.
Conclusions
Sooner or later programmers using Lua are confronted with often complex data
input, which is frequently not under their control (because other people use the
210 17 · Complex Structured Data Input
program). A method that allows management of the data, with contents control
and validation, is thus convenient.
This gem presented a solution to this problem, using table templates and
combining these with a function that does the checking. The templates are
tables with control fields placed at odd depth levels, whereas even levels are
used for the proper data field names and values. The template is actually a
qualitative-quantitative description of the input data to be treated.
Control fields are used to define characteristics of data fields: whether these
are optional or not, whether they have default values, whether other fields
not present in the template are allowed in the data, and to define data check
functions.
From a programming point of view, the solution (mainly) uses a function
that behaves as a wrapper and includes a recursive function that performs the
checking of the data table level by level. The number of code lines is small due
to the facilities provided by Lua: tables, recursivity, and the first-class character
of functions (among other questions).
An example driver is included accompanying this main function. This works
in an appropriate environment to protect the data processing from possible
undue uses.
Finally, due to its versatility, this library can be used to neatly develop
programs for user-friendly data input, adaptive configuration of programs and
graphical user interface design, among other tasks.
Lua Implementations of
18
Common Data Structures
Matthew M. Burke
In Lua, as with any programming language, one should learn the style and
idioms that best take advantage of the language’s features, rather than utilizing
techniques from other languages. Jung and Brown state “[b]ecause of tables’
flexibility, you often don’t need a customized data structure. Just ask yourself
how you most often want to access your data — usually an associative table or
an array will do the job” [7, pg. 157].
Lua provides a single built-in data structuring mechanism, the “table,” which
combines the functionality of (re-sizable) arrays and hashes. Any of the fun-
damental data structures and their associated algorithms can be implemented
using Lua tables, but it is not always clear how best to do so. After a brief
discussion of how tables are implemented in Lua, this article describes imple-
mentations of several of the most important and common data structures: lists,
stacks and queues, trees, graphs, and sets. Next, several specialized data struc-
tures, including dictionaries and multisets, are presented. Finally, a few tips are
provided on how to structure data in concordance with the spirit of Lua.
Code examples
Good Lua practice dictates that the functions associated with a data structure
be collected in a (meta)table associated with the data structure. Not only does
this reduce pollution of the global namespace, but it allows one to program in an
object-oriented style.
It is tempting to make use of table.getn for implementing size. But there are
three points to consider: First, table.getn only returns the size of the array
portion of a table. Not only does it not count entries in the hashed portion of the
table, but if there is a gap in the sequence of integer keys, the indices following
the gap are not counted1 . Second, table.getn is O(n). Finally, and, perhaps,
obviously, implementations which use more than one table, particularly linked
data structures, cannot be sized using table.getn.
For these reasons, it is preferable to store the size explicitly in the data
structure. Doing so yields an O(1) implementation, although care must be
taken to ensure that this value is properly updated as the data structure is
manipulated. Implementations will accompany the discussions of the various
data structures in this article.
1 This follows from the fact that non-sequential integer keys are stored in the hash part of the
table.
213
Lua tables
Lua tables function as a combination of (adjustable-size) arrays and associative
arrays depending on what kinds of values are used as keys. In this article, a Lua
table that has exclusively integral keys is referred to as an array, provided the
table also satisfies the condition that there are no gaps in the sequence of key
values. Tables that do not qualify as arrays will be referred to as mixed tables,
or simply, tables.
Lists
There are two common ADTs for Lists. The first is the Array List2 , whose ADT
is presented in Table 2. The second is the Node List with ADT in Table 3.
Array lists
Array lists are most easily implemented as Lua tables with integer indices. The
functions in the Array List ADT are implemented below. The functions get and
set have (amortized) costs of O(1) and the costs of add and remove are O(n)
because elements must be shifted up or down.
function List:new()
return setmetatable({ __size = 0 } , self)
end
function List:get(i)
return self[i]
end
function List:set(i, e)
self[i] = e
end
function List:add(i, e)
table.insert(self, i, e)
self.__size = self.__size + 1
end
function List:remove(i)
table.remove(self, i)
self.__size = self.__size - 1
end
function List:size()
return self.__size
end
215
Node lists
Since variables store references to tables, it is easy to create linked lists using
one table per node. Moreover, these linked lists can very easily be multiply-
linked lists. Two example implementations follow. The first is a doubly-linked
list. The second is an excerpt from an implementation of a Skip List. In these
examples, a node’s data is accessed with an index value. The other (string)
indices are used for the links. The motivation for Skip Lists is outside the scope
of this article. Interested readers should consult [2].
Note that using a doubly-linked structure makes implementing the Node List
easier, although it can be implemented with a singly-linked structure.
NList = {}; NList.__index = NList
function NList:new()
local l = { head = {}, tail = {}, __size = 0 }
l.head.__next, l.tail.__prev = l.tail, l.head
return setmetatable(l, self)
end
function NList:first()
if self.__size > 0 then
return self.head.__next
else
return nil
end
end
function NList:last()
if self.__size > 0 then
return self.tail.__prev
else
return nil
end
end
function NList:next(node)
if node.__next ~= self.tail then
return node.__next
else
return nil
end
end
216 18 · Lua Implementations of Common Data Structures
function NList:prev(node)
if node.__prev ~= self.head then
return node.__prev
else
return nil
end
end
function NList:addFirst(elem)
local node =
{ __prev = self.head, value = elem, __next = self.head.__next }
node.__next.__prev = node
self.head.__next = node
self.__size = self.__size + 1
end
function NList:addLast(elem)
local node =
{ __prev = self.tail.__prev, value = elem, __next = self.tail }
node.__prev.__next = node
self.tail.__prev = node
self.__size = self.__size + 1
end
function NList:addBefore(node, elem)
local new_node =
{ __prev = node.__prev, value = elem, __next = node }
new_node.__prev.__next = new_node
node.__prev = new_node
self.__size = self.__size + 1
end
function NList:addAfter(node, elem)
local new_node =
{ __prev = node, value = elem, __next = node.__next }
node.__next = new_node
new_node.__next.__prev = new_node
self.__size = self.__size + 1
end
function NList:remove(node)
node.__prev.__next = node.__next
node.__next.__prev = node.__prev
self.__size = self.__size - 1
end
function NList:size()
return self.__size
end
217
The Skip List example below is included here to demonstrate the ease with
which one can build structures with arbitrary linkages. A Skip List is imple-
mented using several doubly-linked lists. Each list has copies of some of the
nodes in the previous list. A node in a list points to its predecessor and succes-
sor in the list. It also points to its copies (if they exist) in the previous and next
lists. Therefore, a node in a Skip List has four pointers: next, prev, below, and
above. The example code shows an implementation of the find function.
function SkipList:find(k)
local p = self.TopLeft
while p.below do
p = p.below
while k >= p.next.key do p = p.next end
end
end
Stacks
The default value for table.insert’s position parameter is n + 1, where n is the
length of the table. For table.remove, the default value for position is n. Thus,
we can implement push as table.insert(stack, value). One implements pop
as table.remove(stack). The cost of these functions is interesting because it
depends on whether the table needs to be resized, whether getn compatibility
was enabled when Lua was compiled, and whether the sizes table has been
created (see getsizes in lauxlib.c). In some cases, these function calls may
invoke the internal function luaH getn in ltable.c, which costs O(log n), to
determine the correct value for the insertion/removal position.
function Dequeue:new()
seturn setmetatable({ __first = 0, __last = -1 } , self)
end
function Dequeue:addFirst(elem)
self.__first = self.__first - 1
self[self.__first] = elem
end
function Dequeue:addLast(elem)
self.__last = self.__last + 1
self[self.__last] = elem
end
function Dequeue:first()
return self[self.__first]
end
function Dequeue:last()
return self[self.__last]
end
219
function Dequeue:getFirst()
if self.__first > self.__last then return nil end
local result = self[self.__first]
self.__first = self.__first + 1
return result
end
function Dequeue:getLast()
if self.__first > self.__last then return nil end
local result = self[self.__last]
self.__last = self.__last - 1
return result
end
function Dequeue:size()
return (self.__last - self.__first + 1)
end
require ’NList’
NDequeue = {}
NDequeue.__index = NDequeue
function NDequeue:new()
local l = { nlist = NList:new() }
return setmetatable(l, self)
end
function NDequeue:addFirst(elem)
self.nlist:addFirst(elem)
end
function NDequeue:addLast(elem)
self.nlist:addLast(elem)
end
function NDequeue:getFirst()
local result = self.nlist:first()
if result then
self.nlist:remove(result)
return result.value
else
return nil
end
end
220 18 · Lua Implementations of Common Data Structures
function NDequeue:getLast()
local result = self.nlist:last()
if result then
self.nlist:remove(result)
return result.value
else
return nil
end
end
function NDequeue:first()
return self.nlist:first().value
end
function NDequeue:last()
return self.nlist:last().value
end
function NDequeue:size()
return self.nlist:size()
end
Trees
The Tree ADT is presented in Table 7. In the case of binary trees, additional
functions are typically implemented. These are described in Table 8. Trees are
typically implemented in one of two fashions: either collections of linked nodes
or arrays with a protocol for making use of entries. Lua works well for both
implementations. Linked node implementations will be discussed first.
Linked-node trees
A linked-node implementation of a Tree can be accomplished using the same
techniques described above for Node Lists. Each node of the tree is represented
by a table. This table has an entry with index value for the element and an entry
with index children for the references to the node’s children. The children entry
is itself a table whose elements are references to the child nodes. For binary
trees, a slight optimization is to have entries indexed with left and right to hold
the references for the node’s children. The following code implements the binary
tree interface.
NTree = {}; NTree.__index = NTree
function NTree:new()
return setmetatable({ __size = 0 }, self)
end
function NTree:root()
return self.__root
end
function NTree:addRoot(elem)
self.__root = { value = elem }
self.__size = 1
return self.__root
end
function NTree:parent(node)
return node.__parent
end
function NTree:left(node)
return node.__left
end
function NTree:right(node)
return node.__right
end
222 18 · Lua Implementations of Common Data Structures
function NTree:isInternal(node)
return node.__left or node.__right
end
function NTree:isExternal(node)
return not self:isInternal(node)
end
function NTree:isRoot(node)
return (node == self.__root)
end
function NTree:remove(node)
local parent = node.__parent
local i_am_left_child = (parent.__left == node)
if i_am_left_child then
parent.__left = nil
else
parent.__right = nil
end
self.__size = self.__size - 1
end
function NTree:size()
return self.__size
end
223
Array-based trees
An array-based implementation of a Tree can be accomplished by numbering
the nodes of the tree. For a binary tree this can be done as follows:
n(v) = 1 if v is the root node
n(v) = 2i if v is the left child of node u and n(u) = i
n(v) = 2i + 1 if v is the right child of node u and n(u) = i
More generally, we can number the nodes of a k-ary tree as follows:
n(v) = 1 if v is the root node
n(v) = ki − k + j + 1 if v is the j-th child of node u and n(u) = i, 1 ≤ j ≤ k
With this numbering scheme at hand, implementing a tree with an array simply
requires storing node v at n(v). Note that, with this implementation, it is not
possible to store nil as a tree entry.
Implementations in other languages typically use zeroth array entry to hold
the number of elements in the tree. This implementation, however, uses the
index size to store the number of elements in the tree.
function ATree:new()
return setmetatable({ __size = 0 }, self)
end
function ATree:root()
return self[1]
end
function ATree:addRoot(elem)
self[1] = elem
self.__size = 1
end
function ATree:parent(node)
return math.floor(node/2)
end
function ATree:left(node)
return 2 * node
end
function ATree:right(node)
return 2 * node + 1
end
224 18 · Lua Implementations of Common Data Structures
function ATree:isInternal(node)
return (self[2*node] ~= nil or self[2*node+1] ~= nil)
end
function ATree:isExternal(node)
return not self:isInternal(node)
end
function ATree:isRoot(node)
return (node == 1)
end
function ATree:subsize(node)
if self[node] == nil then
return 0
else
return 1 + self:subsize(self:left(node))
+ self:subsize(self:right(node))
end
end
function ATree:remove(node)
local count = self:subsize(node)
self[node] = nil
self.__size = self.__size - count
end
function ATree:size()
return self.__size
end
225
function Dictionary:new()
local l = { __size = 0 }
return setmetatable(l, self)
end
function Dictionary:find(k)
local matches = self[k]
if (matches == nil) then
return nil
else
local _, match = next(matches)
return match
end
end
function Dictionary:findAll(k)
return self[k]
end
226 18 · Lua Implementations of Common Data Structures
function Dictionary:insert(k, v)
local tab = self[k] or {}
table.insert(tab, v)
self[k] = tab
self.__size = self.__size + 1
end
function Dictionary:remove(k, v)
local tab = self[k]
if tab ~= nil then
for k, val in pairs(tab) do
if val == v then
tab[k] = nil
self.__size = self.__size - 1
end
end
if next(tab) == nil then
self[k] = nil
end
end
end
function Dictionary:keys()
local keys = {}
for k, _ in pairs(self) do
if k ~= ’__size’ then
table.insert(keys, k)
end
end
return keys
end
function Dictionary:size()
return self.__size
end
Sets
The mathematical definition of a set is a collection of distinct objects. In partic-
ular, sets do not allow for duplicates, and there is no notion of order amongst the
elements. In addition to functions that manipulate set elements, several binary
operations are defined on sets. The most common of these operations are union
(members of the new set are all objects that are members of either of the input
sets), intersection (whose members are only those objects that belong to both
input sets), and difference (all members of the first set that are not members of
227
Set = {}
Set.__index = Set
function Set:new()
return setmetatable({ __size = 0 }, self)
end
function Set:add(elem)
local result = self[elem]
self[elem] = true
if not result then
self.__size = self.__size + 1
end
end
228 18 · Lua Implementations of Common Data Structures
function Set:remove(elem)
local result = self[elem]
self[elem] = nil
if result then
self.__size = self.__size - 1
end
end
function Set:member(elem)
return (self[elem] or false)
end
function Set:size()
return self.__size
end
The following implementations of the binary set operations are all destructive
in that they modify the first set rather than return a new result set. They are
all implemented by iterating over the keys of one set and modifying the entries
of the first set as appropriate.
function Set:union(b)
for k, _ in pairs(b) do
if k ~= ’__size’ then
if not self[k] then
self.__size = self.__size + 1
end
self[k] = true
end
end
end
function Set:intersect(b)
for k, _ in pairs(self) do
if k ~= ’__size’ then
if not b[k] then
self.__size = self.__size - 1
self[k] = nil
end
end
end
end
229
function Set:difference(b)
for k, _ in pairs(b) do
if k ~= ’__size’ then
if self[k] then
self.__size = self.__size - 1
end
self[k] = nil
end
end
end
The run times of these three functions are all O(n). Note that the implementa-
tion of these functions can be simplified by using the (functional programming)
map. A non-destructive implementation can be achieved by copying the first set
to a new set before proceeding with the rest of the operation.
Multisets
A common variation of a set is the so-called multiset (also known as a bag). A
multiset discards the restriction that each contained object be distinct. One
can easily adapt the prior implementation of set operations to multisets by
storing an integer count, rather than the Boolean value true, at each key. The
Multiset ADT has two functions in addition to the functions in the Set ADT.
These functions are described in Table 12.
MSet = {}
MSet.__index = MSet
function MSet:new()
local l = { __size = 0 }
return setmetatable(l, self)
end
function MSet:add(elem)
self[elem] = (self[elem] or 0) + 1
self.__size = self.__size + 1
end
230 18 · Lua Implementations of Common Data Structures
function MSet:remove(elem)
local current = self[elem] or 0
if current > 0 then
current = current - 1
self.__size = self.__size - 1
end
self[elem] = current
end
function MSet:member(elem)
return ((self[elem] or 0) > 0)
end
function MSet:count(elem)
return self[elem] or 0
end
function MSet:removeAll(elem)
local rcount = self[elem] or 0
self[elem] = nil
self.__size = self.__size - rcount
end
function MSet:size()
return self.__size
end
function MSet:union(b)
for k, v in pairs(b) do
if k ~= ’__size’ then
local bcount = v or 0
self[k] = (self[k] or 0) + bcount
self.__size = self.__size + bcount
end
end
end
function MSet:intersect(b)
for k, acount in pairs(self) do
if k ~= ’__size’ then
local bcount = b[k] or 0
self[k] = math.min(acount, bcount)
self.__size = self.__size - math.abs(acount - bcount)
end
end
end
231
function MSet:difference(b)
for k, bcount in pairs(b) do
if k ~= ’__size’ then
local acount = self[k] or 0
local reduced = acount - bcount
if acount >= bcount then
self[k] = acount - bcount
self.__size = self.__size - bcount
else
self[k] = nil
self.__size = self.__size - acount
end
end
end
end
Partitions
A partition of a set S is a collection of subsets of S with the following property:
Each element of S is a member of exactly one set in the partition. For example,
consider S = {1, 2, 3, 4, 5, 6}. One partition is the collection of singleton sets: {1},
{2}, {3}, {4}, {5}, {6}. Another partition consists of the following sets: {1, 2},
{3, 4}, {5, 6}. As an interesting aside, one can define an ordering on partitions
where a partition A is a refinement of partition B if every set in B is composed
of a union of sets of A. Note this is not a total ordering because it is possible to
have two partitions, neither of which is a refinement of the other. An ADT for
partition is presented in Table 13. An implementation of this ADT follows.
function Partition:new()
local l = { __size = 0 }
return setmetatable(l, self)
end
function Partition:make_set(x)
self[x] = self.__size
self.__size = self.__size + 1
end
232 18 · Lua Implementations of Common Data Structures
function Partition:merge(a, b)
local a_idx = self:find(a)
local b_idx = self:find(b)
for k, v in pairs(self) do
if k ~= ’__size’ and v == b_idx then
self[k] = a_idx
end
end
self.__size = self.__size - 1
end
function Partition:find(x)
return self[x]
end
function Partition:get_set(x)
local idx = self:find(x)
local res = {}
for k, v in pairs(self) do
if k ~= ’__size’ and v == idx then
table.insert(res, k)
end
end
return res
end
Graphs
A graph is composed of a set of vertices and a set of edges which connect
the vertices. It is important to remember that a graph is a topological object
rather than a geometric one, i.e., it is the connections between vertices that
is important, not the precise picture used to illustrate a graph. Normally in a
graph, if a node u is connected to a node v, this implies that node v is connected to
node u. In a directed graph, however, it is possible for a node u to be connected
to a node v while node v is not connected to node u. A good analogy to help
understand the difference between directed and undirected graphs is to think
of a road map. A directed graph is a network of one-way streets, whereas an
undirected graph is a network of two-way roads. A graph, either directed or
undirected, may have weights assigned to the edges. Again, if one considers the
road map analogy, weights would correspond to the distances between cities.
There are two common representations of graphs: the adjacency matrix and
the vertex list. Adjacency matrices are simple but waste space in the cases
of non-directed graphs and sparse graphs. When using Lua tables, however,
adjacency matrices, even for sparse graphs, are memory efficient. Therefore, the
vertex list representation of graphs is not discussed in this article.
233
function Graph:new(n)
local l = { __size = n }
local vertices = {}
for i = 1, n do
table.insert(vertices, i)
end
l.__vertices = vertices
local graph = {}
for i = 1, n do
table.insert(graph, {})
end
l.__graph = graph
return setmetatable(l, self)
end
function Graph:vertices()
return self.__vertices
end
function Graph:incidentEdges(v)
local result = {}
for i = 1,v-1 do
if self.__graph[i][v] then table.insert(result, i) end
end
for i, _ in pairs(self.__graph[v]) do
table.insert(result, i)
end
return result
end
234 18 · Lua Implementations of Common Data Structures
function Graph:areAdjacent(v, u)
return ( (self.__graph[v][u] or self.__graph[u][v]) or false)
end
function Graph:insertEdge(v, u)
if u < v then v, u = u, v end
self.__graph[v][u] = true
end
function Graph:removeEdge(v, u)
if u < v then v, u = u, v end
self.__graph[v][u] = nil
end
Text processing
Most books on data structures contain a chapter discussing text processing,
particularly pattern matching and compression. This article does not cover these
topics, but the interested reader is directed to read the documentation for Lua’s
string library [4]3 , Reuben Thomas and Shmuel Zeigerman’s rex library [13],
and Roberto Ierusalimschy’s LPEG library [6].
require ’NTree’
OrderTree = {}
OrderTree.__index = OrderTree
function OrderTree:new()
local l = { }
l.tree = NTree:new()
return setmetatable(l, self)
end
function OrderTree:addRoot(elem)
elem = { data = elem, subtree_size = 1 }
return self.tree:addRoot(elem)
end
function OrderTree:increment_ranks(node)
while (node ~= nil) do
node.value.subtree_size = node.value.subtree_size + 1
node = self.tree:parent(node)
end
end
function OrderTree:decrement_ranks(node)
while (node ~= nil) do
node.value.subtree_size = node.value.subtree_size - 1
node = self.tree:parent(node)
end
end
function OrderTree:rank(x)
local root = self.tree:root()
return self:_rank(x, root)
end
function OrderTree:select(k)
if k > self.tree:size() then
error(’Tree does not contain that many items.’)
end
local root = self.tree:root()
return self:_select(k, root)
end
236 18 · Lua Implementations of Common Data Structures
One can easily modify the data structures presented here so that the inser-
tion routines check the type of the value before inserting it into the structure.
A simple approach would be to use the type function, although in many circum-
stances that is not sufficient. A more robust approach would be to make use
of metatables, particularly since most object-oriented systems in Lua rely on
metatables to implement classes.
Below is the implementation of a factory for creating functions to verify types.
function make_typechecker(spec)
if type(spec) == "string" then
return function(v) return (type(v) == spec) end
else -- spec is a table
return function(v) return (getmetatable(v) == spec) end
end
end
3. Remember that objects of any type can be table keys (except nil).
4. Make use of both the array portion and hash portion of a table.
Resources
There are a number of resources available to Lua programmers to aid in the
design of data structures. These include both existing libraries and reference
238 18 · Lua Implementations of Common Data Structures
material. Good texts on Lua programming include Programming in Lua [5], the
Lua Reference Manual [4], and Beginning Lua Programming [7].
There are also several existing libraries which contain a range of data struc-
tures and algorithms. These include Reuben Thomas’s stdlib [12] and Paul
Chisano’s Sano Library [1]. Thomas’s work contains a number of useful exten-
sions to Lua’s standard libraries and has a particularly functional programming
style. Chisano’s library contains implementations of almost all the data struc-
tures described in this article.
Complete implementations of the data structures discussed in this article are
available at the book’s web site. And, of course, there are a number of examples
of data structure implementations available at the Lua Wiki [9]. Finally, the
Lua mailing list [10] has an excellent signal-to-noise ratio.
References
[1] Chisano, Paul. Sano Library. http://luaforge.net/projects/sano/.
[2] Goodrich, Michael T. and Roberto Tamassia. Data Structures and Algo-
rithms in Java. 4th Ed. John Wiley and Sons, 2006.
[3] Ierusalimschy, Roberto, de Figueiredo, Luiz H. and Waldemar Celes. “The
implementation of Lua 5.0,” Journal of Universal Computer Science, Vol.
11, No. 7, 2005.
[7] Jung, Kurt and Aaron Brown. Beginning Lua Programming. Wiley, Indi-
anapolis, IN, 2007.
Introduction
Humans started to play games a long time ago and just after the firsts civiliza-
tions have existed. The main purpose of them were basically for entertainment.
Since then, more and more games have been created so that future generations
still spend their time with game playing.
Adding this fact with computing, many of this games were inserted into
computers and thus some artificial intelligence became indispensable. In fact,
game playing is one of the oldest areas of endeavor in artificial intelligence,
turning up around 1950 when computers just became programmable [3].
Search problems
In order to solve search problems we build a tree that is superimposed over the
problem space and find a solution by searching node-after-node until reaching
the desired goal node. However, in some cases, building such tree is not an
easy task. Even a simple problem can generate a huge amount of data and,
in this case, the resulting tree becomes intractable even when submitted to
modern computers. This can be observed by looking at Table 1 which shows
a hypothetical computer times and memories usage for a given number of tree
depth.
Even with this exponential growth, the most common difficulties related
to problem solving is finding a good strategy for searching among those trees.
This is almost always the majority of work in the area of search problems.
The strategy can be classified in term of four different criteria as presented on
following:
Completeness. Is the strategy guaranteed to find a solution if one exists?
Time Complexity. How long does it take to find a solution?
Space Complexity. How much memory does it need to perform that search?
Optimality. Does the strategy find the highest-quality solution when there are
several different solutions?
Games
An important consideration when designing games is to treat them as search
problems. However, the presence of an opponent makes the decision problem
241
• The initial state (root), which includes board position and an indication of
whose move it is.
• A set of operators (transition), which defines the legal moves that a player
can make.
• A terminal set (leaves), which determines when the game is over.
• A utility function (node values), which gives a numeric value for the out-
come of the game. For example, we can assign to leaves the values +1, 0 or
−1 respectively for winning, drawing, or losing the game.
1. Generate the whole game tree, all the way down to the terminal states.
2. Evaluate the utility function to each terminal state to get its value.
242 19 · Tic-Tac-Toe and the Minimax Decision Algorithm
For programming purposes, imagine the computer playing CIRCLE and the
human playing CROSS. The code in Listing 1 shows a Lua implementation of
the minimax algorithm. The top function, MINIMAX DECISION, starts the decision
process and selects the computer best move from all legal ones, which are
evaluated in turn by the MINIMAX VALUE recursive function.
When computer’s turn comes, the CROSS player leaves it a board with a
given game situation. Then, MINIMAX DECISION is invoked with the current game
board as argument. After that, all legal moves are taken by using getLegalMoves.
From the resulting table, every legal move must be played in separate boards
and passed to MINIMAX VALUE, which calls itself recursively until it reaches the
maximum predefined hMax depth or when the game is over for that board state.
In both cases, the recursion stops and the UTILITY function is evaluated. Other-
wise, the recursive function continues alternating each player’s turn, maximiz-
ing or minimizing the utilities.
Comparing Lua to other programming languages, such as C or Java, we can
spot some particular differences. The first one is related to Lua tables. They
are simple, easy to use, and avoid any additional memory management. As an
example, take this line from MINIMAX VALUE
This line means that a function move receives a game board (a Lua table) and
a move v to play on this board. The resulting situation leads to a new game
board named newBoard. If we were using C, one of the best ways would be
treating boards as integer vectors. However, special memory managements
would be required in that case because vector copying in that language is not as
simple as copying tables in Lua. This will surely require user care and attention
with pointers. And every programmer knows that just a single, small memory
manipulation mistake can affect the whole application integrity. This sort of bug
does not happen in Lua programs.
Case study
Up to this point, we have been using the tic-tac-toe game in our explanations.
In fact, this game was chosen to be our case study for several reasons. First, it
is a well-known and popular game with dismiss any rules explanations. Second,
other games like checkers (also known as draughts) and chess are much more
complex than tic-tac-toe. For instance, tic-tac-toe has a maximal branching
factor of 9 while in chess the average is about 35. Moreover, tic-tac-toe has
exactly 623530 nodes in its search tree but chess can easily achieve 35100 .
244 19 · Tic-Tac-Toe and the Minimax Decision Algorithm
Listing 1.
245
Conclusions
The first conclusion is that implementing the minimax strategy in Lua is an
easy task. The Lua combine tables facility to form matrices speeds up users
implementation skills keeping the source code simple and clear. The memory
manipulation abstraction also contributes to make Lua an easy programming
language.
Another important Lua approach consists in collecting any memory garbage
with frees programmers from this responsibilities.
Yet Lua is a powerful programming language and also provides extensions.
One future work could be implementing other games concepts such as alpha-
beta pruning. Alpha-beta pruning is a technique where a large amount of
nodes are securely removed from search tree and for this reason they are not
processed. Avoiding unnecessary processing means increasing the computer
answer velocity.
References
[1] J. D. Funge. Artificial Intelligence for Computer Games. Peters Corp., 2004.
[2] R. Ierusalimschy. Programming in Lua. Lua.org, 2006.
[3] S. Russel and P. Norvig. Artificial Intelligence: A Modern Approach.
Prentice-Hall, 1995.
Part IV
Game Programming
20
Using Lua in
Game and Tool Creation
Konstantin Sokharev and Vadim Groznov
IDE (debugger)
Soon we understood that, with all the convenience and flexibility of Lua, at
large volume of code (over 10 Lua files at several thousand lines each) means of
debugging are lacking, so we decided to write full-featured Lua script debugger.
Such a debugger was written and built into the game. It has a convenient
IDE with game files navigation tree, search, multitabs, syntax highlighting,
and all the functionality for debugging: breakpoints, step in, step out, edit
and continue, watches, expressions. It significantly simplified and sped up
development, because earlier we had to debug code by logging.
It is worthy of note that for text representation we used Scintilla middleware,
while Lua middleware has all the functionality for the creation of a good debug-
ger (i.e., we didn’t have to change anything in Lua source code), so the creation
of a debugger took only one man-month.
251
• All structural definitions must always have the same format (Lua allows
to do one thing by several means). For example, function declaration:
function Running() or Running = function() — you must choose most con-
venient.
To conclude this topic, good advice: frequently make a code review to solidify
structures (we advise to include it into the weekly development cycle).
Performance issues
One of the most important questions with script language usage is “Will it be
fast enough?”. We conducted preliminary synthetic tests and found that perfor-
mance was sufficient even with several hundred units operating simultaneously
in game. In reality, it turned out to be slower — in massive combats with many
game entities (30–60 fighting units on-screen), the game began to lag signifi-
cantly. An important conclusion was made: game designers’ vision of battles
in game and other scripted events happening simultaneously on the map must
be taken into account beforehand. Then most critical (performance-wise) stuff
must be exposed and it should be decided whether to keep it in C++ or transfer it
to Lua. Regrettably, we made this too late, so we had to fix these problems dur-
ing production. We decided to integrate a Lua profiler (from the Kelper project),
which did its job well and showed up bottlenecks, and did it in runtime, with-
out stopping game in needed game situations, which is very important. That
way we found out that significant amount of processing time was consumed by
Lua-to-C++ data transfer (we will elaborate on that later), and next function
used for looking through table elements. A conclusion can be made: thoroughly
plan structure of data that you will store in script (use massives extensively and
tables where you really need them.) Remember that performance impact when
looking through larger number of elements is significantly heavier in Lua then
in C++. Also, there is sense in using custom containers, exported from C++, with
faster look through elements.
252 20 · Using Lua in Game and Tool Creation
Conclusion
Our experience showed that utilization of Lua as a convenient database and a
means for rapid creation of game entities is quite possible, provided you have a
convenient IDE for debugging and navigation of code.
Well thought-out architecture of Lua allowed for good integration with .NET
and usage of Lua as a tool for creating plug-ins for a game editor.
When using Lua, one should consider memory consumption, execution speed
and “excessive” flexibility of Lua code — overlooking these peculiarities you may
hinder all your efforts to improve your system.
254 20 · Using Lua in Game and Tool Creation
function MapLogic:LocationSubscription()
local loc
loc = getObj("loc_street_enter" )
if ( loc ) then
loc:Subscribe("GE_OBJECT_ENTERS_LOCATION", self,
"OnUnitEnterLocationStreetEnter" )
end
loc = getObj("loc_yard_enter" )
if ( loc ) then
loc:Subscribe("GE_OBJECT_ENTERS_LOCATION", self,
"OnUnitEnterLocationYardEnter" )
end
end
local pos
local index = 1
for _, pos in self.Constants.StreetPos do
local obj = self.ProtEnemy[index]()
index = index + 1
if ( obj ) then
obj:SetPosForced(pos)
obj:SetBelong(1002)
end
end
end
Listing 1.
255
-- PostEvent member
-- ProcessEvent member
-- Calls recipient ( if object) with OnEvent( EventName, SenderId, Params )
-- Calls function by itself with ( EventName, SenderId, Params )
if obj.OnEvent then
obj:OnEvent( Evn[2], Evn[3], Evn[4] )
end
end
end
Listing 2.
256 20 · Using Lua in Game and Tool Creation
Prot.Characters = {
Defaults = {
PhysicAnimDeath = true
HP = 500
Material = "FleshDead"
FullInstanceCopy = true
StandAnimationCount = 1
Radius = 0.5
HardRadius = 0.5
RunSoftRadius = 1
StandSoftRadius = 1
TimeBeforeRemove = 10
EngineProperties = {
CastShadow = false
ReceiveShadow = true
DynamicShadow = true
}
}
TestCharacter1= Prot {
Class = Obj
ModelFile = "data/Models/NPC/Women/Pols_1/model.xml"
Material = "FleshAll"
RotationSpeed = 2 * math.pi / 3
HP = 2000
Damage = 70
CaptureTargetRadius = 20
AttackRadius = 2
AttackAngle = math.pi / 4
AttackEnergy = 10
MaxEnergy = 50
RestoreEnergyPerSecond = 7
EmaciationTIme = 3
DamageTimeInAttackAnimation = 0.4
MoveInUpdate = true
EngineProperties = {
DefaultAnimationSpeed = 0.78
CastShadow = true
DefaultPhysicState = "STATIC"
Physic = {
ObjectWeight = 7500
RigidBodyWeight = 3000
DynamicTimeout = 10
NoSleepingCheckTimeout = 10
DynamicSleepingTimeout = 10
}
}
}
}
Listing 3.
257
//Bindings stuff
/**
The types of exports supported by the binding system.
*/
enum eExportType
{
METHOD,
NATIVE_METHOD,
PROPERTY,
//CONST_INT,
//PROP_INT,
//PROP_FLOAT,
//PROP_BOOL,
//PROP_VECTOR,
};
/**
Information about a class entry point.
*/
struct ExportInfo
{
const char* name; // The name of the entry point
eExportType type; // METHOD, NATIVE_METHOD, etc.
void* addr1; // The memory address of the entry point.
void* addr2; // A second address, used by properties
/// these are optional export descriptors
const char* returns; // e.g. "void" or "int"
const char* params; // e.g. "int a, const CStr& s"
const char* desc; // e.g. "returns number of chicks avail"
};
Listing 4.
258 20 · Using Lua in Game and Tool Creation
#define BEGIN_EXPORT_MAP(className) \
static ::engine::ExportInfo _exports_##className[] = {
Lua::Table Obj::GetAnimations()
{
Lua::Table tbl;
for ( int i = 0; i < this->GetNumAnimations(); i++ )
{
tbl.SetNum(float(i + 1), sArg::FromS(this->GetAnimationName(i)));
}
return tbl;
}
Listing 4. (continued)
Leveraging Lua and C++ to
21
Create a Dynamic and Flexible
Event System for Script-Driven
Games
Robert Oates
Lua boasts several features that make it an attractive option for use in event-
driven applications such as video games. These programs are made much
easier to program through judicious application of the subscriber pattern, and
can be given a lot of runtime flexibility through the strategy pattern. In this
article I will explain what these patterns are, how they are beneficial to game
development, and finally how we can implement and optimize these patterns
easily with Lua.
The subscriber
The subscriber pattern is absolutely integral to event-driven programming. Ob-
jects will subscribe to events through an event manager, and the event manager
notifies the subscribers when the event occurs. There is a lot of room for creative
implementation with this pattern, so let’s step back a moment and consider the
requirements we have for our application and see if that helps narrow down the
possibilities.
Since we’re going for an event-driven application here, most of our objects
need to be able to subscribe to and receive events. Characters, UI components,
even entire systems must have a consistent way of interacting with event man-
agers. The goal here is for objects of any type to subscribe to events from multi-
ple event managers and handle them in one place. We can accomplish this goal
by leveraging Lua’s first-class functions. To illustrate the usefulness of such
functions, let’s look at some example code for an imaginary game:
function LeftPlayerKeyHandler(receiver, eventData)
if eventData.key == "A" then
--move receiver left
elseif eventData.key == "D" then
--move receiver right
end
end
Player1:Subscribe("OnKeyPress", LeftPlayerKeyHandler)
Player2:Subscribe("OnKeyPress", RightPlayerKeyHandler)
The first thing we did here was define two slightly different functions to be
used as event callbacks. Second, we created two instances of some predefined
“Player” object. These instances then subscribed to the same event with different
callback functions. Player1’s OnKeyPress callback will cause it to move left and
right when the ‘A’ and ‘D’ keys are pressed. Player2’s OnKeyPress callback uses
the ‘J’ and ‘L’ keys instead. We can see that regardless of what our object,
event, and callback functions are the act of subscribing to an event will be
consistent with this interface. But the use of callbacks in event systems is
certainly not groundbreaking and definitely not Lua-exclusive, so what’s the big
deal? The “big deal” is a combination of things. While callback functions are
not exceptional on their own, when they also happen to be first-class functions
it opens up some interesting possibilities.
Subscribers can define simple callbacks inline:
Player1:Subscribe("OnDeath", function()
error("Player Died!")
end)
261
MyGame:Subscribe("OnKeyPress", EventDataWriter)
MyGame:Subscribe("OnWindowResize", EventDataWriter)
MyGame:Subscribe("OnUDPRecv", EventDataWriter)
MyGame:Subscribe("OnKilled", EventDataWriter)
Now that we have a subscription and callback interface that satisfies our
requirement for consistency and flexibility, we must turn our attention towards
posting events and the interface thereof. To get the most out of our event
manager we will need an equally flexible interface for sending events as we
created for receiving them. Here again we will take advantage of Lua’s flexibility
with respect to data types. Consider the following call to send an event:
PostEvent("OnKilled", someEventDataTable )
All we’ve done here is tell some nebulous event managing system that the
OnKilled event has fired and whatever is in the event data table should be
passed along to any objects subscribed to that event. What’s wonderful about
this system is that there’s no need for predefined events or event data types.
We can post an OnKilled event without any object ever being subscribed to it,
and we can put whatever we want in the event data so long as the subscriber’s
callback function handles (or ignores) it. Most importantly, the call to post an
event looks the same no matter what the event is, who’s getting it, or what data
is associated with the event.
All that’s left is to decide how the internals of our event manager should
work. What happens when we subscribe to an event? What happens when
we post one? In general, an event manager needs to keep a list of subscribed
objects and their callbacks for each event. Such a structure could be visualized
in a manner similar to picture below:
Let’s examine how the subscription and posting of events works in better
detail to provide some insight. Subscribing to an event would add it to the list
of subscribed events above. Once the event name has been found or added, the
subscriber and callback are inserted into the corresponding list as shown below:
function Subscribe(self, eventName, callback)
table.insert( EventManager.Events[eventName],
{ Subscriber = self, Callback = callback } )
end
263
This is event subscription at its simplest. A more robust system would need
to add functionality for unsubscribing from events, as well as error checking to
avoid duplicate registrations and similar cases. We must also consider events
that add/remove other events, iterator invalidation, and all of the headaches
that brings. The simplest way to handle these issues is with “pending queues”.
Rather than inserting new event handlers immediately when the function is
called, it should add them to a separate list that is added en masse at the end
of the frame. Removing events should flag them for deletion at the end of the
frame.
Posting events is also remarkably simple. If the list of subscribed events
contains the event being posted, then we iterate over the associated list and
execute the callback for each subscriber. This example too eschews robustness
for brevity:
function PostEvent(eventName, eventDataTable)
for _, pair in ipairs(EventManager.Events[eventName]) do
pair.Callback(pair.Subscriber, eventDataTable)
end
end
Strategy
I mentioned earlier that one of the great things about Lua is that its functions
are first-class. This language feature immensely facilitates the strategy design
pattern. The strategy pattern creates differences between objects at runtime
by changing their function references. We can leverage the event manager we
created above as a way to implement strategy in our application by specifying
different callbacks for different objects (as in the very first code example), but we
can also override functions directly when it suits us. Consider the code below:
function Fly()
--Do stuff
end
function Walk()
--Do Stuff
end
enemy1.MoveFunc = Fly
enemy2.MoveFunc = Walk
Now if we have a list of similar enemies we can simply iterate through it and
call MoveFunc for each one. This is similar to inheritance, with the exception
that you can actually alter the behavior during runtime.
264 21 · A Dynamic and Flexible Event System for Script-Driven Games
Now when our flying enemy gets too close to the sun, his movement function
is changed to make him a walking enemy instead.
Extra credit
Serialization of this event data is very straightforward, and even allows us to
leverage our existing event system for sending network messages. Consider the
following example:
Where will the “OnNetChatMessage” event be generated? From the other com-
puter we’re connected to, of course! What we need to do to make this work is cre-
ate a method for remote systems to tell our event manager to broadcast the On-
NetChatMessage event — with proper data and all. This gets a little involved,
but it’s definitely worth it. This section was written with winsock and C++ in
mind, but the theory should apply to any network API you can use through Lua.
The remote system wants to send an event to our system. The data will need
to be serialized into a stream of bytes by the remote system, sent across the
network, and then deserialized into a meaningful event on our system. How
should we approach this given that we do not predefine event templates? With
a little trickery, of course! We will give events a predictable header that tells the
receiver how to decode them on the fly. So what sort of information do we need
to send so the receiver can decode our byte string?
• We should send the length of the serialized data as the first integer.
• We will need to store the name string of the event.
· For keyed tables, we will need to store each key string as well as
its value.
Taking all of this into consideration, a call and viable serialization might look
like this:
SendNetEvent("PlayerData",
{
Name = "Robert Oates",
Age = 23,
Single = false,
Color = {255, 0, 0},
})
• Line 1 (header):
Now that our data has been serialized, it is ready to be streamed across the
internet to the waiting event handler on some other computer. The benefit to
self-describing data such as this is that the receiver does not need to know about
an event in order to receive and decode it.
As you can see, this setup allows for the nesting of tables and all sorts of other
neat stuff. The drawback it suffers is bloat from all of the strings used to describe
the various bits of data. Fortunately if the format of our event data is not going
to change much, we can take advantage of caching. Once this message has
been successfully sent, both sides can remember “Ok. There’s an event called
PlayerData, and its event structure looks a certain way.” The event information
will then be cached and assigned a unique number for future use. Next time the
event gets sent across the network, it only needs to supply the values and none
of the description data (keys, type info). The example I’m about to show makes
the assumption that types will remain consistent even inside of nested tables.
Your implementation may vary. In the event that I needed to send my player
data again the serialized version would now look like this:
Notice that the buffer length in picture below is now a negative number,
which I’ve decided to use as a flag to indicate that this is a cached message
type. The following byte (you may wish to use a short or uint instead) with
value 1 is the unique cached message type number. With the type information
cached, we’re able to send only the value data and reduce subsequent messages
from 72 bytes to 35 bytes — cutting the message length by more than half while
preserving the ability to send events over the network as easily as we send them
to objects in our own game. When the PlayerData message is received on the
other side, it will be reconstructed and then sent to an event handler which will
then forward it to any objects (such as the game object, a level, a user interface,
or an enemy) that have subscribed to the message.
267
Closing
Hopefully the simple examples I have provided are adequate to illustrate the
power and flexibility of these patterns. When combined with Lua’s advanced
features (closures, first-class functions) they can be used to build a solid founda-
tion for any script-driven game.
Lua for Game Programming
22
Steve Gargolinski
The goal of this article is to describe a breadth of ways that Lua can be used to
supplement a traditional (C++) game engine. The three areas that we will take
a look at specifically are data representation, adding an extensible structure
for providing dynamic in-game challenges to the player, and supplementing our
game world with Lua-driven artificial intelligence.
When video games were young, all pieces of a typical game were coded
directly into the engine. Maps, sprites, the user interface, game logic, and AI
were all represented in assembly code or C, and later C++. This approach was
cumbersome and inflexible, requiring a build step in between changing any sort
of game data and being able to view it in the game. Before long, the concept
of separating the game engine from its external components became possible.
Maps are now created by editors as external files, Non-Player Character (NPC)
dialogue is stored in a text file rather than C++ code, and sprites are stored as
textures. It is no longer necessary to perform a costly build step after changing
one line of NPC dialogue.
After moving game assets out of the engine and into data, the next step is
to create a separation between our engine code and game logic, which can be
achieved by exposing select areas of the game engine to a scripting language
such as Lua. This separation allows programmers to define clean interfaces
between select areas of the game and the chosen scripting solution. Also, less
technically proficient designers can work at a higher level and tinker with game
systems without needing to write actual C++ code, compile, or fully understand
the engine. Much of the game experience can be tweaked and tuned without
requiring the game to be rebuilt or even restarted.
Example game
To clearly illustrate the goals of this article, we will refer to a sample game.
Hopefully you all remember the classic game Adventure (or Zork). The frame-
work game we are going to use is a very simple version of these “text adven-
tures”.
In this sample game there is a World made up of Locations. Locations are
connected to each other in a sparse graph. There are Items in this World which
can either be at a Location or in the possession of an Actor who can be either
the Player or an NPC. Actors are able to pick up items, drop them, and move
between connected Locations in the World. It is as simple as that.
Note that this article assumes an existing C++/Lua binding solution. A
description of the different available techniques is beyond the scope of this
article, but Celes et al.1 provide a solid discussion.
Data representation
Games use huge amounts of data. In a typical game there are models, ani-
mations, maps, entities, and sounds, each with its own data format. Figuring
out the best way to represent this data is an involved decision with many im-
plication details including: platform-specific issues, memory limitations, and
designer/artist workflow patterns.
Lua can be used to efficiently and flexibly handle loading designer-defined
data. Assume that in the realm of our example game, designers are in charge
of creating each Location in the World. The code in Listing 1 describes a way to
load in two Locations by defining them as a Lua table.
LoadAllLocations is responsible for building up a locationTable and pass-
ing it along to LoadLocationsFromTable where the information is extracted and
used to add Locations to the World through a minimal number of exposed C++
1 W. Celes, L. H. de Figueiredo, and R. Ierusalimschy, “Binding C/C++ Objects to Lua”, Game
function LoadLocationsFromTable(locationTable)
local world = LPGWorld.GetInstance()
for i = 0, table.getn(locationTable) do
world:AddLocation(locationTable[i].name, locationTable[i].desc)
end
end
function LoadAllLocations()
local locationTable = {}
local index = 0
locationTable[index] = {}
locationTable[index].name = "FOREST"
locationTable[index].description = "A lush forest."
index = index + 1
locationTable[index] = {}
locationTable[index].name = "SWAMP"
locationTable[index].description = "A dark swamp."
LoadLocationsFromTable(locationTable)
end
Listing 1.
functions. Allowing game data to be defined this way is very flexible; it’s easy
to cut and paste, add, and delete Locations with just a few clicks. In the real
world of game development, however, game data is far more complicated than
simple Locations with names and descriptions. Specifying our data explicitly
in a Lua table is not optimal. It is important for designers to have the power
to organize data in a cleaner way — in spreadsheet form, for example. No mat-
ter which external format we decide to use, the result will be a locationTable
passed to LoadLocationsFromTable. Whenever designers want to use an exter-
nal data format (.xml, .csv, etc.) we need to write a bit of code to turn this data
into a properly formatted locationTable. Lua has a decent set of string ma-
nipulation utilities which makes writing these functions easy. If we want to
represent our data above in a spreadsheet, we will end up needing to parse a
.csv (comma separated value) file, as follows:
LoadLocationsFromTable(BuildLocationTableFromCSV("map.csv"))
Loading data in this fashion allows us to store the actual data in any format we
want, while still passing through a common function (LoadLocationsFromTable)
during the loading process. This allows load-time designer defined validation of
272 22 · Lua for Game Programming
function BuildLocationTableFromCSV(csvFile)
local locationTable = {}
local index = 0
for csvEntry in io.lines(csvFile) do
local _, _, locationName, locationDesc =
string.find(csvEntry, "(.+),(.+)")
locationTable[index] = {name = locationName, desc = locationDesc}
index = index + 1
end
return locationTable
end
Listing 2.
data on top of the constraints implemented in our game engine. We can expose
this functionality by adding a single line:
function LoadLocationsFromTable(locationTable)
local world = LPGWorld.GetInstance()
for i = 0, table.getn(locationTable) do
ValidateLocationData(locationTable[i]) -- ***
world:AddLocation(locationTable[i].name, locationTable[i].desc)
end
end
ValidateLocationData becomes an opportunity for designers to define vali-
dation requirements for the data they are specifying. A designer could decide
that Locations should always specify a description that is not an empty string.
It would be simple to add this check without requiring any change to the game
code. This is most useful for designer-desired guidelines, with more strict re-
quirements enforced in the engine:
function ValidateLocationData(loc)
if string.len(loc.desc) == 0 then
print("DATA ASSERT - Empty description for location: " .. loc.name)
end
end
With this structure in place, it is simple to expose the ability to add Locations
at run time. All that we need to do is add a Lua function to build up a
locationTable to pass through the same loading procedure used above. Hooking
a game debug console into this function allows designers to modify the game’s
data easily at runtime, giving them the ability to test out new Locations without
even needing to restart the game.
function AddSingleLocation(locationName, locationDesc)
local locationTable = {}
locationTable[1].name = locationName
273
locationTable[1].description = locationDesc
LoadLocationsFromTable(locationTable)
end
Dynamic challenges
The goal of this section is to add a mechanism to present the player with chal-
lenges — focused, mix-in situations with risk/reward structures to drive and con-
trol the overall flow of the game. We will be creating a simple example challenge
called “Water The Forest”, in which the player must bring the “Water Jug” Item
to the “Forest” Location in the World. This is a very simple challenge based on
our example game, but the mechanism is powerful and can be applied to many
different types of games. The idea here is that high-level control of the chal-
lenges (initialization, updating, etc.) is handled in the game engine, but the
content is completely controlled through Lua scripts. Each challenge is defined
in terms of a single Lua file. For the purposes of this article we will keep things
very simple — Lua file only needs to implement four functions: EvaluatePre-
Reqs(), Update(), Success(), and Failure(). EvaluatePreReqs() is responsible for
controlling when a particular challenge is given out. This function returns a
boolean value indicating true whenever the pre-requisites for this challenge are
met, and false otherwise. We will use the results of EvaluatePreReqs() when
deciding which challenge to present to the player. Here is some simple example
(engine level) pseudocode for using EvaluatePreReqs() to choose a valid chal-
lenge based on the current game state:
Array<Challenges> validChallenges;
for i = 1; i < allChallenges.size(); ++i
{
if (allChallenges[i]->TriggerEvaluatePreReqs() == true)
validChallenges.push(allChallenges[i]);
}
activeChallenge = validChallenges[RandInt(0, validChallenges.size)];
This piece of code will loop through all of our challenges, building up an
array of the ones which pass our EvaluatePreReqs() test. We then set our active
challenge to a random entry in this array. Something important to note here
is how the allChallenges array gets filled in. Since each of our challenges is
contained within a Lua file, we can simply iterate on all the .lua files in a
specified challenges directory, adding each one to the allChallenges array. This
discovery mechanism is a very useful property since it does not require a list
of challenges to be stored anywhere. Adding a new challenge only requires the
addition of a new file. If we decide to release an expansion pack, downloadable
content, or some combination of the two, there is no need to coordinate an index
file of challenges between these permutations. Each expansion pack simply
needs to drop a few files in the challenges directory. After the engine has chosen
an active challenge, the next responsibility of the engine is to trigger an update
274 22 · Lua for Game Programming
on this challenge. The result of an update can either be success (1), failure (–1),
or no resolution (0).
int challengeResult = activeChallenge->triggerUpdate();
if (challengeResult == 1)
{
activeChallenge->TriggerSuccess();
completedChallenges.push(activeChallenge);
activeChallenge = NULL;
}
else if (challengeResult == -1)
{
activeChallenge->TriggerFailure();
failedChallanges.push(activeChallenge);
activeChallenge = NULL;
}
These lists can be used to present the player with a history of the challenges
they have attempted, or to filter allChallenges to prevent giving the player
a challenge they have already completed. For our example challenge these
functions are simple. The goal of “Water the Forest” is for the player to bring
the Water Jug into the Forest. We do not want to give out this challenge unless
three preconditions are met:
• The Water Jug Item exists somewhere in the World.
• The Forest is a Location in the World.
• The Water Jug is not already in the Forest.
The EvaluatePreReqs() implementation is very simple. As long as these
three conditions are satisfied, we want to give out the challenge. Here is some
example code to handle evaluating the prerequisites (assume that the world and
itemManger are passed into EvaluatePreReqs() by default):
function EvaluatePreReqs(world, itemManager)
local forest = world:GetLocationFromIDString("FOREST")
local waterJug = itemManager:GetItemFromIDString("WATERJUG")
if (forest == nil) or (waterJug == nil) or (forest:HasItem(waterJug))
then
return false
end
return true
end
In Update() we only need to perform one check: Does the Forest currently
contains a Water Jug? When it does, we provide feedback for the player indicat-
ing that the challenge has been completed. In a full game we would also give
out some gold or experience points. Here is an example Update() function for
our challenge:
275
things up big time. FSMs can become complicated, assuming control over a
huge number of rich character behaviors. Designers (or programmers) unfamil-
iar with the subtleties of FSM design may have trouble keeping full view of the
‘big picture’ when modifying the AI. Changes in one state may have undesired
side-effects on other states in the machine. These state to state relationships
are not always obvious, and exposing them to more people (instead of just AI
programmers) is a risk increase. Performance is another important factor to
keep in mind when implementing a FSM solution. State logic implemented in
Lua will never be as fast as state logic implemented in C++. Make sure that the
majority of expensive calculations are taking place on the C++ side and develop
metrics to keep an eye on where processor cycles are being used up in your game
AI. As your FSMs near completion, consider moving the more expensive states
into C++ code, trading flexibility for performance.
Here is the basic skeleton of our FSM:
class FSMMachine
{
void UpdateMachine();
void ChangeState(FSMState* newState);
void AddState(FSMState* newState);
FSMState* m_currentState;
}
FSMMachine is responsible for aggregating the states in our FSM. It keeps
track of the active state, provides a mechanism for switching states, and is the
entry point for updating our FSM.
class FSMState
{
void Begin();
void Update();
void End();
}
Each FSMState requires three functions: Begin(), End(), and Update(). Be-
gin() is called whenever the state machine transitions to this state, End() is
called whenever the state machine transitions away from this state, and Up-
date() is called each tick on the active state. This functionality is captured in
the engine as FSMMachine::ChangeState(), which looks like this:
void FSMMachine::ChangeState(FSMState* newState)
{
m_currentState->End();
m_currentState = newState;
m_currentState->Begin();
}
The base class FSMState can be extended to implement a specific state in
C++ code. For example, we could add a new state FSMStateHunt, which extends
277
FSMState and implements the logic necessary to send an NPC out on a hunt. In
order to facilitate a Lua-driven FSM, we’re going to provide the LuaFSMState
class:
This class replaces arbitrary C++ logic in Begin(), Update(), and End() with
Lua function calls in this fashion:
void LuaFSMState::Begin()
{
CallLuaFunction(m_beginFunc);
}
Our FSMMachine is now free to mix and match Lua driven states with C++
driven states.
Now that we’ve got the basic transition structure of state machine logic set
up, how do we actually use it? First we need to define the state machine of
an NPC. We’ll expose a LuaFSMState factory function, and provide a startup
method for each NPC to seed their state machine.
function SetupFSM(fsm)
local idleState = LuaFSMState.Create("Idle", "npc0.lua",
"Idle_Begin", "npc0.lua", "Idle_Update", "npc0.lua", "Idle_End")
local wanderState = LuaFSMState.Create("Wander", "npc0.lua",
"Wander_Begin", "npc0.lua", "Wander_Update", "npc0.lua", "Wander_End")
fsm:AddStateToFSM(idleState)
fsm:AddStateToFSM(wanderState)
fsm:ChangeState("Idle")
end
This block of code starts off by creating “Idle” and “Wander” states, specifying
which Lua functions to call for the Begin(), Update(), and End() of each. It then
adds these states to our NPC’s FSM, and finally starts him off in the “Idle”
state. Note that it is possible to follow the pattern we defined in the “Data
Representation” section to move this data into a .csv file.
For this example, we only need to define the Update() functions for our two
FSM states (Assume that the FSM, the player, and our NPC are passed into
these functions when called from the engine):
278 22 · Lua for Game Programming
This code will cause the NPC to hang out in the Idle state until the player
comes onto the same location as him, which triggers a transition into Wander.
This NPC will spend the next few updates wandering randomly through the
world before returning to Idle.
We’re not doing anything in the Begin() or End() functions in a simple exam-
ple like this, but we can show some useful trace code here:
This is a simple example that shows the basic structure behind a flexible and
powerful FSM. We’ve given developers the option to move any or all AI decision-
making code into Lua, separating it completely from engine code. We retain
the option to implement each state in either Lua or C++ as we see fit. We can
define and iterate on many states quickly in Lua, while preserving the ability
to implement states in C++ when speed is a priority or the potential exists for
significant debugging.
in game to see how they play out, QA can attach a Lua script to a bug report
to provide more accurate reproduction, and programmers can use the script for
anything from general code testing to performance evaluation.
Making generic Lua function execution available has the added benefit of
increasing the development team’s exposure to your chosen scripting solution.
The more comfortable your team gets with Lua, the more use you will get out of
it.
Conclusion
Hopefully this article has given you some ideas on how to supplement your game
engine by exposing key areas to a Lua scripting solution.
The techniques discussed here barely scratch the surface of potential uses of
Lua for game programming. Before you delve in, analyze the problems that your
particular game is trying to solve and identify areas which lend themselves well
to a data-driven, scripted solution.
Designing an Efficient Lua
23
Driven Game Scripting Engine
Nicolas Peri
from tree to tree or evil monsters trying to attack you. When designing a game
engine, you must provide your future users an easy and flexible way to define
those behaviors. Nowadays, scripted artificial intelligence is the best solution to
do that. The problem is that scripts will always be slower than native code; that’s
a fact. When making a game engine, performance is one of the most important
constraints. We thus must find a way to reduce the CPU cost of scripts execution,
to stay under an acceptable threshold.
while ( gameIsRunning )
do
for each Object o in the Scene
do
for each AIModel instance ai controlling o
do
ai.runOneFrame ( )
done
done
scene.draw ( )
done
while ( gameIsRunning )
do
for each Object o in the Scene
do
for each AIModel instance ai controlling o
do
if ( ai.isTimeToRunOneFrame ( ) )
then
ai.runOneFrame ( )
end
done
done
scene.draw ( )
done
Listing 1.
while ( gameIsRunning )
do
for each Object o in the Scene
do
if ( o.isActive ( ) )
then
for each AIModel instance ai controlling o
do
if ( ai.isTimeToRunOneFrame ( ) )
then
ai.runOneFrame ( )
end
done
end
done
scene.draw ( )
done
Listing 2.
In this example that defines a simple behavior, the self keyword is the
Lua sugar that represents the current AIModel instance that is executed: its
members are set up by the C/C++ just before calling the Lua function. It allows
to get a handle to the object controlled by this AIModel instance. Once we get
this handle, it will be possible to use it in every high level functions that take
an object handle in parameter. A more complex behavior using could be for
example:
In this example, input and dynamics are Lua function packages, dedicated
to input devices management (keyboard, mouse, gamepad...), and dynamics
subsystem access.
286 23 · Designing an Efficient Lua Driven Game Scripting Engine
at the next garbage collection. All of this will fragment your system memory
and will also be time consuming. In an embedded environment or a gaming
console, with a limited and fixed amount of memory, it is vital to avoid as much
as possible dynamic allocations at runtime. A solution is to use a preallocated
pool of small buffers (16, 32, and 96 bytes are good values, but it should depend
on your implementation) that will be dedicated to the Lua runtime. Bigger or
unpredictable allocations are done through your main allocator.
Conclusion
Making a well done integration of Lua in your game engine finally implies
two major things: finding a way to execute only the scripts that are really
needed (but every engine should already do that for all other sub systems like
animation or dynamics, so doing it for AI should not be a big deal), and reducing
the overheads of Lua execution by avoiding time consuming operations and
providing high level dedicated function packages. Lua scripted behaviors would
also be useful for remote programming and debugging: it is possible to change
the behavior of one object in a game running on a game console, by simply
uploading a string containing the new script, which will automatically be taken
in account at the next frame.
Part V
History
Our company, Olivetti, is active in the ink-jet printer industry. We have always
written specific programs in order to drive our printers for testing and produc-
tion. The most recent one is a very powerful tool, called LuaDura, based on
Lua 5.1. This program can send low level commands to the printer, through a
dedicated communication channel. The protocol used here is similar to the Mass
Storage Class of USB: the host sends a binary command, optionally followed by
data; the printer always sends back a status, along with additional response
data. Several communication channels can be used for this task: USB, serial
port, Ethernet. Other ones may be added in the future, like Bluetooth or WiFi.
We wanted to be able to drive several printers simultaneously within a single
Lua script, for example in a mass production test board. The problem is that
all input/output commands through the preceding channels are blocking: it is
therefore difficult to drive several printers in a single thread. This is why we
need some form of true multithreading for our application.
Multithreading models
In the book “Programming in Lua” (second edition), Roberto Ierusalimschy
explains in chapter 30 that Lua supports two models for multithreading. The
third model is the one we will describe.
Later on, the coroutine will be resumed, and if the command has finished
executing, output result is pushed onto Lua stack, or an error is thrown.
Implementation
The annexed file thread.c implements this third solution for multithreading.
It is partly based on the source code of LuaDura, but with all references to
printer communication removed. Instead, it exports some simple tasks, which
are widely available and used. The implementation is aimed to be both an
example program and a startup file in which you can add features for your
application.
The code supports both Windows and POSIX preemptive threads. To mini-
mize the number of compilation switches, the following POSIX functions are im-
plemented under Windows using native objects: sem init, sem post, sem wait,
sem destroy, pthread create, and pthread cancel. With a simple similar in-
terface, it should be possible to run the example on other operating systems as
well.
The implementation does not need any change in Lua 5.1 sources. Like
all standard Lua libraries, this code only uses the official API. And like Lua
itself, it is written in clean C, thus should compile unmodified in both C and
C++ languages.
A pool of C threads is maintained in the form of four lists, all initially empty.
When a coroutine ask for a blocking operation, we look at the idle list of threads
and take one from that list. If the list is empty, a new native thread is allocated
and initialized. A table placed in Lua registry is used to keep track of which
coroutine is using which C thread at any time.
Like with the coroutines, there is no need to explicitly close the threads.
They will be collected like any other objects. If you are low in system resources,
you can force a garbage collection by calling collectgarbage explicitly. For this
purpose, a second table is present in the registry. Each time a thread is taken
from the pool to execute a command, a userdatum is created and placed in the
table. When the thread becomes idle again, the table entry is set back to nil.
Because this userdatum is bound to a metatable of which the gc method is
overridden, the unused operating system resources can be freed when a garbage
collection occurs.
The module exports one global thread table with a few functions. Four user
functions are exported into that thread table. They were chosen because they
are blocking functions, are simple and are available on most platforms. They
only pretend to be examples for user needs, although they may be useful as is.
All functions follow the same scheme. They do not execute directly their
associated base function. Instead, each one fills a message structure with input
parameters from the Lua state. Necessary checks on the parameters are done at
that time. The message is then passed to one native thread for execution. And
the function should yield until the message finishes executing.
Here we face the major difficulty. It is impossible in the standard Lua
implementation to yield inside a C function: only Lua functions can yield.
Although there are non-portable patches that allow this, we will stay with the
standard distribution and thus avoid yielding inside C code. Therefore, it is
necessary to use the following idiom, just after having sent the message to the
thread:
return lua_yield();
The lua yield function will register the yielded status into the Lua state,
and return –1. The net effect is that the calling Lua function will yield just after
the C function ends. When the coroutine is resumed, it would go ahead after the
call as if the command was successful. Because it is not the case, the C function
must be called again. The second time, the input parameters are useless, but
we will suppose that they are the same as before, to simplify the coding. If the
native thread has now finished executing the command, either the output data is
copied from the message back into the Lua state, or luaL error is called with the
error message. If the command is still executing, we again return lua yield().
So we have to call any of the four user functions inside a loop, necessarily
written in Lua language. As we wish to hide this complex mechanism from
the user scripts, the initializing function luaopen thread runs the following Lua
chunk just after having registered the user functions into the thread table:
With this mechanism, each C user function is overridden with a Lua closure
that calls the underlying C function inside a loop until it does not yield, and
returns all its result values. The original user function is stored inside a
295
Coding
Now let’s take a closer look into the C implementation. Each C thread has a
state variable associated with it for the synchronization mechanism. The four
states used by the state machine are as follows:
• REQUEST: A command has been sent, execution can start. Lua coroutine
yields.
• FINISHED: A command has finished execution, Lua can read back results.
There are four double linked lists of threads, one for each state. Each time
a thread state changes, its item is removed from one list and placed into the
next one. A shared semaphore assures that operations on these list cannot be
interrupted by other threads. Also to avoid any real time problem, Lua is only
allowed to exchange data with the thread during IDLE state for the command,
and during FINISHED state for the output read back.
An important helper function is retrieve thread data. Given the Lua corou-
tine represented by the current lua State parameter, it returns a pointer to a
C thread structure. Four situations can occur:
• If called from the main Lua thread (lua pushthread returned 1), simply
returns nil.
• If a command is already started for the coroutine, uses the mapping table
to return the associated thread.
• If the list of idle threads is not empty, returns its first element.
• Otherwise, allocates a new structure and fills it with a C thread and two
semaphores. The mapping table is updated accordingly.
296 24 · Enhanced Coroutines in Lua
The central function in the code is exchange with thread. After having
got the native thread structure from current coroutine, one of these scenarios
happens:
• If the thread state is IDLE, the message is passed to the thread. If an error
will occur during the execution by the native thread, the error string is
stored inside the message structure.
• When the thread state is FINISHED, the result is copied from the native
thread. If an error has been stored in the message structure, luaL error is
called with that text error message.
The synchronization between the native thread and the user functions uses
two semaphores for each thread. The first one controls the start of command.
It is signalled by the user function and waited by the thread. At the end of
execution, the opposite is done: the thread signals a semaphore awaited by
the user function. It also signals a shared semaphore that will be checked by
thread.wait.
An important function is thread.wait. It is responsible to stop the execution
of the main thread, without wasting CPU time, until one of the native threads
has finished execution of their command. If the list of finished threads is not
empty, it returns the first one. Otherwise, it waits on the shared semaphore
signals by the threads and tries again.
to separate the header and the content parts in Lua language, using regular
expressions:
Error handling
As already mentioned, if an error occurs during a command execution, we cannot
call lua error directly, because the native thread has no access to the Lua
interpreter object. Instead, it calls the local helper function error. This latter
begins by closing opened resources: freeing the output string and closing the
current socket, if available. Depending on whether or not it is called from a
coroutine, it then calls luaL error directly or stores the message for the caller.
The return value is only a syntactic trick to enable the use of the idiom like this
one inside the user functions:
This is also the reason for the existence of an unnecessary return value in
user execute * functions, although the return status could be used by the main
thread function to determine whether or not the command succeeded (in order
to perform logging, for example).
function thread.sched(threads)
while next(threads) do
local thr = thread.wait()
if not thr then thr = next(threads) end
if not coroutine.resume(thr, thread) then
threads[thr] = nil
end
end
end
The argument is a table of coroutine objects. Repeatedly, while there are still
active threads, it waits for the first thread that has finished a blocking com-
mand. It tries to resume it, passing the global table thread as argument to
coroutine.resume, so that the loop of previous listing knows that it had yielded.
298 24 · Enhanced Coroutines in Lua
If the resume fails, this either means that the user script has normally finished
its execution, or there was an error. On both cases, the thread object is removed
from the table, and will be garbage collected some time later (provided there is
no other reference to it). In the case of an error, it would be preferable to display
the error message or log it to some file, depending on the environment, but this
was just a simple example. Another limitation to this function is that it does not
resume coroutines which have yielded for other reasons than inside the thread
library (by calling coroutine.yield). When a coroutine is created, it is in the
suspend state, so we have to resume it once before calling the scheduler function
(alternatively, thread.sched may resume all threads before entering the while
loop).
Interpreter
In a multithreaded Lua script, the main function is typically the scheduler
function, which is called after the coroutine objects have been initialized. When
called from the standalone Lua interpreter, this means that until the last thread
has finished its job, no prompt is issued to the user. This can be annoying, since
you might want to keep the control of what is going on. But it is easy to write
a new interpreter in Lua, using the exported thread.gets function. Here is an
implementation which mimics the behavior of the regular Lua interpreter:
function thread.shell()
local line = ’’
local prompt = ’--> ’
while true do
io.write(prompt)
prompt = ’--> ’
line = line .. thread.gets(1000)
if line == ’quit\n’ then return end
line = line:gsub(’^=’, ’return ’)
local fct, err = loadstring(line, ’@stdin’)
if fct then
local res = { pcall(fct) }
if not res[1] or #res > 1 then
print(unpack(res, 2, #res))
end
line = ’’
elseif err:sub(-7,-1) == "’<eof>’" then
prompt = ’-->> ’
else
print(err)
line = ’’
end
end
end
299
The function is continuously issuing a prompt, slightly different from the de-
fault one of lua.c, so that you know you are in a multithreaded script, and reads
a line with the blocking function thread.gets. Until you press the RETURN key,
this script is suspended and other threads can execute other tasks. Like the
regular interpreter, an equal sign at the beginning of the line is a shortcut for
the return statement. The line is then compiled. If a compilation error occurs
at the end of the block, this means a multiple line instruction is typed, and so a
different prompt is issued, and the next read line is concatenated to the previous
one. You can exit the function by issuing the command “quit” on its own.
Complete example
The web site contains a more complete example for running simple user scripts.
All four user functions previously discussed are run in parallel. You may notice
that they do not to worry about the underlying multithreading; the scheduler
function with the help of the thread.wait function takes care about all low-level
synchronization management.
Using Lua in Pascal
25
Jeremy Darling
Why Lua?
In the world of Pascal development there are plenty of native solutions for
scripting an application. Most of these are built with the Pascal language
itself and this only leads to limiting the user base for the scripting language in
question. Utilizing a well known and commonly used scripting language within
our applications only helps to expand our user base in the end.
There are other common scripting languages out there (JavaScript, Monkey,
and VBA to name a few), but as of the time of this writing, Lua is the most
robust and supported scripting language available. Due to the nature of how it
is built, Lua allows for great flexibility within an application and it incorporates
into Pascal applications seamlessly.
Wrappers as many bugs have been fixed and new features introduced in pLua
that haven’t been applied to the Generic Wrappers.
Instead of focusing on the “common” implementation of Pascal (Delphi), I will
be focusing on generic ANSI-ISO Pascal in its Object Oriented form. This will
allow the same code to be executed within FreePascal, Lazarus, Delphi, Kylix,
and most other implementations of Pascal.
We will also need a design problem to solve.
The problem
In order to provide the best walkthrough of integrating Lua into a Pascal appli-
cation as possible, I’ll present an application that would benefit well from Lua
integration. The application is a game of sorts where the user is up against the
HAL 9000 and must convince it to release control of the ship.
To keep with the ubiquitous first project (“Hello World”), our first implemen-
tation will load whatever script is passed in and execute it.
Start Application
Check to see if game.lua file exists
If so then load it
Else
Throw an error
Setup the game and execute the script to prepare the environment
Player input and processing
Lua API, but instead are part of lauxlib (or Lua Auxiliary Library). Since they are built into the
standard Lua binary release, they have been included in the lua.pas wrapper.
304 25 · Using Lua in Pascal
Writing
To modify or create a new global value we need to do three things: first, we have
to push the name or index associated with our value; second. we have to push
the value itself onto the stack; finally, we have to tell Lua to set the value in the
global table:
Lua_pushliteral(L, ’MyIdentifier’);
Lua_pushstring(L, ’Some Value’);
Lua_settable(L, LUA_GLOBALSINDEX);
As you can see in the code above (lua_pushstring), there are methods to read
and write each variable type that we use (at least until we start using the
variant type wrappers presented within the pLua unit). This should be nothing
new if you have stored or retrieved information from a standard .ini or registry
object.
We place our identifier onto the stack first with a call to lua_pushliteral;
this could also be done with a call to any of the lua_push* methods. Lua uses
a hash table lookup, so any type (except nil) can be used as a key within a Lua
305
table. We then place the value that we want to associate with the name onto
the stack using the proper lua_push* method. The final call is lua_settable
with our Lua State Pointer as the first argument and the table identifier as the
second (LUA_GLOBALSINDEX is a constant for the globals table). We can simplify
this code by using methods from the pLua and variants libraries within Pascal.
The variant library provides us a VarIsType function that we can use to see if
the variant type is a string (varString). pLua provides us with a method called
plua_RegisterValue that takes a Lua instance, value name, a value (quoted if
the value is a string), and a table index (defaulted to LUA_GLOBALSINDEX) that
performs the above actions for us. You will notice that this method is used quite
a bit though the examples that accompany this article.
Reading
Retrieving the value back out of the table is almost as easy:
Lua_pushliteral(L, ’MyIdentifier’);
Lua_rawget(L, LUA_GLOBALSINDEX);
If lua_isnil(L, 1) then
Else
MyString := Lua_tostring(L, ’MyIdentifier’);
We push the name of the variable that we want to retrieve onto the stack. We
then call lua_rawget with our Lua state pointer and table index. Then we test
for nil (lua_isnil). If the value on the top of the stack isn’t nil, we get the value
using the proper lua_to* method.
Just like writing a variable, there are some support methods provided by the
pLua library that allow us to minimize our source code. The main one that we
are interested in is plua_tovariant. This function takes the Lua state and the
stack index that we wish to work with and returns a variant type that contains
the value.
More on methods
Once we can read and write global variables, the next logical step is to move up
to surfacing methods (procedures and functions) from our Pascal environment
to our Lua environment. We will also need a way to call Lua methods (functions)
from within our Pascal source code. We touched lightly on this before, but this
time we will be looking at the specific needs of methods inside and out.
function. Procedures don’t allow you to pass back any value, but can still contain
var args.
This isn’t true in Lua. All methods return a value (default is nil), and there
is only one method type (a function). Parameters to a method are also static, and
their values are cannot be marked as output of any kind. A method in Lua may
return as many values as it wishes and the return types are not pre-defined.
Surfacing a method
The most basic of methods that you will need to call from a Lua script is the
print function. While the standard Lua libraries surface print for us, there are
some aspects of its implementation that may not be desirable. For instance, by
default, print will only print to the standard output interface (console window).
If we want to have our script print to another output (say a memo), then we will
need to override the default print hander.
Remember that, since Lua is developed in C, we must always use the cdecl
compiler flag to let the compiler know that we expect the method to follow C
rules and not standard Pascal rules.
Our first call is to lua_gettop. This call will tell us how many arguments are
being passed from Lua to our method. In some methods you can use this as a
quick and dirty test to see if you are receiving the proper number of arguments.
In most, you will also need to provide a type check along with the argument
count check.
For our print implementation we will only be using this as a high for our
counter. Next we create an empty string that we can place the passed in values
into. We then iterate through all of the values passed in (notice that the Lua
stack starts at a positive index of 1 instead of 0 like C and Pascal), placing each
argument string representation into our container. We then add the combined
string value to our memo.
The last step is to tell Lua that we didn’t put anything back onto the stack (a
return value) and thus that it should set the return value itself. Remember that
Lua methods always have a return value.
Now that we have our method defined and ready within our application we
need to surface it to the Lua virtual machine. We do this after we register
the libraries that we want to make use of (in case we are overriding default
behavior).
Our first call is to lua_pushliteral with the Lua instance and the name that
we want scripters to use to call our method. This is followed by placing the
pointer to the method onto the stack using lua_pushcfunction. We then make a
call to lua_settable with the LUA_GLOBALSINDEX (that should now be becoming
very familiar).
many items we have placed on the stack. To demonstrate this, we will surface a
method called Size, which will return the width and height of our applications
main form.
Unlike print, we don’t care how many arguments Lua is passing in; instead,
we only care about returning values. In cases like this, it’s a good idea not to
waste time checking the number of input arguments and instead just place our
values on the stack.
The two calls to lua_pushinteger place the width and then the height onto
the stack and are followed by us setting the return value of the function to 2.
This tells the Lua virtual machine that we returned two values. It’s important
to note the order that we pushed the return values, as when we use this method
from within our script we will need to know what to expect first and last. I’ve
stayed with the typical X-then-Y structure, but there is no good reason it couldn’t
be Y-then-X.
We register this method the same as any other method; in fact, this is a good
place to introduce a procedure for registering methods. pLua has a wrapper to
achieve this as well, and when you look at the source code you will find it using
plua_RegisterMethod instead of all of the hand code.
to surface objects and records. No worries though, Lua has mechanisms in place
to take care of object handling.
There are no hard and fast rules as to how you implement or handle objects
within Lua. In fact, a quick search of the web will result in many different ways
to handle objects. I’m going to present the one that I use and that I’ve found to
work very well in the end this isn’t the end all be all answer to objects. [CHECK!]
If you think that I forgot about records, you’re mistaken. A record is nothing
more than an object without methods. More accurately objects are just records
with method pointers. Thus, if we can handle objects then records will fall in
and work automatically.
Metatables
From the Lua manual (section 2.8): “Every value in Lua may have a metatable.
This metatable is an ordinary Lua table that defines the behavior of the original
value under certain special operations. You can change several aspects of the
behavior of operations over a value by setting specific fields in its metatable. For
instance, when a non-numeric value is the operand of an addition, Lua checks
for a function in the field __add in its metatable. If it finds one, Lua calls this
function to perform the addition.”
Metatables make object types possible within Lua. It is outside of the scope
of this article to cover all aspects of metatables, but we will need to at least
understand them at their very basics in order to implement objects and records.
If you understand how exactly OO works, then the concept of a virtual
method table (VMT) should not be alien at all. If you don’t, then think of a
VMT as an array of pointers that says what method address should be called
when a particular method is called in the code. This way an object can contain
only its private information (variables) and use a global method table (the VMT)
along with its private address (object pointer) to save space and make its calls.
This is the basis of polymorphic programming in general (OO). A metatable
allows us to implement a VMT of our own. The primary difference is that the
metatable doesn’t only have to surface methods, it can surface values, alternate
implementations (we won’t cover this), and methods.
Object properties
Lua tables have built-in meta-methods to allow developers to modify their de-
fault behavior. We will use this to our advantage to implement variables (read
and write for the case of this document; read only or write only should be easy
to figure out) and methods. For variables we will need to override the default
__index and __newindex table entries to call our own getters and setters.
310 25 · Using Lua in Pascal
First we have to receive back a copy of the object that Lua is referencing.
This is achieved with a call to a custom function that returns a Pascal Object
from a Lua stack reference.
Within our TLuaObject descendant we will surface helper methods to read
and write variable values. These methods are: GetPropValue, SetPropValue,
GetPropObject, SetPropObject. They do exactly what they say: get a value, set
a value, get a sub-object and set a sub-object.
If you choose to use the pLuaObjects unit then you will find that you can
quickly and easily wrap existing objects. Take a look at the pLua demos folder
for the pLuaObjects and pLuaObjects2 demos for more explanation.
Object methods
Methods are a bit trickier. We will need to tell Lua that the method exists and
what class type the methods are tied to. We will then need to write a method
handler that prepares the proper arguments and then passes them to the object
instances actual method. We will need to check for results and out parameters
and put them back on the Lua stack. The latter part is the same as when we
covered methods above, the former is explained in the source code.
Records
While I know that I said if we supported objects that records would fall in
automatically, but there is one caveat to records. You must use record pointers
instead of actual record types. This is due to the way that records are passed and
assigned within Pascal. If you assign an object instance to an object variable,
the variable contains a pointer to the object instance. On the other hand, if
you assign a record to a record variable then the entire contents of the record
are copied over to the variable. If you can keep this in mind and use record
pointers in all locations, then you will actually gain twofold. First, you will
guarantee that your records will work properly, and second, you will notice that
your application uses less memory.
you might have noticed that you received access violations after the first call.
This is because simply loading the Lua code does not compile the source and
prepare the virtual machine (first part of the article).
Instead we need to load the source file, execute it once, and only then we can
call our loop routine.
Final words
Everything that has been covered in this article has been wrapped up into a nice
little package for you to work with. The files LuaWrapper.pas, pLuaObjects.pas,
and LuaObject.pas have helper classes, functions, and base classes for you to
extend and use. They have been built from experience and have quite a bit of
testing going on with them all of the time. As with all libraries though, you
should read and understand them as the author(s) are not responsible for any
damage to your system.
26
Porting Lua to a Microcontroller
Ralph Hempel
The Lua language was designed from the beginning to be small in its memory
footprint for both the developer and the target machine. The basic philosophy
is to provide a concise and unambiguous syntax that the developer can use and
depend upon.
The purpose of this gem is to outline some of the issues that come up when
porting Lua to an extremely memory constrained target. I’ll go over a basic in-
troduction to the target, which is the LEGO MINDSTORMS NXT brick, talk a
bit about how the run-time library is designed, and then introduce my compro-
mise between 32-bit long integers and single-precision floating-point numbers,
which I call “flongs”.
Forth was about the only language I could imagine running right on the brick,
and the result was pbForth.
In 2005, the new LEGO MINDSTORMS NXT was released. It has an ARM7
based micro with 256K of rewritable on-chip FLASH and 64K of RAM. I could
have easily ported Forth to that device as well, but the syntax of Forth can make
marginally sane programmers cross the line.
In searching for other small interpreted languages, I evaluated two addi-
tional options: Lisp and Lua. While very powerful, Lisp suffers from the same
problems as Forth in terms of syntax. Since most programmers are very comfort-
able with infix notation, Lua provides a familiar syntax compared with Forth’s
postfix and Lisp’s prefix notation.
I really wanted to try to port Lua to this device just to see if it could be done,
and then realized that I would have a very powerful way of programming robots
interactively. This might be useful for academic purposes when the limitations
of the original GUI programming environment provided by LEGO are reached.
There are some special challenges to getting even a minimal Lua system
working on a deeply embedded system. Besides the obvious one of putting
together a toolchain that generates the code image, the more subtle problem
is to figure out what can or should be removed in order to make a useful system.
Being lazy has also made me dependent on make, which I use in all kinds of
projects to make sure that I have to think and type as little as possible once I’ve
figured out how to get a job done.
I have a standard framework that I set up for any embedded systems project,
which I won’t describe in detail here but is available in the pbLua distribution.
In general terms, it starts with a directory that has the processor specific startup
315
files and a few I/O routines that blink LEDs, make sounds, or read and write
characters thought a serial port.
Once I have that framework set up and some minimal code compiled that will
flash LEDs, make sounds, or read and write characters through a serial port I’m
ready to move on. One other thing being lazy has taught me is that you will do
a lot more work later if you don’t start simple, gain confidence in your tools, and
only then add complexity.
#undef LUA_NUMBER_DOUBLE
#undef LUA_NUMBER
#define LUA_NUMBER_LONG
#define LUA_NUMBER long
/*
@@ LUAI_UACNUMBER is the result of an ’usual argument conversion’
@* over a number.
*/
#undef LUAI_UACNUMBER
#define LUAI_UACNUMBER long
/*
@@ LUA_NUMBER_FMT is the format for writing numbers.
@@ lua_number2str converts a number to a string.
@@ LUAI_MAXNUMBER2STR is maximum size of previous conversion.
@@ lua_str2number converts a string to a number.
*/
#undef LUA_NUMBER_FMT
#undef lua_number2str
#undef LUAI_MAXNUMBER2STR
#undef lua_str2number
/*
@@ The luai_num* macros define the primitive operations over numbers.
*/
#if defined(LUA_CORE)
#include <math.h>
#undef luai_nummod
#undef luai_numpow
Listing 1.
317
Thread-safe considerations
One of the buzzwords you’ll hear in a discussion on embedded libraries is “thread-
safe” or “re-entrant”. In multi-tasking systems it is quite common for a routine
called by a task to be interrupted at any time for a task switch. The new task
may call the original routine as well. If the routine is thread safe, it won’t get
confused and return incorrect results to either task.
One of the first steps towards becoming thread safe is to not use global
variables. As long as a routine allocates all of its variables on the calling task’s
stack, then chances are the routine is thread safe.
in assembler. If you do find that the routines are too slow, then go ahead and
rewrite them later. In the meantime you’ll have code that works.
The basic string library that I settled on is based on the Minix project from
Vrije Universiteit in Amsterdam. It’s a well known project that has been in use
for years, so I am confident that the library has had most bugs eliminated.
Besides the string routines, the Minix project source also yielded routines to
handle character classification, memory operations, and basic error and locale
handling:
memory string characters locale
memchr.c strcat.c strncpy.c isalnum.c errlist.c
memcmp.c strchr.c strnlen.c isalpha.c locale.c
memcpy.c strcmp.c strpbrk.c isascii.c setlocale.c
memmove.c strcoll.c strrchr.c iscntrl.c strerror.c
memset.c strcpy.c strstr.c isdigit.c
strcspn.c strtoflong.c isgraph.c
strerror.c strtol.c islower.c
strlen.c tolower.c isprint.c
strncat.c toupper.c ispunct.c
strncmp.c isspace.c
isupper.c
isxdigit.c
chartab.c
There are a few things I changed in order to save space, and the main one is
in errlist.c where I set all errors to unknown except for the ones that are set
by Lua and its libraries. Other than that I did not touch the code because it was
unlikely that I’d be able to improve on it without breaking something.
The other place where I’ve taken some liberties is with the number conver-
sion part of the string library. The routine that the Lua interpreter uses to
convert strings to numbers is luaO_str2d. This routine operates as follows:
1. First, it tries to use the lua_str2number macro to read the string as a
number. If it makes no forward progress in reading the string, then the
conversion fails completely and luaO_str2d exits returning 0.
2. Next, it checks for a leading upper or lower case ’x’, in which case it
tries to convert the string as an unsigned hex number using the strtoul
function.
3. Next, remove any trailing spaces and check if we’ve reached the end of the
string we’re trying to convert, and if so, luaO_str2d exits returning 1.
4. If we get here, we have illegal trailing characters after the end of the string
we are trying to convert, so luaO_str2d exits returning 0.
I needed to make sure strtoul was available, and it’s part of the strtol.c
file. As you’ll see later in the section on the math library, the trick was figuring
out a way to get the interpreter to differentiate between floating point numbers
and integers in a way that was not too complicated. The custom routine I came
319
read and try to understand how it’s implemented underneath. The first language I learned (besides
assembler and C) was Forth. It had a wonderful structure that made it obvious how it worked. When
I saw the Lua source code, I knew I was looking at something that had evolved over time and had
some deep thought behind it. It looks obvious when you read the code that this is the “right way” but
based on my experience, there were probably a few failed attempts before this. . . From the high level
parser to the virtual machine, and the API that handles the interaction between C or assembler
libraries and Lua itself, the Lua core is beautifully organized and a model of good code. I have much
greater confidence that code is correct if it looks like it was carefully crafted.
322 26 · Porting Lua to a Microcontroller
reason it’s not a good choice to be the single numerical representation of this
implementation of Lua.
In addition to that, most of the time we’re interested in doing fairly simple
math when we’re designing robots. While the overhead of converting all num-
bers to float is less than for doubles, it’s still significant. Even simple operations
like addition and subtraction become much more complicated with floats than
with longs.
While these two tradeoffs competed against one another, I started to think
about ways in which a compromise could be struck — and the result is a hybrid
number type that uses standard representations for long and float while making
maximum use of the benefits of both. I call it “flong”.
The breakthrough came when I realized that the practical application of each
of the number types is in several distinct domains — in other words you use one
numeric type for the task at hand and avoid mixing them unless absolutely
necessary.
I often get asked why I don’t use standard or even light userdata for the
new numerical type. The answer is quite simply speed. Light userdata values
don’t have metatables that we can use to override their operators, and standard
userdata exacts a toll on the C API side.
Conclusion
Building a complete Lua interpreter under Linux, BSD, OSX, or even Windows
is relatively simple when you use the makefiles provided with the source code.
Building Lua for use on a microcontroller with no underlying OS support re-
quires more careful consideration of tradeoffs between speed, size, and accuracy.
Embedding Lua on a constrained micro forces the programmer to think hard
about many things that are taken for granted on a desktop system. From
math to memory, strings to stdio, almost everything is under the control of
the designer, and knowing what the tradeoffs are can help you to make better
decisions.
In the future, this project may take advantage of other work being done in the
Lua world, including keeping some table information in non-volatile memory,
324 26 · Porting Lua to a Microcontroller
and allowing pre-compiled code chunks to be stored and used later. I also plan
on implementing a simple, flat, filesystem that can be used to log data and store
raw or precompiled code.
In an ideal world, every university and high school would have a LEGO NXT
lab where students could learn about programming by designing and building
simple robots. I may have to settle for a bit less.
Writing C/C++ Modules for Lua
27
Ralph Steggink and Wim Couwenberg
of C/C++ modules. Where for Lua 4.0 we tended to write a complete module
in C, we now only do minimal coding in C to expose a library’s API pretty much
verbatim to Lua and then shape it into a scripting friendly module in Lua code.
A C module for Lua hence consists of two parts: a public Lua script contain-
ing the module’s interface and a private C library that exposes the raw C API
as literally as possible to the Lua script. Here “public” means that only the Lua
script is ever required directly by client code.
The low-level API functions perform little to no argument checking since
they are called only from the accompanying Lua script in a controlled and safe
manner. Of course we must ensure that no low-level API function can “slip” out
of our module and into the open since calling such a function in any other way
than intended might lead to instant disaster.
Hence to make a C module for libevent we create Event.lua and a C-Event.so
(or C-Event.dll) file. Requiring the “Event” module will in fact find, load, and
run the Event.lua script. The Event.lua script will in turn require “C-Event” to
load the private C part and properly setup the entire event module.
The distribution of responsibilities between the Lua and C part of a module
is always roughly the same. The tables below list the typical content for both.
C part:
• Exposes C API to Lua part “as is”.
• Provides garbage collection methods for the module’s object types.
• Can define public constants.
Lua part:
• Defines structured Lua object methods around the low-level API calls.
• Sets up object meta tables.
• Exports object constructors in the module’s namespace.
• Offers structured error handling.
• Checks preconditions on method parameters (correct types and values).
Note that in general the C part does not add any function definitions directly
to the module’s namespace. The public interface is taken care of by the Lua
script. It is convenient to add public (numerical or string) constant to the
module’s namespace directly from the C part. This saves us from copy pasting
their symbolic definition from a header file to the Lua script. The Event module
defines its public constants in the C part.
A private communication channel between the Lua and C parts is setup by
means of two local tables in the Lua script: A prv table that will hold the low
level C API calls and an aux table that holds anything from the Lua script
that must be available from within the C part. Being local, these tables are
not accessible outside of the module. In particular, the prv table that holds
dangerously low-level C calls is safely tucked away. (Similar to Lua Technical
Note #7 [3].)
327
When the private C part is required from the Lua script it just returns an
initialization function without any other side effects. This function is called
with aux, prv, and the module table _M as parameters. The aux table is set as
the environment of all low-level C API functions and this API is put in the prv
table. Public constants are defined directly in the module table. For the libevent
library, the script Event.lua would typically begin as follows (the exact script is
presented at the end of this gem):
The initialization function does three things: It defines public constants from
event.h in the module’s namespace. Then it replaces its own environment with
the provided aux table such that it will be easily accessible to all API functions.
Finally, it places all the low-level API functions in the prv table. Note that by
pushing a C closure on the stack before storing it in ”prv” it will automatically
inherit aux as its environment table. The C part for libevent in Event.c would
roughly look as follows (the exact source file is at the end of this gem):
/* module initialization with "aux", "prv" and module tables */
static int initialize(lua_State *L)
{
< ... setup constants in module table (code omitted) ... >
/* set "aux" table as environment. */
lua_pushvalue(L, 1);
lua_replace(L, LUA_ENVIRONINDEX);
/* put low-level API functions in "prv" table. */
lua_pushvalue(L, 2);
luaL_register(L, NULL, prv_functions);
return 0;
}
Notice that this C code is really minimalistic. Not even metatables are
created and manipulated in C: All such interesting stuff will be left to the Lua
script. Having established the generic setup of a module as a pair of a Lua script
and a private C library we will now discuss how a module can organize an API
in more scripting friendly objects.
Objects
Mostly a library’s API is structured in an object-oriented way. This is obvious
for C++ interfaces but is equally true for many C interfaces. For example,
libevent really introduces an “event” object of which the event_set function is a
constructor and the event_add function (among others) is a method. This object-
oriented approach is really convenient from a scripting point of view, so we will
want to structure the event module’s interface in terms of objects.
An object is modeled as a combination of a Lua table and a full userdata.
The userdata part represents an object from the C library while the Lua table
is used to store additional information with this object. For libevent we reserve
space for an event structure in a userdata. A Lua table is used to store the
callback function for an event. The Lua table can be set as the environment of
the userdata to make it easily obtainable from the userdata. Also, the userdata
can be put in the Lua table (by assigning it to some “private” field __udata say),
so that the userdata can be accessed from Lua.
The possibility to set an environment table for a userdata was introduced in
Lua 5.1 and is a great help to associate Lua data with userdata. In Lua 5.0 we
could only do this by maintaining a userdata-to-table mapping. Such a mapping
can still be necessary for some other purposes even in Lua 5.1, as we will see in
the libevent example, where we use it to retrieve a Lua object from a pointer-to-
void callback argument.
When writing a module primarily in Lua we are confronted with the follow-
ing restrictions of environments and metatables:
1. A userdata’s metatable cannot be set in a Lua script.
2. A userdata’s environment cannot be set or obtained in a lua script.
Restriction 1 means that the module’s C code must have access to the metat-
able to construct a userdata object since it cannot be set by the Lua script later
on. In our libevent example we put all metatables in the aux table (the environ-
ment of all C API functions) exactly for this reason.
Restriction 2 hinders the implementation of object methods in the following
sense. We could simply implement an object as a userdata with a metatable and
an environment to store associated data. In this case, each method receives a
userdata instance as their first self parameter. If object methods are written
in Lua (which we aim for), then a method cannot get at the environment ta-
ble of self — an inconvenient situation. Even though we could work our way
around this inconvenience (via a getenv function in the module’s prv table), we
implement objects differently.
329
An object is a plain Lua table that contains the userdata part as its __udata
field. This field is considered “private” although we do not take extra measures
to make it inaccessible. (It is private only by convention.) With this setup
for objects we need two metatables for each object type: one to specify public
methods and properties for an object and another one to specify a garbage
collection function for its userdata part. (Remember that the __gc metamethod
is never called for Lua tables.) The methods and properties metatable can be set
on an object (a Lua table) in the Lua script itself. The userdata metatable must
be set in the C part of the library when the userdata is created. Such metatables
are placed in aux for easy access from C.
Listing 1.
specific libraries before resorting to the all-in-one loader. This allows to “patch”
selected modules from a distribution.
A libevent module
Finally, we present some of the techniques that we discussed to make a Lua
module for the libevent library. The code below is fully functional but still
intended as an example. Only a very small part of libevent’s API is included,
but enough to see that it is actually working. Of course, there is room for
lots of improvements and variations. Function and method parameters could
be checked for preconditions and error return codes translated into something
more sensible. However, what this example aims to show is that such things are
really easy to add simply in the module’s Lua script.
First, a small example that uses the Event module to echo stdin with a
timeout of 5 seconds:
331
Example.lua
require "Event"
Event.lua
-- save the used globals
local math, require, setmetatable, pcall =
math, require, setmetatable, pcall
Event.__index = Event
function Event:add(timeout)
local sec, usec
if timeout then
sec = math.floor(timeout)
usec = (timeout % 1)*1e9
end
return prv.add(self.__udata, sec, usec)
end
function Event:del()
return prv.del(self.__udata)
end
-- global functions
function create(fd, event_type, handler)
local udata, ptr = prv.create(fd, event_type)
local event = setmetatable({}, Event)
event.__udata = udata
event.handler = handler
events[ptr] = event
return event
end
function dispatch()
prv.dispatch()
end
#include "lua.h"
#include "lauxlib.h"
/* Module-global dispatch */
static int dispatch(lua_State *L) {
event_dispatch();
return 0;
}
/* timeout specified? */
if (lua_type(L, 2) == LUA_TNUMBER) {
t.tv_sec = lua_tointeger(L, 2);
t.tv_usec = lua_tointeger(L, 3);
pt = &t;
}
/* constants */
static const struct constant constants[] = {
{"EV_TIMEOUT", EV_TIMEOUT},
{"EV_READ", EV_READ},
{"EV_WRITE", EV_WRITE},
{"EV_SIGNAL", EV_SIGNAL},
{"EV_PERSIST", EV_PERSIST},
{NULL, 0}
};
335
/* private functions */
static const luaL_Reg prv[] = {
{"dispatch", dispatch},
{"create", create},
{"add", add},
{"del", del},
{NULL, NULL}
};
return 0;
}
References
[1] Roberto Ierusalimschy, Luiz Henrique de Figueiredo, Waldemar Celes,
“Lua 5.1 Reference Manual”, Lua.org, 2006.
[2] Niels Provos, “libevent – an event notification library”.
http://www.monkey.org/∼provos/libevent
Lua 5.1 provides a flexible and powerful module mechanism. It can load two
types of modules: Lua modules, which are written in Lua, and binary modules,
which are written in any compilable language that can produce shared libraries.
Through this mechanism it is possible extend Lua in many ways, making it a
truly extensible language, for use as a general scripting tool.
However in certain circumstances these two extension mechanisms may not
be enough. Hopefully, the Lua module mechanism has been carefully written
and is itself extensible thanks to a searcher concept. The stock Lua interpreter
comes with a few searchers that implement the Lua 5.1 module system. But
it is possible to create additional searchers that will inject code inside the Lua
interpreter state in whatever way may be needed.
In this article I will introduce a method to extend the Lua module mechanism
to other kinds of modules. As an example, I will show how to support modules
written in uncompiled C, with the help of a tiny embeddable C compiler, TCC (for
Tiny C Compiler). This mechanism will allow us to load uncompiled C modules,
and to compile them on the fly. I will first explain how the searcher mechanism of
Lua works in detail, and how to hook into it. I will present TCC, and especially
libtcc, which is a library that is able to compile C code and relocate it in the
current executable for immediate use. We will finally see how to create a small
binding library for TCC that injects its C compilation ability in the Lua module
framework.
TCC
Tiny C Compiler
TCC is a C compiler targeted for x86 platforms written by Fabrice Bellard (of
ffmpeg and QEMU fame). It is very small and very fast. It is so fast that it is
used as a JIT compiler to interpret C programs, and that’s precisely the feature
we’ll be using in this gem. Originally TCC was derived from OTCC, which is
aimed to be the smallest self-compiling C compiler. OTCC was not capable of
compiling full C99, only a subset of it, but that subset was C99 compliant and
enough to build itself. TCC has kept that minimalistic approach while being
much more usable in a production environment.
TCC is heading toward full ISO C99 compliance. It does have some exten-
sions, but not as numerous as GCC’s. It has almost no support for older versions
of C not covered by C99. Also it has no support for C++, so C++ programmers
used to GCC extensions must take care to program strictly in C. But past these
little restrictions, TCC can compile most C code without any problem and very
quickly (for example it can boot a typical Linux 2.4 from sources in less than 15
seconds on an average PC).
libtcc
TCC can be used as a standalone compiler, but it is internally built around a
compiling library, libtcc. That library can be used from external projects, to
compile and link C code. But libtcc is also able to relocate dynamically generated
code into memory, and to return pointers to functions and other symbols in that
340 28 · Interpreted C Modules
relocated code. That’s the way TCC is used as a C interpreter, and that’s the
feature we’re going to use to create Lua modules at runtime from C source code.
Libtcc follows classic C compilation steps. First you can add include paths,
library paths, and predefined symbols. Then each source file is compiled. All
compiled files are finally linked between each other and with external libraries
to produce the output binary. TCC can link to dynamic libraries too, but on some
platforms it may require a specific import library (read the TCC manual for more
information). The output binary can be saved into an executable file (ELF on
Linux, PE on Windows), or it can stay in memory and be directly executed. In
the latter approach, TCC can retrieve pointers to symbols in the binary and give
them to the calling code. If the symbol is a function, it can be directly executed
as a function pointer.
Special care must be taken when accessing symbols in the relocated binary.
Libtcc has no way to ensure that the retrieved symbol is of a specified type, so
you have to carefully handle the pointers returned and cast them to the proper
type. This is especially important for functions pointers, since calling a function
with an incorrect number of parameters or with the wrong calling convention
can invalidate the stack pointer and lead to unpredictable results, among which
the program crash is the least problematic. Here the fact that Lua uses a single
function prototype almost everywhere will be very handy and will avoid many
complications.
Another problem must be handled: data execution prevention hardware
features. On some platforms, which includes modern x86 derivatives, there
are safety mechanisms built in the memory hardware to separate code from
data in memory. Libtcc output is considered as data (it is written by current
process) and code (we want to execute it), so attempting to call functions in TCC
relocated code is considered by the underlying hardware as a violation of the
data/code separation and may interrupt the whole process. In your application
you must make sure that this behaviour will be allowed (with its potential
security implications). This is handled neither in libtcc nor in the libtcc binding
presented here, since it’s a very hardware and OS specific issue.
TCC binding
The TCC binding I’m going to expose here was originally based on the one
written by Javier Guerra. It has been widely rewritten and extended to be
used as a Lua searcher. This binding is split in two parts. The first part, the
luatcc module presented in this section, is a simple binding to libtcc and allows
to compile and execute C code. The second part, the luatcc.loader module, is a
module searcher that locates C source files and compiles them as Lua modules.
The TCC binding is articulated around a context or state concept. A context
is like an instance of the compiler. It has its own paths, you can add several
source files, declare several libraries, and it can produce a single output binary.
However, you can access several symbols in that binary. To create a context, just
call the luatcc.new function. The module source code is not of much interest:
341
it’s just a simple C library binding, so I won’t explain it here; source files are
self-explanatory. Here is a basic example that extracts a function called hello
from a C source string:
local luatcc = require ’luatcc’
local context = luatcc.new()
context:compile([[
#include <lua.h>
int hello(lua_State* L)
{
lua_pushstring(L, "Hello World!");
return 1;
}
]])
context:relocate()
local hello = context:get_symbol("hello")
print(hello())
As you can see, you must call the methods compile, relocate, and get_symbol
of the TCC context object. compile accepts as a second parameter the chunk
name, which can be useful when you have several source files and an error oc-
curs. Here we don’t add include paths; TCC will use its predefined ones to locate
lua.h. These predefined paths are defined at TCC compilation time.
TCC searcher
The TCC searcher is very simple. It mimics Lua and C searchers. We will
examine its source code and comment each section.
The first action of the searcher will be to locate the C source file for the module
we are trying to load. To do that we use the content of the package.tccpath
variable in the same way that the Lua and C searchers use package.path and
package.cpath respectively. Each tested path which doesn’t match is added to
an error message. If the module is not found, that filename list is returned to
require in a string. The format is the same as the one used by the Lua and C
searchers: each path is prefixed with a new line and a tab character.
The content of the module source file is read entirely. That way we will be able
to locate pragma directives inside the file (see below). It is not the most efficient
way to load the file, especially if the source file is very big, but that is left as an
optimization for future versions.
Since we have no way to add compilation options specific to a module, the TCC
searcher will read some pragma directives inside the module source file. An
alternative method would have been to load a second file containing those pa-
rameters. With pragma directives we can keep the module configuration atomic.
Also it does not prevent us from adding a second optional file containing other
configuration parameters (for example if we need platform-specific parameters).
Each directive has the form:
#pragma luatcc commandname(commandparameters)
The searcher will load all such found commands in the commands table. Each
entry has the command name as key and a array as value. That array contains
arrays, each containing the parameters of command. For example if you want
to access the second parameter of the first foo command, you would access it
through commands["foo"][1][2]. This system gives us an extensible way to add
commands. All unused commands will be simply ignored. For the moment, the
only command used is use_library, but we could add commands for each luatcc
API function.
-- Interpret pragma commands
--- use_library
local libdeps = {}
if commands.use_library then
for _,args in ipairs(commands.use_library) do
local libdep = args[1]
table.insert(libdeps, libdep)
end
end
This section of the searcher is the interpretation of the pragma directives. As
mentioned above, the only directive used at the moment is use_library. Here
we simply build a list of libraries using the first parameter of each use_library
command.
Next we create a luatcc context and allocate three local variables. Most luatcc
functions return a boolean, success, and eventually an error message, errmsg,
if the boolean is false. Also, for more safety, we will call these functions through
pcall and the first return value of pcall, indicating the call success, will be
stored in result.
local context = new_context()
-- Compile file
local result,success,errmsg =
pcall(context.compile, context, source, filename)
if not result then
error("error loading module ’"..modulename.."’ from file ’"..
filename.."’:\n\t"..success, 0)
end
assert(success, errmsg)
344 28 · Interpreted C Modules
The first step is the compilation of the file. We use filename as the chunk
name passed to TCC since the source has been directly read from the file without
modification. On error we will throw a Lua error. As mentioned before, when
describing the searcher behaviours, a searcher has to return a string if it doesn’t
find a module, but it can throw an error if it finds the module but cannot load
it. This is the behaviour used by the Lua and C searchers, and we simply mimic
it here. Our first error case is when the pcall fails. In that case we throw
a simple error message containing the pcall error message, stored in pcall
second return value, here success. If the pcall went right but the compilation
failed, the assert will throw the appropriate error message.
-- Add libraries
for _,libdep in ipairs(libdeps) do
result,success,errmsg = pcall(context.add_library, context, libdep)
if not result then
error("error loading module ’"..modulename.."’ from file ’"..
filename.."’:\n\t"..success, 0)
end
assert(success, errmsg)
end
This step is similar to compilation. We simply call add_library for each library
declared in the module pragma directives. The error mechanism is the same as
before.
Here again we simply call the relocate method of the luatcc context, with the
same error handling as before.
-- Extract symbol
local chunk
result,chunk,errmsg = pcall(context.get_symbol, context,
"luaopen_"..string.gsub(modulename, "%.", "_"))
if not result then
error("error loading module ’"..modulename.."’ from file ’"..
filename.."’:\n\t"..chunk, 0)
end
assert(chunk, errmsg)
return chunk
end
345
The last step of the loading mechanism is a bit different: instead of a success
boolean value, get_symbol returns a function, which is the module loader. If
all went well, the searcher simply returns the loader to require, which is
responsible to execute it.
local priority
if type(package.tccpriority)==’number’ and package.tccpriority>=1 then
priority = math.min(#package.loaders+1, package.tccpriority)
end
table.insert(package.loaders, search, priority)
The first one will add the interpreted C module searcher to the searcher list
used by require: package.loaders. We simply use table.insert to add the
searcher to the searcher list. You can specify the searcher priority through the
global variable package.tccpriority. If you don’t specify it, it will default to
nil, letting table.insert add the searcher at the end of the list. This will give
interpreted C modules the lowest priority when several modules of different
type have the same name. To change that priority you can simply assign a
positive integer value to package.tccpriority before loading the tcc.loader
module (the integer 1 is the highest priority).
package.tccpath = "./?.c"
The last line is simply the initialization of the path parameter used by the
searcher to locate the C source files. Here we look for modules in the current
directory, but that path could be extended to include standard system-wide
paths, just like package.path or package.cpath.
Conclusion
The main purpose of this gem was to show you how easy it is to add a completely
new kind of module to Lua. With only a few tens of lines, you can convert an
existing binding to some other form of programming into a module searcher. My
example, which is able to load uncompiled C modules, is just an example. With
the same principle, you could load Java classes or their .NET equivalents. You
could access some web services, just by loading some interface definition file.
You could load modules from Python, TCL, or Ruby.
There is also another field of development for more Lua searchers. With
new module searchers, you could simply change the way the searcher locates
the code. Instead of loading the modules from some directory according to
package.path, you could look for the modules online, either in some company-
specific intranet or in the wild internet. You could load the modules from a
346 28 · Interpreted C Modules
compressed form, just unzipping it before actual loading. You could locate the
module in some big archive which contains all the data of your game. You could
add a versioning scheme just like the smart one present in Ruby Gems. You
could decrypt the module on the fly, or simply check it against a hash key before
loading it.
The possibilities are endless. The Lua 5.1 module system is a mechanism
that makes Lua module distribution and management much simpler and clearer,
providing a standard. But that standard has been cleverly designed and it em-
powers programmers in such a way that it makes Lua interoperability with
other computing systems much easier than it was before.