Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Showing posts with label lisp. Show all posts
Showing posts with label lisp. Show all posts

Tuesday, January 7, 2014

Simple scripting in Common Lisp

I know enough Bash to read simple scripts, but I am always having a hard time when I am trying to write a script that does something that is more complicated than a one-liner that substitutes arguments into a template. I can eventually figure things out, but I have never bothered to develop my Bash skills, programming in it for me is error-prone and time-consuming.

So recently I have started experimenting with scripting in Common Lisp. It turns out to be very, very easy and convenient if you already know CL. I usually start scripting with a skeleton that looks like this:

(cl:defpackage #:script
  (:use #:cl)
  (:export #:run))

(cl:in-package :script)

;;; code comes here

I run things from SLIME and experiment with the REPL. When I think that I am approaching a solution, I write the main function run:

(defun run ()
  ...)

Then I use the following Makefile to compile it with cl-launch:

myscript: myscript.lisp
        cl-launch --lisp sbcl --file myscript.lisp --dump '!'   \
        --output myscript -i '(script:run)'

You can of course also include ASDF libraries, check the help of cl-launch.

For interfacing with the environment (pathnames, files, OS, etc), I find the UIOP compatibility library really helpful. It is included with ASDF, so most likely it is already on your computer.

Tuesday, December 10, 2013

cl-data-frame pretty printed column summaries

I have been refining printed summaries of data frames for my Common Lisp library cl-data-frame. I found that the following approach works best for me for quick eyeballing of data before any processing or analysis:

  1. Real numbers should be summarized by their range and the three quartiles (25%, 50%, 75%). This provides enough information to assess the variation and the "typical" values of the data.
  2. All other values should be summarized by their count and frequency. This is ideal for categorical data (called "factor" in R), and also for various encodings of missing data.
  3. When the column has both numbers and non-numbers, print both of the above. However, when it has very few distinct numbers, don't use quartiles for numbers, just print the frequencies.

For example,

(dframe:df :a #(nil nil nil 1 1 2 3 "missing" "missing"))

prints as

#<CL-DATA-FRAME:DATA-FRAME (1 x 7)
  :A 3 (43%) x NIL, 2 (29%) x 1, 1 (14%) x 2, 1 (14%) x 3>

while

(dframe:df :a (concatenate 'vector
                           #(nil nil nil "missing" "missing")
                           (clnu:numseq 0 100 :by 1/100)))

prints as

#<CL-DATA-FRAME:DATA-FRAME (1 x 10006)
  :A 10001 reals, min=0, q25=24.9975, q50=50, q75=75.0025, max=100;
     3 (0%) x NIL, 2 (0%) x "missing">

A more realistic example with a dataset I am currently working on that has both numeric, categorical, and missing ("") data:

#<CL-DATA-FRAME:DATA-FRAME (20 x 1082)
  :IDHH 1082 reals, min=7, q25=1351, q50=2548, q75=4073, max=5434
  :IDPERS bits, ones: 1082 (100%)
  :AGE 1082 reals, min=25, q25=41.17526, q50=49.492752, q75=59.814816, max=63
  :GENDER 832 (77%) x "weiblich", 250 (23%) x "maennlich"
  :MARITAL 515 (48%) x "geschieden",
           331 (31%) x "ledig",
           153 (14%) x "verwitwet",
           83 (8%) x "dauernd getrennt lebend"
  :SCHOOL 419 (39%) x "mittlere reife, realschulabschluss",
          338 (31%) x "volksschul-/hauptschulabschluss",
          217 (20%) x "abitur (hochschulreife)",
          87 (8%) x "fachoberschule, fachabitur",
          13 (1%) x "keine angabe",
          8 (1%) x "schule ohne abschluss verlassen"
  :WTTYPN 756 (70%) x "montag bis freitag",
          172 (16%) x "samstag",
          154 (14%) x "sonntag"
  :TMW 1082 reals, min=0, q25=0, q50=4.151436, q75=105, max=740
  :HOURS_MAINJOB 1082 reals, min=0, q25=0, q50=3.0288463, q75=9.53125, max=690
  :HOURS_ADDJOB 1082 reals, min=0, q25=0, q50=1.388621, q75=17.039537, max=450
  :THP 1082 reals, min=0, q25=141.15384, q50=263.13727, q75=386.55173, max=760
  :JUKIGR 664 (61%) x "anderer wert/trifft nicht zu oder kein wert vorhanden",
          165 (15%) x "10 bis unter 15",
          104 (10%) x "6 bis unter 10",
          80 (7%) x "18 bis unter 27",
          53 (5%) x "15 bis unter 18",
          16 (1%) x "27 und älter"
  :LEISURE 1082 reals, min=70, q25=437.74194, q50=601.8919, q75=752.069, max=960
  :USUAL_HOURS 1082 reals, min=300, q25=25138.87, q50=82259.11, q75=99997.82,
               max=99999
  :HHTYPE 664 (61%) x 1, 418 (39%) x 2
  :WORK bits, ones: 285 (26%)
  :MAINWAGE 1078 reals, min=0, q25=0, q50=36.24454, q75=153.93013, max=4100;
            4 (0%) x ""
  :ADDWAGE 1063 reals, min=0, q25=0, q50=6.7724867, q75=34.89418, max=1250;
           19 (2%) x ""
  :WAGE 1059 reals, min=0, q25=0, q50=16.293531, q75=49.222637, max=4100;
        23 (2%) x ""
  :NONWAGE_INCOME 1082 reals, min=0, q25=619.7917, q50=935.9649, q75=1293.9656,
                  max=4800>

This is the first time I used CL's pretty printer, so there might be a few bugs in there. The code is in the repository, you also need to update cl-num-utils.

Sunday, March 31, 2013

Deprecated libraries removed

I had some Common Lisp libraries on Github that I have abandonned a while ago, mostly because I have either

  1. found a better library by someone else,
  2. decided to follow alternative approaches, or
  3. write a better version with an incompatible API.

These libraries have been undergoing some quiet bit-rot for a while which makes them unusable without minor or not so minor fixes, for which I have no time; also, there are better libraries out there for the same purpose.

Having abandonned Common Lisp libraries out there is confusing and goes against the principle of consolidating Common Lisp Libraries, so I decided to remove some repositories. In particular, I have removed

  1. cl-2d, which didn't see any new development in the last two years. My recommended replacement is cl-flexplot, which uses PGF (a LaTeX package) as a backend.
  2. cl-text-tables was simply abandonned, I recommend the excellent cl-csv instead.
  3. cl-numlib is superseded by cl-num-utils.

The latter two suggested replacements are in Quicklisp.

If, for some strange reason, anyone is interested in the code of the dead libraries, just write me an e-mail. But I would prefer if these libraries stayed dead, as there are much better replacements out there.

It is very likely that I will remove other deprecated libraries in the future.

Monday, March 25, 2013

LLA update

Lisp Linear Algebra was updated today. This is a major update, so I increased the version number to 0.2. Compared to the previous version, all of the changes are internal and do not affect the already existing API, with the exception of an experimental destructive BLAS interface, which was contributed by Gábor Melis.

Gábor also made the necessary changes to the internals (which were not designed for destructive updates before), and re-enabled the array pinning interface for SBCL. This means that if you arrays have the same float type, and it is one of the four supported float types (single-float, double-float, and their complex counterparts), SBCL will not copy your arrays unless it has to (eg because they would be overwritten or need to be transposed). I am grateful for Gábor's contributions.

The destructive BLAS interface is still experimental, so the API may be refined in future versions.

Wednesday, February 20, 2013

updates to Common Lisp libraries

In the past month I updated many of my Common Lisp libraries. Here is a brief summary of the changes:

  1. cl-num-utils has been growing by accretion for a long time, which did not lead to the cleanest code. I decided to reorganize the library and split off functionally independent parts which may be useful on their own:
    • functions that operate on Common Lisp arrays have been moved to array-operations,
    • functions that handle array slices are now in cl-slice,
    • data frames are in cl-data-frame (which is still experimental, but works fine).

    The first two are meant to have very few dependencies, and in particular do not depend on cl-num-utils. I hope that cl-slice will serve as a basis for a Common Lisp API for slicing arrays and array-like objects.

    Parts of cl-num-utils have been disabled for now because I didn't have time to update their unit tests and I don't like to release untested code. If you need any of those, please open an issue on Github.

  2. lla had to be rewritten a bit since it depends on cl-num-utils. I moved special matrices from the former to the latter since they do not require foreign functions per se, and moved array stacking code to array-operations.
  3. cl-random also depends on cl-num-utils and had to be rewritten a bit. Again, not all functionality is enabled yet, feel free to open an issue if you need something urgently. I will re-enable everything in due time, but I would like to think more about testing multivariate distributions.
  4. Minor changes were made to cl-rmath and let-plus.

Monday, January 7, 2013

announcing trivial-project-pathname

I have been copy-pasting the same code over and over and decided to wrap it up in trivial-project-pathname, which is a very simple Common Lisp library for resolving pathnames relative to some directory, usually the sources.

The typical use case looks like this: I am working on, say, some data analysis and I have the following directory structure:

data/where data files are kept
latex/LaTeX sources
lisp/Common Lisp code, including ASDF system definition
plots/plots (potentially generated from Lisp)

I want to be able to write something like

(analyze-data-and-save-plot (project-path "mydata.csv" :data)
                            (project-path "myplot.pdf" :plots))

and have the data read from data/mydata.csv and the plot saved to plots/myplot.pdf.

This is how I would do it using this library:

(project-pathname:define project-path (:asdf "myproject" '(:relative :up))
  (:data "data/")
  (:plots "plots/"))

This finds the directory for the .asd file, takes its parent directory, and then defines a function that maps symbols to subdirectories.

Unless otherwise specified, NIL, the default for the optional parameter, would map to the base directory. You don't even need to use subdirectories, you can just

(project-pathname:define project-path2 (:asdf "myproject"))
;;; and then use
(project-path2 "my-data-file.csv")

You can also use *load-truename* to define *base-directory* in the .asd file as recommended by Zach Beane and then define

(project-pathname:define project-path (:directory *base-directory*))

Wednesday, November 7, 2012

array-operations resurrected

After reading a few papers on array languages, I decided to resurrect my array-operations library. Nope, this does not make Common Lisp into APL or Nial, but it does provide a concise DSL for array operations which should integrate well into Common Lisp.

The github page provides a quick tour of the library. The code is (or should be) portable, ANSI-conformant Common Lisp, and comes with unit tests for all functions.

The library previously available under this name (which has been dormant for 3 years) is still available, but it is deprecated.

Wednesday, October 17, 2012

cl-num-utils and utf8

I find Unicode math symbols very useful for mixing a bit of math and text — even though they don't even begin to approach the flexibility of LaTeX, all editors and browsers should be able to render them by now. When writing Lisp code, I only use them in docstrings. For example, cl-num-utils contains comments like this:

(defclass interval/infinite-left ()
  ()
  (:documentation "Left endpoint is -∞."))

Anton Vodonosov has recently reported that this causes a problem when the default external format is not utf8, also suggesting that I include an :encoding clause in the system definition. I have just pushed this change and I hope that this fixes the issue for everyone.

I appreciate the work Anton and others have put into cl-test-grid. I usually use a single implementation (the latest released SBCL), but I strive to make my code compatible with all implementations. This is much easier to achieve with cl-test-grid.

Wednesday, October 3, 2012

Chebyshev polynomials in cl-num-utils

I just pushed an update to the cl-num-utils repository, merging a branch which implements function approximation with Chebyshev polynomials.

For example,

(defun myfun (x)
  "Function that we want to approximate."
  (expt x -2))

(defparameter *myapprox* (chebyshev-approximate #'myfun (interval 2 10) 8)
  "Chebyshev approximation for MYFUN on the interval [2,10], using 8 Chebyshev
  polynomials.")

(myfun 5d0)            ; 0.04d0
(funcall *myapprox* 5) ; 0.04018023309369832d0

Approximation with Chebyshev polynomials is very fast and stable because it does not require a matrix inversion (Chebyshev polynomials are orthogonal), and thus it is a very nice procedure if you want to approximate a "smooth" function that is expensive to evaluate.

I also extended the interval code in cl-num-utils, now it handles open and infinite endpoints. This extension is experimental and not all functions understand it at the moment, but you can use my new extended-reals library to denote infinites, for example

(chebyshev-approximate #'sqrt (interval 0 (xr:inf)) 10)

would approximate \(x\mapsto\sqrt{x}\) on \([0,\infty)\). I will blog about extended-reals soon.

Tuesday, October 2, 2012

updates to let-plus

I have just pushed some updates to the let-plus Github repository. The most important change is the handling of ignored values — let+ used to ignore nil, for example

(let+ (((a nil b) '(1 2 3)))   ; OLD SYNTAX, NO LONGER VALID
  ...)

would ignore the second element of the list. The implementation of let+ would simply replace nil with gensyms in the AST and declare them ignored. This caused some unexpected problems, for example in

(let+ (((a nil &optional (c nil c?)) '(1 2 3))) ; OLD SYNTAX, NO LONGER VALID
  ...)

nil is a value, not a placeholder for a variable. I could have extended let+ to work handle this, but that would have required destructuring the lambda lists internally instead of simply relying on destructuring-bind. Being lazy, I decided to change the syntax. From now on, users should use &ign to ignore variables. For example,

(let+ ((#(a &ign b) (vector 1 2 3)))
  (list a b)) ; => (1 3), no warning about the middle element not being used

The change is incompatible with the previous version (nil is no longer understood as being ignored and will most likely cause an error during macroexpansion), so you need to update your code if you are using let+. I am sorry about any inconvenience this may cause.

I also extended let+ to understand nesting in more places (when applicable, mostly for read-only forms). For example,

(let+ (((&slots-r/o ((&complex a b) bar))
        (make-instance 'foo :bar (complex 1 2))))
  (list 1 2)) ; => (1 2)

As usual, bug reports are welcome.

Thursday, January 26, 2012

cl-cairo2: new maintainer, new license

cl-cairo2 was one of the first Common Lisp libraries I wrote, but I haven't been using it much for the last year or so (currently I am experimenting with cl-pdf as a backend for my new plotting library, which will be released soon). I have been pretty busy with research, so I didn't have time to merge (and test) patches, also, I didn't even contemplate updating the library to make use of the latest version of cairo. So when Ryan Pavlik contacted me about adding compatibility with cl-freetype2, I asked him whether he wants to take over as a maintaner. He kindly agreed, so I have transferred the repository to him, and he already merged a lot of patches.

One last thing that I wanted to fix before handing the repository over is the license. Originally, the library was licensed under the GPL — in retrospect, I think that

  1. I was very naive about software licenses, and didn't really understand what GPL means in the context if Common Lisp (I still don't claim that I do :-P), and
  2. I didn't realize that there are a lot of commercial applications in the Common Lisp world, whose authors are wary of depending on GPL'ed libraries.

Consequently, I received many complaints about the license of the library, and decided to change it. I picked the Boost Software License, and contacted all who contributed to the library for permission. All contributors approved the change, so now the library has a simple, permissive non-copyleft free software license. However, it is always possible that I missed someone, so if you contributed to cl-cairo2 in the past but didn't hear from me regarding the license change, please get in touch (eg via the Github issue tracker).

I would like to thank (in alphabetical order) Ala'a Mohammad Alawi, Jay Bromley, Pau Fernández, Johann Korndoerfer, Peter Mikula, and especially Kei Suzuki (who did the last major reorganization) for their contributions to the library (again, I apologize if I missed anyone). I would also like to thank Ryan for taking over — I am convinced that the library is in good hands.

Tuesday, November 1, 2011

Color schemes to prevent eye fatigue

I have been sitting in front of the computer quite a bit in the past few weeks, and I noticed that my eyes get very tired by the end of the day (I used to get back pain too, but stretching helped a lot). After a bit of reading I learned that the maximum-contrast white-on-black-based color scheme I was using is in fact extremely stressful for the eyes, and I started looking for something designed for the specific purpose of reducing eye fatigue. After checking out Zenburn for a few days, I started using Solarized by Ethan Schoonover. I find it really nice and relaxing, here is a screenshot of Emacs editing Common Lisp code:

New and updated libraries

Lately I have been trying to be more organized about updates to my libraries: I always try to keep the master branch on Github in a consistent state for all of them, and merge the development branches for all libraries at the same time. I have been doing that yesterday, here is a brief summary of the changes.

cl-num-utils mostly received minor fixes: I cleaned up the accumulator code a bit, and added a few minor convenience functions. The package now exports median, which causes a symbol conflict with alexandria:median so you need to shadow if you are using both. Maybe I should ask the developers of Alexandria to make mean, variance and median generic.

LLA underwent a medium-scale rewrite: I sanitized the type system (internal types are no longer exposed), cleaned up object creation functions (CLO replaced with HERMITIAN, LOWER, etc), and wrote specialized routines for array sharing which are about 30 times faster than they used to be — this of course doesn't mean that LLA is now 30x faster, just that they overhead for sharing arrays got a lot smaller. Since many operations are \(O(n^\alpha), \alpha \geq 2\), this is mostly noticeable for small arrays. Note that this is still generic CFFI-based code, preliminary tests indicate that implementation-specific tricks could yield array copying code that is 2-5x faster, but currently I have neither the time nor the inclination to write this, because I would rather wait for other parts of LLA to settle down before I micro-optimize. However, if you find that array copying overhead is critical for you, please let me know and I will see what I can do — this rewrite was prompted by the benchmarks sent by Robert Brown.

Realizing that implementing special functions in Common Lisp is perhaps not the best use of my time, I decided to use a foreign library for this purpose in cl-random. I chose R's standalone math library, with an SWIG-generated wrapper (it took about 30 minutes to have this library running, SWIG is amazing). I made the result available as a small standalone library called cl-rmath in case anyone else wants to use it. I really appreciate that the R developers made this available as a standalone library, it makes my life much easier. Besides these changes, I only made small changes to cl-random, for example, the univariate normal now has the syntax (r-normal mean variance) instead of the standard deviation, to be more consistent with the multivariate case.

Finally, it looks like cl-2d refuses to die, so I fixed some of the minor bugs that were the result of other libraries moving on. I am working on a replacement library, but I will probably keep cl-2d updated until the new library is ready. It amazes me how many people are using cl-2d: I get questions and bug reports all the time. It is not a bad library, but I wrote it when I was a CL newbie and I hope that I can write a much better one — we shall see. In any case, if you are not using cl-2d, I would recommend that you don't start now.

Tuesday, September 6, 2011

Is anyone using cl-2d?

I wrote a Common Lisp plotting library called cl-2d in 2009, and I have been using it for my own graphs ever since. I am currently working on a library that implements the same functionality but with a much more friently interface and a different engine, using LaTeX and pgf to generate the plots instead of cairo. This gives me a lot of useful features out of the box, including very nicely formatted formulas if I want to use them in annotations, and it is reasonably fast for my purposes.

I plan to remove the existing cl-2d repository from Github, put up a new one (it is a complete rewrite from scratch, there isn't much sense in carrying around the old source), and recycle the name (for both the package and the library).

I don't think that there is anyone out there using the current version of this library, but in case you are, please let me know soon.

Friday, September 2, 2011

Persistent misconceptions about Lisp

I am in the process of implementing (yet another) graphing library Common Lisp, one that would conform to my requirements better (I will blog about it later on). In order to understand some issues better, I decided to read The Grammar of Graphics by Leland Wilkinson. The book is very well written and clearly organized, and I think it will turn out to be very useful for what I am doing. However, I encountered a couple of misconceptions about Lisp in the Introduction. On page 4, the author claims that

Some languages, like Simula, Smalltalk, and Java, make it easy to implement objects and difficult to implement procedures. Others, like C++, enable development of both objects and procedures. And others, like C, Pascal, Ada, BASIC, Algol, Lisp, APL, and FORTRAN, make it difficult (but not impossible) to develop objects.

O RLY? Most users of CLOS find both objects and "procedures" very easy to use, and at least on par with other OO languages. Moreover, I don't understand how Lisp ended up in the company of C, which doesn't have any OO whatsoever.

On page 5, we are told that

Lisp is ideal for manipulating lists of words and symbols because it is a list processing language.

Strictly speaking, this is true, but Lisp is much more than a list processing language: it is also used to tackle numerical problems and usually does a very good job. Compiled code can be very fast, especially with implementations like SBCL, and Lisp has a very nice foreign function interface library.

That said, from what I have read so far, the book appears to be excellent, and the author is clearly an expert on the topic. Statements about programming languages are tangential to the book in any case, so I don't understand why the author found it necessary to talk about a language that he is not so familiar with.

Thursday, July 21, 2011

Library configuration in Common Lisp: an example

I have been dodging the issue of configuration for LLA for a while, but eventually I had to come up with a solution for the revised version that I pushed to Github yesterday.

LLA users may need to configure two things: the location and name of libraries that they want to use, and whether LLA should use 64-bit integers to interface with BLAS/LAPACK (this is rarely needed, even on 64-bit platforms, as many implementations still use 32-bit integers). In the previous versions of LLA, I tried to do this with read-time conditionals (based on trivial-features) but this was not satisfactory as it is impossible to figure out the list of libraries from nothing but the platform information — for example, a Linux user could use ATLAS or MKL.

I read Maintaining Portable Lisp Programs by Christophe Rhodes and also asked on cl-pro, where I got a lot of good suggestions and decided to follow the advice of Pascal Costanza. This is how configuration works at the moment for LLA:

  1. The user may define a variable *lla-configuration* in the CL-USER package. The variable should contain a plist of configuration options, eg :libraries. The variable does not need to be bound, and it can be NIL or incomplete: the user only needs to deal with configuration when he wants to override the defaults of the library. LLA makes an effort to come up with sensible platform-specific defaults.
  2. When loaded/compiled, LLA checks whether cl-user::*lla-configuration* is bound, and if it is, it uses the corresponding values. If the variable is not bound or doesn't contain the desired property, a default value is used instead.

Internally, some features are implemented by pushing symbols like lla::int64 to *features*. I have do admit that I didn't even consider the possibility of package qualifiers in read-time conditionals before I read the appropriate sections of the Hyperspec, but in retrospect it makes perfect sense as it helps to avoid name clashes. LLA also removes its own symbols from *features* if they don't belong there: this means that you can reload the library with a different configuration without restating your CL image.

Wednesday, July 20, 2011

LLA reorganization almost finished, new version on Github

I have almost completed the reorganization of LLA and merged it to the main branch on Github (sorry, no installation via Quicklisp until the library is fully stable, for the moment you have to check it out from Github). Make sure that you get the latest dependencies, most importantly cl-num-utils and let-plus.

This version of LLA boasts a reorganized interface: most importantly, now it works with plain vanilla Common Lisp arrays:

LLA> (defparameter *a* #2A((1 2) (3 4)))
*A*
LLA> (mm t *a*) ; same as A^T A, but LLA recognizes that the result is Hermitian
#<HERMITIAN-MATRIX 
  10.00000        *
  14.00000 20.00000>
LLA> (elements *)  ; Lisp arrays inside all special matrices
#2A((10.0d0 0.0d0) (14.0d0 20.0d0))
LLA> (svd *a* :thin)
#S(SVD
   :U #2A((-0.40455358483375703d0 0.9145142956773045d0)
          (-0.9145142956773045d0 -0.40455358483375703d0))
   :D #<DIAGONAL 
  5.46499       .
        . 0.36597>
   :VT #2A((-0.5760484367663207d0 -0.817415560470363d0)
           (-0.817415560470363d0 0.5760484367663208d0)))
LLA> (elements (svd-d *)) ; Lisp arrays again
#(5.464985704219043d0 0.3659661906262576d0)

LLA uses clever tricks to avoid transposing when it can. Row-major and column-major representations are transposes of each other, and instead of calculating an SVD as $$A=UDV^T$$ LLA justs calculates $$A^T={V^T}^T D^T U^T$$ and returns \(U\) as \(V^T\) and vice versa. Of course transposing cannot be avoided all the time, but it is not needed often and should not impose a large performance penalty. So LLA can work with row-major Common Lisp arrays natively — in fact, that's the only array representation it supports.

This version of LLA has a reorganized DSL for BLAS/LAPACK calls which makes it really easy to wrap Fortran functions in general, not just LAPACK/BLAS functions for which it has extensions. For example, this is how Cholesky factorization is implemented:

(defmethod cholesky ((a hermitian-matrix))
  (let+ ((a (elements a))
         ((a0 a1) (array-dimensions a)))
    (assert (= a0 a1))
    (lapack-call ("potrf" (common-float-type a)
                          (make-instance 'cholesky :left-square-root 
                                         (make-lower-triangular-matrix l)))
                 #\U (&integer a0) (&array a :output l) (&integer a0) &info)))

The first two lines are just bookkeeping: the array A containing the elements of the hermitian (symmetric) matrix (in the lower triangle) is extracted, and its dimensions are examined. LAPACK-CALL figures out whether to use SPOTRF, DPOTRF, CPOTRF, ZPOTRF from the element type of the matrix, allocates memory for atoms (recall that Fortran takes pointers) using the macros, and automatically examines the INFO parameter at &info to catch exceptions. The #\U in the code suggests that storage is upper-triangular, even though in CL we store elements of the hermitian matrix in the lower triangle: this is because we avoid transposing. I find that it is really easy to write wrappers to Fortran code using this macro family (there is also a BLAS-CALL and a LAPACK-CALL-W/QUERY that queries workspace sizes), and it should be easy to write a FORTRAN-CALL for general Fortran code, or even a wrapper for C functions that handle arrays. Let me know if you are interested in any of these, I would be happy to add them or even factor out this part of LLA into another library.

By the way, contrary to what my previous post on LLA said, you no longer need the C wrappers in MKL or CLAPACK, so ATLAS or something similar is just fine. The README has instructions on how to set up these libraries (unless you want to compile your own, a single apt-get install or similar is enough to satisfy LLA's dependencies) and configure custom library locations.

Even though LLA is now perfectly usable and all the unit tests work just fine, there are still some things left to do:

Write the array implementation-specific sharing macros.
LLA uses a few macros to make arrays available for foreign functions, and all the wrappers are built around these. The idea is to avoid copying if the implementation supports it, and otherwise fall back to copying. I decided to focus my efforts on getting LLA to work again first, and this means that currently all operations fall back to copying, which should entail a slight performance penalty. I will of course remedy this at some point, first for SBCL and ECL and then for the other distributions, but I am waiting for the rest of LLA to stabilize before I start fiddling with low-level optimizations.
Eigenvalues/eigenvectors are not yet implemented.
LAPACK's handling of eigenvalues and especially eigenvectors is a mess, and I need to dust off the code that picks out paired eigenvectors. It should be fairly easy, but I prefer not to add functionality without thorough testing, so currently I am postponing this until someone needs it (just drop me a line).

Also, even though I wrote a lot of unit tests, it is of course possible (and probable) that LLA has bugs. Please report them on Github.

I would like to thank all users of LLA for helpful suggestions and for the gentle nudging I received during LLA's reorganization. I believe that LLA fills an important niche, and I will continue to work on it.

Monday, June 27, 2011

LLA reorganization almost done

As many of you have noticed, the Lisp Linear Algebra library is currently undergoing a major reorganization. The new branch is incomplete, but what is there is perfectly is usable and passes all unit tests. The purpose of this post is to give a preview of changes and invite LLA users to experiment with the new version before it is "released".

Why the new version?

Also relying on LAPACK, the previous version of LLA used a column-major representation. This meant that even though matrices were represented using Common Lisp arrays (in particular, vectors) under the hood, they had to be wrapped to indicate that they were column-major. However, less then a year ago the LAPACK Team finally standardized the C interface for the library, which now provides row-major matrices. This means that you can use Common Lisp arrays in LLA:

(let ((a #2A((1 2)
             (3 4)))
      (b #(5 7)))
  (solve a b))

which computes \(A^{-1}b\), evaluating to

#(-3.0d0 4.0d0)

Note that the result contains double floats instead of integers: in fact, in implementations that support it, the element type of the array will be double-float.

This is because most LLA operations work like this:

  1. Determine the narrowest common element type of each argument. Unless the element type of the array is something specific that LLA can recognize (eg single and double floats and their complex versions), LLA needs to traverse the array, which costs a bit in terms of speed. Use arrays with specialized element types to avoid this cost. Integers and rationals are converted to floats (you can control which, double is the default).
  2. Determine the narrowest common element type of the pertinent arguments. For example, in the code above, LLA decided to use double floats, which also determines the element type.

The use of Common Lisp arrays made LLA much more transparent without any noticable overhead, especially if you use arrays that have the appropriate element types (instead of T). Even if you don't, LLA functions will return arrays that have specialized element types, so if you chain LLA operations, your code should be efficient without any particular effort.

Hermitian and triangular matrices are now wrappers around simple two-dimensional arrays. You should use as-array to convert to a dense matrix.

Required libraries

You need an implementation of LAPACK that provides the C interface. Currently, the two available choices are LAPACKE and Intel's MKL. I have been using the latter to develop the new version of LLA, so I don't have instructions on how to compile and set up LAPACKE with other libraries. If you figure it out, please write it up and I will post it in the README file of LLA.

If you install MKL on Linux, all you have to do is tell the linker where the libraries are: for example, all I had to do was create the file /etc/ld.so.conf.d/mkl.conf with contents

/opt/intel/mkl/lib/intel64
/opt/intel/composerxe/lib/intel64

and run ldconfig.

Portability

LLA uses CFFI, and should be perfectly portable. However, implementations that don't support arrays with element types single-float, double-float, (complex single-float) and (complex double-float) will suffer a performance penalty because LLA has to check the array elements to determine the narrowest type.

State of the new version

Everything works except eigenvalues and singular value decompositions. Drop me a line if you are using the new version and need either one of those, I am happy to implement them — it should be relatively easy, but I have kept features to the minimum while the library is in flux.

In principle, LLA should be clever enough to avoid copying of arrays when possible (ie no type conversion is necessary and the implementation supports foreign arrays — most of them should), but currently I am using the CFFI backend for testing.

As always, feedback, bug reports and suggestions are very welcome.

Monday, May 23, 2011

Introducing LET+, a destructuring extension of LET*

Why another destructuring library?

This Common Lisp library extends the syntax of let*. In that respect it is pretty similar to other destructuring libraries, most importantly Gary King's excellent metabang-bind. I have been using the latter for years now, but at some point I decided to write a library of my own, aiming for a cleaner syntax, more concise implementation and a more consistent interface (whether I have succeeded is of course a matter of judgement — try metabang-bind to see if you like it better).

This library differs from metabang-bind in the following ways:

  1. Placeholder macros like &slots, mainly for providing editor hints (completion, eg syntax reminders in the Emacs status bar with SLIME, displaying of arguments, etc).
  2. More consistent naming of destructuring forms. In particular, when both read-write and read-only forms are available the latter always have the -r/o suffix
  3. &flet and &labels resemble the Common Lisp syntax more closely.
  4. The code is extremely simple (< 300 loc, not counting unit tests), and the library should be easier to extend.

Examples

Lists are destructured unless recognized as something else (values mapping to NIL are ignored):

(let+ (((a (b &optional (c 3)) nil &key (d 1 d?)) '(1 (2) 7 :d 4)))
  (list a b c d d?))  ; => (1 2 3 4 T)

Slots and accessors have a syntax similar to with-slots:

(defclass foo-class ()
  ((a :accessor a :initarg :a)
   (b :accessor b-accessor :initarg :b)))

(let+ (((&slots a (my-b b)) (make-instance 'foo-class :a 1 :b 2)))
  (list a my-b))  ; => (1 2)

(let+ (((&accessors a (b b-accessor)) (make-instance 'foo-class :a 1 :b 2)))
  (list a b))  ; => (1 2)

Slots in structures can be accessed by giving the conc-name:

(defstruct foo-struct c d)
(let+ (((&structure foo-struct- c (my-d d)) (make-foo-struct :c 3 :d 4)))
  (list c my-d))  ; => (3 4)

but if you use defstruct+ to define your structure, you automatically get a corresponding deconstructing form:

(defstruct+ interval left right)

(let+ ((interval (make-interval :left 1 :right 2))
       ((&interval left right) interval))
  (incf right 10)
  interval)  ; => #S(INTERVAL :LEFT 1 :RIGHT 12)

Multiple values are also supported:

(let+ (((&values a nil b) (values 1 2 3)))
  (list a b))  ; => (1 3)

You can access array elements in many different ways:

(let+ ((#(a nil b) (vector 1 2 3)))
  (list a b))  ; => (1 3)

(let+ (((&array-elements (a 0 1)
                         (b 2 0))
        #2A((0 1)
            (2 3)
            (4 5))))
  (list a b))  ; => (1 4)

You can use &flet (and &labels, if the function needs to refer to itself) to write functions:

(let+ (((&flet add2 (x)
          (+ x 2))))
  (add2 5))  ; => 7

&plist and &hash-tables allow access to elements property lists and hash tables.

Read-only forms

All of the &... forms have a read-only version, eg &slots-r/o. With these, the values are read at the beginning, and you can modify them without modifying the original structure. Read-only forms may also give you a slight increase in speed, and promote better style by indicating that you are not modifying the original structure.

How to get it

The library is a available on Github, under the Boost Software License (Version 1.0). The README file there serves as the documentation, see the source code and unit tests for mode details. As always, bug reports and suggestions are welcome.

Wednesday, October 6, 2010

The status of cl-sparsematrix

In 2008, I had to solve some linear equations of the form Ax=b, where A was a sparse matrix (usually a design matrix for B-splines). I wrote up a library called cl-sparsematrix in one afternoon, and put it on the web.

This was not a sophisticated library. It did three very basic things: it helped construct a sparse matrix using a hash-table, had a function to pack that up in CSC format, and finally, a function to call UMFpack via CFFI to solve the equation mentioned above. This was it. It worked fine for what I wanted to use it, but it was not a sophisticated sparse matrix library.

I didn't have a need for sparse matrices ever since, and the library succumbed to bit rot: some other libraries it depended on moved on, quite substantially in some cases, and now it doesn't even load cleanly. But I still get a few e-mails every month asking about this library. So apparently there is a need in the CL community for a sparse matrix library. I have no time to clean it up at the moment, and you are probably better off writing your own. The library is no longer available online, because it doesn't work and I don't want to mislead people who are looking for a sparse matrix library.

When LLA is finalized, I will include sparse matrix functionality at some point. But LLA is undergoing (yet another) major reorganization now — I think I finally figured out how to interface native Lisp arrays to LAPACK in the implementations that support pinned arrays. So sparse matrices are not on the immediate horizon. Unless, of course, I need to solve sparse systems again, and hack up yet another quick-and-dirty solution.