Group-By: A Common Lisp library to help group data into trees

A recurring problem I have experienced while programming, is the need to convert a flat list of data into a more complex tree data structure. This is especially common when dealing with results from relational databases (where all data is intrinsically flat, and queries return tables of data). To solve this problem I wrote a small library named group-by (in honor of the sql operator that performs much the same task).

The easiest example:

(group-by '((a 1 2) (a 3 4) (b 5 6)))
=> ((A (1 2) (3 4)) (B (5 6)))

A more concrete example is from trac, the ticketing system we use. Trac tickets contain fields for author, project, milestone, and summary (among others). When displaying this data, my project manager wants to be able to see what everybody is working on (a tree view organized by author, project, and milestone), as well as being able to see what is being worked on in a project and by whom (a tree view organized by project and milestone). To accomplish this I pull a flat list of ticket objects from the database (using a clsql-orm generated class). I then create a tree from this data table by calling make-grouped-list. I can then perform a standard recursive tree walk to render this with the desired organization directly.

Example Call:
(make-grouped-list tickets
:keys (list #'author #'project #'milestone)
:tests (list #'equal #'equal #'equal))

Example Rendering:

Group-by supports grouping into alists, hashtables, and CLOS tree-nodes. To hide the difference between these implementations, I created a grouped-list CLOS object that manages all of the grouping and presents a unified interface to each of these implementation strategies. I support each of these implementations because which to use is strongly dependent on the workload you anticipate performing with the tree. Simply grouping once then recursively rendering the tree, is often more efficient as an alist, than a heavier weight data structure. Conversely, hashtables tend to perform better for lots of accesses into the grouping structure.

To see more, runnable examples, please checkout the project page and the examples file.

CL-Creditcard & CL-Authorize-net: Processing credit card payments with common lisp

I just pushed cl-creditcard to my github account. CL-Creditcard( and sub-library cl-authorize-net) is a library that we use to process payments with Authorize.Net. We have a large internal application that tracks and manages all our business and customer logic including billing and invoicing. (Invoices are generated using cl-typesetting). This application charges credit card payments through this cl-authorize-net.

It has been stable and charging cards for years and I just got around to releasing it. Soon it will also support ACH (echeck) transactions and we will be moving to lisp-unit from lift.

As with all payment processing, test very well before putting into production :)

CLSQL-ORM: Turn you existing database schema into Common Lisp CLSQL-View-Classes

CLSQL-ORM is a common lisp library I just pushed to my git hub account. Its primary goal is to generate CLSQL-view-classes based on an existing database. It uses the “information_schema” views to introspect relevant data from the database, then builds and evals the view-class definitions. This project is a significant branch of clsql-pg-introspect, attempting to remove its “pg” aspects in favor of the standard “information_schema”. It might have changed some semantics/conventions along the way, (I’m not sure as I didn’t use the original project much, and that was long ago).

I wanted to generate my lisp objects from the database for a couple reasons. One, I am fairly comfortable with SQL databases and am used to specifying them in whatever variant of sql the database engine supports. Two, I am most often presented with an extant working database that I want to interact with (such as a wordpress install), where the schema of the database can change, and I just want my common lisp to match whatever the database says, rather than trying to keep both up to date with each other manually. Obviously this project encodes many of my own, personal thoughts and tastes about databases, which may not be the same as your thoughts and tastes. This project is perhaps best though of as a jumping off point for creating your own personal common lisp ORM, though it should be usable as is, if your tastes and mine coincide.

Symbol-Munger: A common lisp library to help convert symbols between their representations in various systems

I just posted a small common lisp library to my github account named Symbol-Munger. Symbol-Munger provides functionality to ease conversion between the same symbol in different environments.

For example, when generating common lisp classes from a database schema, I want to change column names into lisp slot / accessor names, and then later when I am displaying that common lisp slot on the screen I want to display its slot-name as an english column header. (IE: my_db_col > my-db-col > My Db Col)

I have had this code laying about for years and use it everywhere frequently. Earlier today while cleaning other code, I ran across this TODO in vana-inflector, and realized I already had code that performed this function. This library contains one function that does the bulk of the work (normalize-capitalization-and-spacing) and many functions that set default arguments to that one (lisp->english, english->camel-case, english->keyword. etc). Its only dependency is iterate.

CSS-Selectors: quicker compiler with lambda trees

CSS-Selectors and TALCL are both, at heart, translators from some language (css-selectors and xml respectively) into compiled common lisp functions. We translate these expressions by translating the source language into a tree of common-lisp, inserting that common-lisp into a lambda form, then calling compile on that lambda. Because this is a central part of the utility they provide, slow compilation speed can be somewhat annoying. This led to me implementing compiler macros and other caching schemes so that compiling could be handled at compile time (when possible/constantp), where the slow compilation would be less problematic.

A while back I read an article by Xach about compiling queries without calling either “eval” or “compile” by utilizing lambda building functions. He wrote that this technique yielded much quicker compilation and no real loss of performance. To investigate, I wrote a second compiler for CSS-Selectors utilizing this compilation technique. I then wrote tests to compare this new compiler with the old.

Below are the results comparing the two compilers. My experiment confirmed Xachs findings from 2007. The lambda tree version of the compiler is MUCH quicker (~1000x in SBCL) with comparable execution speed. This hasn’t sped up the (cl-yacc produced) parser, so I think that much of the caching and compiler macro code is still valid and still saving user time, just less crucially so now. The only downside I saw was that debugging was a touch harder because there was no human readable version of the fully translated expression (which I could get from the old translator).


15:37:03 BEGIN Running Test TEST-COMPILER1-SPEED | form translation and compile
Evaluation took:
7.266 seconds of real time
7.140000 seconds of total run time (5.840000 user, 1.300000 system)
[ Run times consist of 1.780 seconds GC time, and 5.360 seconds non-GC time. ]
98.27% CPU
606 forms interpreted
6,262 lambdas converted
18,119,140,305 processor cycles
683,883,136 bytes consed

15:37:11 BEGIN Running Test TEST-COMPILER2-SPEED | lambda trees
Evaluation took:
0.008 seconds of real time
0.000000 seconds of total run time (0.000000 user, 0.000000 system)
0.00% CPU
19,072,358 processor cycles
1,572,384 bytes consed

15:37:02 BEGIN Running Test TEST-COMPILER1-EXECUTION-SPEED
Evaluation took:
0.907 seconds of real time
0.890000 seconds of total run time (0.880000 user, 0.010000 system)
[ Run times consist of 0.080 seconds GC time, and 0.810 seconds non-GC time. ]
98.13% CPU
2,261,249,550 processor cycles
39,185,920 bytes consed

15:37:11 BEGIN Running Test TEST-COMPILER2-EXECUTION-SPEED
Evaluation took:
0.881 seconds of real time
0.850000 seconds of total run time (0.840000 user, 0.010000 system)
[ Run times consist of 0.040 seconds GC time, and 0.810 seconds non-GC time. ]
96.48% CPU
2,196,945,765 processor cycles
39,155,152 bytes consed

Source Code:

CL-Mediawiki Moved to GitHub

I have moved CL-Mediawiki to github.  Why, you might ask?

  • Better authentication / user scheme (and other people manage it)
  • Pull / merge requests and request viewer (essentially I think github has a better story for distributing patches than repo.or.cz’s mob branch)
  • Everything about the project in one place, no longer need both trac an repo.or.cz just one git hub location
  • Prettier git-web
  • I am more comfortable with the github interface
  • It has a name and url I can remember without google’s help

I am still in the process of changing all the links and I will keep the repo.or.cz links up (at least for a bit) so that people pointing to it don’t get lost.

Reliable Common Lisp Builds with Quicklisp

Earlier this week I released a couple common lisp libraries.  Thus for the rest of the week I have tried to make them usable by anyone who is not me.  The situation is complicated by a few facts about our development environment.  Our shared lisp library directory has been accreted over more than 5 years, and some of these libraries having been patched (some in darcs and some in git and most not patched upstream).  We also make heavy use of precompiled cores to speed up start up time.  All of this leads to an environment that is very hard to replicate.  Quicklisp to the rescue! Quicklisp makes it easy to have a lisp environment with the same set of libraries available as everybody else, which is a tremendous win when compared to our difficult-to-replicate system.

To fix the bugs other people would see in the software that builds and works fine for me, I wrote an sbcl script to perform a clean build and fetch all unknown dependencies from quicklisp.  (This script is not common lisp; there is sb-xxx all over the place.)  I learned quite a bit about all of the related systems, so the end result seems somewhat trivial, but perhaps it will provide decent examples. Quicklisp is easy to install, and makes deploying usable common lisp libraries MUCH easier.

By running the following script I will load and test the buildnode system using only libraries pulled from quicklisp.
~$ sbcl --script ~/run-builds.lisp --test-system buildnode --quicklisp-only

... Style warnings and other build detritus...
** TEST RESULTS: Buildnode **
-----------
ATTRIB-MANIP: 9 assertions passed, 0 failed.
CLASS-MANIP: 7 assertions passed, 0 failed.
TEST-ADD-CHILREN: 1 assertions passed, 0 failed.
TEST-BASIC-HTML-DOC: 1 assertions passed, 0 failed.
TEST-FLATTEN-&-ITER-DOM-CHILDREN: 1 assertions passed, 0 failed.
TEST-INSERT-CHILREN: 3 assertions passed, 0 failed.
TEST-ITER-CHILDREN: 6 assertions passed, 0 failed.
TEST-ITER-NODES: 6 assertions passed, 0 failed.
TEST-ITER-PARENTS: 4 assertions passed, 0 failed.
TEST-TEXT-OF-DOM: 2 assertions passed, 0 failed.
TOTAL: 40 assertions passed, 0 failed, 0 execution errors.
------ END TEST RESULTS ------
... MORE style warnings and other build detritus...

I hope to incorporate this script into an automated lisp builder soon.

Announcing Buildnode, CSS-Selectors & TALCL

I have just pushed three Common Lisp libraries to my github account.  We have been using Buildnode and TALCL internally for quite some time and CSS-Selectors was something of a learning project I just wrote.

Buildnode

Buildnode is a library to make working with CXML DOM documents and nodes easier.  It smoothes some of the DOM interfaces to be a bit nicer and more in line with common lisp conventions.  We use buildnode primarily to generate the output of webpages hooked up to our extensively modified UCW lisp web server.  We also use it to generate excel spreadsheet XML and google earth KML.  It facilitates writing small generation functions that can be built up and combined in any way.  We also use it to generate “tag” packages which are a package of functions that build the XML tree for us (see the example).

Primary Features

  • Iterate drivers for the dom (in-dom-children, in-dom-parents, and in-dom)
  • DOM manipulation functions such as set-attribute, add-children etc that return the node they are manipulating to ease stringing many calls together and then appending the result to the dom
  • TAG Packages that make a library of functions for interacting with a specific XML dialect

TALCL

TALCL is a branch of ARNESI YACLML/UCW Template Attribute Language, which in turn was branch of Template Attribute Language originally developed in python for Zope.  We think that this version of TAL is much improved over the one originally shipped with UCW by being simpler to use, has better integration with the lisp environment, and simpler evaluation rules.  TALCL is divorced entirely from the UCW/ARNESI/YACLML stack and should be a usable choice for any templating need (though certainly specializing in XML templating).  We use this library for HTML templates that our designers can work with as well as processing both plain text and HTML email templates in our internal billing software. Examples can be found in the repository.

CSS-Selectors

CSS-Selectors is a library that implements a parser for css-selectors level 3 and provides a compiler that can compile node-matcher functions from these selectors.  It also provides a query function (much like jQuery), that can be used to retrieve a list of matching nodes from the dom.  I use this for selecting dom nodes to manipulate from the output of an (unreleased) form controls library and for manipulating and pulling information from DOM documents.  If static, compilation of CSS-selectors into node-matcher functions occurs at compile time.

 

Aqua Data Studio OSS License Program rocks!

I have been using Aqua Data Studio 7 since its release (and had used a couple versions before that I believe). I really enjoy using ADS, but have been wanting a new version for quite a while. Today while browsing their site I saw that a version 9 beta is available and also that they have a FREE license available for OSS developers. (They also offer student licensing). This is seriously quality software that makes my life so much easier. If you work with a variety of database systems, you can’t do much better than Aqua Data Studio to talk to all of them with one program.

Thanks Aqua Fold for my free licence! (I qualified for my trac plugin work)

Copying Tables in Postgresql

If you want to copy a table, all of its data and all of its constraints, indexes, defaults, etc., you have a few options in postgresql

Create table has a couple forms that are helpful:

This form will create a table like yours with all your data, but without any sequences, indexes, etc on it.  As such it is very quick for copying data and nothing else.
CREATE TABLE new_blah AS SELECT * FROM blah;

This form will create a table without any of the data but with all appropriate schema objects. Be careful, this will cause serial columns on the table you are creating to point toward sequence values of the table you are copying.
CREATE TABLE new_blah ( LIKE blah INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES);
ALTER TABLE new_blah ALTER id DROP DEFAULT;
CREATE sequence new_blah_seq;
-- do what you actually need here, 1 might be a good first id for your situation
SELECT setval('new_blah_seq', (SELECT max(id) FROM blah), true);
ALTER TABLE new_blah ALTER ID SET DEFAULT nextval('new_blah_seq');

Data can then be inserted with the following, but this could take a long time on a large table because all constraints and indexes are checked/built one at a time for each row of data.

INSERT INTO new_blah (SELECT * FROM blah);

The next option is to have all your constraints and indexes be declared externally so that they can be re-added later. This allows the CREATE TABLE AS syntax to do its quick inserts and then you can reinstate all schema objects around this table after the data has been copied. This seems to be the absolutely fastest way to copy a table, but requires you to manage your schema objects outside of postgresql or to pg_dump the schema, do some tricky find and replace on names, and then re-run each schema create/alter command.

The final option is to pg_dump the tables you wish to copy, rename the originals and then restore the copy that you made. This takes longer because it must write the data to disk twice (once into the stream/file you dump to then once again when you run the restore and it gives the results