<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Russ's  Tech Blog</title>
	<atom:link href="http://russ.unwashedmeme.com/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://russ.unwashedmeme.com/blog</link>
	<description>Ranting and Raving about Random Computer bits</description>
	<lastBuildDate>Fri, 12 Oct 2012 15:27:47 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>I Published My Common Lisp Docstring Search Engine</title>
		<link>http://russ.unwashedmeme.com/blog/?p=391</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=391#comments</comments>
		<pubDate>Mon, 30 Jan 2012 21:13:38 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[Lisp]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=391</guid>
		<description><![CDATA[My Common Lisp documentation search engine has been published to http://lisp-search.acceleration.net. In a previous post I wrote about using the montezuma full-text search engine to build an index of documentation available from within my common lisp runtime. I ended up &#8230; <a href="http://russ.unwashedmeme.com/blog/?p=391">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>My Common Lisp documentation search engine has been published to <a href="http://lisp-search.acceleration.net">http://lisp-search.acceleration.net</a>. In a <a href="/blog/?p=385">previous post</a> I wrote about using the montezuma full-text search engine to build an index of documentation available from within my common lisp runtime.  I ended up going the extra mile on this one and indexing all of the documentation available for all of the packages in quicklisp (as well as readme files and other packages that sbcl had already loaded).  The result is a 90M search index (4M tar.gz) that can be used search through all of the doc strings of all of the easily loadable packages.</p>
<p>The user interface is a bit clunky,  searches don&#8217;t always return the most relevant results first, but it is live, fast, and seems already useful.  Perhaps with some help from the internet, this search engine can reach its full potential. I named the software package that does this manifest-search-web, because it was inspired by gigamonkey&#8217;s manifest project.  I still have not come up with a reasonable name for the published search engine (lisp-search seems a touch blasé and under-descriptive).
</p>
<p> Hopefully, I will never again spend time writing a library only to find the already written, open source alternative after I publish mine. Also, perhaps this will inspire better doc-strings, now that doc-strings might be what leads to someone finding your project.
</p>
<p>Other things todo: </p>
<ul>
<li>Integrate manifest-search with slime</li>
<li>Have the documentation index be distributable in quicklisp (not sure how to do that efficiently)</li>
<li>Find a way to unify CLIKI, l1sp.org, lisp-search and other lisp documentation resources into a more cohesive single website / search</li>
<li>Improve the query language to ensure that it behaves according to user expectations as opposed to lucene expectations</li>
</ul>
<p>As always, <a href="https://github.com/AccelerationNet/manifest-search-web/issues">please report bugs and make   suggestions</a> for improvements. Cheers and happy lisping.</p>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=391</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Domain Scammer: Domain Names International (DNIDOMAINMARKET.COM)</title>
		<link>http://russ.unwashedmeme.com/blog/?p=389</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=389#comments</comments>
		<pubDate>Tue, 27 Dec 2011 19:11:09 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[Politics]]></category>
		<category><![CDATA[Spammers and Scammers]]></category>
		<category><![CDATA[Web Hack]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=389</guid>
		<description><![CDATA[Domain Names International (InTrust Domains, DNIDomainMarket.com) has been repeatedly spamming me with emails about domains similar to ones I own. The emails come from various random domains, but when going to the domain, you are immediately redirected to dnidomainmarket.com. On &#8230; <a href="http://russ.unwashedmeme.com/blog/?p=389">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Domain Names International (InTrust Domains, DNIDomainMarket.com) has been repeatedly spamming me with emails about domains similar to ones I own.  The emails come from various random domains, but when going to the domain, you are immediately redirected to dnidomainmarket.com.  On the home page of their website they advertise how the BBB says they can be trusted.  This general shadiness makes me not want to follow the opt out link.  Also googling makes it fairly obvious that the optout will not work anyway.
</p>
<p>I would really suggest that others receiving this spam, click the link below to the BBB and file another advertising complaint against the company. Also reporting to spam cop or similar places.</p>
<p><a href="http://www.bbb.org/southern-colorado/business-reviews/internet-services/intrust-domains-in-falcon-co-87340850/complaints">The BBB complaints section for InTrust Domains</a></p>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=389</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Manifest Search: A Common Lisp Documentation Search Engine</title>
		<link>http://russ.unwashedmeme.com/blog/?p=385</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=385#comments</comments>
		<pubDate>Fri, 23 Dec 2011 18:44:55 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[Lisp]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=385</guid>
		<description><![CDATA[A common complaint from a co-worker is not being able to find relevant library functionality. We have libraries that do some tasks well, but if you haven&#8217;t used it before, how are you to know that it is there. More &#8230; <a href="http://russ.unwashedmeme.com/blog/?p=385">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>
A common complaint from a co-worker is not being able to find relevant library functionality.  We have libraries that do some tasks well, but if you haven&#8217;t used it before, how are you to know that it is there. More over, how do you find what you are looking for from all of the available utility libraries currently loaded.
</p>
<p>
After seeing <a href="http://www.youtube.com/watch?v=COEgRaf6acU">Peter Seibel&#8217;s Manifest screencast</a>. I was struck by the idea that you could index all the doc strings to provide a powerful search tool.  I dont know about powerful yet, but this idea has turned into at least <a href="https://github.com/AccelerationNet/manifest-search">a search tool: Manifest-Search</a>.  This is the product of one days hacking and so should not be construed as the end-all-be-all common lisp search tool, however, it is at least a step in that direction.
</p>
<p>
I would like to eventually get this integrated more fully with both <a href="http://www.quicklisp.org/">quicklisp </a> and <a href="https://github.com/gigamonkey/manifest">manifest</a>, but that is all in the future.  I think it would be amazing to search for functionality I need, and get documentation for a library I have not yet installed, but is distributed by quicklisp.</p>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=385</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Access update &#8211; corrected SETF expanders and more examples</title>
		<link>http://russ.unwashedmeme.com/blog/?p=366</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=366#comments</comments>
		<pubDate>Mon, 14 Nov 2011 23:10:02 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[Lisp]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=366</guid>
		<description><![CDATA[In the first released version of access I defined the setf versions as (defun (setf accesses) (new o &#038;rest keys)&#8230;). In order to make this work out for plists and alists (where adding a key can result in a new &#8230; <a href="http://russ.unwashedmeme.com/blog/?p=366">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>In the first released version of <A href="https://github.com/AccelerationNet/access">access</a> I defined the setf versions as <span class="code">(defun (setf accesses) (new o &#038;rest keys)&#8230;)</span>.  In order to make this work out for plists and alists (where adding a key can result in a new HEAD element), I was forced to return the updated object rather than the &#8220;new&#8221; value that setf usually returns.  I was unhappy with this oddity at the time but didn&#8217;t know directly how to fix it (obviously some macrology was in order to capture the &#8220;place&#8221; being modified).</p>
<p>Today I looked into the docs for <a href="http://clhs.lisp.se/Body/m_defi_3.htm">define-set-expander</a> and saw how to transform my code into &#8220;correct&#8221; setf&#8217;s.  To do this i transformed my previous setf functions into set-access and set-accesses which return <span class="code">(values new-value possibly-new-object)</span>.  I then define my setf expanders in terms of calling those functions and setting the place passed in to possibly-new-object.  It took a little while to figure out and I&#8217;m still not entirely sure I wrote the optimal common lisp for this.  However I was able to elide the outer setf from these expressions in the tests <span class="code">(setf pl (setf (access pl &#8216;one) &#8216;new-val))</span> and now the new <span class="code">(setf (access pl &#8216;one) &#8216;new-val)</span> returns &#8216;new-val as would be expected.</p>
<p> There were some requests for more, better examples of where access might be useful:</p>
<ul>
<li>My html components have a plist representing direct html attributes.  I update these with <span class="code">(setf (accesses ctl &#8216;attributes &#8216;name) &#8220;myFormName&#8221;)</span> and its correlary <span class="code">(accesses ctl #&#8217;attributes :name)</span>. Note that both forms work even though one uses a local symbol and one a keyword (they are compared by symbol-name so that I can think about it less).  Also I am ok referring to the attributes function by name or function object (both will result in calling the attributes function on ctl). </li>
<li>Another example from the web domain: I often store a reference to a database object on the control that is responsible for displaying it.  Thus getting the database primary key off of the data for a control can be <span class="code">(accesses client-form &#8216;data &#8216;adwolf-db:accountid)</span>. This allows me (where useful) to ignore the difference between an unsaved, new object and an object that hasn&#8217;t been created yet (for things like putting the id in the url, the difference is irrelevant). </li>
<li>While not currently implemented this way, my <a href="https://github.com/AccelerationNet/group-by">group-by</a> library which groups items into nested alists or hashtable could potentially use access to handle the different implementations</li>
<li>Printing my database objects in debug / log messages, I want to output some columns (but only if the database object has those). This way I can define one printer for all my db objects with a minimum of fuss<br />
<code>(defmethod print-object ((o clsql:mssql-db-object) (s stream))<br />
  "Print the database object, and a couple of the most common identity slots."<br />
  (print-unreadable-object (o s :type t :identity t)<br />
    (iter (for c in '(id accountid serviceid transactionid title amount name))<br />
      (for v = (access o c))<br />
      (when v (format s "~A:~A " c v)))<br />
    ))</code></li>
</ul>
<p>In general I find access useful whenever I need to operate on some set of keys that may or may not exist in a dictionary-like object and I don&#8217;t care to receive any errors related to missing keys.
</p>
<p>TODO: </p>
<ul>
<li>A keys / values interface to ease arbitrary dictionary iteration would be a worthy addition (alexandria seems to have all the relevant functions implemented, so it would mostly be a dispatch to those)</li>
<li>When a dictionary doesnt exist, there should be someway of telling it how to create that dictionary (currently you will get a plist).</li>
<li>Extensibility to allow support for other dict like structures.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=366</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Common Lisp project: Access</title>
		<link>http://russ.unwashedmeme.com/blog/?p=358</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=358#comments</comments>
		<pubDate>Wed, 09 Nov 2011 21:37:37 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[Lisp]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=358</guid>
		<description><![CDATA[Access is a common lisp library I just culled out of our immense utility mud ball and refactored into a library all its own. Access makes getting and setting values in common data structures support a single unified api. As &#8230; <a href="http://russ.unwashedmeme.com/blog/?p=358">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><a href="http://github.com/AccelerationNet/access">Access</a> is a common lisp library I just culled out of our immense utility mud ball and refactored into a library all its own.  Access makes getting and setting values in common data structures support a single unified api.  As such you could access a specific key from an alist stored in a hashtable stored in the slot of an object as <span class="code">(accesses o &#8216;k1 &#8216;k2 &#8216;k3)</span>.  It also supports setting values <span class="code">(setf (accesses o &#8216;k1 &#8216;k2 &#8216;k3) &#8220;new-val&#8221;)</span>.  Obviously there are some limitations to this approach, but for me, with my coding conventions, I don&#8217;t tend to run into them (see the <a href="http://github.com/AccelerationNet/access">README</a> for details).
</p>
<p>Access has removed some of my need for forms like <span class="code">(awhen a (awhen (fn1 it) (fn2 it)))</span> with <span class="code">(access a &#8216;fn1 &#8216;fn2)</span>.  To me, it allows me to more accurately express what I am trying to do while ignoring the vagaries of shifting implementation details.  It also eases setting values in nested objects because it handles propagating the value up the chain rather than me having to do that myself (ie adding a new key-value pair to a the front of an alist stored in an object, automatically saves the new resulting alist in the object).  I don&#8217;t expect that this is tasteful coding, but it is easier and allows me to not get mired down trying to decide if I want it to be an alist, plist, hashtable, or object because the cost to change it later is essentially zero.</p>
<p>Performance is rarely in issue in the apps that I tend to write.  However, if it were, I would not use access as it does significant type and dispatch analysis that could be avoided by using the specific access functions of the data structure I am using.</p>
<p>A dot syntax familiar to those who use javascript/python/ruby type languages is available as well.  This transforms calls like <span class="code">foo.bar.bast</span> into <span class="code">(accesses foo &#8216;bar &#8216;bast)</span>.  I don&#8217;t use this syntax as I tend to prefer the lisp function-call syntax, but it seems to be an oft requested / discussed feature, and I had fun writing the code.  </p>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=358</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Github Projects Moving</title>
		<link>http://russ.unwashedmeme.com/blog/?p=339</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=339#comments</comments>
		<pubDate>Wed, 09 Nov 2011 20:52:36 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[Lisp]]></category>
		<category><![CDATA[Politics]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=339</guid>
		<description><![CDATA[I moved all of my Common Lisp github projects from my personal github page to the new AccelerationNet github organization. Sorry for any inconvenience.]]></description>
				<content:encoded><![CDATA[<p>I moved all of my Common Lisp github projects from <a href="http://github.com/bobbysmith007">my personal github page</a> to the new <a href="http://github.com/AccelerationNet">AccelerationNet</a> github organization.  Sorry for any inconvenience.</p>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=339</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WordPress Image Crop Disabled, or Finding PHP Files with Trailing Space</title>
		<link>http://russ.unwashedmeme.com/blog/?p=331</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=331#comments</comments>
		<pubDate>Mon, 31 Oct 2011 18:56:39 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Web Hack]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=331</guid>
		<description><![CDATA[One of our many wordpresses was not allowing you to crop images. I tracked this down to the image failing to load which in turn was caused by an extra \r\n preceding the image content. This extra line-break is caused &#8230; <a href="http://russ.unwashedmeme.com/blog/?p=331">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>One of our many wordpresses was not allowing you to crop images.  I tracked this down to the image failing to load which in turn was caused by an extra <span class="code">\r\n</span> preceding the image content.  This extra line-break is caused when an included php file ends in <span class="code">?>\r\n</span>.  Because php writes any content outside of a php tag to the output stream, this causes an extra newline to precede any other content you might have been trying to send (such as a jpeg image).  This can cause all sorts of problems, in this case corrupting the JPEG output.</p>
<p>
To fix this problem I investigated how to get grep to search in multiline mode (install pcregrep).  I then had the trouble that <span class="code">$</span> matches end of line rather than end of file.  After some googling I found that <span class="code">\z</span> will match end of file, and with that I was off to the races. This pcregrep expression will allow you to find php files with pesky trailing space issues.
</p>
<p><code>pcregrep -Mri --exclude_dir=.svn --exclude_dir=css '\?>\s+\z' wp-content/plugins</code></p>
<p>
The offending plugin in my case was an older version of wp-e-commerce (which is not easily upgradeable).  After finding all the files with trailing whitespace and removing it, I could now crop images in wordpress again.</p>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=331</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Comparing Recursive-Regex and CL-YACC</title>
		<link>http://russ.unwashedmeme.com/blog/?p=289</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=289#comments</comments>
		<pubDate>Thu, 29 Sep 2011 19:32:23 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[cl-ppcre]]></category>
		<category><![CDATA[Lisp]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=289</guid>
		<description><![CDATA[In my previous blog post, I discussed how recursive-regex has been maturing but that it still wasn&#8217;t and was not intended to compete with actual parser toolkits. I wanted to quantify this assumption and present a head-to-head analysis of using &#8230; <a href="http://russ.unwashedmeme.com/blog/?p=289">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>In my previous blog post, I discussed how recursive-regex has been<br />
maturing but that it still wasn&#8217;t and was not intended to compete with<br />
actual parser toolkits.  I wanted to quantify this assumption and<br />
present a head-to-head analysis of using recursive-regex vs using<br />
<a href="http://www.pps.jussieu.fr/~jch/software/cl-yacc/">cl-yacc</a>.<br />
I chose cl-yacc because<br />
<a href="https://github.com/AccelerationNet/css-selectors/blob/master/src/parse.lisp#L251"><br />
I already had a working css3-selector parser implemented</a>. My<br />
existing cl-yacc implementation is based on<br />
the <a href="http://www.w3.org/TR/css3-selectors/#grammar">published<br />
CSS3-Selectors Grammar</a> (but the modified some to get it to work).</p>
<hr />
<h3>CL-YACC Parser</h3>
<p>I found implementing the parser in cl-yacc to be fairly tedious,<br />
time consuming and error prone, even for such a small language as<br />
css-selectors.  It doesn&#8217;t help that I tried to do it using the<br />
published css3 grammar and lex files which are somewhat awkward (eg:<br />
open parens are in the lexer and close parens are in the grammar).</p>
<p>I wrote a not very well documented (f)lex file reader to read in the<br />
<a href="http://www.w3.org/TR/css3-selectors/#lex">existing CSS3 flex<br />
file</a> and build a lexer compatible with cl-yacc.  After getting a<br />
valid lexer, I started working with the published grammar to convert<br />
it to a format cl-yacc would approve of. Along the way I was finding<br />
the syntax for cl-yacc to be pretty cumbersome, so I used some<br />
read time execution to turn forms like the first below, into forms like<br />
the second (macro expanded) one below which are valid cl-yacc<br />
productions. (This is not a great idea or great code, but it did simplify<br />
the parser def for me.)</p>
<div class="hlcode">
<div class="syntax">
<pre><span class="p">(</span><span class="nv">or-sel</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">comb-sel</span> <span class="ss">:|,|</span> <span class="nv">spaces</span> <span class="nv">or-sel</span><span class="p">)</span>
       <span class="p">(</span><span class="nb">list</span> <span class="ss">:or</span> <span class="nv">comb-sel</span> <span class="nv">or-sel</span><span class="p">))</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">comb-sel</span><span class="p">)</span> <span class="nv">comb-sel</span><span class="p">))</span>



<span class="p">(</span><span class="nv">or-sel</span>
 <span class="p">(</span><span class="nv">comb-sel</span> <span class="ss">:|,|</span> <span class="nv">spaces</span> <span class="nv">or-sel</span>
  <span class="nf">#'</span><span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nv">comb-sel</span> <span class="nv">|,|</span> <span class="nv">spaces</span> <span class="nv">or-sel</span><span class="p">)</span>
      <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">ignorable</span> <span class="nv">comb-sel</span> <span class="nv">|,|</span> <span class="nv">spaces</span> <span class="nv">or-sel</span><span class="p">))</span>
      <span class="p">(</span><span class="nb">list</span> <span class="ss">:or</span> <span class="nv">comb-sel</span> <span class="nv">or-sel</span><span class="p">)))</span>
 <span class="p">(</span><span class="nv">comb-sel</span>
  <span class="nf">#'</span><span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="nv">comb-sel</span><span class="p">)</span> <span class="p">(</span><span class="k">declare</span> <span class="p">(</span><span class="k">ignorable</span> <span class="nv">comb-sel</span><span class="p">))</span> <span class="nv">comb-sel</span><span class="p">)))</span>
</pre>
</div>
</div>
<p>The difficulties I had in implementation were mostly concerning where<br />
white space could appear, in which productions, at which levels in the<br />
grammar.  It seemed like it was quite a task to get an unambiguous<br />
grammar and have all my extra unimportant white space everywhere still<br />
be parsable.  After I had finally managed to rid my grammar of<br />
reduce/reduce conflicts, I was able to parse my language and it seemed<br />
like a fairly peppy parser.</p>
<p>The only problem I really have with this implementation is that it<br />
seems like it is totally illegible after the fact. Even knowing about<br />
parsers and having written this one, I don&#8217;t feel comfortable<br />
modifying the language.  It seems like it would be difficult to get<br />
working again. Thankfully, I don&#8217;t anticipate having to do much<br />
language rewriting in this case.</p>
<h4>CL-YACC parser definition (lexer not shown)</h4>
<div class="hlcode" style="height:250px; overflow:auto;">
<div class="syntax">
<pre><span class="p">(</span><span class="nv">yacc:define-parser</span> <span class="vg">*css3-selector-parser*</span>
  <span class="p">(</span><span class="ss">:start-symbol</span> <span class="nv">selector</span><span class="p">)</span>
  <span class="p">(</span><span class="ss">:terminals</span> <span class="p">(</span><span class="ss">:|,|</span> <span class="ss">:|*|</span> <span class="ss">:|)|</span> <span class="ss">:|(|</span> <span class="ss">:|&gt;|</span> <span class="ss">:|+|</span> <span class="ss">:|~|</span> <span class="ss">:|:|</span> <span class="ss">:|[|</span> <span class="ss">:|]|</span> <span class="ss">:|=|</span> <span class="ss">:|-|</span>
		<span class="ss">:S</span> <span class="ss">:IDENT</span> <span class="ss">:HASH</span> <span class="ss">:CLASS</span> <span class="ss">:STRING</span> <span class="ss">:FUNCTION</span> <span class="ss">:NTH-FUNCTION</span>
		<span class="ss">:INCLUDES</span> <span class="ss">:DASHMATCH</span> <span class="ss">:BEGINS-WITH</span> <span class="ss">:ENDS-WITH</span> <span class="ss">:SUBSTRING</span>
		<span class="ss">:integer</span><span class="p">))</span>
  <span class="p">(</span><span class="ss">:precedence</span> <span class="p">((</span><span class="ss">:left</span> <span class="ss">:|)|</span> <span class="ss">:s</span> <span class="ss">:|,|</span> <span class="ss">:|+|</span> <span class="ss">:|~|</span> <span class="p">))</span> <span class="p">)</span>
  
  <span class="p">(</span><span class="nv">selector</span> <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">or-sel</span><span class="p">)</span> <span class="nv">or-sel</span><span class="p">))</span>

  <span class="p">(</span><span class="nv">or-sel</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">comb-sel</span> <span class="ss">:|,|</span> <span class="nv">spaces</span> <span class="nv">or-sel</span><span class="p">)</span>
       <span class="p">(</span><span class="nb">list</span> <span class="ss">:or</span> <span class="nv">comb-sel</span> <span class="nv">or-sel</span><span class="p">))</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">comb-sel</span><span class="p">)</span> <span class="nv">comb-sel</span><span class="p">))</span>
  
  <span class="p">(</span><span class="nv">comb-sel</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">and-sel</span> <span class="nv">combinator</span> <span class="nv">comb-sel</span><span class="p">)</span>
       <span class="p">(</span><span class="nb">list</span> <span class="nv">combinator</span> <span class="nv">and-sel</span> <span class="nv">comb-sel</span><span class="p">))</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span>
	 <span class="c1">;; need to handle trailing spaces here</span>
	 <span class="c1">;; to avoid s/r</span>
	 <span class="p">(</span><span class="nv">and-sel</span> <span class="nv">spaces</span><span class="p">)</span> <span class="nv">and-sel</span><span class="p">))</span>

  <span class="p">(</span><span class="nv">combinator</span> 
   <span class="p">(</span><span class="ss">:s</span> <span class="p">(</span><span class="nb">constantly</span> <span class="ss">:child</span><span class="p">))</span>
   <span class="p">(</span><span class="nv">spaces</span> <span class="ss">:|&gt;|</span> <span class="nv">spaces</span> <span class="p">(</span><span class="nb">constantly</span> <span class="ss">:immediate-child</span><span class="p">))</span>
   <span class="p">(</span><span class="nv">spaces</span> <span class="ss">:|~|</span> <span class="nv">spaces</span> <span class="p">(</span><span class="nb">constantly</span> <span class="ss">:preceded-by</span><span class="p">))</span>
   <span class="p">(</span><span class="nv">spaces</span> <span class="ss">:|+|</span> <span class="nv">spaces</span> <span class="p">(</span><span class="nb">constantly</span> <span class="ss">:immediatly-preceded-by</span><span class="p">)))</span>
  
  <span class="p">(</span><span class="nv">and-sel</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">and-sel</span> <span class="nv">simple-selector</span><span class="p">)</span>
       <span class="p">(</span><span class="nb">list</span> <span class="ss">:and</span> <span class="nv">and-sel</span> <span class="nv">simple-selector</span><span class="p">))</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">simple-selector</span><span class="p">)</span> <span class="nv">simple-selector</span><span class="p">))</span>
  
  <span class="p">(</span><span class="nv">simple-selector</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:HASH</span><span class="p">)</span> <span class="o">`</span><span class="p">(</span><span class="ss">:hash</span> <span class="o">,</span><span class="p">(</span><span class="nv">but-first</span> <span class="nv">hash</span><span class="p">)))</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:CLASS</span><span class="p">)</span> <span class="o">`</span><span class="p">(</span><span class="ss">:class</span> <span class="o">,</span><span class="p">(</span><span class="nv">but-first</span> <span class="nc">class</span><span class="p">)))</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:IDENT</span><span class="p">)</span> <span class="o">`</span><span class="p">(</span><span class="ss">:element</span> <span class="o">,</span><span class="nv">ident</span><span class="p">))</span>
   <span class="p">(</span><span class="ss">:|*|</span> <span class="p">(</span><span class="nb">constantly</span> <span class="ss">:everything</span><span class="p">))</span>
   <span class="p">(</span><span class="nv">attrib</span> <span class="nf">#'</span><span class="nb">identity</span><span class="p">)</span>
   <span class="p">(</span><span class="nv">pseudo</span> <span class="nf">#'</span><span class="nb">identity</span><span class="p">))</span>

  <span class="p">(</span><span class="nv">attrib</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:|[|</span> <span class="nv">spaces</span> <span class="ss">:IDENT</span> <span class="nv">spaces</span> <span class="ss">:|]|</span><span class="p">)</span>
       <span class="o">`</span><span class="p">(</span><span class="ss">:attribute</span> <span class="o">,</span><span class="nv">ident</span><span class="p">))</span>
   
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:|[|</span> <span class="nv">spaces</span> <span class="ss">:IDENT</span> <span class="nv">spaces</span> <span class="nv">attrib-value-def</span> <span class="nv">spaces</span> <span class="ss">:|]|</span><span class="p">)</span>
       <span class="o">`</span><span class="p">(</span><span class="ss">:attribute</span> <span class="o">,</span><span class="nv">ident</span> <span class="o">,</span><span class="nv">attrib-value-def</span><span class="p">))</span>
   <span class="p">)</span>

  <span class="p">(</span><span class="nv">attrib-value-def</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">attrib-match-type</span> <span class="nv">attrib-value</span><span class="p">)</span>
       <span class="p">(</span><span class="nb">list</span> <span class="nv">attrib-match-type</span> <span class="nv">attrib-value</span><span class="p">)))</span>

  <span class="p">(</span><span class="nv">attrib-match-type</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:|=|</span><span class="p">)</span> <span class="ss">:equals</span><span class="p">)</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:includes</span><span class="p">)</span> <span class="ss">:includes</span><span class="p">)</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:dashmatch</span><span class="p">)</span> <span class="ss">:dashmatch</span><span class="p">)</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:begins-with</span><span class="p">)</span> <span class="ss">:begins-with</span><span class="p">)</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:ends-with</span><span class="p">)</span> <span class="ss">:ends-with</span><span class="p">)</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:substring</span><span class="p">)</span> <span class="ss">:substring</span><span class="p">))</span>

  <span class="p">(</span><span class="nv">attrib-value</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:ident</span><span class="p">)</span> <span class="nv">ident</span><span class="p">)</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:string</span><span class="p">)</span> <span class="p">(</span><span class="nv">but-quotes</span> <span class="nb">string</span><span class="p">)))</span>
  
  <span class="p">(</span><span class="nv">pseudo</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:|:|</span> <span class="ss">:IDENT</span><span class="p">)</span> <span class="p">(</span><span class="nb">list</span> <span class="ss">:pseudo</span> <span class="nv">ident</span><span class="p">))</span>
   
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:|:|</span> <span class="ss">:FUNCTION</span> <span class="nv">spaces</span> <span class="nv">selector</span> <span class="ss">:|)|</span><span class="p">)</span>
       <span class="p">(</span><span class="nb">list</span> <span class="ss">:pseudo</span> <span class="p">(</span><span class="nv">but-last</span> <span class="k">function</span><span class="p">)</span> <span class="nv">selector</span><span class="p">))</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:|:|</span> <span class="ss">:NTH-FUNCTION</span> <span class="nv">spaces</span> <span class="nv">nth-expr</span> <span class="nv">spaces</span> <span class="ss">:|)|</span> <span class="p">)</span>
       <span class="o">`</span><span class="p">(</span><span class="ss">:nth-pseudo</span> <span class="o">,</span><span class="p">(</span><span class="nv">but-last</span> <span class="nv">nth-function</span><span class="p">)</span>
		     <span class="o">,@</span><span class="nv">nth-expr</span><span class="p">)))</span>

  <span class="p">(</span><span class="nv">nth-expr</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:ident</span><span class="p">)</span>
       <span class="p">(</span><span class="nb">cond</span> <span class="p">((</span><span class="nb">string-equal</span> <span class="nv">ident</span> <span class="s">"even"</span><span class="p">)</span> <span class="p">(</span><span class="nb">list</span> <span class="mi">2</span> <span class="mi">0</span><span class="p">))</span>
	     <span class="p">((</span><span class="nb">string-equal</span> <span class="nv">ident</span> <span class="s">"odd"</span><span class="p">)</span> <span class="p">(</span><span class="nb">list</span> <span class="mi">2</span> <span class="mi">1</span><span class="p">))</span>
	     <span class="p">(</span><span class="no">T</span> <span class="p">(</span><span class="nb">error</span> <span class="s">"invalid nth subexpression"</span><span class="p">))))</span>
   
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">nth-sign</span> <span class="ss">:integer</span><span class="p">)</span>
       <span class="p">(</span><span class="nb">list</span> <span class="mi">0</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">string-equal</span> <span class="nv">nth-sign</span> <span class="s">"-"</span><span class="p">)</span>
		   <span class="p">(</span><span class="nb">*</span> <span class="mi">-1</span> <span class="p">(</span><span class="nb">parse-integer</span> <span class="nc">integer</span><span class="p">))</span>
		   <span class="p">(</span><span class="nb">parse-integer</span> <span class="nc">integer</span><span class="p">))))</span>
   
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">nth-sign</span> <span class="ss">:integer</span> <span class="ss">:ident</span><span class="p">)</span>
       <span class="p">(</span><span class="k">let</span> <span class="p">(</span><span class="nv">extra-num</span><span class="p">)</span>
	 <span class="p">(</span><span class="nb">cond</span>
	   <span class="p">((</span><span class="nb">string-equal</span> <span class="s">"n"</span> <span class="nv">ident</span><span class="p">)</span> <span class="no">T</span><span class="p">)</span>
	   <span class="c1">;; this is because our lexer will recogince n-1 as a valid ident</span>
	   <span class="c1">;; but n+1 will hit the rule below</span>
	   <span class="p">((</span><span class="nv">alexandria:starts-with-subseq</span> <span class="s">"n"</span> <span class="nv">ident</span><span class="p">)</span>
	    <span class="p">(</span><span class="nb">setf</span> <span class="nv">extra-num</span> <span class="p">(</span><span class="nb">parse-integer</span> <span class="p">(</span><span class="nb">subseq</span> <span class="nv">ident</span> <span class="mi">1</span><span class="p">))))</span>
	   <span class="p">(</span><span class="no">T</span> <span class="p">(</span><span class="nb">error</span> <span class="s">"invalid nth subexpression in (what is ~A)"</span> <span class="nv">ident</span><span class="p">)))</span>
	 <span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nb">or</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">string-equal</span> <span class="nv">nth-sign</span> <span class="s">"-"</span><span class="p">)</span>
		       <span class="p">(</span><span class="nb">*</span> <span class="mi">-1</span> <span class="p">(</span><span class="nb">parse-integer</span> <span class="nc">integer</span><span class="p">))</span>
		       <span class="p">(</span><span class="nb">parse-integer</span> <span class="nc">integer</span><span class="p">))</span>
		   <span class="mi">0</span><span class="p">)</span>
	       <span class="p">(</span><span class="nb">or</span> <span class="nv">extra-num</span> <span class="mi">0</span><span class="p">))))</span>
   <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="nv">nth-sign</span> <span class="ss">:integer</span> <span class="ss">:ident</span> <span class="nv">nth-sign</span> <span class="ss">:integer</span><span class="p">)</span>
       <span class="p">(</span><span class="nb">when</span> <span class="p">(</span><span class="nb">and</span> <span class="nv">integer-1</span> <span class="p">(</span><span class="nb">null</span> <span class="nv">nth-sign-1</span><span class="p">))</span>
	 <span class="p">(</span><span class="nb">error</span> <span class="s">"invalid nth subexpression 2n+1 style requires a sign before the second number"</span><span class="p">))</span>
       <span class="p">(</span><span class="nb">list</span> <span class="p">(</span><span class="nb">or</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">string-equal</span> <span class="nv">nth-sign-0</span> <span class="s">"-"</span><span class="p">)</span>
		     <span class="p">(</span><span class="nb">*</span> <span class="mi">-1</span> <span class="p">(</span><span class="nb">parse-integer</span> <span class="nv">integer-0</span><span class="p">))</span>
		     <span class="p">(</span><span class="nb">parse-integer</span> <span class="nv">integer-0</span><span class="p">))</span>
		 <span class="mi">0</span><span class="p">)</span>
	     <span class="p">(</span><span class="nb">or</span> <span class="p">(</span><span class="k">if</span> <span class="p">(</span><span class="nb">string-equal</span> <span class="nv">nth-sign-1</span> <span class="s">"-"</span><span class="p">)</span>
		     <span class="p">(</span><span class="nb">*</span> <span class="mi">-1</span> <span class="p">(</span><span class="nb">parse-integer</span> <span class="nv">integer-1</span><span class="p">))</span>
		     <span class="p">(</span><span class="nb">parse-integer</span> <span class="nv">integer-1</span><span class="p">))</span>
		 <span class="mi">0</span><span class="p">))</span>
       <span class="p">))</span>
  
   <span class="p">(</span><span class="nv">nth-sign</span>
    <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:|+|</span><span class="p">)</span> <span class="ss">:|+|</span><span class="p">)</span>
    <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">(</span><span class="ss">:|-|</span><span class="p">)</span> <span class="ss">:|-|</span><span class="p">)</span>
    <span class="o">#.</span><span class="p">(</span><span class="nv">rule</span> <span class="p">()</span> <span class="p">()))</span>
  
   <span class="p">(</span><span class="nv">spaces</span>
    <span class="p">(</span><span class="ss">:S</span> <span class="p">)</span>
    <span class="p">(</span> <span class="p">)))</span>
</pre>
</div>
</div>
<hr />
<h3 id="recursive-regex-parser">Recursive-Regex Parser</h3>
<p>After having written recursive-regex, I wanted a way to beat some of<br />
the bugs out as well as have a good example of what I could use this<br />
tool for, and with what performance characteristics.  To accomplish<br />
this task, I converted the CL-YACC grammar and lex file into a single<br />
parser definition and<br />
<a href="https://github.com/AccelerationNet/recursive-regex/blob/master/rex-reader.lisp"><br />
wrote some code to turn that file into the recursive dispatch functions</a>.</p>
<h4 id="rex">REX: a recursive expression file format based (loosely) on lex</h4>
<p>I had a really shoddy lexish reader for the existing lex based lexer.<br />
I converted this to a similar tool (without the c blocks) for defining<br />
named recursive expressions.  These files have the .rex extension.<br />
They consists of options, inline-definitions, and named productions.<br />
<a href="https://github.com/AccelerationNet/css-selectors/blob/recex/src/recex/css3.rex"><br />
The definitions for css-selectors are in css3.rex.</a></p>
<p>Once I had my recursive expression definition, getting the parser was<br />
a fairly easy task.  Along the way I added some code to minimize the<br />
parse tree results by promoting children, when a parent only had a<br />
single child that matched the same length as the parent.  I also<br />
improved the parse tracing, so that I could observer and debug what<br />
the expression was doing while it was matching.  With the tree minimized<br />
I also had to revise many of my tests.</p>
<h4>css3.rex: css3-selectors parser definition</h4>
<p><code style="height:250px; overflow:auto;" >%option case-insensitive</p>
<p>h	      =&gt;   [0-9a-f]<br />
nmstart	      =&gt;   [a-z]<br />
nmchar	      =&gt;   [a-z0-9-]<br />
nl	      =&gt;   \n|\r\n|\r|\f<br />
string1	      =&gt;   \"([\t !#$%&#038;(-~]|\\{nl}|\')*\"<br />
string2	      =&gt;   \'([\t !#$%&#038;(-~]|\\{nl}|\")*\'<br />
id	      =&gt;   [-]?{nmstart}{nmchar}*<br />
name	      =&gt;   {nmchar}+<br />
int	      =&gt;   [0-9]+<br />
num	      =&gt;   [0-9]+|[0-9]*\.[0-9]+<br />
string	      =&gt;   {string1}|{string2}<br />
url	      =&gt;   ([!#$%&#038;*-~])*<br />
w	      =&gt;   [ \t\r\n\f]*<br />
s             =&gt;   [\s\f]+</p>
<p>%%</p>
<p>comment       =&gt;   \/\*[^*]*\*+([^/][^*]*\*+)*\/<br />
cdo           =&gt;   &lt;!--<br />
cdc           =&gt;   --&gt;<br />
class         =&gt;   \.(?&lt;ident&gt;)<br />
hash          =&gt;   #(?&lt;ident&gt;{name})<br />
ident         =&gt;   {id}<br />
element       =&gt;   {id}<br />
important_sym =&gt;   !{w}important</p>
<p>selector        =&gt;   (?&lt;or&gt;)<br />
or              =&gt;   (?&lt;comb-sel&gt;)(\s*,\s*(?&lt;or&gt;))?</p>
<p>comb-sel        =&gt;   (?&lt;and&gt;)((?&lt;combinator&gt;)(?&lt;comb-sel&gt;))?<br />
combinator      =&gt;   (?&lt;child&gt;\s+)|(?&lt;immediate-child&gt;&gt;)|(?&lt;preceded-by&gt;~)|(?&lt;immediatly-preceded-by&gt;\+)<br />
and             =&gt;   (?&lt;sim-sel&gt;)(?&lt;and&gt;)?<br />
sim-sel         =&gt;   (?&lt;hash&gt;)|(?&lt;class&gt;)|(?&lt;element&gt;)|\*|(?&lt;attrib&gt;)|(?&lt;pseudo&gt;)<br />
attrib          =&gt;   \[\s*(?&lt;ident&gt;)\s*(?&lt;attrib-val&gt;)?\]<br />
attrib-val      =&gt;   [\^\$\*\|\~]?=((?&lt;ident&gt;)|(?&lt;string&gt;))<br />
pseudo          =&gt;   :(?&lt;ident&gt;)(?&lt;parens&gt;\s*(?&lt;selector&gt;)|(?&lt;nth-expr&gt;)\s*)?<br />
nth-expr        =&gt;   even|odd|[\+-]?{int}(n([\+-]{int})?)?</p>
<p></code></p>
<hr />
<h3>Performance Numbers</h3>
<p>These are the performance numbers for the two parsers each parsing 6<br />
short inputs 1000 times.  Also included is (a version) of that<br />
output. (Recursive-expressions return CLOS object trees that have in<br />
the results below been converted to a list representation for easy of<br />
viewing.)  As you can see the recursive-expressions version is ten<br />
times slower and uses twenty times the memory.</p>
<p><code></p>
<p>(defun run-some-tests ()<br />
  (timed-side-by-side-parses<br />
   (list ".foo" "#bar" ":bast" "div[onclick]"<br />
         "div[onclick=foo]"<br />
         ":nth-last-child(  2n+1  ), foo.bar>bast:blech( .foo )" )<br />
   1000))</p>
<p>CSS> (run-some-tests)</p>
<p>;;; Recursive-Regex Results<br />
Evaluation took:<br />
  7.548 seconds of real time<br />
  7.480000 seconds of total run time (6.850000 user, 0.630000 system)<br />
  [ Run times consist of 1.410 seconds GC time, and 6.070 seconds non-GC time. ]<br />
  99.10% CPU<br />
  18,821,417,333 processor cycles<br />
  1,167,530,480 bytes consed</p>
<p>((:CLASS ".foo" (:IDENT "foo")) (:HASH "#bar" (:IDENT "bar"))<br />
 (:PSEUDO ":bast" (:IDENT "bast"))<br />
 (:AND "div[onclick]" (:ELEMENT "div")<br />
  (:ATTRIB "[onclick]" (:IDENT "onclick")))<br />
 (:AND "div[onclick=foo]" (:ELEMENT "div")<br />
  (:ATTRIB "[onclick=foo]" (:IDENT "onclick")<br />
   (:ATTRIB-VAL "=foo" (:IDENT "foo"))))<br />
 (:OR ":nth-last-child(  2n+1  ), foo.bar>bast:blech( .foo )"<br />
  (:PSEUDO ":nth-last-child(  2n+1  )" (:IDENT "nth-last-child")<br />
   (:MATCHED-PARENS "(  2n+1  )" (:BODY "2n+1  " (:NTH-EXPR "2n+1"))))<br />
  (:COMB-SEL "foo.bar>bast:blech( .foo )"<br />
   (:AND "foo.bar" (:ELEMENT "foo") (:CLASS ".bar" (:IDENT "bar")))<br />
   (:IMMEDIATE-CHILD ">")<br />
   (:AND "bast:blech( .foo )" (:ELEMENT "bast")<br />
    (:PSEUDO ":blech( .foo )" (:IDENT "blech")<br />
     (:MATCHED-PARENS "( .foo )"<br />
      (:BODY " .foo" (:CLASS ".foo" (:IDENT "foo")))))))))</p>
<p>;;; CL-YACC Results</p>
<p>Evaluation took:<br />
  0.790 seconds of real time<br />
  0.790000 seconds of total run time (0.770000 user, 0.020000 system)<br />
  [ Run times consist of 0.080 seconds GC time, and 0.710 seconds non-GC time. ]<br />
  100.00% CPU<br />
  1,968,753,765 processor cycles<br />
  64,695,792 bytes consed</p>
<p>((:CLASS "foo") (:HASH "bar") (:PSEUDO "bast")<br />
 (:AND (:ELEMENT "div") (:ATTRIBUTE "onclick"))<br />
 (:AND (:ELEMENT "div") (:ATTRIBUTE "onclick" (:EQUALS "foo")))<br />
 (:OR (:NTH-PSEUDO "nth-last-child" 2 1)<br />
  (:IMMEDIATE-CHILD (:AND (:ELEMENT "foo") (:CLASS "bar"))<br />
   (:AND (:ELEMENT "bast") (:PSEUDO "blech" (:CLASS "foo"))))))<br />
</code></p>
<hr />
<h3 id="tldr">TL/DR</h3>
<p>In conclusion, I am certainly not going to replace my working, fast,<br />
memory efficient cl-yacc parser with my recursive-expressions parser.<br />
However, if I wanted to have a working, legible (maybe) parser<br />
definition, that will match as I intuitively expect, I might use<br />
recursive-expressions.  Because I am so used to using regex&#8217;s for<br />
matching, if performance was not an issue, I would probably always<br />
prefer the recursive expressions version.  I could also see the<br />
recursive expressions solution being a nice prototyping tool to help<br />
develop the cl-yacc parser.</p>
<p><em>Obviously some of these opinions are going to be biased because<br />
I wrote one of these libraries and not the other</em></p>
<h4>CL-YACC</h4>
<h5>Pros</h5>
<ul>
<li>Pretty quick and very memory efficient parser</li>
<li>Easy to customize parser output (each production has a lambda body to build whatever output is necessary)</li>
<li>Theoretically well grounded </li>
</ul>
<h5>Cons</h5>
<ul>
<li>Its hard to make unambiguous grammars</li>
<li>Not exceedingly helpful with its suggestions for how to fix your ambiguities</li>
</ul>
<h4>Recursive Regex</h4>
<h5>Pros</h5>
<ul>
<li>Relatively easy to get working</li>
<li>lexer and parser are the same tool, built on top of cl-ppcre, which<br />
    you presumably already know, and has a good test environment (regex-coach, repl)
  </li>
<li>Parses ambiguous grammars</li>
<li>Has reasonable parser tracing built in, so debugging can be somewhat easier </li>
</ul>
<h5>Cons</h5>
<ul>
<li>Not very concerned with parse time or memory consumption</li>
<li>Parses ambiguous grammars</li>
<li>Bad pathological cases (exponential search)</li>
<li>Currently no architecture for modifying the parse tree other than an after<br />
      the fact rewrite</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=289</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Recursive-Regex Update</title>
		<link>http://russ.unwashedmeme.com/blog/?p=283</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=283#comments</comments>
		<pubDate>Tue, 27 Sep 2011 19:03:53 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[cl-ppcre]]></category>
		<category><![CDATA[Lisp]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=283</guid>
		<description><![CDATA[I think I have enough of the bugs worked out and enough tests now to actually recommend that others use the recursive-regex library up at my github. There is a brief overview of the project in the introductory blog post. &#8230; <a href="http://russ.unwashedmeme.com/blog/?p=283">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>I think I have enough of the bugs worked out and enough tests now to actually recommend that others use the <a href="https://github.com/AccelerationNet/recursive-regex">recursive-regex</a> library up at my github.
</p>
<p>
There is a brief overview of the project in the <a title="Introducing Recursive-Regex" href="http://russ.unwashedmeme.com/blog/?p=263">introductory blog post</a>. Also, in response to one of the comments on that blog post, there is an <a href="https://github.com/AccelerationNet/recursive-regex/blob/master/tests/sexp.lisp">example s-exp parser</a> as part of the test suite.
</p>
<p>
While this started as a toy, to scratch an intellectual itch, I think that this project is potentially a nice mid point between full blown parser frame work and regular expressions. Grammars are hard to get right though, so if you are writing your own language you might want to investigate something <a href="http://www.cliki.net/parser%20generator">from the cliki parser generators page</a> (eg: <a href="http://www.pps.jussieu.fr/~jch/software/cl-yacc/">cl-yacc</a>).</p>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=283</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing Recursive-Regex</title>
		<link>http://russ.unwashedmeme.com/blog/?p=263</link>
		<comments>http://russ.unwashedmeme.com/blog/?p=263#comments</comments>
		<pubDate>Sun, 28 Aug 2011 17:46:11 +0000</pubDate>
		<dc:creator>russ</dc:creator>
				<category><![CDATA[cl-ppcre]]></category>
		<category><![CDATA[Lisp]]></category>
		<category><![CDATA[Nifty Bordom]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://russ.unwashedmeme.com/blog/?p=263</guid>
		<description><![CDATA[Recursive-Regex is the end result of a weekend of playing with the code I published on Thursday about adding named dispatch functions to CL-PPCRE regular expressions. I kept at it and I think that this approach might have some promise &#8230; <a href="http://russ.unwashedmeme.com/blog/?p=263">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>
<a href="https://github.com/AccelerationNet/recursive-regex">Recursive-Regex</a> is the end result of a weekend of playing with the <a href="http://russ.unwashedmeme.com/blog/?p=254">code I published on Thursday</a> about adding named dispatch functions to <a href="http://weitz.de/cl-ppcre/">CL-PPCRE</a> regular expressions.  I kept at it and I think that this approach might have some promise for building up a library of reusable regexp/matcher chunks.  I also found that this made it somewhat easier to obtain results from the regular expression search because I get back a full parse tree rather than the bindings typically supplied by CL-PPCRE.
</p>
<p>
I have it somewhat documented, loadable and testable, with all my current tests passing.  There is even a recursive regex csv parser defined in the default dispatch table (mostly as a simple, but practical proof of concept).
</p>
<p>
<code>  Comma-List: [\t ]*(?:(?&lt;body>[^,]*)[\t ]*,)*[\t ]*(?&lt;body>[^,]*)[\t ]*<br />
  CSV-Row: (?&lt;comma-list>((?&lt;double-quotes>)|[^\n,]*))(?:\n|$)<br />
  CSV-File: (?&lt;csv-row>)*<br />
</code><br />
Double quotes and body both go to custom dispatcher functions. Body defines where the body regex should be matched and what to use if no body is supplied.
</p>
<p>
I don&#8217;t really have long term plans for this project, but it scratched an intellectual itch I was experiencing.  Perhaps it will be useful for someone down the road.</p>
]]></content:encoded>
			<wfw:commentRss>http://russ.unwashedmeme.com/blog/?feed=rss2&#038;p=263</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
