Looping in LISP

Loops are a fairly important part of any programming language and fairly fundamental to a language that is purported to be all about manipulating lists. However it’s not something I use that often in my .emacs code so I thought it might be useful to discuss the various options with some examples.

Problem: I run emacs on a number of machines, each with a different set of sound sets. I want to set up set up a valid sound for erc but I don’t want an overly verbose set of cases depending on what machine I’m on. Instead a given a list of sound files I want a function that will return the first one that actually exists.

This problem can be easily generalised into return the first valid path from a list of paths.

First version: pure emacs lisp

; the 'elisp' way
(defun find-valid-file-elisp-way (list-of-files)
  "Go though a list of files and return the first one that is present"
  (let (r '())
    (mapc '(lambda (f)
             (if (file-exists-p f) (add-to-list 'r f)))
          list-of-files)
    (car r)))

First impressions aren’t good. The lisp parenthesis do seem to get in the way of making what is happening clear. However it’s using one of common mapping functions you see a lot of in lisp. A mapping function essentially takes a list, applies a function to each element of the list and eventually returns a result. The most common of the mapping functions is mapcar which returns a modified list as a result. In this case that isn’t what we want so we use mapc where the only value that is built up is the result r as we identify each valid file. The final return value is just the first entry in that list. This does mean we have processed the whole list of alternatives which is sub-optimal.

Second version: Common Lisp Version

(defun find-valid-file-clisp-way (list-of-files)
  "Go though a list of files and return the first one that is present"
  (loop for path in list-of-files
        until (file-exists-p path)
        finally return path))

This version probably is the easiest to read for people familiar with other programming languages. The intention of the code jumps out at you. However the actual implementation is done with a macro. If you look at the help for loop you’ll see it can take a number of different forms – follow that to the code and you’ll see a fairly complex elisp implementation. However to my mind still easier to follow than the pure elisp version with mapc.

Third Version: Using the dolist macro

; using 'cl-macs 
(defun find-valid-file-dolist-way (list-of-files)
  "Go though a list of files and return the first one that is present"
  (dolist (f list-of-files)
    (if (file-exists-p f)
        (return f))))

This is yet another version using an LISP macro but this one has considerably less potential forms to cause confusion. It’s fairly comprehensible what is going on and even follows the traditional parenthesis happy form. It also takes advantage of the common LISP return to early return from the loop when we detect a valid file. If it makes it to the end of the list it evaluates the 3rd optional form to calculate the result which in this case will be ‘nil.

So what do you think? What version do you prefer? Where does the balance lie between writing code is LISPy ways and for code comprehension? Are there any other ways to solve this particular problem? I’ll be looking forward to your comments.

13 Comments

  1. I think the third version is the most readable.

    Initially I was actually disappointed that it was the most readable form, because it uses return, which I think some functional programmers consider suboptimal. I wondered if in a language without return it would be less readable, so I wrote it in Clojure (except the predicate). I was pleasantly surprised that because of filter (not in emacslisp afaik, but implementations available on emacswiki) and lazy sequences (to keep from trying all the files) it’s actually more readable!

    (defn find-valid-file [list-of-files]
      (first (filter file-exists? list-of-files)))
    
    • Oh, and the (non-lazy) filter equivalent for elisp is either delete-if or remove-if, depending on whether you want destructive behaviour or not. Again, you’ll need to (require ‘cl) for these.

  2. (require 'cl)
    (defun find-valid-file-dolist-way (list-of-files)
      "Go though a list of files and return the first one that is present"
      (find-if 'file-exists-p list-of-files))
    

    @Scott, Clojure has a function similar to find-if: clojure.contrib.seq-utils/find-first.

    • It’s interesting that improvements being suggested all rely on ‘cl. Is there really any reason in this day and age not to use Common Lisp liberally within Emacs?

      • Here’s a pure elisp version that’s (IMO) a little less jarring than your mapc one:

        (defun find-valid-file (file-list)
          (let ((found nil))
            (while (and (not found) file-list)
              (setq found (when (file-exists-p	(car file-list))
                            (car file-list))
                    file-list (cdr file-list)))
            found))
        

        In general, though, I don’t really feel there’s any value in avoiding cl. It’s built in, and huge swathes of third party stuff already use it. And even if you need to avoid it at run time, you can still use it at compile time to get the loop macro.

        • “And even if you need to avoid it at run time, you can still use it at compile time to get the loop macro.”

          Can you expand (or point to something) on that statement?

  3. “Where does the balance lie between writing code is LISPy ways and for code comprehension?”

    Well I suppose it might depend on whether or not one finds the other canonical way to do repetition in (e)Lisp comprehensible (I know some people don’t):

    (defun find-valid-file-elisp-way (list-of-files)
      "Go though a list of files and return the first one that is present"
      (when list-of-files
        (if (file-exists-p (first list-of-files))
            (first list-of-files)
          (find-valid-file-elisp-way (rest list-of-files)))))
    
    • Ahh, recursion 🙂

      I must admit I hadn’t thought of that approach. As far as I can recall the last time I actually used recursion was for some infinite precision maths routines in my Uni days. I wonder why it didn’t occur to me? Perhaps it’s my embedded roots which is wary of the stack usage?

      • I wouldn’t be surprised 🙂 – and oddly enough (GNU) Emacs comes with rather conservative default values for max-specpdl-size and max-lisp-eval-depth – but I find recursion so natural and elegant (esp. in Lisps, of course) that it’s often the first approach I consider.

        Personally, I’ve never had cause to develop a wariness about stack usage but I have often come across the advice to /generally/ try to “avoid consing”. But except in /specific/ cases where it’s clearly applicable, I’ve ignored it (and without ill effect). To my mind, if it’s a good idea to avoid fundamental Lispy constructs in some Lisp implementation, it must be the implementers “doing it wrong”, not me. 😉

  4. Something you don’t mention in your case 1 is that it turns an O(n) operation into an O(n^2), as add-to-list will need to repeatedly loop over the built up list to weed out duplicates.
    I like both the second and third approaches, but you can do a pure elisp solution with an early break using the builtins catch/throw.

    (let (file) 
      (catch 'found 
        (while files-remaining
          (setq file (car files-remaining)
                files-remaining (cdr files-remaining))
          (when (file-exists-p file)
            (throw 'done file)))
         nil)))
    
    • Shouldn’t there me more lets in there for files-remaining and would the throw tag not have to match the catch?

      It’s certainly cleaner but for pure elisp I think I prefer phayes recursive solution. Hew’s ‘cl version with find-if is probably the most compact.

  5. I was assuming files-remaining was being passed in; but yes, I renamed the throw tag at the last minute and didn;t fix it everywhere 🙂

    For pure elisp, you should almost always avoid recursion unless you know it’s a very fixed bound. By default you can’t go very deep, and the warning you get it is pretty incomprehensible to the average person (complaining about max-specpdl-size).

Comments are closed.