Collecting entries from files in a directory

| categories: emacs-lisp | tags:

I am running a class where students will be generating files that contain their answers. I want to quickly get a list of counts of all the answers. For example, six students will create a file called animal.dat in a directory called example/<studentid>, and that file will contain their favorite animal. I made some example files to test this idea out. Here are the contents.

cat example/*/animal.dat
dog
cat
dog
bird
dog
bird

You can see there are three dogs, two birds and a cat. I want code to do this counting, because in my real application there will be 58 of these files, and lots of times I need to aggregate them. Let us start with a simple example that counts the elements in a list.

(let ((animals '(dog cat dog bird dog bird))
      (counts '())
      place)
  (dolist (animal animals)
    (setq place (assoc animal counts))
    (message "place = %s" place)
    (if place
        (setf (cdr place) (+ 1 (cdr place)))
      (setq counts (cons `(,animal . 1) counts))))
  counts)

((bird . 2) (cat . 1) (dog . 3))

Let us turn that into a function.

(defun counts (list)
  (let ((counts '())
        place)
    (dolist (el list)   
      (setq place (assoc el  counts))
    (message "place = %s" place)
    (if place
        (setf (cdr place) (+ 1 (cdr place)))
      (setq counts (cons `(,el . 1) counts))))
    counts))

(counts '(dog cat dog bird dog bird))

((bird . 2) (cat . 1) (dog . 3))

Nice. Now we need a simple way to get that list. We need to glob the files to find them, then open them and read the value. Here is a way to get the files.

(f-entries "example"
           (lambda (f)
             (string= (file-name-nondirectory f) "animal.dat"))
           t)
/Users/jkitchin/blogofile-jkitchin.github.com/blog/collect-entries/example/s1/animal.dat /Users/jkitchin/blogofile-jkitchin.github.com/blog/collect-entries/example/s2/animal.dat /Users/jkitchin/blogofile-jkitchin.github.com/blog/collect-entries/example/s3/animal.dat /Users/jkitchin/blogofile-jkitchin.github.com/blog/collect-entries/example/s4/animal.dat /Users/jkitchin/blogofile-jkitchin.github.com/blog/collect-entries/example/s5/animal.dat /Users/jkitchin/blogofile-jkitchin.github.com/blog/collect-entries/example/s6/animal.dat

Now we just need to run a mapcar over this list of files.

(mapcar
 (lambda (f)
   (with-temp-buffer
     (insert-file-contents f)
     (s-trim (buffer-string))))
 (f-entries "example"
            (lambda (f)
              (string= (file-name-nondirectory f) "animal.dat"))
            t))
dog cat dog bird dog bird

Finally, putting this together, we have some code that maps over the files, and counts the entries.

(counts
 (mapcar
  (lambda (f)
    (with-temp-buffer
      (insert-file-contents f)
      (s-trim (buffer-string))))
  (f-entries "example"
             (lambda (f)
               (string= (file-name-nondirectory f) "animal.dat"))
             t)))

((bird . 2) (cat . 1) (dog . 3))

This will be helpful in dealing with 58 entries during my class!

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter