Make a list of org-files in all the subdirectories of the current working directory

| categories: recursive, org-mode, emacs | tags:

It would be helpful to get a listing of org-files in a directory tree in the form of clickable links. This would be useful, for example, to find all files associated with a project in a directory with a particular extension, or to do some action on all files that match a pattern. To do this, we will have to recursively walk through the directories and examine their contents.

Let us examine some of the commands we will need to use. One command is to get the contents of a directory. We will explore the contents of a directory called literate in my computer.

;; list contents of the directory
(let ((abspath nil)
      (match nil)
      (nosort t))
  (directory-files "literate" abspath match nosort))
makefile-main Makefile main.o main.f90 main literate.org hello.f90 circle.o circle.mod circle.f90 circle-area.png archive a.out .. .

Note the presence of . and ... Those stand for current directory and one directory up. We should remove those from the list. We can do that like this.

;; remove . and ..
(let ((abspath nil)
      (match nil)
      (nosort t))
  (remove "." 
          (remove ".." 
                  (directory-files "literate" abspath match nosort))))
makefile-main Makefile main.o main.f90 main literate.org hello.f90 circle.o circle.mod circle.f90 circle-area.png archive a.out

Next, we need to know if a given entry in the directory files is a file or a directory. Emacs-lisp has a few functions for that. We use absolute filenames here since the paths are relative to the "molecules" directory. Note we could use absolute paths in directory-files, but that makes it hard to remove "." and "..".

;; print types of files in the directory
(let ((root "literate")
      (abspath nil)
      (match nil)
      (nosort t))
  (mapcar (lambda (x)
            (cond
             ((file-directory-p (expand-file-name x root))
              (print (format "%s is a directory" x)))
             ((file-regular-p (expand-file-name x root))
              (print (format "%s is a regular file" x)))))
          (remove "." 
                  (remove ".." 
                          (directory-files root abspath match nosort)))))
"makefile-main is a regular file"

"Makefile is a regular file"

"main.o is a regular file"

"main.f90 is a regular file"

"main is a regular file"

"literate.org is a regular file"

"hello.f90 is a regular file"

"circle.o is a regular file"

"circle.mod is a regular file"

"circle.f90 is a regular file"

"circle-area.png is a regular file"

"archive is a directory"

"a.out is a regular file"

Now, we are at the crux of this problem. We can differentiate between files and directories. For each directory in this directory, we need to recurse into it, and list the contents. There is some code at http://turingmachine.org/bl/2013-05-29-recursively-listing-directories-in-elisp.html which does this, but I found that I had to modify the code to not list directories, and here I want to show a simpler recursive code.

(defun os-walk (root)
  "recursively walks through directories getting list of absolute paths of files"
  (let ((files '()) ; empty list to store results
        (current-list (directory-files root t)))
    ;;process current-list
    (while current-list
      (let ((fn (car current-list))) ; get next entry
        (cond 
         ;; regular files
         ((file-regular-p fn)
          (add-to-list 'files fn))
         ;; directories
         ((and
           (file-directory-p fn)
           ;; ignore . and ..
           (not (string-equal ".." (substring fn -2)))
           (not (string-equal "." (substring fn -1))))
          ;; we have to recurse into this directory
          (setq files (append files (os-walk fn))))
        )
      ;; cut list down by an element
      (setq current-list (cdr current-list)))
      )
    files))

(os-walk "literate")
c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/makefile-main c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/main.o c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/main.f90 c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/main c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/literate.org c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/hello.f90 c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/circle.o c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/circle.mod c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/circle.f90 c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/circle-area.png c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/a.out c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/Makefile c:/Users/jkitchin/Dropbox/blogofile-jkitchin.github.com/blog/literate/archive/empty-text-file.txt

Nice, that gives us a recursive listing of all the files in this directory tree. Let us take this a step further, and apply a function to that list to filter out a list of the org files. We will also create org-links out of these files.

(defun os-walk (root)
  (let ((files '()) ;empty list to store results
        (current-list (directory-files root t)))
    ;;process current-list
    (while current-list
      (let ((fn (car current-list))) ; get next entry
        (cond 
         ;; regular files
         ((file-regular-p fn)
          (add-to-list 'files fn))
         ;; directories
         ((and
           (file-directory-p fn)
           ;; ignore . and ..
           (not (string-equal ".." (substring fn -2)))
           (not (string-equal "." (substring fn -1))))
          ;; we have to recurse into this directory
          (setq files (append files (os-walk fn))))
        )
      ;; cut list down by an element
      (setq current-list (cdr current-list)))
      )
    files))

(require 'cl)

(mapcar 
 (lambda (x) (princ (format "[[%s][%s]]\n" x (file-relative-name x "."))))
 (remove-if-not 
  (lambda (x) (string= (file-name-extension x) "org"))
  (os-walk "literate")))

literate/literate.org

That is certainly functional. It might be nice to format the links a bit nicer to show their structure in a table of contents way, or to sort them in a nice order if there were many of these files.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.5h

Discuss on Twitter

Some of this, sum of that

| categories: recursive, miscellaneous | tags:

Matlab plot

Python provides a sum function to compute the sum of a list. However, the sum function does not work on every arrangement of numbers, and it certainly does not work on nested lists. We will solve this problem with recursion.

Here is a simple example.

v = [1, 2, 3, 4, 5, 6, 7, 8, 9] # a list
print sum(v)

v = (1, 2, 3, 4, 5, 6, 7, 8, 9)  # a tuple
print sum(v)
45
45

If you have data in a dictionary, sum works by default on the keys. You can give the sum function the values like this.

v = {'a':1, 'b':3, 'c':4}
print sum(v.values())
8

1 Nested lists

Suppose now we have nested lists. This kind of structured data might come up if you had grouped several things together. For example, suppose we have 5 departments, with 1, 5, 15, 7 and 17 people in them, and in each department they are divided into groups.

Department 1: 1 person Department 2: group of 2 and group of 3 Department 3: group of 4 and 11, with a subgroups of 5 and 6 making up the group of 11. Department 4: 7 people Department 5: one group of 8 and one group of 9.

We might represent the data like this nested list. Now, if we want to compute the total number of people, we need to add up each group. We cannot simply sum the list, because some elements are single numbers, and others are lists, or lists of lists. We need to recurse through each entry until we get down to a number, which we can add to the running sum.

v = [1, 
    [2, 3],
    [4, [5, 6]],
    7,
    [8,9]]

def recursive_sum(X):
    'compute sum of arbitrarily nested lists'
    s = 0 # initial value of the sum

    for i in range(len(X)):
        import types  # we use this to test if we got a number
        if isinstance(X[i], (types.IntType,
                             types.LongType,
                             types.FloatType,
                             types.ComplexType)):
            # this is the terminal step
            s += X[i]
        else:
            # we did not get a number, so we recurse
            s += recursive_sum(X[i])
    return s

print recursive_sum(v)
print recursive_sum([1,2,3,4,5,6,7,8,9]) # test on non-nested list
45
45

In Post 1970 we examined recursive functions that could be replaced by loops. Here we examine a function that can only work with recursion because the nature of the nested data structure is arbitrary. There are arbitary branches and depth in the data structure. Recursion is nice because you do not have to define that structure in advance.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter

Lather, rinse and repeat

| categories: recursive, math | tags:

Matlab post

Recursive functions are functions that call themselves repeatedly until some exit condition is met. Today we look at a classic example of recursive function for computing a factorial. The factorial of a non-negative integer n is denoted n!, and is defined as the product of all positive integers less than or equal to n.

The key ideas in defining a recursive function is that there needs to be some logic to identify when to terminate the function. Then, you need logic that calls the function again, but with a smaller part of the problem. Here we recursively call the function with n-1 until it gets called with n=0. 0! is defined to be 1.

def recursive_factorial(n):
    '''compute the factorial recursively. Note if you put a negative
    number in, this function will never end. We also do not check if
    n is an integer.'''
    if n == 0:
        return 1
    else:
        return n * recursive_factorial(n - 1)

print recursive_factorial(5)
120
from scipy.misc import factorial
print factorial(5)
120.0

0.1 Compare to a loop solution

This example can also be solved by a loop. This loop is easier to read and understand than the recursive function. Note the recursive nature of defining the variable as itself times a number.

n = 5
factorial_loop = 1
for i in range(1, n + 1):
    factorial_loop *= i

print factorial_loop
120

There are some significant differences in this example than in Matlab.

  1. the syntax of the for loop is quite different with the use of the in operator.
  2. python has the nice *= operator to replace a = a * i
  3. We have to loop from 1 to n+1 because the last number in the range is not returned.

1 Conclusions

Recursive functions have a special niche in mathematical programming. There is often another way to accomplish the same goal. That is not always true though, and in a future post we will examine cases where recursion is the only way to solve a problem.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter