Now, let's re-populate it for real. I store my contacts in a variable called "contacts" as a list of a descriptive string and then cons cells. These are actually harvested from a set of org-files. It is way to slow to parse these files each time, so I keep the contacts cached in memory and only update them if a file changes.
6047
There are over 6000 contacts. Let's put them in a MongoDB.
Here is a limitation of our approach. This will not work because the generated shell command ends up being too long for the shell.
(mongo-insert "contacts" "contacts"
(loop for contact in contacts
collect
(append `((desc . ,(car contact))) (cdr contact))))
So, we do them one at time here:
(let ((ct (current-time)))
(loop for contact in contacts
do
(let ((output (mongo-insert "contacts" "contacts"
(append `((desc . ,(car contact))) (cdr contact)))))
(unless (= 1 (cdr (assoc 'nInserted output)))
(warn "error: %S for %S" (cdr (assoc 'nInserted output)) contact))))
(message "Elapsed time %.02f seconds" (float-time (time-since ct))))
Elapsed time 762.95 seconds
That took a little over 10 minutes to add. That seems long to me. This next step confirms that they were added.
(mongo-cmd "contacts" "contacts" "count()")
6047
Next we will compare some timing of finding data in the database vs looping through the cached contacts. Here is a timing macro to measure how long it takes to run a bit of code.
;; http://stackoverflow.com/questions/23622296/emacs-timing-execution-of-function-calls-in-emacs-lisp
(defmacro measure-time (&rest body)
"Measure the time it takes to evaluate BODY."
`(let ((time (current-time)))
,@body
(message "%.06f seconds elapsed" (float-time (time-since time)))))
measure-time
Here is the old way I would extract data. Many contacts I have are academics, and I have stored their academic ranks in each contact.
(loop for contact in contacts
if (string= "Professor" (cdr (assoc "RANK" (cdr contact))))
collect contact into professors
if (string= "Associate Professor" (cdr (assoc "RANK" (cdr contact))))
collect contact into associate-professors
if (string= "Assistant Professor" (cdr (assoc "RANK" (cdr contact))))
collect contact into assistant-professors
finally return `(("Assistant Professor" ,(length assistant-professors))
("Associate Professor" ,(length associate-professors))
("Professor" ,(length professors))))
| Assistant Professor |
313 |
| Associate Professor |
283 |
| Professor |
879 |
How long did it take to do that?
(measure-time
(loop for contact in contacts
if (string= "Professor" (cdr (assoc "RANK" (cdr contact))))
collect contact into professors
if (string= "Associate Professor" (cdr (assoc "RANK" (cdr contact))))
collect contact into associate-professors
if (string= "Assistant Professor" (cdr (assoc "RANK" (cdr contact))))
collect contact into assistant-professors
finally return (list (length assistant-professors)
(length associate-professors)
(length professors))))
0.008772 seconds elapsed
Not long at all! Comparatively, it is very slow to get this information out of the mongodb, although considerably less code is required. That might not be surprising, considering the json parsing that has to get done here.
Here is the equivalent code to extract that data from the database.
(loop for rank in '("Assistant Professor" "Associate Professor" "Professor")
collect (list rank (length (mongo-find "contacts" "contacts"
`((RANK . ,rank))))))
| Assistant Professor |
313 |
| Associate Professor |
283 |
| Professor |
879 |
It is comparatively slow to do this. This requires three json parses, and profiling indicates that alot of the work is done in parsing the json.
(measure-time
(loop for rank in '("Assistant Professor" "Associate Professor" "Professor")
collect (list rank (length (mongo-find "contacts" "contacts"
`((RANK . ,rank)))))))
1.914817 seconds elapsed
Here is smarter way to do it that avoids the json parsing.
(loop for rank in '("Assistant Professor" "Associate Professor" "Professor")
collect (list rank (mongo-cmd "contacts" "contacts" "count(%s)"
(json-encode `((RANK . ,rank))))))
| Assistant Professor |
313 |
| Associate Professor |
283 |
| Professor |
879 |
And you can see here it is about 10 times faster, but not nearly as fast as running the lisp code on the cache.
(measure-time
(loop for rank in '("Assistant Professor" "Associate Professor" "Professor")
collect (list rank (mongo-cmd "contacts" "contacts" "count(%s)"
(json-encode `((RANK . ,rank)))))))
0.349413 seconds elapsed
This is how you might integrate this into a completion command:
(ivy-read "choose: "
(loop for c across (mongo-find "contacts" "contacts" "")
collect
(list (cdr (assoc 'desc c)) c)))
This is basically unusable though, because it takes so long to generate the candidates (over six seconds).
(measure-time
(loop for c across (mongo-find "contacts" "contacts" "")
collect
(list (cdr (assoc 'desc c)) c)))
6.228225 seconds elapsed
We can get back to usable by making the database do more work for us. Here, we simply make the database print a list of cons cells that we can read into lisp. We have to use a javascript function, with some escaping and quoting. The escaping was necessary because there is some bad data in the email field that messed up the cons cells, e.g. some things like "name" <email> with nested single and double quoting, etc., and the quoting was necessary to get cons cells of the form ("desc" . "email"), and finally we wrap them in parentheses and read back the list of cons cells. At about a quarter of a second, this is very usable to get a list of over 6000 candidates. It is still many times slower than working on the contacts list in memory though. I am not a super fan of the one-line javascript, and if it was much more complicated than this another strategy would probably be desirable.
(measure-time
(read
(concat
"("
(shell-command-to-string "mongo contacts --quiet --eval 'db.contacts.find().forEach(function (doc) {print(\"(\\\"\" + doc.desc + \"\\\" . \\\"\" + escape(doc.EMAIL) +\"\\\")\");})'")
")")))
0.284730 seconds elapsed