## What region is a point in

| categories: programming | tags: | View Comments

Suppose we have a space that is divided by a boundary into two regions, and we want to know if an arbitrary point is on one region or the other. One way to figure this out is to pick a point that is known to be in a region, and then draw a line to the arbitrary point counting the number of times it crosses the boundary. If the line crosses an even number of times, then the point is in the same region and if it crosses an odd number of times, then the point is in the other region.

Here is the boundary and region we consider in this example:

boundary = [[0.1, 0],
[0.25, 0.1],
[0.3, 0.2],
[0.35, 0.34],
[0.4, 0.43],
[0.51, 0.47],
[0.48, 0.55],
[0.44, 0.62],
[0.5, 0.66],
[0.55,0.57],
[0.556, 0.48],
[0.63, 0.43],
[0.70, 0.44],
[0.8, 0.51],
[0.91, 0.57],
[1.0, 0.6]]

import matplotlib.pyplot as plt

plt.plot([p[0] for p in boundary],
[p[1] for p in boundary])
plt.ylim([0, 1])
plt.savefig('images/boundary-1.png')

... ... ... ... ... ... ... ... ... ... ... ... ... ... >>> >>> >>> >>> >>> ... [<matplotlib.lines.Line2D object at 0x00000000062FEBA8>]
(0, 1)


In this example, the boundary is complicated, and not described by a simple function. We will check for intersections of the line from the arbitrary point to the reference point with each segment defining the boundary. If there is an intersection in the boundary, we count that as a crossing. We choose the origin (0, 0) in this case for the reference point. For an arbitrary point (x1, y1), the equation of the line is therefore (provided x1 !=0):

$$y = \frac{y1}{x1} x$$.

Let the points defining a boundary segment be (bx1, by1) and (bx2, by2). The equation for the line connecting these points (provided bx1 != bx2) is:

$$y = by1 + \frac{by2 - by1}{bx2 - bx1}(x - bx1)$$

Setting these two equations equal to each other, we can solve for the value of $$x$$, and if $$bx1 <= x <= bx2$$ then we would say there is an intersection with that segment. The solution for x is:

$$x = \frac{m bx1 - by1}{m - y1/x1}$$

This can only fail if $$m = y1/x1$$ which means the segments are parallel and either do not intersect or go through each other. One issue we have to resolve is what to do when the intersection is at the boundary. In that case, we would see an intersection with two segments since bx1 of one segment is also bx2 of another segment. We resolve the issue by only counting intersections with bx1. Finally, there may be intersections at values of $$x$$ greater than the point, and we are not interested in those because the intersections are not between the point and reference point.

Here are all of the special cases that we have to handle:

We will have to do float comparisons, so we will define tolerance functions for all of these. I tried this previously with regular comparison operators, and there were many cases that did not work because of float comparisons. In the code that follows, we define the tolerance functions, the function that handles almost all the special cases, and show that it almost always correctly identifies the region a point is in.

import numpy as np

TOLERANCE = 2 * np.spacing(1)

def feq(x, y, epsilon=TOLERANCE):
'x == y'
return not((x < (y - epsilon)) or (y < (x - epsilon)))

def flt(x, y, epsilon=TOLERANCE):
'x < y'
return x < (y - epsilon)

def fgt(x, y, epsilon=TOLERANCE):
'x > y'
return y < (x - epsilon)

def fle(x, y, epsilon=TOLERANCE):
'x <= y'
return not(y < (x - epsilon))

def fge(x, y, epsilon=TOLERANCE):
'x >= y'
return not(x < (y - epsilon))

boundary = [[0.1, 0],
[0.25, 0.1],
[0.3, 0.2],
[0.35, 0.34],
[0.4, 0.43],
[0.51, 0.47],
[0.48, 0.55],
[0.44, 0.62],
[0.5, 0.66],
[0.55,0.57],
[0.556, 0.48],
[0.63, 0.43],
[0.70, 0.44],
[0.8, 0.51],
[0.91, 0.57],
[1.0, 0.6]]

def intersects(p, isegment):
'p is a point (x1, y1), isegment is an integer indicating which segment starting with 0'
x1, y1 = p
bx1, by1 = boundary[isegment]
bx2, by2 = boundary[isegment + 1]

# outline cases to handle
if feq(bx1, bx2) and feq(x1, 0.0): # both segments are vertical
if feq(bx1, x1):
return True
else:
return False
elif feq(bx1, bx2):  # segment is vertical
m1 = y1 / x1 # slope of reference line
y = m1 * bx1 # value of reference line at bx1
if ((fge(y, by1) and flt(y, by2))
or (fle(y, by1) and fgt(y,by2))):
# reference line intersects the segment
return True
else:
return False
else: # neither reference line nor segment is vertical
m = (by2 - by1) / (bx2 - bx1) # segment slope
m1 = y1 / x1
if feq(m, m1): # line and segment are parallel
if feq(y1, m * bx1):
return True
else:
return False
else: # lines are not parallel
x = (m * bx1 - by1) / (m - m1) # x at intersection

if ((fge(x, bx1) and flt(x, bx2))
or (fle(x, bx1) and fgt(x, bx2))) and fle(x, x1):
return True
else:
return False

raise Exception('you should not get here')

import matplotlib.pyplot as plt

plt.plot([p[0] for p in boundary],
[p[1] for p in boundary], 'go-')
plt.ylim([0, 1])

N = 100

X = np.linspace(0, 1, N)

for x in X:
for y in X:
p = (x, y)

nintersections = sum([intersects(p, i) for i in range(len(boundary) - 1)])

if nintersections % 2 == 0:
plt.plot(x, y, 'r.')
else:
plt.plot(x, y, 'b.')

plt.savefig('images/boundary-2.png')
plt.show()


If you look carefully, there are two blue points in the red region, which means there is some edge case we do not capture in our function. Kudos to the person who figures it out.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

## Getting a dictionary of counts

| categories: programming | tags: | View Comments

I frequently want to take a list and get a dictionary of keys that have the count of each element in the list. Here is how I have typically done this countless times in the past.

L = ['a', 'a', 'b','d', 'e', 'b', 'e', 'a']

d = {}
for el in L:
if el in d:
d[el] += 1
else:
d[el] = 1

print d

{'a': 3, 'b': 2, 'e': 2, 'd': 1}


That seems like too much code, and that there must be a list comprehension approach combined with a dictionary constructor.

L = ['a', 'a', 'b','d', 'e', 'b', 'e', 'a']

print dict((el,L.count(el)) for el in L)

{'a': 3, 'b': 2, 'e': 2, 'd': 1}


Wow, that is a lot simpler! I suppose for large lists this might be slow, since count must look through the list for each element, whereas the longer code looks at each element once, and does one conditional analysis.

Here is another example of much shorter and cleaner code.

from collections import Counter
L = ['a', 'a', 'b','d', 'e', 'b', 'e', 'a']
print Counter(L)
print Counter(L)['a']

Counter({'a': 3, 'b': 2, 'e': 2, 'd': 1})
3


Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

## Lambda Lambda Lambda

| categories: programming | tags: | View Comments

Is that some kind of fraternity? of anonymous functions? What is that!? There are many times where you need a callable, small function in python, and it is inconvenient to have to use def to create a named function. Lambda functions solve this problem. Let us look at some examples. First, we create a lambda function, and assign it to a variable. Then we show that variable is a function, and that we can call it with an argument.

f = lambda x: 2*x
print f
print f(2)

<function <lambda> at 0x0000000001E6AAC8>
4


We can have more than one argument:

f = lambda x,y: x + y
print f
print f(2, 3)

<function <lambda> at 0x0000000001E3AAC8>
5


And default arguments:

f = lambda x, y=3: x + y
print f
print f(2)
print f(4, 1)

<function <lambda> at 0x0000000001E9AAC8>
5
5


It is also possible to have arbitrary numbers of positional arguments. Here is an example that provides the sum of an arbitrary number of arguments.

import operator
f = lambda *x: reduce(operator.add, x)
print f

print f(1)
print f(1, 2)
print f(1, 2, 3)

<function <lambda> at 0x0000000001DFAAC8>
1
3
6


You can also make arbitrary keyword arguments. Here we make a function that simply returns the kwargs as a dictionary. This feature may be helpful in passing kwargs to other functions.

f = lambda **kwargs: kwargs

print f(a=1, b=3)

{'a': 1, 'b': 3}


Of course, you can combine these options. Here is a function with all the options.

f = lambda a, b=4, *args, **kwargs: (a, b, args, kwargs)

print f('required', 3, 'optional-positional', g=4)

('required', 3, ('optional-positional',), {'g': 4})


One of the primary limitations of lambda functions is they are limited to single expressions. They also do not have documentation strings, so it can be difficult to understand what they were written for later.

## 1 Applications of lambda functions

Lambda functions are used in places where you need a function, but may not want to define one using def. For example, say you want to solve the nonlinear equation $$\sqrt{x} = 2.5$$.

from scipy.optimize import fsolve
import numpy as np

sol, = fsolve(lambda x: 2.5 - np.sqrt(x), 8)
print sol

6.25


Another time to use lambda functions is if you want to set a particular value of a parameter in a function. Say we have a function with an independent variable, $$x$$ and a parameter $$a$$, i.e. $$f(x; a)$$. If we want to find a solution $$f(x; a) = 0$$ for some value of $$a$$, we can use a lambda function to make a function of the single variable $$x$$. Here is a example.

from scipy.optimize import fsolve
import numpy as np

def func(x, a):
return a * np.sqrt(x) - 4.0

sol, = fsolve(lambda x: func(x, 3.2), 3)
print sol

1.5625


Any function that takes a function as an argument can use lambda functions. Here we use a lambda function that adds two numbers in the reduce function to sum a list of numbers.

print reduce(lambda x, y: x + y, [0, 1, 2, 3, 4])

10


We can evaluate the integral $$\int_0^2 x^2 dx$$ with a lambda function.

from scipy.integrate import quad

print quad(lambda x: x**2, 0, 2)

(2.666666666666667, 2.960594732333751e-14)


## 2 Summary

Lambda functions can be helpful. They are never necessary. You can always define a function using def, but for some small, single-use functions, a lambda function could make sense. Lambda functions have some limitations, including that they are limited to a single expression, and they lack documentation strings.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

## Redirecting the print function

| categories: programming | tags: | View Comments

Ordinarily a print statement prints to stdout, or your terminal/screen. You can redirect this so that printing is done to a file, for example. This might be helpful if you use print statements for debugging, and later want to save what is printed to a file. Here we make a simple function that prints some things.

def debug():
print 'step 1'
print 3 + 4
print 'finished'

debug()

... ... ... >>> step 1
7
finished


Now, let us redirect the printed lines to a file. We create a file object, and set sys.stdout equal to that file object.

import sys
print >> sys.__stdout__, '__stdout__ before = ', sys.__stdout__
print >> sys.__stdout__, 'stdout before = ', sys.stdout

f = open('data/debug.txt', 'w')
sys.stdout = f

# note that sys.__stdout__ does not change, but stdout does.
print >> sys.__stdout__, '__stdout__ after = ', sys.__stdout__
print >> sys.__stdout__, 'stdout after = ', sys.stdout

debug()

# reset stdout back to console
sys.stdout = sys.__stdout__

print f
f.close() # try to make it a habit to close files
print f

__stdout__ before =  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
stdout before =  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
>>> >>> >>> >>> ... __stdout__ after =  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
stdout after =  <open file 'data/debug.txt', mode 'w' at 0x2ae7dbcbbb70>
>>> >>> >>> ... >>> >>> >>> <open file 'data/debug.txt', mode 'w' at 0x2ae7dbcbbb70>
>>> <closed file 'data/debug.txt', mode 'w' at 0x2ae7dbcbbb70>


Note it can be important to close files. If you are looping through large numbers of files, you will eventually run out of file handles, causing an error. We can use a context manager to automatically close the file like this

import sys

# use the open context manager to automatically close the file
with open('data/debug.txt', 'w') as f:
sys.stdout = f
debug()
print >> sys.__stdout__, f

# reset stdout
sys.stdout = sys.__stdout__
print f

>>> ... ... ... ... ... <open file 'data/debug.txt', mode 'w' at 0x0000000002071C00>
... >>> <closed file 'data/debug.txt', mode 'w' at 0x0000000002071C00>


See, the file is closed for us! We can see the contents of our file like this.

cat data/debug.txt

step 1
7
finished


The approaches above are not fault safe. Suppose our debug function raised an exception. Then, it could be possible the line to reset the stdout would not be executed. We can solve this with try/finally code.

import sys

print 'before: ', sys.stdout
try:
with open('data/debug-2.txt', 'w') as f:
sys.stdout = f
# print to the original stdout
print >> sys.__stdout__, 'during: ', sys.stdout
debug()
raise Exception('something bad happened')
finally:
# reset stdout
sys.stdout = sys.__stdout__

print 'after: ', sys.stdout
print f # verify it is closed
print sys.stdout # verify this is reset

>>> before:  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
... ... ... ... ... ... ... ... ... ... during:  <open file 'data/debug-2.txt', mode 'w' at 0x2ae7dbcbbf60>
Traceback (most recent call last):
File "<stdin>", line 7, in <module>
Exception: something bad happened
after:  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
<closed file 'data/debug-2.txt', mode 'w' at 0x2ae7dbcbbf60>
<open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>

cat data/debug-2.txt

step 1
7
finished


This is the kind of situation where a context manager is handy. Context managers are typically a class that executes some code when you “enter” the context, and then execute some code when you “exit” the context. Here we want to change sys.stdout to a new value inside our context, and change it back when we exit the context. We will store the value of sys.stdout going in, and restore it on the way out.

import sys

class redirect:
def __init__(self, f=sys.stdout):
"redirect print statement to f. f must be a file-like object"
self.f = f
self.stdout = sys.stdout
print >> sys.__stdout__, 'init stdout: ', sys.stdout
def __enter__(self):
sys.stdout = self.f
print >> sys.__stdout__,  'stdout in context-manager: ',sys.stdout
def __exit__(self, *args):
sys.stdout = self.stdout
print '__stdout__ at exit = ',sys.__stdout__

# regular printing
with redirect():
debug()

# write to a file
with open('data/debug-3.txt', 'w') as f:
with redirect(f):
debug()

# mixed regular and
with open('data/debug-4.txt', 'w') as f:
with redirect(f):
print 'testing redirect'
with redirect():
print 'temporary console printing'
debug()
print 'Now outside the inner context. this should go to data/debug-4.txt'
debug()
raise Exception('something else bad happened')

print sys.stdout

>>> ... ... ... ... ... ... ... ... ... ... ... ... >>> ... ... ... init stdout:  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
stdout in context-manager:  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
step 1
7
finished
__stdout__ at exit =  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
... ... ... ... init stdout:  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
stdout in context-manager:  <open file 'data/debug-3.txt', mode 'w' at 0x2ae7dbcbbb70>
__stdout__ at exit =  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
... ... ... ... ... ... ... ... ... ... init stdout:  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
stdout in context-manager:  <open file 'data/debug-4.txt', mode 'w' at 0x2ae7dca4d030>
init stdout:  <open file 'data/debug-4.txt', mode 'w' at 0x2ae7dca4d030>
stdout in context-manager:  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
temporary console printing
step 1
7
finished
__stdout__ at exit =  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
Traceback (most recent call last):
File "<stdin>", line 10, in <module>
Exception: something else bad happened
<open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>


Here are the contents of the debug file.

cat data/debug-3.txt

step 1
7
finished


The contents of the other debug file have some additional lines, because we printed some things while in the redirect context.

cat data/debug-4.txt

testing redirect
__stdout__ at exit =  <open file '<stdout>', mode 'w' at 0x2ae7d70e01e0>
Now outside the inner context. this should go to data/debug-4.txt
step 1
7
finished


See http://www.python.org/dev/peps/pep-0343/ (number 5) for another example of redirecting using a function decorator. I think it is harder to understand, because it uses a generator.

There were a couple of points in this section:

1. You can control where things are printed in your programs by modifying the value of sys.stdout
2. You can use try/except/finally blocks to make sure code gets executed in the event an exception is raised
3. You can use context managers to make sure files get closed, and code gets executed if exceptions are raised.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

## Mail merge with python

| categories: | tags: | View Comments

Suppose you are organizing some event, and you have a mailing list of email addresses and people you need to send a mail to telling them what room they will be in. You would like to send a personalized email to each person, and you do not want to type each one by hand. Python can automate this for you. All you need is the mailing list in some kind of structured format, and then you can go through it line by line to create and send emails.

We will use an org-table to store the data in.

First name Last name email address Room number
Jane Doe jane-doe@gmail.com 1
John Doe john-doe@gmail.com 2
Jimmy John jimmy-john@gmail.com 3

We pass that table into an org-mode source block as a variable called data, which will be a list of lists, one for each row of the table. You could alternatively read these from an excel spreadsheet, a csv file, or some kind of python data structure.

We do not actually send the emails in this example. To do that you need to have access to a mail server, which could be on your own machine, or it could be a relay server you have access to.

We create a string that is a template with some fields to be substituted, e.g. the firstname and room number in this case. Then we loop through each row of the table, and format the template with those values, and create an email message to the person. First we print each message to check that they are correct.

import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEText import MIMEText
from email.Utils import  formatdate

template = '''
Dear {firstname:s},

I am pleased to inform you that your talk will be in room {roomnumber:d}.

Sincerely,
John
'''

for firstname, lastname, emailaddress, roomnumber in data:
msg = MIMEMultipart()
msg['From'] = "youremail@gmail.com"
msg['Date'] = formatdate(localtime=True)

msgtext = template.format(**locals())
print msgtext

msg.attach(MIMEText(msgtext))

## Uncomment these lines and fix
#server = smtplib.SMTP('your.relay.server.edu')
#server.sendmail('your_email@gmail.com', # from
#                msg.as_string())
#server.quit()

print msg.as_string()
print '------------------------------------------------------------------'

Dear Jane,

I am pleased to inform you that your talk will be in room 1.

Sincerely,
John

Content-Type: multipart/mixed; boundary="===============1191311863=="
MIME-Version: 1.0
From: youremail@gmail.com
To: jane-doe@gmail.com
Date: Tue, 16 Apr 2013 16:10:23 -0400

--===============1191311863==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

Dear Jane,

I am pleased to inform you that your talk will be in room 1.

Sincerely,
John

--===============1191311863==--
------------------------------------------------------------------

Dear John,

I am pleased to inform you that your talk will be in room 2.

Sincerely,
John

Content-Type: multipart/mixed; boundary="===============1713881863=="
MIME-Version: 1.0
From: youremail@gmail.com
To: john-doe@gmail.com
Date: Tue, 16 Apr 2013 16:10:23 -0400

--===============1713881863==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

Dear John,

I am pleased to inform you that your talk will be in room 2.

Sincerely,
John

--===============1713881863==--
------------------------------------------------------------------

Dear Jimmy,

I am pleased to inform you that your talk will be in room 3.

Sincerely,
John

Content-Type: multipart/mixed; boundary="===============0696685580=="
MIME-Version: 1.0
From: youremail@gmail.com
To: jimmy-john@gmail.com
Date: Tue, 16 Apr 2013 16:10:23 -0400

--===============0696685580==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

Dear Jimmy,

I am pleased to inform you that your talk will be in room 3.

Sincerely,
John

--===============0696685580==--
------------------------------------------------------------------


Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source