Saturday 29 June 2013

GSoC: Week 2 : Reference Frames

This is an update for the second week of my GSoC coding period. Its been long and quite tedious (though very interesting at times), as far as the code was concerned. Got a lot of stuff done in the commits this week, and I guess the major components of the reference frame class are now in place. Well, I can't be sure whether its perfect till Gilbert reviews it, but all that we discussed is more or less there. The things that got done this week-

  • The initializer for the MovingRefFrame class. Its huge and dense as I had said earlier, but it gets the job done well, and prevents any 'special cases' from wreaking havoc while at the same time giving all the needed functionality. As mentioned in the last week, the functions for boundary conditions have now been implemented in a file of their own, and they get called in this class' 'init' function. There were some other issues as well, like allowing velocities to be defined in frames that don't 'exist' yet, etc. But thankfully, they got solved after some thinking(read: thinking hard).
  • I rewrote the tree algorithm for finding relative motion parameters, since the one I had implemented earlier went via the global frame everytime- something that made the whole process very inefficient(and computationally impractical) at times. To get an idea of what I am talking about, imagine a 100 reference frames, each defined at some orientation/position wrt its predecessor. Now I want to know the motion of the 98th frame wrt the 96th. Earlier, I would find the motion of the 98th wrt the first(global) frame, do the same for 96th, and then 'process' the info to get the required motion params. Some tests with the current framework made me realise how stupid that was. So now, it just goes this way - 98th->97th->96th, the way its supposed to be.
A small mock SymPy session with the use of this class is shown below-
The conditions are-
1) A frame R1
2) A frame R2 defined wrt R1, such that it initially coincides with R1, but has angular velocity of  wrt it
3) A frame R3 defined wrt R2, such that it has a constant position vector of  î  wrt it.
Since R2 is rotating wrt R1, R3's position vector wrt R1 will be a function of time, as expected. Heres the working-

 >>> from sympy.physics.mechanics import MovingRefFrame  
 >>> R1 = MovingRefFrame('R1', parentframe=None)  
 >>> R2 = MovingRefFrame('R2', parentframe = R1, ang_vel = R1.basis(2))  
 >>> R3 = MovingRefFrame('R3', parentframe = R2, pos_vector = R2.basis(0))
 >>> R1.express(R3.pos_vector_in(R1))  
 - cos(t)*R1.basis(0) - sin(t)*R1.basis(1)  

Thats all for now. You can have a look at the PR if you wish. The coming week will majorly be spent fine-tuning my current code and making it perfectly compatible with Prasoon's API, adding a few other helper functions if needed, and writing unit tests/example docs for all that's implemented so far.
Have a great week :-)

Saturday 22 June 2013

GSoC: Week 1

Hmm. Week 1 of GSoC ends today. Its been a bit of a confused week, with most of my time spent thinking and  trying to determine how I am going to code the initialization of reference frames in the new mechanics core. Its going to be a little dense, with all the motion parameters being checked, analysed and stored, and the constraint of immutability making things a little difficult, implementation-wise. But I guess its better to make the classes immutable to gel well with the rest of SymPy, as the whole point of this part of my GSoC work would be to make sympy.physics.mechanics seem 'friendlier' to the rest of the codebase.

What was gained? Well, I did finish (today) determining how exactly I am going to go the whole initialization procedure, after a lot of brain-racking (its quite true when they say that programmers do most of their thinking not at their desk, but while doing things like eating and showering :-) ). A lot of math was involved, along with considerable thinking given to the exact implementation details. What did I learn? - Always plan out what you are going to implement, on paper, before you start typing code(especially if math is involved). Taking time and understanding what you are going to do, and then doing it, is much better than constantly changing your code as you keep finding faults. Anyways, I did finish coding the rough API and some elementary functions like those of time derivatives, relative motion etc., but all this will be useful only once I finish the __init__ method.

One good thing we decided to do was to code helper functions in the a separate file in the core to calculate motion parameters given differential equations and boundary conditions- using SymPy's integration and equation-solving functions, ofcourse. I proposed this way of doing things since having this in the reference frame code would be confusing and chaotic, not to mention unnecessary. At the same time, a user would want to create a new frame instance based on a velocity vector and initial position boundary values-or something on those lines. So this we felt was a good middle ground to enable both. This is going to take time, but I guess having a good foundation is always good.

In terms of code and API, I will have something concrete in my PR by the end of this week. Will try and show a mock SymPy session in my next post :-). Anyways, thats all for now. I wont be able to do much tomorrow, since I have to travel to a relative's place for lunch..though I guess I could work at night for an hour or two. A long week ahead, wish me luck :-D

Monday 17 June 2013

Smart caching in Python

Prasoon and I are currently looking for ways to reduce the computational load of calculating the DCM(Direction Cosine Matrice) of one frame with respect to another. This DCM is useful to mathematically understand how one reference frame is oriented with respect to another. The reason why we need to find a way is, if we are talking about something like a 100 frames each oriented in some random way with respect to one of its predecessors, it gets pretty messy- for your computer AND you.
We thought of various ways, but apart from some very inefficient methods like 'World' frames (to store these DCMs for a particular run of the program) and using a global frame as an intermediate, we really couldn't find anything much. Then Gilbert suggested looking at the way SymPy does caching of some of its methods.
We may or may not use this technique eventually, but I learnt a cool way to implement caching in Python-using dictionaries and method decorators. If you don't know what method decorators are, have a look at this - decorators are essentially ways to mutilate(or rather, add to) the functioning of a function/class.

I am going to try and show a simplified version of what I found in the SymPy source, their original method is close but much more detailed- for purposes of immutability and debugging and stuff.

The components involved in this technique are these - 

1) CACHE = {}
This is a Python dictionary (empty intially). This is what we will use as our cache.

2) The caching decorator -

def globalcached(function):
    @wraps(function)
    def wrapper(*args):
        tup = tuple([function] + [args])
        if tup in CACHE:
            print('In cache')
            return CACHE[tup]
        else:
            print('Not in cache')
            temp = function(*args)
            CACHE[tup] = temp
            return temp
    return wrapper

Understand what the above will do. wraps is a decorator from the functools Python library- its there by default. It enables the 'decorated' function to retain the docstring of the original function- we would probably want to retain the docs.
As you might have understood, 'wrapper' is the modified function that will get executed in place of the original one.
Now, tup is a tuple (tuple is immutable in python and hence hashable. You can only use hashable objects as dictionary keys) containing info about the following-
i. The function being called
ii. The arguments of the function
Thus, I know exactly what function it is, and what it is going to work on. Hence, the code will not directly let the original function go computing again- that may be a waste of resources. It first checks in CACHE, whether it has that particular tup as one its keys. If its not there, then this is the first time that function is being called with those arguments. So, I will calculate the needed quantity, but before returning it, I will store the calculated stuff with tup as the key. Hence, the next time the same function gets called with the very same arguments, the decorator will have a 'deja vu' and return the value- from CACHE. Ingenious, isnt it?
I don't know how many other projects use this method, or something similar. But its quite cool, it you want speed and you know the computation is going to be intensive.

Anyways, here is a snippet using the above decorator-
 CACHE = {}  
 def globalcached(function):  
   @wraps(function)  
   def wrapper(*args):  
     tup = tuple([function] + [args])  
     if tup in CACHE:  
       print('In cache')  
       return CACHE[tup]  
     else:  
       print('Not in cache')  
       temp = function(*args)  
       CACHE[tup] = temp  
       return temp  
   return wrapper  
 def printcache():  
   for x in CACHE:  
     print 'Function: ' + str(x[0])  
     print 'Arguments: ' + str(x[1:])  
     print 'Value: ' + str(CACHE[x])  
     print()  
 @globalcached  
 def myfunction(x, y):  
   """ myfunction docs"""  
   return x ** 2 - y ** 2  
 class SomeClass(object):  
   def __init__(self, x):  
     self.x = x  
   @globalcached  
   def function1(self, otherinstance):  
     """function1 docs"""  
     return otherinstance.x ** 2 - self.x ** 2  



Using it in a Python IDLE session-


 >>> myfunction(8, 3)  
 Not in cache  
 55  
 >>> myfunction(8, 3)  
 In cache  
 55  
 >>> i1 = SomeClass(4)  
 >>> i2 = SomeClass(7)  
 >>> i1.function1(i2)  
 Not in cache  
 33  
 >>> i1.function1(i2)  
 In cache  
 33  
 >>> myfunction(6, 7)  
 Not in cache  
 -13  
 >>> printcache()  
 Function: <function myfunction at 0x026B38B0>  
 Arguments: ((8, 3),)  
 Value: 55  
 ()  
 Function: <function function1 at 0x026CD330>  
 Arguments: ((<__main__.SomeClass object at 0x026C9B10>, <__main__.SomeClass object at 0x026C9B50>),)  
 Value: 33  
 ()  
 Function: <function myfunction at 0x026B38B0>  
 Arguments: ((6, 7),)  
 Value: -13  
 ()  
 >>>   

I hope its clear now. Its not rocket science, but its a clever way of doing things.

Anyways, that's all for now. And btw, thanks to Google for giving us a free membership to ACM for one year. The GSoC welcome package will also be arriving soon, and I am excited :-D


Monday 10 June 2013

Some basic math

As suggested by Gilbert, I am uploading some hand-written notes I made about the entry-level math involved in coding framework Prasoon and I are talking of building. This DID help to some extent as a form of revision and cleared up some basics in my own brain.
I guess I should do this more often. I have practically lost the skill of writing things down since my tenth grade.
(By the way, this is a good article to refer to, if you are interested in knowing the conceptual differences between Reference Frames and Coordinate Systems)









Friday 7 June 2013

sympy.logic - Part 2

(contd from last post)
Ok. A little late on this one. As promised, the part-2 about the features of sympy.logic, one of the lesser-used modules of sympy.logic. Today's post contains information about some stuff that was there before I worked on the module, and some stuff that we worked in. Anyways, here goes-

IV) Equality of boolean expressions

This is a function that became quite a headache for us while were trying to implement it. We thought of many algorithms and kept refuting them, until we settled on a good one. I coded it, and Christopher improved it (based on the SymPy architecture) considerably. It may still have a few bugs (one is right here), but it gets the job done nonetheless.

Lets take an example to see how it works
Take the function that I discussed in part I) of last post. Its SOP form was -
Or(And(Not(a), d), And(c, d))
and its POS form was-
And(Or(Not(a), c), d)
Lets take the POS form to be And(Or(Not(x), y), z), just to get a change of variables.

So, function1 = Or(And(Not(a), d), And(c, d)) and function2 = And(Or(Not(x), y), z)
According to the docstring of the bool_equal function, it "Returns True if the two expressions represent the same logical behavior for some correspondence between the variables of each (which may be different)"
In the above functions, the mapping a : x, c : y, d : z will give the SAME outputs for every input for the variables (a,c,d) or (x,y,z). How do we check this equality?

>>> from sympy.logic import *
>>> from sympy import symbols
>>> a, c, d, x, y, z = symbols('a c d x y z')      <- Remember all SymPy variables are called Symbols?
>>> function1 = Or(And(Not(a), d), And(c, d))
>>> function2 = And(Or(Not(x), y), z)
>>> bool_equal(function1, function2)

Output - True

Want to check the mapping?

>>> bool_equal(function1, function2, info = True)
(And(Or(Not(a), c), d), {c: y, a: x, d: z})

If there is more than one mapping which may make the expressions equal, any ONE is returned. The function that is returned as the first return is the simplified version of the functions (in terms of one set of variables). As the function is already the simplest here, it gets returned as it is.

V) Inferences from knowledge bases

This is a very handy feature of sympy.logic. It basically lets you 'tell it' a set of True boolean expressions, and then 'ask' it if a given input is True based on the 'knowledge' it has. First, let me introduce to two other logic classes in sympy.logic-

Implies
Implies(some expression, some expression)

Equivalent
Equivalent(some expression, some expression)

They behave exactly like the corresponding operators in logic theory. VERY useful when converting natural language statements to logic ones. They also let you convert them to And, Or and Not instances using simplify_logic or to_dnf or to_cnf functions.Ok, now about KBs-
The API is as follows-

>>> from sympy.logic.inference import PropKB
>>> from sympy.abc import x, y
>>> l = PropKB()
>>> l.tell(x & ~y)
>>> l.ask(x)
 True

Did you see what it did there? It returns True for any statement thats satisfiable given the statements it is 'told'.

One more thing.

One inefficient way to check if a statement y is valid given a set of statements x1, x2, ... xn is using the Implies class and simplify_logic.
Just use the following function -

def check_validity(statements, check):
      statements = simplify_logic(And(*statements))
output = simplify_logic(Implies(statements, check))
if output != True:
output = False
return statements, output

>>> check_validity([x|y], x)
(Or(x, y), False)

It will return the simplified version of the ANDing of all the statements in the first input, and the validity of the 'check' statement.

How did I use the above function in Database Systems? Put all the known functional dependencies in the 'statements' input and the dependency to be checked in the 'check' input (in the form of Implies objects). If the second output was not True, that would mean the functional dependency does not hold. Ofcourse there are better ways of doing this, but well, I had little time and I wanted to check some stuff pronto.

Anyways, thats all I have about sympy.logic. If you want to know more, the source is here. Do have a look.

GSoC update - 

The whole of last week was spent deciding the API of our work on frames, coordinate systems and vectors with Stefan, Prasoon and Gilbert. We have a consensus on some things now, and I am likely to start working this week on a new class to depict ScalarFields in n-dimensional space. I will post the progress by the end of next week, and I hope I will have substantial stuff to write about. Excited. Definitely. Cheers!

Saturday 1 June 2013

sympy.logic - Part 1

As I have mentioned before, my first work for SymPy was for its logic module. Most of it took inspiration from the digital design and CS-Logic classes I took in my 3rd semester at college. In this post and the next I will try to explain some of the cool features of this module that I have used applied in useful ways.

Note - Quite a bit of this work was done just a few months back by Christopher Smith(smichr) and me, and hence is NOT included in the lastest version of SymPy(0.7.2), but will find its way into SymPy 0.7.3. Therefore, there is no documentation on the official website also. However, interested people can fork the SymPy source from its github repo - There is enough of in-code docs to help you.

I) Using the truth table to find SOP/POS form 

This is quite useful in digital design, and I myself used this quite a few number of times during my course.
Suppose you have to find the SOP (sum of products) form of a boolean function that has the following specs-
1. The variables (literals) are a,b,c and d.
2. The function MUST return True for the following combinations (in order-a,b,c,d)-
0,0,0,1
0,0,1,1
0,1,1,1
1,0,1,1
1,1,1,1
 where 0-false and 1-true
3. The function may give True or False (a dont-care) for the following combinations-
0,0,0,0
0,0,1,0
0,1,0,1

For those of you who may ask WHY we won't care. Simple- These combinations will never occur in our application if used rightly, OR we dont' care about the output even if they do. For example, a function that deals with binary representations of digits from 0-9 will not care about the inputs from 1001, as they will not occur in operation.
4. All the rest of the combinations must return False

Python session for this? Here it is -

>>> from sympy.logic import SOPform
>>> minterms = [[0, 0, 0, 1], [0, 0, 1, 1], [0, 1, 1, 1], [1, 0, 1, 1], [1, 1, 1, 1]]
>>> dontcares = [[0, 0, 0, 0], [0, 0, 1, 0], [0, 1, 0, 1]]
>>> SOPform(['a','b','c','d'], minterms, dontcares)

Output - Or(And(Not(a), d), And(c, d))

POS form?

>>> from sympy.logic import POSform
>>> POSform(['a','b','c','d'], minterms, dontcares)

Output - And(Or(Not(a), c), d)

II) Simplification of boolean expressions

Suppose you want to simplify a cumbersome boolean expression to something simple. How?
Use simplify_logic.

Example expression --> "AB(A + B)" [Symbols take their usual meaning]

On paper, you will have to take the following steps-
AB(A + B)
(A + B)(A + B)DeMorgan's Law
A + BBDistributive law. This step uses the fact that or distributes over and. It can look a bit strange since addition does not distribute over multiplication.
AComplement, Identity.
SymPy console session?

>>> from sympy.logic import simplify_logic
>>> to_be_simplified = '~(A & B) & (~A | B)'
>>> simplify_logic(to_be_simplified)
Output - Not(A)

Period.

III) Conversion between cnf and dnf forms

Conversion of boolean expressions to CNF or DNF forms is also possible.
The code to do this is quite simple -

>>> from sympy.logic.boolalg import to_dnf
>>> from sympy.abc import A, B, C, D
>>> to_dnf(B & (A | C))
    
Output - Or(And(A, B), And(B, C))

sympy.abc generates 'symbols' - Variables that can be used to denote any value. The entire framework of SymPy is based on the concept of Symbols (hence, Symbolic-Py)

Thats all for now. In the next post, I will continue with some other features like boolean equality mapping and checking for true-ness of a statement given a set of statements (and using it to find redundancies in databases).