Macports, Git and HTTPS

My university recently started providing Git hosting over Smart HTTPS.  Previously, they only provided subversion hosting.  I’ve used git for around 2 years now, so this is great news for me, and I asked them to create a repository for my latest project.  I try to clone the (empty) repository and . . . failure!  It seems as though git failed to authenticate with the server, even though I provided valid credentials and had the appropriate certificates installed.  This failure occurs on my MacBook Pro, so I quickly logged into one of the university machines and tried.  Much to my surprise it everything works from their machines.

So I spent some time hunting down the differences.  The most substantial difference was the version of curl and OpenSSL being used.  Mine were much newer.  After a few hours of searching online, I came to the conclusion that OpenSSL was at fault here.  On a hunch, I resolved my issue by reinstalling curl, but this time with [+gnutls] instead of openssl (the default).  So if you’re having issues dealing with Smart HTTPS hosted repositories using a new version of git (particularly if you are on OSX), then give this solution a try.  Hopefully it will save you some time, as I couldn’t find this solution anywhere else.

Posted in Uncategorized | Leave a comment

Ugly Scala corner case

Eww. I ran into this little wart in the Scala language this morning.  Thank goodness that StackOverflow had an answer, or the frustration might have continued a while longer.   Basically, the wart deals with tuple unwrapping (or matching if you prefer that terminology).  Say that I have some very simple function f that looks like:

def f : (Int, Int) = { return (1,2) }

now, in some other piece of code, I want to call f and assign the result to a tuple.

val (Res1, Res2) = f

but wait; Scala complains about this code! You’ll get the following:

error: not found: value Res1
error: not found: value Res2

But why?  The source of the error is that Res1 and Res2 begin with capital letters.  Thus, when Scala tries to assign the tuple elements by means of pattern matching, it assumes that Res1 and Res2 are previously defined constants, not new values (despite the val annotation in front of them).  It’s fairly easy to correct this; just write

val (res1, res2) = f

and you’ll get the expected behavior. However, this seems to me to be a little bit of a wart in the language. Scala is, in my opinion, a beautiful and incredibly expressive language. However, little things like this often remind me that there are still a lot of improvements to be made, and often the little things count.

Posted in Uncategorized | Leave a comment

Python Hatchlings part 0

So my labmate recently wrote a very nice blog post comparing some different solutions for speeding up Python code. The solutions he explored are either replacement interpreters, (restricted) python -> vm bytecode compilers, or (restricted) python -> C++ compilers. Some of these solutions look very promising. However, they are not all mature yet, and most only support a subset of the current language. I recently had a need to speed up some multithreaded python code that I had written, and in the process, I learned a bit about improving the performance of Python itself (i.e. not replacing the interpreter/bytecode compiler).

In particular, there are two changes that significantly sped up my code; converting while loops to for loops, and replacing list comprehensions with conditional guards with filter statements. So the first of these improvements is rather obvious in retrospect, as a for loop can be unrolled somewhat, whereas the condition guard on the while loop must be re-evaluated by the interpreter for every loop iteration. However, what was not obvious at first was how to convert my while loop to a for loop. The loop in question traverses the edges in a tree from an internal node until it reaches the root. My initial code looked something like this:

def pathToRoot(G, n):
    path = [n]
    while len(G.predecessors(n)) > 0:
        pred = G..predecessors(n)[0]
        path.append(n)
        n = pred
    return path

Simple enough, right? So how do we turn this into a for loop? The solution upon which I settled was to make an initial pass over the tree and to label each node with its depth. Therefore, for an arbitrary node n, n['depth'] holds the number of predecessors I must traverse to reach the root. This allows the above code to be written as:

def pathToRoot(G, n):
    path = [n]
    ndepth = n['depth']
    for i in xrange(ndepth):
        pred = G..predecessors(n)[0]
        path.append(n)
        n = pred
    return path

It’s a very minor change, but it led to a significant speedup in my program.

The other change I made was even more simple, but likewise, it yielded improved performance. This was simply to replace a list comprehension with a filter statement. So initially, the list comprehension looked pretty much like a filter itself

result = [ x for x in input if fun(x) ]

This simply becomes

result = filter( fun, input )

Unlike the previous example, there’s no real though required here. However, I was surprised to discover than the filter statement was significantly faster than the list comprehension. For some (obviously misguided) reason, I thought they would both lead to the same bytecode; they do not. Anyway, these are two very simple Python performance tricks that are easy to implement and can lead to significantly improved performance. If I stumble upon other such tricks in the future, I’ll be sure to drop them here.

Posted in Uncategorized | 2 Comments

Closures and Currying and Partials . . . oh my!

So we were having a discussion in the lab the other day about some of the more functional features of the Python language (and of course, I made sure to give Scala some love as well).  In particular, we were discussing some code where I’d made use of Python’s partials.  My lab mate made a comment to the effect of “this seems related to closures.”

Indeed, there are three functional programming concepts that came to the top of my head which are all related, but these relationships, and indeed  the tools themselves are often overlooked in most programming courses which do not  focus on the functional style.  These three concepts are:

  1. Closures
  2. Currying
  3. Partials

Now, the wikipedia article on closures is pretty good, but alas, it has no scala example!  But before we get to the example, what is a closure?  Well, consider first a “normal” function.

def plus( x : Int, y : Int ) = { x + y }

Scala functions don’t get too much simpler than this.  The plus function takes two Ints and returns their sum.  What I want to draw your attention to here are the variables x and y.  These variables, as we are used to, are explicitly declared in an argument list.  When we call the plus function, we pass in two Ints, which the compiler takes as x and y, and we say that the values we pass in are bound to the variables x and y.  So, what if I defined the following function

def plus( x : Int, y : Int ) = { x + y + z }

Uh ohh; what is this z you might ask? It’s not defined in the body of the function; it’s not passed in as a variable, so what is it?  Well, in the context of closures, z is a free variable.  That is, given just the context I’ve shown you above it is unbound (and also undefined).  However, what if I showed you the following piece of code

def plusZ = {
  val z = 3
  def plus( x : Int, y : Int ) = { x + y + z }
  plus(1, 2)
}

Ok, so this function is not very useful, but this is a pedagogical example. So, what’s the behavior of this function? In particular, what happens when I call “plusZ”? Take a wild guess . . . it returns 6. That free variable, z, in the plus function was bound to 3 in the enclosing lexical scope of the plusZ function. Essentially, a closure is this process; where free variables within functions are bound to values. The term comes from the proper terminology of referring to the free variables in the function; the function is said to be “closed over” its free variables. “Regular” functions, are just those with no free variables. The variables of which they make use are explicit arguments, and are, by construction, bound within the scope of the function.

Next, lets look at currying. The syntax for currying varies from language to language as does the syntax support. In Scala, a curried function may look something like this

def plus(x : Int)(y : Int) = { x + y }

Again, this is our good old plus example. Our normal plus function has the type (x:Int, y:Int)Int, or more readably, (Int, Int) => Int. We say plus is a function which takes two Ints as arguments and returns an Int. What’s they type of our curried plus function though? Well, it’s (Int)(Int)Int, or, again more readably, (Int) => (Int) => Int. That is, the new plus is a function which takes a single Int, and it returns another function which itself takes a single Int and returns an Int. Calling this new plus function as follows:

plus(3, 4)

results in the error “error: too many arguments for method plus: (x: Int)(y: Int)Int”. Rather, the proper way to call this function is

plus(3)(4)

This yields the expected result of “7.” However, the error we received before is rather telling. As we can see by the signature, the curried version of plus does not take two arguments (thus the error). What the second (successful) call really means is call the function plus with the argument 3 (i.e. plus(3) ), then apply the result of this function call ( remember the call plus(3) itself returns a function) to the argument 4. This can be extended to any number of arguments where:

def func(x_{0})(x_{1}) . . . (x_{k}) = { evaluate function }

Is a function which takes a single argument, and returns a function (let’s call it func_1, even though it doesn’t really have a name) which itself takes k-1 arguments. However, func_1, in turn, takes a single argument and returns a function func_2 which takes k-2 arguments. In a similar fashion, each func_d takes a single argument and returns a function taking d-1 arguments, until we get to func_k-1; a function which takes a single argument and returns the result of whatever expression is evaluated in “evaluate function”. Currying takes a function of k arguments, and returns a chain of k function applications, where each function takes a single argument. Perhaps now it’s becoming clear the relation between bound and free variables, closures and currying. Each function application in the chain of curried functions is essentially binding one of the free variables used in the function body’s expression. Ok, time for a test; what if, given the curried definition of plus above, I wrote something like

val p9 = plus(9)_
p9(1)

What’s the result of the “p9(1)” expression? If you guessed 10, then you’re correct! Essentially, p9 is the result of a partial function application to the curried function. In the partially applied function, exists in a scope where the previously free variable “x”, has now been bound to 9. In fact, we could have achieved the same thing by writing the much more cumbersome (and much less intuitive) code:

val p9 = (y:Int) => 9 + y

Here, we’re assigning p9 directly as a function which takes a single integer, “y” and returns the sum of 9 and y. This is equivalent to binding the value 9 to the variable “x” in our curried function.

After all of this discussion, we come full circle, back to the notion of a partial, or “partially applied function.” All we mean by a partial is function, where some subset of the free variables are captured in a closure. When we partially apply a set of arguments to a function, we get back some function where some set of the free variables of the function are bound to those arguments, and the rest of the free variables will (presumably) be passed in at another point. Again, scala has some fairly nice syntax for this, let’s take our old friend, the original plus function, and look at partially applying each of its two arguments.

def plus(x : Int, y : Int) = { x + y }
val p9first = plus(9, _ : Int)
val res0 = p9first(1)

val p9second = plus(_ : Int, 9)
val res1 = p9second(1)

At the end of this little exercise, both res0 and res1 have the same value, 10. However, in the first case, p9first the free variable “x” is bound to the value 9 and the function call p9first(1) is interpreted as binding y to 1 and evaluating 9 + 1; thus returning 10. In the second case, p9second the free variable “y” is bound to the value 9, and the function call p9second(1) is interpreted as binding x to 1 and evaluating 9 + 1; thus returning 10. Now, the commutativity of plus ensures that these results were the same, but if we had instead considered the function minus (defined in the natural way), then the analogs m9first and m9second would have returned different results; namely 8 and -8 respectively.

Alright, enough with these examples. We’ve begun to explore the wonderful world of closures, currying and partially applied functions, and I hope that I’ve given you at least some idea about how these concepts are related (at least conceptually if not in terms of implementation). Though the concepts are distinct, they share a certain flavor and overlap in many meaningful ways. Together, closures, currying and partials provide a very powerful set of functional programming tools that you should remember next time you are faced with a programming problem where they apply . . . that is, assuming that you are programming in a language as awesome as scala which offers access to such functional constructs.

Posted in Uncategorized | Leave a comment

Building Numpy from source

This post should act as a note to future Rob, as well as a hint to anyone else that runs into this problem after building and installing numpy.

> ImportError: path_to_python/site-packages/numpy/linalg/lapack_lite.so:
>
undefined symbol: zgesdd_

I was motivated to build my own version of Numpy because I use a newer version of Python than the one that is provided on some of the machines I use in the lab.  I’ve built up my entire python ecosystem from source; and thus far, everything had worked out smoothly.  The real problem with this numpy error, is that it can happen for so many reasons, none of which was the reason it was occurring for me.

First, make sure that the build script is looking for BLAS and LAPACK in the right place.  If you download the actual numpy sources, this is fairly simple.  Just copy site.cfg.example to site.cfg in the top-level directory, and then make sure that the include and library path variables include the proper paths.  Now, the crux of the problem, which took me a few hours to resolve, is that blas and lapack must be built as shared libraries.  Numpy assumes this is the case and will look to link them at runtime.  I had previously built BLAS and LAPACK as static libraries, so numpy could not find them, and  hence failed with the error stated above.  Simply re-downloading lapack, altering the makefile to build shared libraries, and then installing them solved the problem.  Obviously, it’s always easiest if you have admin privileges on a machine and can handle these things with a decent package manger.  However, if you have to build sources by hand and run into this problem, I hope this note helps you out and saves you some time!

Posted in Uncategorized | Leave a comment

Playing With Scala

I’ve recently been exploring the Scala language. I used it mainly as a scripting language and in conjunction with Python and C++ in a recent research project. Now, I’m starting a more substantial (at least from a coding perspective) project, and I’m faced with the tough decision of what language I should use to write the code. My standard fall-backs are Python and C++. Don’t get me wrong, they’re great languages, both in isolation and when combined. However, for a long time, I’ve been looking for a language with the expressiveness of a high level scripting language, but without the excessive performance penalty.

So far, Scala fits the bill better than anything else I’ve seen. It’s a statically typed language which compiles down to JVM bytecode. Scala attempts to combine the object-oriented and functional programming paradigms in a single language in the most natural way. It has a type system that is among the most powerful I’ve yet seen, and a type inference systems which relieves the user of the burden and verbosity of explicitly annotating the types of most variables.

The power of the type system and the built in functional constructs really make Scala an expressive language. Moreover, the JVM bytecode it produces seems very good. In particular, there seems to be little to no penalty (in terms of performance and resource utilization) over Java. This can probably be chalked up to a combination of the JVM’s awesome performance tuning abilities and the adeptness of the Scala development team at producing efficient JVM bytecode.

Finally, being a JVM language, Scala capable of using Java libraries directly. This means that vast wealth of libraries written in Java are available to a Scala user immediately; in real life projects, this is a huge plus. Anyway, as you can probably tell from the tone of this post, I’m pretty excited about Scala, and the more I explore it, the more I find to like. I’m seriously considering attacking my new research project in Scala, and you’ll likely be hearing much more about my experience with the language on this blog in the future.

Posted in Uncategorized | Leave a comment

Releasing a fun project

So I finally got around to releasing some code I wrote a few months ago. It’s basically an implementation of “Implicit Fairing of Arbitrary Meshes using Diffusion and Curvature Flow” by Desbrun et al. However, the interesting part about this implementation is that it uses a very efficient method to solve the linear system required by this approach.  One of the main ideas put forth in this paper is to replace the common approach to mesh fairing, which relies on integration via the forward Euler method, with an approach which relies on integration via an unconditionally stable backward Euler method.

Imagine we have a set of vectors \mathbf{x}_{i}, i \in \{0,1,2\} of length n; representing the x, y, and z coordinates of our mesh.  The standard approach to smoothing our mesh using some differential operator matrix \mathcal{L} (e.g. some formulation of the Laplacian) is to apply the following formula:

\displaystyle  \mathbf{x}_{i}^{n+1} = (\mathbf{I}+\lambda dt \mathcal{L}) \mathbf{x}_{i}^{n}, i \in \{1,2,3\}  (1)

This tells us how we obtain the coordinates of the vertices at timestep n+1 from the positions of the vertices at timestep n. The main problem with this approach is that to ensure stability, we must choose a sufficiently small \lambda. However, the value of \lambda directly affects the amount of smoothing performed in every iteration. Thus, the smaller we need to make \lambda for the sake of stability, the greater the number of smoothing iterations we must perform to obtain the desired amount of smoothing.

Desbrun et al. get around this problem by replacing equation (1) by the following equation:

\displaystyle  (\mathbf{I}-\lambda dt \mathcal{L})\mathbf{x}_{i}^{n+1}=\mathbf{x}_{i}^{n}, i \in \{1,2,3\}  (2)

This should look familiar; if we set \mathbf{A} = \mathbf{I}-\lambda dt \mathcal{L}, \mathbf{x} = \mathbf{x}_{i}^{n+1}, and \mathbf{b} = \mathbf{x}_{i}^{n}, then we can see that, just like in our introduction to linear algebra course, we want to solve \mathbf{A}\mathbf{x} = \mathbf{b}. Intuitively, equation (1) says we take the vertex positions at the current timestep n, and apply the smoothing process to obtain the vertex positions at the next timestep n+1. Equation (2), on the other hand, asks, what must the vertex positions at the next timestep n+1 be such that by applying the “inverse” of the smoothing process (i.e. (\mathbf{I}-\lambda dt \mathcal{L}) instead of (\mathbf{I}+\lambda dt \mathcal{L})) to these positions we obtain the vertex positions at the current timestep n.

Now, instead of simply applying the smoothing operator as in equation (1), we have to solve a linear system of equations. There are many ways we can tackle this problem; each with certain benefits and detriments. However, a particularly practical and fast approach is to solve the system iteratively. In fact, this is what Desbrun et al. suggest in the paper, where they solve (2) using a PBCG solver. However, Botsch et al. in “Efficient Linear System Solvers for Mesh Processing” show us that for many problems, direct solvers can be much more efficient than iterative solvers. In particular, in situations where a set of linear systems with the same non-zero structure need to be solved, approaches which reuse the factorization can be much faster.

Luckily, our problem presents one of these situations. Since the non-zero structure of the matrix is the same for each smoothing iteration, a symbolic factorization of the necessary matrix can be performed only once, and reused for for all subsequent iterations.  My direct solver implementation of the smoothing algorithm makes use of the symbolic Cholesky factorization of the differential operator matrix. This means that for all but the first iteration, only the numeric factorization and back substitution need to be performed to solve the linear system.  My code provides implementations of implicit smoothing using both an iterative Precomputed Bi-Conjugate Gradient (PBCG) solver as well as a direct solver making using the Cholesky factorization of operator matrix. Compare them yourself to see the speed difference! If I have time in the future, I’d like to add an implementation which solves the linear system iteratively on the GPU. There are a number of nice libraries out there which provide PBCG implementations for GPUs, and it would certainly be interesting to see how an implementation using one of these solvers would compare against either of the CPU based approaches.

Let me know if you find my code useful or cool; or if you find any crazy bugs still lingering in there. I have more fun stuff to release in the future, but it requires some preparation first and I’m short on free time right now. Hopefully, however, I’ll find enough to post again soon.

Posted in Uncategorized | Leave a comment

Hello world . . . again!

So, there are a lot of new things going on in my graduate career right now, and this is also leading to a renewal of my efforts to accomplish other things I’ve previously wanted to do. I’m going to give the whole blogging thing another try. I think it can be a worthwhile way to explore (even if I’m the only participant) interesting issues pertaining to my research and perhaps other various topics as well. I’m going to make an honest and concerted effort to update this thing on a reasonably frequent basis; even if that means some of the posts will be tiny. I believe that one of the most effective ways to form a new habit is to force yourself to perform an activity frequently, until it becomes almost second nature.

Anyway, that’s all for now. By the way, for future reference, I’ve installed a LaTeX module for the blog so that you can include math with [latex] . . . [/latex] tags.

Posted in Uncategorized | Leave a comment