This is a super-brief post, but an interesting little test nonetheless. After my friend Geet
suggested the idea a few times, I decided to run the test from my last post using pypy. Initially, I was fairly convinced that while the program would run substantially faster, it wouldn’t have substantially improved memory usage. Well, I was wrong. Not only did the pypy version run substantially faster (26 vs 91 seconds on my laptop), but it also used a good deal less memory (~3.3 vs ~6.5 GB).
Now, while the pypy version is about 3.3x faster and uses about half the memory of the CPython version that’s still about 2 times slower and uses about 4 times as much memory as the C++ version. So what can we gather from this little experiment? Well, first, PyPy is pretty awesome. Not only is it able to substantially improve the speed of the python program substantially, it was also able to improve the memory usage by a large margin. Second, even with PyPy, this task requires way too much memory under python. The speed overhead between the C++ and python version isn’t unreasonable, but I think the memory overhead still is. I haven’t done a heapy analysis of the PyPy version yet (I don’t even know if I can), but whatever it’s doing, it’s still generating a ton of overhead for what should be a pretty straightforward data structure.
What was the test you ran?
~Cha
The test was reading a 10k vertex (fairly sparse) graph into memory. The poor speed of Python was astounding, but it’s horrendous memory usage was even worse. See this post for more details.
~Cha