<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments for RoBlog</title>
	<atom:link href="http://robpatro.com/blog/?feed=comments-rss2" rel="self" type="application/rss+xml" />
	<link>http://robpatro.com/blog</link>
	<description>Thoughts and musings on science and life</description>
	<lastBuildDate>Sun, 05 Dec 2010 20:34:55 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>Comment on Python Hatchlings part 0 by Rob</title>
		<link>http://robpatro.com/blog/?p=68#comment-29</link>
		<dc:creator>Rob</dc:creator>
		<pubDate>Sun, 05 Dec 2010 20:34:55 +0000</pubDate>
		<guid isPermaLink="false">http://robpatro.com/blog/?p=68#comment-29</guid>
		<description>Hi Cortland,

  Thanks for the timing tests.  My lists consist of a sets of nodes for synthetically generated protein phylogeny networks, with size ranges varying from ~10^3 to ~10^4.  The structure of each network is essentially a forest of trees, and the goal of my filtering function is to extract the root nodes from each forest.  My filter based code looks something like this:

&lt;code lang=&quot;python&quot;&gt;
def rootSet(network):
    def isRoot(network, n):
         if len(network.predecessors(n)) == 0:
             return n
    return filter( partial(isRoot, network), network )
&lt;/code&gt;

However, after reading over your response and looking again at the NetworkX documentation, I found a simpler and more efficient solution.

&lt;code lang=&quot;python&quot;&gt;
def rootSet(network):
    return [ x[0] for x in network.in_degree_iter( network.nodes_iter() ) if x[1] == 0 ]
&lt;/code&gt;

Even on a set of very small networks (average size of 500 nodes each) second approach is about a 0.1s faster on each network.  However, this adds up to a significant savings as I am calling rootSet on a fairly large number of networks.  When I move to larger networks, the savings should be even more significant.

I think my example supports your conclusion.  In addition to performing my filter as a list comprehension, choosing the proper NetworkX functions allowed me to replace my function call with a simple statement.  Thanks again for your input.</description>
		<content:encoded><![CDATA[<p>Hi Cortland,</p>
<p>  Thanks for the timing tests.  My lists consist of a sets of nodes for synthetically generated protein phylogeny networks, with size ranges varying from ~10^3 to ~10^4.  The structure of each network is essentially a forest of trees, and the goal of my filtering function is to extract the root nodes from each forest.  My filter based code looks something like this:</p>
<div class="codecolorer-container python default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="python codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #ff7700;font-weight:bold;">def</span> rootSet<span style="color: black;">&#40;</span>network<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">def</span> isRoot<span style="color: black;">&#40;</span>network<span style="color: #66cc66;">,</span> n<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>network.<span style="color: black;">predecessors</span><span style="color: black;">&#40;</span>n<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> <span style="color: #66cc66;">==</span> <span style="color: #ff4500;">0</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #ff7700;font-weight:bold;">return</span> n<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">filter</span><span style="color: black;">&#40;</span> partial<span style="color: black;">&#40;</span>isRoot<span style="color: #66cc66;">,</span> network<span style="color: black;">&#41;</span><span style="color: #66cc66;">,</span> network <span style="color: black;">&#41;</span></div></div>
<p>However, after reading over your response and looking again at the NetworkX documentation, I found a simpler and more efficient solution.</p>
<div class="codecolorer-container python default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="python codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #ff7700;font-weight:bold;">def</span> rootSet<span style="color: black;">&#40;</span>network<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: black;">&#91;</span> x<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">for</span> x <span style="color: #ff7700;font-weight:bold;">in</span> network.<span style="color: black;">in_degree_iter</span><span style="color: black;">&#40;</span> network.<span style="color: black;">nodes_iter</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">if</span> x<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span> <span style="color: #66cc66;">==</span> <span style="color: #ff4500;">0</span> <span style="color: black;">&#93;</span></div></div>
<p>Even on a set of very small networks (average size of 500 nodes each) second approach is about a 0.1s faster on each network.  However, this adds up to a significant savings as I am calling rootSet on a fairly large number of networks.  When I move to larger networks, the savings should be even more significant.</p>
<p>I think my example supports your conclusion.  In addition to performing my filter as a list comprehension, choosing the proper NetworkX functions allowed me to replace my function call with a simple statement.  Thanks again for your input.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Python Hatchlings part 0 by Cortland Setlow</title>
		<link>http://robpatro.com/blog/?p=68#comment-23</link>
		<dc:creator>Cortland Setlow</dc:creator>
		<pubDate>Fri, 03 Dec 2010 17:51:14 +0000</pubDate>
		<guid isPermaLink="false">http://robpatro.com/blog/?p=68#comment-23</guid>
		<description>I found the following times on my netbook for these variants:

17.8 seconds:
print timeit.timeit(&#039;[x for x in xrange(100000) if (lambda y: y==5)(x)]&#039;, number=100)
11.4 seconds:
print timeit.timeit(&#039;l5 = lambda x: x==5; [x for x in xrange(100000) if l5(x)]&#039;, number=100)
9.8 seconds:
print timeit.timeit(&#039;filter(lambda x: x==5, xrange(100000))&#039;, number=100)
print timeit.timeit(&#039;l5 = lambda x: x==5; filter(l5, xrange(100000))&#039;, number=100)
3.5 seconds:
print timeit.timeit(&#039;[x for x in xrange(100000) if x==5]&#039;, number=100)

When the filter can be written as a statement, under my test conditions the list comprehension is fastest.  Have you tried inlining your filter function?  Python has historically required such ugly tricks for best performance.  

I&#039;d really like to know how big your list is and what you do in your filtering function.</description>
		<content:encoded><![CDATA[<p>I found the following times on my netbook for these variants:</p>
<p>17.8 seconds:<br />
print timeit.timeit(&#8216;[x for x in xrange(100000) if (lambda y: y==5)(x)]&#8216;, number=100)<br />
11.4 seconds:<br />
print timeit.timeit(&#8216;l5 = lambda x: x==5; [x for x in xrange(100000) if l5(x)]&#8216;, number=100)<br />
9.8 seconds:<br />
print timeit.timeit(&#8216;filter(lambda x: x==5, xrange(100000))&#8217;, number=100)<br />
print timeit.timeit(&#8216;l5 = lambda x: x==5; filter(l5, xrange(100000))&#8217;, number=100)<br />
3.5 seconds:<br />
print timeit.timeit(&#8216;[x for x in xrange(100000) if x==5]&#8216;, number=100)</p>
<p>When the filter can be written as a statement, under my test conditions the list comprehension is fastest.  Have you tried inlining your filter function?  Python has historically required such ugly tricks for best performance.  </p>
<p>I&#8217;d really like to know how big your list is and what you do in your filtering function.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

