Nick Sieger: Tag performance
tag:blog.nicksieger.com,2005:Typo
Typo
2010-11-22T18:07:01+00:00
Nick Sieger
urn:uuid:fe7e8324-82de-49dc-a132-f1e514007cdd
2008-01-17T23:48:00+00:00
2010-11-22T18:07:01+00:00
Next performance fix: Builder::XChar
<p>Next up in our performance series: <code>Builder::XChar</code>. (Another fine Sam Ruby production!) While this piece of code in the Builder library strikes me as perfectly fine, it also tends to slow down quite a bit with larger documents or chunks of text.</p>
<p>Our path to the bottleneck is as follows: <code>ActiveRecord::Base#to_xml => Builder::XMLMarkup#text! => String#to_xs => Fixnum#xchr</code>. Consider:</p>
<div class="typocode"><pre><code class="typocode_ruby "><span class="ident">require</span> <span class="punct">'</span><span class="string">rubygems</span><span class="punct">'</span>
<span class="ident">gem</span> <span class="punct">'</span><span class="string">activesupport</span><span class="punct">'</span>
<span class="ident">require</span> <span class="punct">'</span><span class="string">active_support</span><span class="punct">'</span>
<span class="ident">require</span> <span class="punct">'</span><span class="string">benchmark</span><span class="punct">'</span>
<span class="keyword">module </span><span class="module">Benchmark</span>
<span class="keyword">class </span><span class="punct"><<</span> <span class="constant">self</span>
<span class="keyword">def </span><span class="method">report</span><span class="punct">(&</span><span class="ident">block</span><span class="punct">)</span>
<span class="ident">n</span> <span class="punct">=</span> <span class="number">10</span>
<span class="ident">times</span> <span class="punct">=</span> <span class="punct">(</span><span class="number">1</span><span class="punct">..</span><span class="number">10</span><span class="punct">).</span><span class="ident">map</span> <span class="keyword">do</span>
<span class="ident">bm</span> <span class="punct">=</span> <span class="ident">measure</span><span class="punct">(&</span><span class="ident">block</span><span class="punct">)</span>
<span class="ident">puts</span> <span class="ident">bm</span>
<span class="ident">bm</span>
<span class="keyword">end</span>
<span class="ident">sum</span> <span class="punct">=</span> <span class="ident">times</span><span class="punct">.</span><span class="ident">inject</span><span class="punct">(</span><span class="number">0</span><span class="punct">)</span> <span class="punct">{|</span><span class="ident">s</span><span class="punct">,</span><span class="ident">t</span><span class="punct">|</span> <span class="ident">s</span> <span class="punct">+</span> <span class="ident">t</span><span class="punct">.</span><span class="ident">real</span><span class="punct">}</span>
<span class="ident">mean</span> <span class="punct">=</span> <span class="ident">sum</span> <span class="punct">/</span> <span class="ident">n</span>
<span class="ident">sumsq</span> <span class="punct">=</span> <span class="ident">times</span><span class="punct">.</span><span class="ident">inject</span><span class="punct">(</span><span class="number">0</span><span class="punct">)</span> <span class="punct">{|</span><span class="ident">s</span><span class="punct">,</span><span class="ident">t</span><span class="punct">|</span> <span class="ident">s</span> <span class="punct">+</span> <span class="ident">t</span><span class="punct">.</span><span class="ident">real</span> <span class="punct">*</span> <span class="ident">t</span><span class="punct">.</span><span class="ident">real</span><span class="punct">}</span>
<span class="ident">sd</span> <span class="punct">=</span> <span class="constant">Math</span><span class="punct">.</span><span class="ident">sqrt</span><span class="punct">((</span><span class="ident">sumsq</span> <span class="punct">-</span> <span class="punct">(</span><span class="ident">sum</span> <span class="punct">*</span> <span class="ident">sum</span> <span class="punct">/</span> <span class="ident">n</span><span class="punct">))</span> <span class="punct">/</span> <span class="punct">(</span><span class="ident">n</span> <span class="punct">-</span> <span class="number">1</span><span class="punct">))</span>
<span class="ident">puts</span><span class="punct">("</span><span class="string">Mean: %0.6f SDev: %0.6f</span><span class="punct">"</span> <span class="punct">%</span> <span class="punct">[</span><span class="ident">mean</span><span class="punct">,</span> <span class="ident">sd</span><span class="punct">])</span>
<span class="keyword">end</span>
<span class="keyword">end</span>
<span class="keyword">end</span>
<span class="comment"># http://blog.nicksieger.com/files/page.xml</span>
<span class="ident">page</span> <span class="punct">=</span> <span class="constant">File</span><span class="punct">.</span><span class="ident">open</span><span class="punct">("</span><span class="string">page.xml</span><span class="punct">")</span> <span class="punct">{|</span><span class="ident">f</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">.</span><span class="ident">read</span> <span class="punct">}</span>
<span class="constant">Benchmark</span><span class="punct">.</span><span class="ident">report</span> <span class="keyword">do</span>
<span class="number">20</span><span class="punct">.</span><span class="ident">times</span> <span class="punct">{</span> <span class="ident">page</span><span class="punct">.</span><span class="ident">to_xs</span> <span class="punct">}</span>
<span class="keyword">end</span></code></pre></div>
<p>On Ruby and JRuby, this produces:</p>
<pre><code>$ ruby to_xs.rb
21.430000 0.400000 21.830000 ( 22.022769)
21.530000 0.360000 21.890000 ( 22.005737)
21.540000 0.370000 21.910000 ( 22.065165)
21.530000 0.370000 21.900000 ( 22.028591)
21.500000 0.350000 21.850000 ( 21.990395)
21.550000 0.370000 21.920000 ( 22.033164)
21.520000 0.360000 21.880000 ( 21.984129)
21.550000 0.370000 21.920000 ( 22.116802)
21.550000 0.370000 21.920000 ( 22.051421)
21.520000 0.380000 21.900000 ( 22.084736)
Mean: 22.038291 SDev: 0.041985
$ jruby -J-server to_xs.rb
79.112000 0.000000 79.112000 ( 79.112000)
81.480000 0.000000 81.480000 ( 81.481000)
84.745000 0.000000 84.745000 ( 84.745000)
84.384000 0.000000 84.384000 ( 84.384000)
121.933000 0.000000 121.933000 (121.933000)
85.533000 0.000000 85.533000 ( 85.532000)
82.762000 0.000000 82.762000 ( 82.763000)
82.090000 0.000000 82.090000 ( 82.090000)
81.298000 0.000000 81.298000 ( 81.299000)
80.774000 0.000000 80.774000 ( 80.773000)
Mean: 86.411200 SDev: 12.635700
</code></pre>
<p>(Hmm, I must have accidentally swapped in some large program in the middle of that JRuby run. The perils of benchmarking on a desktop machine. I don’t claim that the numbers are scientific, just illustrative!)</p>
<p>Fortunately, the fix again is very simple, and has <a href="http://groups.google.com/group/rubyjam/browse_thread/thread/82a9ddb762019bcc">previously</a> <a href="http://dev.rubyonrails.org/changeset/7773">been acknowledged</a>. The latest (unreleased?) <a href="http://code.whytheluckystiff.net/hpricot/" title="Hpricot, a fast and delightful HTML parser">Hpricot</a> has a new native extension, <code>fast_xs</code>, which is an almost drop-in replacement for the pure-ruby <code>String#to_xs</code>. (Almost, because it creates the method <code>String#fast_xs</code> instead of <code>String#to_xs</code>. ActiveSupport 2.0.2 and later <a href="http://dev.rubyonrails.org/browser/trunk/activesupport/lib/active_support/core_ext/string/xchar.rb?rev=7773">take care of aliasing it for you</a>). Unbeknownst to me, I ported <code>fast_xs</code> recently as part of upgrading JRuby extensions that have Java code in them. And so it happens to come in handy at this time. The patch for that is <a href="http://code.whytheluckystiff.net/hpricot/ticket/131">here</a>.</p>
<p>I have the latest Hpricot gems on my server, so you can install it yourself (for either Ruby or JRuby):</p>
<pre><code>gem install hpricot --source http://caldersphere.net
</code></pre>
<p>or</p>
<pre><code>jruby -S gem install hpricot --source http://caldersphere.net
</code></pre>
<p>With that installed, the script now produces these results:</p>
<pre><code>$ ruby to_xs.rb
0.460000 0.080000 0.540000 ( 0.537793)
0.420000 0.070000 0.490000 ( 0.501965)
0.430000 0.070000 0.500000 ( 0.501359)
0.400000 0.070000 0.470000 ( 0.484495)
0.400000 0.070000 0.470000 ( 0.479995)
0.400000 0.070000 0.470000 ( 0.469118)
0.390000 0.070000 0.460000 ( 0.468864)
0.390000 0.070000 0.460000 ( 0.465009)
0.390000 0.060000 0.450000 ( 0.452902)
0.390000 0.070000 0.460000 ( 0.466881)
Mean: 0.482838 SDev: 0.024926
$ jruby -J-server to_xs.rb
0.882000 0.000000 0.882000 ( 0.883000)
0.832000 0.000000 0.832000 ( 0.832000)
0.851000 0.000000 0.851000 ( 0.850000)
0.837000 0.000000 0.837000 ( 0.837000)
0.846000 0.000000 0.846000 ( 0.846000)
0.843000 0.000000 0.843000 ( 0.843000)
0.835000 0.000000 0.835000 ( 0.835000)
0.825000 0.000000 0.825000 ( 0.826000)
0.830000 0.000000 0.830000 ( 0.830000)
0.834000 0.000000 0.834000 ( 0.833000)
Mean: 0.841500 SDev: 0.016379
</code></pre>