Ruby performance tricks
Serhii Potapov September 02, 2012 #ruby #exception #performance #string #concatenation #reader #writer #accessor #variableI did some benchmarks to find out which alternatives to write code work faster. I wanna share it with you. All benchmarks are made against ruby 1.9.3p194 MRI.
Do not use exceptions for a control flow
The next example is pretty stupid but it shows how exceptions slow against conditional statements.
respond_to?(:mythical_method) ? self.mythical_method : nil
end
self.mythical_method
rescue NoMethodError
nil
end
end
obj = Obj.new
= 10_000_000
puts RUBY_DESCRIPTION
Benchmark.bm(15, ) do
rescue_report = x.report() { N.times { obj.with_rescue } }
condition_report = x.report() { N.times { obj.with_condition } }
[rescue_report / condition_report]
end
MRI 1.9.3: ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux] user system total real rescue: 111.530000 2.650000 114.180000 (115.837103) condition: 2.620000 0.010000 2.630000 ( 2.633154) rescue/condition: 42.568702 265.000000 NaN ( 43.991767)
MRI 1.8.7 (REE has similar result): ruby 1.8.7 (2011-12-28 patchlevel 357) [x86_64-linux] user system total real rescue: 80.510000 0.940000 81.450000 ( 81.529022) if: 3.320000 0.000000 3.320000 ( 3.330166) rescue/condition: 24.250000 inf -nan ( 24.481970)
String concatenation
Avoid using +=
to concatenate strings in favor of <<
method.
The result is absolutely the same: add a string to the end of an existing one.
What is the difference then?
See the example:
str1 =
str2 =
str1.object_id # => 16241320
str1 += str2 # str1 = str1 + str2
str1.object_id # => 16241240, id is changed
str1 << str2
str1.object_id # => 16241240, id is the same
When you use +=
ruby creates a temporal object which is result of str1 + str2
.
Then it overrides str1
variable with reference to the new built object.
On other hand <<
modifies existing one.
As a result of using +=
you have the next disadvantages:
- More calculation to join strings.
- Redundant string object in memory (previous value of
str1
), which approximates time when GC will trigger.
How +=
is slow? Basically it depends on length of strings you have operation with.
= 1000
= 10
5.times do
length = BASIC_LENGTH * (10 ** factor)
puts * 60 +
Benchmark.bm(10, ) do
concat_report = x.report() do
str1 =
str2 = * length
N.times { str1 += str2 }
end
modify_report = x.report() do
str1 =
str2 = * length
N.times { str1 << str2 }
end
[concat_report / modify_report]
end
end
Output:
____________________________________________________________
LENGTH: 10
user system total real
+= 0.000000 0.000000 0.000000 ( 0.004671)
<< 0.000000 0.000000 0.000000 ( 0.000176)
+= VS << NaN NaN NaN ( 26.508796)
____________________________________________________________
LENGTH: 100
user system total real
+= 0.020000 0.000000 0.020000 ( 0.022995)
<< 0.000000 0.000000 0.000000 ( 0.000226)
+= VS << Inf NaN NaN (101.845829)
____________________________________________________________
LENGTH: 1000
user system total real
+= 0.270000 0.120000 0.390000 ( 0.390888)
<< 0.000000 0.000000 0.000000 ( 0.001730)
+= VS << Inf Inf NaN (225.920077)
____________________________________________________________
LENGTH: 10000
user system total real
+= 3.660000 1.570000 5.230000 ( 5.233861)
<< 0.000000 0.010000 0.010000 ( 0.015099)
+= VS << Inf 157.000000 NaN (346.629692)
____________________________________________________________
LENGTH: 100000
user system total real
+= 31.270000 16.990000 48.260000 ( 48.328511)
<< 0.050000 0.050000 0.100000 ( 0.105993)
+= VS << 625.400000 339.800000 NaN (455.961373)
Be careful with calculation within iterators
Assume you need to write a function to convert an array into a hash where keys and values are same as elements of the array:
func([1, 2, 3]) # => {1 => 1, 2 => 2, 3 => 3}
The next solution would satisfy the requirements:
.inject({}) { h.merge(e => e) }
end
array
And would be extremely slow with big portions of data because it contains
nested methods (inject
and merge
), so it's __ O(n2) __ algorithm.
But it's obviously that it must be __ O(n) __.
Consider the next:
.inject({}) { h[e] = e; h }
end
array
In this case we do only one iteration over an array without any hard calculation within the iterator.
See the benchmark:
.inject({}) { h[e] = e; h }
end
array.inject({}) { h.merge(e => e) }
end
= 10
4.times do
size = BASE_SIZE * (10 ** factor)
params = (0..size).to_a
puts * 60 +
Benchmark.bm(10) do
x.report( ) { n_func(params) }
x.report() { n2_func(params) }
end
end
array
Output:
____________________________________________________________
SIZE: 10
user system total real
O(n) 0.000000 0.000000 0.000000 ( 0.000014)
O(n2) 0.000000 0.000000 0.000000 ( 0.000033)
____________________________________________________________
SIZE: 100
user system total real
O(n) 0.000000 0.000000 0.000000 ( 0.000043)
O(n2) 0.000000 0.000000 0.000000 ( 0.001070)
____________________________________________________________
SIZE: 1000
user system total real
O(n) 0.000000 0.000000 0.000000 ( 0.000347)
O(n2) 0.130000 0.000000 0.130000 ( 0.127638)
____________________________________________________________
SIZE: 10000
user system total real
O(n) 0.020000 0.000000 0.020000 ( 0.019067)
O(n2) 17.850000 0.080000 17.930000 ( 17.983827)
It's an obvious and trivial example. Just keep in mind to not do hard calculation within iterators if it's possible.
Use bang! methods
In many cases bang methods do the same as there non-bang analogues but without
duplication an object. The previous example with merge!
would be much faster:
.inject({}) { h.merge!(e => e) }
end
array.inject({}) { h.merge(e => e) }
end
= 10_000
array = (0..N).to_a
Benchmark.bm(10) do
x.report() { merge!(array) }
x.report() { merge(array) }
end
array
Output:
user system total real
merge! 0.010000 0.000000 0.010000 ( 0.011370)
merge 17.710000 0.000000 17.710000 ( 17.840856)
Use instance variables
Accessing instance variable directly is about two times faster than accessing them with accessor methods:
attr_accessor :var
@n = n
@var = 22
end
Benchmark.bm(10) do
x.report() { @n.times { @var } }
x.report( ) { @n.times { var } }
x.report() { @n.times { @var = i } }
x.report() { @n.times { self.var = i } }
end
end
end
metric = Metric.new(100_000_000)
metric.run
Output:
user system total real
@var 6.980000 0.010000 6.990000 ( 7.193725)
var 13.040000 0.000000 13.040000 ( 13.131711)
@var = 7.960000 0.000000 7.960000 ( 8.242603)
self.var = 14.910000 0.010000 14.920000 ( 15.960125)
Parallel assignment is slower
= 10_000_000
Benchmark.bm(15) do
x.report() do
N.times do
a, b = 10, 20
end
end
x.report() do
N.times do
a = 10
b = 20
end
end
end
Output:
user system total real
parallel 1.900000 0.000000 1.900000 ( 1.928063)
consequentially 0.880000 0.000000 0.880000 ( 0.879675)
Dynamic method defention
What is the faster way to define method dynamically: class_eval
with a code string
or using define_method
? Which way generated methods work faster?
= 1_000_000
N.times do
class_eval(, __FILE__, __LINE__ + 1)
end
end
N.times do
define_method() do
i
end
end
end
end
Benchmark.bm(22) do
x.report() { Metric.class_eval_with_string }
x.report() { Metric.with_define_method }
metric = Metric.new
x.report() { Metric::N.times { metric.smeth_1 } }
x.report() { Metric::N.times { metric.dmeth_1 } }
end
Output:
user system total real
class_eval with string 219.840000 0.720000 220.560000 (221.933074)
define_method 61.280000 0.240000 61.520000 ( 62.070911)
string method 0.110000 0.000000 0.110000 ( 0.111433)
dynamic method 0.150000 0.000000 0.150000 ( 0.156537)
So class_eval
works slower but it's preferred since methods generated with
class_eval
and a string of code work faster.
Links
- How Slow Are Ruby Exceptions
- 6 Optimization Tips for Ruby MRI
(NOTE:
Symbol#to_proc
was ported to Ruby and it's not slow anymore) - "Writing Efficient Ruby Code" by Dr. Stefan Kaes
- Top 10 Ruby on Rails performance tips
- 20 Ruby Performance Tips
- A Beginner’s Guide to Big O Notation
Danke.