I was verifying the performance of an endpoint in an API using framegraph when I noticed that checking if an object is in a set or a hash data structure was consuming a lot of time. This post explains the problem and the solution to it.
First of all, checking if an object is in a Set or HashMap is O(1) when we don’t consider collisions. Even in the presence of collisions, it should be much less than O(n), where n is the number of items in the set. Otherwise, we are operating as a linked list.
In my case, checking if an object is in a Set of 1000 objects was consuming 1.62 seconds or 37% of my request time. Only for us to compare, checking if an integer is in a Set with 1000 integers consumes less than 1 millisecond.
So, what is the problem? To dive into it below is the class I was working with. I removed everything from the business problem and changed the class name only to allow us to concentrate on the problem and the solution here.
class CustomID
attr_reader :id
def initialize(id)
@id = id
end
def ==(other)
other.class == self.class && other.id == id
end
def hash
self.class.hash | id.hash
end
end
There is nothing wrong with this code. However, if we use it frequently, it could be a bottleneck.
The method == makes a string comparison which is O(n), where n is the size of the string, and in case of success, it compares two integers. I improved it to make a faster integer comparison and use only one machine instruction. Integer comparison is O(b), where b is the number of bits used to represent the integer.
The method hash is very expensive. It calculates the hash of a string and the hash of an integer and does an or operation.
Both methods cited above are where our performance problem starts. For a Set of 1000 elements, we need to calculate the object’s hash and use the method == to check if an object is inside it. Again, this is fine if we check this a few times. However, doing it a few million times daily could be a significant bottleneck.
The solution is to remove the string comparison in the method == and memoize the expansive hash operation. The new code is below:
class ImprovedCustomID
attr_reader :id
def initialize(id)
@id = id
end
def ==(other)
other.hash == hash
end
def hash
@hash ||= self.class.hash | id.hash
end
def freeze
hash
super
end
end
The new code overrides the method freeze to call hash and only after it, call super. The override is necessary because the hash memoized version adds a new instance variable @hash when it is called the first time.
To compare both versions, we can use the code below, which uses the library benchmark-ips:
require "benchmark/ips"
custom_id = CustomID.new(61000)
improved_custom_id =ImprovedCustomID.new(61000)
custom_ids = (1..1000).map {|i| CustomID.new(i*1000) }
improved_custom_ids = (1..1000).map {|i| ImprovedCustomID.new(i*1000) }
Benchmark.ips do |x|
x.report("include? before improvements") { custom_ids.include?(custom_id) }
x.report("include? after improvements") { improved_custom_ids.include?(improved_custom_id) }
x.compare!
end
With this benchmark, we got the results:
Warming up --------------------------------------
include? before improvements
16.564k i/100ms
include? after improvements
23.623k i/100ms
Calculating -------------------------------------
include? before improvements
165.641k (± 0.3%) i/s - 828.200k in 5.000032s
include? after improvements
235.784k (± 0.3%) i/s - 1.181M in 5.009491s
Comparison:
include? after improvements: 235784.5 i/s
include? before improvements: 165640.9 i/s - 1.42x slower
Therefore, in this simple case, the new code is 42% faster. The new code was 2x more quickly in my production case, from 1.62 seconds to 866.16 milliseconds.