概要
先日社内のPRコードレビューを受けて、Array#include?
よりSet#include?
の方が高速だということを知った
その時にソースを追ってみたまとめ
要約
Array#include?
は計算量がO(n)
Set#include?
は計算量がO(1)
Array#include?
1VALUE 2rb_ary_includes(VALUE ary, VALUE item) 3{ 4 long i; 5 VALUE e; 6 for (i=0; i<RARRAY_LEN(ary); i++) { 7 e = RARRAY_AREF(ary, i); 8 if (rb_equal(e, item)) { 9 return Qtrue; 10 } 11 } 12 return Qfalse; 13}
refs. https://github.com/ruby/ruby/blob/master/array.c#L5417-L5430
1/** 2 * This function is an optimised version of calling `#==`. It checks equality 3 * between two objects by first doing a fast identity check using using C's 4 * `==` (same as `BasicObject#equal?`). If that check fails, it calls `#==` 5 * dynamically. This optimisation actually affects semantics, because when 6 * `#==` returns false for the same object obj, `rb_equal(obj, obj)` would 7 * still return true. This happens for `Float::NAN`, where `Float::NAN == 8 * Float::NAN` is `false`, but `rb_equal(Float::NAN, Float::NAN)` is `true`. 9 * 10 * @param[in] lhs Comparison LHS. 11 * @param[in] rhs Comparison RHS. 12 * @retval RUBY_Qtrue They are the same. 13 * @retval RUBY_Qfalse They are different. 14 */ 15 16VALUE rb_equal(VALUE lhs, VALUE rhs);
refs. https://github.com/ruby/ruby/blob/master/include/ruby/ruby.h#L164-L178
説明によるとオブジェクト同士の比較ということ
1describe "rb_equal" do 2 it "returns true if the arguments are the same exact object" do 3 s = "hello" 4 @o.rb_equal(s, s).should be_true 5 end 6 it "calls == to check equality and coerces to true/false" do 7 m = mock("string") 8 m.should_receive(:==).and_return(8) 9 @o.rb_equal(m, "hello").should be_true 10 m2 = mock("string") 11 m2.should_receive(:==).and_return(nil) 12 @o.rb_equal(m2, "hello").should be_false 13 end 14end
refs. https://github.com/ruby/ruby/blob/master/spec/ruby/optional/capi/object_spec.rb#L764-L779
念の為テストも確認して同値比較メソッドであることを確認
つまりArray#include?
は1つずつ同値比較をしているので計算量がO(n)
Set#include?
1# Returns true if the set contains the given object. 2# 3# Note that <code>include?</code> and <code>member?</code> do not test member 4# equality using <code>==</code> as do other Enumerables. 5# 6# See also Enumerable#include? 7 8def include?(o) 9 @hash[o] 10end 11alias member? include?
refs. https://github.com/ruby/ruby/blob/master/lib/set.rb#L397-L406
hashでのアクセスにより O(1)
の計算量を実現していることが分かった
ちなみにSetはイニシャライズでhashとして扱われている
1def initialize(enum = nil, &block) # :yields: o 2 @hash ||= Hash.new(false) 3 enum.nil? and return 4 if block 5 do_with_enum(enum) { |o| add(block[o]) } 6 else 7 merge(enum) 8 end 9end
refs. https://github.com/ruby/ruby/blob/master/lib/set.rb#L245-L255