class RichString < String
def initialize(string)
super(string)
@data = string[0..0] # some manipulation here
end
def data
@data
end
end
word = RichString.new('word')
puts word # => word
puts word.data # => wThat was not special and worked as expected.
Then I happened to use instances of RichString as keys in a hash. Why shouldn't I? They were still normal Strings and their data should be ignored when used in the hash.map = {}
map[word] = :anything
word_key = map.keys[0]
puts word_key # => word
puts word_key.data # => nilThe last line warned me "instance variable @data not initialized". Oops, my little @data went missing indicated by the bold nil in the last line. First I did not know what was causing the problems. I was baffled as all tests were green and had a good coverage. I spent some time digging and rewriting a lot of functionality until I found that Hash#keys() caused the trouble when given my RichStrings as hash keys.puts word == word_key # => true puts word.object_id == word_key.object_id # => falseAha,
Hash changed the keys. It's reasonable to prohibit key changes, so a String passed as a key will be duplicated and frozen. (RTFM always helps ;-) But how did it do that? It did not call dup() on the RichString. As Hash is natively implemented, I ended up in the C source hash.c./*
* call-seq:
* hsh[key] = value => value
* hsh.store(key, value) => value
*/
VALUE
rb_hash_aset(hash, key, val)
VALUE hash, key, val;
{
rb_hash_modify(hash);
if (TYPE(key) != T_STRING || st_lookup(RHASH(hash)->tbl, key, 0)) {
st_insert(RHASH(hash)->tbl, key, val);
}
else {
st_add_direct(RHASH(hash)->tbl, rb_str_new4(key), val);
}
return val;
}So when the key is a String and not already included in the hash, then rb_str_new4 is called. (I just love descriptive names ;-) Furthermore string.c revealed some fiddling with the original key.VALUE
rb_str_new4(orig)
VALUE orig;
{
VALUE klass, str;
if (OBJ_FROZEN(orig)) return orig;
klass = rb_obj_class(orig);
if (FL_TEST(orig, ELTS_SHARED) &&
(str = RSTRING(orig)->aux.shared) &&
klass == RBASIC(str)->klass) {
long ofs;
ofs = RSTRING(str)->len - RSTRING(orig)->len;
if ((ofs > 0) || (!OBJ_TAINTED(str) && OBJ_TAINTED(orig))) {
str = str_new3(klass, str);
RSTRING(str)->ptr += ofs;
RSTRING(str)->len -= ofs;
}
}
else if (FL_TEST(orig, STR_ASSOC)) {
str = str_new(klass, RSTRING(orig)->ptr, RSTRING(orig)->len);
}
else {
str = str_new4(klass, orig);
}
OBJ_INFECT(str, orig);
OBJ_FREEZE(str);
return str;
}
I didn't quite understand what was going on in rb_str_new4(), but it was sufficient to read a few lines: If the original string was frozen, then it was used directly. I verified that.map = {}
map[word.freeze] = :anything
word_key = map.keys[0]
puts word_key # => word
puts word_key.data # => wExcellent, finally my @data showed up as expected. Fixing the problem added some complexity dealing with frozen values, but it worked.Freeze your custom Ruby strings when you use them as keys in a hash (and want to retrieve them with
Hash#keys())















No comments:
Post a Comment