RSpec 2 Matcher Fun
Posted by Nick Sieger Thu, 20 Jan 2011 17:47:21 GMT
I was troubleshooting some JRuby code that transforms Java camelCase
method names into Ruby snake_case
form. We had a bunch of specs that did this, for example:
describe "Java instance method names" do
it "should present javabean properties as attribute readers and writers" do
methods = MethodNames.instance_methods
methods.should include("getValue2")
methods.should include("get_value2")
methods.should include("value2")
methods.should include("setValue2")
methods.should include("set_value2")
methods.should include("value2=")
end
end
The problem comes when these specs fail. The default error message made by the #include
matcher looks like:
Failures:
1) Java instance method names should present javabean properties as attribute readers and writers
Failure/Error: methods.should include("get_value2")
expected [...full contents of array here...] to include "get_value2"
Diff:
@@ -1,2 +1,186 @@
-get_value2
+[...all entries, one per line here...]
That’s not a terrible message, but when your array contains over 100 entries (like an array of method names), it could be a lot better. In particular, I kept scanning the failure message’s big list, unable to clearly see why the methods I was expecting weren’t there.
What I wanted to see was how my changes to the regex which splits a Java camelCase
name affected the conversion. So, what I needed was a report of which method names were the closest to the ones that were not in the list. Hey, sounds like a good reason to implement a custom matcher, and take a diversion into fuzzy string matching algorithms!
I settled on porting the pseudocode in Wikipedia for the Levenshtein distance, which calculates how close in content two strings are to each other. I looked around and there are existing Levenshtein ports for Ruby, but they use native code for performance. I don’t need performance because I’m only using the Levenshtein function when there is a failure. Of course, pure Ruby code is more portable too!.
The other change I made in the specs was to pass all strings in a single matcher rather than one name per expectation, so we can see all names that fail, not just the first.
So now, the new spec looks more like this:
describe "Java instance method names" do
let(:members) { MethodNames.instance_methods }
it "should present javabean properties as attribute readers and writers" do
members.should have_strings("getValue2",
"get_value2",
"value2",
"setValue2",
"set_value2",
"value2=")
end
end
The custom RSpec matcher #have_strings
is declared like so:
RSpec::Matchers.define :have_strings do |*strings|
match do |container|
@included, @missing = [], []
strings.flatten.each do |s|
if container.include?(s)
@included << s
else
@missing << s
end
end
@missing.empty?
end
failure_message_for_should do |container|
"expected array of #{container.length} elements to include #{@missing.inspect}.\n" +
"#{closest_match_message(@missing, container)}"
end
failure_message_for_should_not do |container|
"expected array of #{container.length} elements to not include #{@included.inspect}."
end
def closest_match_message(missing, container)
missing.map do |m|
groups = container.group_by {|x| levenshtein(m, x) }
" closest match for #{m.inspect}: #{groups[groups.keys.min].inspect}"
end.join("\n")
end
end
I omitted the #levenshtein
function here for brevity. (You can view the full source for details.) Now our failing spec output looks like:
Failures:
1) Java instance method names should present javabean properties as attribute readers and writers
Failure/Error: members.should have_strings("getValue2",
expected array of 185 elements to include ["get_my_value", "my_value", "set_my_value", "my_value="].
closest match for "get_my_value": ["get_myvalue", "set_myvalue"]
closest match for "my_value": ["myvalue"]
closest match for "set_my_value": ["get_myvalue", "set_myvalue"]
closest match for "my_value=": ["myvalue="]
Now the failure message is giving me exactly the information I need. Much better, don’t you think?