Last week I released an update for one of my older modules, Test::DatabaseRow. In the course of this update I completely rewrote the guts of the module, turning the bloated procedural module into a set of clearly defined Moose-like Perl classes.
Why update now?
Test::DatabaseRow had been failing it’s tests since Perl 5.13.3 (it was another victim of the changing stringification of regular expressions breaking tests.) We’re currently planning to upgrade our systems at work from 5.12 to 5.14 in the new year, and (embarrassing) one of the key modules that breaks our 5.14 smoke is Test::DatabaseRow. Oooops.
Since I had my editor open, I decided it might be a good idea to switch to a more modern build system. And, since I was doing that, I thought it might be a good idea to fix one of my long standing todos (testing all rows returned from the database not just the first.)
In other words, once I’d started, I found it hard to stop, and before I knew it I had a reasonably big task on my hands.
The decision to refactor
When I first wrote Test::DatabaseRow back in 2003, like most testing modules of the time, it sported a simple Exporter based interface. The (mostly correct) wisdom was that simple procedural interfaces make it quicker to write tests. I still think that’s true, but:
Procedual programming ends up either with very long functions or excessive argument passing. The single function interface made the internals of Test::DatbaseRow quite difficult – to avoid having one giant function call I ended up passing all the arguments a multitude of helper functions and then passing the multiple return values of one function onto the next.
Many of the calls you write want to share the same defaults For example, the database handle to use, if we should be verbose in our output, should we do utf–8 conversion?… These are handled reasonably well with package level variables as defaults for arguments not passed to the function (which isn’t such a big deal in a short test script) but the code to support those within the testing class itself isn’t particularly clean having to cope with defaults evaluation in multiple places.
Only being able to return once from the function is a problem. Sometimes you might want to get extra data back after the test has completed. For example, when I wanted to allow you to optionally return the data extracted from the database I had to do it by allowing you to pass in the args to
row_okreferences to variables to be populated as it executes. Again, while this isn’t the end of the world from an external interface point of view, the effect it has on the internals (passing data up and down the stack) is horrible.
For the sake of the internals I wanted things to change. However: I didn’t want to break the API. I decided to split the module into two halves. An simple external facing module that would provide the procedural interface, and an internal object orientated module that would allow me to produce a cleaner implementation.
No Moose, but something similar
As I came to create Test::DatabaseRow::Object I found myself really wanting to write this new class with Moose. Now, Moose is a very heavyweight dependency; You don’t want to have to introduce a dependency on hundreds of modules just because you want to use a simple module to test your code. In fact, Test::DatabaseRow has no non-core dependencies apart from DBI itself, and I wanted to keep it this way with the refactoring. So, no Moose. No Mouse. No Moo. Just me and an editor.
In the end I compromised by deciding to code the module in a Moose “pull accessor” style even if I didn’t have Moose itself to provide the syntax to do this succinctly.
The approach I took was to put most of the logic for Test::DatabaseRow::Object – anything that potentially changes state of the object – into lazy loading read only accessors. Doing this allowed me to write my methods in a declarative style, relying entirely on the accessors performing the calculation to populate themselves the first time they’re accessed. For example. Test::DatabaseRow::Object has a read only accessor called
db_results which goes to the database the first time it’s accessed and executes the SQL that’s needed to populate it (and the SQL itself comes from
sql_and_bind which, unless explicitly stated in the constructor, is populated on first use from the
table accessors and so on.)
Since I wasn’t using Moose this produced a lot more code than we’d normally expect to see, but because I was following a standard Moose conventions it’s still fairly easy to see what’s going on (I even went as far to leave a Moose style
has accessor declaration in a comment above the blocks of code I had to write to sufficiently convey what I was doing.)
A results object
The second big architectural change I made was to stop talking directly to Test::Builder. Instead I changed to returning a results object which was capable of rendering itself out to Test::Builder on demand.
This change made the internals a lot easier to deal with. I was able to break the test up into several functions each returning a success or failure object. As soon I was able to detect failure in any of these functions I could return it to Test::DatabaseRow, but if I got a success – which now hadn’t been rendered out to Test::Builder yet – I could throw it away and move onto the next potentially failing test while I still had other things to check.
This made implementing my missing feature, the ability to report on all rows returned from the database not just the first one, much easier to implement.
Problems, problems, problems
After all this work, and spending hours improving the test coverage of the module, I still botched the release of 2.00. The old module tested the interface with DBI by checking against a database that was on my old laptop in 2003. Since I no longer had that laptop these tests weren’t being run (I actually deleted them since they were useless) and hence I didn’t notice when I’d broken the interface to DBI in my refactoring.
Ilmari pointed out I was being stupid a few minutes after I’d uploaded. Within ten minutes I’d written some DBI tests that test with SQLite (if you have DBD::SQLite installed) and released 2.01.
The biggest surprise was the next day where our overnight Hudson powered smokes failed at work, and the only thing that had changed was Test::DatabaseRow (we re-build our dependencies from the latest from CPAN every night, and it’d automatically picked up the changes.) Swapping in the old version passed. Putting the new version in failed. Had I missed something else in my testing?
No, I hadn’t.
After several hours of head scratching I eventually worked out that there was an extra bug in Test::DatabaseRow 1.04 that I’d not even realised was there, and I’d fixed it with 2.01. The tests were failing in our smokes but not because I’d broken the testing infrastructure, but because I’d fixed it and now I was detecting an actual problem in my Hudson test suite that had previously been unexposed.
Oh, I’m already planning Test::DatabaseRow 2.02. On github I’ve already closed a feature request that chromatic asked for in 2005. Want another feature? File a bug in RT. I’ll get to it sooner or later…