Test::DatabaseRow 2

Last week I released an update for one of my older modules, Test::DatabaseRow. In the course of this update I completely rewrote the guts of the module, turning the bloated procedural module into a set of clearly defined Moose-like Perl classes.

Why update now?

Test::DatabaseRow had been failing it’s tests since Perl 5.13.3 (it was another victim of the changing stringification of regular expressions breaking tests.) We’re currently planning to upgrade our systems at work from 5.12 to 5.14 in the new year, and (embarrassing) one of the key modules that breaks our 5.14 smoke is Test::DatabaseRow. Oooops.

Since I had my editor open, I decided it might be a good idea to switch to a more modern build system. And, since I was doing that, I thought it might be a good idea to fix one of my long standing todos (testing all rows returned from the database not just the first.)

In other words, once I’d started, I found it hard to stop, and before I knew it I had a reasonably big task on my hands.

The decision to refactor

When I first wrote Test::DatabaseRow back in 2003, like most testing modules of the time, it sported a simple Exporter based interface. The (mostly correct) wisdom was that simple procedural interfaces make it quicker to write tests. I still think that’s true, but:

  • Procedual programming ends up either with very long functions or excessive argument passing. The single function interface made the internals of Test::DatbaseRow quite difficult – to avoid having one giant function call I ended up passing all the arguments a multitude of helper functions and then passing the multiple return values of one function onto the next.

  • Many of the calls you write want to share the same defaults For example, the database handle to use, if we should be verbose in our output, should we do utf–8 conversion?… These are handled reasonably well with package level variables as defaults for arguments not passed to the function (which isn’t such a big deal in a short test script) but the code to support those within the testing class itself isn’t particularly clean having to cope with defaults evaluation in multiple places.

  • Only being able to return once from the function is a problem. Sometimes you might want to get extra data back after the test has completed. For example, when I wanted to allow you to optionally return the data extracted from the database I had to do it by allowing you to pass in the args to row_ok references to variables to be populated as it executes. Again, while this isn’t the end of the world from an external interface point of view, the effect it has on the internals (passing data up and down the stack) is horrible.

For the sake of the internals I wanted things to change. However: I didn’t want to break the API. I decided to split the module into two halves. An simple external facing module that would provide the procedural interface, and an internal object orientated module that would allow me to produce a cleaner implementation.

No Moose, but something similar

As I came to create Test::DatabaseRow::Object I found myself really wanting to write this new class with Moose. Now, Moose is a very heavyweight dependency; You don’t want to have to introduce a dependency on hundreds of modules just because you want to use a simple module to test your code. In fact, Test::DatabaseRow has no non-core dependencies apart from DBI itself, and I wanted to keep it this way with the refactoring. So, no Moose. No Mouse. No Moo. Just me and an editor.

In the end I compromised by deciding to code the module in a Moose “pull accessor” style even if I didn’t have Moose itself to provide the syntax to do this succinctly.

The approach I took was to put most of the logic for Test::DatabaseRow::Object – anything that potentially changes state of the object – into lazy loading read only accessors. Doing this allowed me to write my methods in a declarative style, relying entirely on the accessors performing the calculation to populate themselves the first time they’re accessed. For example. Test::DatabaseRow::Object has a read only accessor called db_results which goes to the database the first time it’s accessed and executes the SQL that’s needed to populate it (and the SQL itself comes from sql_and_bind which, unless explicitly stated in the constructor, is populated on first use from the where and table accessors and so on.)

Since I wasn’t using Moose this produced a lot more code than we’d normally expect to see, but because I was following a standard Moose conventions it’s still fairly easy to see what’s going on (I even went as far to leave a Moose style has accessor declaration in a comment above the blocks of code I had to write to sufficiently convey what I was doing.)

A results object

The second big architectural change I made was to stop talking directly to Test::Builder. Instead I changed to returning a results object which was capable of rendering itself out to Test::Builder on demand.

This change made the internals a lot easier to deal with. I was able to break the test up into several functions each returning a success or failure object. As soon I was able to detect failure in any of these functions I could return it to Test::DatabaseRow, but if I got a success - which now hadn’t been rendered out to Test::Builder yet - I could throw it away and move onto the next potentially failing test while I still had other things to check.

This made implementing my missing feature, the ability to report on all rows returned from the database not just the first one, much easier to implement.

Problems, problems, problems

After all this work, and spending hours improving the test coverage of the module, I still botched the release of 2.00. The old module tested the interface with DBI by checking against a database that was on my old laptop in 2003. Since I no longer had that laptop these tests weren’t being run (I actually deleted them since they were useless) and hence I didn’t notice when I’d broken the interface to DBI in my refactoring.

Ilmari pointed out I was being stupid a few minutes after I’d uploaded. Within ten minutes I’d written some DBI tests that test with SQLite (if you have DBD::SQLite installed) and released 2.01.

The biggest surprise was the next day where our overnight Hudson powered smokes failed at work, and the only thing that had changed was Test::DatabaseRow (we re-build our dependencies from the latest from CPAN every night, and it’d automatically picked up the changes.) Swapping in the old version passed. Putting the new version in failed. Had I missed something else in my testing?

No, I hadn’t.

After several hours of head scratching I eventually worked out that there was an extra bug in Test::DatabaseRow 1.04 that I’d not even realised was there, and I’d fixed it with 2.01. The tests were failing in our smokes but not because I’d broken the testing infrastructure, but because I’d fixed it and now I was detecting an actual problem in my Hudson test suite that had previously been unexposed.

What’s next?

Oh, I’m already planning Test::DatabaseRow 2.02. On github I’ve already closed a feature request that chromatic asked for in 2005. Want another feature? File a bug in RT. I’ll get to it sooner or later…

Share "Test::DatabaseRow 2"

Share on: FacebookTwitter

Dear Member of Parliment

As a Perl programmer, both my livelihood and a large chunk of my social life relies entirely on the internet. How would you react if the head of your government made public statements talking about restricting people internet access to people that they (and their agencies) "know" are doing wrong things...
...we are working with the Police, the intelligence services and industry to look at whether it would be right to stop people communicating via these websites and services when we know they are plotting violence, disorder and criminality.
David Cameron, UK Prime Minister
In response, I wrote to my MP. I encourage those of you from the UK to do the same.
Dear Duncan Hames, I write to you today to express my concerns regarding statements made by the prime minister with respect to restricting access to "social media". It should be fairly obvious when the chinese regime are praising our censorship plans that they are ill thought through and should be scrapped. However obvious I feel that I must still enumerate the ways that this plan is wrong on many levels. Firstly, your prime minister seems unable to distinguish between the medium and the message. As we move more and more into the digital age more and more communication will take new forms, these new forms will replace more traditional forms of communication in society. To seek to control over some forms of commutation is modern equivalent of the government seeking to control the ability for its citizens to write to newspapers or talk in the street. Secondly, the idea of the government silencing it's citizens from communicating with one another is chilling. While I can understand that some speech may be criminal by it's content, woe befall any government who tries to pre-emptively stop such speech, as these very same controls can be used, and abused, to control its citizens. Thirdly, the prime minister is seeking to put restrictions on people that have not been convicted of a crime (he said, I quote, "when we know they are plotting violence, disorder and criminality", but that is a matter for the courts not the "[the government,] the Police, the intelligence services and industry" to decide.) What safeguards are being proposed that I, a law abiding citizen, may not also be restricted from communication? Fourthly, and ironically, your prime minister is suggesting restricting the primary ability for communication with wider society by those individuals who he claims live outside of our society. Finally, I do not understand your prime minister's desire to push for further attention grabbing legislation when our police force can already wield the RIP Act to gather evidence from these new forms of communication. While I may not agree with the RIP Act, let our police forces use these powers to full effect before granting them new ones. As a member of your constituency I ask you to ensure that your prime minister is questioned about such blatant flaws in his proposals in parliment. Thanking you in advance for your help in this matter Yours sincerely, Mark Fowler
Those wanting to do more could do a lot worse than set up a regular donation to the Open Rights Group.

Share "Dear Member of Parliment"

Share on: FacebookTwitter

London Calling

Now that I don't live in London anymore (I live in Chippenham, which is eighty two miles away) I don't often get to go to the London Perl Monger Socials, but last night with the meeting happening right by Paddington Station it was too good an chance to miss. The hot topic of conversation was obviously the impending YAPC::Europe conference. I sadly won't be attending, since I just got back from my trip to YAPC::NA (which I owe a blog post on,) but I was able to give good advise on what talks to go see having already seen the US versions. There seemed to be a significant amount of problems with clashes in the schedule in Latvia that I can sympathise with. For example, I was recommending Jesse's talk on 5.16 (which I really enjoyed at YAPC::NA,) but it was pointed out that he's up against Smylers who I think is also an entertaining and informative speaker. Jesse's talk at YAPC::NA on 5.16 generated quite a bit of conversation around the tables. Taking a straw poll of the people present I think that they liked the direction that's being proposed and those that could would be attending the talk in Latvia to hear more in person. People in general liked the idea of making the language (optionally) more consistent, easier to parse and more consistent without losing the ability to run older more sloppy code. Jesse might have been shocked that in Ashville people clapped rather than booed his suggestion that the indirect object syntax not be allowed under "use 5.16" but at work we enforce "no indirect;" on all our code anyway. The idea of laying the ground work for possibly re-implementing perl 5 (not Perl, but "perl", the interpreter) by making cleaner syntax was one thing that Jesse said in his talk that people at the social thought was interesting. Sam Villian pointed out that Git seems to have been re-implemented multiple times and this has been a big advantage for it. Nicholas Clarke arrived hot and in need of beer after running for the train, being delayed after writing grant proposals. This kicked off a discussion about the TPF core maintenance grant which morphed into a discussion about the availability of talent to work on Perl 5 core issues (We had both Nicholas and Zefram sitting round the table - that's not a bad chunk of the talent pool in itself.) In short my opinion is that the more work that's done on the Perl core the more interest we'll attract, and that's a good thing. Problems with hiring in general were discussed; I pointed out that at YAPC::NA lots of companies were hiring and offering telecommute positions so they could get the talent they needed. The outragerous costs charged by not very effective recruiters were mentioned and the real need for high quality technically savvy recruiters (or at least, recruiters with technical experts) was identified as a gap in the market. For some reason at some point we got into a big discussion about unicode. Ilmari showed us his awesome library card with "Mannsåker" written as "MannsAyker". "Mannsåker" had obviously gone through some terrible UTF-8 bytecode into Latin-1 conversion resulting in "MannsÃ¥ker" and then someone seems to have re-typed that into it's ASCII equivalent. It's not like his donor card was much better either! This morphed in a discussion about failed attempts to get domain name registrars to adopt proper unicode characters (and the various security issues related around that.) I wonder if the IT industry will be dealing with this in twenty years time? Probably. As is fitting for any modern IT meetup these days we talked a bit about the problems of scale. This progressed into discussion of the problems of disaster recovery preparation; It's very hard to test without impacting customers (it's easier if you've got completely redundant systems and you're willing to invest into DR with a zero downtime switchover but that's rare) and it's actually quite hard to get a grip on what you have and haven't got covered (systems change rapidly and delaying rollouts to make sure full DR cover is in place may result in a large lost opportunity cost.) Of course, London.pm still (in addition to all the Perl and computing talk) ricochets between geek trivia and the usual trappings of good friends. "Why don't we talk about Buffy and more?", "Well, what about Ponies?", "Hey, All the cool kids on the internet like My Little Pony these days". "Speaking of kids, is your daughter crawling yet?" "She's sitting up and waving", "Oh, while I remember, here's the bib your youngest left at our house last week" As always, I had fun, and I look forward to attending again another time soon.

Share "London Calling"

Share on: FacebookTwitter

Unexceptional Exceptions in Perl 5.14

There's a lot to love about Perl 5.14, but one of the best changes is a subtle one: Native exception handling is now reliable. Perl's native exception syntax uses an eval block as the traditional "try" block, and requires checking $@ to see if it contains an exception.
  eval {
    ...; # do something that might throw an exception
  };
  
  if ($@) {
    ...; # handle exception here
  }
Just like other languages code in the block is evaluated until an exception is thrown and then control jumps out of the block. Perl doesn't have a native catch syntax however - it simply puts the exception into $@ where you can check it with a standard if statement. Herein lies the problem in all versions of Perl prior to 5.14. Prior to 5.14 $@ is assigned and then the block is immediately exited; With 5.14 the block is immediately exited and then $@ is assigned. A subtle, but important difference. Perl's improvised catch mechanism relies on eval undefining $@ if no exception occurred (so the if block won't execute.) Each time Perl executes an eval therefore it must undefine $@. Prior to 5.14 this interacts badly with object destructors.
  package SomethingComplex;
  sub new { return bless {}, shift };
  sub DESTROY {
    eval {
      ...; # some cleanup code that might throw an exception
    };
    if ($@) {
      ...; # handle exception in cleanup code
    }
  }
  
  package main;
  
  eval {
    my $thingy = SomethingComplex->new();
    ...; # do something that might throw an exception
  };
  if ($@) {  
    ...; # will never be executed even on exception
  }
If an exception occurs in the eval block in main then execution will stop and control will immediately jump out of the block. $thingy will fall out of scope. When this happens the object's DESTROY block will be executed. This in turn runs its own eval which will unset $@ as it executes. Assuming another exception doesn't occur during cleanup by the time we reach the if statement in main $@ will have been undefined even though an exception happened in the eval block immediately above. Disaster! The, simple, quite frankly terrible, workaround is to write this:
  eval {
    ...; # do something that might throw an exception
    1;
  } or do {
    ...; # handle the exception here
  }
We're no longer relying on the $@ to tell that an exception has occurred but on the fact that an eval block quill return false on exception handling. Of course, we can't now reliably look at $@ to find out what kind of exception occurred. There's ways around this, but they're even more complex to code. A better fix on Perl's prior to 5.14 is to use the Try::Tiny module from the CPAN that handles all of this for us.
  use Try::Tiny;
  try {
    ...; # do something that might cause an exception.
  } catch {
    ...; # handle the exception stored in $_ here
  };
Of course, no matter how tiny Try::Tiny is, there's no getting away from the fact that it's not a module that's bundled with Perl; I can rely on getting it installed whenever I install software, but not on every machine I might happen to admin and want to make use of the system perl. Luckily, Perl 5.14 solves this problem entirely for us by executing the block first - therefore executing all destructors that might mess with $@ first - and then, once all that's done, populating $@ with the exception. Thanks Perl 5 Porters!

Share "Unexceptional Exceptions in Perl 5.14"

Share on: FacebookTwitter

A Rose By Any Other Name

In this blog post I'm going to talk about writing a Perl script to automatically change entries in my local /etc/hosts file, and I'll digress into brief discussions on Net::DNS, how to edit files in place using Tie::File, and the sneaky -s switch for dumb command line argument parsing.

The Problem I'm Trying to Solve

I'll just come out and say it: I don't like using the default DNS servers assigned to my laptop by DHCP servers. In the case of my home network I get the buggy DNS server from my ISP that doesn't work as often as I'd like. In the case of my work network I often get hostnames resolved to internal IP addresses for servers where (because of my particular job) I really want the public ones. To avoid the issue completely I hard code my DNS to always point at Google's free DNS service on 8.8.8.8. There's just one problem with this:
bash$ ssh sandbox1.dev.example.priv
ssh: Could not resolve hostname sandbox1.dev.example.prvi: nodename nor servname provided, or not known
Ooops! Entries for my development servers only exists on our local work DNS server and if I'm not using it I can't find any of them! Luckily my Mac (and other unix like boxes) allows me to override DNS servers using the /etc/hosts file (Windows has something similar too.) In its simplest form this file contains one override per line, an IP address followed by one or more hostnames it overrides. For example: 10.0.0.1 sandbox1.dev.example.priv 10.0.0.2 sandbox2.dev.example.priv 10.0.1.1 db.dev.example.priv And so on. My kludgy soliton is for each development server that I want to use to put a line in /etc/hosts so I don't have to remember it's IP address (and more importantly, so I can use the addresses in my browser and still have it map to the right virtual host on the webserver.) However, doing this by hand gets old real quick. Running dig against the company DNS server's IP address, copying and pasting the resolved IP address into the hosts file using a text editor is something that takes the better part of a minute, is prone to mistakes, and completely interrupts my train of thought. What I want is a simple command to automate the whole process of adding or updating an entry like this:
bash$ hostify sandbox.dev.example.com
And maybe I could have it update all the entries that it knows about so they don't get out of date whenever I type:
bash$ hostify -r
Right! Time to write some Perl.

Using Net::DNS to do DNS lookups

You'd think that dealing with the complexities of DNS would be the hard bit, but looking up domain names with Perl is actually really trivial. We can almost copy the example out of the perldoc for Net::DNS:
my $res = Net::DNS::Resolver->new(
  nameservers => [qw(10.5.0.1)],
);

my $query = $res->search($hostname);

if ($query) {
  foreach my $rr ($query->answer) {
    next unless $rr->type eq "A";
    say "Found an A record: ".$rr->address;
  }
}
And that's about all there is to it. Now for the hard bit...

Using Tie::File to edit a file in place

We either need to add an entry to our existing /etc/hosts file or update one or more entries in the middle of the file. However, if we were to use the standard open function that Perl provides we're going to quickly run into a problem: The open (and sysopen) syntax is optomised for either appending data onto the end of the file, or in a pinch overwriting byte for byte in the middle of the file. What it won't do is automatically handle the case where we want to replace something in the middle of the file with more or fewer bytes. We end up having to manually read in and echo out the tail end of the file which results in us having to write a lot of complex "bookkeeping" code we'd rather not concern ourselves with. One of the easiest ways in Perl to edit a file in place without worry about these niggly details is to instead use a core module called Tie::File. This module uses a magic feature in Perl called tieing where some functionality is tied to a Perl data structure - any attempts to read from or modify the tied data structure cause Perl code to be executed to do something clever instead of modifying a dumb chunk of memory. In the case of Tie::File each element in the array that it ties maps to a line in the file on disk - reading from the array reads in lines from the file, and writing to the array writes out to disk. So, for example, to tie our array to the hosts file, we just need to use the special tie syntax:
tie my @file, 'Tie::File', "/etc/hosts"
  or die "Can't open /etc/hosts: $!";
Now altering a line in the middle of our file is simple:
# alter the 21st line in the file
$file[20] = "10.0.69.1 filestore.example.priv";
Tie::File seamlessly handles all the complicated bits about moving the stuff after the line we've just altered. Perfect!

Rudimentary argument passing with -s

My script needs to be able to only accept one simple command line option to tell it to also update all hostnames it's previously inserted. Because I'm lazy, I didn't even use a module to do this but rather used the simple -s command line option to tell perl to shove anything it sees on the command line starting with a dash into a similarly named variable in the main namespace:
#!/usr/bin/env perl -s
if ($r) { print "Someone called us with -r\n" }
Of course, with strictures and warnings on I have to write something a little more complex:
#!/usr/bin/env perl -s
use 5.12.0;
use warnings;
if ($::r && $::r) { say "Someone called us with -r" }
I need to use $::r not $r because the former, being a fully qualified variable, doesn't need predeclaration when running under use strict (which is implicitly turned on when I asked to use 5.12.0.) I also need to use $::r && $::r not $::r because otherwise warnings would notice that the variable was only being used once in the entire run of the code and emit a warning (this is one of the rare cases where this isn't a bug - the variable really does get its value without ever being set by Perl code.)

The Complete Script

And here's the complete finished script.
#!/usr/bin/env perl -s

use 5.12.0;
use warnings;

use Net::DNS;
use Tie::File;

# look at the file
tie my @file, 'Tie::File', "/etc/hosts"
  or die "Can't open /etc/hosts: $!";

# did someone want to update all the cached entires?
if ($::r && $::r) {
  my $found = 0;
  foreach (@file) {
    # skip down until the comment in my /etc/hosts that
    # states that "cached entries are below this point"
    next unless $found ||= m/cached entries/; 

    # then replace each host entry
    s{\A\d+\.\d+\.\d+\.\d+\s+(?<host>.*)\z}{
       dns_lookup($+{host}) . " $+{host}";
     }e;
  }
  exit unless @ARGV;
}

my $host = shift // die "No hostname supplied\n";
my $ip = dns_lookup( $host );

# look for an existing entry and replace it
foreach (@file) {
  exit if s/\A\d+\.\d+\.\d+\.\d+\s+\Q$host\E\z/$ip $host/;
}

# not found?  Add it to the end
push @file, "$ip $host";

########################################################################

sub dns_lookup {
  my $hostname = shift;

  my $res = Net::DNS::Resolver->new(
    nameservers => [qw(10.5.0.1)],
  );

  my $query = $res->search($hostname);
  
  if ($query) {
    foreach my $rr ($query->answer) {
      next unless $rr->type eq "A";
      return $rr->address;
    }
    die "No A record for $hostname";
  }

  die "query for $hostname failed: ", $res->errorstring;
}

Share "A Rose By Any Other Name"

Share on: FacebookTwitter

Support your Local Library

In this blog post I talk about the first step of modularising code from simple scripts. I'm going to cover extracting routines from the scripts into a shared module in the local file system and using FindBin to locate and load that module from within the scripts.

Iterative Development

We've all written small one off scripts that have grown over time to become more than what they originally were intended to be, with new features and functionality being grafted on as time goes by. The code gets more and more complex and it's hard to maintain so much code in a simple script. Often around the same time we realise that we need some of the functionality of this script in another script. We could cut and paste, but this will end up in a maintenance nightmare with the same code being repeated in any number of scripts. The obvious solution to both issues is to modularise: To move this code into a separate module and include that module at the start of our various scripts. Now, lots of Perl programmers will recommend converting your code straight into a distribution (i.e. packaging your code up with a Makefile.PL, tests, etc.) However: This is a big step and involves a lot of work, both upfront and whenever you need to change the code (every time you make a change to a distribution you have to reinstall it.) There's an intermediate step we can take first: We can move the code into a local module in same directory. This is a lot easier, and any changes we make to the code is 'live'. It's a lot closer to the development we have right now, just in more than one file.

A worked example

So, let's start with a simple script:
#!/usr/bin/p
use strict;
use warnings;
use DBI;

my $dbh = DBI->connect(
  "DBI:mysql:database=live;host=livedb.twoshortplanks.com",
  "admin",
  "opensaysme",
  { RaiseError => 1 }
);

...
We'd like to avoid encoding our database username and password at the start of each admin script we write, so we'd like to turn this into code to be loaded from a module. Let's start by turning the code we want to extract into a function
#!/usr/bin/perl

use strict;
use warnings;
use DBI;

sub connect_to_live_db {
  return DBI->connect(
    "DBI:mysql:database=live;host=livedb.twoshortplanks.com",
    "admin",
    "opensaysme",
    { RaiseError => 1 }
  );
}

my $dbh = connect_to_live_db();

...
Now, let's move this code into a module called TwoShortPlanksUtils which we store in a file "TwoShortPlanksUtil.pm" in the same directory as our admin scripts. We make the code avaible to any module using our module by using Exporter to export the function back into scripts that ask for it in the usual fashion.
package TwoShortPlanksUtils;

use strict;
use warnings;

use base qw(Exporter);
our @EXPORT_OK;

sub connect_to_live_db {
  return DBI->connect(
    "DBI:mysql:database=live;host=livedb.twoshortplanks.com",
    "admin",
    "opensaysme",
    { RaiseError => 1 }
  );
}
push @EXPORT_OK,"connect_to_live_db";

1;
Now let's use it in our script, just like we would as if we'd created a full distribution and installed it.
#!/usr/bin/perl

use strict;
use warnings;

use TwoShortPlanksUtil qw(connect_to_live_db);
my $dbh = connect_to_live_db();
Hooray. When we test if the script everything works...as long as we run it from the correct directory that is. In order for this to work the directory TwoShortPlanksUtil.pm is in must be in @INC, the list of places Perl will look for modules to load. This normally contains the current working directory, so if you execute your script from the command line from the directory that contains it it works. However, if your script lives in your ~/bin directory (or for that matter anywhere else in your $PATH) and you expect to be able to execute it from an arbitrary directory then this won't work at all. What we need to do is modify our script's @INC to always contain the directory the script is located in. The magic incarnation to insert into our script is:
use FindBin;
use lib $FindBin::Bin;
When you load the FindBin module it examines the $0 variable (which contains the current executing script path) and the current working directory and works out the path to the directory containing the script and stores it in the $FindBin::Bin variable, which it exports. By passing this to the lib pragma we include that directory in @INC. The boilerplate at the start of our code now looks like:
#!/usr/bin/perl

use strict;
use warnings;

use FindBin;
use lib $FindBin::Bin;

use TwoShortPlanksUtil qw(connect_to_live_db);
my $dbh = connect_to_live_db();
And this now works no matter where we execute our script from!

Share "Support your Local Library"

Share on: FacebookTwitter

Indirect Method Calls Must die();

In this blog post I'll talk about the problems that I have with Perl's indirect method call syntax, both stylistically and pragmatically, and what you can do to prohibit its use.

Pop Quiz, Hotshot

What does the following Perl code do?
save $myobject;
Is it:
  • A function call to the save function, passing $myobject as the argument, i.e. the same as
    save($myobject)
  • An indirect method call, calling the save method on $myobject, i.e. the same as
    $myobject->save()
The answer is complicated. It depends entirely if a function called save is in scope (either declared above or imported from another module with a use statement) when Perl is compiling the code, in which case it'll be the former rather than the latter. Does anyone else find this confusing? I know I sure do.

Just say no

At work we have a simple rule: You're not allowed to write code that uses indirect method calls. If you want to make a method call, our house coding style demands you write it explicitly using the normal direct method call. That way if you see the above code you know that it's a function call. The problem is that it's still entirely possible to use the indirect method call completely accidentally when you intended it to be a function call. Imagine that you've written the code and forgotten to use ObjectSaver qw(save); at the top of your module and import the save() function. Perl will blindly go ahead an compile your code as an indirect method call on $myobject! The real issue with this is that this won't result in an error straight away - the problem will only come to light when the code is actually executed and the non-existent save() method is called (or worse, $myobject really might have a save() method that you didn't mean to call..) If the code you're writing is one of those hard-to-reach and therefore hard-to-test codepaths (e.g. obscure error handling) there's a chance you could ship broken code to live without noticing it.

A Solution: The no indirect pragma

The solution is to use the indirect pragma from the CPAN. This pragma allows you to get perl to throw an exception whenever it encounters something it would normally compile into an indirect method call straight away. Getting it to do its thing is simple:
no indirect ':fatal';
And that's it. Now:
bash$ perl
use strict;
use warnings;
no indirect ':fatal';

my $myobject = bless {}, "SomeClass";
save $myobject;
Indirect call of method "save" on object "$myobject" at - line 6.
bash$
Hooray!

Share "Indirect Method Calls Must die();"

Share on: FacebookTwitter

The PSGI/Mac OS X Dance

In this blog post I'll show you how to get a PSGI web application to start up and listen on a port when you boot your Mac OS X machine. To do this, I'll be quickly covering the plackup command line utility and then delving into the basics of OS X's launchd plist configuration files.

Our example PSGI application

The first step is to create a PSGI compatible application. For the purpose of this blog post, let's just use the example application from the Dancer framework's documentation:
#!/usr/bin/perl

use strict;
use warnings;

use Dancer;

get '/hello/:name' => sub {
   return "Why, hello there " . params->{name};
};

dance;
We should probably check from the command line that this works as we expect before we go any further.
bash$ perl hellow.pl 
>> Dancer server 5758 listening on http://0.0.0.0:3000
== Entering the development dance floor ...
And then in another terminal:
bash$ lwp-request http://127.0.0.1:3000/hello/world
Why, hello there world
Of course, we could have just as easily used a Mojolicious or Catalyst application here! But that's not the point....in just a few lines of code we've got a PSGI compatible web application written and ready to host.

Running this with a PSGI webserver and plackup

The PSGI standard is essentially a compatibility layer between Perl web framework and Perl webservers; Without changing a line of code you can switch one webserver to another, and likewise our webservers can be written to support any web framework without needing to add further code for each framework. In this example I'm going to use a server called Twiggy as my PGSI compliant webserver, which is a module that can be installed from the CPAN in the normal manner. I've chosen it because it's fast and has a low memory footprint (the latter being quite important if I don't want to use up too much of my desktop's RAM.) The only drawback with Twiggy is that my application can't use too much CPU or block on IO in a non-any-event compatible way without holding up the next request. For me this doesn't matter to me because I'm the only one going to be using my machine! Of course, it's a simple configuration variable change to switch to another PSGI compatible server like Starman that handles preforking for us. To start our Dancer application with Twiggy we just need to use the plackup command:
bash$ plackup -s Twiggy -p 4077 -a dancer.pl 
Twiggy: Accepting connections at http://0.0.0.0:4077/
And then again, in another terminal:
bash$ lwp-request http://127.0.0.1:4077/hello/world
Why, hello there world

Configuring plackup to run on boot on a Mac

Mac OS X uses a process called launchd to manage services, replacing the more traditional init.d system you'd find on a typical linux box. To define a new service we need to create a plist file (a correctly formatted xml file.) The standard place for plists for deamons launched on boot is /System/Library/LaunchDaemons; Mac OS X will load all the plist files in this directory when the machine starts up. Each of these files need a unique name, one that will be guaranteed not clash with any other service that Apple or third parties will create in the future. To ensure this they use the same "reverse-domain-name" scheme that Java uses for its class names: You start the file name with your domain name in reverse. Today I'm going to create a file called com.twoshortplanks.hellow.plist, which I know is unique because I control the twoshortplanks.com domain name:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>Disabled</key>
        <false/>
        <key>Label</key>
        <string>com.twoshortplanks.hellow</string>
        <key>ProgramArguments</key>
        <array>
                <string>/usr/local/bin/plackup</string>
                <string>-a</string>
                <string>Twiggy</string>
                <string>-p</string>
                <string>4077</string>
                <string>-a</string>
                <string>/usr/local/hellow/hellow.pl</string>
        </array>
        <key>OnDemand</key>
        <false/>
</dict>
</plist>
So, this is fairly straight forward, with the plist containing alternating keys and data structures. Since we want the server to start up straight away on boot both the Disabled and OnDemand keys need to be set to false. Label needs to be set to the same name we used in the filename. Finally the slightly confusingly named ProgramArguments needs to be set to contain both the name of the executable and its arguments. This is exactly as we would have passed it to the shell, but with each part that was separated by spaces in its own <string> tag. You'll note that we've also used absolute paths here because, obviously, when this is run by launchd it won't have either our current PATH or current working directory. (It's also worth noting at this point, just incase you're using this example to write something to run a daemon other than plackup, that the command should run in the foreground and not fork itself off into a daemon. We're not passing the options to plackup to do this, so that's all good.) The first thing we should probably do after writing the plist check we got the plist syntax right and there are no typos (especially as launchd gives the world's most unhelpful error messages.) The system-supplied plutil utility comes with a lint mode that can help here:
bash$ plutil -lint com.twoshortplanks.hellow.plist
com.twoshortplanks.hellow.plist: OK
Once we've done that we can force Mac OS X to load the daemon settings right away (without having to reboot the computer):
bash$ sudo launchctl load /System/Library/LaunchDaemons/com.twoshortplanks.hellow.plist
And now we can check it's loaded:
bash$ sudo launchctl list | grep hellow
2074    -    com.twoshortplanks.hellow
And we can use it as a webserver!
bash$ lwp-request http://127.0.0.1:4077/hello/world
Why, hello there world
Great! It's running! Now what? Well, assuming we're not going to be using plackup's --reload option (which is a little too complicated to go into now) we need to know how to restart the server whenever we make changes. The simpliest thing is to unload and reload it again:
bash$ sudo launchctl unload /System/Library/LaunchDaemons/com.twoshortplanks.hellow.plist
bash$ sudo launchctl load /System/Library/LaunchDaemons/com.twoshortplanks.hellow.plist

Conclusion

With PGSI it's possible to have a low impact custom webserver running on your local Mac without much work at all.

Share "The PSGI/Mac OS X Dance"

Share on: FacebookTwitter

Having a Brew

Back in August in 2009 I wrote a post on how to install a release candidate of Perl. Installing a custom perl in your home directory has since got a lot easier with perlbrew, and in this blog post I'll show the five commands needed to get a new perl onto your system.

A new perl in five commands

First I need to setup perlbrew. This takes three commands (the latter two will be different on your system, but don't worry, in each case the previous command tells you what to do when you run it.)

   curl -L http://xrl.us/perlbrewinstall | bash
   /Users/mark/perl5/perlbrew/bin/perlbrew init
   echo >>~/.bashrc "source /Users/mark/perl5/perlbrew/etc/bashrc"

Now to install and switch to the latest development version of perl I only need type:

   perlbrew install perl-5.13.8
   perlbrew switch perl-5.13.8

And to install and switch to the latest stable perl I only need to type:

   perlbrew install perl-5.12.2
   perlbrew switch perl-5.12.2

Again, in excruciating detail

If you're anything like me, you've seen this kind of quickfire instructions on the web before and it leaves you with questions. Wouldn't it be nice if someone posted a complete dump of exactly what was going on when these commands are run? Yes, it would be...

Let's start from the top again. First, download and install perlbrew with the one liner:

  travis-4:~ mark$ curl -L http://xrl.us/perlbrewinstall | bash

Off it goes:

    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                   Dload  Upload   Total   Spent    Left  Speed
  102   306  102   306    0     0    236      0  0:00:01  0:00:01 --:--:--  2756

  ## Download the latest perlbrew

  ## Installing
  The perlbrew is installed as:

      /Users/mark/perl5/perlbrew/bin/perlbrew

  You may trash the downloaded /private/tmp/perlbrew from now on.

  Next, if this is the first time you install perlbrew, run:

      /Users/mark/perl5/perlbrew/bin/perlbrew init

  And follow the instruction on screen.

  ## Done. (automatically removes downloaded /tmp/perlbrew)

We'd better follow its instructions, eh?

  travis-4:~ mark$ /Users/mark/perl5/perlbrew/bin/perlbrew init

This prints out some more stuff, including an instruction to modify our path:

  Perlbrew environment initiated, required directories are created under

      /Users/mark/perl5/perlbrew

  Well-done! Congratulations! Please add the following line to the end
  of your ~/.bashrc

      source /Users/mark/perl5/perlbrew/etc/bashrc

  After that, exit this shell, start a new one, and install some fresh
  perls:

      perlbrew install perl-5.12.1
      perlbrew install perl-5.10.1

  For further instructions, simply run:

      perlbrew

  The default help messages will popup and tell you what to do!

  Enjoy perlbrew at $HOME!!

Okay, let's modify our path to contain the instructions then:

  travis-4:~ mark$ echo >>~/.bashrc "source /Users/mark/perl5/perlbrew/etc/bashrc"

Now we need to restart bash. The easiest way to do that is to close the current terminal and open a new one (and stops us getting confused.) After this, installing a new Perl is a doddle, with just a single command.

  travis-4:~ mark$ perlbrew install perl-5.13.8

And it does all the hard work for us:

  Attempting to load conf from /Users/mark/perl5/perlbrew/Conf.pm
  Fetching perl-5.13.8 as /Users/mark/perl5/perlbrew/dists/perl-5.13.8.tar.gz
  Installing perl-5.13.8 into /Users/mark/perl5/perlbrew/perls/perl-5.13.8
  This could take a while. You can run the following command on another shell to track the status:

    tail -f /Users/mark/perl5/perlbrew/build.log

  (cd /Users/mark/perl5/perlbrew/build; tar xzf /Users/mark/perl5/perlbrew/dists/perl-5.13.8.tar.gz;cd /Users/mark/perl5/perlbrew/build/perl-5.13.8;rm -f config.sh Policy.sh;sh Configure -de '-Dprefix=/Users/mark/perl5/perlbrew/perls/perl-5.13.8' '-Dusedevel';make;make test && make install) >> '/Users/mark/perl5/perlbrew/build.log' 2>&1 
  Installed perl-5.13.8 as perl-5.13.8 successfully. Run the following command to switch to it.

    perlbrew switch perl-5.13.8

So, as it says, we can switch which perl we're using just by using the "perlbrew switch" command:

  travis-4:~ mark$ perl -v

  This is perl, v5.10.0 built for darwin-thread-multi-2level
  (with 2 registered patches, see perl -V for more detail)

  Copyright 1987-2007, Larry Wall

  Perl may be copied only under the terms of either the Artistic License or the
  GNU General Public License, which may be found in the Perl 5 source kit.

  Complete documentation for Perl, including FAQ lists, should be found on
  this system using "man perl" or "perldoc perl".  If you have access to the
  Internet, point your browser at http://www.perl.org/, the Perl Home Page.

  travis-4:~ mark$ perlbrew switch perl-5.13.8
  travis-4:~ mark$ perl -v

  This is perl 5, version 13, subversion 8 (v5.13.8) built for darwin-2level

  Copyright 1987-2010, Larry Wall

  Perl may be copied only under the terms of either the Artistic License or the
  GNU General Public License, which may be found in the Perl 5 source kit.

  Complete documentation for Perl, including FAQ lists, should be found on
  this system using "man perl" or "perldoc perl".  If you have access to the
  Internet, point your browser at http://www.perl.org/, the Perl Home Page.

We can switch back to the system perl with the "perlbrew off" command:

  travis-4:~ mark$ perlbrew off
  travis-4:~ mark$ perl -v

  This is perl, v5.10.0 built for darwin-thread-multi-2level
  (with 2 registered patches, see perl -V for more detail)

  Copyright 1987-2007, Larry Wall

  Perl may be copied only under the terms of either the Artistic License or the
  GNU General Public License, which may be found in the Perl 5 source kit.

  Complete documentation for Perl, including FAQ lists, should be found on
  this system using "man perl" or "perldoc perl".  If you have access to the
  Internet, point your browser at http://www.perl.org/, the Perl Home Page.

And that's about it. Very simple!

Share "Having a Brew"

Share on: FacebookTwitter

A Method of Coming Clean

Since Perl has no inbuilt way of declaring if a subroutine is a function or a method, it's perfectly possible to mistakenly override a superclass method with an errant function import or accidentally call a function as a method call from outside the class it was declared in, violating encapsulation. This is particularly problematic with generically named utility functions (or "language extensions") that are imported into your module's namespace from other modules. In this blog entry I'm going to talk about what can be done to prevent this, and introduce you to the namespace::clean module that can automate this process.

An Example of the Problem

As way of example, let's write a simple class that looks up details about a user from their Facebook profile:
  package FacebookUser;
  use Moose;

  use LWP::Simple qw(get);
  use JSON::XS qw(decode_json);
  
  has api_access_token => ( isa => 'Str', is => 'ro', required => 1 );
  has username         => ( isa => 'Str', is => 'ro', required => 1 );
  has data             => ( isa => 'HashRef', is => 'ro', lazy_build => 1 );
  
  sub _build_data {
    my $self = shift;
    my $url = "https://graph.facebook.com/@{[ $self->username ]}?access_token=@{[ $self->api_access_token ]}";
    return decode_json(get($url));
  }
  
  sub get_firstname { return $_[0]->data->{first_name} }
  sub get_lastname  { return $_[0]->data->{first_name} }
  sub get_gender    { return $_[0]->data->{gender} }
  sub get_hometown  { return $_[0]->data->{hometown}{name} }
This is designed to be called like so:
  my $user = FacebookUser->new(
    api_access_token => $access_token,
    username => "2shortplanks",
  );
  say "Hello ".$user->get_firstname;
But what happens if someone calls it like so:
  my $user = FacebookUser->new(
    access_token => $access_token,
    username => "2shortplanks",
  );
  say "Hello ".$user->get('firstname');
Instead of a complaint about calling a non-existant get method we instead get the very cryptic error message:
  Can't use a FacebookUser object as a URI at /Library/Perl/5.10.0/HTTP/Request/Common.pm line 103
Why's this? Because Perl tried to call the get function that we imported into our module from LWP::Simple as a method call, passing the invocant as the first argument instead of a URL.

The obscured parent method problem

Bad error messages when we write stupid code are one thing. Left over functions totally breaking our code in ways we don't expect is another. Let's trying writing the same module using the Class::Accessor object system instead of using Moose.
  package FacebookUser;
  use base qw(Class::Accessor);

  # use Class::Accessor to create the api_access_token and username accessor methods
  __PACKAGE__->mk_accessors(qw(api_access_token username));

  use LWP::Simple qw(get);
  use JSON::XS qw(decode_json);

  sub data {
    my $self = shift;
    return $self->{data} ||= $self->_build_data;
  }

  sub _build_data {
    my $self = shift;
    my $url = "https://graph.facebook.com/@{[ $self->username ]}?access_token=@{[ $self->api_access_token ]}";
    return decode_json(get($url));
  }

  sub get_firstname { return $_[0]->data->{first_name} }
  sub get_lastname  { return $_[0]->data->{first_name} }
  sub get_gender    { return $_[0]->data->{gender} }
  sub get_hometown  { return $_[0]->data->{hometown}{name} }
This is designed to be called called like so:
  my $user = FacebookUser->new({
    access_token => $access_token,
    username => "2shortplanks",
  });
  say "Hello ".$user->get_firstname;
To our surprise, this doesn't work at all:
  Can't use a FacebookUser object as a URI at /Library/Perl/5.10.0/HTTP/Request/Common.pm line 103
That's the same error we got when we called $user->get('firstname') in our Moose example by mistake, but this time we're sure we're calling $user->get_firstname, so what gives? How is the get subroutine being called accidentally? Let's run our example again with Carp::Always so we get a full stack trace out:
  bash$ perl -MCarp::Always example.pl 
  Can't use a FacebookUser object as a URI at /Library/Perl/5.10.0/HTTP/Request.pm line 70
  	HTTP::Request::uri('HTTP::Request=HASH(0x10091c5f8)', 'FacebookUser=HASH(0x10091c1f0)') called at /Library/Perl/5.10.0/HTTP/Request.pm line 16
  	HTTP::Request::new('HTTP::Request', 'GET', 'FacebookUser=HASH(0x10091c1f0)') called at /Library/Perl/5.10.0/HTTP/Request/Common.pm line 103
  	HTTP::Request::Common::_simple_req() called at /Library/Perl/5.10.0/HTTP/Request/Common.pm line 20
  	HTTP::Request::Common::GET('FacebookUser=HASH(0x10091c1f0)') called at /Library/Perl/5.10.0/LWP/UserAgent.pm line 386
  	LWP::UserAgent::get('LWP::UserAgent=HASH(0x10083c840)', 'FacebookUser=HASH(0x10091c1f0)') called at /Library/Perl/5.10.0/LWP/Simple.pm line 36
  	LWP::Simple::get('FacebookUser=HASH(0x10091c1f0)', 'username') called at /System/Library/Perl/Extras/5.10.0/Class/Accessor.pm line 393
  	Class::Accessor::__ANON__('FacebookUser=HASH(0x10091c1f0)') called at FacebookUser.pm line 18
  	FacebookUser::_build_data('FacebookUser=HASH(0x10091c1f0)') called at FacebookUser.pm line 13
  	FacebookUser::data('FacebookUser=HASH(0x10091c1f0)') called at FacebookUser.pm line 22
  	FacebookUser::get_firstname('FacebookUser=HASH(0x10091c1f0)') called at example.pl line 15
What's happened is that the username method that Class::Accessor created in our package is trying to call the get method in FacebookUser's Class::Accessor superclass but because we've imported a get function into FacebookUser that's being accidentally called as a method instead. Ooops!

Solving this: Deleting the subroutine when done

a This problem can be solved in either code example by adding the following line to FacebookUser.pm to delete the get subroutine:
  delete $FacebookUser::{get};
Now a trying to call the get function as a method won't work, and we'll get a proper error message:
  say "Hello ".$user->get('firstname');
  Can't locate object method "get" via package "FacebookUser" at example.pl line 15
And with these modifications in both cases our get_firstname method call works as expected:
  say "Hello ".$user->get_firstname;
  Hello Mark
Those of you paying attention might be wondering how this possibly can still (if you'll pardon the pun) function. After all, the the C_build_data method uses the get function to get the data from the facebook servers - and we just deleted the get function! The answer to this conundrum relies on the fact that Perl is not an interpreted language, but a compiled one (one that is recompiled each time the program is started and has no executable or bytecode that is saved to disk.) As such Perl has multiple phases of execution, with "compile time" (when the code is turned into bytecode) and "execution time" (when the code is actually running) happening at different times. It's at the time the text it turned into bytecode that perl works out what function is what, and at compile time (after the symbol for the function has been deleted) that method calls occur and fail to find the errant get. The same thing that means the following bit of code still works even though the panic() statement comes before the declaration of the function.
  panic();
  sub panic() { say "Don't panic Mr Mannering!"; return }

Doing this automatically

I'll be the first to admit that this syntax:
  delete $FacebookUser::{get};
Totally sucks. It's ugly and completely unintelligible unless you happen to have learnt what it does beforehand. Worse still, you need to remember to do it for each and every function you declare or import in your object and you might not even know all the functions that are imported if you rely on the default import list. In other words, it involves a lot of bookkeeping and hence chances for mistakes. Why don't we get the computer to automate this for us? This is where the namespace::clean module from CPAN comes into use. It's use is simple:
  use namespace::clean;
This deletes all function from your namespace that were declared or imported above the line that namespace::clean was use-ed on. So, in our example we might like to do it after the lines where we've imported everything but before we start declaring methods and accessors:
  package FacebookUser;
  use base qw(Class::Accessor);

  use LWP::Simple qw(get);
  use JSON::XS qw(decode_json);

  use namespace::clean;

  # use Class::Accessor to create the api_access_token and username
  # accessor methods
  __PACKAGE__->mk_accessors(qw(api_access_token username));

  ...
Or with Moose:
  package FacebookUser;
  use Moose;

  use LWP::Simple qw(get);
  use JSON::XS qw(decode_json);
  
  use namespace::clean -except => 'meta';
  
  ...
In the later example we've told namespace::clean to keep the meta subroutine that use Moose imported for us - it's an important part of our object's framework.

Conclusion

The confusion between functions and methods in Perl is one that can cause a fair number of headaches for those not paying very close attention, and often requires us to have too much knowledge of the inner workings of our parent classes. We can handle this problem manually, or the namespace::clean module on CPAN can help alleviate this pain.

Share "A Method of Coming Clean"

Share on: FacebookTwitter

blog built using the cayman-theme by Jason Long. LICENSE