Let's have our cake and eat it too

One of the most difficult tradeoffs in language design is brevity verses explicitness. Having long names for methods, functions, variables and verbose patterns makes your code much clearer and less ambiguous. To my mind there's nothing ambiguous about:
System.out.println("Hello World");
It's obviously printing a line to standard out and, at least as long as you associate "System" with "standard" in your head, there's no surprises lurking here. This kind of explicitness is great. It means anyone can read the code and bit by bit pick it apart to work out exactly what's happening (Oh...so there's a 'System' class....and look an 'out' property....and a 'println' method...) Of course, the problem with this level explicitness is that the verbosity it requires takes too long to write all the time unless you've got a super-charged macro based IDE. And even then it faces a worse drawback: As the patterns get larger and the verbosity multiplies it gets harder to comprehend the overall picture. Your eyes tends to glaze over after a day's protracted coding and the details start to become obscured. For example in microcosm, the "System.out" is the least interesting bit of the above code, but it's the part of the statement my eyes are drawn to first. Worse still, programmers tend to write the same number of lines of code no matter how verbose the programming language they tend to use is. Those languages with more brevity therefore tend on the whole to get more done in those lines! Compare and contrast the above Java statement to the following Perl statement:
say "Hello World"
Much shorter, and much easier to read; The eyes are drawn to the "Hello World" which is indeed the interesting part of the statement. With much shorter statements and less wrapper code, the Perl users should be producing many more lines of code a day and beating the pants off the Java programmers. Well, this is sometimes true. And, to be fair, often not. What was the tradeoff with brevity again? Oh, yes...more ambiguity. When does this strike? In maintenance and what I like to call "pre-maintenance", the time you're developing the code yourself and if reaches the point it's too big to fit in your head. Consider the two examples above. While the Java version is clearly printing to standard out, "say" is printing to the 'current file handle' which almost always is standard out. Of course Perl people might consider this potential action at a distance to be a worthwhile abstraction layer. Which really emphasises an important difference between the two languages. Perl and Java are essentially operating at about the same level of abstraction. They're both a level above C, running on a virtual machine layer that has nice safety nets built in meaning things like memory allocation, array boundary checking, sorting algorithms, etc. are all taken care by the core language and API. There's really little to separate the two languages, and they share more in common with each other than say, C and Prolog do, so it's more interesting to look at the small details that make the two languages different to one another. The Perl and Java programmer have a lot to learn from one another. This difference in syntax philosophy is really interesting to me. Perl's basic syntax typically allows you to express more in shorter space by exploiting what is known as context. There's a lot of implicit things. There's list or scalar context for example, or the current file handle, or there's the topic variables ($_, @_ et al) that are often used as default arguments for calls. This either allows you to hold more in your head (because you worry about less because you can ignore the need to restate the context all the time) or hold a lot less (because the code isn't clear and you have to worry about what's in the hidden context all the time.) So in theory Perl allows you to express more in a line, but you can also get yourself in a mess a heck of a lot quicker. It can be really easy for beginners to pick up Perl compared to Java because they aren't forced to deal with all the implicit stuff directly, but at the same time they're not aware of all the implicit things going so it can harder to deal with too; A double edged sword. No wonder sometimes Perl is better than Java, and sometimes Java is better than Perl. The obvious thing that Perl and Java can do, in the grand tradition of langauge design, is learn from each other's mistakes and steal the good bits from each other without (hopefully) picking up the bad bits. Java stealing regular expressions and Perl stealing layered IO are good examples of worthwhile theft. So what would I change about Perl to take advantage of what Java teaches us? Probably more than some people would like, and a lot less than others. As a way of example of what I would change I wrote is this particularly confusing chunk of code earlier in the week:
sub log {
  no warnings;
  warn @_;
}
no warnings immediately followed by warn? Gah! Of course, what I'm actually doing is suppressing all warnings that perl will generate (undefined values, printing of wide characters, etc) while it prints out my warnings message. Very brief. Different semantic domains entirely. So what do I think we should do to fix this? Nothing. This is just the pain of having a brief ambigious language: Sometimes you're just going to end up with what I dub "talking at semantic cross purposes" in your code. I could suggest that we force people more explicit so it's clear what warnings I'm talking about in each case, but then I'd be changing the feel of the language. I'm in no rush to recreate Java; Java's a fine language and I know where it is when I want to use it. So what would I change? Ambiguity where there's genuine confusion due to overloading meaning of things. If you're paying a lot of attention in my example you'll notice that I'm not declaring something that's going to be used as a subroutine there, despite the sub keyword (because, obviously, without syntatic gymnastics you can't call a subroutine called log without calling the log built in function instead.) In fact, the occasions where you want a method to be also be callable as a subroutine (or vice versa) are very thin. So this is where I'd be more explicit and lose the unnecessary ambiguity be able to express the difference between a function and a method. Like so:
method log {
  no warnings;
  warn @_
}
Of course, that's exactly what some of the more radical extensions like MooseX::Declare allow you to do, and I'll talk more about that in a future blog entry.

Share "Let's have our cake and eat it too"

Share on: FacebookTwitter

CPAN Unplugged

CPAN is often described as Perl's Killer App; Modern Perl relies on it, with the perl distribution being almost considered in parts to be nothing more than a bootstrap for the rest of the language that's out there in the cloud. Which makes it all the more annoying when you're stuck somewhere without an internet connection missing the vital bit of the language you need. I just had first hand experience of being offline for a two week holiday, but I didn't have this problem when hacking on personal projects: I took CPAN with me. So, want CPAN at your fingertips even when you're offline? Yep, you've guessed it: There's a CPAN module for that! It's called CPAN::Mini, and it lets you create a mini-mirror of CPAN. A mini-mirror? What's that? It's a mirror of just the latest non-development versions of the modules from the CPAN - or in other words, it's a mirror of anything you can install by just typing "install" and just the module name into the cpan shell. As I type this now this mirror weighs in at about 1.1GB, which is a fair bit smaller than the full archive. So how do we create a mini-mirror? Well, first (when you're actually online) you need to install the module.
bash$ sudo cpan Mini::CPAN
Once you've done that the minicpan command will be installed on your computer. While you can pass arguments on the command line to tell it how to run, it's easier to create a .minicpanrc file in your home directory so you don't have to remember what commands to type each time you want to sync your mirror. This is what mine looks like: local: /cpan/ remote: http://www.mirrorservice.org/sites/ftp.funet.fi/pub/languages/perl/CPAN/ So I've got minicpan set up to download from mirrorserver.org (my nearest CPAN mirror on the internet when I'm in the UK) and create files in /cpan on my hard drive. So all that's left is to run the cpan mirror command and watch it download.
bash$ minicpan
This prints out each file as it downloads. The first time you run this might take a while (depending on the speed of your internet connection) so you might want to trigger it while you're laptop is going to be in the same place for a while with a fast internet connection (i.e. just before you go to bed or just after you get into the office for the day.) The second time you run this command it'll update the existing mirror. This means that it won't have to download the whole 1.1GB again, just the index files and the new modules that have been released.
bash$ minicpan
authors/01mailrc.txt.gz ... updated
authors/id/A/AD/ADAMK/Test-POE-Stopping-1.05.tar.gz ... updated
authors/id/A/AD/ADAMK/CHECKSUMS ... updated
authors/id/A/AN/ANDK/CPAN-Testers-ParseReport-0.1.4.tar.bz2 ... updated
authors/id/A/AN/ANDK/CHECKSUMS ... updated
authors/id/A/AT/ATHOMASON/Ganglia-Gmetric-PP-1.01.tar.gz ... updated
authors/id/A/AT/ATHOMASON/CHECKSUMS ... updated
authors/id/A/AT/ATHOMASON/Gearman-WorkerSpawner-1.03.tar.gz ... updated
...
cleaning /cpan/authors/id/A/AA/AAYARS/Fractal-Noisemaker-0.011.tar.gz ...done
cleaning /cpan/authors/id/A/AD/ADAMK/Test-POE-Stopping-1.04.tar.gz ...done
cleaning /cpan/authors/id/A/AL/ALEXLOMAS/CHECKSUMS ...done
cleaning /cpan/authors/id/A/AL/ALEXLOMAS/WWW-Nike-NikePlus-0.02.tar.gz ...done
...
The module will also delete any old versions of modules that are no longer in the index; In the above example you can see Adam released a new version of Test::POE::Stopping, so CPAN::Mini downloaded the new distribution and deleted the old distribution (as no modules contained in the index still relied on that distribution). This keeps the size of the local mirror as small as possible on disk. There's several ways you can configure the CPAN module to use this new local mirror, including typing commands in the CPAN shell. However, my preferred way is to directly edit the CPAN::Config module on the system directly. First work out where the module containing your config is installed:
bash$ perl -E 'use CPAN::Config; say $INC{"CPAN/Config.pm"}'
/System/Library/Perl/5.10.0/CPAN/Config.pm
Then edit it changing the urllist parameter to contain your CPAN mirror in addition to your normal remote mirror:
'urllist' => [
  q[file:///cpan/],
  q[http://www.mirrorservice.org/sites/ftp.funet.fi/pub/languages/perl/CPAN/]
],
This means your CPAN shell will try and install files from disk first, and if for any reason that fails (for example, you tell it to install a development release) it'll go to the second mirror. Which way round you order the mirrors depends really on how often you update your cpan mirror and personal preference. If you, as I do, put your local mirror first this has the disadvantage that CPAN will seem "frozen" at the last time you ran minicpan, with any new changes being hidden from you until you next update. It however means that installs are very quick compared to normal internet installs (be you offline or not) and it avoids having to wait for the internet connection timeout every time CPAN tries to fetch a file and fallback to the local mirror when you're offline. With all this done, I can now install modules in the usual way with the CPAN shell no matter if I have an internet connection or not. Of course, I haven't yet explained how I work out what modules I should be using when I'm offline and haven't got access to search.cpan.org. I'll get to that in a future blog post...

Share "CPAN Unplugged"

Share on: FacebookTwitter

say What?

Now that I've got Snow Leopard (finally) installed on my Mac, the default perl binary is now 5.10.0. This means many things: The given keyword and smart matching, the defined-or operator, the wonderful additions to the regex engine, and other things I'm bound to blog about later when I get round to enthusing about them. What I wanted to talk about today is the simpliest change that'll be making the most difference to me on a day to day basis: The "say" keyword. More or less say is exactly the same as print but two characters shorter and automatically adds a newline at the end. This is most useful when you're writing one liners. This quick calculation:
bash$ perl -e 'print 235*1.15, "\n"'
Becomes just:
bash$ perl -E 'say 235*1.15'
(Note the use of -E instead of -e to automatically turn on the 5.10 keywords like say without having to add use 5.10 or use feature 'say'.) This saves us a grand total of nine keypresses (including having to hit shift one less time.) More importantly it saves us having to use double quotes at all. This is really useful when you're already using the quotes for something else. For example, running a Perl one-liner remotely with ssh:
bash$ ssh me@remote "perl -MSomeModule -e 'print SomeModule->VERSION, qq{\n}"'
With 5.10 on the remote machine this becomes just:
bash$ ssh me@remote "perl -MSomeModule -E 'say SomeModule->VERSION'"
This has not only the advantage of saving me a bunch of keystrokes, but also doens't make me think as much. And the less I have to think, the less chance I'm going to do something stupid and make a mistake.

Share "say What?"

Share on: FacebookTwitter

meta META update

I'm on holiday in the states at the moment, so I haven't had much chance to do much Perl stuff.  That said, I did take a chance to add proposals for the new META.yml spec. For those of you that don't know, META.yml is a small file that ships in modern CPAN distributions that contains, well, meta information.  It's a way to know things about the distribution like author, pre-requisits, etc, etc, without actually having to examine or even execute other files in the distribution.  A lot of the more fancy automated tools that do really clever things with CPAN distibutions rely on it. I added two suggestions.  The first suggestion was a way to provide more detailed information on the repository (a.k.a. the version control system) the original source for the module was stored in.  Currently we support a simple URL, but it's not clear if this is the URL for the web front end or the version control resource itself, nor how to go from one to the other.  I'd also like a format ("git", "svn", "cvs") and type ("github", "sourceforge") so automated tools don't have to derive things they can do from the URL alone.  I'm imaginging a tool where I can type in a module name and it'll do the right thing to get me to a stage where I can immediatley start working on a patch (e.g. forking the project and checking out the fork if it's on github, svn co-ing the project if its a subversion mirror, etc) The second proposal was the more straight forward.  I suggested that we provide an official way for inhouse extensions to META.yml for private use that won't get stomped on by future versions of the spec and won't get complained about by tools the use META.yml: Simply reserving the 'x-' prefix to keys much like in HTTP headers. You can view my full proposals (or submit your own proposals before the 1st of October) on the CPAN Meta Spec Proposals page on the Perl QA wiki.

Share "meta META update"

Share on: FacebookTwitter

Think McFly, Think!

This week I made a time machine with Perl. So, writing tests for testing anything that happens with the passage of time is very very hard to do as you don't know how long any operation might actually take on the computer you're testing on. Code may take different time to run on different machines. Sleep instructions may take longer on busy machines than on non-busy machines. The user might suspend the computer in the middle of your test suite! Multiple this across the large number and diverse set of computers that might run your test suite after you upload your module to the CPAN and you're in a situation where you're going to get some false negatives, where the test suite will fail even though nothing's really wrong. Telling your end users to "try it again, it'l probably work this time" isn't exactly a recipe for instilling confidence in your code. That's why I developer my trusty time machine. Test::Time::HiRes is what I call it, and it allows me fine grained control over passage of a certain type of time - the time that Test::HiRes reports back to modules using it. The easiest way to think of it is an alternative implementation of Time::HiRes where time only progresses time when you (or the code you're testing) tells it to. This means that all sleeps your code does take an instant of wallclock seconds, but the simulated clock moves automatically. As such this code runs in milliseconds, rather than hours:
use Test::Time::HiRes;  
use Time::HiRes qw(sleep);
sleep(3600 * 10);
We can also jump around in time however we want.
use Test::More tests => 2;
use Test::Time::HiRes;

# Time::Hash::Expire uses Time::HiRes to get it's timing
use Tie::Hash::Expire;
tie my %hash, "Tie::Hash::Expire", { expire_seconds => 10 };
$hash{"foo"} = 1;

ok($hash{"foo"}, "key not expired");
time_travel_by(3600);
ok(!$hash{"foo"}, "key expired");
(or back in time, or to particular points in time.) We have complete control! Of course, I'm not the first person to come up with this idea. The very fine Test::MockTime does the same thing for Perl's inbuilt time(),localtime(), and gmtime() functions. I'm just extending the concept for more accurate time. It's not quite a DeLorean, but it'll do me. The module is on the CPAN and on github. [Update: as of 2009-09-24 20:15:39 UTC, the module still hasn't reached search.cpan.org yet - I'll leave the link in place for when it does, but if you're impatient, head off to github]

Share "Think McFly, Think!"

Share on: FacebookTwitter

Three Blinding Perl Tips, See How They Execute

I've got a lot of half written posts that I still need to complete, but I don't want to let this blog stagnate. Quick! Time for an n things list - Three random Perl things that I've been using a lot recently:

Default Prompts

If you install Term::ReadLine::Perl you can provide defaults for interactive prompts in your program
my $term = Term::ReadLine->new('report');
my $filename = $term->readline(
   "file to write output to? ",
   "$ENV{HOME}/report-".DateTime->now->ymd.".csv"
);
When this executes this code creates a line in your terminal that has a default value already filled in for you to edit:
file to write output to? /Users/mark/report-2009-09-16.csv

Controlling Where lwp-request Sends Its Requests

The command line utility lwp-request, which is a handy Perl utility that ships with LWP that allows you to download webpages from the command line, can take proxy settings from the environment variables you pass to it.
bash$ http_proxy=http://127.0.0.1 lwp-request http://www.mywebsite.com
This is really useful for development: You can set your test apache running on your local machine / virtual machine / dev box up to answer on the same virtual host name as your live domain and, via the proxy settings, have lwp-request send requests there rather than to the real live machine whose DNS the domain name points to. This is also really useful for debugging reverse proxies in live - you can choose exactly which machine in your proxy chain to send to, bypassing the nginx / lighttpd / varnish front proxy machine and talking directly to the backend machine if you wish.

The "Just Show Me The Data" Incarnation for DBI

Most of the time I don't use DBI directly - I use an object relational mapper like DBIx-Class. But sometimes I'll get handed a chunk of SQL from our DBA and I just want to write a wrapper script to run that SQL and do something simple with it. DBI is very flexible, and the documentation talks at length about things like caching parsed queries, reading data in from the database row by row, efficency of data structures, etc, etc. What it's not very clear about, however, is how to ignore all of that and just get all your data from executing some SQL in the easiest possible format to manipulate. The magic incarnation looks like this:
    my $dbh = DBI->connect(
      "DBI:mysql:hostname=127.0.0.1;database=foo",
      $username, 
      $password,
      { RaiseError => 1 },
    );

    my $rows = $dbh->selectall_arrayref( <<'SQL',{ Slice => {} }, "flintstone");
      SELECT *
        FROM characters
       WHERE last_name = ?
    SQL
This gives you a reference to an array, with each element in this array representing one row returned from the database as a hashref keyed by field name. This is really easy to process:
    foreach my $row (@{ $rows }) {
      say " * $row->{first_name} $row->{last_name}";
    }
If there's any problem at all DBI will raise an exception, so you don't need to worry about writing lots of error checking code.

Share "Three Blinding Perl Tips, See How They Execute"

Share on: FacebookTwitter

Saying Thanks

Did you know that recent releases of perl come with a way to send thanks back to the authors of Perl?  Neither did I until I saw Paul Fenwick speak at YAPC::Europe this year. Here's what you've got to do.  First, make sure you're running Perl 5.8.9 or 5.10.0 or later. Then from the command line run the "perlthanks" command with your email address:
bash$ perlthanks -r someone@example.com
This fires up a modified version of the "perlbug" command line tool - a utility for reporting bugs in the perl interpreter itself.  This version is designed:
This program provides an easy way to send a thank-you message
back to the authors and maintainers of perl.

If you wish to submit a bug report, please run it without the -T
flag (or run the program perlbug rather than perlthanks)

First of all, please provide a subject for the message.
Subject:
So we tap in "We think you're totally awesome in every possible way" and hit return. It takes us onto the next stage:
It's now time to compose your thank-you message.

Some information about your local perl configuration will
automatically be included at the end of your message, because
we're curious about the different ways that people build and use
perl. If you'd rather not share this information, you're welcome
to delete it.

You will probably want to use a text editor to enter the body of
your report. If "vi" is the editor you want to use, then just
press Enter, otherwise type in the name of the editor you would
like to use.

If you have already composed the body of your report, you may
enter "file", and /Users/mark/local/bin/perlthanks will prompt
you to enter the name of the file containing your report.

Editor [vi]: 
I can cope with vi quite happily, so I hit return and then it fires up my editor containing the following text:
This is a thank-you report for perl from someone@example.com,
generated with the help of perlbug 1.39 running under perl 5.10.1.


-----------------------------------------------------------------
[Please enter your thank-you message here]



[You're welcome to delete anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=thanks
    severity=none
---
Site configuration information for perl 5.10.1:

Configured by mark at Sun Aug  9 09:11:41 BST 2009.

Summary of my perl5 (revision 5 version 10 subversion 1) configuration:
...
You can tell this was originally written to collect bug reports, can't you? Okay, so we fill in the bits between the dashed lines with some message like "You rock my world, oh yeah, oh yeah". Save the file and quit vi and we see:
You have finished composing your message. At this point, you have 
a few options. You can:

    * [Se]end the message to perl-thanks@perl.org, 
    * [D]isplay the message on the screen,
    * [R]e-edit the message
    * Display or change the message's [su]bject
    * Save the message to a [f]ile to mail at another time
    * [Q]uit without sending a message

Action (Send/Display/Edit/Subject/Save to File): 
Typing "se" and hitting return sends the message off to perl-thanks@perl.org. Whee!

Share "Saying Thanks"

Share on: FacebookTwitter

Devel::UseFromCommandLineOnly

On Friday I posted a horrible chunk of code to my blog. With all the scorn I threw about it there's one bit I didn't cover however, a sort of saving grace. The second line here:
package Test::EnhancedIs;
use base qw(Devel::UseFromCommandLineOnly);
Devel::UseFromCommandLineOnly? What the heck is that, and why am I subclassing it? Well, that, boys and girls, is another quick module I knocked up while butchering the internals of Test::Builder. You see, I wrote Test::EnhancedIs as a proof of concept that I knew would very likely break just as soon as someone changed Test::Builder in the slightest. I'm offering it as unsupported experimental code, but even I know no sooner had I typed those warning words that someone else was busy typing
#!/usr/bin/perl

use strict;
use warnings;

use Test::More tests => 1;
use Test::EnhancedIs;  # whooo, easier tests

use Sharks;
is(Sharks->armed,"frickin' laser beams");
And committing it to their version control system. I know what'll happen next is that that person or their coworkers, months down the line when this inevitably breaks, will hunt me or the hardworking geniuses that maintain Test::Builder down to blame us for breaking their code. So I decided to not let that happen. If you run the above code it outputs:
bash$ perl test.t
1..1
Invalid use of Test::EnhancedIs in 'test.t' at line 7; This module can only be loaded from the command line at test.t line 7
Yep, that's Devel::UseFromTheCommandLineOnly kicking in and hitting you with the cluestick. However, if you remove line 7 from the above example and run the test loading the extra module from the command line like so:
travis-4:~ mark$ perl -MTest::EnhancedIs test.t 
1..1
not ok 1
#   Failed test at test.t line 9.
#          got: 'frickin' *sharp teeth'
#     expected: 'frickin' *laser beams'
# Looks like you failed 1 test of 1.
Everything is dandy. Devel::UserFromCommandLineOnly is on CPAN and GitHub if you want to play with it.

Share "Devel::UseFromCommandLineOnly"

Share on: FacebookTwitter

Test::EnhancedIs

The other day I made an idiot out of myself on IRC. Now this isn't exactly news, but the way I did so is interesting. It seemed to me that Test::More's is was unexpectedly failing when it was passed identical inputs. Having pondered over the issue for a good while, I turned to greater minds than myself for enlightenment; Thus I pastied this output to the London Perl Monger's IRC channel:
bash$ perl -Ilib t/01multi.t 
1..3
ok 1 - data okay
not ok 2 - ip
#   Failed test 'ip'
#   at t/01multi.t line 61.
#          got: '123.45.67.98'
#     expected: '123.45.67.89'
ok 3 - port
# Looks like you failed 1 test of 3.
Can you spot the obvious mistake? I didn't. Of course, being a bunch of pedantic so-and-sos the London Perl Mongers are they immediately pointed out that "89" is not the same as "98", and that's why the tests were failing....Ooops. But wait? Why should I have to spot that. That's not very lazy. Shouldn't this comparing strings kind of thing be exactly what the computer is good at, and darn it, why can't it just point out where the string starts being different, preferably with a big flashing arrow, a klaxon, and a troupe of dancing girls... We can't quite manage that, but with my new module, we can get a lot closer:
travis-4:Babel-WideLog mark$ perl -MTest::EnhancedIs -Ilib t/01multi.t 
1..3
ok 1 - data okay
not ok 2 - ip
#   Failed test 'ip'
#   at t/01multi.t line 61.
#          got: '123.45.67.*98'
#     expected: '123.45.67.*89'
ok 3 - port
# Looks like you failed 1 test of 3.
Yep, a white dot on a red background. The universal Look at Me!. Actually, I did start by just making dot red, but then that wasn't as clear when you're running under prove which already colourises your output. Sadly, the implementation of Test::EnhancedIs leaves a lot to be desired, and is really a proof of concept rather than what I'd consider actual safe, shippable code. Let's have a look at the actual code and try not to wince too much:
package Test::EnhancedIs;
use base qw(Devel::UseFromCommandLineOnly);

use strict;
use warnings;
no warnings "redefine"; ## no critic (ProhibitNoWarnings)

our $VERSION = 0.00_01;

use Term::ANSIColor qw(colored);
use List::Util qw(min);

use Test::Builder;

# remember the original subroutine.  Note the BEGIN { } here - this is because
# without it this code will be run after the sub Test::Builder::_is_diag
# has been declared and we'll grab a ref to the wrong subroutine
my $uboat;
BEGIN { $uboat = \&Test::Builder::_is_diag }; ## no critic (ProtectPrivateVars)

# now write a new subroutine, overriding the subroutine in another package
# don't try this at home kids.
sub Test::Builder::_is_diag { ## no critic (ProtectPrivateSubs)
  my( $self, $got, $type, $expect ) = @_;

  # look for either a different character, or the end of either string
  my $index;
  foreach (0..min(length $got,length $expect)) {
    $index = $_;
    last if substr($got,$index,1) ne substr($expect,$index,1);
  }

  # put a marker in there
  substr($got,$index,0,colored("*","white on_red"));
  substr($expect,$index,0,colored("*","white on_red"));

  # run the original code
  return $uboat->($self,$got,$type,$expect);
}

1;
As you can tell from the comments, we're really breaking the rules here. Anything that disables warnings like that and requires multiple Perl Critic tests to be disabled is more than a little bit worrying! The worst of it is that we're redefining a private function that's inside the Test::Builder namespace. By convention any method or function that starts with an underscore in Perl is considered to be private and can change between versions of the code without notice, meaning that this code will probably not work on different versions of Test::Builder than the one I have installed (which is the latest) - including future versions that are yet to be released. Still, as long as we're aware of the pitfalls this isn't too bad a snippet to have around to fire up from the command line for one off tests when our eyes start to glaze over, at least until it next breaks again. We'll just have to be careful not to include it in any commands or scripts we save to disk, lest we start to rely on it. Of course the real solution is to take this proof of concept to the Perl-QA guys and gals and ask them how we can get best this functionality integrated properly into the next release of Test::Builder. That's a task for another day however.

Share "Test::EnhancedIs"

Share on: FacebookTwitter

The Upper Hand

When coding I quite often find myself having to setup some state temporarily limited to just the block I'm in and the routines the block calls. For way of a simple example, imagine we have a password reset utility on our website. In our example resetting works by sending a user a url with a token in it to email address associated with their account, and then when the user clicks on that url they're sent to a a page where they can send us the token and a new password. The core of the code to do the actual password resetting might look something like this:
sub reset_password_from_email_token {
  my $self = shift;
  my $token = shift;
  my $password = shift;

  # temporarily disable security checks as this user isn't the one
  # the session is logged in as
  $self->disable_security_checks;

  if (any { $_ eq $token } $self->recent_reset_tokens) {
    $self->set_password($password);
    $self->remove_reset_token($token);
  }

  # turn the checks back on again
  $self->enable_security_checks;
 }
This is fairly reasonable code, but prone to subtle bugs. What if there's a problem with setting the password? For example, set_password could have easily been written to raise an exception if the password isn't long enough:
sub set_password {
  my $self = shift;
  my $value = shift;

  if (length($value) {value} = $value;
  return $self;
}
And it's fairly reasonable for someone therefore to write something like this:
eval {
  $user->reset_password_from_email_token($token, $password);
};
if ($@) {
  if ($@ =~ /too short/) {
    return render_bad_password_page();
  } else { die $@ }
}
return render_password_reset_page();
Have you spotted the problem yet? Yep, enable_security_checks never got called. render_bad_password_page is running with security off! What we want is to ensure that security is always turned back on when we exit from reset_password_from_email_token no matter how we do that. We want something like this psudocode:
sub reset_password_from_email_token {
  my $self = shift;
  my $token = shift;
  my $password = shift;

  # this user isn't the one the session is logged in as
  ...temporarily disable security checks somehow...

  if (any { $_ eq $token } $self->recent_reset_tokens) {
    $self->set_password($password);
    $self->remove_reset_token($token);
  }
}
Now how to write that? One very hacky way of doing it would be to localise a state variable that Perl will automatically restore to the original value as it exits the current scope, i.e. as it exits the subroutine:
sub reset_password_from_email_token {
  my $self = shift;
  my $token = shift;
  my $password = shift;

  # this user isn't the one the session is logged in as
  local $self->{security} = 0;

  if (any { $_ eq $token } $self->recent_reset_tokens) {
    $self->set_password($password);
    $self->remove_reset_token($token);
  }
}
Of course, this has several obvious drawbacks. Firstly, it requires the reset_password_from_email routine to understand how security works; If we ever change the security implementation of the module we're going to have to alter this code too. Heaven help us if we try this approach on a third party module! Secondly, it assumes that the implementation of security is sufficiently trivial that it can be controlled by a simple variable. This isn't often the case. You may have to end up localising a whole collection of variables, or even running complex logic to work out what to do. Very very messy. Quite simply, just resetting variables back to their original state isn't powerful enough of a mechanism. What we actually would like to do is define some code that will be run on the exit of the subroutine. One way to do that is to use the End module from the CPAN:
use End qw(end);

sub reset_password_from_email_token {
  my $self = shift;
  my $token = shift;
  my $password = shift;

  # this user isn't the one the session is logged in as
  $self->disable_security_checks;
  my $temp = end { $self->enable_security_checks };

  if (any { $_ eq $token } $self->recent_reset_tokens) {
    $self->set_password($password);
    $self->remove_reset_token($token);
  }
}
As long as $temp stays in scope nothing happens, but as soon as the subroutine exits and $temp goes out of scope the code we passed in will be executed. How does that work? The End module is a way to create an instance that runs some code when it's garbage collected. In the above example when $temp goes out of scope its DESTROY method will be called which will in turn call the anonymous subroutine that we passed in which calls enable_security_checks. Great! We've solved the problem. We've almost invented a kind of backwards try / catch / finally syntax al-la Java and friends. The problem with is it still requires me to write code every time I disable security, and therefore think, and therefore have a chance to introduce bugs. What I really really would like to do is write this:
sub reset_password_from_email_token {
  my $self = shift;
  my $token = shift;
  my $password = shift;

  # this user isn't the one the session is logged in as
  $self->temporarily_disable_security_checks;

  if (any { $_ eq $token } $self->recent_reset_tokens) {
    $self->set_password($password);
    $self->remove_reset_token($token);
  }
}
And it to essentially do the same thing, call disable_security_checks immediately and enable_security_checks at the end of scope. Is this possible? Yes, with the help of another CPAN module, Scope::Upper:
use Scope::Upper;

sub temporarily_disable_security_checks {
  my $self = shift;
  
  # disable security checks immediately
  $self->disable_security_checks;
  
  # and when the scope that called us exits, re-enable them
  reap sub {
    $self->enable_security_checks;
  }, UP;
}
Whoa! What happened there? Like end, the reap function exported by Scope::Upper allows us to to define an anonymous subroutine that will be called when a scope exits - but rather than the current scope, we can say when any scope in our call-chain exits. In this example we're saying "UP", which is a constant exported by Scope::Upper which means "in the scope that called us", i.e. run this code when reset_password_from_email_token exits. As you can imagine this is a really powerful mechanism that can be used to encapsulate complex logic. It's useful for all sorts of things from cleanup exercises like I've shown here, to being really helpful in defining new keywords...

Share "The Upper Hand"

Share on: FacebookTwitter

blog built using the cayman-theme by Jason Long. LICENSE