A Method of Coming Clean

Since Perl has no inbuilt way of declaring if a subroutine is a function or a method, it's perfectly possible to mistakenly override a superclass method with an errant function import or accidentally call a function as a method call from outside the class it was declared in, violating encapsulation. This is particularly problematic with generically named utility functions (or "language extensions") that are imported into your module's namespace from other modules. In this blog entry I'm going to talk about what can be done to prevent this, and introduce you to the namespace::clean module that can automate this process.

An Example of the Problem

As way of example, let's write a simple class that looks up details about a user from their Facebook profile:
  package FacebookUser;
  use Moose;

  use LWP::Simple qw(get);
  use JSON::XS qw(decode_json);
  
  has api_access_token => ( isa => 'Str', is => 'ro', required => 1 );
  has username         => ( isa => 'Str', is => 'ro', required => 1 );
  has data             => ( isa => 'HashRef', is => 'ro', lazy_build => 1 );
  
  sub _build_data {
    my $self = shift;
    my $url = "https://graph.facebook.com/@{[ $self->username ]}?access_token=@{[ $self->api_access_token ]}";
    return decode_json(get($url));
  }
  
  sub get_firstname { return $_[0]->data->{first_name} }
  sub get_lastname  { return $_[0]->data->{first_name} }
  sub get_gender    { return $_[0]->data->{gender} }
  sub get_hometown  { return $_[0]->data->{hometown}{name} }
This is designed to be called like so:
  my $user = FacebookUser->new(
    api_access_token => $access_token,
    username => "2shortplanks",
  );
  say "Hello ".$user->get_firstname;
But what happens if someone calls it like so:
  my $user = FacebookUser->new(
    access_token => $access_token,
    username => "2shortplanks",
  );
  say "Hello ".$user->get('firstname');
Instead of a complaint about calling a non-existant get method we instead get the very cryptic error message:
  Can't use a FacebookUser object as a URI at /Library/Perl/5.10.0/HTTP/Request/Common.pm line 103
Why's this? Because Perl tried to call the get function that we imported into our module from LWP::Simple as a method call, passing the invocant as the first argument instead of a URL.

The obscured parent method problem

Bad error messages when we write stupid code are one thing. Left over functions totally breaking our code in ways we don't expect is another. Let's trying writing the same module using the Class::Accessor object system instead of using Moose.
  package FacebookUser;
  use base qw(Class::Accessor);

  # use Class::Accessor to create the api_access_token and username accessor methods
  __PACKAGE__->mk_accessors(qw(api_access_token username));

  use LWP::Simple qw(get);
  use JSON::XS qw(decode_json);

  sub data {
    my $self = shift;
    return $self->{data} ||= $self->_build_data;
  }

  sub _build_data {
    my $self = shift;
    my $url = "https://graph.facebook.com/@{[ $self->username ]}?access_token=@{[ $self->api_access_token ]}";
    return decode_json(get($url));
  }

  sub get_firstname { return $_[0]->data->{first_name} }
  sub get_lastname  { return $_[0]->data->{first_name} }
  sub get_gender    { return $_[0]->data->{gender} }
  sub get_hometown  { return $_[0]->data->{hometown}{name} }
This is designed to be called called like so:
  my $user = FacebookUser->new({
    access_token => $access_token,
    username => "2shortplanks",
  });
  say "Hello ".$user->get_firstname;
To our surprise, this doesn't work at all:
  Can't use a FacebookUser object as a URI at /Library/Perl/5.10.0/HTTP/Request/Common.pm line 103
That's the same error we got when we called $user->get('firstname') in our Moose example by mistake, but this time we're sure we're calling $user->get_firstname, so what gives? How is the get subroutine being called accidentally? Let's run our example again with Carp::Always so we get a full stack trace out:
  bash$ perl -MCarp::Always example.pl 
  Can't use a FacebookUser object as a URI at /Library/Perl/5.10.0/HTTP/Request.pm line 70
  	HTTP::Request::uri('HTTP::Request=HASH(0x10091c5f8)', 'FacebookUser=HASH(0x10091c1f0)') called at /Library/Perl/5.10.0/HTTP/Request.pm line 16
  	HTTP::Request::new('HTTP::Request', 'GET', 'FacebookUser=HASH(0x10091c1f0)') called at /Library/Perl/5.10.0/HTTP/Request/Common.pm line 103
  	HTTP::Request::Common::_simple_req() called at /Library/Perl/5.10.0/HTTP/Request/Common.pm line 20
  	HTTP::Request::Common::GET('FacebookUser=HASH(0x10091c1f0)') called at /Library/Perl/5.10.0/LWP/UserAgent.pm line 386
  	LWP::UserAgent::get('LWP::UserAgent=HASH(0x10083c840)', 'FacebookUser=HASH(0x10091c1f0)') called at /Library/Perl/5.10.0/LWP/Simple.pm line 36
  	LWP::Simple::get('FacebookUser=HASH(0x10091c1f0)', 'username') called at /System/Library/Perl/Extras/5.10.0/Class/Accessor.pm line 393
  	Class::Accessor::__ANON__('FacebookUser=HASH(0x10091c1f0)') called at FacebookUser.pm line 18
  	FacebookUser::_build_data('FacebookUser=HASH(0x10091c1f0)') called at FacebookUser.pm line 13
  	FacebookUser::data('FacebookUser=HASH(0x10091c1f0)') called at FacebookUser.pm line 22
  	FacebookUser::get_firstname('FacebookUser=HASH(0x10091c1f0)') called at example.pl line 15
What's happened is that the username method that Class::Accessor created in our package is trying to call the get method in FacebookUser's Class::Accessor superclass but because we've imported a get function into FacebookUser that's being accidentally called as a method instead. Ooops!

Solving this: Deleting the subroutine when done

a This problem can be solved in either code example by adding the following line to FacebookUser.pm to delete the get subroutine:
  delete $FacebookUser::{get};
Now a trying to call the get function as a method won't work, and we'll get a proper error message:
  say "Hello ".$user->get('firstname');
  Can't locate object method "get" via package "FacebookUser" at example.pl line 15
And with these modifications in both cases our get_firstname method call works as expected:
  say "Hello ".$user->get_firstname;
  Hello Mark
Those of you paying attention might be wondering how this possibly can still (if you'll pardon the pun) function. After all, the the C_build_data method uses the get function to get the data from the facebook servers - and we just deleted the get function! The answer to this conundrum relies on the fact that Perl is not an interpreted language, but a compiled one (one that is recompiled each time the program is started and has no executable or bytecode that is saved to disk.) As such Perl has multiple phases of execution, with "compile time" (when the code is turned into bytecode) and "execution time" (when the code is actually running) happening at different times. It's at the time the text it turned into bytecode that perl works out what function is what, and at compile time (after the symbol for the function has been deleted) that method calls occur and fail to find the errant get. The same thing that means the following bit of code still works even though the panic() statement comes before the declaration of the function.
  panic();
  sub panic() { say "Don't panic Mr Mannering!"; return }

Doing this automatically

I'll be the first to admit that this syntax:
  delete $FacebookUser::{get};
Totally sucks. It's ugly and completely unintelligible unless you happen to have learnt what it does beforehand. Worse still, you need to remember to do it for each and every function you declare or import in your object and you might not even know all the functions that are imported if you rely on the default import list. In other words, it involves a lot of bookkeeping and hence chances for mistakes. Why don't we get the computer to automate this for us? This is where the namespace::clean module from CPAN comes into use. It's use is simple:
  use namespace::clean;
This deletes all function from your namespace that were declared or imported above the line that namespace::clean was use-ed on. So, in our example we might like to do it after the lines where we've imported everything but before we start declaring methods and accessors:
  package FacebookUser;
  use base qw(Class::Accessor);

  use LWP::Simple qw(get);
  use JSON::XS qw(decode_json);

  use namespace::clean;

  # use Class::Accessor to create the api_access_token and username
  # accessor methods
  __PACKAGE__->mk_accessors(qw(api_access_token username));

  ...
Or with Moose:
  package FacebookUser;
  use Moose;

  use LWP::Simple qw(get);
  use JSON::XS qw(decode_json);
  
  use namespace::clean -except => 'meta';
  
  ...
In the later example we've told namespace::clean to keep the meta subroutine that use Moose imported for us - it's an important part of our object's framework.

Conclusion

The confusion between functions and methods in Perl is one that can cause a fair number of headaches for those not paying very close attention, and often requires us to have too much knowledge of the inner workings of our parent classes. We can handle this problem manually, or the namespace::clean module on CPAN can help alleviate this pain.

- to blog -

blog built using the cayman-theme by Jason Long. LICENSE