Under the Hood

Dec 2, 2009

Perl provides a high level of abstraction between you and the computer allowing you to write very expressive high level code quickly that does a lot. Sometimes however, when things don't go to plan or you want performance improvements it's important find out what's really going on at the lower levels and find out what perl's doing "under the hood."

What Did perl Think I Said?

Sometimes when code doesn't do what you expect it's nice to see how the Perl interpreter understands your code incase your understanding of Perl's syntax and perl's understanding of that same syntax differ. One way to do this is to use the B::Deparse module from the command line to regenerate Perl code from the internal representation perl has built from your source code when it parsed it. This is as simple as:

bash$ perl -MO=Deparse myscript.pl

One of my favourite options for B::Deparse is -p which tells it to put in an obsessive amount of brackets so you can see what precedence perl is applying:

bash$ perl -MO=Deparse,-p -le 'print $ARGV[0]+$ARGV[1]*$ARGV[2]'
BEGIN { $/ = "\n"; $\ = "\n"; }
print(($ARGV[0] + ($ARGV[1] * $ARGV[2])));
-e syntax OK

You'll even note there's two sets of brackets immediately after the print statement - one surrounding the addition and one enclosing the argument list to print. This means that B::Deparse can also be used to work out why the following script prints out 25 rather than 5:

bash$ perl -le 'print ($ARGV[0]**2+$ARGV[1]**2)**0.5' 3 4

The brackets we thought we were using for force precedence actually were parsed by perl as constraining what we were passing to print meaning that the **0.5 was actually ignored:

bash$ perl -MO=Deparse,-p -le 'print ($ARGV[0]**2+$ARGV[1]**2)**0.5' 3 4
BEGIN { $/ = "\n"; $\ = "\n"; }
(print((($ARGV[0] ** 2) + ($ARGV[1] ** 2))) ** 0.5);
-e syntax OK

What Does That Scalar Actually Contain?

A scalar is many things at once - it can actually hold a string, an integer, a floating point value and convert between them at will. We can see the internal structure with the Devel::Peek module:

use Devel::Peek;
my $foo = 2;
Dump($foo);

This prints

SV = IV(0x100813f78) at 0x100813f80
  REFCNT = 1
  FLAGS = (PADMY,IOK,pIOK)
  IV = 2

This tells you a lot about the object. It tells you it's an int (an IV) and the value of that int is 2. You can see that it's got one reference pointing to it (the $foo alias.) You can also see it's got several flags set on it telling us which of the values stored in the object are still current (in this case, the IV, since it's an IV)

$foo .= "";
Dump($foo);

This now prints:

SV = PVIV(0x100803c10) at 0x100813f80
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK)
  IV = 2
  PV = 0x100208900 "2"
  CUR = 1
  LEN = 8

We gain PV flags (it's a "pointer value" aka a string) and we also gain CUR (current string length) and LEN (total string length allocated before we need to re-alloc and copy the string.) The flags have changed to indicate that the PV value is now current too. So we can tell a lot about the internal state of a scalar. Why would we care (assuming we're not going to be using XS that has to deal with this kind of stuff.) Mainly I find myself reaching for Devel::Peek to print out the contents of strings whenever I have encoding issues. Consider this:

my $acme = "L\x{e9}on";
Dump $acme;

On my system this shows that Léon was actually stored internally as a latin-1 byte sequence:

SV = PV(0x100801c78) at 0x100813f98
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK)
  PV = 0x100202550 "L\351on"
  CUR = 4
  LEN = 8

But it doesn't have to be

utf8::upgrade($acme);
Dump($acme);

Now the internal bytes of the string are stored in utf8 (and the UTF8 flag is turned on)

SV = PV(0x100801c78) at 0x100813f98
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK,UTF8)
  PV = 0x1002010f0 "L\303\251on" [UTF8 "L\x{e9}on"]
  CUR = 5
  LEN = 6

As far as perl is concerned these are the same string:

my $acme  = "L\x{e9}on";
my $acme2 = $acme;
utf8::upgrade($acme);
say "Yep, this will be printed"
  if $acme eq $acme2;

In fact, perl may decide to switch between these two internal representations as you concatinate and manipulate your strings. This is not something you normally have to worry about until something goes wrong and you see something horrid being output:

LÃ©on

This is usually a sign that you've read in some bytes that were encoded as latin-1 and forgotten to use Encode (or you've done that twice!), or you've passed a UTF-8 string though a C library, or you had duff data to begin with (garbage in, garbage out.) Of course, you can't really start to work out which of these cases is true unless you look in the variable, and that's hard: You can't just print it out because that will re-encode it with the binmode of that filehandle and your terminal may do all kinds of weirdness with it. The solution, of course, is to Dump it out as above and see an ASCII representation of what's actually stored in memory.

How Much Memory Is That Using?

In general you don't have to worry about memory in Perl - perl handles allocating and deallocating memory for you automatically. On the other hand, perl can't magically give your computer an infinite amount of memory so you still have to worry that you're using too much (especially in a webserver environment where you might be caching data between requests but running multiple Perl processes at the same time.) The Devel::Size module from the CPAN can be a great help here:

bash$ perl -E 'use Devel::Size qw(size); say size("a"x1024)'
1080

So in this case a string of 1024 "a" characters takes up the 1024 bytes for all the "a" characters plus 56 bytes for the internal scalar data structure (the exact size will vary slightly between versions of perl and across architectures.) Devel::Size can also tell you how much memory nested data structures (and objects) are taking up:

perl -E 'use Devel::Size qw(total_size); say total_size({ z => [("a"x1024)x10] })'
11251

Be aware that Devel::Size will only report how much memory perl has allocated for you - not how much memory XS modules you've loaded into perl are taking up.

How Does perl Execute That?

Perl's interpreter (like those that run Python, Java, JavaScript, Ruby and many other languages) neither compiles your code to native machine instructions nor interprets the source code directly to execute it. It instead compiles the code to an bytecode representation and then 'executes' those bytes on a virtual machine capable of understanding much higher level instructions than the processor in your computer. When you're optomising your code one of the most important things to do is reduce the number of "ops" (bytecode operations) that perl has to execute. This is because there's significant overhead in actually running the virtual machine itself, so the more you can get each Perl op to do the better, even if that op itself is more expensive to run. For example, here's a script that counts the number of "a" characters in the output by using the index command to repeatedly search for the next "a" and increasing a counter whenever we do'

perl -E '$c++ while $pos = index($ARGV[0], "a", $pos) + 1; say $c' aardvark
3

Let's look at what ops that program actually creates. This can be done with the B::Concise module that ships with perl:

bash$ perl -MO=Concise -E '$c++ while $pos = index($ARGV[0], "a", $pos) + 1; say $c' aardvark
l  <@> leave[1 ref] vKP/REFC ->(end)
1     <0> enter ->2
2     <;> nextstate(main 42 -e:1) v:%,{ ->3
g     <@> leave vK* ->h
3        <0> enter v ->4
-        <1> null vKP/1 ->g
c           <|> and(other->d) vK/1 ->g
b              <2> sassign sKS/2 ->c
9                 <2> add[t7] sK/2 ->a
7                    <@> index[t6] sK/3 ->8
-                       <0> ex-pushmark s ->4
-                       <1> ex-aelem sK/2 ->5
-                          <1> ex-rv2av sKR/1 ->-
4                             <#> aelemfast[*ARGV] s ->5
-                          <0> ex-const s ->-
5                       <$> const[GV "a"] s ->6
-                       <1> ex-rv2sv sK/1 ->7
6                          <#> gvsv[*pos] s ->7
8                    <$> const[IV 1] s ->9
-                 <1> ex-rv2sv sKRM*/1 ->b
a                    <#> gvsv[*pos] s ->b
-              <@> lineseq vK ->-
e                 <1> preinc[t2] vK/1 ->f
-                    <1> ex-rv2sv sKRM/1 ->e
d                       <#> gvsv[*c] s ->e
f                 <0> unstack v ->4
h     <;> nextstate(main 42 -e:1) v:%,{ ->i
k     <@> say vK ->l
i        <0> pushmark s ->j
-        <1> ex-rv2sv sK/1 ->k
j           <#> gvsv[*c] s ->k

It's not important to really understand this in any great detail; All we need worry about is that firstly it's very big for what we're trying to do and secondly that it's looping so those ops we can see are going to be executed multiple times. Let's try an alternative approach, using the translation operator to translate all the "a" characters to "a" characters (so, do nothing) and return how many characters it 'changed'

bash$ perl -MO=Concise -E '$c = $ARGV[0] =~ tr/a/a/; say $c' aardvark
b  <@> leave[1 ref] vKP/REFC ->(end)
1     <0> enter ->2
2     <;> nextstate(main 42 -e:1) v:%,{ ->3
6     <2> sassign vKS/2 ->7
-        <1> null sKS/2 ->5
-           <1> ex-aelem sK/2 ->4
-              <1> ex-rv2av sKR/1 ->-
3                 <#> aelemfast[*ARGV] s ->4
-              <0> ex-const s ->-
4           <"> trans sS/IDENT ->5
-        <1> ex-rv2sv sKRM*/1 ->6
5           <#> gvsv[*c] s ->6
7     <;> nextstate(main 42 -e:1) v:%,{ ->8
a     <@> say vK ->b
8        <0> pushmark s ->9
-        <1> ex-rv2sv sK/1 ->a
9           <#> gvsv[*c] s ->a

Ah! much less ops! And no loops! This is because the call to tr is a single op, meaning this whole thing is much faster. Of course, don't take my word for it - run a benchmark

#!/usr/bin/perl

use Benchmark qw(cmpthese);

cmpthese(10_000_000, {
 'index' => sub { my $c; my $pos; $c++ while $pos = index($ARGV[0], "a", $pos) + 1 },
 'tr'    => sub { my $c; $c = $ARGV[0] =~ tr/a/a/ },
});

bash$ ./benchmark.pl aardvark
           Rate index    tr
index 2439024/s    --  -39%
tr    4016064/s   65%    --

And finally

This is just a smattering of modules that can help poke around inside the internal of Perl - practically the national sport of the Programming Republic of Perl. The CPAN contains a very large number of modules that can do all kinds of clever things - try looking on the CPAN for "B::" and "Devel::" modules.

As Thick As Two Short Planks

Mark Fowler's Perl Blog