Frac'ing your HTML
In my previous blog entry I talked about encoding weird characters into HTML entities. In this entry I’m going to talk about converting some patterns of ASCII - dumb ways of writing fractions - and turning them into HTML entities or Unicode characters.
Hey Good Looking, What’s Cooking?
Imagine a simple recipe:
<ul>
<li>1/2 cup of sugar</li>
<li>1/2 cup of spice</li>
<li>1/4 cup of all things nice</li>
</ul>
While this is nice, we can do better. There’s nice Unicode characters for ¼, ½ and corresponding HTML entities that we can use to have the browser render them for us. What we need is some way to change all our mucky ASCII into these entities. Faced with this problem on his recipes site European Perl Hacker Léon Brocard wrote a module called HTML::Fraction that could tweak strings of HTML.
use HTML::Fraction;
my $frac = HTML::Fraction->new();
my $output = $frac->tweak($string_of_html);
This module creates output like:
<ul>
<li>½ cup of sugar</li>
<li>½ cup of spice</li>
<li>¼ cup of all things nice</li>
</ul>
Which renders nicely as:
- ½ cup of sugar
- ½ cup of spice
- ¼ cup of all things nice
HTML::Fraction can even cope with decimal representation in your string. For example:
- 0.5 slugs
- 0.67 snails
- 0.14 puppy dogs tails
Processed with HTML::Fraction renders like so:
- ¼ slugs
- ⅔ snails
- ⅐ puppy dogs tails
Unicode Characters Instead
Of course, we don’t always want to render out HTML. Sometimes we just want a plain old string back. Faced with this issue myself, I wrote a quick subclass called String::Fraction:
use String::Fraction;
my $frac = String::Fraction->new();
my $output = $frac->tweak($string);
The entire source code of this module is short enough that I can show you it here.
package String::Fraction;
use base qw(HTML::Fraction);
use strict;
use warnings;
our $VERSION = "0.30";
# Our superclass sometimes uses named entities
my %name2char = (
'1/4' => "\x{00BC}",
'1/2' => "\x{00BD}",
'3/4' => "\x{00BE}",
);
sub _name2char {
my $self = shift;
my $str = shift;
# see if we can work from the Unicode character
# from the entity returned by our superclass
my $entity = $self->SUPER::_name2char($str);
if ($entity =~ /\A &\#(\d+); \z/x) {
return chr($1);
}
# superclass doesn't return a decimal entity?
# use our own lookup table
return $name2char{ $str }
}
We simply override one method _name2char
so that instead of returning a HTML entity we
return corresponding Unicode character.