Frac’ing your HTML

In my previous blog entry I talked about encoding weird characters into HTML entities. In this entry I’m going to talk about converting some patterns of ASCII – dumb ways of writing fractions – and turning them into HTML entities or Unicode characters.

Hey Good Looking, What’s Cooking?

Imagine a simple recipe:

<ul>
   <li>1/2 cup of sugar</li>
   <li>1/2 cup of spice</li>
   <li>1/4 cup of all things nice</li>
</ul>

While this is nice, we can do better. There’s nice Unicode
characters for ¼, ½ and corresponding HTML entities that we can use to have the browser render them for us. What we need is some way to change all our mucky ASCII into these entities. Faced with this problem on his recipes site European Perl Hacker Léon Brocard wrote a module called HTML::Fraction that could tweak strings of HTML.

use HTML::Fraction;
my $frac = HTML::Fraction->new();
my $output = $frac->tweak($string_of_html);

This module creates output like:

<ul>
   <li>&frac12; cup of sugar</li>
   <li>&frac12; cup of spice</li>
   <li>&frac14; cup of all things nice</li>
</ul>

Which renders nicely as:

  • ½ cup of sugar
  • ½ cup of spice
  • ¼ cup of all things nice

HTML::Fraction can even cope with decimal representation in your string. For example:

  • 0.5 slugs
  • 0.67 snails
  • 0.14 puppy dogs tails

Processed with HTML::Fraction renders like so:

  • ¼ slugs
  • ⅔ snails
  • ⅐ puppy dogs tails

Unicode Characters Instead

Of course, we don’t always want to render out HTML. Sometimes we just want a plain old string back. Faced with this issue myself, I wrote a quick subclass called String::Fraction:

use String::Fraction;
my $frac = String::Fraction->new();
my $output = $frac->tweak($string);

The entire source code of this module is short enough that I can show you it here.

package String::Fraction;
use base qw(HTML::Fraction);

use strict;
use warnings;

our $VERSION = "0.30";

# Our superclass sometimes uses named entities
my %name2char = (
  '1/4'  => "\x{00BC}",
  '1/2'  => "\x{00BD}",
  '3/4'  => "\x{00BE}",
);

sub _name2char {
  my $self = shift;
  my $str = shift;

  # see if we can work from the Unicode character
  # from the entity returned by our superclass
  my $entity = $self->SUPER::_name2char($str);
  if ($entity =~ /\A &\#(\d+); \z/x) {
    return chr($1);
  }

  # superclass doesn't return a decimal entity?
  # use our own lookup table
  return $name2char{ $str }
}

We simply override one method _name2char so that instead of returning a HTML entity we
return corresponding Unicode character.

About these ads

Posted in Uncategorized

Permalink 2 Comments

2 responses to “Frac’ing your HTML

  1. Neat.

    I assume 0.5 slugs becoming ¼ slugs is a typo :)

    (Don’t know why that other comment went in as fuzzixfuzzix)

  2. walt Wheeler

    Interesting:

    0.5 slugs # half a slug here
    0.67 snails

    Processed with HTML::Fraction renders like so:
    ¼ slugs # where did the other quarter slug go?
    ⅔ snails

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: