#!/usr/bin/perl -w
use strict;
use Encode;
use HTML::Template;
my $template = HTML::Template->new(
filehandle => *DATA,
);
print Encode::encode('UTF-8', $template->output);
__DATA__
¡™£¢∞§¶•ªº
prints, ¡™£¢∞§¶•ªº (or something like that!)
In the example above, this makes sense, since we're printing on an open filehandle (even if it's only to our magical, DATA) that we didn't put a file layer filter thingy to. That's easy to fix:
prints, ¡™£¢∞§¶•ªº, yay!
#!/usr/bin/perl -w
use strict;
use Encode;
binmode DATA, ':encoding(UTF-8)';
use HTML::Template;
my $template = HTML::Template->new(
filehandle => *DATA,
);
print Encode::encode('UTF-8', $template->output);
__DATA__
¡™£¢∞§¶•ªº
This also works if we want to just pass a reference to a scalar to HTML::Template:
prints, ¡™£¢∞§¶•ªº, yay!
#!/usr/bin/perl -w
use strict;
use Encode;
my $content = "\x{a1}\x{2122}\x{a3}\x{a2}\x{221e}\x{a7}\x{b6}\x{2022}\x{aa}\x{ba}";
use HTML::Template;
my $template = HTML::Template->new(
scalarref => \$content,
);
print Encode::encode('UTF-8', $template->output);
This doesn't work, if we want to just give it a name of a template file. This is really useful, since HTML::Template has a feature to allow you to search through a file structure (or at least an array of directories, looking for the file).
And this is where encoding madness begins.
Cause I know what you're thinking, just treat HTML::Template's output like information that's coming from outside your program (since, if you're using a template *file*, it kinda is).
So, all you need to do is decode (this is the WRONG WAY to solve the problem, but let's just make that mistake...) the return value of ->output, like this:
#!/usr/bin/perl -w
use strict;
use Encode;
my $content = "\x{a1}\x{2122}\x{a3}\x{a2}\x{221e}\x{a7}\x{b6}\x{2022}\x{aa}\x{ba}";
my $filename = 'utf8string.tmpl';
open my $fh, '>:encoding(UTF-8)', $filename or die $!;
print $fh $content;
close $fh;
use HTML::Template;
my $template = HTML::Template->new(
filename => $filename,
);
my $output = $template->output;
$output = Encode::decode('UTF-8', $output);
print Encode::encode('UTF-8', $output);
prints, ¡™£¢∞§¶•ªº. Yes.
But... what if you have a variable (it is a templating system) and the variable in the param() you pass has UTF-8 strings? MUAHAHA!
#!/usr/bin/perl -w
use strict;
use Encode;
my $content = "
<!-- tmpl_var one -->
\x{a1}\x{2122}\x{a3}\x{a2}\x{221e}\x{a7}\x{b6}\x{2022}\x{aa}\x{ba}
";
my $filename = 'utf8string.tmpl';
open my $fh, '>:encoding(UTF-8)', $filename or die $!;
print $fh $content;
close $fh;
use HTML::Template;
my $template = HTML::Template->new(
filename => $filename,
);
$template->param(
one => "\x{a1}\x{2122}\x{a3}\x{a2}\x{221e}\x{a7}\x{b6}\x{2022}\x{aa}\x{ba}",
);
my $output = $template->output;
$output = Encode::decode('UTF-8', $output);
print Encode::encode('UTF-8', $output);
Cannot decode string with wide characters at /System/Library/Perl/5.10.0/darwin-thread-multi-2level/Encode.pm line 162.
Bahahaha!
Take those decode/encode lines (I know it looks strange to one, right after the other ) and you'll still get a weird output:
¡™£¢∞§¶•ªº
¡™£¢∞§¶•ªº
Darned if you do/don't. Those two lines should have the same string. They don't. No amount of encoding/decoding is going to help.
The trick, other than tweaking HTML::Template's source to include file filter layer thingamabobs, is to decode the contents of the file it opens up.
How to do that.
Trolling through the HTML::Template mailing list archives leads to the idea of using a HTML::Template filter that matches everything, that then does our decoding:
#!/usr/bin/perl -w
use strict;
use Encode;
my $content = "
<!-- tmpl_var one -->
\x{a1}\x{2122}\x{a3}\x{a2}\x{221e}\x{a7}\x{b6}\x{2022}\x{aa}\x{ba}
";
my $filename = 'utf8string.tmpl';
open my $fh, '>:encoding(UTF-8)', $filename or die $!;
print $fh $content;
close $fh;
use HTML::Template;
my $template = HTML::Template->new(
filename => $filename,
filter => [
{ sub => \&decode_str, format => 'scalar' },
],
);
$template->param(
one => "\x{a1}\x{2122}\x{a3}\x{a2}\x{221e}\x{a7}\x{b6}\x{2022}\x{aa}\x{ba}",
);
my $output = $template->output;
print Encode::encode('UTF-8', $output);
sub decode_str {
my $ref = shift;
${$ref} = Encode::decode('UTF-8', ${$ref});
}
This sort of lines up all the data to be UTF-8 encoded and aware and all that stuff that the unicodefaqthingy perldoc tells you to do.
But, oh, it gets better.
DON'T use that filter trick thing if you're using a scalarref, or a properly encoded file handle! You'll get a nice error, like this:
HTML::Template->new() : fatal error occured during filter call: Cannot decode string with wide characters at /System/Library/Perl/5.10.0/darwin-thread-multi-2level/Encode.pm line 162.
at /Library/Perl/5.10.0/HTML/Template.pm line 1697
HTML::Template::_init_template('HTML::Template=HASH(0x1008aafb8)') called at /Library/Perl/5.10.0/HTML/Template.pm line 1238
HTML::Template::_init('HTML::Template=HASH(0x1008aafb8)') called at /Library/Perl/5.10.0/HTML/Template.pm line 1124
Brilliant.
So I don't know what the best advice is to give. If you're passing the template as a scalarref, DON'T use that filter, unless you want to, perhaps encode your template beforehand (which makes little sense?)
If it's a filename, use that filter trick perhaps (or edit the sourcecode of HTML::Template).
 
2 comments:
You should talk to Sam Tregar and see what ideas he has about it.
At least from going through the HTML::Template mailing lists, the other option is to switch from HTML::Template to HTML::Template::Compiled. I'm not too interested in doing that, since H::T::Compiled doesn't support things like Template Expressions, which is something that I expose as a feature to advanced users of this project.
As far as I understand, Sam is very very conservative on releases of H::T, but currently, H::T, hasn't the best plugin support to extend his fine module.
Post a Comment