I have, finally, after much complaining from people who wanted to play, shipped a copy of HTML::Zoom to the CPAN. Herein some of the more fun parts from the docs so far:
SYNOPSIS
use HTML::Zoom;
my $template = <<HTML; <html> <head> <title>Hello people</title> </head> <body> <h3 id="greeting">Placeholder</h3> <div id="list"> <span> <p>Name: <span class="name">Bob</span></p> <p>Age: <span class="age">23</span></p> </span> <hr class="between" /> </div> </body> </html> HTML
my $output = HTML::Zoom ->from_html($template) ->select('title, #greeting')->replace_content('Hello world & dog!') ->select('#list')->repeat_content( [ sub { $_->select('.name')->replace_content('Matt') ->select('.age')->replace_content('26') }, sub { $_->select('.name')->replace_content('Mark') ->select('.age')->replace_content('0x29') }, sub { $_->select('.name')->replace_content('Epitaph') ->select('.age')->replace_content('<redacted>') }, ], { repeat_between => '.between' } ) ->to_html;
will produce:
<html> <head> <title>Hello world & dog!</title> </head> <body> <h3 id="greeting">Hello world & dog!</h3> <div id="list"> <span> <p>Name: <span class="name">Matt</span></p> <p>Age: <span class="age">26</span></p> </span> <hr class="between" /> <span> <p>Name: <span class="name">Mark</span></p> <p>Age: <span class="age">0x29</span></p> </span> <hr class="between" /> <span> <p>Name: <span class="name">Epitaph</span></p> <p>Age: <span class="age"><redacted></span></p> </span> </div> </body> </html>
DANGER WILL ROBINSON
This is a 0.9 release. That means that I'm fairly happy the API isn't going to change in surprising and upsetting ways before 1.0 and a real compatibility freeze. But it also means that if it turns out there's a mistake the size of a politician's ego in the API design that I haven't spotted yet there may be a bit of breakage between here and 1.0. Hopefully not though. Appendages crossed and all that.
Worse still, the rest of the distribution isn't documented yet. I'm sorry. I suck. But lots of people have been asking me to ship this, docs or no, so having got this class itself at least somewhat documented I figured now was a good time to cut a first real release.
DESCRIPTION
HTML::Zoom is a lazy, stream oriented, streaming capable, mostly functional, CSS selector based semantic templating engine for HTML and HTML-like document formats.
Which is, on the whole, a bit of a mouthful. So let me step back a moment and explain why you care enough to understand what I mean:
JQUERY ENVY
HTML::Zoom is the cure for JQuery envy. When your javascript guy pushes a piece of data into a document by doing:
$('.username').replaceAll(username);
In HTML::Zoom one can write
$zoom->select('.username')->replace_content($username);
which is, I hope, almost as clear, hampered only by the fact that Zoom can't assume a global document and therefore has nothing quite so simple as the $() function to get the initial selection.
the HTML::Zoom::SelectorParser manpage implements a subset of the JQuery selector specification, and will continue to track that rather than the W3C standards for the forseeable future on grounds of pragmatism. Also on grounds of their spec is written in EN_US rather than EN_W3C, and I read the former much better.
I am happy to admit that it's very, very much a subset at the moment - see the the HTML::Zoom::SelectorParser manpage POD for what's currently there, and expect more and more to be supported over time as we need it and patch it in.
CLEAN TEMPLATES
HTML::Zoom is the cure for messy templates. How many times have you looked at templates like this:
<form action="/somewhere"> [% FOREACH field IN fields %] <label for="[% field.id %]">[% field.label %]</label> <input name="[% field.name %]" type="[% field.type %]" value="[% field.value %]" /> [% END %] </form>
and despaired of the fact that neither the HTML structure nor the logic are remotely easy to read? Fortunately, with HTML::Zoom we can separate the two cleanly:
<form class="myform" action="/somewhere"> <label /> <input /> </form>
$zoom->select('.myform')->repeat_content([ map { my $field = $_; sub {
$_->select('label') ->add_attribute( for => $field->{id} ) ->then ->replace_content( $field->{label} )
->select('input') ->add_attribute( name => $field->{name} ) ->then ->add_attribute( type => $field->{type} ) ->then ->add_attribute( value => $field->{value} )
} } @fields ]);
This is, admittedly, very much not shorter. However, it makes it extremely clear what's happening and therefore less hassle to maintain. Especially because it allows the designer to fiddle with the HTML without cutting himself on sharp ELSE clauses, and the developer to add available data to the template without getting angle bracket cuts on sensitive parts.
Better still, HTML::Zoom knows that it's inserting content into HTML and can escape it for you - the example template should really have been:
<form action="/somewhere"> [% FOREACH field IN fields %] <label for="[% field.id | html %]">[% field.label | html %]</label> <input name="[% field.name | html %]" type="[% field.type | html %]" value="[% field.value | html %]" /> [% END %] </form>
and frankly I'll take slightly more code any day over *that* crawling horror.
(addendum: I pick on Template Toolkit here specifically because it's the template system I hate the least - for text templating, I don't honestly think I'll ever like anything except the next version of Template Toolkit better - but HTML isn't text. Zoom knows that. Do you?)
PUTTING THE FUN INTO FUNCTIONAL
The principle of HTML::Zoom is to provide a reusable, functional container object that lets you build up a set of transforms to be applied; every method call you make on a zoom object returns a new object, so it's safe to do so on one somebody else gave you without worrying about altering state (with the notable exception of ->next for stream objects, which I'll come to later).
So:
my $z2 = $z1->select('.name')->replace_content($name);
my $z3 = $z2->select('.title')->replace_content('Ms.');
each time produces a new Zoom object. If you want to package up a set of transforms to re-use, HTML::Zoom provides an 'apply' method:
my $add_name = sub { $_->select('.name')->replace_content($name) }; my $same_as_z2 = $z1->apply($add_name);
LAZINESS IS A VIRTUE
HTML::Zoom does its best to defer doing anything until it's absolutely required. The only point at which it descends into state is when you force it to create a stream, directly by:
my $stream = $zoom->as_stream;
while (my $evt = $stream->next) { # handle zoom event here }
or indirectly via:
my $final_html = $zoom->to_html;
my $fh = $zoom->to_fh;
while (my $chunk = $fh->getline) { ... }
Better still, the $fh returned doesn't create its stream until the first call to getline, which means that until you call that and force it to be stateful you can get back to the original stateless Zoom object via:
my $zoom = $fh->to_zoom;
which is exceedingly handy for filtering Plack PSGI responses, among other things.
Because HTML::Zoom doesn't try and evaluate everything up front, you can generally put things together in whatever order is most appropriate. This means that:
my $start = HTML::Zoom->from_html($html);
my $zoom = $start->select('div')->replace_content('THIS IS A DIV!');
and:
my $start = HTML::Zoom->select('div')->replace_content('THIS IS A DIV!');
my $zoom = $start->from_html($html);
will produce equivalent final $zoom objects, thus proving that there can be more than one way to do it without one of them being a bait and switch.
STOCKTON TO DARLINGTON UNDER STREAM POWER
HTML::Zoom's execution always happens in terms of streams under the hood - that is, the basic pattern for doing anything is -
my $stream = get_stream_from_somewhere
while (my ($evt) = $stream->next) { # do something with the event }
More importantly, all selectors and filters are also built as stream operations, so a selector and filter pair is effectively:
sub next { my ($self) = @_; my $next_evt = $self->parent_stream->next; if ($self->selector_matches($next_evt)) { return $self->apply_filter_to($next_evt); } else { return $next_evt; } }
Internally, things are marginally more complicated than that, but not enough that you as a user should normally need to care.
In fact, an HTML::Zoom object is mostly just a container for the relevant information from which to build the final stream that does the real work. A stream built from a Zoom object is a stream of events from parsing the initial HTML, wrapped in a filter stream per selector/filter pair provided as described above.
The upshot of this is that the application of filters works just as well on streams as on the original Zoom object - in fact, when you run a repeat_content operation your subroutines are applied to the stream for that element of the repeat, rather than constructing a new zoom per repeat element as well.
More concretely:
$_->select('div')->replace_content('I AM A DIV!');
works on both HTML::Zoom objects themselves and HTML::Zoom stream objects and shares sufficient of the implementation that you can generally forget the difference - barring the fact that a stream already has state attached so things like to_fh are no longer available.
POP! GOES THE WEASEL
... and by Weasel, I mean layout.
HTML::Zoom's filehandle object supports an additional event key, 'flush', that is transparent to the rest of the system but indicates to the filehandle object to end a getline operation at that point and return the HTML so far.
This means that in an environment where streaming output is available, such as a number of the Plack PSGI handlers, you can add the flush key to an event in order to ensure that the HTML generated so far is flushed through to the browser right now. This can be especially useful if you know you're about to call a web service or a potentially slow database query or similar to ensure that at least the header/layout of your page renders now, improving perceived user responsiveness while your application waits around for the data it needs.
This is currently exposed by the 'flush_before' option to the collect filter, which incidentally also underlies the replace and repeat filters, so to indicate we want this behaviour to happen before a query is executed we can write something like:
$zoom->select('.item')->repeat(sub { if (my $row = $db_thing->next) { return sub { $_->select('.item-name')->replace_content($row->name) } } else { return } }, { flush_before => 1 });
which should have the desired effect given a sufficiently lazy $db_thing (for example a the DBIx::Class::ResultSet manpage object).
A FISTFUL OF OBJECTS
At the core of an HTML::Zoom system lurks an the HTML::Zoom::ZConfig manpage object, whose purpose is to hang on to the various bits and pieces that things need so that there's a common way of accessing shared functionality.
Were I a computer scientist I would probably call this an ``Inversion of Control'' object - which you'd be welcome to google to learn more about, or you can just imagine a computer scientist being suspended upside down over a pit. Either way works for me, I'm a pure maths grad.
The ZConfig object hangs on to one each of the following for you:
An HTML parser, normally HTML::Zoom::Parser::BuiltIn
An HTML producer (emitter), normally HTML::Zoom::Producer::BuiltIn