Chapter 12: Custom Mason Subclasses- P2
Output: Compiling to a Different Output
So you've decided that you really hate Mason and you want to use Embperl
instead. But you have a number of Mason components you've already
written that you'd like to save. Well, you can create your own compiler to
generate Embperl code from Mason. In this case, we'll use the lexer as is and
rewrite the compiler from scratch. There isn't really a one-to-one match
between Mason and Embperl's features so this example will, like the lexer
example, be limited in scope. Finding an intelligent way to convert Mason's
methods and subcomponents to Embperl is beyond the scope of this book.
In case you are unfamiliar with Embperl, it uses the following syntax: [+
+] tags contain code whose results should be sent to the browser, like
Mason's substitution tag (<% %>). The [* *] tags contain Perl code that is
not intended to generate output. This is equivalent to Mason's % -lines and
<%perl> blocks. Finally, Embperl also has a [! !] tag similar to Mason's
<%once> block.
There are other Embperl tags but, once again, this is a simplified example.
Embperl does have a feature similar to Mason's inheritance system called
EmbperlObject, but translating between the two is nontrivial.
So let's make our new compiler:
package HTML::Mason::Compiler::ToEmbperl;
$VERSION = '0.01';
use strict;
use HTML::Mason::Lexer;
use HTML::Mason::Exceptions ( abbr =>
[qw(syntax_error)] );
use HTML::Mason::Compiler;
use base qw(HTML::Mason::Compiler);
This pulls in the basic packages we'll need. Even though we really aren't
inheriting much from HTML::Mason::Compiler , we still subclass it as
anything expecting a compiler will check that what it is given is a subclass
of HTML::Mason::Compiler.
Of course, in our case, we won't be using this compiler with the
HTML::Mason::Interp class, so the point is moot but important to
mention.
sub compile {
my ($self, %p) = @_;
$self->lexer->lex( comp_source =>
$p{comp_source},
name => 'Embperl',
compiler => $self );
return $self->component_as_embperl;
}
The only parameter we expect is
comp_source. We tell the lexer the name of the component is
'Embperl' since we don't really care what the name is in this context.
Presumably we are being called by some sort of script that is simply going to
take the Embperl-ized component and write it to disk somewhere. The name
is used for reporting syntax errors when a component is run, but that won't
be an issue in this case.
sub start_component {
my $self = shift;
$self->{once_header} = '';
$self->{header} = '';
$self->{body} = '';
$self->{footer} = '';
$self->{current_block} = '';
}
This method is called to give the compiler a chance to reset its state, so that's
what we do.
We will be storing blocks of code in each of the first four attributes. When
we encounter a <%once> block, it will go in the once_header attribute.
For <%init> blocks, we can put then in the header attribute. % -lines,
<%perl> blocks, <%text> blocks, substitution tags, and text will be
placed immediately into the body attribute. Finally, any <%cleanup>
blocks will go into the footer attribute.
The current_block() attribute will be used to keep track of what type
of block we are in after a call to our start_block() method.
This example will ignore other Mason syntax such as component calls,
subcomponents, methods, and <%shared>. Again, this will be left as an
exercise for the reader.
sub start_block {
my ($self, %p) = @_;
syntax_error "Cannot nest a $p{block_type}
inside a $self->{in_block} block"
if $self->{in_block};
This is to make sure that the component is following the syntax rules we
expect.
$self->{in_block} = $p{block_type};
}
Then we record what kind of block we are starting, which will be something
like init or perl .
The next method, raw_block() , is called for all of the blocks that we
handle except the <%text> block:
sub raw_block {
my ($self, %p) = @_;
for ($self->{in_block}) {
/^once$/ and $self->{once_header}
.= $p{block};
/^init$/ and $self->{header}
.= $p{block};
/^perl$/ and $self->{body}
.= "[* $p{block} *]";
/^cleanup$/ and $self->{footer}
.= $p{block};
}
}
This switchlike statement stores the code given to us in the appropriate
attribute. If it is a <%perl%> block, we wrap it in the relevant Embperl tag;
otherwise, we simply store it as is in the appropriate slot.
sub text_block {
my ($self, %p) = @_;
$self->{body} .= $p{block};
}
sub text {
my ($self, %p) = @_;
$self->{body} .= $p{text};
}
The first method is called when the lexer finds a<%text> block. The
second is called for regular text. Both of these get placed into the body
attribute for later use.
sub substitution {
my ($self, %p) = @_;
$self->{body} .= "[+ $p{substitution} +]";
}
This method handles substitution tags (<% %>) though it ignores the fact
that this method can also be given an escape
parameter. This could be handled via Embperl's $escmode variable (again,
left as an exercise for the reader).
sub perl_line {
my ($self, %p) = @_;
$self->{body} .= "[* $p{line} *]";
}
This method is called for % -lines.
Then we need to implement the end_block() method:
sub end_block {
my ($self, %p) = @_;
syntax_error "end of $p{block_type}
encountered while in $self->{in_block} block"
unless $self->{in_block} eq
$p{block_type};
Another sanity check is in the start_block() method. It's always a
good thing to make sure that the lexer is giving us the kind of input that we
would expect.
$self->{in_block} = undef;
}
And we reset our in_block attribute so that the next call to
start_block() succeeds.
The last method to implement is the component_as_embperl()
method, which simply will return a big block of text, our new Embperl page:
sub component_as_embperl {
my $self = shift;
my $page = '';
if ( length $self->{once_header} ) {
$page .= "[! $self->{once_header} !]\n";
}
if ( length $self->{header} ) {
$page .= "[* $self->{header} *]\n";
}
if ( length $self->{body} ) {
$page .= "$self->{body}\n";
}
if ( length $self->{footer} ) {
$page .= "[* $self->{footer} *]\n";
}
return $page;
}
And there you have it a perfectly good Mason component brutally
butchered and turned into an Embperl page. I hope you're happy with
yourself!
Storage: Replacing the Resolver
Occasionally, people on the Mason users list wonder if they can store their
component source in an RDBMS. The way to achieve this is to create your
own HTML::Mason::Resolver subclass.
The resolver's job is take a component path and figure out where the
corresponding component is.
We will show an example that connects to a MySQL server containing the
following table:
MasonComponent
path VARCHAR(255) PRIMARY KEY
component TEXT NOT NULL
last_modified DATETIME NOT NULL
Our code starts as follows:
package HTML::Mason::Resolver::MySQL;
$VERSION = '0.01';
use strict;
use DBI;
use Params::Validate qw(:all);
use HTML::Mason::ComponentSource;
use HTML::Mason::Resolver;
use base qw(HTML::Mason::Resolver);
__PACKAGE__->valid_params
(
db_name => { parse => 'string', type =>
SCALAR },
user => { parse => 'string', type =>
SCALAR, optional => 1 },
password => { parse => 'string', type =>
SCALAR, optional => 1 },
);
These parameters will be used to connect to the MySQL server containing
our components. Readers familiar with the Perl DBI will realize that there
are a number of other parameters that we could take.
Our constructor method, new(), needs to do a bit of initialization to set up
the database connection, so we override our base class's method:
sub new {
my $class = shift;
my $self = $class->SUPER::new(@_);
We invoke the new() method provided by our superclass, which validates
the parameters in @_ and makes sure they get sent to the right contained
objects. The latter concern doesn't seem so important in this case since we
don't have any contained objects, but the point is that if somebody
subclasses our HTML::Mason::Resolver::MySQL class and adds
contained objects, our new() method will still do the right thing with its
parameters.
Now we connect to the database in preparation for retrieving components
later:
$self->{dbh} =
DBI->connect
( "dbi:mysql:$self->{db_name}",
$self->{user}, $self->{password}, {
RaiseError => 1 } );
return $self;
}
A resolver needs to implement two methods left unimplemented in the
parent HTML::Mason::Resolver class. These are get_info() and
glob_path(). The first is used to retrieve information about the
component matching a particular component path. The second takes a glob
pattern like /path/* or /path/*/foo/* and returns the component
paths of all the components that match that wildcard path.
Additionally, if we want this resolver to be usable with the ApacheHandler
module, we need to implement a method called
apache_request_to_comp_path() , which takes an Apache object
and translates it into a component path.
Given a path, we want to get the time when this component was last
modified, in the form of a Unix timestamp, which is what Mason expects:
sub get_info {
my ($self, $path) = @_;
my ($last_mod) =
$self->{dbh}->selectrow_array
( 'SELECT
UNIX_TIMESTAMP(last_modified)
FROM MasonComponent WHERE path =
?',
{}, $path );
return unless $last_mod;
If there was no entry in the database for the given path, we simply return,
which lets Mason know that no matching component was found:
return
HTML::Mason::ComponentSource->new
( comp_path => $path,
friendly_name => $path,
last_modified => $last_mod,
comp_id => $path,
source_callback => sub { $self-
>_get_source($path) },
);
}
The get_info() method returns its information in the form of a
HTML::Mason::ComponentSource object. This is a very simple class
that holds information about a component.
Its constructor accepts the following parameters:
• comp_path
This is the component path as given to the resolver.
• friendly_name
The string given for this parameter will be used to identify the
component in error messages. For our resolver, the component path
works for this parameter as well because it is the primary key for the
MasonComponent table in the database, allowing us to uniquely
identify a component.
For other resolvers, this might differ from the component path. For
example, the filesystem resolver that comes with Mason uses the
component's absolute path on the filesystem.
• last_modified
This is the last modification time for the component, as seconds since
the epoch.
• comp_id
This should be a completely unique identifier for the component.
Again, since the component path is our primary key in the database, it
works well here.
• source_callback
This is a subroutine reference that, when called, returns the source text
of the component.
Mason could have had you simply create an
HTML::Mason::ComponentSource subclass that implemented
a source() method for your resolver, but we thought that rather
than requiring you to write such a do-nothing subclass, it would be
easier to simply use a callback instead.
Our _get_source() method is trivially simple:
sub _get_source {
my $self = shift;
my $path = shift;
return
$self->{dbh}->selectrow_array
( 'SELECT component FROM
MasonComponent WHERE path = ?', {}, $path );
}
• comp_class
This is the component class into which this particular component
should be blessed when it is created. This must be a subclass of
HTML::Mason::Component. The default is
HTML::Mason::Component.
• extra
This optional parameter should be a hash reference. It is used to pass
information from the resolver to the component class.
This is needed since an HTML::Mason::Resolver subclass and
an HTML::Mason::Component subclass can be rather tightly
coupled, but they must communicate with each other through the
interpreter (this may change in the future).
Next is our glob_path() method:
sub glob_path {
my $self = shift;
my $pattern = shift;
$pattern =~~ s/*/%/g;
The pattern given will be something that could be passed to Perl's glob()
function. We simply replace this with the SQL equivalent for a LIKE
search:
return
$self->{dbh}->selectcol_array
( 'SELECT path FROM MasonComponent
WHERE path LIKE ?', {}, $pattern );
}
Then we return all the matching paths in the database.
Since we may want to use this resolver with ApacheHandler, we will also
implement the apache_request_to_comp_path() method:
sub apache_request_to_comp_path {
my $self = shift;
my $r = shift;
my $path = $r->uri;
return $path
if $self->{dbh}->selectrow_array
( 'SELECT 1 FROM MasonComponent WHERE
path = ?', {}, $path );
return undef unless $r->path_info;
$path .= $r->path_info;
return $path
if $self->{dbh}->selectrow_array
( 'SELECT 1 FROM MasonComponent WHERE
path = ?', {}, $path );
return undef;
}
We generate a component path by taking the requested URI and looking for
that in the database. If it doesn't exist, we will try appending the path info if
possible or just give up. Finally, we try the altered path and, if that doesn't
exist either, we just give up and return undef, which will cause the
ApacheHandler module to return a NOT FOUND status for this request.
That's it, all done. And nothing left as an exercise for the reader this time.
As with the lexer, this can be used either via a httpd.conf
directive:
PerlSetVar MasonResolverClass
HTML::Mason::Resolver::MySQL
or by passing the resolver_class parameter to the new() method for
HTML::Mason::Interp.
Request: A Request Object with a Built-in Session
Wouldn't it be cool to have a request object with a built-in session? "Yes, it
would," you answer. "Child's play," we say.
When a request is made using this object, it should either find an old session
or create a new one. Then in our components we will simply call $m-
>session() to get back a hash reference that will persist between
requests.
For simplicity's sake, we won't make this class configurable as to what type
of session to use, though it could be done.
3
package HTML::Mason::Request::WithSession;
$VERSION = '0.01';
use strict;
# Import a subroutine error( ) which throws an
HTML::Mason::Exception
# object
use HTML::Mason::Exceptions ( abbr => [ 'error' ]
);
use HTML::Mason::ApacheHandler;
use base qw(HTML::Mason::Request);
One problem unique to subclassing to the Request object is that Mason
already comes with two of its own Request subclasses. These are
HTML::Mason::Request::ApacheHandler and
HTML::Mason::Request::CGIHandler, which are used by the
ApacheHandler and CGIHandler, respectively.
In order to cooperate with the ApacheHandler and CGIHandler modules, we
want to subclass the appropriate class. However, we can't know which one to
subclass when we are loaded, because it is possible that we will be loaded
before the ApacheHandler or CGIHandler module. We'll take care of this in
our new() method, which will be discussed momentarily.
Our session will be implemented using cookies and Cache::FileCache
for storage, just as we saw in Chapter 11
:
use Apache::Cookie;
use Cache::FileCache;
use Digest::SHA1;
We solve our subclassing problem with the following code. There is nothing
wrong with changing a class's inheritance dynamically in Perl, so that's what
we do. The alter_superclass() method is provided by the
HTML::Mason::Request base class, and does the right thing even given
multiple inheritance. It also cooperates with Class:Container to make
sure that it sees any changes made to the inheritance hierarchy:
sub new {
my $class = shift;
$class->alter_superclass(
$HTML::Mason::ApacheHandler::VERSION ?
'HTML::Mason::Request::ApacheHandler' :
$HTML::Mason::CGIHandler::VERSION ?
'HTML::Mason::Request::CGI' :
'HTML::Mason::Request' );
return $class->SUPER::new(@_);
}
We make a session, call exec() in our parent class, taking care to preserve
the caller's scalar/list context, and then save the session. If an exception is
thrown, we simply rethrow it:
sub exec {
my $self = shift;
$self->_make_session;
my @result;
if (wantarray) {
@result = eval { $self->SUPER::exec(@_)
};
} elsif (defined wantarray) {
$result[0] = eval { $self-
>SUPER::exec(@_) };
} else {
eval { $self->SUPER::exec(@_) };
}
# copy this in case _save_session overwrites
$@
my $e = $@;
$self->_save_session;
die $e if $e;
return wantarray ? @result : defined
wantarray ? $result[0] : undef;
}
Making a new session for subrequests is probably incorrect behavior, so we
simply reuse our parent's session object if a subrequest is exec()'d:
sub _make_session {
my $self = shift;
if ( $self->is_subrequest ) {
$self->{session} = $self->parent_request-
>session;
return;
}
This code is pulled almost verbatim from Chapter 11:
my %c = Apache::Cookie->fetch;
my $session_id =
exists $c{masonbook_session} ?
$c{masonbook_session}->value : undef;
$self->{session_cache} =
Cache::FileCache->new( { cache_root =>
'/tmp',
namespace =>
'Mason-Book-Session',
default_expires_in => 60 * 60 * 24, # 1 day
auto_purge_interval => 60 * 60 * 24, # 1 day
auto_purge_on_set => 1 } );
my $session;
if ($session_id) {
$session = $self->{session_cache}-
>get($session_id);
}
unless ($session) {
$session = { _session_id =>
Digest::SHA1::sha1_hex( time, rand, $$ ) };
}
Apache::Cookie->new( $self->apache_req,
name =>
'masonbook_session',
value => $session-
>{_session_id},
path => '/',
expires => '+1d',
)->bake;
$self->{session} = $session;
}
Also just like Chapter 11
:
sub _save_session {
my $self = shift;
$self->{session_cache}->set
( $self->{session}{_session_id} => $self-
>{session} );
}
And to finish it off, a simple accessor method:
sub session { $_[0]->{session} }
Wow, nice and simple. Of course, this would need to be customized for your
environment, or you can use the previously mentioned
HTML::Mason::Request::WithApacheSession module available
from CPAN.
Once again, you have two options to use this new subclass. If you are
configuring Mason via your httpd.conf
file, do this:
PerlSetVar MasonRequestClass
HTML::Mason::Request::WithSession
or in your handler.pl
you can load the module and then pass a
request_class parameter to the HTML::Mason::ApacheHandler
class's constructor.
Argument Munging: ApacheHandler
One of the main reasons that you might consider creating your own
ApacheHandler class is to change the way arguments are processed. For
example, we might want to create objects based on certain objects.
Our subclass starts like many others:
package HTML::Mason::ApacheHandler::AddObjects;
$VERSION = '0.01';
use strict;
use HTML::Mason::ApacheHandler;
use base qw(HTML::Mason::ApacheHandler);
This should look pretty familiar. Now we'll load a few more classes, which
we'll be using to create objects:
use Date::ICal; # date object
use MyApp::User; # user object
And now we override the argument-processing subroutine,
request_args():
sub request_args {
my $self = shift;
my ($args, $r, $cgi_object) = $self-
>SUPER::request_args(@_);
ApacheHandler's request_args() method returns three items. The first
is a hash reference containing the arguments that will be passed to the
component. The second is the Apache or Apache::Request object for
the current request, and the third is a CGI.pm object. The CGI.pm object is
created only when the ApacheHandler's args_method attribute is set to
CGI .
if ( exists $args->{epoch} ) {