Here's a presentation I did for the Japanese Perl Association on April 21st, 2009.
It covers 10 aspects of Catalyst that may not be documented or discussed as much as they could be, that are very useful.
The first topic is Catalyst plugins, authoring them and using them.
The simple advice is to stop. Don’t use plugins. They’re grossly abused, misused, overused and generally wrong.
There is only one single case when it is ok to use Plugins. Only one. No more, and no exceptions. When you need to inspect, modify or otherwise muck with dispatching. This includes parameter munging, altering the Catalyst chain of execution and things like this.
There are so many Catalyst::Plugin modules that shouldn’t be out there. While they let you add methods onto your context ($c) object, that’s just bad form. Most of them should be models, and some just shouldn’t exist (Catalyst::Plugin::Message!).
Ok, there are exceptions to the rule. Authentication and Session are probably valid reasons, but there are reasons behind the exception. You need to inflate the session before even the root level begin action is called (for cache-based requests), and the authorization plugin makes sense as a plugin. It bypasses dispatching at the application level, so it requires the user to be inflated.
In all these, you could do them as controllers. A root level begin action could handle inflating the session, restoring the user and then running the access controls to determine validity.
These convenience methods are worth the Plugin. However, unless you’re writing something that is completely global, the exception to the “Don’t write plugins” rule probably won’t hold up. If you write plugins, and then upload them to CPAN, the Catalyst core team dies a little bit inside. Especially MST.
Up next is debugging your Catalyst application. On the mailing list, a lot of people post problems and don’t really know how to get enough information on their own to solve them. I’m just going to run down a common set of the problems that users have.
Syntax error in the code while method attribute parsing is happening. There are patches to fix this, but a lot of repositories don’t have it yet. You can do perl -cw on the file and still get the syntax error, but it’s frustrating and annoying.
You can get a lot of information from Catalyst when running in debug mode. Aside from what shows up in the console, you can run the request and then dump everything out to the screen in a nicely formatted view. This can be very good for writing automated tests to inspect the stash and configuration, or just fixing that weird bug when you don’t want to modify templates to put in debugging statements.
Writing mechanize test that use Test::WWW::Mechanize::Catalyst are really easy, and quickly expose bugs in controllers. I highly recommend doing this and then when your code breaks and you need debugging help, you can say, “I have this test” and the Catalyst community will jump at the chance to help someone with the foresight to post the problem with a test.
Every Catalyst component, including the application (MyApp.pm), has a config accessor. As a rule, only modify this BEFORE setup is completed. After that, you can get heisenbugs. You’ve been warned. So, this is how you configure something in a controller.
To configure that same controller at a higher precedence, you do it at the application level. These two configuration examples are identical and produce the same results.
And to get at it, the big thing is to simply use accessors to get at the configuration key.
A moose example to create the ‘foo’ accessor, since we have the ‘foo’ config key (that was set to ‘bar’)
Or the old school way
And yes, config keys are merged into $self in a Catalyst component.
So, you have the hash entry ‘foo’ that is set for you in your component to work with. This is merged in the way you would expect.
So if you have the configuration specified in the myapp_local file, that gets set in $self->{foo}. It just works down from there, traversing into the myapp.conf file and then the application configuration and finally the controller itself.
But, people still do this wrong.
This is wrong. While it will -probably- be right, don’t do it. Simply do $self->{fo}. That is what you want, and in all cases will be the right value.
But, that case will almost always be correct. This case, however, will most likely not be. At component creation time, the configuration is merged into $self already. So use that.
As you can see, the rule here is to always use $self for configuration values that are specific to a component. Application level configuration can be inspected from $c->config, but only generalized configuration.
Once you get an understanding of configuration, you really should use it and abuse it. Doing so will really enhance your applications and hopefully get you writing less code that does more. Which leads conveniently to the next point, number 4.
What makes a better controller? Well, to me, the thinner the better. The best way to do this is base classes.
A lot of controllers fit into specific types of behaviors, most of which are configurable. With the knowledge you now have of configuration, you can make very thin controllers that inherit from thicker base controllers and still delegate as much as possible to the models.
For most cases, sometimes your classes can really be nothing more than a config block. If you have a DBIC CRUD application, this is very easy to accomplish.
There’s a package on CPAN that does this exactly, and gets you very far with nothing more than configuration blocks. I urge you to give it a try, as for most cases it works very well. Since it’s a base class, you can override any methods it uses. It uses Chained, so it’s very simple to alter behavior.
Chained is a simple dispatch method that unfortunately confuses a lot of people, and that’s because method attributes are terrible. Maybe not, but that’s my theory.
The entire methodology is a route, letting you incrementally travel from point A to B. The steps can be easily modified and logged, so it’s very very powerful.
It does, however, require you to stop thinking about actions being part of your URI. No more of thinking that your action name has to correspond to the URI.
Before Chained, you have a controller, say “Bar”, with a method foo and that becomes the URI. You just have /bar/foo. This, however, is silly.
Your URI is your publicly facing URI, it shouldn’t matter what your private action is called. It certainly shouldn’t follow each step that you invariably end up with, with several private actions as common code gets moved.
Right, you see, method attributes suck.
Point A to B is a bad example, because you just have two points. But lets say you have a real use case here, where you want to test user permissions, then test if an object exists (like /foo/123) and then some action based off that in a nice CRUD format. So you start with a URI
Before Chained, you’d have a hard time building a URL like this. That’s not really why you want to use Chained though, you want to use Chained because you can get executable actions at every step of the chain. At /foo, at the 123 and finally at the edit.
You get auto actions, except at every step. Since I’ve switched to Chained, I’ve not used an auto action. To have the power to incrementally build your pipeline and relationships like this is awesome.
If you have an action ‘setup_object’ you tell Catalyst where it is chained from. Here, we’re using “/” which is the root action. By default, the path part is also the method name but can be overriden with the PathPart attribute.
The CaptureArgs attribute is the most important part of Chained. It determines exactly how many arguments to intercept in the dispatch chain. Looking back at the URL, the CaptureArgs is the ‘123’
The 123 is passed in to the action arguments
So, you can catch it and then dispatch accordingly. This is usually where you would fetch the object from the database, check permissions and other actions. However, at this point we do not yet have an end point that is accessible to the user. To define the end point, we have to define additional methods.
Here is the end point defined, and it chains itself to the other method
We use the private action path in the Chained argument, which is in most cases simply the method name.
The Args, not CaptureArgs, defines that this method is an end-point.
The end-point method, which can also be overridden by PathPart, is listed at the end -- after the capture arguments. In this case, our method is called ‘edit’ so that’s what we end up with.
But I know that these slides don’t let you understand it fully, but that’s ok. I just want to convince you to try it. I have an application for you, just for this purpose.
Look at my Catalyst examples on github, you can play around with some Chained examples that are nicely documented and hopefully get you started.
This means that your webserver (Apache) is spawning the FastCGI processes and handling the process management directly. This requires restarting your webserver to restart your application. I don’t like that.
I prefer the external method, using a separate process manager and starting the FastCGI daemon and have it listen on a unix socket. This has a couple benefits
The best one is zero downtime, achieved by the wonderful facility of unix sockets having multiple applications listening to them. So, the technique is simply “Start the new version before you shutdown the old one.” If you have a really catastrophic change in your application, a cold start is probably best but it still is a fantastic way to upgrade your application without incurring a penalty when the FastCGI socket dies.
The second best reason to use External FastCGI is that you are an independent, free thinking entity. So should your application, and you shouldn’t restart your web server to restart your application. There are a few init scripts floating around, and I’ve posted my favorites on http://our.coldhardcode.com/
There is a simple “just start, restart and be done” script, as well as a zero-downtime script. Both are configurable, and I really want to write something that manages the processes better. If you want to, let me know and we can share brains.
We’ve all heard of mod_perl, and some of us have even used it.
But you should already know enough about mod_perl to use it. I’m not saying it is bad, I’m just saying that if you don’t know that you need it, you probably don’t need it. So lets move on.
A nice Perl based preforking server, that is very similar to Mongrel but still uses the Catalyst scripts for process management. I’ve tried to use mongrel, and it pisses me off. HTTP::Prefork hasn’t yet, and still offloads what it can to C-level functions (and XS) so it’s very fast, lightweight and works well. You don’t get the middle-man load of FastCGI, and don’t lose the capability to independently restart your application.
While the code base is pretty robust, stable and written by some smart people who know this stuff I still don’t know of any high-profile and high-traffic sites that are deployed on HTTP::Prefork. I’ve used it, and sitting behind something like nginx is fantastic, but again, only for my own internal applications and nothing getting real user base. So, please, give it a try and let us know if it falls down. It has some strong merits, but some other points that remain to be solved.
The one that matters to me is that it loses the unix socket zero downtime trick. While there are other ways to do it, such as load balancing between two ports and upgrading each port incrementally, it isn’t as easy to get done as External FastCGI.
Catalyst::Log is a very good start for a basic logging mechanism. It provides a rudimentary buffer, with a flush mechanism and the typical debug levels.
The already on CPAN enhanced logging package is Catalyst::Log::Log4perl, which really brings out a lot of power. You get all of the power of Log::Log4perl, but still it exists as $c->log.
Even if you use a logger that has all sorts of fantastic methods, you probably don’t want to use them. Using the standard debug, info, warning, error and fatal methods gets you what you want, and there really aren’t a lot of convincing arguments. If you adhere to those methods, then you can swap loggers in and out without having to update your app.
The fantastic _dump method gives you a pretty-printed output similar to what you’d get with Data::Dump (and uses Data::Dump under the hood).
This is an undocumented “private” method because it deviates from the standard logging methods above, but Catalyst::Log and Catalyst::Log::Log4perl both support it. It emits the message as an info message.
Action classes are eventually going to be replaced with roles, which is a much better way of doing things. You can skip ahead, and reap the benefits by using...
This exchanges the idea of Action Classes and replaces them with roles, which is really what an action class is (describing that it does something). Catalyst::Controller::ActionRole is on CPAN, and available for playing with.
You can ignore this if you don’t already fully understand roles or use them. They’re very useful, but I’m trying to keep the Meta to a minimum.
Action classes can turn into Roles, which works out a bit better as far as semantic coding.
If you want to see more, there is plenty of information out there.
So, back to action classes. These are very useful because they essentially wrap the action’s execution in a method that you control. A very popular action class is the REST action class, which takes the action name, inspects the request method (like POST, PUT) and then dispatches to another method specifically for the request.
Usually when you have some “meta” actions that happen around the action. Perhaps it is parameter munging on an per-action basis that isn’t suitable for a plugin. I recently wrote an ActionClass that fetches page properties for the specific action, so that marketing people can modify certain aspects of the behavior of the action in a GUI. The action class then populates the stash with some defaults. While you can do this with a model, having the role of an action being taken just cleans up some syntax, and also lets you interrupt dispatching in more powerful ways that still keeps your controllers thin.
The main point of local::lib is that it separates your application from the vendor supplied paths. What this generally means is that on a per-user basis, you can have your own perl lib tree. Stuffing everything in the vendor perl has a lot of limitations, and it just isn’t a good way to do it. You’re really bound to whatever version of perl, bugs and all, the vendor supplies because updating Perl may break your entire system. Additionally, security patches supplied by your vendor may break Perl (Like Apple’s latest security update).
You can do it, and most people do with few negative side-effects. It doesn’t make it good, and considering how easy local::lib is to get going, there really isn’t a reason to not use it.
In your application, if you keep your Makefile.PL (or Build.PL) up to date, you can install all the dependencies as the user that is running the application. You’re guaranteed that the packages you install via local::lib are those running your application, so on a multi-user system (or multi-application, with one application per user) you can run different versions of different software.
In most cases, your application shouldn’t run as root. You shouldn’t have your dependencies requiring root, either.
If you use external FastCGI, local::lib, you can run and deploy your entire application without ever touching the root account.
To setup local::lib, just install it from CPAN. When you load local::lib, it dumps out environment variable settings to use. Simply store those in your bash or cshrc profiles and you are set to go. Installing modules from CPAN or from a tarball puts them in a local, sanitary directory.
Want to start from scratch? No Problem. You still have a fallback to work from with your Perl, so removing your local::lib directory won’t destroy Perl. You can start from scratch.
local::lib defaults to a ‘perl’ directory that lives in your home directory. However, it is very trivial to have a lot of local::lib paths, and to even switch between them.
This is particularly useful for doing smoke testing, if you have multiple environments.
Contributing to Catalyst is easy. There’s a lot of things you could do, and a lot of them don’t involve a lot of code.
Documentation is always good
Tests are even better
But getting the word out, showing off applications and simply doing something with Perl and Catalyst is the best thing you can do to help Catalyst and the Perl community. Perl in general is languishing because we’re very inventor heavy. We have a lot of names that work in theory, and build tools, but not a lot of people who are out building applications.
So, if you find something cool to do with Catalyst then blog about it.
If you have an application out the door, tell the world it is running on Catalyst. Help other people out by showing how you did it, and even discussing some problems. This has a lot of benefits, and it helps get your name out there and even helps hire better developers.
The more applications that are still running and built on Perl means the more developers will be using Perl. It’s viral.
If you want to help Perl and the Catalyst community, build end-user sites. Talk about how you did it.
You’ll get better developers, and more of them.
A big reason why people assume Perl is dead is because too few new sites talk about using Perl. I know a lot of developers who don’t even know that Vox is written in Perl (let alone Catalyst). There’s reasons for this
See, this is good for me but bad for Vox. Perl isn’t any better, but I’m not the top result so you don’t get to see it.
Talk about how you build your site, and you will get better search results. Of course, if you’re working on an internal site you can ignore this point. But don’t ignore the last 9!