To jettison Geronimo
The techincally.us server hosts a number of Java webapps, including the Typeturner software that runs this weblog. It’s a decent proving ground because while it’s a real server that runs continuously for months, but downtime does not result in angry phone calls from paying clients. Brisk e-mails for charity work are at bad as it gets.
Several months back I fell hard for Apache Geronimo, that Java EE app server that no one uses for some reason. I didn’t want an EE app server either, but I was looking for something that supported context reloading. Jetty didn’t in its standard configuration, and reloading in Tomcat had given me trouble. Plus, Tomcat’s admin screens feel unsettlingly like 1998.
Lock and reload
Context reloading makes a lot of sense. If you’re running multiple contexts on one server—what they are billed to be for—rolling out an update to one application context should not disrupt service to the rest. But since its inception, this important feature has been effectively broken in every servlet container: reloads leak memory and unpredictably swamp the server and all servlet contexts.
And since the year 2000 at least, container programmers have been excusing that failure by reprimanding users for daring to use the context reloading feature in production. As if reloading had any great purpose in development! Rarely does one need to code or debug more than one application context at a time, and the overhead for bouncing the entire server versus only the context is negligible (if not negative). And for that matter, who wants memory leaks and unpredictable behavior in development, either?
Despite my problems with reloading under Tomcat, I was determined to find a container that could do it reliably. So I tried out Geronimo (despite its J2EE horn-blowing) and was immediately pleased to see that it could run Jetty internally, reload contexts, and present a 21st century web interface. I switched over to it for all my home Java serving, and even wrote up Databinder deployment instructions recommending Geronimo.
It took me a while to get back to reality. The first step was in setting up permanent hosting for a client site, I soon found that even Geronimo’s “Little G” variant consumed significantly more memory than plain Jetty. It’s foolish to waste any when you only have 128MB total. The second step was realizing that no amount of maximum PermGen would be enough to permit daily context rollouts (that I do for Typeturner when I add features) without eventually crashing the entire app server. New face, old problems.
Garbage, Version 2.0
Foolishly, I assumed that the latest, greatest Geronimo pre-release would have solved this pesky PermGen problem once and for all. I’m not sure why I thought that after ten years of the problem languishing it would have been solved without any announcement. We all have a weakness for higher version numbers, and mine kicked in. So I spent hours installing Geronimo JEE5 Jetty6 2.0 M5 (and a shot of non-fat milk, please!) because I have faith in projects tackling the tough, important problems because they’re there.
This new version of Geronimo was so slow I could not believe it. Starting all my contexts under 1.1.1 used to take a little over a minute. Now, it took over ten minutes. I’m not kidding. And once they finally started, the web-apps performed noticeably slower than inside any other container. Why bother releasing anything with such crippling performance regressions? I can only guess that in their effort to incorporate every capital E coming out of Santa Clara, the Geronimo programmers neglected job number one of their container: running applications at an acceptable speed.
I should have known that being in the same execution space as Java EE would have a price, even if I avoided those features myself. My assumption that a motivated, capable project like Geronimo could toss off the enterprise curse was naive: that devil has been stalking through computing in various forms since long before Java and it’s stronger than any of us. Invoked by those who neither understand nor respect programming, it bogs down, suffocates, and sucks the soul out of any software it gets near.
Application server, deconstructed
Container context reloading is never going to work in any meaningful sense. Perhaps if the JVM were largely rewritten—but no one cares enough for that. It’s somehow viewed as a childish feature, as if there were something wrong with updating applications on a live server more often than scheduled monthly corporate releases. Most likely, the context reloading problem will someday retire with Java itself. But we don’t have to suffer with it until then.
There’s a trick answer to the problem: if contexts do not work, do not use them. Serve applications that need to be regularly updated in their own containers and JVMs. Then you can “reload” all day long and no one will come along to slap your wrist. There are obvious redundancies in this setup—having multiple instances of the servlet container and the JVM itself—but in the practical sense they’re irrelevant.
My five Jetty 6.1 servers running now are featherweight compared to a single instance of that awful Geronimo millstone (oops—I mean milestone!), and even compared to the 1.1.1 this seems faster. Certainly it’s more stable and predictable for replacing, adding, and removing contexts, which is something we just have to do in real life. I’m not sure how we started going down the path of a monolithic server JVM. Those mythical Java desktop applications we’re supposed to be running would execute in their own virtual machines; why not run servers that are each intended for many users in separate environments too?
Nuts and bolts
Instructions for this arrangement will eventually replace those for Geronimo over at databinder.net, but here’s the gist: you run each app server on different ports and proxy those out through an Apache web server. (This is common with Mongrel configurations, and others too I’m sure.) Applications can run on named contexts or the root context; as a bonus, you can run root contexts on different virtual domains with no special configuration for servlet containers.
I have a single /etc/jetty.xml
file, and an /etc/jetty.conf
file that contains the string /etc/jetty.xml
. Inside my jetty.xml
, I’ve only made a few changes from the jetty.xml
distributed with Jetty: 1. I commented out the context deployer. None of my apps need anything beyond the web-app deployer. 2. Inside the web-app deployer, on a single line:
<Set name="webAppDir">/usr/local/webapps/
<SystemProperty name="my.webapps" default="../jetty/webapps"/>
</Set>
This allows me to set a JVM property for the webapp directory and thus use the same jetty.xml
for all applications. If my.webapps
is not specified, the supplied default points back to Jetty’s regular webapps dir (though I’m not using it for anything). 3. I commented out the request logger because Apache is taking care of that.
In an /etc/default/jetty
script we put variables that are the same for every Jetty instance:
LANG=en_US.UTF-8
JAVA_HOME=/usr/lib/jvm/java-6-sun
JETTY_HOME="/usr/local/jetty"
JAVA_OPTIONS="$JAVA_OPTIONS -Dwicket.configuration=deployment \
-Xms128m -Xmx512m -XX:MaxPermSize=64m"
I only made one change inside the Jetty directory tree, so upgrades will be a cinch, to the file resources/log4j.properties
. The default configuration outputs to stdout
, which on my headless server is the same as trees falling in forests when no one is around. I set it to log to a rotating file, but I can do even better (and was never able to get this to work under Geronimo) with errors by e-mail:
log4j.rootLogger=info, R, email
log4j.appender.R=org.apache.log4j.RollingFileAppender
log4j.appender.R.File=/var/log/jetty/app.log
log4j.appender.R.MaxFileSize=100KB
# Keep one backup file
log4j.appender.R.MaxBackupIndex=1
log4j.appender.R.layout=org.apache.log4j.PatternLayout
log4j.appender.R.layout.ConversionPattern=%p %t %c - %m%n
log4j.appender.email=org.apache.log4j.net.SMTPAppender
log4j.appender.email.SMTPHost=localhost
log4j.appender.email.From=webmaster@technically.us
log4j.appender.email.To=nathan@technically.us
log4j.appender.email.Subject=Error at technically.us
log4j.appender.email.layout=org.apache.log4j.PatternLayout
log4j.appender.email.layout.ConversionPattern=%p %t %c - %m%n
log4j.appender.email.Threshold=WARN
Don’t forget to create a /var/log/jetty/
directory, and leave out any log4j.properties
from your application archive. If your application uses mail internally, you also need to exclude the javax.mail
dependency from the archive file so Jetty will use the one from its own classpath. Setting that Maven dependency to “optional” does the trick.
Web applications themselves go into their own subdirectories of /usr/local/webapps
, so for a root context you could have /usr/local/webapps/typeturner/ROOT.war
. Each application gets its own control script, such as the following /etc/init.d/typeturner
:
#!/bin/sh
JETTY_PID="/var/run/typeturner.jetty.pid"
JETTY_PORT="8283"
JAVA_OPTIONS="$JAVA_OPTIONS \
-Dmy.webapps=typeturner"
. /usr/local/jetty/bin/jetty.sh
This script passes run, start, stop and other control commands to jetty.sh
. Try run
first since the output goes right to the console.
The last step is to configure Apache to proxy the different applications. My typeturner’s proxy configuration is a little convoluted; it’s a root context but shares that URL space with non-Java apps. The databinder.net setup is more typical:
ProxyRequests Off
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
ProxyPass /site/ http://localhost:8282/site/
ProxyPass /baseball/ http://localhost:8281/baseball/
ProxyPass /directory/ http://localhost:8281/directory/
[...]
ProxyPassReverse / http://localhost:8281/
ProxyPassReverse / http://localhost:8282/
All of the Databinder examples (baseball, directory, …) run on different contexts in the same servlet container, as they aren’t updated often and when they are it’s usually as a group.
This setup uses an HTTP reverse proxy instead of AJP; if I ever get tired of its simplicity and reliability I might try adapting it. For now, I’m happy to have something that just works perfectly.
Geronimo was going to be the platform for testing Databinder’s theoretical generic JPA support on something besides Hibernate EntityManager; now it certainly won’t be. Our JPA support will have to exist in some alternate universe where “enterprise” means something besides wasted time, satisfied airheads, and paperwork over performance.
Codercomments
I still have one last hope to achieve complete separation between web apps: OSGI. What I’ve read about it sounded like it’s going to solve many Java problems we face today. I suggest you take a look at it, at least before throwing the towel for good…
Thanks for pointing that out. Maybe some day OSGi modules will replace (or just fix) servlet contexts? Seeing as I have a work-around that I’m very happy with I won’t be spending any more time on it for now, but I can always “retrieve the towel” and jump back in the ring later.
Jetty’s ContextDeployer will do the hot redeploy for you. Have you tried it out?
Yes, I think that’s been built in since 6.1? I tried it but quickly ran into the same problem I’ve had with any hot redeployer, gradual consumption of PermGen space. Jetty rules, regardless; I’m sure this multi-instance setup would be ugly in anything else.
Hi n8han,
I’ve been running into the Tomcat “out of memory” problems due reloading of webapps for quite some time, and your solution is about the closest I’ve seen to what I would consider “helpful isolation”. It should also allow you to identify memory leaks or issues, since there is no way to get this info on a context level from Tomcat. I’m looking forward to trying it out.
At any rate, you might want to update your other Geronimo link to say “Update: I’ve abandoned Geronimo in favor of … see this article.”
Thanks again,
Josh
Good idea! I’ve updated the old post. (The Databinder docs will have to wait until I can rewrite them.)
Nice post, Nathan.
It’s funny that despite I didn’t read your post before, I’ve came to almost the same conclusion than you. I run 4 jetty servers to power the xoocode.org dynamic sites, all behind a single apache server (also serving static stuff), and I’m very happy with memory usage and applications isolation.
The only thing I do differently is that I do not even deploy my apps, I make them plain regular java applications, embedding a jetty web server among other stuff (derby server, mule server, spring container, depending on my needs). This works very well in development, you don’t need any specific plugin or complex deployment script or startup script, you can run the app by launching a Main class, that’s all. Keep it simple :-)
Yay, the Jetty secret weapon squad. I’ve been meaning to use Jetty embedding for the Databinder examples, to make running them inside any IDE automatic (click “go”). I may as well use it for deployment too.
Excellent post like always Nathan!
What about using mod_jk to avoid the extra HTTP traffic? This could improve performance.
I’ve started using AJP in some places, to make the proxy invisible to code. The performance difference isn’t apparent, though I don’t doubt it’s there.
Heya,
Have you considered enabling PermGen sweeping?
-XX:+UseConcMarkSweepGC \ -XX:+CMSPermGenSweepingEnabled \ -XX:+CMSClassUnloadingEnabled
Supposedly ends the PermGen nightmare…
Cheers, Dan
Hadn’t considered or heard of those parameters! I see your post about them from yesterday. Big servlet containers that reload contexts should really update their scripts to save people some headaches. Me, I’m not going back either way. That version of Geronimo was seriously slow, and my Jetty servers have been kicking butt since I switched over. Actually now I’m embedding them the way Xavier suggests in his comment, and it rules.
excellent post, nathan!
just in case you only use apache for reverse proxying, did you ever consider using a load balancer such as pound? it’s simple and extremely lightweight, avoiding apache’s overhead.
cheers,
francisco
I do serve some plain HTML and PHP, but thanks for the link.
Add a comment