Thursday, March 27, 2008

Scalable Java Web Applications, Part 2

Saturday, August 20, 2005

Scalable Java Web Applications: Part 2

I've recently discovered another way not to handle load-balancing for a Java Web Application...at least not with Tomcat. :)

As mentioned in my previous article Scalable Java Applications, in order to maintain a user's session data in a web application that is distributed across a server farm, your two basic options are:

1) Share the user's data between all of the servers.
2) Use "Server Affinity" to pin a user to one given server while they are logged in, and store their session data on that server.

Server Affinity has some pretty hefty benefits in terms of simplicity, speed of development, performance, and scalability.

However, if your web site encrypts all communication through SSL (https), it can be difficult to achieve Server Affinity. Standard network hardware cannot read and interpret the incoming requests if they are encrypted.

The solution recommended by our network hardware vendor was to add an SSL-izer box in front of our load balancer box. The SSLizer decrypts all incoming traffic, and encrypts all outbound traffic. You host your Site Certificate on the SSLizer box instead of on the Web Servers.

So the incoming path goes: Browser -> SSLizer -> Load Balancer -> Web/Java Server.

The outbound path goes: Web/Java Server -> Load Balancer (maybe) -> SSLizer -> Browser

All inbound communication is decrypted by the SSLizer and forwarded onto the load balancer as an http request on port 80. This allows the load balancer to easily read the requests and determine their destination, thus allowing for Server Affinity.

Piece of cake right? Nope!

The problem is that the Web Servers and Java Application Servers are tricked into believing that they are running an http site, when in reality this is an https site. (See "Level 2 Kluge" under The Kluge Scale). And every place in our system where absolute URLs are generated for the browser, the system will put "http" in the URL instead of "https".

The Application Server assumes these http requests are coming directly from the user's browser, and it has no way of knowing otherwise (that I know of).

The first problem we ran into was with the "base" tag in all of our JSP pages. The &ltbase> tag is standard HTML. It establishes the root path for all relative URL's on a page. However, we were using standard JSP logic to generate the <base&rt; tag, and so Java put "http" into the base URL. And thus all relative references in the page read as "http" instead of "https". This caused security warnings on the user's browser, as well as other odd behavior.

So, I went through our entire application removing all base tags, and re-working all of our URL's to be independently relative. (This is a bit error-prone and will cause future maintenance issues, but I guess we can live with it.)

The more serious problem we ran into next was with HTML redirects, which are generated from many places in a Web application. A "redirect" is an action where the application sends a response to a browser that forwards the browser to a different page. This is quite common, such as for sending the user a "GET" redirect response for a "POST" request. (See section in previous article Scalable Java Web Applications on the use of GET and POST in HTML.)

Tomcat is fundamentally designed to generate fully qualified URLs for all redirect requests. This behavior cannot be overridden. And it is even part of the Sun Specification for Java Servlets. Here is the relevant paragraph from the Sun specification of the "sendRedirect" method:

The sendRedirect method will set the appropriate headers and content body to redirect the client to a different URL. It is legal to call this method with a relative URL path, however the underlying container must translate the relative path to a fully qualified URL for transmission back to the client. If a partial URL is given and, for whatever reason, cannot be converted into a valid URL, then this method must throw an IllegalArgumentException.

On an HTML redirect, Tomcat prepends "http" to the redirect URL because Tomcat believes that the incoming request is "http". And Tomcat sends an "http" redirect to the browser. When the browser receives an "http" redirect in response to its original "https" request, the browser will pop up a security warning to the user. (The Firefox browser does not show such a warning, but Internet Explorer does.)

Example: If I try to redirect the browser to "/myApplication/pageOne.html", Tomcat will generate an HTML redirect to http://www.mysite.com/myApplication/pageOne.html. And this will cause a browser security warning popup.

So, once we installed the SSLizers, our application was popping up security alerts every other page. That is unacceptable.

For the Java and JSP code that I write, I can add code to fully quality all the redirects myself, and prepend "https" to the front of them. And that avoids this problem for my code.

But there are other redirects that happen internal to other Java Libraries that I can't control. Struts and JSF make use of a "redirect" tag in their XML control layer, and the Tomcat Login Process also makes use of redirects. The Tomcat Login Process is especially problematic because it was designed to prevent any and all alterations to its behavior. And the Tomcat Login process uses a redirect to forward the user into the application after they have successfully logged in.

The login page is the LAST place you want your users seeing any kind of security warning from the browser.

Here are the things I tried that did not work:

Dead End #1: In the server.xml file for Tomcat, you can specify the "scheme" for a given Connector. I can specify a scheme of "https", so that Tomcat will assume that all traffic for this controller will be https. Of course, this means that I'll never be able to use regular http on my site, but I can live with that for this particular application. I gave it a try, and I found out that the stubborn Tomcat Login Process won't allow that. If you set your Connector scheme to "https", the Tomcat Login process will then tack port ":80" to the end of your domain in your URL, thus forcing you BACK to port 80. Making an https request over port 80 causes an invalid request error on any Web Server.

Dead End #2: I spent a few hours experimenting with Apache mod_rewrite to see if I could construct a trap that would remove that port ":80" from inbound requests. What I discovered was that the request was failing before it even reached mod_rewrite, because the listener on port 80 was receiving encrypted garbage.

Dead End #3: I created a standard Java Servlet Filter so that I could alter all outbound traffic, and replace "http" with "https" in all outbound redirects. It turns out that j_security_check (the Tomcat Login Process) bypasses all Servlet Filters by design.

Dead End #4: I created my own login servlet, so I could manipulate the request data, and then forward the request into j_security_check. That way I could authenticate the user myself, and then hand them over to Tomcat. It turns out j_security_check does not allow itself to be a target of a servlet forward. And there is no way to intercept the login process AFTER j_security_check either.

Dead End #5: Finally, I came up with a truly evil idea: I called the login page in such a way that I didn't tell j_security_check where to send the user after they have logged in. This would cause j_security_check to generate an HTML 400 error after login. In my Application Configuration (web.xml), I setup a trap for 400 errors so that they are automatically sent to a custom JSP page that I wrote. The custom JSP page then detects if it came from j_security_check, and if so, it forwards the user into the application! Talk about a Rube Goldberg machine! However, it turns out that j_security_check invalidates the user's session if a 400 error is generated. Drat!

I tell you, the people who wrote j_security_check thought of everything! The philosophy is that a secure login process must be atomic (un-dividable). If you let a developer inject their own code anywhere into the login process, you risk compromising security if the developer's code breaks or is hackable.

Final Solution
Well, there is no final solution. The fundamental design of the SSLizer is in direct conflict with the fundamental design of Tomcat. We will have to work with our vendors to try something else, but I'm not sure this is a hardware issue.

Perhaps we are at conflict because we are handling load balancing and SSL with hardware, when Tomcat seems to assume this is a software concern. Tomcat has its own software load balancing mechanisms, and so do commercial Application Servers. It seems almost as if the designers of Tomcat never really considered making Tomcat work behind a hardware load balancer and SSLizer.

Perhaps commercial grade Java Application Servers have configuration settings that allow them to be told that they are operating behind an SSLizer. Perhaps they have more configuration options for controlling the default protocol for outbound traffic.

Work Around
As a temporary work-around, I have pulled a "Neo Kluge", and I have altered the Tomcat Engine itself to send relative URL's for every redirect. I downloaded all the source code for Tomcat from the Apache site, and traced down every place redirects are generated. I managed to find one key point that all redirects have in common (that was lucky), and so I only had to alter the class org.apache.catalina.connector.Response. I rebuilt "catalina.jar" and deployed it to our production Tomcat servers, and everything works perfectly now.

But obviously, this fix cannot be allowed to remain in production for very long. It's just too much of a maintenance nightmare to maintain our own version(s) of Tomcat.

On a side note, if I wanted to fix just the Tomcat Login Process and nothing else, I found that I could alter org.apache.catalina.authenticator.FormAuthenticator. This class controls the login process, as well as that final redirect that happens upon a successful login.

If we decide to replace the Tomcat Login Process with some other module that doesn't have this problem, we would still have redirect problems to deal with in the JSF framework. But I also downloaded the source code for Apache MyFaces (the implementation of JSF that we are using), and I traced down its redirect logic to the "redirect" method of the class org.apache.myfaces.context.servlet.ServletExternalContextImpl. That method can be altered to override the way redirect URL's are generated. And what's nice about this solution is that you can drop this altered class right into your Web Application's source directory, and it will override the same class in the JSF jar files. This way, you don't have to build your own version of the JSF jar files. (This is still a horrible kluge, but it's at least a tiny bit better than modifying Tomcat itself.)

Look for new postings to this blog in the comming days to see how we finally resolve this problem.

To Be Continued....
Well, we finally came up with a semi-decent solution to the problem with redirects in Tomcat.

Here is what you do:

In Tomcat's server.xml, you can configure the Connector to run behind a proxy using the proxyName and proxyPort parameters. Here is the relevant piece of Tomcat documentation on these parameters:

"The proxyName and proxyPort attributes can be used when Tomcat is run behind a proxy server. These attributes modify the values returned to web applications that call the request.getServerName() and request.getServerPort() methods, which are often used to construct absolute URLs for redirects. Without configuring these attributes, the values returned would reflect the server name and port on which the connection from the proxy server was received, rather than the server name and port to whom the client directed the original request."

Well, in my case, I'm running Tomcat behind an SSL decoder, not a Proxy, but the effect is the same, and the solution is the same.

So, in the Connector attributes, I set the "proxyName" to the server name of my website as seen from the outside world ("mysite.mydomain.com"). I set "proxyPort" to 443. I set "scheme" to "https", and I set "secure" to "true". And it works!

When Tomcat generates absolute URL's, it knows to use these parameters to build the URL rather than the values from the incoming request.

This solution seems easy in hindsight. But when I was coming at this from the other direction, I didn't know what to look for. I was doing tons and tons of reading on the topic of "SSL with Tomcat" and not on the topic of "proxy servers with Tomcat".

It's one of those "magic word" solutions where you can't find the answer until you guess the magic word to search for.

But I posted many messages on newsgroups and other forums, and nobody else knew the answer to this either. So, I don't feel too bad. :)

So, overall, we have a pretty good end-to-end solution. The only issue we still have is that by configuring the Connector this way, we've locked the entire site into https. We could never have parts of the site be just "http", because we've hard-coded the Connector to treat everything as https.

But that's not so terribly bad. We could still create a separate site for some http content, and jump back and forth between the two sites I suppose.

2 comments:

Tom Gordon said...

Not sure I understand. You don't have to use the tomcat authenticator, so ... just check the credentials yourself and forward to whatever URL you want to.

mahakk01 said...

Java language is in demand. Scalable Java web applications are not very difficult topic to understand but you have to know the basic for understanding Java web applications. I got what you are describing as I read your first post also. Thanks for the post.
digital id