
Safe, High-Velocity Library Upgrades in Tomcat Monoliths
Keeping a large production system healthy often feels like changing airplane engines while in flight. At Egnyte, we still operate several sizeable Java monoliths that run inside Apache Tomcat. All high-severity Common Vulnerabilities and Exposures (CVEs) need to be patched quickly—sometimes in a matter of days—to maintain the uncompromising security posture our customers expect. Falling behind isn’t an option, yet switching out low-level libraries, such as database drivers, can break the entire application if even one transitive dependency behaves differently.
Traditionally, teams respond with heavy engineering rituals. They build a second Docker image with the new JAR files, deploy it to a staging stack, run a battery of tests, then drift it side-by-side with the old image in production. That approach works, but it doubles build and storage costs, complicates continuous-integration pipelines, and creates awkward questions like “Which image just failed that health check?” We wanted something leaner—one image, one artifact name, zero shell scripts shuffling files around at container startup—yet still reversible at the flip of a switch.
Why Library Upgrades Are Risky
Java packages its dependencies as JAR files that live under /WEB-INF/lib inside a web application. When Tomcat starts, it opens every JAR on that path and wires the classes into a class loader, the component responsible for turning a class name like org.postgresql.Driver into executable bytecode. Once a class is loaded, it tends to stay loaded for the lifetime of the JVM. If you swap a JAR on disk and restart Tomcat, you get the new version everywhere, all at once—a “big bang” upgrade. If the new version expects a method that no longer exists or behaves differently under load, the application crashes.
The usual escape hatch is to run two full copies of the application—“old libs” and “new libs”—and decide at routing time which one receives traffic. While conceptually clean, this approach multiplies build time, container registry storage, environment variables, monitoring noise, and incident complexity. We calculated that maintaining a parallel image for every critical library upgrade would be a considerable expense in CI time alone, plus it increases the chance for the two images to drift apart in subtle ways as new commits come in.
A One-Image Strategy
Our goal was surgical: control which libraries Tomcat sees without copying or deleting anything on disk and without maintaining two separate builds. The solution hinged on a little-known extension point inside Tomcat called StandardRoot. At startup, Tomcat asks StandardRoot to enumerate every resource that belongs to the application—HTML files, configuration files, and, most importantly, JARs. By providing our own implementation of this interface, we could decide in real time which JARs appear to exist.
Two Directories, One Truth
We introduced a second directory beside the built-in library folder:
- /WEB-INF/lib is the baseline, fully tested library set
- /WEB-INF/alt-lib is an overlay that can override or hide anything in lib
On every container build, we copy both directories into the image. That costs nothing extra and keeps the artifact count at one. Which directory wins is determined at runtime by a single environment variable: “USE_ALT_TOMCAT_LIB=true”.
If the flag is absent or false, our custom StandardRoot behaves exactly like the stock implementation, and the monolith boots with its familiar dependencies. If the flag is true, the same code pretends that alt-lib sits in front of lib. For any filename that exists in both places, the version under alt-lib masks the one under lib. In some cases, we may need to remove a library entirely—for example, a logging framework that collides with the new driver or when a JAR name is stamped with a version, e.g., driver-8.0.23.jar while the old one is named driver-5.7.14.jar.To entirely remove a library, we drop an empty JAR of the same name into alt-lib. Tomcat then dutifully loads the empty shell, and the real classes never appear.
No files move, nothing is copied at startup, and the container image digest doesn’t change when the feature flag toggles. Rollback is as simple as unsetting the environment variable and recycling the pod or restarting the VM.
A Quick Class Loading Primer (No Prior Knowledge Required)
Think of a class loader as a librarian in charge of a shelf of books. When the application asks for “Chapter 7 of the PostgreSQL manual,” the librarian looks through every book on the shelf in the order they were shelved—first-come, first-served. If two books share the same title, the librarian stops at the first match and never checks the rest. Our custom StandardRoot quietly rearranges the shelf order: It places the books from alt-lib in front of the main stack if (and only if) the feature flag demands it. To the rest of Tomcat, the operation is invisible.
Handling Dual APIs in Application Code
Of course, switching libraries at runtime is only safe if the rest of the code can survive either version. For data-access layers, we wrap the driver behind a thin adapter that abstracts changes such as connection URL syntax or new SQL types. Reflection and feature detection (“Does class X exist? Does method Y accept three arguments?”) further insulate business logic from API drift. The pattern is familiar to anyone who has supported multiple Java versions in the same codebase.
To easily distinguish which libraries have been loaded, if such a need arises, different jar names can be interrogated through reflection or looking at META-INF content that can be used. However, if both are unavailable, the developer can place a properties file into both lib and alt-lib. That property file will then be overridden, just like a jar would, and the different content could alter the application flow as needed.
Real-World Successes
Since late 2024, this system has powered several upgrades that once would have required after-hours deploy windows—for example, the MySQL 5.x to 8.x driver migration.
In each case, the rollout finished during regular business hours, with the same monitoring dashboards and alert definitions. The operational overhead dropped from two parallel stacks to one stack with a switch, and the average time-to-patch for security fixes shrank from roughly a week to under two days.
Operational Benefits
The operational benefits of this approach are as follows:
- Single artifact—simpler registry retention policies and vulnerability scans
- Predictable startup—no shell scripts copying files before Tomcat boots, so fewer moving parts in container entry points
- Instant rollback—panic button equals toggling an environment variable or helm value
- Cost-efficient—build, test, and store one image regardless of experiment count
Looking Ahead
Our monoliths continue to shrink as services peel off into Spring Boot microservices, but Tomcat will be with us for a while. The controlled-overlay trick scales naturally: Multiple feature flags could activate multiple overlays (for example, /alt-lib-beta), and the same mechanism could gate new JSPs or static resources, providing a lightweight A/B framework without a proxy layer.
For now, the lesson is simple: You don’t need heavyweight release patterns to deliver high-pace library upgrades—even under the most stringent of regimes. A deep understanding of your runtime’s extension points, plus a small amount of custom code, can buy enormous flexibility. By letting Tomcat do the work instead of fighting it, we keep our focus where it belongs: shipping features and staying secure, not nursing fragile build pipelines.
A Note About Spring Boot
While we haven’t had the need yet, the same approach can be used in Spring Boot by overriding Spring’s “LaunchedClassLoader” and applying the same logic.