Two weeks ago I had the pleasure of presenting at the NY JavaSIG. The event was hosted by an old friend, Frank
Greco, who has been doing a really great work keeping the NY Java community up
to date with the latest and greatest for quite some time (Great work Frank !).
Even though it was one of those freezing NY evenings, the room was packed with
around 300 people.
In this presentation, I used an analogy that I
refer to in many of my recent talks to explain the fundamental limitations of
the tier-based approach. I thought it is worth documenting this analogy, for
those of you who are looking for a simple argument to convince their managers to
open their mind to alternative approaches. One of the things I experienced with
this analogy is that everyone gets it (my wife included J)
It goes like this:
Imagine a Coca-Cola production line that
consist of three factory lines: one producing the bottles, one filling them and
the third shipping them. The current production line can produce 1000 bottles at
any given day.
One day your manager says to you (you are
responsible for the total production): “we’re going to launch a new campaign, we
expect demand to grow to 10,000 bottles a day, how quickly can we be ready for
this?”. Excited to meet that challenge you immediately call the responsible
persons in each of these factories and tell them about the new requirement. Jim,
the bottle factory manager tells you “no problemo, I’ve just upgraded my entire
machinery I should be ready in no time”, Joe, the bottle filling factory line
manager says – “I’ve already squeezed everything I can get from what I have, and
it’ll take me 6 months to upgrade my production line”, and Ann, the shipping
factory line manager tells you “it’ll take me 1 month to get ready”. How long
will it take to meet the 10,000 bottles a day requirement?…. Easy – exactly 6
months, why? Because you’re only as strong as your weakest link – in this case
the bottle filling factory line.
Coca-Cola Factory – “Tier Based”
You are probably asking by now how all this is related to
computing? Think of the tier-based approach as a production line which consists
of a messaging-tier, a business logic-tier and a data-tire. This production line
produces transactions. To be fulfilled, the business transaction needs to go
through all the tiers in similar fashion to Coca-Cola bottles in our production
As with the production line, in order to
process more transactions in a given time period, we need to make sure that all
the tiers in the chain of transaction processing flow are tuned to meet the new
required throughput capacity. As with the production line, to go from a certain
capacity to a bigger one requires a process in which each tier needs to be
tuned/upgraded or even replaced to cope with the new required. This is going a
continuous effort that happens for every scaling event. Each time you scale
would require a different level of effort which in most cases this effort is
This is only the tip of the iceberg –
things becomes much more complex when we add reliability constraints to this
process i.e. we cannot afford down time of this production line at any point in
time. In our factory analysis this will require that each factory will have its
own DR site. Most likely each one will have a different approach on how it
implements this reliability policy, and it’s going to be very hard to make these
policies consistent across the production line. Same with our tier-based
implementation. Each tier has its own high availability and fail-over model. The
only way we can ensure that our transaction is consistent is by adding an
external coordinator which will look at each individual transaction and make
sure each tier processed it, before it can safely “say” that the entire
transaction was processed successfully. This synchronization process is going
to hold all our operations i.e. we’re going to be busy most of the time doing
synchronizations, which means that we’re not going to be utilizing our existing
resources effectively. In the tier-based world this coordination is basically
the two phase (XA) transaction.
I can easily continue down this path, and review the
various limitations of the tier-based model, but I believe you get the picture.
This model is built of “silos”. If you have a fairly static environment this
model may work fine. However, where there is a strong dependency between silos,
and we expect to deal with continuous scaling changes and upgrades, this
approach is broken.
Is there a better way
To find a solution, we can refer again to
production line optimization experience. One of the methodologies used to
optimize a production line (and is also being adopted for optimizing development
processes) is referred to as “lean”.
Lean is a management philosophy. Ultimately, it
focuses on throughput (of whatever is being produced) by taking a
strictly system-level view of things. In other words, it doesn’t focus
on particular components of the value-stream, but on whether all the components of the chain are working as
efficiently as possible, to generate as much overall value as possible.
specific example, if we take an end to end system level view it becomes clear
that if we have strong dependency between the different units in our production
line. It doesn’t make a lot of sense to put them in different places under
different managers, even if each of them serves a different purpose. By
recognizing this dependency we can restructure our production line – this time
we’re going to build each factory as a self-sufficient unit where each unit will
handle the entire production line i.e. producing the bottles, filling and
shipping them. Since we can build all of the units as a complete replica of
each other, we gain consistency across the entire sites, even if each unit can
deal with only a small subset of the total required capacity. All we need is the
right number of these units to meet the demand. If we need to increase the
capacity ? Easy – we just add more of these production units without even
needing to inform the existing units of this change. In this approach, if one
unit fails, it brings down only that unit and doesn’t impact the entire
production line. More importantly, we eliminate the need for the synchronization overhead between
the various components by co-locating them in a single factory. This way our
production line becomes completely agile compared to the alternative.
Coca-Cola Factory – Self Sufficient Units
In our tier-based world we will do pretty much the same.
Instead of having separate servers per tier we’ll build our application out of
self sufficient processing-units each containing the messaging, business logic
and data components. We scale our application simply by having more of these
units and load-balance the transactions between these self sufficient units. In
other words, we`re doing the following:
- We take all the components of our architecture that are tightly
coupled at run-time i.e. latency, fail-over, scaling, and group them under a
single self sufficient unit of work which we refer to as processing-unit.
- We use
many of these processing units to handle the required
It is not surprising that the production line analogy helps highlight some of
fundamental deficiencies with tier-based architecture. There is a lot of
parallelism and things we can learn from the “Lean” and “Agile” methodologies that already proved themselfs in optimizing production environment. One
of the main lessons is to take the system end-to-end view, rather then trying to
apply optimization on each tier separately.
In many cases, much
more “bang for the buck” can be achieved simply by looking at an extended
value-stream, as opposed to a localized one. \
The limitations of the tier-based approach are not just
because of the limitation of a certain implementation or a certain API (J2EE).
It is the fundamental thinking underlying the tier-based approach, which leads
to the complexity of wiring these tiers to meet changing runtime requirements.
Our suggestion to solve this problem is that instead of separating our
application based on functionality (API) we will separate it based on runtime
dependency and keep the API separation at a logical level and not a physical
one. At first sight this may sound like a major shift in how we build our
applications. However – the good news is that we can abstract a large part of
that change from our application code using virtualization techniques. The official name that we’ve given to this pattern is referred to as Space Based Architecture (SBA). For more information on that pattern you can listen to podcast from a presentation that was given during the last TSS Symposium event.
By the way – when started talking about tier-based
architecture, my wife lost me. But the production line worked. Try