Putting together the two words “seamless scaling” in
front of a technical audience is a very dangerous thing to do. The technically savvy
folks are walking around with plenty of scars from previous attempts to scale their
system – enough to know that there “scaling” and “seamless” couldn’t be further apart. But nevertheless, in this post I’m going to take the risk and do just
that 🙂
Basically what I’m going to try and argue
is that while scaling can’t be made seamless across the board, there are
different techniques to make scaling seamless in certain scenarios, or at least
very close to seamless. I will use GigaSpaces as an example of how to achieve seamless migration of existing JEE applications into a scale-out model, with zero or minimal change to the code. I’ll also outline our general principles, which I believe are applicable to any application seeking seamless scaling.
The seamless
scaling dogma
There has been a lot of discussion over the past year
about different patterns of scalability. I devoted quite a few of my posts on
this topic. Most of them centered around architecture – how we can use partitioning to avoid a data bottleneck, how we can use in-memory implementations to get better performance and concurrency compared to implementations based on the file-system, and how we can use an asynchronous event-driven architecture
as a better way to scale our business logic.
Randy Shoup outlined these principles nicely in his infoQ article, Scalability
Best Practices: Lessons from eBay. The dogma behind all these discussions and panels was that scaling requires a very rare set of skills, which average developers don’t have, and that’s why we’re still seeing plenty of
online system failure. The most recent was the iPhone
launch failure.
Does scaling really have to be
complex?
Well, if you look at Network Attached
Storage as an example, you’ll see there are alternatives to the traditional
dogma around scaling. With storage systems, we don’t really think of scaling that much.
More so – our applications don’t really need to be aware of the fact
that they run over a local disk or a network -attached device. We can scale by adding
disks, even hot-swapping them in some cases, even while our application is still running.
Now imagine what the
world would look like if it wasn’t that simple. If our application
would need to be aware of what’s behind the scenes of these storage devices and
would have to be re-written to deal with these scaling issues. It’s not that hard
to imagine, is it? Most likely we would still have been talking about storage-related system failure as a result of bad architecture and implementations issues.
But we don’t have anything to talk about, because storage gave us a level of abstraction that enabled almost everyone, regardless
of their skill-set, to deal with scaling without being an expert at it, or really even thinking about it much at all.
Can we learn any lessons from NAS about our ability to achieve seamless
scalability?
Let’s see what were the conditions that made
seamless scaling with storage possible:
- Well-defined interface (or
abstraction) - Interface that fits the share-nothing
approach to make it suitable for scaling - Simple interface
- Widely-used
interface
Now if we examine these principles as they apply to
other layers of the application stack, we’ll get a decent answer as to why we
haven’t been able to apply the same level of seamless scaling – which storage already provides – to these other layers.
In the data layer, the most commonly -used interface is SQL. SQL fit well with criteria (1) and (4) criteria but doesn’t meet (2) and (3). HashTable fit well with (2) and (3) but unfortunately is
less commonly used in distributed systems. JavaSpaces, like HashTable, fits (2) and (3) but is even less commonly used then HashTable. In the
messaging tier, JMS fits well with (1), (3) and (4) but doesn’t lend itself well
to (2), and so on. And these are the cases where there is a well-defined
standard. Unfortunately, in other layers of our applications it’s even harder to find a
well-defined standard that fits to all of these criteria.
To overcome this complexity, there have been other
attempts to use the JVM bytecode as a lowest common denominator and introduce
seamless scaling not at the middleware API level, but on the JVM level using
bytecode manipulation. This seems like an elegant solution to
the problem, however most of the existing distributed systems were not written as
a standalone Java applications that get distributed by some sort of magic, so
it fails mainly on the 4th criteria – it fits mainly to new
applications that were designed with certain assumptions in mind about how the standalone
Java code would behave in a distributed
environment.
Now to the point – can we scale
seamlessly?
Those who expect a simple yes-or-no answer to
this question are going to be disappointed – there is no clear answer , because it depends on the specific application scenario, the way the application was
written and the maturity of various standards around these
applications.
In general I would say that Java-framework-based
applications are in better condition then applications based on other frameworks, due to
the maturity of the standards and the advanced layer of abstractions that are
now available as part of framework such as Spring and Mule.
Seamless scaling at the application layer would most
likely mean the ability to plug-in different underlying scalable
implementations at the middleware layer (data, messaging, business-logic,
presentation). The use of abstraction layers such as IOC in Spring/Mule and the
new EJB3 abstraction gives more freedom to plug in different implementations
that don’t necessarily conform to the exact same standard API. That means that
your code can remain intact when you plug in a different messaging implementation,
for example, whether it is a JMS implementation, a space-based messaging, or remoting.
Some cases are going to be easier
then others. For example, taking a SessionBean and scaling it by having multiple
instances of that service running over a pool of machines, while viewing them all
as if they where a single server, can be done through configuration changes only.
We can do pretty much the same thing to the messaging layer, where we will have a virtual queue and topic rather then a centralized
server.
On the data layer things are more tricky, as
most of the commonly-used standards in this area don’t fit criteria (2)
very well. If our data model is built with a complex object graph, or if our queries
depends on complex joins, then we’re not going to be able to scale it out without
changes to the code or to the domain model. But even in these more difficult cases, it’s possible to minimize the scope of change by using the DAO pattern, declarative
transactions and annotations as a mapping layer on top of the domain model. This means
that even if the change can’t be completely seamless, it will nevertheless be quite simple to achieve.
Learning from the GigaSpaces
experience
At this point I’d like to use our specific
experience at GigaSpaces to describe the methods we used to
enable seamless scaling:
- Use standard
APIs, but only when it makes sense. For years we
chose not to implement large parts of the JEE standard, such as EJB and Entity
beans, because they didn’t fit the scale-out environment and were too bounded to
database. What I’m trying to say is that implementing a standard API is not
always going to make the transition to scale-out model seamless, so you should be
careful which standard you pick.
- Leverage existing
abstractions to plug in different implementations that are based on other APIs or
technologies than the one originally used. We use this principle quite
extensively in our OpenSpaces framework, to map our own transaction handlers,
Remoting abstraction, to enable seamless scaling of SessionBeans ,
etc.
- Use annotations for
mapping between different models.
- Use
aspects to add new behavior when it makes sense. We use aspects in several cases
such as filters/remoting aspects and security aspects. We will probably be using aspects more to address a more advanced level of serialization.
- Apply
more tightly coupled integration to specific products/frameworks A good example for that is our
Spring, Mule and upcoming web tier integration. This sort of integration
enables an end-to-end seamless scaling story that makes the user experience
significantly better. On the .Net side our integration with Office and Excel
enables something equivalent.
- Use open source as a tool to open up the framework for extensions and other
integration work. This is something that we
introduced quite recently through our new OpenSpaces.org community site and
found it to be a useful tool with many extensions already available. GigaSpaces users implemented their own extensions and made them available through the
community site. The most recent one has been Camel
integration.
Real life
examples
Of course, this isn’t just a theoretical discussion – we’ve been attempting to achieve this level of seamless scaling in practice since we
introduced our middleware virtualization stack, which was our first attempt to
address scaling of existing applications and not just new applications.
We have been involved in numerous scenarios of scaling out existing applications. An interesting example is detailed in Mickey’s recent blog post, in
which he describes in more detail how he was able to scale-out a JBoss/Oracle RAC-based application. Mickey provides a good description with code snippets that
show the before and after effects, both in terms of code changes and obviously
scaling and performance. You can find the details of that experience here. The bottom line of this case
study is the fact that he was able to get that application from 15tx/sec to
1500tx/sec in less then 4 days! For me, measuring the time it takes to move your
EXISTING application and see the immediate results is the ultimate measure. You have to agree that if the transition to a scale-out model wasn’t seamless, it wouldn’t have been possible to do in such a short time, and more importantly, without ripping and replacing the entire application. In Mickey’s case, we
started with decoupling of the database to get the initial scaling, and replaced the other layers
incrementally.
Summary
Storage taught us the lesson of seamless scaling.
Seamless scaling can be achieved on other layers of our application as well, using a combination of Standard APIs, Abstractions, Aspects and tailored
integration. In most cases, seamless scaling would mean no changes to our
application code but would require changes to configuration and packaging. Not
all layers can make a fully seamless transition. But in those more difficult cases, we can use the same
principles to significantly minimize the changes required for scaling.
In this post i wanted to share some of our GigaSpaces experience in that area as i believe many of the lessons and principles are pretty generic and can be applied to any project/product. At this point it is also important to note that this is not a one-off proposition. It’s a continuous effort and requires a long-term roadmap and commitment. We’ve been struggling with this for years and applied every possible method to achieve this goal. Some required significant re-factoring of our entire infrtustructure. The lastest one has been the addition of our OpenSpaces framework as an open source development framework based on Spring. With this change, we can easily support more APIs and frameworks, as well as build an entire ecosystem around it that will enable others to apply the same model to even more frameworks and applications very easily.
You may wonder why we, as a commercial company, would want to do this – after all it also means that GigaSpaces can be replaced much more easily. Well, the reason is fairly simple – we believe that our success and adoption will be much higher if we can get to the point where scaling any application through GigaSpaces won’t require any changes to code. It took few years and an intensive effort to get a point were I can feel comfertable to use the two words “Seamless Scaling”. Now we’re starting to see the fruits of that effort – just see the recent post by Seon Lee who appears to be one of the Mule users: Mule 2.0 + GigaSpaces 6.5 = Pure Sex:
Gigaspaces released 6.5 with API integration with Mule 2.0 … this is just plain awesome. You can use Gigaspaces as the transport (e.g. in place of JMS) and quickly get a SBA up and running utilizing the same concepts I used at RHG when we were servicing B2B
problems. You also get the advantage of the clustering ability and
fault tolerance that comes with Gigaspaces – which is just pure sex –
not to mention all the other great features that come with this
advanced Javaspaces implementation (i.e. management tools, monitoring
tools, data partitioning, performance features like batching).
I expect to see even more on that line with our latest 6.6 release which includes Seamless Scaling of Web application – check that out!