In this post I’ll cover the difference between multi-core concurrency that is often referred to as Scale-Up and distributed computing that is often referred to as Scale-Out model.
The Difference Between Scale-Up and Scale-Out
One of the common ways to best utilize multi-core architecture in a context of a single application is through concurrent programming. Concurrent programming on multi-core machines (scale-up) is often done through multi-threading and in-process message passing also known as the Actor model.Distributed programming does something similar by distributing jobs across machines over the network. There are different patterns associated with this model such as Master/Worker, Tuple Spaces, BlackBoard, and MapReduce. This type of pattern is often referred to as scale-out (distributed).
Conceptually, the two models are almost identical as in both cases we break a sequential piece of logic into smaller pieces that can be executed in parallel. Practically, however, the two models are fairly different from an implementation and performance perspective. The root of the difference is the existence (or lack) of a shared address space. In a multi-threaded scenario you can assume the existence of a shared address space, and therefore data sharing and message passing can be done simply by passing a reference. In distributed computing, the lack of a shared address space makes this type of operation significantly more complex. Once you cross the boundaries of a single process you need to deal with partial failure and consistency. Also, the fact that you can’t simply pass an object by reference makes the process of sharing, passing or updating data significantly more costly (compared with in-process reference passing), as you have to deal with passing of copies of the data which involves additional network and serialization and de-serialization overhead.
Choosing Between Scale-Up and Scale-Out
The most obvious reason for choosing between the scale-up and scale-out approaches is scalability/performance. Scale-out allows you to combine the power of multiple machines into a virtual single machine with the combined power of all of them together. So in principle, you are not limited to the capacity of a single unit. In a scale-up scenario, however, you have a hard limit -– the scale of the hardware on which you are currently running. Clearly, then, one factor in choosing between scaling out or up is whether or not you have enough resources within a single machine to meet your scalability requirements.
Reasons for Choosing Scale-Out Even If a Single Machine Meets Your Scaling/Performance Requirements
Today, with the availability of large multi-core and large memory systems, there are more cases where you might have a single machine that can cover your scalability and performance goals. And yet, there are several other factors to consider when choosing between the two options:
1. Continuous Availability/Redundancy: You should assume that failure is inevitable, and therefore having one big system is going to lead to a single point of failure. In addition, the recovery process is going to be fairly long which could lead to a extended down-time.
2. Cost/Performance Flexibility: As hardware costs and capacity tend to vary quickly over time, you want to have the flexibility to choose the optimal configuration setup at any given time or opportunity to optimize cost/performance. If your system is designed for scale-up only, then you are pretty much locked into a certain minimum price driven by the hardware that you are using. This could be even more relevant if you are an ISV or SaaS provider, where the cost margin of your application is critical to your business. In a competitive situation, the lack of flexibility could actually kill your business.
3. Continuous Upgrades: Building an application as one one big unit is going to make it harder or even impossible to add or change pieces of code individually, without bringing the entire system down. In these cases it is probably better to decouple your application into concrete sets of services that can be maintained independently.
4. Geographical Distribution: There are cases where an application needs to be spread across data centers or geographical location to handle disaster recovery scenarios or to reduce geographical latency. In these cases you are forced to distribute your application and the option of putting it in a single box doesn’t exist.
Can We Really Choose Between Scale-Up and Scale-Out?
Choosing between scale-out/up based on the criteria that I outlined above sound pretty straightforward, right? If our machine is not big enough we need to couple a few machines together to get what we’re looking for, and we’re done. The thing is, that with the speed in which network, CPU power and memory advance, the answer to the question of what we require at a given time could be very different than the answer a month later.
To make things even more complex, the gain between scale-up and scale-out is not linear. In other words, when we switch between scale-up and scale-out we’re going to see a significant drop in what a single unit can do, as all of a sudden we have to deal with network overhead, transactions, and replication into operations that were previously done just by passing object references. In addition,we will probably be forced to rewrite our entire application, as the programming model is going to shift quite dramatically between the two models. All this makes it fairly difficult to answer the question of which model is best for us.
Beyond a few obvious cases, choosing between the two options is fairly hard, and maybe even almost impossible.
Which brings me to the next point: What if the process of moving between scale-up and scale-out were seamless — not involving any changes to our code?
I often use storage as an example of this. In storage, when we switch between a single local disk to a distributed storage system, we don’t need to rewrite our application. Why can’t we make the same seamless transition for other layers of our application?
Designing for Seamless Scale-Up/Scale-Out
To get to a point of seamless transition between the two models, there are several design principles that are common to both the scale-out and scale-up approaches.
Parallelize Your Application
1. Decouple: Design your application as a decoupled set of services. “All problems in computer science can be solved by another level of indirection” is a famous quote attributed to Butler Lampson. In this specific context: if your code sets have loose ties to one another, the code is easier to move, and you can add more resources when needed without breaking those ties. In our case, designing an application from a set of services that doesn’t assume the locality of other services is used to enable us to handle a scale-up scenario by routing requests to the most available instance.
2. Partition: To parallelize an application, it is often not enough to spawn multiple thread
s, because at some point they are going to hit a shared contention. To parallelize a stateful application we need to find a way to partition our application and data model so that our parallel units share-nothing with one another.
Enabling Seamless Transitions Between Remote and Local Services
First, I’d like to clarify that the pattern I outlined in this section is intended to enable seamless transition between distributed and local service. It is not intended to make the performance overhead between the two models go away.
The core principle is to decouple our services from things that assume locality of either services or data. Thus, we can switch between local and remote services without breaking the ties between them. The decoupling should happen in the following areas:
1. Decouple the communication: When a service invokes an operation on another service we can determine whether that other service is local or remote. The communication layer can be smart enough to go through more efficient communication if the service happens to be local or go through the network if the service is remote. The important thing is that our application code is not going to be changed as a result.
2. Decouple the data access: Similarly, we need to abstract our data access to our data service. A simple abstraction would be a distributed hash table, where we could use the same code to point to a local in-memory hash-table or to a distributed version of that update. A more sophisticated version would be to point to an SQL data store where we could have the same SQL interface to point to an in-memory data store or to a distributed data-store.
Packaging Our Services for Best Performance and Scalability
Having an abstraction layer for our services and data brings us to the point where we could use the same code whether our data happens to be local or distributed. Through decoupling, the decision about where our services should live becomes more of a deployment question, and can be changed over time without changing our code.
In the two extreme scenarios, this means that we could use the same code to do only scale-up by having all the data and services collocated, or scale-out by distributing them over the network.
In most cases, it wouldn’t make sense to go to either of the extreme scenarios, but rather to combine the two. The question then becomes at what point should we package our services to run locally and at what point should we start to distribute them to achieve the scale-out model.
To illustrate, let’s consider a simple order processing scenario where we need to go through the following steps for the transaction flow:
1. Send the transaction
2. Validate and enrich the transaction data
3. Execute it
4. Propagate the result
Each transaction process belongs to a specific user. Transactions of two separate users are assumed to share nothing between them (beyond reference data which is a different topic).
In this case, the right way to assemble the application in order to achieve the optimal scale-out and scale-up ratio would be to have all the services that are needed for steps 1-4 collocated, and therefore set up for scale-up. We would scale-out simply by adding more of these units and splitting both the data and transactions between them based on user IDs. We often refer to this unit-of-scale as a processing unit.
To sum up, choosing the optimal packaging requires:
1. Packaging our services into bundles based on their runtime dependencies to reduce network chattiness and number of moving parts.
2. Scaling-out by spreading our application bundles across the set of available machines.
3. Scaling-up by running multiple threads in each bundle.
The entire pattern outlined in this post is also referred to as Space Based Architecture. A code example illustrating this model is available here.
Today, with the availability of large multi-core machines at significantly lower price, the question of scale-up vs. scale-out becomes more common than in earlier years.
There are more cases in which we could now package our application in a single box to meet our performance and scalability goals.
A good analogy that I have found useful to understanding where the industry is going with this trend is to compare disk drives with storage virtualization. Disk drives are a good analogy to the scale-up approach, and storage virtualization is a good analogy to the scale-out approach. Similar to the advance in multi-core technology today, disk capacity has increased significantly in recent years. Today, we have xTB data capacity on a single disk.
PC hard disk capacity (in GB).The plot is logarithmic,
so the fitted line corresponds to exponential growth
Interestingly enough, the increase in capacity of local disks didn’t replaced the demand for storage, quite the contrary. A possible explanation is that while single-disk capacity doubled every year, the demand for more data grew at a much higher rate as indicated in the following IDC report:
Market research firm IDC projects a 61.7% compound annual growth rate (CAGR) for unstructured data in traditional data centers from 2008 to 2012 vs. a CAGR of 21.8% for transactional data.
Another explanation to that is that storage provides functions such as redundancy, flexibility and sharing/collaboration. Properties that a single disk drive cannot address regardless of its capacity.
The advances with the new multi-core machines will follow similar trends, as there is often a direct correlation between the advance in the capacity of data and the demand for more compute power to manage it, as indicated here:
The current rate of increase in hard drive capacity is roughly similar to the rate of increase in transistor count.
The increased hardware capacity will enable us to manage more data in a shorter amount of time. In addition, the demand for more reliability through redundancy, as well as the need for better utilization through the sharing of resources driven by SaaS/Cloud environments, will force us even more than before towards scale-out and distributed architecture.
So, in essence, what we can expect to see is an evolution where the predominant architecture will be scale-out, but the resources in that architecture will get bigger and bigger, thus making it simpler to manage more data without increasing the complexity of managing it. To maximize the utilization of these bigger resources, we will have to combine a scale-up approach as well.
Which brings me to my final point -– we can’t think of scale-out and scale-up as two distinct approaches that contradict one another, but rather must view them as two complementing paradigms.
The challenge is to make the combination of scale-up/out native to the way we develop and build applications. The Space Based Architecture pattern that I outlined here should serve as an example on how to achieve thi