Security has always been one of those topics that we as developers or architects hate to deal with. Our ideal world would be that security is dealt with at some higher level in our data-center, and that we don’t need to think about it. It turns out that some of the most recent frauds can be caused by someone from within our own organization. In this specific fraud case, the cost to SocGen was $7.7b. There was nothing specific in this incident that was a direct result of an application security failure, but it does force us to examine more carefully some of the current assumptions about security. In this post I focus on some of the potential security vulnerability that we may be exposed to as a result of using data-grid and caching more and more within our organization.
Why Do We Need a Secured Data-Grid?
We add the data-grid/caching layer as a front end to the database to address the database scalability limitation. One of the common assumptions in this sort of model, is that the caching layer is an embedded part of a particular application. In other words, the caching layer can only be accessed by a specific application. Based on that assumption, it is fair to assume that it is the application’s responsibility from a security perspective, to control access to its caching layer. Therefore, security was not generally considered to be a critical aspect of many of the existing caching solutions.
However, there are two things that change that assumption:
- Caching becomes a shared service (just like databases of today), that is shared by more than one application. SOA-based applications are a good example of this. SaaS-based applications impose an even bigger challenge in this regard.
- Virtualization and the drive for better utilization, forces us to think of a more efficient deployment model through sharing of resources. Having a dedicated in-memory cluster per application can be a fairly inefficient model, but on the other hand, running in a shared environment requires a high degree of isolation support by the data-grid service.
Security assumptions and constraints change significantly when you move from a cluster dedicated to one application, to a cluster shared between several applications. It is no longer enough to assume that the application is responsible for granting access to its embedded middleware components. In this case, multiple applications which share the same data, must have the right level of isolation and access. Securing the shared middleware resources is the key to achieving this level of isolation and control.
In the next section I describe the specific attributes that a secured data-grid must support.
Securing Your Shared Data-Grid/Caching Layer
1. Granting Access Based on Roles and Content
Imagine the following cases:
- Several applications can access the same data-grid service, and be granted different permissions to view each data item, based on the content of the data.
- Different parts of the same application (e.g. purchasing module vs. sales module) may be granted access to different accounts, even though all the account data is stored in the same shared data-grid.
- One part of an application can be granted read/write access, while another part of an application can be granted read-only access to the same data.
This scenario is very similar to what we expect from most databases today. A user can be granted access (authentication) through user-name/password authentication, and they can be granted read/write access (authorization) to the shared data. Providing such a level of control is fairly basic and core to any secured data-grid. However, there are cases where we would want to grant access to different parts of organizations, or where customers need to have different privileges for different parts of the shared data. For example, in a call center application, we might have a shared call center service that serves different departments in our organization. In such a case, we would want callers of certain departments to see only the contacts that belong to them, and not be granted privileges to see or have access to contacts that belong to other departments. In that case we would need to provide fine-grained authorization, based on the actual content.
2. Ensuring Isolation in a Multi–Tenant Environment
There are many cases where we do not want to share the same data-grid between different groups of applications, but at the same time we do want to share the physical memory resources. A multi-tenant environment implies that we run multiple services or applications in a shared environment, but keep them logically separated, as if they were running in their own dedicated environment. That level of separation is referred to as isolation. In order to do that, we need an abstraction layer that enables us to provision our data-grid as a set of fine-grained services that can be managed independently, even if they run on the same process. Abstracting the data-grid as fine-grained services enables us to provision the data-grid services on a shared group of machines or even processes (JVM’s), and still manage them completely independently.
There are a few components that enable that level of fine-grained isolation:
- Life cycle management – each data-grid can be started or relocated, completely independently of other data-grid services, even if they run on the same process container.
- Class-loader – each data grid contains its own class-loader to ensure that there is no class version conflict between two data-grid services.
In addition to that, there are other levels of isolation, such as ‘zones’ where we can ensure that certain data-grid services never share the same process or even the same machine. At the top level, we can assign a data-grid with specific virtual machines (VM), which ensures the highest level of isolation, not just at the service level, but also at the operating and physical resources level.
Providing this level of fine-grained isolation control between data-grid services that share the same process, up to a completely separate data-grid that runs on its own virtual machine, is critical if we want to ensure best utilization on the one hand, and isolation on the other hand. Providing a limited degree of isolation determines how effectively we can share resources.
3. Transport Level Security
In most data-grid deployment, it is fair to assume that network level security is a responsibility of the data-center, not the data-grid, or even the application. There are cases where the data-grid can span across multiple firewall or data-center zones, and have a replication channel between those zones. In such cases, the replication channel can turn out to be a security loophole. The simplest approach to address this challenge, is enable replication over SSL. Using SSL can ensure that no one can “sniff” the packets that go over the wire.
The Administrative Challenge
Providing all this level of fine-grained security control and multi-tenancy can pose a huge administrative challenge. For example:
1. A super administrator needs to have full access to all data-grid services
2. An application administrator only needs to be granted access to data-grid services that belong to a particular application.
3. Operational managers need to be granted monitoring (read-only) access, to view how the system is behaving, but don’t need to be granted access to change the data-grid topology or configuration.
These are only few of the challenges that we have to deal with, once we move from a dedicated data-grid to a shared data-grid. This basically means that we can’t assume that all administration operations to our data-grid are the same. We need to treat operation guys differently then we deal with application managers, and they all need to have visibility and some level of control to our shared services. That means that we have to rethink how to manage security when we deal with shared data-grid services.
Not Just Your Data..
The same arguments that I have mentioned here, apply not just to the data-grid, but to any part of the application middleware that is going to become a shared resource. For example, think of the messaging service, or any other part of your application that becomes a service to other applications. This implies that we need some sort of generic way to handle application security that is consistent across the layers of the application.
How Security Reduces Costs
Being able to share data-grid services between multiple applications is a great cost saver. Here is how:
1. Better utilization – we can better utilize the machines that run our data. In other words, we reduce the total number of machines that run our data services. The fact that we use less machines to run our data-grid services means that can save all of the cost associated with it i.e. cooling, maintanance, space etc.
2. Pooling of spare resources – each data–grid needs to run some level of spare capacity. Sharing of data-grid services enables us to share that spare capacity, and therefore better utilize it amongst different applications.
3. Reduction of the amount of redundant data – one of the ways that organizations dealt with the lack of ability to share their data services effectively, was by maintaining different copies of that same data. That led to redundant copies, as well as a lot of synchronization cost associated with it. Being able to safely share the data, means that we remove a lot of that overhead, and the cost associated with it.
4. Reduced administrative overhead – the data-grid becomes just like a data-base: instead of having each application group learn how to provision it separately, and go through the cycles of tuning and sizing it separately, we now have all that knowledge and skill set managed by a shared group. This is the same way in which Amazon manages its SimpleDB, SQS and other services – we as users of those services don’t need to worry about how to provision or maintain it.
Caching/data-grids are going through a similar evolution to databases. As with databases, we started by using caching as an embedded service to the application. Now we are in the phase where we need to be able to share the data between multiple applications, or in cases where we don’t want to share the data, we need to be able to share the resources for managing the data, while keeping a high degree of isolation. The demand for these sort of requirements becomes much more common with SOA or SaaS-based applications.
We used to think of security as a defensive move. However, security is not just about protecting our data – security can create the right level of isolation that is a main enabler for achieving better efficiency and cost savings, through sharing of resources, as I pointed out above.
As we approach the next generation of middleware and data-centers, it becomes clear that we cannot move to the next wave of virtualization and cloud computing without a strong security and isolation solution that is built-in to all layers of our application and middleware.