Gigaspaces.com - Application Server
  • Send to a friend
  • Print

Gigaspaces blogs

Real Time analytics for Big Data: Facebook's New Realtime Analytics System

Date: July 8, 2011
Author: Nati Shalom

"Recently, I was reading Todd Hoff's write-up on FaceBook real time analytics system. As usual, Todd did an excellent job in summarizing this video from Engineering Manager at Facebook Alex Himel, Engineering Manager at Facebook.

In this first post, I’d like to summarize the case study, and consider some things that weren't mentioned in the summaries. This will lead to an architecture for building your own Realtime Time Analytics for Big-Data that might be easier to implement, using Facebook's experience as a starting point and guide as well as the experience gathered through a recent work with few of GigaSpaces customers. The second post provide a summary of that new approach as well as a pattern and a demo for building your own Real Time Analytics system..."

 

Read Full Post


Cloudify for Azure - On-Board Enterprise Java Apps to the Azure Cloud in a Snap!

Date: July 6, 2011
Author: Uri Cohen

"For those of you that missed out, we held a Microsoft Azure Live Academy webcast last night titled Easy On-Boarding of Mission-Critical Java Apps to the Azure Cloud, where GigaSpaces Cloudify for Azure was introduced.

Cloudify for Azure is an enterprise Java application platform that is targeted at on-boarding JEE/Spring/big-data apps to Azure without architectural or code changes.

In this session, Nati Shalom (our CTO) and Adi Paz (our EVP marketing) provided an overview that demonstrated how to on-board a JEE app that uses Cassandra as its database to Azure using Cloudify. The live demo shows how to on-board this app to Azure, and how Cloudify provides monitoring, self-healing, and scale-out capabilities for the application services. You can view the full webcast here, or the live demo in the video below..."

 

Read Full Post


Elastic Distributed Risk Analysis Engine

Date: June 30, 2011
Author: Shay Hassidim

"Financial services applications produce reports constantly. Some of them produce reports where the data required for the report is generated over night  via batch processing, and some other type of reports are produced instantly upon user request. This type of processing activity requires fast access to the raw data and the ability to utilize distributed resources available on the local environment or on the cloud..."

 

Read Full Post


Read/write scale without complete re-write

Date: June 13, 2011
Author: Nati Shalom

"In the last week of May I attended one of our Partner events in Stockholm where I presented the convergence of trends in the data scalability world – specifically the transition from NoSQL to NewSQL and the convergence of trends that brings the existing SQL and new NoSQL world much closer together as I noted in a previous post, "YeSQL: An Overview of the Various Query Semantics in the Post Only-SQL World."

Avanza Bank AB

    During the event, Ronnie Bodinger, Head of IT at Avanza Bank AB, gave an excellent       talk on how they turned their existing online banking application into a new site that was designed for read/write scaling..."

 

 

Read Full Post


Building Cloud Applications the Easy Way Using Elastic Application Platforms

Date: June 5, 2011
Author: Dotan Horovits

Patterns, Guidelines and Best Practices Revisited

"In my previous post I analyzed Amazon’s recent AWS outage and the patterns and best practices that enabled some of the businesses hosted on Amazon’s affected availability zones to survive the outage.

The patterns and best practices I presented are essential to guarantee robust and scalable architectures in general and on the cloud in particular. Those who dismissed my latest post as exaggeration of an isolated incident got affirmation of my statement last week when Amazon found itself apologizing once again after its Cloud Drive service was overwhelmed by unpredictable peak demand for Lady Gaga’s newly-released album (99 cents, who wouldn’t buy it?!) and was rendered non-responsive. This failure to scale up/out to accommodate fluctuating demands raises the scalability concern in the public cloud, in addition to the resilience concern raised in the AWS outage..."

 

Read Full Post


GigaSpaces Citrix integration on top of OpenStack

Date: May 25, 2011
Author: Nati Shalom

"In my one of my previous posts (GigaSpaces OpenStack Explained) I made a reference to the joint work that we are doing with Citrix through the integration of our new PaaS API:

The specific integration with NetScaler (load balancer) and XenServer will be achieved through more open interfaces provided through the Citrix/Openstack contribution, which means that OpenStack users can use those intefaces to plug in any hypervisor or load balancer.

In this post I’d like to elaborate more specifically on the current and planned integration work..."

 

Read Full Post


Retrospect on recent AWS Outage and Resilient Cloud-Based Architecture

Date: May 19, 2011
Author: Dotan Horovits

"According to the television series “Terminator: the Sarah Connor Chronicles”, Skynet computer system began its attack against humanity on April 21, 2011. Luckily that hasn’t happened (or has it?) but on that very day another predominant computing system provided us with a painful reminder on how much humanity relies on computers to run the world..."

 

Read Full Post


How to scale static data

Date: May 12, 2011
Author: Guy Lubovitch

"Many articles blogs and other documents have been written about how to scale your data linearly. To scale your application you need to partition your data across multiple machines each handling part of the load. If the data can be partitioned correctly, the grid can scale to include more machines and scale out. But not all data can be partitioned. In some use cases, the majority of the application data can be partitioned except a small part of the data that is required for all partitions. This is usually referred to as static or reference data, it compound of relatively small amount of data that is rarely updated. This static data is used frequently to process the data on each partition. An example of such data is the dictionary. For example, trade systems need a dictionary of symbols to verify the data they accept as valid, or it can be simple rules that should be followed by services. Both of these data types change when off peak, and take only several megabytes..."

 

Read Full Post


PaaS on OpenStack

Date: April 27, 2011
Author: Nati Shalom

"In my last post (GigaSpaces OpenStack Explained) I introduced our plan to add support for OpenStack in our platform:

One of the goals for our second-generation PaaS/SaaS enablement platform was to enable smooth migration between different cloud providers. We were able to achieve this goal through the use of our own abstraction (the Scaling Handler) and through the integration with the JClouds project that provides common abstraction to most of the existing cloud providers. With that, we can ensure that any application can be moved from the likes of Amazon to OpenStack or to an organization's own private cloud with zero changes to the application code or configuration.

The only change involves is setting the user/key of the specific cloud.

image

By adding support for OpenStack, we now enable users to safely move to an OpenStack-based cloud when they're ready and with little effort, yet they gain all the benefits that comes with it in terms of cost, openness etc..."

 

Read Full Post

 


GigaSpaces OpenStack Explained

Date: April 11, 2011
Author: Nati Shalom

"One of the major concerns of many IT organizations is cloud vendor lock-in. This concern was expressed recently in "Banks fear cloud vendor lock-in," from IT Wire:

The onset of cloud computing gives vendors the chance to lock customers in to their infrastructure, using proprietary protocols to ensure they’re on the monthly billing cycle as long as possible.

The OpenStack project emerged with a mission to address this concern by creating a community-led open source project enabling any organization to create and offer cloud computing services running on standard hardware."  

Read Full Post


Schema evolution in XAP.NET 8.0.1

Date: April 7, 2011
Author: .Net team

"GigaSpaces XAP.NET 8.0.1 introduces a few cool new features such as SpaceDocument (code name: Docu.NET API) as well as dynamic properties. Roughly speaking, a SpaceDocument is a virtual document-like type that can be written and read from the space like any other regular object, however, it provides a more dynamic API by having a key value dictionary in its properties instead of fixed properties that are part of the class code. Even though a document is dynamic, it fully supports indexing and even more powerfully so, since 8.0 now supports dynamic addition of indexes, it also supports indexing defined at runtime using the dynamic indexing capability (indexes can be added via the GigaSpaces management center at runtime)."...

Read full post 


XAP 8.0.1 is Out!

Date: April 6, 2011
Author: Uri Cohen

"We’ve just released XAP 8.0.1, with a lot of goodies included. 8.0.1 is the first feature and service pack on top of XAP 8.0.0. It includes many enhancements and a few exciting new features. Here’s a short recap:

 Improved Web UI Dashboard with Alerts View: The dashboard view now gives you a single click view of the entire cluster, including alerts on various problematic conditions. The previous view is now available under the topology tab. This is the first stage in the new Web based UI planned for XAP. You can find more details about it here.

 Elastic Deployment for Stateless and Web Processing Units: The elastic deployment model introduced in 8.0 for stateful and data grid only processing units has now been extended to support stateless and web processing units. You can scale web applications and stateless processing units up and down based on CPU, memory or available resources."...

Read full post


Scaling Tomcat with GigaSpaces

Date: March 22, 2011
Author: Guy Lubovitch

"Webserver session objects have existed since the beginning of web programming. Many developers see them as the most convenient way to store session information. While some of this information is temporary and should be deleted once the session expires, the rest should be saved in some sort of persistent storage for the user’s next visit.

In recent years, web development has become more and more challenging, session data size has increased greatly to support the data the user expects to view, for example their friends or favorite articles. In addition, sites need to support a higher number of open sessions while maintaining high availability of the data."...


GigaSpaces New Cloud Platform - Sneak Preview

Date: March 8, 2011
Author: Nati Shalom

"The presentation below provides a sneak preview into one of the cool features of our upcoming Cloud Enabled Application Platform - Universal Service Manager (USM).

The universal service manager enables the handling of the deployment, elasticity, continuous availability, and scalabilty of existing applications on any cloud or local data center. In this specific presentation, we will demonstrate how to deploy a NoSQL Service - in this case, Cassandra, using the USM.  We will then scale-it, monitor it, handle failure scenarios, and more. All this, without writing any line of code and without any need for prior knowledge of GigaSpaces!

All you need to do is point to your Cassandra directory, write a few shell scripts for handling the pre-post deployment and the rest is taken care of by the platform."...

Read full post


Data Grid Querying, Revisited

Date: March 8, 2011
Author: Uri Cohen

"There has been a great deal of talk lately about the new EHCache cache querying capabilities and the advantages of real-time analytics through in-memory cache querying. I find that rather odd since extensive querying and processing capabilities have been around for years with in memory data grids like GigaSpaces XAP, Oracle Coherence, Gemstone GemFire and more recently Hazelcast and GridGain. So I don’t really understand the big fuss around EHCache finally supporting it….

But that’s actually a great opportunity to revisit some of the work we’ve done in our recent 8.0 release in the context of querying. There are two main features we’ve introduced in 8.0 that take data grid querying to the next level."...

Read full post


Productivity vs. Control tradeoffs in PaaS

Date: March 7, 2011
Author: Nati Shalom

"Gartner published recently an interesting paper: Productivity vs. Control: Cloud Application Platforms Must Split to Win. (The paper requires registration.)

The paper does a pretty good job covering the evolution that is taking place in the PaaS market toward a more open platform and compares between the two main categories: aPaaS (essentially a PaaS running as a service) and CEAP (Cloud Enabled Application Platform) which is the *P* out of PaaS that gives you the platform to build your own PaaS in private or public cloud.

According to Gartner the main split between the two categories is Productivity vs Control:

The cloud application platform markets are splitting to support two different constituencies: mainstream application developments that are focused on fast time to deployment, and advanced projects requiring the full control of the underlying cloud application platform attributes."

Read full post 


Realistic Elastic

Date: March 3, 2011
Author: Shay Hassidim

"When running system in production, the last thing you want to do is to shutdown the system. This could happen when:

- You need to replace one of the machines running the system.

- You need to upgrade one of the machines running the system.

- You need to increase the memory capacity of the system to support more data to be stored in memory.

- You need to increase the CPU power the system needs to consume to process data fast enough."...

Read full post


A interesting note on Google Megastore CAP,..

Date: February 17, 2011
Author: Nati Shalom

"I was reading James Hamilton's coverage on Google Megastore: The Data Engine Behind GAE (a must read!), where he summarizes the main architecture assumption behind Google Megastore.

The thing that caught my eyes was the following line:

Support for consistency unusual for a NoSQL database but driven by (what I believe to be) the correct belief that inconsistent updates make many applications difficult to write (see I Love Eventual Consistency but …)

This was later further elaborated upon in one of James' comments below the posting:"...

Read full post


Fun with XStream

Date: February 15, 2011
Author: Joe Ottinger

"XStream unmarshalling is great fun when you’re not working with a fixed schema.

I’ve been working on a quick start document for GigaSpaces‘ data grid edition lately, and I’m doing it with the code in the form of tests. This makes writing it really easy (run the tests, make sure it works, if it fails, wash, rinse, repeat), but that’s not what this post is about.

For each test, I clear out the data grid, and then populate it; I then act on the grid in various ways to show operations.

One of the operations I’m testing is a query-by-example facility, where you create an example object (using null for wildcards by default), populating a few fields, then ask the data grid to hand back all matching objects.

However, this means the object hierarchies have to be somewhat similar, shall we say. If you have two branches of objects, it doesn’t work."...

Read Full Post


Using Selenium at GigaSpaces

Date: February 10, 2011
Author: Eli Polonsky

"How we integrated our distributed testing framework with Selenium for automating our Web UI testing

GigaSpaces XAP – as a distributed application platform and data grid – has some unique and interesting testing requirements. I’d like to explain how we use Selenium to test our upcoming web-based administration console in the grid."...

Read Full post