GigaSpaces FAQ

Below is a list of frequently asked questions about GigaSpaces and their answers. If you could not find the answers you were looking for, feel free to email us at info@gigaspaces.com.

GigaSpaces XAP

GigaSpaces EDG


GigaSpaces XAP

  • How does GigaSpaces XAP fit in with the open source stack?
    The open source movement has created an alternative middleware stack with frameworks such as Spring, Hibernate and Tomact, as well as other frameworks growing in popularity, such as Mule ESB. Web 2.0 has generated innovation, such as Hadoop (open source framework for parallel file search), Lucene (index server for search engines) and Compass (equivalent of Hibernate for Lucene) -- and many
    AJAX
    frameworks.
    Many companies are already following the strategy of building their applications using an open middleware stack.

    GigaSpaces helps companies take that strategy to the next level by complementing those frameworks with runtime capabilities that address scalability, high-availability, performance, latency, deployment management, monitoring and more.

    The combination of GigaSpaces and such open frameworks creates a Scale-Out Application Server alternative that is compatible, but independent from, existing application servers. It is not bound to a specific standard, but is still compatible with different platforms that support those same open frameworks, thus avoiding vendor lock-in.

     

    The OpenSpaces development framework from GigaSpaces is provided with source code, under the Apache 2 license, as part of the product download. It will be available soon as a fully open source project through OpenSpace.org. GigaSpaces is also in the process of gradually open sourcing large parts of the core GigaSpaces runtime. The core clustering engine would most likely remain closed sourced in the foreseeable future.

     

     

    At the end of the day, GigaSpaces' value is measured on our ability to improve performance, scalability and reliability, not on the API.

     

  • How is GigaSpaces XAP different from alternative application servers?
    Unlike tradition applications servers, GigaSpaces focuses on the run-time and execution capabilities, while supporting many popular development frameworks, such as Spring, Hibernate, Mule, .Net and certain J2EE APIs. With this approach, developers can plug-in GigaSpaces to existing applications using these development frameworks and give them an immediate performance, scalability and reliability a boost, simply through configuration. There is no need to change code.
    At the run-time level GigaSpaces provides unlimited linear scalability, extremely low-latency and higher levels of reliability than those offered by traditional J2EE application servers.

  • What development frameworks and platforms are already supported by GigaSpaces XAP?
    Fully supported frameworks include Spring, Hibernate, Mule and Lucene.
    All common relational databases can be scaled-out seamlessly using a combination of the Persistence as a Service feature and the GigaSpaces In-Memory Data Grid.

     

    Integration with other frameworks, such as JPA, SimpleDB, Memcached, as well as caching support for Jetty, Tomcat and iBatis, is available through OpenSpaces.org

     


  • What if my application already uses a traditional application server?
    If your application currently uses a JEE application server, there are several approaches you can take with GigaSpaces XAP:

    - Integrate GigaSpaces XAP with your existing application server as a caching or distributed execution environment to reduce the load on your app server

    - If you use an open source application server such as Tomcat (and other elements such as Spring and hibernate) you can leverage GigaSpaces XAP to add reliability and scalability to the stack

    - You can also completely replace your existing JEE app server with GigaSpaces. If your application uses Spring, or certain APIs (such as JMS or JDBC) this will involve minimal code changes. See: How can I migrate an existing application to GigaSpaces XAP.

  • What if my application already uses a caching product, such as Coherence or Terracotta?
    That's fine. As both Coherence and Terracotta support Spring, it's easy to plug them into GigaSpaces XAP and enhance them with messaging capabilities, the SLA-driven container and deployment capabilities. In fact, in many respects this approach works much better than running them within a J2EE container such as WebLogic. This is due to GigaSpaces XAP's "tier-less" approach and virtualization capabilities. The fact that relies purely on memory for high-availability, makes it extremely efficient compared J2EE app servers. In addition, it was designed from the ground up for a scale-out architecture model (horizontal scalability), which helps get more value out of data grid/distributed caching products. Our native support for Spring makes the integration and configuration extremely simple.

  • What is the ROI of GigaSpaces XAP compared to J2EE application servers?
    A benchmark performed by GigaSpaces compared to implementations of an Order Management System application. One was built with WebLogic and Spring, and the other -- using the same code -- was implemented with GigaSpaces and Spring. The results demonstrated that by running with GigaSpaces XAP total cost of ownership is significantly reduced due to the following:

    - Increased server throughput, which reduces the amount of servers and software licenses required to handle the same volumes
    - Shorter time-to-market of new products and services due to reduced deployment complexity, reduced testing and tuning cycles and decreased development complexity
    - Increased reliability through better testability and self-healing capabilities
    - Lower maintenance overhead through better manageability
    - Use a single platform for Java, .Net and C++
    - Use the same platform across various application types, including transactional, web, real time analytics, SOA and more.

    The benchmark shows that cost-savings grow exponentially as scaling requirement grows, due to the fact that J2EE application servers cannot scale linearly.
    If you would like to receive more details on this benchmark please contact sales@gigaspaces.com

  • Does GigaSpaces have any special pricing offers for start-ups?
    Start-ups or individuals that had less than $5 million in revenues during the last 12 months may qualify to receive a free unlimited license of GigaSpaces XAP or EDG. To read more about the program and to apply, please visit the GigaSpaces Start-Up Program.

  • How can I migrate an existing application to GigaSpaces?
    There are several options for migrating existing applications to GigaSpaces XAP and EDG:
    • Deploy an existing Spring application on the GigaSpaces SLA-driven containers. No code changes are required, only the packaging changes (no EAR files)
    • POJO session beans can be turned into scalable services seamlessly by plugging them into GigaSpaces' remoting implementation called the Service Virtualization Framework
    • Migrating messaging-oriented middleware (MOM) to the GigaSpaces virtual message bus is seamless, with the use of the JMS API. Mule can also be used to abstract the message flow
    • Migrating the data-tier to the GigaSpaces In-Memory Data Grid (IMDG) is simple, assuming use of DAO and declarative transactions, and there are several API choices, such as Map/JCache or GigaSpaces extensions to the JavaSpaces API
  • My application does not use Spring, is GigaSpaces XAP still relevant?
    Of course! It just means you may need to make some modifications to your code to benefit from GigaSpaces XAP's performance, scalability and reliability. But as GigaSpaces XAP supports standard APIs, such as JDBC and JMS, and provides integration with solutions such as Hibernate and Mule, the transition should be smooth and involve minimal code changes.

  • Does GigaSpaces XAP support standard APIs?
    GigaSpaces is focused on the runtime execution platform, not the APIs. GigaSpaces XAP allows you to leverage existing APIs, including standard JEE APIs (JDBC, JMS and JCA), and frameworks such as Spring, Mule and Hibernate. Additional abstractions can be achieved through the use of dependency injection, annotations and the OpenSpaces framework -- GigaSpaces' open source development framework.
  • How does GigaSpaces XAP reduce costs?
    This page explain a variety of ways in which GigaSpaces can reduce Total Cost of Ownership and provide a better Return on Investment than traditional approaches. If you would like to learn more, please contact us at sales@gigaspaces.com

GigaSpaces Enterprise Data Grid (EDG)

  • How does an in-memory data grid improve the performance, scalability and reliability of a relational database (RDBMS)?
    The fundamental problems with both database replication and database partitioning is the reliance on the performance of the file system/disk and the complexity involved in setting up database clusters. No matter how you turn it around, file systems are fairly ineffective when it comes to concurrency and scaling. This is pure physics: disk storage suffers severe latency because each data access must go through serialization/de-serialization, as well as mapping from binary format to a usable format. This puts hard limits on latency. In addition, latency is often severely affected by lack of scalability. So putting the two together makes file systems -- and databases, which heavily rely on them -- suffer from limited performance and scalability.

    These database patterns evolved under the assumption that memory is scarce and expensive, and that network bandwidth is a bottleneck. Today, memory resources are abundant and available at a relatively low cost. So is bandwidth. These two facts allow us to do things differently than we used to, when file systems were the only economically feasible option.

    GigaSpaces takes advantage of this by managing data (including transactional data) as objects in-memory and collocated with the application business logic (running within the same process). This significantly reduces latency. It also allows for better scalability, as the data can be easily partitioned across nodes that have no dependency on each other (each processes a sub-set of the data). Finally, reliability is achieved by maintaining active hot backups of each partition, which can take over instantly upon the filaure of the primary node (fail-over).

  • If GigaSpaces EDG synchronizes with a relational database, doesn't that mean that performance is limited?
    No. Because:

  • Data is sent from memory to the database asynchronously and in batches
  • Updates to the database are performed in parallel by all partitions
  • Updates to the database are executed in the same machine as the database through the GigaSpaces Mirror Service. This allows reducing network overhead, as well as benefiting from optimizations, such as batch operations
  • The database is not used for high availability purposes. This means that in-flight transactions are not stored in the database, only the end result of the business transaction. This, in turn, reduces the amount of updates sent to the underlying database. Also, queries don't hit the database, only updates and inserts. All of this combined means that the in-memory data grid (IMDG) acts as a smart buffer to the database. It is common that the number of reads/updates the IMDG receives is 10x higher than the number of hits on the underlying database
    With GigaSpaces EDG, the database and the application are decoupled, enabling more options for optimization. For example, there are scenarios where writing to the database is required to ensure the durability of the data. In this scenario, the data is stored directly in a persistent log (to ensure durability). The log can be updated at a relatively high rate. Data is read from the persistent log back into the database as an off-line operation. With this approach, update rates can easily reach 30,000 to 40,000 per second with a single low-end database instance (such as MySQL). If this is insufficient, database instances can be clustered for faster database access.
  • Doesn't asynchronous replication to the database mean that data might be lost in case of failure?
    No, because asynchronous replication refers to the transfer of data between the in-memory data grid (IMDG) and the database. The IMDG, however, maintains in-memory backups that are synchronously updated. If one of the nodes in a partitioned cluster fails before the replication to the underlying database took place, its backup will be able to instantly continue from that exact point.
  • What happens if one of the in-memory data grid partitions fails?
    The backup partition takes over and becomes the primary. GigaSpaces EDG re-directs the failed operation to the hot backup implicitly. This enables a smooth transition of the client application during failure -- as if nothing happened. Each primary node may have multiple backups to further reduce the chance of total failure. In addition, the cluster manager component detects failure and provisions a new backup instance on one of the available machines.
  • What happens if the database fails?
    The in-memory data grid (IMDG) maintains a log of all updates and can re-play them as soon as the database becomes available again. It is important to note that during this time the system continues to operate unaffected. The end user will not notice this failure!
  • How do I maintain transactional integrity with GigaSpaces EDG?
    GigaSpaces EDG supports the standard two-phase commit protocol and XA transactions. Having said that, this model should be avoided as much as possible due to the fact that it introduces dependency among multiple partitions, as well as creates a single point of distributed synchronization in the system. Using a classic distributed transaction model doesn't take advantage of the full linear scalability potential of the partitioned topology offered by GigaSpaces EDG. Instead, the recommended approach is to break transactions into small, loosely-coupled services, each of which can be resolved within a single partition. Each partition can maintain transaction integrity using local transactions. This model ensures that in partial failure scenarios the system is kept in a consistent state.
  • How is transactional integrity with the database maintained?
    As noted above, distributed transactions might introduce a severe performance and scalability bottleneck, especially if performed with the database as the system of record. In addition, attempting to execute transactions with the database violates one of the core principles behind the GigaSpaces Persistence as a Service (PaaS) approach: asynchronous updates to the database. To avoid this overhead, the GigaSpaces in-memory data grid (IMDG) ensures that transactions are resolved purely in-memory and are sent to the database in a single batch. If the update to the database fails, the system will re-try the operation until it succeeds.
  • What types of queries are supported in GigaSpaces EDG?
  • Template matching (matching object based on class name, class hierarchy, and attribute values)
  • SQL - supports range queries, 'like' semantics, etc.
  • Continuous queries - through a combination of notification and SQL.
  • Parallel query (a.k.a Map/Reduce) - queries that are not designated to a specific partition are automatically broadcasted to all partitions and the result is implicitly aggregated on the client side
  • Iterator - iterates through a large result-set of data
  • Code snippets of the different query APIs are available here

  • This model relies heavily on partitioning. How do I handle queries that need to span multiple partitions?
    Aggregated queries are executed in parallel on all partitions. You can combine this model with stored procedure-like queries to perform more advanced manipulations, such as sum and max. See more details below.
  • What about stored procedures and prepared statements?
    Because the data is stored in memory, we avoid the use of a proprietary language for stored procedures. Instead, we can use either native Java/.Net/C++ or dynamic languages, such as Groovy and JRuby, to manipulate the data in memory. The IMDG provides native support for executing dynamic languages, routes the query to where the data resides, and enables aggregation of the results back to the client. A reducer can be invoked on the client-side to execute second level aggregation. A code example that illustrates how this works can be found here
  • Can these prepared statements and stored procedure equivalents be changed without bringing down the data?
    Yes. When you change the script, the script is reloaded to the server while the server is up without bringing down the data. The same capability exists in case you need to re-fresh collocated services code on the server-side.
  • Do I need to change my application code to use GigaSpaces EDG?
    There are cases in which introducing GigaSpaces EDG's in-memory data grid is completely seamless and there are cases in which you will need to go through a re-write, depending on the programming model:

Nature of Integration with GigaSpaces EDG

Comments/limitations

Hibernate 2nd level cache

Seamless

Best fit for read-mostly applications. Limited performance gain as it still heavily relies on the underlying database.

JDBC

Seamless, but limited

SQL commands written to the in-memory data grid are guaranteed to run with other JDBC resources. Doesn't support full SQL 92 and therefore existing applications may require code changes. Recommended for monitoring and administration. Not recommended for application development as it introduces unnecessary O/R mapping complexity.

HashMap

Seamless

Extensions such as timeout and transaction support are available.

GigaSpaces Spring DAO

Partially seamless

Abstracts transaction handling from the code. Domain model is based on POJOs, and therefore, doesn't require explicit changes, only annotations (annotation can be provided through an external XML file). If the application already uses a DAO pattern then it would require changing the DAO. This allows narrowing down the scope of changes required to use an IMDG-specific interface. This option is highly recommended for best performance and scalability.

  • What topologies are supported by GigaSpaces EDG?
    Replicated (synchronous or asynchronous), partitioned, partitioned-with-backup.
    See details here
  • Does code need to be changed when switching from one topology to another?
    No. The topology is abstracted from the application code. The only caveat is that your code needs to be implemented with partitioning in mind, i.e., moving from a central server or a replicated topology to partitioning doesn't require changes to the code as long as your data includes an attribute that acts as a routing index
  • How are in-memory data grids (IMDG) and Persistence-as-a-Service (PaaS) different from in-memory databases (IMDB)?
    An IMDG allows storing objects in memory while maintaining a relational model. However, using in-memory storage in an IMDG, eliminates the need for an object-relational mapping (ORM) layer. In addition, we don't need separate languages to perform data manipulation. We can use the native application code, or dynamic languages.

    Moreover, one of the fundamental problems with in-memory databases is that relational SQL semantics are not geared to deal with distributed data models. For example, an application that runs on a central server and uses statements like Join, which often maintain references among tables, or even use aggregated queries such as Sum and Max, doesn't map well to a distributed data model. This is why many IMDB implementations only support very basic topologies and often require significant changes to the data schema and application code. This reduces the motivation for using in-memory relational databases, as it lacks transparency.

    The GigaSpaces in-memory data grid implementation exposes a JDBC interface and provides SQL query support. Applications can therefore benefit from the best of both worlds: you can read and write objects directly through the GigaSpaces API, query those same objects using SQL semantics, and view and manipulate the entire data set using regular database viewers.

  • Can I use existing Hibernate mapping to map data from the database to the GigaSpaces in-memory data grid (IMDG)?
    Yes. In addition, with GigaSpaces' Persistence-as-a-Service (PaaS) feature, Hibernate mapping overhead is significantly reduced, as most of it happens in the background, during initial load or during the asynchronous update to the database.

    Further information about Hibernate support is available here

  • Can Persistence-as-a-Service (PaaS) be used with .Net or C++ applications?
    Yes. Starting with GigaSpaces 6.5 both Hibernate (Java) and nHibernate (.Net) are supported. C++ applications defer to the default Hibernate implementation. In addition, with GigaSpaces' new integration with Microsoft Excel, .Net users can easily access data in the IMDG directly from their Excel spreadsheets without writing code.