The Apache Cassandra Projectâ„¢ is a scalable multi-master database with no single points of failure. The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model.
Cassandra is in use at Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX, and more companies that have large, active data sets. The largest production cluster has over 100 TB of data in over 150 machines. Data is automatically replicated to multiple nodes for fault-tolerance. Replication across multiple data centers is supported. Failed nodes can be replaced with no downtime. Every node in the cluster is identical. There are no network bottlenecks. There are no single points of failure.
Cassandra GigaSpaces Mirror
The Cassandra GigaSpaces Mirror implementation allows applications to use push the long term data into Cassandra database in an asynchronous manner without impacting the application response time.
The Cassandra Mirror leveraging the Cassandra CQL and the Cassandra JDBC Driver. Every application write or take operation against the IMDG is delegated into the Mirror service that is using the Cassandra Mirror implementation to execute the CQL statement and push the changes into the Cassandra database.
How the Cassandra Mirror works?
The Cassandra Mirror translating the IMDG object data type into a Cassandra column family where the IMDG object properties are translated into record rows. The IMDG object ID is used as the Cassandra raw KEY. The Cassandra database schema can be defined via CLI or in runtime via the CQL.
Running the Cassandra Mirror
To run the Cassandra Mirror example:
Download Gigaspaces XAP 8.x.
Download the Cassandra Mirror and extract it into \gigaspaces-xap\examples folder. A new folder called CassandraMirror will be created with the example files.
Connected to: "Test Cluster" on localhost/9160
c0c1a470-938e-11e0-0000-242d50cf1fbc
Waiting for schema agreement...
... schemas agree across the cluster
Authenticated to keyspace: TEST
c103b680-938e-11e0-0000-242d50cf1fbc
Waiting for schema agreement...
... schemas agree across the cluster
This will generate a keyspace called TEST and a column family called COM_TEST_MYDATA that will be used to store the space class com.test.MyData objects.
Run the Cassandra Mirror example.
Move to the \gigaspaces-xap\examples\CassandraMirror folder and run the following for windows:
> run.bat
or for linux:
> run.sh
It will start a space cluster, the Mirror and will perform few space write and take operations. These will be delegated into the Cassandra Mirror that will execute Cassandra calls to persist the data. Make sure the CASSANDRA_HOME , JSHOMEDIR and the GS_HOME are set correctly within the run.bat/run.sg script. Make sure the cassandra.config system property is set to have the cassandra.yaml file location. Make sure the run.sh/bat script includes the correct Cassandra libraries name (to include the correct version). Before you run the Cassandra server you should set the /apache-cassandra-0.8.X/conf/cassandra-env.sh to set the MAX_HEAP_SIZE and the HEAP_NEWSIZE variables.
You will see the space clustered started, the Mirror started and later the following output displayed:
INSERT INTO COM_TEST_MYDATA (KEY, 'age', 'first', 'id', 'last') VALUES (0,'16','first0','0','last0')
...
INSERT INTO COM_TEST_MYDATA (KEY, 'age', 'first', 'id', 'last') VALUES (29,'16','first29','29','last29')
UPDATE COM_TEST_MYDATA SET 'age'='17', 'first'='firstXX0', 'id'='0', 'last'='lastYY0' WHERE KEY=0
...
UPDATE COM_TEST_MYDATA SET 'age'='14', 'first'='firstXX9', 'id'='9', 'last'='lastYY9' WHERE KEY=9
DELETE FROM COM_TEST_MYDATA WHERE KEY = 10
...
DELETE FROM COM_TEST_MYDATA WHERE KEY = 19
INSERT INTO COM_TEST_MYDATA (KEY, 'age', 'first', 'id', 'last') VALUES (1000,'12','first1000','1000','last1000')
...
UPDATE COM_TEST_MYDATA SET 'age'='17', 'first'='first1000', 'id'='1000', 'last'='last1000' WHERE KEY=1000
...
UPDATE COM_TEST_MYDATA SET 'age'='15', 'first'='first1009', 'id'='1009', 'last'='last1009' WHERE KEY=1009
View the data within Cassandra using the Cassandra CLI