Lowering the Space Object Footprint - The Binary Serialization Pattern

Search Solutions & Best Practices
Searching Solutions and Best Practices
Browse Solutions & Best Practices

                                                              

composition-setup}

Summary: Lowering the Space Object Footprint - The Binary Serialization Pattern

Author: Shay Hasidim, Deputy CTO, GigaSpaces
Using XAP:7.0GA
JDK:Sun JDK 1.6
Date: July 2009

Your Rating: Results: PatheticBadOKGoodOutstanding! 2 rates

Overview

By default, when using the GigaSpace Java API, the space stores space object fields as is. No data compaction, or compression is done while the object is transported across the network or when stored within the space.

  • The compressed serialization mode compressing non primitive fields using the zip utilities.
  • The C++ and .Net API objects data does go through some compaction when sent across the network.

The GigaSpaces serialization API using the binary serialization technique allows you to reduce the footprint associated when storing space objects in memory. This means you will be able to store more space objects per memory unit.

The basic idea of the binary serialization pattern is simple: Total control on the format of the space object data while transported over the network and when stored within the space. This technique:

  • Compacts the object payload data when transported over the network and when stored in memory.
  • Avoids the de-serialization involved when space object written to the space from a remote client (for non primitive fields such as user defined classes or collection field)
  • Avoids the de-serialization replicated to a backup space(s)
  • Avoids the serialization involved when reading an object back from the space into the client process (at the space side).

The Basic Flow

With the binary serialization pattern:

  • Before you write the object you serialize all non indexed fields (payload data) into one byte array field using the GigaSpaces serialization API (pack).
  • You serialize/de-serialize all indexed fields as usual (have the writeExternal , readExternal implementation to write and read these into the stream).
  • After reading the object from the space you should de-serialize the byte array data (unpack).

When the object is written to the space:

  • The Non indexed fields are compressed and serialized into the same field (as a byte array).
  • All the indexed fields + the byte array are serialized as usual via the writeExternal call.
  • The object with arrives in the space, de-serialized, indexed fields stored as usual and the byte array field stored as is.

When the object is read from the space:

  • The read template undergoes the same actions as when writing an object to the space
  • The matching object is serialized and sent to the client.
  • When the matching object arrives the client side is it de-serialized, and the byte array data is de-serialized and decompressed (in a lazy manner).

The Implementation

Using the binary serialization pattern can reduce the object footprint when stored within the space in drastic manner. As much as you will have more fields as part of the space object serialized using the GigaSpaces Serialization API, the memory footprint overhead will be smaller compared to the default serialization mode.

The binary serialization pattern involves creation the following methods:

  1. pack method - Packs the object data into one field. Serialize the non-Indxed fields into the byte array.
  2. unpack method - Unpacks the object data into one field. De-serialize the non-Indxed fields from the byte array.
  3. writeExternal method - Serialize the object data. Required for the Externalizable implementation. Serialize the indexed fields and the byte array.
  4. readExternal method - De-serialize the object data. Required for the Externalizable implementation. De-serialize the indexed fields and the byte array.
  5. checkNulls method - Handles null data for the indexed and byte array fields.
  6. getnulls method - Handles null data for non indexed fields.
Future versions will generate the binary serialization methods in a transparent manner. See the PackRat for an example.

BinaryOutputStream and BinaryInputStream

The BinaryOutputStream conatins various method to serialize all java's primitive type, their Object wrappers and arrays forms in a compacted mode. BinaryInputStream is its counterpart for deserialization.

Your pack and unpack methods will be using an instance of those classes.

Example

With the attached example we have a space class with 37 fields.

  • 1 Integer data type field (indxed used for queries).
  • 12 String fields
  • 12 Long fields
  • 12 Integer Fields.
With this example - The footprint overhead of the default serialization compared to a binary format is 300%.
  • With 64 bit JVM the Regular Class consumes 2069 bytes and the Binary format Class consumes 758 bytes.
  • With 32 bit JVM the Regular Class consumes 1408 bytes and the Binary format Class consumes 525 bytes.

To run this example copy the example package zip into \GigaSpaces Root\examples\, extract the zip file and follow the instructions at the readme file.

The Original Space class

Our example involves a space class that will be modified to follow the binary serialization pattern.

The original class includes:

  • One Integer indexed field.
  • 12 String type non indexed fields declared as space class fields
  • 12 Long type non indexed fields declared as space class fields
  • 12 Integer type non indexed fields declared as space class fields
  • Getter and Setter methods for the above fields

The original class looks like this:

@SpaceClass
public class SimpleEntry {

	public SimpleEntry() {
	}
	private Integer _queryField;
	private Long _longFieldA1;

	....

	@SpaceRouting
	@SpaceProperty(index=IndexType.BASIC)
	public Integer get_queryField() {
		return _queryField;
	}

	// getter and setter methods
	public void set_queryField(Integer field) {
		_queryField = field;
	}

	public Long get_longFieldA1() {
		return _longFieldA1;
	}

	public void set_longFieldA1(Long fieldA1) {
		_longFieldA1 = fieldA1;
	}

The BinaryFormatEntry class

The modified class that implements the Binary serialization pattern includes:

  • Using the @SpaceClass(includeProperties=IncludeProperties.EXPLICIT) decoration - this allows you to control which fields will be Space class fields explicitly.
  • One Integer indexed field.
  • One byte array field declared as a space class field.
  • 12 String type non indexed fields. These are not space class fields.
  • 12 Long type non indexed fields. These are not space class fields.
  • 12 Integer type non indexed fields. These are not space class fields.
  • Getter and setter methods for the above fields.
  • pack and unpack method and few helper methods.
  • Externalizable implementation - writeExternal and readExternal methods

The modified class looks like this:

@SpaceClass(includeProperties=IncludeProperties.EXPLICIT)
public class BinaryFormatEntry implements Externalizable {

	public BinaryFormatEntry(){}
	
	private Integer    _queryField;
	private byte[]     _binary;
	
	private Long       _longFieldA1;
	....
	
	@SpaceRouting
	@SpaceProperty(index=SpaceProperty.IndexType.BASIC)
	public Long getQueryField()
	{
	    return _queryField;
	}
	
	public void setQueryField(Long queryField)
	{
	    _queryField = queryField;
	}
	
	@SpaceProperty()
	public byte[] getBinary() {
		return _binary;
	}
	
	public void setBinary(byte[] _binary) {
		this._binary = _binary;
	}
	
	public Long get_longFieldA1() {
		return _longFieldA1;
	}
	
	public void set_longFieldA1(Long fieldA1) {
		_longFieldA1 = fieldA1;
	}
	...
	
	public void pack(){...}
	public void unpack(){...}
	public void writeExternal(ObjectOutput out){...}
	public void readExternal(ObjectInput in) {...}
	private long getnulls(){...}	
	private short checkNulls() {...}
}

The pack method

The pack method serialize the object non indexed data. It is called explictly before calling the space write operation.
This method serialize the object data by placing the data into the byte array field. Null values fields indication stored within one field.
The BinaryOutputStream utility class is used to write the binary data into the byte array.

public void pack()
{
    BinaryOutputStream output = new BinaryOutputStream();
    long nulls = getNulls();
    output.writeLong(nulls);
    if (_longFieldA1 != null)
        output.writeLong(_longFieldA1);

    // ... etc. for all other compactable fields.

    _binary = output.toByteArray();
    output.close();
}

The unpack method

This method de-serialize the object data by extracting the data from the byte array field and populating the fields with their corresponding values. Null values fields are non populated. This method is called after calling the space read operation. The Binar utility class is used to read the binary data and place it into the relevant field.

public void unpack() {
    BinaryInputStream input = new BinaryInputStream(_binary);
    long nulls = input.readLong();
    int i = 0;

    if ((nulls & 1L << i) == 0)
        _longFieldA1 = input.readLong();
    i++;

    // ... etc. for all other compactable fields.

    input.close();
    _binary = null;
}

The writeExternal method

The writeExternal method serialize the object data into the output stream.
The object data involves a field indicates which fields have null value, the indexed fields and a byte array field that includes all non indexed fields data (created by the pack method). The writeExternal assumes the pack method has been called explicitly prior the space write method call that initiated the writeExternal call.

public void writeExternal(ObjectOutput out) throws IOException {
	short nulls = 0;
    int i=0;

    nulls = checkNulls();

    out.writeShort(nulls);
    if (_queryField != null) {
        out.writeLong(_queryField);
    }
    if (_binary != null) {
        out.write(_binary);
    }
}

The readExternal method

The readExternal method essentially performs the opposite of the what the writeExternal method is doing.
This methods populates the indxed fields data and the byte array field data. Later, the remaining fields will be populated once the unpack method will be called.

public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {
	short nulls;
    int i=0;
    nulls = in.readShort();

    if( (nulls & 1L << i) == 0 )
      _queryField = in.readLong();
    i++;
    if( (nulls & 1L << i) == 0 )
    {
         byte[] data = new byte[500];
         int len = in.read(data);
         _binary = new byte[len];
         System.arraycopy(data, 0, _binary, 0, len);
    }
}

The checkNulls method

This method goes through the indexed fields and the byte array field and place into a short data type field an indication for the ones with null value using a bit map.

private short checkNulls() {
    short nulls = 0;
    int i = 0;

    nulls = (short) ((_queryField == null) ? nulls | 1 << i : nulls);
    i++;
    nulls = (short) ((_binary == null) ? nulls | 1 << i : nulls);
    i++;
    return nulls;
}

The getnulls method

This method goes through all class non indexed fields (the ones that their data is stored within the byte array) and place into a long data type field indication for the ones with null value using a bit map.

private long getnulls()
{
    long nulls = 0;
    int i=0;


    nulls = ((_longFieldA1 == null)  ? nulls | 1L << i : nulls ) ;
    i++;
    nulls = ((_longFieldB1 == null)  ? nulls | 1L << i : nulls ) ;
    i++;
    ...
    return nulls;
}

The Factory method

The example using a factory method called generateBinaryFormatEntry to create the space object. Once it has been populated , its pack method is called.

private BinaryFormatEntry generateBinaryFormatEntry(int id){
	BinaryFormatEntry bfe = new BinaryFormatEntry(id, value1 , value2 ?)
	bfe.pack();     //  the pack method is called implicitly as part of the factory method
	return bfe;
}

Writing and Reading the Object from the space

The following code snipped illustrates how the binary serialized object is written into the space and read from the space:

GigaSpace _gigaspace;
BinaryFormatEntry testBFE = generateBinaryFormatEntry(500);
_gigaspace.write(testBFE, Lease.FOREVER);
BinaryFormatEntry templateBFE = new BinaryFormatEntry();
templateBFE._queryField = new Long(500);
BinaryFormatEntry resBFE = (BinaryFormatEntry)_gigaspace.read(templateBFE, 0);
resBFE.unpack(); // this deserialize the binary data into the object fields

References

The PackRat project allows you to use the Binary Serialization pattern via simple annotations.

Rate this page:
Your Rating: Results: PatheticBadOKGoodOutstanding! 2 rates

Additional resources: XAP Application Server | XAP Data Grid | XAP for Cloud Computing | XAP J2EE Support

Labels

 
(None)