Cassandra Notes 3 – query data

To query Cassandra

1, ColumnOrSuperColumn
This is a “wrapper” – it has two attributes “column” and “super clolumn”, one of them will be used to hold the real “column” or “SuperColumn”. This is convenient because in the column-family you can store whatever you want, column or super-column.

2, Column Parent
If we think Cassandra storage is a tree, then “column parent” describe a “Path” to the parent node, “Column Parent” can help to identify “columns” by specifying the “Column Family” they are stored, and more importantly, the “Super Column” they belongs to (can be null)

3, Column Path
Similar to “Column Parent”, it is also a “path”, mostly to identify a single “Column”. The path is like Column-Family->Super-Clolumn->Column Name.

4, Slice Range
Still think of a tree, this “slice range” desribe a “branch” – it contains the “start column”, “end column”, additonally, you can specify the “ordering” and “return number” (like ASC/DESC and LIMIT in sql) Note – the use “”.getBytes(“utf-8”) as NULL value for the start and finish if you dont want to specify them.

5, Slice Predicate
With Slice Range, you can pick columns which are stored together, with start-column and end-column. What if you want to hand-pick columns? You can use “Slice Predicate” by providing a list of column names. Slice Predicate is actually wrapping the Slice Range too. You will either use the list-of-column-names or the slice-range.

6, Key Slice
This one a wrapper too – it can hold a list of Column/SuperColumns with a name. It is mostly used to hold the query results.

Query operations
1, Get a single Column or SuperColumn
ColumnOrSuperColumn get(key-space, key, column-path, consistence-level)
key-space: the warehouse (tree root)
column-path: defines the path from column-family down to super-column or column
key: to identify the “row”

2, Get a List of Columns which are owned by same tree node.
List get_slice(key-space, key, column-parent, slice-predicate, consistence-level)
key-space: tree root
key: to identify the row
column-parent: the parent node, either be the Column-Family or a Column-Family->Super-Column.
slice-predicate: either list of column names or a branch defined by slice-range

3, Multi-Get. This is the steroid version of “get”
This one we can provide a list of KEYs.

4, Multi-Get-Slice, this is the steroid version of “get-slice”
Provide a list of Keys

Here is the test case to search the package contents from the “ShippedPackages” column family

	@Test
	public void testShippedPackageContents() throws InvalidRequestException, UnavailableException, TimedOutException, TException, UnsupportedEncodingException{
		TTransport transport = null;
		try{
			transport = new TSocket("localhost", 9160);
			TProtocol protocol = new TBinaryProtocol(transport);
			Cassandra.Client client = new Cassandra.Client(protocol);
			transport.open();
			//client.batch_insert("SuperInventory", packageNumber, oneRowPerColumnFamily, ConsistencyLevel.ALL);
			String packageNumber = "UPS030292020-22243242";
			SlicePredicate slicePredicate = new SlicePredicate();
			SliceRange sliceRange = new SliceRange("".getBytes(),"".getBytes(),false,100);
			slicePredicate.setSlice_range(sliceRange);
			List<ColumnOrSuperColumn> results = client.get_slice("SuperInventory", packageNumber, new ColumnParent("ShippedPackages",null), slicePredicate, ConsistencyLevel.ONE);
			System.out.println("found " + results.size());
			for(ColumnOrSuperColumn r:results){
				if(r.isSetSuper_column()){
					SuperColumn sc = r.getSuper_column();
					String superName = new String(sc.getName(),"utf-8");
					System.out.print("superName:" + superName);
					List<Column> attributes = sc.getColumns();
					for(Column attribute:attributes){
						System.out.print(new String(attribute.getName(),"utf-8") + "=" + new String(attribute.getValue(),"utf-8"));
					}
					System.out.println();
				}else{
					Column sc = r.getColumn();

					System.out.println(new String(sc.getName(),"utf-8") + "=" + new String(sc.getValue(),"utf-8"));

				}
			}
		}
		finally{
			transport.flush();
			transport.close();
		}
	}

Also, we can search the shipped packages of an order from the “OrderShippedPackages” column family

@Test
	public void testShippedPackageContents() throws InvalidRequestException, UnavailableException, TimedOutException, TException, UnsupportedEncodingException{
		TTransport transport = null;
		try{
			transport = new TSocket("localhost", 9160);
			TProtocol protocol = new TBinaryProtocol(transport);
			Cassandra.Client client = new Cassandra.Client(protocol);
			transport.open();
			//client.batch_insert("SuperInventory", packageNumber, oneRowPerColumnFamily, ConsistencyLevel.ALL);
			String packageNumber = "UPS030292020-22243242";
			SlicePredicate slicePredicate = new SlicePredicate();
			SliceRange sliceRange = new SliceRange("".getBytes(),"".getBytes(),false,100);
			slicePredicate.setSlice_range(sliceRange);
			List<ColumnOrSuperColumn> results = client.get_slice("SuperInventory", packageNumber, new ColumnParent("ShippedPackages",null/*"toaster33333".getBytes("utf-8")*/), slicePredicate, ConsistencyLevel.ONE);
			System.out.println("found " + results.size());
			for(ColumnOrSuperColumn r:results){
				if(r.isSetSuper_column()){
					SuperColumn sc = r.getSuper_column();
					String superName = new String(sc.getName(),"utf-8");
					System.out.print("superName:" + superName);
					List<Column> attributes = sc.getColumns();
					for(Column attribute:attributes){
						System.out.print(new String(attribute.getName(),"utf-8") + "=" + new String(attribute.getValue(),"utf-8"));
					}
					System.out.println();
				}else{
					Column sc = r.getColumn();

					System.out.println(new String(sc.getName(),"utf-8") + "=" + new String(sc.getValue(),"utf-8"));

				}
			}
		}
		finally{
			transport.flush();
			transport.close();
		}
	}
Advertisements

One thought on “Cassandra Notes 3 – query data

  1. Hi,

    I have a requirement of using Cassandra in my application. In my application there is one table with lot of data and most of my application uses that table. Due to lot of data,performance of the application is decreasing when i use that table is in Oracle.

    So, I have decided to use the Cassandra database for that one table and all other tables in oracle. Lot of business logic is dependent on that table.

    No my question is, Can I use the Cassandra for a table which has lot of business logic.

    I am unable to implement lot of where clauses for Cassandra database.

    Is there any supporting tool to use Cassandra in an efficient way?

    Please let me know…
    i am in urgency..

    Thanks in advance

    By Mallik

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s