How To: RCassandra?

Because of the scalability of Cassandra it is widely adopted throughout the globe.

  • This tutorial assumes that you have Cassandra and R installed and configured correctly.

Why would you want to use database like Cassandra with R?

I find it very easy to convert raw data into processed data with R. However, there are times when I have a large number of tables that I have to process, but the memory space is not great enough to keep them all as objects at the same time. Therefore, what I do is clean them up and then put them into db one by one. I personally think that most analysts spend a large majority their time simply altering data from raw to nicely formatted, quickly fathomable data, and that this pursuit, mundane as it may be, is an important aspect of the process for the sake of future analysis.

Limitations of RCassandra

RCassandra doesn’t support creating keyspace, deleting keyspaces, creating column family, deleting column family, deleting a row, and deleting a line in a column of data. Also there are very few functions that are available in RCassandra in comparison to Clojure’s Cassandra package alia, R’s mongodb packages, etc.

Creating keyspace and tables in Cassandra’s single-node cluster on localhost
cassandra-cli -host 127.0.0.1 -port 9160

First, create a keyspace with
CREATE KEYSPACE rcass with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor:1};

then to connect to this keyspace type
USE rcass;

Installation and usage of RCassandra

install.packages("RCassandra")
Load the RCassandra package into your environment with
library(RCassandra)

Now connect to your database with
connect.handle <- RC.connect(host="127.0.0.1", port=9160)

Cassandra by default listens to port 9160 but you can change it according to your configuration.
To show the cluster type into your prompt
RC.cluster.name(connect.handle)
[1] “Test Cluster” RC.describe.keyspaces(connect.handle)

It will show a list and you would find a entry for your keyspace
RC.describe.keyspace(connect.handle, 'rcass')
$name
[1] “rcass”
$strategy_class

Using the R’s datasets library to create a column family
library(datasets)
head(mtcars, 3)

mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1

RC.use(connect.handle, 'rcass')
RC.write.table(connect.handle, "cars", mtcars)

To get a row type:
RC.get.range(connect.handle, "cars", "3")

To get a range of keys and multiple columns:
RC.get(connect.handle, "cars", "3", c("mpg", "gear", "carb"));

To query a range of keys and a rangle of columns:
cars_slice <- RC.get.range.slices(connect.handle, "cars")

The above cars_slice is a list. So you can get the list elemensts as
cars_slice[[1]]

To read the table into R from the db use:
mycars <- RC.read.table(connect.handle, "cars")
head(mycars)

Now let’s create a data frame for storing the name, email, password, designation:
employee <- data.frame(name="Mr. Foo", designation="coder",
email="foo@example.com", password="123")

Now to write this frame into a table in cassandra:
RC.write.table(connect.handle, "employees", employee)

Now to read this table:
RC.read.table(connect.handle, "employees")

Insert a row into the table:
RC.insert(connect.handle, "employees", "Mr. Moo",
designation="tester", email="moo@example.com", password="345")
RC.insert(connect.handle, "employees", "Boo", designation="HR",
email="boo@example.com", password="333")

Now to look up the changes made
RC.read.table(connect.handle, "employees")

Advertisements

1 thought on “How To: RCassandra?”

  1. Hi,

    I just try your code on Cassandra VM from PlanetCassandra, using RStudio. And I can not create a table in Cassandra from R with the command `RC.write.table()`. Have I messed something ? Is there anything to do before executing this command ?

    Same thing for insert data with `RC.insert()`…

    Thanks for your answer
    FX Jollois

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s