Friday, May 6, 2011

Why Riak?

I got my first understanding of NoSQL-databases reading the excellent Dynamo-paper. This is a well-written paper which explains a lot of the reasons for using key-value databases, and their design. It also contains technical information that is good to know for users of Riak, since Riak builds on the ideas found in Dynamo.

Looking into NoSQL databases I started with CouchDB which looks neat. Still, it seems to not match everything I want, especially in terms of scalability and fault tolerance. Like Riak, CouchDB is built using Erlang, and its API is RESTful.

I briefly looked at HBase, but found the technology too complicated. I have limited knowledge about Linux, and want to be able to setup the datastore without too much hassle.

I moved over to testing Cassandra, which, like HBase, seem to be able to scale well. Cassandra is built in Java and uses Thrift as its interface. I managed to install Cassandra and compile the Thrift classes in Smalltalk, but lost interest when I discovered Riak.

Riak is similar to Cassandra but everything seems a bit simpler; the installation, the data model and the API:
  • The installation is dead simple. Installing Riak took me only a few minutes.
  • The data model is easy to understand. You could say that Cassandra has a richer model, but my overall impression is that Riak is able to support most use-cases. 
  • Basho's choice of using a RESTful API for Riak makes sense to me. I could relatively easy create a Smalltalk interface to do some basic operations.


Boris Popov said...

So far I'm struggling with a couple of issues with Riak,

- any kind of data store that needs to pass through compliance presents a challenge, unless one can find an auditor willing to listen and not many will put their signature on a technology that isn't backed by Microsoft, Oracle or IBM for applications with a potential for considerable data breaches regardless of what you put in your application layer
- lack of any kind of authentication is a big minus for the above, even when proxied by nginx, apache or whatnot

I'm happy to see further indexing improvements coming online and would really like to see some success stories of folks deploying applications on NOSQL that pass through tough compliance routines.

Juraj Kubelka said...

Would you be so kind to write more about CouchDB and "... especially in terms of scalability and fault tolerance"? I am exploring these NoSQL databases these days and I would like to understand, how do you mean it.

Thank you a lot!