Wednesday, January 25, 2012

Basho Riak vs Amazon DynamoDB

Up to this point Amazon had two non-relational data-store offerings, but none really matched the needs when storing large amount of structured data:

Amazon S3 - Has unlimited storage of key-value data, but no indexing and no conflict resolution mechanism.
SimpleDB - Good indexing and query capabilities, but has limits on the amount of data that can be stored.

Now Amazon has announced DynamoDB: A scalable, structured database, fully served from the cloud.

With DynamoDB, Basho gets competition from the source that inspired them to make Riak. Still, it does not look like Basho is too concerned. I agree with Basho: DynamoDB might turn out good for Basho, since DynamoDB does not directly copy the model of Riak.

DynamoDB has some advantages over Riak, like its "scale-by-pay": With DynamoDB, you just make an API call to scale up your database. No new hardware needs to be purchased and deployed. Still, I think Riak has a lot of features that usually will make it a better choice than DynamoDB:
  • Riak allows defining multiple indices per object. DynamoDB has a more limited index model. 
  • Riak allows storing objects larger than DynamoDB's 64 KB limitThis is particular important when you want to serialize a business object graph and store it as a BLOB in a single object/record in the database. With DynamoDB this approach won't (usually) work; 64 KB is too little to fit a serialized graph. You will be forced give up serializing, break up your entire graph and map it to multiple records. This means manual mapping between objects/database. If the domain is complex, I fear we enter another Vietnam of Software Development.
  • DynamoDB only replicates between "availability zones", not "regions"Riak Enterprise allows for replicating between data centers in different regions, e.g. Europe and the US. With Riak you can have your data in Europe and the US to reduce latency for both regions.
  • You can run Riak at your own hardware having full control of the data. DynamoDB is only run at Amazon's data centers.

No comments: