Tuesday, June 14, 2011

OrientDB - Pure Java NoSQL Datastore

I have been following the so-called 'NoSQL movement' (as if such a movement exists!). In my opinion, it is just another way of storing and retrieving data. And this new (rather different, I am sure the method is not new) way is only suitable for certain type of applications. Not all applications require NoSQL-type storage and retrieval regardless of those who claim it to be. SQL and NoSQL complement each other. They are not rivals. 

Having said that, I was pretty much interested in the concepts. I love trying out new things. While I was planning the design and architecture of a new open source project that I intend to develop (It is not exactly new. I have already started it but now plan to move it to a new architecture - Greenscape), one of the choices I had to make was on the data storage. My application will be highly dynamic with ability of end users to add and remove columns to a database table frequently. Doing it in relational database will incur heavy performance and complexity. After much consideration, I decided that a non-relational database will be the best choice for my application. But this choice is not without its share of challenges:
  1. I loose JPA/Hibernate support which means I may have to come up with an equivalent framework.
  2. I loose portability. NoSQL is non-standard. There is no standard query language. There are no standard protocol. So, tomorrow, I cannot replace the database with a better performing one without rewriting my application.
  3. Few documentation. What are the best practices, design patterns, etc?
In spite of the challenges, I went ahead with my choice. The compelling reason was the ability to create knowledge because of the void.

Now that the decision was made, I had to choose a NoSQL database. I analysed all the popular ones but none fitted my requirements. I had one criteria for selecting a database: I must be able to code in Java. Most available systems were non-Java based which would be a significant issue for a one man project. Even if they had Java interface, the installation, setup, etc. were a tedious process. Having a database developed purely in Java has many advantages:
  1. Easy packaging with other applications
  2. Easy to install and run
  3. Can be embedded
  4. Can run in same or different VM
  5. Easy to debug
  6. Easy to test
After much searching, I came across OrientDB. Voila! That was what I needed! Going through its features only reinforced by belief in it. It is a pure Java solution and very small in size (500KB! 2MB since last release) . It can be embedded as well as deployed in networked mode. It is both schema-less and schema-based! This topped the feature list. It also supports SQL as a query language (Hmm, need to call NoSQL by some other name).

Some of the features taken from its homepage:

  1. SQL
  2. Super fast
  3. Transactional
  4. GraphDB
  5. Web ready (HTTP, REST, JSON)
  6. Everywhere (Pure Java)
  7. Extremely light
  8. Apache License (Here's the money)

End of architecture choice 1 for Greenscape.
Next decision on application framework pending.

3 comments:

  1. As for your comment about "need to all NoSQL by some other name", NoSQL stands for Not Only SQL.

    ReplyDelete
  2. Oops ... wireless keyboard missed a key ... i meant call not all.

    ReplyDelete
  3. Are you using OrientDB right now?

    ReplyDelete