I have been following the so-called 'NoSQL movement' (as if such a movement exists!). In my opinion, it is just another way of storing and retrieving data. And this new (rather different, I am sure the method is not new) way is only suitable for certain type of applications. Not all applications require NoSQL-type storage and retrieval regardless of those who claim it to be. SQL and NoSQL complement each other. They are not rivals.
Having said that, I was pretty much interested in the concepts. I love trying out new things. While I was planning the design and architecture of a new open source project that I intend to develop (It is not exactly new. I have already started it but now plan to move it to a new architecture - Greenscape), one of the choices I had to make was on the data storage. My application will be highly dynamic with ability of end users to add and remove columns to a database table frequently. Doing it in relational database will incur heavy performance and complexity. After much consideration, I decided that a non-relational database will be the best choice for my application. But this choice is not without its share of challenges:
- I loose JPA/Hibernate support which means I may have to come up with an equivalent framework.
- I loose portability. NoSQL is non-standard. There is no standard query language. There are no standard protocol. So, tomorrow, I cannot replace the database with a better performing one without rewriting my application.
- Few documentation. What are the best practices, design patterns, etc?
In spite of the challenges, I went ahead with my choice. The compelling reason was the ability to create knowledge because of the void.
Now that the decision was made, I had to choose a NoSQL database. I analysed all the popular ones but none fitted my requirements. I had one criteria for selecting a database: I must be able to code in Java. Most available systems were non-Java based which would be a significant issue for a one man project. Even if they had Java interface, the installation, setup, etc. were a tedious process. Having a database developed purely in Java has many advantages:
- Easy packaging with other applications
- Easy to install and run
- Can be embedded
- Can run in same or different VM
- Easy to debug
- Easy to test
After much searching, I came across OrientDB. Voila! That was what I needed! Going through its features only reinforced by belief in it. It is a pure Java solution and very small in size (500KB! 2MB since last release) . It can be embedded as well as deployed in networked mode. It is both schema-less and schema-based! This topped the feature list. It also supports SQL as a query language (Hmm, need to call NoSQL by some other name).
Some of the features taken from its homepage:
Some of the features taken from its homepage:
- SQL
- Super fast
- Transactional
- GraphDB
- Web ready (HTTP, REST, JSON)
- Everywhere (Pure Java)
- Extremely light
- Apache License (Here's the money)
End of architecture choice 1 for Greenscape.
Next decision on application framework pending.