Your data wants to be free
February 16th, 2009 | Published in software development | 2 Comments
In a response to Is the Relational Database Doomed, and my post REST is a distributed data model, Anders Nawroth wrote an interesting post, The future of RDBMS’s, which made me think about the role of databases.
Anders makes the statement that RDBMSes aren’t very good at storing relationships. He points out that there are better ways to store relationships. Is this really what matters? Especially if this means that your data will be tied to the technology used by the application?
I think that having your data stored independent of the application technology, is much more important. Even if this means that the database can’t offer the best features possible. Being able to access all the data from anywhere, any program, any technology, makes is that much more valuable. This is one of the reasons why Object Databases never became a big hit. It limits the use of your data to object oriented languanges, and in some cases to a single language. The value of data is directly related to the ability to use it.
This is why RDBMSes are so great: it doesn’t bind your data to a single application. You can collect the data with one application, enrich it with another, publish it with yet another, and combine it with other data to create a comprehensive consistent whole.
On a related note: RDBMSes may not offer the richest relational model, but it’s a shame that in most SOA implementations you lose what little you have. Data is broken up into services related portions. Next it’s serialized to xml or something similar. Relations between data from different services is lost, circular relationships are lost when translated to hierarchical data. Yet another good reason to have a good look at REST, which could restore the ability to have cross service data relationships.
But remember: Your data wants to be free.
February 17th, 2009 at 5:44 pm (#)
RDBMS’s does bind you to a specific way to model your data. There are two typical scenarios that doesn’t fit into that model very well: 1. semi-structured data. 2. data about social/other networks. That’s why the big scale sites go for homegrown solutions, not RDBMS, when they face those scenarios.
I really don’t understand your focus on how data is stored. The nice thing about REST in my opinion is that applications can expose an external API, where the client doesn’t have to bother about the underlying technology at all. The problem may be that this isn’t standardized yet, so there a lack of tool support. — Actually the great tool support could be one of the big advantages of RDBMS today!
If you want to store data in a truly application-agnostic way, RDF could be the way to go. It’s a standardized way of expressing data as a graph. And most application data fits good in the graph model. There are also multiple implementations of RDF stores and some tools. Neo4j is a generic graph database, but does also have a RDF layer on top. BTW, RDF is all about resources/URIs/links or a “metadata data model” if you like
There’s also a standardized query language as well (SPARQL).
As for OODB, I second that they are to tightly bound to the programming languages. It shouldn’t be relevant if you just consume a resource, but from a development perspective it’s a pain! — On the other hand RDBMS can be a pain too during development. Like when going from one-to-many to many-to-many for some relationship. In a graph database this requires no work on the database at all, just add the new data! So that would be one more scenario, but more from the development perspective: evolving schemas. Here you can read about evolving databases over time: http://push.cx/2009/rules-of-database-app-aging
February 17th, 2009 at 6:01 pm (#)
Graph databases may be a better solution than rdbms, i don’t know, haven’t got any experience with them. As you say, it’s not that important if the solution is universal enough that you can use if from any application.
One important requirement in my opinion would be a declarative query engine, which would enable you to easily get the data required. I think rdf is xml, so xquery could be a solution here.
In the long term, REST may be the ultimate solution, but we need a good query engine first. Maybe existing xquery engines could be used, or extended to serve this purpose?
Is there already a json-query extension to xquery? JQuery would be a good name, but i’m afraid that’s taken.
Just done some research, looks like some smart people have already done this: JAQL. A xquery implementation for json. It seems to work with Haadop.