NoSQL : some myths

Like every buzzword NoSQL generates a lot of hype and consequently a lot of nonsense is spoken and written. In this post full of references I will expose some of the most common I read and hear around. Maybe so at least the readers of this blog avoid falling into these traps. 🙂
Myth # 1: NoSQL is new

The main feature behind the NoSQL databases is that these are not based on the relational model. At this point I may be wrong, but Codd article – A Relational Model of Data for Large Shared Data Banks – in which the relational model is presented out in 1970.

If the main feature is the non-adoption of the relational model, and NoSQL is new, this means that before 1970 there were no databases? I think not: in 1959 we have for example the publication of the first version of CODASYL that created the COBOL language and also set a new database of default, Navigational. Even more interesting, in 1968 the same committee makes a survey of managers of existing databases to date for analysis. And hey: I had mutos! Here is a link to this survey.

Another: I have seen people saying that the document model is new. Nope! Lotus Notes already had a documentary bank in the 1990s More about Notes history can be seen at this link.

Myth # 2: what the hell the scheme!
Heads roll in the future thanks to this 🙂

Heads roll in the future thanks to this 🙂

This is the myth that probably generates more pain in the medium term and is quite common. It’s nice to have a DBMS that allows you flexibility in the attributes present in the definition of each entity. Martin Fowler, for example, puts the absence scheme as a key attributes of this type manager database system. In fact, for entities to which the data is not exactly records as well defined in Kent excellent article published in 1979 – Limitations of Record-Based Information Models – the much disturbs that help scheme.

As time goes on we see that only a minority of cases the fact that information falls into the category record. I ask: is that with this we must turn the fuck to define the scheme? No: it is exactly the opposite. Precisely because you have complete freedom in defining the fields that make up your data you must have an ultra detailed documentation about what should or should not be present in each document / node / whatever! If it is valid, then no rule exists. If no rule exists, what you have in the end? It has a trap. And gross!

And hey: it does not have much to flee the scheme. Here an excellent post about.

Myth # 3: scalability NoSQL is always superior to be NoSQL duh!

One of the selling points of these databases is the fact that it is possible to obtain high scalability only by adopting them in your project. Ouch. High scalability is achieved is not because you use NoSQL: is why yours is good: simple.

I have seen cases where a project is changed all DAOs relational NoSQL for something and gain performance was monstrous. Looking deeper you realize that this gain is actually obtained because the tabular data structure was not the most appropriate to persist system status. I have spoken about this here: the big problem is notational.

I ask, a system in which data is strict records, as in the text of Kent would it be really fast in a documentary DBMS, key-value or based on graphs? Just to remember, you can have a relational DBMS with weaker ACID, such as SQLite or MySQL MyISAM engine.

Martin Fowler puts as one aspect of NoSQL the fact that they are systems designed to run on large clusters (so even if you write? “Clusters” or “clusters”?), But you cannot take it literally, because if so, Oracle NoSQL would fall into category.
Myth # 4 Benchmarks meaningless and completely unjust

Easy to break, how you can compare fairly persistence models as diverse as key-value, relational, document, oriented graphs, columnar, etc? I see some benchmarks out there that say things like “our searches were orders of magnitude faster in using a key-value database instead of relational.”

You hit an eye on the system and realize that all queries are by identifier. Of course, the key-value will win: it is made for this. Or see comparative showing, for example, to deal with relationships is faster with something like Neo4J instead of MySQL Obvious:. Neo4J is made for this!

What I mean is this: as each DBMS is designed to handle a specific type of data structure, the comparison is only fair when the contestants deal with all the same data structures. Riak vs Redis, MongoDB vs CouchDB for example is a fair comparison. Oracle (relational) vs Tokyo Cabinet (key-value), not

Myth # 5: NoSQL with my productivity is much higher!
“It’s like if you had numerous arms with NoSQL”. Frog!

“It’s like if you had numerous arms with NoSQL”. Frog!

Very manager falls into this. Yes, you can get higher productivity with NoSQL: the fact often not be a mandatory scheme allows you, for example, evolve their model as new situations arise. Research on a graph are easier to write, handle key-value queries are simple. There is no denying this: you are using the correct data structures this time uai!

It’s funny to note that the downside few people speak: the main one being cultural. If your team is already used to the relational model, the adoption of something non-relational shown a monstrous challenge. It is not uncommon to observe a high level of rejection, especially when seen in its implementation NoSQL the absence of a feature that made life quite comfortable developer, such as referential integrity. Yes, I know we do not need it forever, but a lot of people cry when they find it, especially when dealing with the document model that resembles the relational.

Another problem: you often transfer to the source code of your project responsibilities that were previously the RDBMS, such as query optimization, data integrity, etc. You do not attribute validation in MongoDB, for example. About this transfer of responsibility, I recommend reading this article: “The NoSQL Ecosystem,” the excellent book “The Architecture of Open Source Systems”.

And did I mention here that, unlike the relational model that possess a standard query language – SQL – the same does not occur in the NoSQL world even with Spring Source trying something in this direction?
In conclusion

Every day that passes more nonsense I see being written and spoken about NoSQL due to hype involved. I sincerely believe that this NoSQL movement is the best thing that could happen to us . But before you get carried away too much and run into this kind of argument , remember the following : just as brown sugar , NoSQL is sweet, but not as soft as sell us. 🙂


Please enter your comment!
Please enter your name here