How do I decide if a NoSQL DB is the right fit for my project?
I will begin this article with a simple statement -
The choice to use a NoSQL database is often based on hype, or a wrong assumption that relational databases cannot perform as well as a NoSQL database. So, in a way, performance becomes the only deciding factor & the other important factors are overlooked by engineers when it comes to selecting the DB. This leads to a quagmire & coming out of which, in itself, becomes a herculean task.
I am assuming that you have a basic understanding of what exactly is a NoSQL DB. If not, I will highly recommend to visit some articles on internet to gain some insights on it first as this article does not explain what a NoSQL DB is. The article only discusses some key factors which helps in the determination of choosing NoSQL as a DB in a project.
Enough talk. Lets start.
Though it is not an easy piece of work to decide for SQL or NoSQL but there exists some basic thumb rules through which you can navigate this tricky land & come out with a conclusion. Let us try to recognize them -
1. Uncertainty
The first thumb rule is to recognize if your project's database design along with tables structures have been decided/finalized or not.
Let us understand it with an example.
Consider that an application needs to be developed quickly for recording employee daily activity work. The functionality of the application is like as below -
- Employee logs into the application
- Employee is shown a page with some fields like Employee name, Employee code, Date, Activity Name, Activity Description.
- There is a button on the page through which the employee can add more fields for Activity Name & Activity Description, if he wishes for adding more than activity.
It needs to be developed as a trial in the first phase. Once launched, the company will monitor the usage & results, and then will decide what kind of major/minor features need to be added. So, in a way, the roadmap of this application is a bit uncertain as of now. It will be decided once the first cut is launched & used. Now if we use a relational db here like MSSql, all the tables schemas/design etc would need to be finalized first. The relations between the tables would also need to be decided. These factors would result in the delay of actual implementation & after launch, if major changes at DB level need to be done(like modify the table structure etc) it may turn out to be a tricky task. While with the help of NoSQL DB we can store all the data without any schema/design considerations resulting in faster implementation & time to market.
2. Time to market
This point is in a way related to the #1 above. When you have the requirement on hands to launch the application quickly for the users, NoSQL is more suited to be the choice. As it does not require db/tables design analysis/finalization, the time taken to implement it would be faster in NoSql than in SQL. Consider if you have to create an application & are not bothered about how data should be stored and just want a storage for data persistence then NoSQL has better capabilities to offer.
3. Scaling
SQL and NoSQL databases scale differently, so you’ll have to think about how your data set will grow in the future.
SQL databases scale vertically, meaning you’ll need to increase the capacity of a single server (increasing CPU, RAM, or SSD) to scale your database. SQL databases were designed to run on a single server to maintain the integrity of the data, so they’re not easy to scale.
NoSQL databases scale horizontally, meaning you can add more servers to power your growing database. This is a huge advantage that NoSQL has over SQL.
The ability of NoSQL databases to horizontally scale has to do with the lack of structure of the data. Because NoSQL requires much less structure than SQL, each stored object is pretty much self-contained and independent. Thus objects can be easily stored on multiple servers without having to be linked. This is not the case for SQL, where each table row and column needs to be related.
4. Want to run a query?
This is a very critical point which must be taken into consideration while mulling over the options. Would you need to run queries against your database?
Let's understand it with an example.
Consider the example mentioned in #1, we may not require running queries against the DB as it's a simple application & is just related to entering the employee daily activities and showing them on screen. Using a NoSQL DB would solve the purpose, but what if the company wants small reports like -
- how many employees are entering their activies on daily basis?
- how many similar type of activies employees are doing?
- how many employees are entering their activites even before lunch time?
Running these type of queries frequently on a NoSQL db is not a seamless task while if we use a relational DB here like MSSQL, we can easily write queries & fetch the required data.
In a nutshell, if we will need to run the queries frequently then a NoSQL DB would prove to be a pain to use.
I know some of you might argue that we can use a CQRS pattern or use a separate DB for datawarehouse to generate these kind of reports. Of course you can & that is a fair argument but implementing these approaches is no child's play. It will require lot of planning, efforts & time resulting further delays. So the point is, if you have to run the queries as well as faster time to market is needed then NoSQL won't fit the bill. I hope you get the drift.
I have created an overly simple diagram just to showcase the basic thumb rules which you can use to start your analysis.
I hope the points I have mentioned in the article would help you in choosing the right fit DB for your needs. Please provide your feedback in comments if you liked(or didn't like 🙂) the article.
Member discussion