We are generating data at an unprecedented pace right now. The scale and size of this data – it’s mind-boggling! Just check out these numbers:
- Facebook generates four petabytes of data in just one day
- Google generates twenty petabytes of data every day
- Furthermore, Large Hadron Collider (27 kilometers long most powerful particle accelerator of the world) generates one petabyte of data per second. Most importantly this data is unstructured
Can you imagine using SQL to work with this volume of data? It’s setting yourself up for a nightmare!
SQL is a wonderful language to learn as a data scientist and it does work well when we’re dealing with structured data. But if your organization works with unstructured data, SQL databases can not fulfill the requirements.
Structured databases have two major disadvantages:
- Scalability: It is very difficult to scale as the database grows larger
- Elasticity: Structured databases need data in a predefined format. It the data is not following the predefined format, relational databases do not store it
So how do we solve this issue? If not SQL then what?
This is where we go for unstructured databases. Among a wide range of such databases, MongoDB is widely used because of its rich query language and quick access with concepts like indexing. In short, MongoDB is best suited for managing big data. Let’s see the difference between structured and unstructured databases:
Structured Databases | Unstructured Databases | |
Structure: | Every element has the same number of attributes | Different elements can have different number of attributes. |
Latency: | Comparatively slower storage | Faster storage |
Ease of learning: | Easy to learn | Comparatively tougher to learn |
Storage Volume: | Not appropriate for storing Big Data | Can handle Big Data as well |
Type of Data Stored: | Generally textual data is stored | Any type of data can be stored (Audio, Video, Clickstraem etc) |
Examples: | MySQL, PostgreSQL | MongoDB, RavenDB |
This article is the ultimate guide to get started with MongoDB using Python. In the coming posts we will demonstrate various operations on MongoDB with the help of examples and the PyMongo library.