What is NoSQL good for?

Colloidal@programming.dev · 1 day ago

What is NoSQL good for?

theit8514@lemmy.world · 1 day ago

NoSQL is best used as a key-value storage, where the value can be non-tabular or mixed data. As an example, imaging you have a session cookie value identifying a user. That user might have many different groups, roles, claims, etc. If you wanted to store that data in a RDBMS you would likely need a table for every 1-to-many data point (Session -> SessionRole, Session -> SessionGroup, etc). In NoSQL this would be represented as a single key with a json object that could looks quite different from other Session json objects. If you then need to delete that session it’s a single key delete, where in the RDBMS you would have to make sure that delete chained to the downstream tables.

This type of key-value lookups are often very fast and used as a caching layer for complex data calculations as well.

The big downside to this is indexing and querying the data not by the primary key. It would be hard to find all users in a specific group as you would need to scan each key-value. It looks like NoSQL has some indexing capabilities now but when I first used it it did not.

Colloidal@programming.dev · edit-2 1 day ago

Let me see if I got it. It would be like a denormalized table with a flexible number of columns? So instead of multiple rows for a single primary key, you have one row (the file), whose structure is variable, so you don’t need to traverse other tables or rows to gather/change/delete the data.

The downsides are the usual downsides of a denormalized DB.

Am I close?

Azzu@lemm.ee · edit-2 1 day ago

Pretty much. The advantage is not really the unstructeredness per se, but simply the speed at which you can get a single record and the throughput in how much you can write. It’s essentially sacrificing some of the guarantees of ACID in return for parallelization/speed.

Like when you have a million devices who each send you their GPS position once a second. Possible with RDBS but the larger your table gets, the harder it’ll be to get good insertion/retrieval speeds, you’d need to do a lot of tuning and would essentially end up at something like a NoSQL database effectively.

ryedaft@sh.itjust.works · 1 day ago

Yes. You can also have fields that weren’t defined when you created the “table”.

With something like Elasticsearch you also have tokenisation of text which obviously compresses it. If it’s logs (or similar) then you also only have a limited number of unique tokens which is nice. And you can do very fast text search. And everything is set up for other things like tf-idf.

bahbah23@lemmy.world · 1 day ago

Rather than try to relate it to an rdbms, think of it as a distributed hash map/associative array.

Colloidal@programming.dev · 1 day ago

What I’m hearing is that they’re very different beasts for very different applications. A typical web app would likely need both.

ramble81@lemm.ee · 1 day ago

Yup. And this right here is where I dismiss people that generally say you only need one or the other. Each has a specific advantage and use case and you’ll have the best performance when you choose the “right tool for the job” and don’t just attempt to shoehorn everything into a single solution

Colloidal@programming.dev · 21 hours ago

Hold a sec. Rolling your own RDBMS out of a NoSQL database is insane. But is the opposite feasible? Wouldn’t it be a simple table with two columns: a key and a JSON blob?

ramble81@lemm.ee · 21 hours ago

Could you do it? Yes, but it’s not something that it’s optimized to do. NoSQL engines are designed to deal with key value pairs much better than an RDBMS. Again, best tool for the job.

Colloidal@programming.dev · 20 hours ago

Got it, thanks.