I’m versed enough in SQL and RDBMS that I can put things in the third normal form with relative ease. But the meta seems to be NoSQL. Backends often don’t even provide a SQL interface.

So, as far as I know, NoSQL is essentially a collection of files, usually JSON, paired with some querying capacity.

  1. What problem is it trying to solve?
  2. What advantages over traditional RDBMS?
  3. Where are its weaknesses?
  4. Can I make queries with complex WHERE clauses?
  • Ephera@lemmy.ml
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    3
    ·
    1 day ago

    This isn’t a sophisticated opinion or anything, but personally I find RDBMS to be a bad fit for how data is typically structured in your program. You will usually have an object, often with sub-objects all built up like a tree. If you want to load that into an SQL DB, you need to split it up, equip lots of its parts with IDs and then hope that you can reconstruct it when you take it back out.

    On the other hand, JSON was directly designed for serializing programming objects. The chance of you being able to persist and load your object with hardly any structural changes is high.

    Of course, this does have other downsides, like the data not being as flexible to access. Similarily, data in an RDBMS is very structured, whereas in many NoSQL databases, you can have individual entries with different fields than the rest.
    So, that’s perhaps a more general takeaway: SQL makes it hard to put something into the database, but easy to get it out. NoSQL often reverses this.

    • FizzyOrange@programming.dev
      link
      fedilink
      arrow-up
      3
      ·
      22 hours ago

      I think this is a really good point, but it’s also kind of a missed opportunity for NoSQL. The ORM mapping is easily the most annoying thing about using a relational database, and I think it’s what most people initially looking at NoSQL wanted to solve.

      But what we ended up with is Mongo which solves that problem but also throws away pretty much every useful feature that relational databases have! No schemas, no type checking, no foreign keys, etc. etc. It’s just a big soup of JSON which is awful to work with.

      I wonder if anyone made any NoSQL databases that avoid the object/table impedance mismatch but also managed to keep schemas, foreign keys, etc.

    • Colloidal@programming.devOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      23 hours ago

      Right, RDBMS for object permanence is a pain. It’s meant as efficient data storage and retrieval. But I counter that a huge amount of data problems are of that kind, and using object permanence for general database applications seems very contrived. I’m imagining loading a huge amount of data to memory to filter the things you need, essentially rolling your own DBMS. Am I missing something?

      • Ephera@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        23 hours ago

        Well, for use-cases where an SQL database works well, I would recommend using an SQL database. NoSQL generally tries to provide a better alternative for the use-cases where SQL is suboptimal.

        For example, I’m currently building a build system with caching. I need the cache to be persistent on disk between builds, but I just load the cache into memory on startup and if I have a breaking change in the format, I can just wipe the whole cache. So, all the strengths of SQL are irrelevant and the pain points are still there. I mean, truth be told, I’m not using an actual NoSQL DB, but rather just writing a JSON file to disk, but it’s still similar.

        Another example is that at $DAYJOB, our last project involved making lots of recordings and training a machine learning model on that. The recordings had to be created early on, long before our software was stable and the data scientists who would work with that data, would write all kinds of transformation scripts anyways. In that case, again, I do not think an SQL database would’ve been the best choice, because we needed the flexibility to just push data into a heap and then later clean it up. If an older format would’ve really become unusable, we could’ve just left that data behind, rather than trying to constantly update all the data to the newest database schema.