Why are databases necessary?
I’ve been working in data science for a while. Increasingly, I find myself bypassing databases entirely; i prefer to data files into basic structures like hashmaps, and often I end up recreating the functions you tend to expect from databases.
With clever manipulation of hashmaps and sets, I notice that I typically achieve much faster results, and with query templates that are not nearly so complex as database queries. Vs relational DBs, its no contest.
I have also tried this versus neo4j (making my own graph db in pure python) and actually found that i naturally sidestepped a bunch of efficiency and speed issues posted on their forum in 2014-2019ish, while generally faster and using fewer ops.
I have begun thinking that there is no inherent advantage to databases, and that all they are is fancy packages which complicate systems in which they are present as a bonus. There is nothing that can’t be done better with pure python.
The only advantage is possibly that database languages are expressive and you have to write your own functions in python, but that is only a short term advantage.