Is there something better than SQL?

If you’re modeling relational data, it doesn’t seem like you can get around using a DB that uses SQL, which to me is the worst: most programmers aren’t DB experts and the SQL they output is quite often terrible.

Not to dunk on the lemmy devs, they do a good job, but they themselves know that their SQL is bad. Luckily there are community members who stepped up and are doing a great job at fixing the numerous performance issues and tuning the DB settings, but not everybody has that kind of support, nor time.

Also, the translation step from binary (program) -> text (SQL) -> binary (server), just feels quite wrong. For HTML and CSS, it’s fine, but for SQL, where injection is still in the top 10 security risks, is there something better?

Yes, there are ORMs, but some languages don’t have them (rust has diesel for example, which still requires you to write SQL) and it would be great to “just” have a DB with a binary protocol that makes it unnecessary to write an ORM.

Does such a thing exist? Is there something better than SQL out there?

BlackXanthus,

I don’t know if there is, but it feels like the email protocol problem.

Like, while the protocol sucks in many, many ways, it would take something revolutionary to replace it because it’s everywhere.

It’s been around so long that everything talks the protocol, the binaries that handle it are mature and stable.

Then you have to ask: what would you replace it with? It does the job it’s designed to do very well. There’s nothing the matter with the protocol, and it’s still fit-for-purpose.

That doesn’t mean there aren’t problems - spam, bad actors, and so on, but ultimately that’s not the fault of the protocol (though, maybe, for email, people have been arguing about protocol-level ways of dealing with spam for years).

I don’t have an answer, but I feel like there should be one, but I doubt the is.

recursive_recursion, (edited )
@recursive_recursion@programming.dev avatar

Then you have to ask: what would you replace it with? It does the job it’s designed to do very well. There’s nothing the matter with the protocol, and it’s still fit-for-purpose.

Rust once it’s matured is my guess

Edit: this short comment wasn’t thought out so the downvotes are reasonable, gonna leave this up so I can improve

lnsfw3,

We’re replacing protocols with Rust now?

clay_pidgin,

I imagine they were trying to reply to CEREMENT about Python.

cerement,
@cerement@slrpnk.net avatar

(have heard a similar argument about Python – there’s no killer app and it’s not the best, but it’s good enough and just keeps going)

fubo,

One alternative to both raw SQL and and ORM is a query builder, a procedural library for constructing database queries. Query builders typically don’t have the object/relational “impedance mismatch” of ORMs; they don’t encourage you to pretend that records in the database are the same as objects in your code. But they give you a syntax that looks more like your programming language, and automatically handles escaping (and thus, resistance to injection attacks).

However, query builders often don’t expose all the power of your database. If you’re using PostgreSQL, you’ve got one hell of a powerful set of tools in there. It’s often worth spending the time to master them just so you don’t end up reinventing the stone-age wheel on top of a warp-speed hovercraft.

bill_1992,

I’ve been using Jooq to build my queries (and run them). Beats the hell out of writing prepared statements in strings.

Not sure what power I’m missing though, I’ve been able to do everything via Jooq that I want to do.

koreth,

You’re not missing much power with jOOQ, in my opinion as someone who has used it for years. Its built-in coverage of the SQL syntax of all the major database engines is quite good, and it has easy type-safe escape hatches if you need to express something it doesn’t support natively.

CameronDev,

I agree that SQL is hard, and optimising it is harder, but is a bespoke binary protocol per DB really easier?

You can usually connect to the DB directly via an ODBC driver, but that will involve SQL anyway.

onlinepersona,

A protocol that forces the use of a query builder or ORM would be easier and better, I think. The DB developer would have to provide a library, which means one would have to learn the library, not a new language.

CameronDev,

Making raw sql access harder for applications is probably a good idea. Perhaps with something like rusts “unsafe”, so you can still do it, but you have to deliberately make the decision to do so.

The other users for raw SQL are DB administrators, and I dont think youll be able to take SQL away from them as easily. I dont actually know what or how DB administrators work, but I dont think they are developers a lot of the time, so requiring them to write code to do their job might be a non-starter?

dudeami0,
@dudeami0@lemmy.dudeami.win avatar

SQL is the industry standard for a reason, it’s well known and it does the job quite well. The important part of any technology is to use it when it’s advantageous, not to use it for everything. SQL works great for looking up relational data, but isn’t a replacement for a filesystem. I’ll try to address each concern separately, and this is only my opinion and not some consensus:

Most programmers aren’t DB experts: Most programmers aren’t “experts”, period, so we need to work with this. IT is a wide and varied field that requires a vast depth of knowledge in specific domains to be an “expert” in just that domain. This is why teams break up responsibilities, the fact the community came in and fixed the issues doesn’t change the fact the program did work before. This is all normal in development, you get things working in an acceptable manner and when the requirements change (in the lemmy example, this would be scaling requirements) you fix those problems.

translation step from binary (program): If you are using SQL to store binary data, this might cause performance issues. SQL isn’t an all in one data store, it’s a database for running queries against relational data. I would say this is an architecture problem, as there are better methods for storing and distributing binary blobs of data. If you are talking about parsing strings, string parsing is probably one of the least demanding parts of a SQL query. Prepared statements can also be used to separate the query logic from the data and alleviate the SQL injection attack vector.

Yes, there are ORMs: And you’ll see a ton of developers despise ORMs. They is an additional layer of abstraction that can either help or hinder depending on the application. Sure, they make things real easy but they can also cause many of the problems you are mentioning, like performance bottlenecks. Query builders can also be used to create SQL queries in a manner similar to an ORM if writing plain string-based queries isn’t ideal.

helloben,

I have found that usually the problem isn’t SQL itself, it’s the design of the database or the application or both that makes it slow or hard to use because databases are tricky and it’s very easy to make them slow and difficult to optimize. This goes for any database, SQL or otherwise.

I think the reason that you see so many people use relational databases for relational data is that they were designed for it and have been optimized to do that job for almost 50 years at this point, the right tool for the job if you will. That’s also why I don’t think any language builds it in, there isn’t one type of database that is good at every scenario so it would be difficult to pick a default as a language. Also keep in mind, most large applications are deployed across many servers so anything that uses a local file is out of the question.

As far as performance directly though, I’m definitely not a DBA but I have spent a lot of time helping people troubleshoot slow databases and it seems many people write apps and design databases based only on how they want to store the data and not how the actual data is consumed. The other thing I think most folks don’t seem to realize (including myself for a long time) know the need for creating good indexes based on how you are accessing the data. It’s not an easy problem to solve for sure though.

Semi-Hemi-Demigod,
@Semi-Hemi-Demigod@kbin.social avatar

Database optimization goes from the design of the schemas to the queries used to the configuration to the actual hardware it’s running on. It’s one of the hardest things to optimize in all of tech.

Imacat,

I’m absolutely biased as a data engineer who loves SQL, but there are some good reasons why SQL has been the de facto standard for interacting with databases since the 80s.

One of its draws is that it’s easy to understand. I can show a stakeholder that I’m selecting “sum(sale_amount) from transactions where date=yesterday” and they understand it. Many analysts are even able to write complicated queries when they don’t know anything else about programming.

Since it’s declarative, you rarely have to think about all the underlying fuckery that lets you query something like terabytes of data in redshift in minutes.

Debugging is often pretty nice too. I can take some query that didn’t do what it was supposed to and run it over and over in a console until the output is right.

sip,

it’s just a drag to veer from. it’s not particularly good, but good enough to stick around.

lysdexic,

I’m absolutely biased as a data engineer who loves SQL, but there are some good reasons why SQL has been the de facto standard for interacting with databases since the 80s.

I find it funny how the people who actually have to wrangle data swear by SQL as awesome, but there are always random hacks coming out of the woodwork, who don’t even look at SQL at all, with sweeping statements claiming SQL sucks because reasons.

It’s like the most opinionated people against SQL are the ones who don’t use SQL.

bill_1992,

The point about a binary protocol is interesting, because it would inherently solve the injection issue.

However, constructing an ad-hoc query becomes tedious, as you’re now dealing with bytes and text together. Doing so in a terminal can be pretty tedious, and most people would require a tool to do so. Compare this against SQL, where you can easily build a query in your terminal. I think the tradeoff is similar to protobuf vs json.

You could do a text representation (like textproto), but guess what? Now injection is an issue again.

Another thing would be the complexity of client libraries. With SQL client libraries, the library doesn’t need to parse or know SQL - it can send off the prepared statement as-is. With a binary protocol, the client libraries will likely need to include a query builder that builds the byte representation since no developers are going to be concatenating bytes by hand, which makes the bar higher for open-source libraries. This also means that if you add a new query feature to your DB, all client libraries will likely need to be updated to use the feature.

And you’re still going to need to tune and optimize queries for this new DB. That’s just the nature of the beast: scaling is hard especially when you can’t throw money at the problem.

Quite frankly, it’s a lot of hard tradeoffs to not need to use prepared statements or query builders. Injection is still is an issue for SQL today, but it’s been “solved” as much as it possibly can.

JackbyDev,

No, SQL is great for relational data. There’s like decades of research into it. It’s the best.

sip,

relational databases have years of reseach into them, not the query language itself.

sql was built so people other than devs can use it, but we got stuck with it.

normalexit,

SQL has been around and evolving since the 70s. It is an ISO/IEC standard and is portable across a multitude of databases.

https://en.m.wikipedia.org/wiki/SQL

sip,

portable, my ass. excuse my french.

each system has it’s own dialect and quirks

lysdexic, (edited )

each system has it’s own dialect and quirks

That does not mean that SQL, as specified by one of it’s standard versions, is not portable. It just means that some implementations fail to comply with the standard and/or provide their own extensions.

If an implementation fails to comply with the standard, that’s a failure on the side of the implementation, not a failure of SQL.

Miaou,

Could also be that the standard is lacking in some areas. I’m not sql expert but I always end up using implementation specific features even for rather simple tasks. Are there really people out there sticking 100% to standard SQL? Hell, the biggest implementations don’t even agree on whether table/column names are case sensitive

lysdexic,

Could also be that the standard is lacking in some areas.

I don’t think that explains it.

If we’re talking about extensions that cover custom features then obviously those aren’t supposed to be standardized because they haven’t been widely adopted.

If an implementation is missing a feature then that’s a shortcoming of that particular implementation, not SQL’s.

If an implementation screws up and has non-compliance qwirks, that’s a bug in the implementation, not a problem with SQL.

Take SQLite for example. It explicitly does not support static, rigid typing, and claims it’s a feature. However, SQL supports static typing and other implementations leverage that for performance and cost gains. Additionally, SQLite also keeps a list with a summary of all the SQL features it purposely does not implement.

SQLite is pretty popular. Does this mean SQL is lacking in any way? Is the SQL standard “lacking” because it supports ALTER TABLE foo ADD CONSTRAINT even though SQLite does not? Or is this a problem caused by an implementation failing to comply with a standard?

sip,

sure, ok, but who uses only the subset of standard SQL in a particular engine just to call his queri3s portable? most of the good stuff is unique to each engine and is what makes the engine stand out.

it’s the same with C standards…

lysdexic,

sql was built so people other than devs can use it, but we got stuck with it.

Not really. Being designed with UX in mind, so that it sacrifices conciseness for readability, does not make it something for “people other than devs”.

Likewise, BASIC was also developed with UX in mind, and no one in their right mind would ever claim that it’s not a programming language.

r1veRRR,

SQL is horrible as a language to read or write. There’s a million different variants, because it lacks so many basic things. And when used in other code, you generally end up string concatinating one language in another language, with all the HORRIBLE bugs something like that brings about.

Imagine Backend People said we should just write adhoc Javascript for the frontend by concatinating the “correct” code in the backend.

JackbyDev,
koreth,

it would be great to “just” have a DB with a binary protocol that makes it unnecessary to write an ORM.

Other people have talked about other parts of the post so I want to focus on this one.

The problem an ORM solves is not a problem of SQL being textual. Just switching to a binary representation will have little or no impact on the need for an ORM. The ORM is solving the problem that’s in its name: bridging the conceptual gap between an object-oriented data model and a relational data model. “A relational data model” isn’t about how queries are represented in a wire protocol; instead, it is about how data, and relationships between pieces of data, are organized.

So, okay, what if you get rid of the relational data model and make your database store objects directly? You can! NoSQL databases had a surge in popularity not too long ago, and before that, there have been lots of object databases.

What you’re likely to discover in an application of any real complexity, though, and the reason the industry has cooled somewhat on NoSQL databases after the initial hype cycle, is that the relational model turns out to be popular for a reason: it is extremely useful, and some of its useful properties are awkward to express in terms of operations on objects. True, you can ditch the ORM, but often you end up introducing complex queries to do things that are simple in SQL and the net result is more complex and harder to maintain than when you started. (Note “often” here; sometimes non-relational databases are the best tool for the job.)

And even in an object database, you still have to know what you’re doing! Storing objects instead of relational tuples won’t magically cause all your previously-slow queries to become lightning-fast. You will still need to think about data access patterns and indexes and caching and the rest. If the problem you’re trying to solve is “my queries are inefficient,” fixing the queries is a much better first step than ditching the entire database and starting over.

ursakhiin,

Even some of these issues ORMs are solving can be solved without them by caching a view of the data in structure of the object. Relational DBs are extremely well tuned for looking up and caching data in an easy to view manner.

Somebody else pointed out the problem is bad devs not learning their tools. I’d go so far as to say DB knowledge can (and was due a while) be a specialized field full of skills that will fall by the wayside for most devs because we aren’t doing super complex things in a single DB anymore with the preference going toward microservices. There’s no need to flex those skills and they depreciate over time.

koreth,

SQL, where injection is still in the top 10 security risks

This is absolutely true, but it’s not what it looks like on the surface, and if you dig into the OWASP entry for this, you’ll see they talk about mitigation.

You can completely eliminate the possibility of injection attacks using well-understood technologies such as bind variables, which an ORM will usually use under the covers but which you can also use with your own queries. There are many, many database applications that have never once had a SQL injection vulnerability and never will.

The reason SQL injection is a widespread security risk, to be blunt, is that there are astonishingly large numbers of inexperienced and/or low-skill developers out there who haven’t learned how to use the tools at their disposal. The techniques for avoiding injection vulnerability are simple and have been well-documented for literally decades but they can’t help if a lousy dev decides to ignore them.

Now, a case could be made that it’d be better if instead, we were using a query language (maybe even a variant of SQL) that made injection attacks impossible. I agree in principle, but (a) I think this ends up being a lot harder than it looks if you want to maintain the same expressive power and flexibility SQL has, (b) given that SQL exists, “get bad devs to stop using SQL” doesn’t seem any more likely to succeed than “get bad devs to use bind variables,” and © I have too much faith in the ability of devs to introduce security vulnerabilities against all odds.

frezik, (edited )

It’s also worth noting that the current entry on OWASP is injection generically. It includes SQL, but it also covers things like HTTP links where you concatenate in unchecked user input. SQL injection by itself may no longer be prominent enough make the list.

I’m also going to put the blame squarely on PHP for SQL injection attacks hanging on for so long, particularly when combined with MySQL. That DB didn’t support bind variables for a long time (maybe still doesn’t?). Other languages may have used a library that simulated bind variables for you. Barring that, they tended to always always always show how to use a sanitation function even if it was the first mention in the first tutorial for the top Google result for “sql [language]”. That creates a culture in the language of writing safe SQL code. Not PHP, though; the sanitation functions were there, but they never gave them the prominence that they so badly needed.

StarkillerX42, (edited )

There are other query languages. InfluxDB develops one called Flux. It’s a master class on why corporate led open source can really suck. Daily use features get deprecated all the time and each new release is breaking. The documentation is horribly insufficient and the language syntax is usually not clear and it wraps many behaviors into one vague function name. Yes, SQL is pretty impressive.

jochem,

With Influx 3 the preferred language is gonna be their SQL dialect. Flux is on its way out and I suspect it will get deprecated in the near future. Flux saw very slow adoption and SQL can do everything as well.

StarkillerX42,

That’s the most Influx-like decision ever.

hperrin,

Yes, but they’re not ubiquitous, so you gotta research them.

601error,
@601error@lemmy.ca avatar

I am both a (T-)SQL expert and a language design enthusiast. IMO, SQL the language is mediocre in its grammar and extremely resistant to cleanliness. Once you get past that, the things you can actually do with it are extremely useful.

I’d love for a better syntax to exist, but it’s a Herculean task to make one. Modern SQL dialects have gargantuan, labyrinthine grammars, and they grow with each new product version. It’s a lot easier to keep adding to that than to build a feature-complete replacement. This is also the reason why most ORMs are so frustratingly limiting: it’s too much work to support the advanced features of one SQL dialect, let alone multiple.

Olap,

ElasticSearch tried. It’s without a doubt one of the worst query languages to use. Nobody can remember how to write the queries, or even reading the aggregations. You literally need the docs open on another monitor.

A solid No from me. But I also never got the ORM hate, ActiveRecord on Ruby made it devine imo, sequelize on JS is fine, Diesel in rust is good

Zikeji,
@Zikeji@programming.dev avatar

ORM with an easy and straightforward way to execute custom queries is the sweet spot to me. Let the ORM handle the basics (and migrations), override / add optimized queries where relevant.

KrokanteBamischijf,

Sounds like it’s not really SQL as a query language but rather the whole database paradigm that’s the problem here.

Look into noSQL databases and their respective drivers. They often use JSON-like syntax and are more likely to be seamlessly integrated with whatever programming language you’re using.

If a search engine won’t point you in the right direction I’d suggest having a look at MongoDB, which is well documented and fairly accessible to mess around with.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • uselessserver093
  • Food
  • aaaaaaacccccccce
  • [email protected]
  • test
  • CafeMeta
  • testmag
  • MUD
  • RhythmGameZone
  • RSS
  • dabs
  • Socialism
  • KbinCafe
  • TheResearchGuardian
  • oklahoma
  • feritale
  • SuperSentai
  • KamenRider
  • All magazines