I’m a tech interested guy. I’ve touched SQL once or twice, but wasn’t able to really make sense of it. That combined with not having a practical use leaves SQL as largely a black box in my mind (though I am somewhat familiar with technical concepts in databasing).
With that, I keep seeing [pic related] as proof that Elon Musk doesn’t understand SQL.
Can someone give me a technical explanation for how one would come to that conclusion? I’d love if you could pass technical documentation for that.
Musk’s statement about the government not using SQL is false. I worked for FEMA for fourteen years, a decade of which was as a Reports Analyst. I wrote Oracle SQL+ code to pull data from a database and put it into spreadsheets. I know, I know. You’re shocked that Elon Musk is wrong. Please remain calm.
I work for a crown corp in Canada we have, off the top of my head, about 800 MSSQL, Oracle, MySQL/MariaDB, Postgres databases across the org (I manage our CMDB). Musk is a retard. The world runs on SQL.
He wouldn’t know this though because he’s a techbro that builds apps with MongoDB b cause he doesn’t understand what normalizing data is and why SQL is the best option for 99.9999999% of applications.
Fucking idiots.
As a former DOD contractor I can also confirm we built whole platforms that use Oracle (shudder) SQL
“The government” is multiple agencies and departments. There is no single computer system, database, mainframe, or file store that the entire US goverment uses. There is no standard programming language used. There is no standard server configuration. Each agency is different. Each software project is different.
When someone says the government doesn’t use sql, they don’t know what they are talking about. It could be refering to the fact that many government systems are ancient mainframe applications that store everything in vsam. But it is patently false that the government doesn’t use sql. I’ve been on a number of government contracts over the years, spanning multiple agencies. MsSQL was used in all but one.
Furthermore, some people share SSNs, they are not unique. It’s a common misconception that they are, but anyone working on a government software learns this pretty quickly. The fact that it seems to be a big shock goes to show that he doesn’t know what he is doing and neither do the people reporting to him.
Not only is he failing to understand the technology, he is failing to understand the underlying data he is looking at.
Yeah, obviously ol’ boy is tripping if he thinks SQL isn’t used in the government.
Big thing I’m prying at is whether there would be a legitimate purpose to have duplicated SSNs in the database (thus showing the Vice Bro doesn’t understand how SQL works).
I’m not aware of any instance where two people share an SSN though. The Social Security Administration even goes as far as to say they don’t recycle the SSNs of dead people (its linked a couple times in other comments and Voyager doesn’t let me save drafts of comments, I’ll make an edit to this comment with that link for you).
Can you point me to somewhere showing multiple people can share an SSN?
Edit: as promised: The Social Security FAQ page
I’d imagine the numbers of dead people eventually get cycled around to. 9 digits only gives you 999,999,999 people to go through, and we have over a third of that in existence right now.
My wife has a tax payment history under two different legal names which share a single SSN
Hmmm, well I can’t speak to how the actual databases are put together, so maybe they would have that as two separate unique primary keys with a duplicated SSN.
But it really seems like bad design if they out it together that way…
Worth noting is that “good” database design evolved over time (https://en.wikipedia.org/wiki/Database_normalization). If anything was setup pre-1970s, they wouldn’t have even had the conception of the normal forms used to cut down on data duplication. And even after they were defined, it would have been quite a while before the concepts trickled down from acedmemia to the engineers actually setting up the databases in production.
On top of that, name to SSN is a many-to-many relationship - a single person can legally change their name, and may have to apply for a new SSN (e.g. in the case of identity theft). So even in a well normalized database, when you query the data in a “useful” form (e.g. results include name and SSN), it’s probably going to appear as if there are multiple people using the same SSN, as well as multiple SSNs assigned to the same person.
Assuming the whole “duplicate SSN” thing isn’t just a complete fabrication, we have no idea what table he was even looking at! A table of transactions e.g. would have a huge number of duplicate SSNs.
I mean I don’t know a ton about SQL but one thing to keep in mind about SSNs is they were not originally meant to be used for identification but because we have no form of national id and places still needed a way to verify who you are people just started using SSNs for that since it’s something everyone has and there wasn’t really a better option. So now the government has been having to try and make them work for that and make them more secure. The better solution would be to make some form of national id that is designed to be secure but Republicans and people like Musk would probably call that government overreach or a way to spy and track people.
This is from 15 years ago, so I don’t know how much has changed since then. But this sounds like the sort of thing they mean.
https://www.nbcnews.com/technolog/odds-someone-else-has-your-ssn-one-7-6c10406347
Because a simple query would have shown that SSN was a compound key with another column (birth date, I think), and not the identifier he thinks it is.
Why would one person, one SSN ever have two different birth dates? That sounds like an issue all onto itself.
As a data engineer for the past 20+ years: There is absolutely no fucking way that the us gov doesnt use sql. This is what shows that he’s stupid not only in sql but in data science in general.
Regarding duplications: its more nuanced than those statements each side put. There can be duplications in certain situations. In some situations there shouldnt be. And I dont really see how duplications in a db is open to fraud.
Yeah, obviously ol’ boy is tripping if he thinks SQL isn’t used in the government.
Big thing I’m prying at is whether there would be a legitimate purpose to have duplicated SSNs in the database (thus showing the First Bro doesn’t understand how SQL works).
It doesn’t matter without scope. Are we looking at a database of SSNs? tax records? A sign in log? The social security number database might require uniques in some way, but tax records could be the same person over multiple years. A sign in gives a unique identifier but you could be signing in every day.
It’s like saying a car VIN shows up multiple times in a database. Where? What database? Was it sold? Tickets? Registered every year?
This is nothing more than a “assume I mean immigrants or tax fraud and get mad!” inflammatory statement with no proof or reason.
If it’s used as an identifier to link together rows from different tables. Also known as “joining” tables. SSN (with birthdate) is a unique identifier, and so it’s natural to choose as a primary/foreign key.
It really is baffling trying to make sense of what he is saying. It’s like the only explanation that makes any sense at all is that he has no idea what he is talking about. Even if he knew just cursory knowledge about database cardinality you wouldn’t say stuff so stupid.
Because SQL is everywhere. If Musk knew what it was, he would know that the government absolutely does use it.
This explanation makes no sense in the context of OP’s question, given the order of comments…
Yeah, a better explanation is that Deduplicating Databases are an absolutely terrible idea for every use case, as it means deleting history from the database.
Because of course the government uses SQL. It’s as stupid as saying the government doesn’t use electricity or something equally stupid. The government is myriad agencies running myriad programs on myriad hardware with myriad people. My damned computers at home are using at least 2-3 SQL databases for some of the programs I run.
SQL is damn near everywhere where data sets are found.
It’s entirely possible that the database is pre SQL.
He didn’t say the SSN database isn’t SQL. He said the GOVERNMENT doesn’t use SQL.
Yeah, obviously ol’ boy is tripping if he thinks SQL isn’t used in the government.
Big thing I’m prying at is whether there would be a legitimate purpose to have duplicated SSNs in the database (thus showing the First Bro doesn’t understand how SQL works).
SSNs being duplicated would be entirely expected depending upon the table’s purpose. There are many forms of normalization in database tables.
I mean just think about this a little bit, if the purpose is transactions or something and each row has a SSN reference in it for some reason, you’d have a duplicate SSN per transaction row.
A tiny bit of learning SQL and you could easily see transactional totals grouped by SSN (using, get this, a group by clause). This shit is all 100% normal depending upon the normalization level of the schema. There are even – almost obviously – tradeoffs between fully normalizing data and being able to access it quickly. If I centralize the identities together and then always only put the reference id in a transactional table, every query that needs that information has to go join to it and the table can quickly become a dependency knot.
There was a “member” table for instance in an IBM WebSphere schema that used to cause all kinds of problems, because every single record was technically a “member” so everything in the whole system had to join to it to do anything useful.
Oh, well another user pointed out that SSN’s are not unique, I think they are recycled after death or something. In any case, I do know that when the SSN system was first created it was created by people who said this is NOT MEANT to be treated as unique identifiers for our populace, and if it were it would be more comprehensive than an unsecure string of numbers that anyone can get their hands on. But lo and behold, we never created a proper solution and we ended up using SSN’s for identity purposes. Poop.
I’m pretty sure there is a federal statute that says ONLY the SSA may collect or use SSNs, as to federal agencies. I argued it once when a federal agency court tried to tell me that it couldn’t process part of my client’s case without it. I didn’t care but my client was crotchety and would only even give me the last four.
Edit. It’s a regulation:
https://www.law.cornell.edu/cfr/text/28/802.23
An agency cannot require disclosure of an SSN for any right or benefit unless a specific federal statute requires it or the agency required the disclosure prior to 1975.
In my case the agency got back to me with some federal statute that didn’t say what they said it said, and eventually they had to admit they were wrong.