Web systems are designed to be simple and reliable. Designing for the everyday person is the goal, but if you don’t consider the odd man out, they may encounter some problems. This is the everyday life for some people with names that often have unconsidered features, such as apostrophes or spaces. This is the life of [Luke O’Sullivan], who even had to fly under a different name than his legal one.
[O’Sullivan] is far from a rare surname, but presents an interesting challenge for many computer systems. Systems from the era of penny pinching every bit relied on ASCII. ASCII only included 128 characters, which included a very small set of special characters. Some systems didn’t even include some of these characters to reduce loading times. Throw on the security features put in place to prevent injection attacks, and you have a very unfriendly field for many uncommon names.
Unicode is a newer standard with over 150,000 characters, allowing for nearly any character. However, many older systems are far from easy or cheap to convert to the new standard. This leaves many people to have to adapt to the software rather than the software adapting to the user. While this is simply poor design in general, [O’Sullivan] makes sure to point out how demeaning this can be for many people. Imagine being told that your name isn’t important enough to be included, or told that it’s “invalid”.
One excuse that gets thrown about is the aforementioned injection prompts that can be used to affect these systems. This can cause systems to crash or even change settings; however, it’s not just these older systems that get affected. For modern-day injection prompts, check out how AI models can get affected!
Thanks to Ken Fallon for the tip!
If you reject names because you are afraid of injection prompts your software sucks because you are concatenating strings to form SQL-statements that are later parsed. And if you are forced to do it that way, there are a multitude of functions/libraries available that sanitizes and quote strings properly for SQL which means if you are not using them your software sucks even more.
This isn’t a hard problem and it has been solved for decades and anyone who writes code that don’t sanitize inputs should go back to school until they can be trusted to work in a professional setting. What’s worse, just imagine what other problems a software will have if it can’t even process someone’s name properly.
In large systems, it’s more common that you are afraid of injection prompts against some other legacy software that you are not in control of. The risk of someone typing ” OR 1=1 to online form is much greater than getting a 1990s ticket agency to type it.
this.
Being Kurzweil’s coding or Karswell’s calligraphy—it all comes down to the proper casting of runes.
If you have a name of a thing–you have power over a thing…just kern your glyphs properly.
Some indigenous people even feared cameras taking their souls.
All of the tedious selfie photos on the internet would seem to confirm this.
SQL has a fundamental problem, in that it’s inherently a text command, so that both the command actions and the command data have to be converted to a text string.
This worked well for direct human input of commands, but doesn’t easily translate to a computer interface.
A more reasonable interface would have a function call and ENUMs for the keywords, and pointers to the various input data. Once you do that, you don’t need sanitizing or ASCII conversion of binary data, you just supply a pointer to the data at the appropriate point.
With varargs it looks a lot like what we have now:
SQL(K_SELECT,”Name,Age”,K_FROM,Users,K_WHERE,”Age”,K_GT,30)
Quite readable, but in this model you don’t need any sort of sanitization, or even conversion of data. Just supply a pointer to the data at the appropriate point. (You may also need to pass the length of data if the data is not self-describing.)
It’s easy to imagine a JSON equivalent, just pass a dynamic struct instead of the function signature.
As a society we could have avoided a ton of hacks, malware, bugs, and headaches by just making SQL into a proper computer API.
Oh, and an XKCD comic as well.
i don’t think your criticism of SQL is valid, because the norm is specifically not to make the SQL-using programmer concatenate strings. for example perl dbi:
you don’t have to escape the argument to a text string, you simply pass the argument to the execute method separately from the string, and it does whatever composing is needed under the covers and with perfect safety
The real universal solution that has been known for decades is stored procedures.
Instead of giving the web interface full SQL access, you make stored procedures that do single operations like “get user list” “Get user info” “change user address” etc. Each procedure has extremely limited permissions in the database and is made to roll back everything in case of a runtime exception. The web user only has access for executing these procedures.
The great secondary effect is that anybody that lays hands on the web user database credentials or roots the web server cannot see the database schema or mess with the data directly. Even getting the list of procedures is off-limits if you are lucky and they can’t read the source of your web app.
Defense in depth works. Trusting your user interface any more than a DMZ is a rookie mistake.
Prepared statements.
Poor Bobby Tables can’t fly either.
Well, we’ve lost this
year’s student records.
I hope you’re happy.
And I hope
you’ve learned
to sanatize your
database inputs.
Little Bobby grew up, started a business:
https://www.reddit.com/r/technology/comments/5l47yy/someone_registered_drop_table_companies_ltd_in/
And bought a nice car:
https://imgur.com/00ir7fQ
https://xkcd.com/327
I once had a coworker with “30 years of SQL DB experience” once ask me ‘Say, just how do you handle apostrophes in names anyway?”
I must admit I kinda stared at him in shock “um like dude that’s like page 3 in “SQL for Dummies” and I couldn’t remember off the top of my head anyway as I shoved all that into a function years ago. (It’s a bit tricky as it varies between systems and you need to watch out for things like # as well.)
As the old joke goes, often “30 years of experience” is just one year repeated 30 times.
Using Code Page 437 might be a workaround that solves most European and American name issues.
Yes, it’s proprietary and not pure ASCII.
But being part of IBM PC architecture for over 40 years makes it relevant.
Both PC/AT BIOS (CGA font) and VGA BIOS (VGA font) have CP437 stored.
It’s not the best there is, but it’s a standard someone could agree upon.
And it has things such as smileys, umlauts and apostrophes.
To vintage systems it might be easier to understand than UTF-8!
https://en.wikipedia.org/wiki/Code_page_437
Uh, yeah, ok……
Did you miss the bit about apostrophe’s being a solved problem? (Except of course using them incorrectly, but only Reddit losers care about that.)
Well, there’s that “using an apostrophe incorrectly” thing again.
Joshua is from Germany.
German does not use the apostrophe – at all. There is no correct way to use the apostrophe in German. It simply doesn’t exist.
Despite this, it is often seen used as by the most illiterate Redditors. It is never used to show possession. It is always used when an “S” is used to indicate a plural word. It is baffling and irritating.
That sums it up pretty well, thank you! 🙂👍
Though I think I must confess that some of us have the bad habbit to use apostrophes for stylistic reasons.
Like “Was gibt’s neues bei euch?”
The word “gibt’s” should be written as “gibts” or “gibt es”.
Anyway, it’s a 90s thing maybe, I suppose.
Back then us Germans tried to be more open, less stiff, more relaxed, more casual.
It’s when workers didn’t use suits at work anymore but wore sneakers and t-shirts.
And since mechanical typeriters were still around in the parents house, or because we used home computers,
many of us probably haven’t learned to use different apostrophes.
It’s probably the ‘ character that’s used most, even if there’s ` character.
Another oddity was writing “Du” or “Dir” or “Dich” with a capital letter in books and letters.
It was an in-between of “du” (informal) and “Sie” (formal).
The “Du” was a more respectful, yet still intimate form of writing to someone.
Like when an author spoke to the reader in a book. In the introduction part, for example.
Anyway, it was an inofficial thing. It brobably isn’t even documented.
What’s funny, though, when thinking about it..
The “du” is often translated with “you”, when in reality they’re not same.
The German “du” is informal, while English “you” is formal.
So in principle, German “Sie” (formal) must be used for “you” (formal) and German “du” (informal) must equal the obsolete “thou” which was informal.
Another thing that comes to mind.
In German, we’re trying to use gender specific speech now (not my fault).
So that females (and diverse people) are being mentioned especially, are being noticed and don’t feel being excluded.
Such as the word Autorin instead of Autor (generic masculin).
Apparently, the genders of German words (professions, roles etc) are now same being treated as real biological genders of people.
Previously, it was just a gramatical gender in German language.
English goes the entirely opposite way of solving the issue by omitting gender specific words.
An actress becomes an actor (generic).
Makes sense to me if inequalities are being addressed.
The paradox problem here in Germany is,
that some groups here don’t want to be any “special” but being accepted on eye level with men.
Such as aforementioned female authors,
which fought hard for acceptance/recognition and even used male pseudonyms in the past 100 years.
Another change is the change from professions to activies.
Students now are a studying ones (Studenten -> studierende).
Why? Gender equality. Studenten is male plural (Student is male singular), so it has to be omitted.
By turning it into an activity, it’s gender neutral. Or so it’s supposed to be? I’m having headaches, too.
(In the not so distant past, both words were also seen as generic masculine, with females silently being included.
Studentin/Studentinnen was explicit female version and used if needed.)
Sorry for the very long comment, but I felt I should give an idea about current situation over here.
Because to foreigners and tourists in Germany this must be very confusing. In the past 5-7 years, a lot of things changed.
Hope you don’t mind for mentioning it.
Speaking under correction also, my English isn’t the yellow from the egg.
Another classic mistake is some of us using the wrong quotation marks.
“Hallo” instead of „Hallo“.
It’s also probably due the typewriter and home computer legacy again.
On the limited keyboards of our generation(s), there weren’t multiple types of quotation marks.
for the “du” vs. “Du”: at school I was told the capital version is used when addressing someone directly (in a letter, for example), otherwise lower case
“German does not use the apostrophe – at all. There is no correct way to use the apostrophe in German. It simply doesn’t exist.”
Um, no? The apostrophe does, indeed, exist.
– For contractions (like “was gibt es?” => “was gibt’s?”
– In cases of showing possession for subjects ending in an “s sound” (like “Lukas’ Fahrrad)
IIRC
In German ‘Fik Du’ is impolite!
Say ‘Fiken Si’ to be polite to those of higher class.
Americans have no class! (old, old joke)
Why I’m not German, despite the dual citizenship.
That and ‘the incident’ on the German train when I was 2.
Rich lady I pissed on from the luggage rack wanted me arrested.
Mom convinced the cops and conductor I was 2 and couldn’t even remember doing it.
Hence I was told to apologize and it would end.
When they brought the damp lady in for her apology, I started Bond villain laughing at her again.
Mom decided if she moved back to Germany, I was doomed.
Mom’s family are ‘high class’ Germans.
I am related to German Billionaires (thousand millionaires anyhow).
Not close enough to do me any good though, second cousin once removed has most of the money.
Guy asks legit question.
Other guy stares blankly because its so easy.
Other guy cant remember how to do it.
Other guy says “well actually its tricky and depends on the system”
Other guy offers dumb misrepresented quote.
Good job there Other guy. You sure showed him…
If you couldn’t remember off the top of your head why should they?
Asking the question doesn’t have to mean they have never and couldn’t do it right, it might just mean they know they have a problem that needs careful handling and hoped you just remembered or had some new library etc that deals with the challenge cleanly.
It’s easier for humans to just remember “I’ve seen a solution to this problem, it exists” and then look it up on the internet, than it is to remember every individual solution. With such easy access to the internet for most, you could argue that we’re becoming cyborgs.
Falsehoods Programmers Believe About Names – With Examples assumption 40: “People have names.”
Indeed, very strange points there.
I’m wondering if there are people who use musical notes to write their “names” or are used to bark their names in their own culture.
Assumption “11. People’s names are all mapped in Unicode code points” is correct because Unicode allows the construction of characters via composition, so you can make any character now.
Assumption “40. People have names” is also correct because a person in any modern society has a name. No programmer is dealing with randos from an isolated tribe if they aren’t expressly building a system to catalog them.
My full last name is 31 chars long, most systems only support 30 chars. Always fun if you have to buy a planeticket where the entered name should match with your passport.
Part of the problem is likely security.
Targetdrone is on the no-fly list.
Tаrgetdrone is not on the no-fly list.
The second one features unicode character \u0430 u0430, a Cyrillic character.
https://en.wikipedia.org/wiki/IDN_homograph_attack
I used to do this over 20 years ago at tibia.org.pl forum, masquerading as different users. Fun times.
Considering you need a valid ID or passport to fly, both of which are digital, I don’t see this as being a problem.
Similar to living in Japan with a middle name…
According to the banks, my first and middle names are one word and along with my last name are in all uppercase, but now it doesn’t match my Japanese government ID or drivers license. So I have to get a human at the bank involved whenever any verification is needed, which is a pain as the system is increasingly set up for online verification methods, BUT they also apparently continue to refuse to allow middle names, or even upper and lower case characters…
And don’t get me started on the arbitary and mandatory random use of half-width and full-width characters, or even character limits on names so a good portion of western names are simply too long to be entered into the system…
Oh well, that sucks. 😟
But on other hand it has to be sort of expected when being an “anomaly” in any foreign “system”. 😕
Maybe you can get the equivalent to a stage name that’s Japanese?
That serves as an alias (or nick name) to your real name, but can be understood flawlessly by the “system” so to say?
I think this Hackaday is on to something here.
Youtube is giving me a: “Sign in to confirm you’re not a bot” for this video. LOL.
And that has not really been possible anymore from the time that youtube rejected my user name because it was “unpronounceable”. And that was some 10 to 15 years ago.
Is the TSA using the rubber gloves on this guy?
It’s annoying, but also understandable same time. Computers have an English heritage, simply.
As a workaround, there are things like umlauts or HTML entities.
Using international ID numbers for a world of global citizens would be another one.
Because as pointed out, there are various combinations of names in the world.
Saving a database name and storing a bitmap of the real name for comparison purposes would be another one.
Especially for those humans living in some rain forest tribes who use their feet for signing documents or something. ;)
Remember the old PaypaI scam? Capital I instead of lower case L at the end tricked many people into logging into fake web site and lost money. Depending on the default font used, it can be indistinguishable to some people.
Someone can do that with similar looking cyrillic letters, too.
The problem of non-ASCII, non-English letters isn’t just displaying them correctly, but also entering tjem into the system.
I mean, let’s take that airport example.
If someone with far-east character name books a flight in, say,
France – then is the French airport personell or French travel agent supposed to open up an input method editor (IME) and tediously enter the far-east characters?
Assuming it’s Kanji characters, then is the European personell supposed to memorize how all the individual Japanese characters sound?
I mean, they’re already hard-pressed to speak/understand correct English.
Which, by the way, has letters that can be spelled in international NATO alphabet.
I hate sans serif fonts. I want I and l and | to look distinct.
It’s amazing that we still have this issue in 2025. I guess the airline booking infrastructure is ancient. Thinking mainframes running a program written in fortran :-D
Yup.
It’s called SABRE, dates back to about 1960. Most airlines use it.
Far from only airlines. My last name starts with two capital letters. Even government computers can’t handle it, even with actual towns having names that start with two capital letters, which means they should have a system in place that can handle it.
I have this problem too, hence why I have an alias on my passport because in one case it didn’t match and they almost refused to let me fly (Ryanair if you’re curious). But I’m glad it happened because my visa application didn’t match on an intercontinental flight and without the alias they wouldn’t let me in.
But it’s not just airlines, it’s anything with a paywall or requires name registration, like an electronic visa application.
It’s fairly crazy.
But this sort of one box for all exists everywhere because user stories aren’t considered a huge amount of developments, for example some online orders and delivery companies require a house number or street name. I have neither.
Well have a similar problem with the umlaut in my name…..
The Classic XKCD Exploits of a mum https://xkcd.com/327
I still have to remove the dash from “Jean-Luc” when filling online forms from time to time.
You’d think that Starfleet would have solved this problem by now.
No, sadly. By 24th century starfleet can’t handle lower-case anymore, even.
It’s all in capitals, at shown on the LCARS panels.
Oh, and then there’s the problem of the Bajorans (?) and that their first and last names are swapped when addressed.
Ltd. Worf was lucky enough to be adopted by humans and thus given a real last name. “Son of Mogh” would also have caused him passport problems.
And then there’s Mr. Spock. His full name was unpronounceable by humans, I remember vaguely.
Therefore, it is unlikely that in his era this was stored in the space fleet’s memory banks,
especially since computer input/output was done via voice and buttons, rather than keyboards.
I remember talking to a girl in uni. She didn’t have a first or last name. It was really weird. Her ID said something like Unknown for both fields. I remember her saying that where she was from (an Island tribe, if I remember correctly, part of Papua), they had a celebration when a girl turns 14 or something to celebrate her passing into womanhood and they give them a name and a husband. But she left with her parents before that so they never gave her a name. Seems like that’s a very difficult thing to deal with in modern society. There is this video about people who are actually named James Bond and their encounters with police, which isn’t pretty as the cops assume they are messing with them. Having no name must be so much worse.
Look at Dylan Beattie’s talks about “plain text”.
There’s a guy called something Norwegian, let’s call him “Haakon”, except the aa is a transcript of the a-with-a-circle, and Americans would just transcribe him as Hakon.
Now the passport (Haakon) and plane ticket (Hakon) don’t match. Great “fun”, probably ok if you are white, probably not fun if you have a higher pigmentation.
Don’t even need non-ascii in your name. Last flight I booked to the US, I had to do manual check-in everywhere.
German systems like to use all of the prefixes to your name, so I was Mr. Dr. Elliot. Somewhere between Lufthansa and United, this became MRDRELLIOT, which reads like a death threat and doesn’t match my passport.
I mean, I got through. But the extra hassle and not being able to reserve a seat?
Oh Mann! 🥲
I felt this article deep in my bones.
Version 1.0 of the Hackaday accounting system got broken when we hired Moritz von Sivers.
I feel you feeling it.
My cousin married a woman with the last name “Test”. She was forever being deleted from systems, and was glad to have an excuse to change it.
The problem is that SQL queries are string based. So both the command and payload/meta data is in the same format. Without sanitation (escaping or substituting characters) names can cause commands to fail, or worse, do something different. They should never have used text as the interface. It caused too many issues.
When growing up my street name had a special character that wasn’t recognized by many systems. Received many letters with garbled letters in my street name. At some point even one of the replaced street signs didn’t have this special character but a substitute, so apparently the street sign factory didn’t know how to process that character either.
Neither. SQL explicitly supports PREPARE statements which completely handle the perceived need for “escaping” or “substituting”.
That is not a SQL problem. That is a code monkey problem.
And even then, SQL problems are not SABRE problems.
doesn’t seem like a big deal to me. at the risk of sounding like neal stephenson at the end of snowcrash, it seems to me like it’s rooted in a misapprehension about the function of symbols.
my ID card doesn’t sing me a song or wrap a gift, even though it has my birth date on it. it isn’t my face even though it has a low quality portrait on it.
your plane ticket has a symbol that represents your name but it isn’t your name. a system that requires them to match perfectly might exist and might be a huge problem. but generally, just fly under OSullivan or Osullivan or OSULLIVAN or osullivan, whatever the database normalizes it to. is it really so bad that this piece of paper represents your name instead of being your name?
if you don’t have umlauts on your keyboard, you can just add ‘e’ like ‘Guenther’. if you don’t have the funny beta “S-set”, just use two S’s. if you can’t put uppercase and spaces in the middle of your name, there’s nothing wrong with “vanmorrison”.
The problem is when one system can cope and another can’t, though. Like, you’d have to legally change your last name to the fake name to get that on your passport, right?
I occasionally run into that when placing on-line orders.
Once, the web site was designed to accept the apostrophe, and my credit card had the apostrophe, so it should have gone through. But their credit card processor choked on the apostrophe, so that actually took a phone call with the vendor to suggest they take out the apostrophe when running it through their processor.
Another time, the pizza place put in about a dozen backslashes to escape the apostrophe.
right but can’t the system actually, in reality, cope? if you write O’Sullivan on your school assignments and osullivan on your school’s login authentication and OSULLIVAN on your passport and O’sullivan on your driver’s license, do you ever actually run into a situation where these symbols don’t match? Doesn’t the airline’s ticket agent understand that OSULLIVAN on your passport is the same person as O’Sullivan on the ticket?
i mean there are cases where the processes are so rigid and the mismatches so extraordinary that there is a failure, and in those cases i don’t think database design is going to save us from a combination of snafu and official bias. but surely everyone has seen O’ cross their desk by the third day on the job?
” Doesn’t the airline’s ticket agent understand that OSULLIVAN on your passport is the same person as O’Sullivan on the ticket?”
From what I’ve seen on YT and TV documentaries, it depends.
Most, but not all have common sense. The US seems very picky, also on purpose because of.. ah, I can’t say that here. You’ll figure out yourself.
the funny beta Eszett is a ligature: https://en.wikipedia.org/wiki/Long_s?useskin=vector#Ligatures
(… and the st ligature is the reason why you don’t line break words between s and t, except when there wouldn’t be a ligature anyways, and the sf ligature does not exist — the f then becomes ph like in Asphalt or the word is set in antiqua like in sforzato — but when using Fraktur in LaTeX “sf” will be mapped to the “ft” ligature)
I once had a street number of 11 1/3. Same problem, not all web sites accepted it. Dismiss the 1/3 part is not an option to get my parcels delivered, because it is a valid address ~ 1 km away.
My Irish surname (O’Leary) once crashed bank computers (ABN AMRO, Netherlands) when I tried to open an account there… they had to drop it. Well used to it. It’s not on my green card either.
This makes no sense, just the US alone has tons of foreign names, many french ones for instance. And of course with world-wide airtravel there simply is no way to not have adapted software from close to the start.
Of the billions of people how many need apostrophes alone? So goddamn many.
There are quite a few congresspeople alone who have names like O’something, and with all the uptightness they will fall all over you if your name is different from the ticket, so how can this be a thing in 2025?
Did anybody check this story at all? Is this AI hallucinating?
Even legacy software and system can be fixed by modifying the machine code. The fact that they don’t bother is insulting.
My school, while being in Switzerland, uses a DB that doesn’t allow names using an accent, which means all my current diplomas don’t have my true name on them
I’m surprised there is no EU regulation requiring businesses to accept legal names.
Diacritics, apostrophes, and such frequently play into writing interfaces between systems. I’ve been managing identities for a decade at a hospital system, and it’s surprising how many systems downstream of the authoritative source are designed with faulty name assumptions.
So many systems accommodate givenname, middle/christian name, surname but don’t accept the Latin American mother’s surname anywhere, nor the Korean family first, then given. Nor are these systems built to enable acknowledging the lived name as being potentially different from the legal same (like nicknames but include middle/last or eschewing conventions altogether). In fact, it is so ingrained that a big day in language schools is the day you are given a western name that you will take for cross-cultural conversation and interaction.
Seeing a stranded traveler is something I see on a daily basis in the digital world.
What\’s the problem
Try entering correct Welsh language and you will see a similar problem.
On a related point, I found someone programmed SAP to take the filename of uploaded files and prefix the directory path changing slashes, backward slashes, and colons to underscores. This made for a long name, that could exceed the maximum path length if downloaded into a long directory path. I think only the original filename should be kept.
O’Neill surname here. I don’t usually even bother adding the apostrophe when using computer systems anymore.
I have seen it all – systems saying its an invalid character, my name being truncated at that apostrophe, cosmetic string filter procedures saying us letters only, or the entire process just going kapooting with an internal error when the submit button is clicked. Some systems replace the apostrophe with a space and that makes an entire mess of its own because suddenly the “O” looks like a middle initial. I have figured most of it out in 2025 but being on the internet since the beginning, its been a constant struggle. Its not a new issue either. My parents constantly dealt with issues with IRS over it too back before the internet.