How Three Letters Brought Down UK Air Traffic Control

The UK bank holiday weekend at the end of August is a national holiday in which it sometimes seems the entire country ups sticks and makes for somewhere with a beach. This year though, many of them couldn’t, because the country’s NATS air traffic system went down and stranded many to grumble in the heat of a crowded terminal. At the time it was blamed on faulty flight data, but news now emerges that the data which brought down an entire country’s air traffic control may have not been faulty at all.

Armed with the official incident report and publicly available flight data, Internet sleuths theorize that the trouble was due to one particular flight: French Bee flight 731 from Los Angeles to Paris. The flight itself was unremarkable, but the data which sent the NATS computers into a tailspin came from two of its waypoints — Devil’s Lake Wisconsin and Deauville Normandy — having the same DVL identifier. Given the vast distance between the two points, the system believed it was looking at a faulty route, and refused to process it. A backup system automatically stepped in to try and reconcile the data, but it made the same determination as the primary software, so the whole system apparently ground to a halt.

It’s important to note that there was nothing wrong with the flight plan entered in by the French Bee pilots, and that early stories blaming faulty data were themselves at fault. However we are guessing that air traffic software developers worldwide are currently scrambling to check their code for this particular bug. We’re fortunate indeed that safety wasn’t compromised and only inconvenience was the major outcome.

Air traffic control doesn’t feature here too often, but we’ve previously looked at a much earlier system.

Header image: John Evans, CC BY-SA 2.0.

51 thoughts on “How Three Letters Brought Down UK Air Traffic Control

  1. IATA codes are supposed to be unique. (They also assign them to airlines, ports and some rail stations). There are currently about 50k active codes.

    Amusing, they don’t seem to censor risqué names, Sioux City is SUX, an airline in Senegal has WTF (from Western Africa air).

    1. “IATA codes are supposed to be unique” but they are not.
      There are known duplicates : some small airports have the same code as major airports.
      Theoretically, no airline should be serving two different airports with the same code.

      1. But it can happen. With maximum of 17,576 possible 3 digits combination of the 26 letters alphabet, and estimated of 41,700 airports around the world, a few is bound to overlap and even have 3 or 4 airports with the same letters.

        If I were to charter a private aircraft, it could happen that computers will get brain freeze trying to process unexpected source and destination code. If I were to fly from DVL to DVL, it could really confuse computers.

        1. It is legal to fly from DVL to DVL, or to fly from the origin fix out and back. Flights that do this are typically search and rescue, emergency (police helicopter), sight seeing operations, parachute jumping… If the computer is properly programmed this should not confuse it.

    2. IATA is only used for the internal business processes of airlines, and have nothing to do with flight planning. ICAO codes are used in flight plans. In this case, DVL is used for VOR navigation beacons in France, North Dakota and Malawi. Since only 3 letters are used, there are some duplicates which are usually pretty obvious to sort out. Except in this case I guess.

    3. One would use an ICAO code on a flight plan. IATA codes are for commercial purposes – to sell tickets. If you tried to enter a IATA code in your flight management system it would spit it out with a disgusted sound effect, or give you improbable results.

      Deauville airport code is LFRG (IATA: DOL) and Devil’s Lake is KDVL (IATA: DVL).

      Digging a little further in James write-up, it turns out that the Deauville VOR is DVL, and so is the Devil’s Lake VOR. Now that makes a little more sense. However, this is not a one-in-a-gazillion occurrence, with three letters, it is *known* that there will be duplicates. When you enter a waypoint that has duplicates in your FMS, you will be shown a list of candidates, usually sorted by distance, so that you pick the correct one (the closest to your previous waypoint, usually).

      Surprisingly (or unsurprisingly – this is what you get when you delegate critical infrastructure stuff to the private sector) that very mundane task was badly handled by NATS code.

  2. There are probably other bugs lurking.

    The (American) system, unbeknownst to me, rejected my IFR flight plan that I filed on ground via my computer. When I activated it shortly after departure from an uncontrolled airfield, ATC had me maintain VFR while I zig-zagged around the country-side to stay clear of clouds and certain types of airspace. It took the controllers about 45 minutes to convince their systems to accept the flight plan.

    As a geezer engineer, my confidence in our infrastructure has linearly decreased for at least 20 years.

    1. I am convinced everything in the world has just enough effort/programming/etc to function and it doesn’t take much to break things. Well maybe not space stuff that seems to be built better than most

      1. The Fagan Inspection system was designed to inspect code for the Space Shuttle and make sure there would be no on-orbit problems. In the entire Shuttle program there was only one on-orbit glitch.

    2. “…. As a geezer engineer, my confidence in our infrastructure has linearly decreased for at least 20 years….”

      LOL.. that is a Priceless and True Statement.. From one geezer to another.. I feel really bad for my Grand Kits..

      Cap

    1. True, DOL is the code for Deauville airport, DVL is the code for a VOR station (VOR: VHF Omnidirectional Range) located close to the airport. A flightplan would list the station. VOR stations and other navaid stations are identified by a 3-letter code, so there aren’t that many combinations possible, over the whole planet, plus their codes were chosen by each aviation authority long before international flight was as common as it is now, so it wasn’t a problem at the time.

      1. In both cases, DVL is not an airport, but a VOR, so a navigation aid. They just planned their route using these VORs. By the way, these three letters are not an IATA code, but a VOR identifier…

    2. DVL isn’t an airport. Flight plans use ICAO codes, and all airports have four letter ICAO codes. DVL in this case refers to a VOR navigation beacon, and there are actually three VORs using that same code.

  3. Must be nice to work on systems that don’t require many test cases, if any.
    Or, since it is Britain, maybe it is a Pink Floyd Gravy Train thing. Billable hours for this service call must have been epic.

  4. If some flightplan seems weird, OK, don’t process it and raise an alert for manual processing.
    The real bug is why all other, completely independant, flightplans could no longer be processed automatically.

    1. Yep, and this is the bug fix that was applied in this case (according to the report) – fixing the system failure due to a single flight plan issue, rather than the root cause (poor logic when specific input data appears), because a root cause fix would be changing core logic which I would expect to take several months to test and cautiously deploy, whereas a peripheral fix to error handling is less ‘critical’, but avoids the seizure experienced here… regulations, legal stuff, etc. etc….

    2. Read the report though – It’s not a bug, it’s a decision to fail-safe and place the system into maintenance if the import system couldn’t process a flight plan, rather than risk passing bad data onwards.

      With the benefit of hindsight we feel that one plan which is bad in this way shouldn’t cause a major issue.

      But it’s not impossible that one badly formed flight plan would mess up the state for all subsequent ones, so this design decision is not unreasonable.

      We’d be complaining more if it had silently corrupted all future flight plans.

      1. No, an essential system like ATC needs to validate its input before passing it on to where it would corrupt other data. The people who programmed this are used to websites where if they go down, little problem. NOT people who should be allowed to write essential software that needs to be fault tolerant until they have a lot more experience.

          1. There should be processing to make sure the flight plan is OK BEFORE passing it on to where it could corrupt data. I think a lot of the problem is companies look for cheap people, not those with experience that can predict many of the issues and prevent them from occurring. Then, once they get the experience, they become too expensive. Been there, done that, but am with someplace that respects my over 45 years of experience. And I have shown I am perfectly capable of learning anything needed.

      2. What it should have done (and I suspect will now do in future) is to reject the flight plan it doesn’t like (and issue an error) instead of going into fail-safe mode and locking up.

    3. Exactly. If one flight plan cannot be filed, reject it but don’t crash.
      I think they just hired cheap programmers and no QA department – whose job it is to send erroneous input and make sure the system still works!

  5. I was under the impression that ICAO was meant to coordinate navigation systems, including navaid/intersection designations so this kind of thing couldn’t happen… I still put this on the heads of the coders for not putting in exception handling for duplicate waypoint designations. In fact, I am surprised it took this long for the problem to arise.

  6. Correct, ICAO is the international body governing international aviation. One thing it does is publish recommendations for countries to follow and organize their aviation systems in a coordinated way. At the highest level the world is divided into Flight Information Regions (FIR for short). These regions are typically assigned around a country’s political boundaries, extending into the ocean where the country may have a boundary there. Within this FIR a country or region has control of air traffic, part of this includes the naming of navigation routes and fixes, of which waypoints and VORs are a part of. The ICAO guidance includes rules choosing names for these fixes. When it comes to duplicates, the rule is that the ARE ALLOWED, as long as they are not inside the same FIR ;). This is why a VOR in the USA can have the same identifier as a VOR in Europe and apparently another in Africa. These are all separate flight information regions.

    If you ever get s chance to look at published air navigation data for aviation computer systems you would find that a navigation fix is defined by many properties, one of which is the FIR to which it belongs. From the programming side, there is published official guidance for how this data must be written and read. These published standards are intended to be used globally and have gone through much validation and testing given how critical the infrastructure the data supports is. Of course nothing is perfect and as mentioned below fools are ingenious, but if you program following the guidance I can tell you that one thing that should NOT be a problem is for the system to accept a flight plan with repeated fix identifiers as long as those lie on separate FIRs.

    On long international flight repeated fix names happen often, sometimes more than once. Sometimes they are hidden behind expandable token names, for example airway names for the pilots out there. And this could mask the potential bug the NATS system suffered from. Someone decided to spell out the route instead of using a contraction :).

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.