Whenever the topic is raised in popular media about porting a codebase written in an ‘antiquated’ programming language like Fortran or COBOL, very few people tend to object to this notion. After all, what could be better than ditching decades of crusty old code in a language that only your grandparents can remember as being relevant? Surely a clean and fresh rewrite in a modern language like Java, Rust, Python, Zig, or NodeJS will fix all ailments and make future maintenance a snap?
For anyone who has ever had to actually port large codebases or dealt with ‘legacy’ systems, their reflexive response to such announcements most likely ranges from a shaking of one’s head to mad cackling as traumatic memories come flooding back. The old idiom of “if it ain’t broke, don’t fix it”, purportedly coined in 1977 by Bert Lance, is a feeling that has been shared by countless individuals over millennia. Even worse, how can you ‘fix’ something if you do not even fully understand the problem?
In the case of languages like COBOL this is doubly true, as it is a domain specific language (DSL). This is a very different category from general purpose system programming languages like the aforementioned ‘replacements’. The suggestion of porting the DSL codebase is thus to effectively reimplement all of COBOL’s functionality, which should seem like a very poorly thought out idea to any rational mind.
Sticking To A Domain
The term ‘domain specific language’ is pretty much what it says it is, and there are many of such DSLs around, ranging from PostScript and SQL to the shader language GLSL. Although it is definitely possible to push DSLs into doing things which they were never designed for, the primary point of a DSL is to explicitly limit its functionality to that one specific domain. GLSL, for example, is based on C and could be considered to be a very restricted version of that language, which raises the question of why one should not just write shaders in C?
Similarly, Fortran (Formula translating system) was designed as a DSL targeting scientific and high-performance computation. First used in 1957, it still ranks in the top 10 of the TIOBE index, and just about any code that has to do with high-performance computation (HPC) in science and engineering will be written in Fortran or strongly relies on libraries written in Fortran. The reason for this is simple: from the beginning Fortran was designed to make such computations as easy as possible, with subsequent updates to the language standard adding updates where needed.
Fortran’s latest standard update was published in November 2023, joining the COBOL 2023 standard as two DSLs which are both still very much alive and very current today.
The strength of a DSL is often underestimated, as the whole point of a DSL is that you can teach this simpler, focused language to someone who can then become fluent in it, without requiring them to become fluent in a generic programming language and all the libraries and other luggage that entails. For those of us who already speak C, C++, or Java, it may seem appealing to write everything in that language, but not to those who have no interest in learning a whole generic language.
There are effectively two major reasons why a DSL is the better choice for said domain:
- Easy to learn and teach, because it’s a much smaller language
- Far fewer edge cases and simpler tooling
In the case of COBOL and Fortran this means only a fraction of the keywords (‘verbs’ for COBOL) to learn, and a language that’s streamlined for a specific task, whether it’s to allow a physicist to do some fluid-dynamic modelling, or a staff member at a bank or the social security offices to write a data processing application that churns through database data in order to create a nicely formatted report. Surely one could force both of these people to learn C++, Java, Rust or NodeJS, but this may backfire in many ways, the resulting code quality being one of them.
Tangentially, this is also one of the amazing things in the hardware design language (HDL) domain, where rather than using (System)Verilog or VHDL, there’s an amazing growth of alternative HDLs, many of them implemented in generic scripting and programming languages. That this prohibits any kind of skill and code sharing, and repeatedly, and often poorly, reinvents the wheel seems to be of little concern to many.
Non-Broken Code
A very nice aspect of these existing COBOL codebases is that they generally have been around for decades, during which time they have been carefully pruned, trimmed and debugged, requiring only minimal maintenance and updates while they happily keep purring along on mainframes as they process banking and government data.
One argument that has been made in favor of porting from COBOL to a generic programming language is ‘ease of maintenance’, pointing out that COBOL is supposedly very hard to read and write and thus maintaining it would be far too cumbersome.
Since it’s easy to philosophize about such matters from a position of ignorance and/or conviction, I recently decided to take up some COBOL programming from the position of both a COBOL newbie as well as an experienced C++ (and other language) developer. Cue the ‘Hello Business’ playground project.
For the tooling I used the GnuCOBOL transpiler, which converts the COBOL code to C before compiling it to a binary, but in a few weeks the GCC 15.1 release will bring a brand new COBOL frontend (gcobol) that I’m dying to try out. As language reference I used a combination of the Wikipedia entry for COBOL, the IBM ILE COBOL language reference (PDF) and the IBM COBOL Report Writer Programmer’s Manual (PDF).
My goal for this ‘Hello Business’ project was to create something that did actual practical work. I took the FileHandling.cob
example from the COBOL tutorial by Armin Afazeli as starting point, which I modified and extended to read in records from a file, employees.dat
, before using the standard Report Writer feature to create a report file in which the employees with their salaries are listed, with page numbering and totaling the total salary value in a report footing entry.
My impression was that although it takes a moment to learn the various divisions that the variables, files, I/O, and procedures are put into, it’s all extremely orderly and predictable. The compiler also will helpfully tell you if you did anything out of order or forgot something. While data level numbering to indicate data associations is somewhat quaint, after a while I didn’t mind at all, especially since this provides a whole range of meta information that other languages do not have.
The lack of semi-colons everywhere is nice, with only a single period indicating the end of a scope, even if it concerns an entire loop (perform
). I used the modern free style form of COBOL, which removes the need to use specific columns for parts of the code, which no doubt made things a lot easier. In total it only took me a few hours to create a semi-useful COBOL application.
Would I opt to write a more extensive business application in C++ if I got put on a tight deadline? I don’t think so. If I had to do COBOL-like things in C++, I would be hunting for various libraries, get stuck up to my gills in complex configurations and be scrambling to find replacements for things like Report Writer, or be forced to write my own. Meanwhile in COBOL everything is there already, because it’s what that DSL is designed for. Replacing C++ with Java or the like wouldn’t help either, as you end up doing so much boilerplate work and dependencies wrangling.
A Modern DSL
Perhaps the funniest thing about COBOL is that since version 2002 it got a whole range of features that push it closer to generic languages like Java. Features that include object-oriented programming, bit and boolean types, heap-based memory allocation, method overloading and asynchronous messaging. Meanwhile the simple English, case-insensitive, syntax – with allowance for various spellings and acronyms – means that you can rapidly type code without adding symbol soup, and reading it is obvious even as a beginner, as the code literally does what it says it does.
True, the syntax and naming feels a bit quaint at first, but that is easily explained by the fact that when COBOL appeared on the scene, ALGOL was still highly relevant and the C programming language wasn’t even a glimmer in Dennis Ritchie’s eyes yet. If anything, COBOL has proven itself – much like Fortran and others – to be a time-tested DSL that is truly a testament to Grace Hopper and everyone else involved in its creation.
i don’t think this description of domain specific languages really gets to the concepts that matter. fortran is a good example. it’s not that the features of the language are good for scientific programming — they very very very much are not — it’s that its pedigree is scientific programming. decades of practice are what make the language appear domain-specific. a lot of people will try to separate ‘technical reasons’ from ‘cultural reasons’, often using the almost-always-completely-wrong word ‘just’. as in, ‘that’s just because of decades of practice’.
and the same is true of cobol. its featureset is a little oriented towards specific tasks but, truly, doing those tasks in another language like C, C++, Java, or Rust is just a question of idioms. the problem is the actual decades of experience. the decades of experience have created individual programmers, programming institutions, codebases, individual non-technical employees, and non-technical institutions. and all of these things have grown up together and any change to any of it will ripple through the whole body. for the last decade or so, a wave of retirements of individual programmers has threatened programming institutions, for example.
the question isn’t what language to use but simply how to manage change. every used-for-decades codebase has become crufty and inflexible, and you would face an enormous task to clean them up or reimplement them even if you decide to do that work in cobol. the old development team has developed not only a knowledge of cobol and of the existing codebase, but also a knowledge of the rest of the body outside of their department, and a philosophy of testing and deployment as well.
and there’s the very closely-related question of whether you want to re-architect away from the mainframe. there are many approaches to scaling in this world.
Agreed. :-)
I don’t consider COBOL or FORTRAN to be DSLs. They were (are?) general-purpose languages and are a reasonable way to solve any computable problem as long as it doesn’t involve writing an interrupt handler or something that manipulates the page table. C can do those two things quite well (vis. Linux) and hence is referred to as “Systems Programming Language” but that’s far beyond what is needed to be a general-purpose language. Lisp, Scheme, Scala, Erlang… none of those are “Systems Programming Languages”, all are general-purpose languages, and Erlang probably spans the gap between domain-specific and general-purpose.
It’s amazingly, eye-wateringly expensive to license a compiler for a mainframe. You’ll see a strong tendency in mainframe shops to do everything in COBOL whether it “makes sense” or not.
Shops with strong change control processes (banks and insurance companies, in particular) will usually require a million signoffs to “write a program” but consider SAS (“Statistical Analysis System”) to just be an application. I worked for a bank at one point that spent more on SAS licensing than they did on mainframe leasing. They had dozens of departments running SAS “scripts” just so they wouldn’t have to go through change controls. Yes, they were basically end-running security, but whatever…
tl;dr: Languages, especially on mainframes, get selected for non-obvious reasons.
Legitimate question: is there a way to develop Rust programs on MVS? Linux for Z series?
When I recently asked the Program Manager for Languages, I think I mentioned Rust but don’t remember a definitive answer. On their Linux “platforms” (LinuxOne® and Linux on the IBM z/Series® I suspect there are Rust Packages.
Thanks! I got curious and poked around. It looks like Rust on z/Linux is “Tier 2” but evidently works fine or well enough. And supposedly (I didn’t know this) you can put the resulting z (390) executable into a Docker container, move it into one of your z/OS (MVS) LPARs, and run it there.
Not bad for a half-century “experienced” OS. Not bad at all. :-)
Does running a Rust-friendly OS+development environment under an emulator hosted on MVS or Linux for Z series count?
I’m not trying to be flippant here, but sometimes if there is a known Rube Goldberg solution that works, using it will be faster or cheaper than doing it “the right way.”
I take that back, I am trying to be flippant. I’m pretty sure the answer to my question “does running it under emulation count” is “no.” I hope the answer to your question is “yes, and it won’t cost you an arm and a leg.
You absolutely can write Rust code on zLinux. I’ve done it on Ubuntu running on the Hercules emulator. Just for the LOLs.
Now I think running it on an emulator is valid, unless your talking about emulating an X86 on MVS. But emulating the mainframe is required as I’m all out of S/360s
i don’t think the ‘purity’ of whether it’s emulated or virtutalized or whatever matters…the question is just can you integrate it into the existing ‘mainframe way of doing things’. and my intuition is that you can but only if you accept a strict demarcation point, the same way they use java to write web frontends to mainframe databases. it doesn’t really matter where the java is running, so long as the database still lives on the mainframe
“Lisp… none of those are “Systems Programming Languages” ”
I take it you never used a Lisp Machine…
I have used modernized cobol in Lawson financial systems erp solutions which are now owned by infor. They used a set of generated cobol libraries to read and write through huge databases of oracle defined databases to work with generating sql into a database in realtime. If the IRS databases and code can be translated into this 4gl language and tested to produce the same results, there would be a clear way to extract the logic of these cobol systems and convert the systems to modern day programing languages, because I have done as much with generated infor/lawson code to scripted SQL to C/C++ unix processed code.
I honestly don’t know if COBOL or Fortran are DSLs or not – but I don’t think they are general purpose languages, either (at least they weren’t intended as such). The clues about this might be in their naming:
COBOL – (CO)mmon (B)usiness (O)riented (L)anguage
Fortran – (For)mula (tran)slator
So…in the case of COBOL – it’s name says that “this is meant for commercial business purposes” – which in general (back then, and now) means sifting thru a bunch of data, generating reports, etc. Also, much of the language can almost be read like “plain English” – which was the point: There was a need (desire?) to have a programming language that didn’t take a nerd/geek/whatever to read and write; that a relatively non-programmer person (a manager…or maybe even the CEO) could look at a program, and get a feel for the calculations (assuming they knew what they were looking at – say a payroll system or something) – because rather than reading (all the following is pseudo-code) something like:
tx = brate * pct; // tax equals baserate multiplied by percentage
…the manager could read:
SET TAXES EQUAL TO BASE_RATE MULTIPLIED-BY PERCENTAGE
…and they might say to themselves, “hmm, that needs something”, and could annotate:
; Bob, change the following:
SET TAXES EQUAL TO BASE_RATE MULTIPLIED-BY PERCENTAGE
; to this:
SET TAXES EQUAL TO BASE_RATE MULTIPLIED-BY (PERCENTAGE PLUS RATIO)
; Thanks! – Phil
Again – the above is NOT really COBOL – but it has the flavor, from what I barely recall from playing with a proprietary dialect known as DB/C (basically COBOL with some flat DB added on, insofar as I understood it at the time – decades ago).
The entire codebase would be just as verbose – that was the purpose behind it; it was a plus for those who couldn’t read anything else (and trust me, the languages then were closer to assembler – or were assembler – than anything else), but could read plain English. Of course, for the actual programmers…well, it could be terrible, because you had to write out everything…but at least for many cases, the code was “self-documenting” – to an extent.
Now Fortran – that was a different beast – it was used and abused for all kinds of stuff (look into what was done for early 2D and 3D graphics, circa-1968 forward to oh, 1975-ish?) – you’ll find it being used and abused for almost everything; the Cal-Comp graphics plotting standard was pretty popular. There was a couple of others that were developed at universities that also proved long-lived…
Oh…there were also more than a few Gerber libraries for Fortran…because you have to be able to take your Cal-Comp plots, and convert them into something to share schematics (and other things; the Gerber standard was – as far as I can tell – initially meant for interchanging of “vector” graphics – and everything was a vector graphic back then, because memory for a framebuffer was…well…expensive doesn’t even begin to describe it)…
But “as designed” it was a language meant to allow easy conversion of mathematical formula into a format a computer could digest, and do it efficiently (again, era of “everything is assembly”); a language meant for scientists and engineers (and be easily readable by humans, too). But it was “free” enough to do more general purpose things. Based on code I have read in various DTIC research reports and other things from the era, people seemed to love to “hack” with it! I haven’t found a game done in it, yet – but I did find a rudimentary flight simulator (about the same time around when Sub-Logic released it’s code in BASIC and assembly for various platforms; but this code was independent of Sub-Logic’s stuff – it was written for somebody’s thesis or something from the Naval Postgraduate school, or maybe it was something for the Air Force – I forget).
Again – none of this proves COBOL or Fortran are DSLs, or were meant as such…but they were designed for particular “use-cases” and “industries” – COBOL pretty much stuck to it’s place, while Fortran “wandered” a bit, but still mostly kept to what it was meant for, formulas and calculations (and plotting/graphing of such) too…
One other thought: At the time, it was a “radical thought” to think of using computers for anything that didn’t have something to do with calculations and formulas, or for business purposes (record keeping and reporting), or occasionally for industry (process control, for instance – but a lot of that was also done with analog computing – which was also a big thing; there were hybrids sold then, too)…
Using a computer to control a robot? Artificial intelligence and games? Graphics? Insanity!
Also, batch processing and punched cards didn’t lend themselves well to “general purpose” stuff…and who could ever own a computer…in their home…all to themselves? Heck, just look at the early microcomputer days, pre-1990-ish…
All of it seems kinda insane to me now…thinking about it…thinking of when I was a kid with a home computer in the 1980s – when I had a modem, and most people didn’t even have a computer, let alone knew what “being online” meant – and that was really “late” in the game; now think about the heady days of the Homebrew Computer Club – when a lot of members didn’t have a computer, because a small one was either too expensive – or didn’t exist at all (there were a very, very few number of people in the world that had a “personal computer” – and generally, those machines had very little memory – on the order of a few 10s of bytes at best – and if you were clocking along at 50 Hz, that was a fast machine for something “homebrew” made out of junk parts from the telephone company or such)…but eventually…once the 4004 became a thing (though there were “electronic magazine projects” for “electronic desk calculators” – I found one recently in a British publication that essentially used something like 54xx series ICs, and a bunch of other discrete components, to implement the basic “ALU” needed for the calculator, and it had various “registers” for storage, and did the calculation using a diode array “ROM” that ran a “microcode” and a ring counter thing to “step thru” each “instruction” to “do the needful” – yes, it was a microcoded, ROM-only computer – it was pretty amazing to see – it was published over about a year’s time-frame in the magazine).
/ok, if you made it down here, consider yourself a “Level-10000 Mage of the Order of Teal Deer”…
FORTRAN 1957
COBOL 1959
The game Spacewar! was developed in 1962, in assembly language on a PDP-1 at MIT.
I think the text game Adventure had a fortran version.
I thought it was written in Fortran to start with.
fortran and cobol seem mission-specific because they were trail blazers. every language after fortran translates formulas. every langugage after cobol has structured data types.
the reason i consider the idea of fortran as a domain-specific-language to be farcical is that i’ve actually met scientific fortran code. they told me to ‘make it work’ after they moved it from VAX/VMS to Linux, but they didn’t tell me what it was. it was huge, tens of thousands of lines of code. and almost all of it was I/O…file, tape, display, and printer. and a horrendous amount of effort was invested into memory-mapped storage, because one of the platforms over the years couldn’t simply malloc(32MB). and once i was done, i finally learned what they were using it for! it converted a 3D dataset into a 2D histogram. the formula was for all x { for all y { for all z { out[x][y] += in[x][y][z]; } } }
they didn’t use fortran to write a formula, they used it to write an interactive commandline that had layered hacks for each I/O subsystem they’d been forced to deal with over the decades. and because it was fortran it was unfactorable and unmaintanable and unfathomable. worst possible language for that use. but “a hack to be able to work with the only gigabyte data storage available in 1982” is exactly what scientific users need the most of. naming files and viewing subsets of them is extremely important to them. truly, they wanted a unix command shell and a few perl scripts to drive gnuplot. the formulas themselves are often astonishingly simple.
I agree about Fortran. And in enough time Python may take the role of “DSL for science” “just because” it’s becoming the defacto standard.
But I don’t think being capable of general purpose computing stops COBOL definitely being domain specific by design. I mean I could write a 6502 emulator in COBOL or Minecraft datapacks but that doesn’t mean it’s the sort of usecase they were intended for.
Cobol is quite simple to learn, but the environment where it lives in, generally a complex network of thousands of applications, databases, reporting frameworks etc makes it hard to manage, even more so as many are run as ‘black boxes’, no one knows what happens in the code as many devs that have written the code are already pensioners or no longer among the living. What the programs do in the end isn’t really difficult or particularly complex in and of itself. It can probably easily be built in general purpose software. But the usual bugs that crop up in such languages can have far reaching consequences. Stability is key.
It can mean your bank balance is off by a few thousand, the interest rate in mortgage is calculated wrong, your coverage of your insurance could have issues with your claims etc. We expect from banks, insurance companies etc, that this all works perfectly every time.
Yes, there are outages all the time, even with Cobol based applications, but generally the Cobol programs run flawlessly and other systems, like the Mainframe emulation layer, or windows or any of the more modern connected applications have failed, or something went wrong in the many migrations that take place behind close doors etc. is much more often the cause of outages.
Aside from that, Cobol programs are often run in complex structures in batches, the order and dependencies that these need to run in is critical. And every day and every week or month different batches needs to run, or run with different dependencies. Much of this can be done in modern web based batch scheduler applications and Cobol can run on windows servers with an emulation application. I’ve even seen frameworks that offer unit testing and XML/JSON connectors in combination with Cobol.
If you take all that into account, it becomes much more cumbersome to contemplate a full rewrite in more modern languages. And in the end, are there managers that dare to take responsibility to make such a move? Often they get like 18 months to complete a project, if they don’t complete it, they get fired. So no manager is motivated to take on a project with a small likelihood of success.
Definitely agree with what you are saying here. But I would argue that modern apps also live in a “complex network of thousands of applications, databases, reporting frameworks etc makes it hard to manage,”. Perhaps the difference is that the people who created them are still around to help decipher it.
Personally, I’ve had difficulty with systems that are only 10-15 years old. The code was manually deployed to bare metal hardware running on an OS that is way past EOL and including components from vendors who are out of business.
But I see the cycle continuing with modern developers building “microservices” glued together with incredibly complex deployment technologies like Kubernetes. 10-15 years from now our children are going to have a lot of fun trying to decipher all of this mess. Will it be any easier to understand than COBOL is now?
I’m sure in the future documentation will be treated the same way as it is now.
i think one blessing we have…something i hate most of the time, of course…is that there’s an awful lot of pointless churn these days. so no one will still be using an unchanged kubernetes cloud deployment configuration in 20 years based on inertia. where ibm goes to pains to avoid breaking the past, most modern things put just as much effort into constantly breaking everything. and worse, constantly linking everything together in fine ways so that you can’t just use back-versions of one component forever to maintain compatibility.
“Will it be any easier to understand than COBOL is now?”
Likewise, will COBOL be any more difficult to understand than it is now?
Very often the “intellectual debt” isn’t even on the systems side of the process, though I’ll be the first to admit that JCL Hacking is almost a job title if not a discipline.
The real gap in knowledge is independent of the language, the environment, or the methodology… the problem is the domain knowledge of all the special cases that despite careful practices (or not) still gets coded into the programs and never written down. Some human artifact with forty-odd years under their belt knows that men’s jeans with a 28″ or smaller waistband in Puerto Rico gets taxed as a boy’s size, not men’s, and once that person is gone no one else knows why that “if” statement is there and if it’s a leftover bug or deeply hidden fraud. Yes, been there, got the gray hair and the T-Shirt. I mean, sometimes the only way to tell in that part of the code if it was heading via Puerto Rico was to see if it had “Blue Container” in the shipping notes. I wish I was kidding.
Go ahead and write that thing in Rust or Scala and it will be just as crazy. And it won’t (?) run on a mainframe anymore. Which is great, except that now you have to interface with a running mainframe system, and that’s usually even worse. LU6 makes the wildest “open systems” interface code you’ve ever seen look like second-semester stuff.
A significant chunk of American Express cards are to this day handled on a system rolled out in 1984 that no one really has the guts to mess with. Careers were made and broken with that system, and it is to be approached with the utmost caution. It runs on the “IMS” database system from IBM, which even Big Blue says is unsuitable for new work. IMS dates from the late 1960s. As recently as a month ago AMEX was looking for programmers with IMS experience. Perhaps looking in nursing homes. The problem with it, again, isn’t the technology. The problem is all the business rules coded into for doing business in dozens of countries.
That jeans size is a nice example for what COBOL can do to understand a program. In COBOL you could make the following definition:
10 waistband pic 99.
88 boy-size value 1 thru 28.
88 adult-size value 29 thru 99.
In the program you can write the following:
If boy-size then ..,.. etc.
That is the way you can make COBOL programs very readable and understandable, and that is why it is so very suitable for business programs.
YEAH, those awful, complex Batch Jobs, like what
gcc
org++
does under the covers of those wrapper scripts. Or a complex Makefile. Batch is stuff run from a command line or the equivalent ofcron
and yes, many COBOL jobs use huge programs and databases, as you would expect in the large of a Bank or Insurance company. But a significant amount of the COBOL inventory talks to transaction processing software: CICS or Web CICS Interfaces or, increasingly REST APIs. XML and JSON support is in the standard. GnuCOBOL (free open source software — FOSS) has it and GCC COBOL 15.1 doesn’t yet. It is on the roadmap for 16.1 but will likely be in our packages before then for early adopters.Source code control was sloppy and software to help was expensive and primitive. That’s a pile of clerical work and
git
(orhg
or …).So what you’re really saying is that there’s a lot of real work that could be done apart from actually touching the program — checking documentation, identifying environments, organizing source (and documentation) control environments and so on. That, ultimately, is the reason that a shift from COBOL is not a good idea — ultimately the job’s not about what language you code in but all the boring peripheral work that’s needed to make high grade software that’s conspicuously absent in a much (most, I’d bet) modern software.
The last thing a bank wants is to have to endure the kind of “will it /won’t it” that characterizes modern Windows releases. Testing is an active job and can actually take more effort than writing the code.
I read this article twice and I still don’t get what does it want to communicate. It all sounds like incoherent rant straight from reddit. In other words: Ma’am, This Is A Wendy’s.
What I took away is “COBOL itself isn’t the problem and is probably a better tool for building financial apps than Rust”.
My take on what it’s saying is that domain-specific languages exist for reasons, they’re not that hard to learn, and there’s typically no real need to rewrite a bunch of code written in a DSL into some more “modern” language.
It is certainly cursed, but I turned my job into a full time programming position with VB and VBA. My work has specific requirements and restrictions with code, but those are readily available and I abuse the hell out of them including file system operations, complex reconciliations, database frontends for multiperson simultaneous read/write operations and more.
There are certainly better suited languages out there, but like COBOL; if it works, use it.
I took a LinkedIn COBOL course sometime ago just out of curiosity to see what it was about. COBOL seems very specific in how it is structured which is a must for business applications. I couldn’t imagine trying to convert all that existing code to Python or C or whatever if it simply works and the original source is still available. It also allowed someone with a good technical understanding of the language to write decent code for business purposes.
I was formally trained in programming in university, mostly Java, but also things like MATLAB, Lisp and Scheme. I haven’t programmed much afterwards, and I seem to happen to find a new programming language for each new project I embark on.
My latest creation was 600 lines of Awk which seems the best language to /develop/ in, for the one-off data manipulation task I needed it for.
The company I work for (not a software house) might want to take my program over and exploit it in more divisions, but for business reasons, AWK (or any scripting language) is not the right language – for usability it will probably be ported to a web frontend and hell do I know what they do with it in the backend. I’m just happy that I made someone (who I concidered a real programmer, but couldn’t do what took me 3 long evenings) happy, and that I will be in no way responsible for rewriting or maintaining said tool.
Moral of the story: Sometimes it’s not the actual task that decides what is the best language, but the development phase, environment where it will run or even the knowledge set of the programmers decides what is the best language.
The big thing with COBOL is the MOVE verb, it does so much with a well designed data division and is quite hard to replicate with modern languages. The ALTER verb is a different story, that was an abomination that I never used.
I came to COBOL first from having been an 8080 programmer. At first, I didn’t get the point of it (I was using a microcomputer version to begin with) and it just seemed a cumbersome way of doing things. Someone then explained it to me as “It just inputs files, twizzles them round and then outputs them again in a different format”. In essence that is what COBOL does best and with that single comment, the penny dropped! I later moved to mainframe COBOL then to maintaining the COBOL compiler for ICL along with the compilers for several other high level languages of theirs (ALGOL, FORTRAN, RPG2 to mention a few) and also their COBOL for their microcomputer. I was the 4th line support for the company and if I couldn’t fix the bugs – they didn’t get fixed! All from that one comment explaining it to me! Oh and I later married the guy that explained it to me.
Just one note to add; I discovered a bug in the “Naval Tests” which were part of the alpha-testing for each new version of any COBOL compiler. It was decided to leave it there as correcting it meant that every single version of a COBOL compiler worldwide and still in use, would have to be retested. The bug is still there and every compiler has to add it’s own workaround. 🤣
As a company ICL were never happy to have bugs in their compilers pointed out and they really didn’t like it if you fixed one by patching their code! Up until my house was flooded I had a treasured copy of a letter to my head of department that tried to get me sacked for such a heinous act (education was a gossip rich environment and their engineers were such tell tales). I wish now that I had framed it and hung it on the wall :-)
“If it isn’t broke, don’t fix it” . Agree with that. Now we see ‘re-writes’ from perfectly good o’ C code to Rust. So it goes I guess. And with Cobol now included with the gnu suite (joining gcc, gfortran, etc.), Cobol will be around for a long time. Personally never had to ‘work’ with Cobol. Just the standard intro in college back when. Cobol didn’t see any traction in my Real-Time world nor thankfully VB ;) …
I do wonder if C -> Rust might be a very different thing than Cobol -> basically anything. Because rust and c are fill the same niche and are similar languages, the inherent quirks of rust notwithstanding. (Also the fact that using rust could theoretically have legitimate benefit, even if it is only because of a lesser amount of undefined behavior? I don’t have the experience to actually comment.)
When porting a DSL:
First understand it – pray you have an experienced developer for the ‘old’ DSL on the team. If you don’t then pay for a couple seats for an LLM to take that role but check the assumptions you feed it.
Second: Design a replacement framework/architecture for the new target that is appropriate for the selected target language/environment, in support of:
Third: Don’t transliterate the code, statement for statement, attempting to keep the structure. Translate the code to an implementation that’s appropriate for the target execution language and environment.
/me seeing a codebase that
is 40 years++ old
has to handle legislation from several states, countries, whatever legal context
has to handle several edge cases inside these legislations
has to handle all the different input forms (validation etc.)
has to print out that stuff in a readable form
Pffff. I don’t think the problem is to port some language to another language. The challenge is to understand all the grown clutter and port it.
(I would bet a little amount of beer that they do not have a single test case for any validation.)
So maybe the problem with old Cobol code is that it was written in a different era with different doumentation and maintainability standards.
Agreed. Does Cobol support unit testing? It would probably present the same problem if it had been written in C – even more because C is lower level so harder to understand.
And zeroth: before touching it, know exactly that this program is for and what it’s supposed to do and how it interfaces with other systems. Reading the code along (no matter how well formatted and commented) will tell you basically nothing about this
The process requirements are far more important than any actual code, and are the hardest things to acquire from legacy systems regardless of what language they were written in.
Translation:
If you don’t have someone who understands the code, feed it to a random text generator whose output you can’t check because you don’t understand the original system enough to tell how poorly the random garbage generator did.
We had a project recently that involved a peculiar protocol for transmitting small amounts of data over a regular telephone connection using DTMF. Documentation for that protocol was hard to come by, so someone asked a chatbot, and dumped the results in the project plans in place of the expected protocol definition.
The project was handed to me to work on because all the planning was done. I sit down to implement the protocol, and discover that it is a bunch of crap. The chatbot made up a bunch of crap that used terms related to DTMF and to the protocol (and the actual use case,) but it was total junk.
We had to put the project on hold while we tracked down a company involved with using the protocol who could give use some real details.
During planning, everyone had looked and saw what looked like a detailed description of the involved protocol. It only fell apart when you actually tried to apply the details – then you found that things didn’t match.
Keep chatbots way far away from your programmers. It’ll only cause you grief.
COBOL is pretty easy, JCL (Job Control Language) and mainframe utilities are not easy. It’s one of reasons UNIX had the one utility, one purpose philosophy.
JCL cards are only used for build or execute tasks. There is not too much JCL compared to cards with COBOL.
Seems cheaper and more efficient and lower risk to de-DOGE than to de-COBOL and de-mainframe.
It would be a better option to read the COBOL code and interpret its purpose, then generate Java or Kotlin, which implements the corresponding actions.
Quote TFA:
A very nice aspect of these existing COBOL codebases is that they generally have been around for decades, during which time they have been carefully pruned, trimmed and debugged, requiring only minimal maintenance and updates while they happily keep purring along on mainframes as they process banking and government data.
End Quote:
The fine author has never had to wallow or root around in a ‘highly evolved’ codebase.
Old code is never touched, not because it’s great, but because it sucks BWDB (big wet donkey balls).
The reason for the minimal maintenance is all the things broken by every fix.
When your coders add 3 new bugs for every fix, there is only one smart move:
Document existing bugs and workarounds.
Don’t touch anything.
Get a guard dog to bite anybody attempting to check in code.
There should be a way to identify this in the process immaturity model, but it operates on another axis of disfunction.
‘Information hoarding’ and ‘obstruction process’ are characteristic of maintaining old code bases though.
They are sane insane responses to the management style invariably associated.
The only real sane response is ‘flight’.
Alternative is ‘retire in place’, but madness lies down that road.
Just to add:
Bad software Stockholm syndrome analog.
Been observed many times (JS, Blender, Oracle apps, SAP, emacs etc etc).
Those that work with really awful systems, invariably start to ‘like’ them.
In an awful dysfunctional way, like horribly abused children.
The crew keeping any old system running are always at least half the problem.
Unless I missed I missed it, no one mentioned what may be the most important feature for business applications. COBOL supports math as decimal which is better suited to computations involving money. Ada has had similar capabilities for decades also. Both languages also have the picture format capability.
Good get. Thanks.
If your doing math usinng float then your and idiot, plane and simple. We had to write a simple banking app duirng our first semester at university and the rule was that any calculation involving money had to be done using uint64_t and only casted to float when printing result. If your doing otherwise you risk losing precision. The trick is to represent penny as the smallest unit (bit) and don’t work on after comma parts of money.
Any modern language has a fixed point/money type/library.
IIRC
Cobol made you define every variable like a number field on a report.
BCD internally.
Pic statements. (Shudder. Also reminded of DataFlex…the most awful language ever! COBOL programmer got drunk AF with GWBasic programmer. Woke up with sore, crabs. DataFlex shit out 9 months later. Arrays missing.)
I did lots of math using floats.
Your first statement missing ‘financial’.
Even there floats can be the right type.
Do option math in fixed point.
Doubles can be used for financial calculations if the numbers are smallish and rounding is done properly and often.
There are 5 errors in your first sentence. Please tell us the name of the university, so we can avoid it.
The last version of COBOL was released in 2023. It has a very simple syntax, and is easy to read code.
I’m wondering if the spark for this article is the 3/29/25 Ars Technica report that Musk employee Steve Davis is leading a small team to try to move the SSA’s financial mainframe code from COBOL to Java in a matter of months? Given that this team would likely have to feed the code base into a machine translator to move 60 million lines of code that quickly, I suspect such a project would go to hell in a hand basket in short order.
I’m generally in favor of the whole DOGE thing, as disruptive as it is, but that sounds like punishing the wrench for the mechanic’s mistakes. In other words, and ironically, a waste of money. Akin to saying “All US government computers must be Macs”.
When I wrote Fortran, I had more comment lines than actual code. Maybe I am one of the few programmers who looked to the future when I would not be available to decipher my coding. As an engineer, I was known for Rube Goldberg solutions that worked and saved time. Sometimes it took a day to explain to my co-workers how I arrived at the solution. I have one Patent that blew my company’s competition away because it was a so ridiculously simple solution to a complex problem. Today, I am retired and still have a MSFortran compiler running in Windows XP.
“When I wrote Fortran, I had more comment lines than actual code.”
Good to know, and you go right to my Do Not Hire pile.
And why they would advance to the top of my Probably Should Hire pile.
I can believe that. Fortran with one character variables (i,j,k, etc) can get obtuse quickly.
That said, when our ‘C’ compiler went from max of 8 chars to 256, one programmer drove us nuts with his ‘long’ variable names. Ie. ‘counter_for_accumulating_the_mwatts_from_the_plant_generator_number_3’ …. Well you get the idea. Variables should state what they are used for, but ‘be reasonable’. A simple ‘gen3_mwatt_accum’ would do. Saves on excessive comments too :) . We implemented a simple rule that variables should be at least 3 chars and say what they do. So instead of ‘i’ use ‘idx’ for example. Only exceptions were variables like x,y,z for coordinates. Also searching for ‘idx’ is a bit easier than ‘i’.
Single character variables are for loops and such ONLY. Bob sold it, I bought it, that settles it.
The real issue isn’t just that legacy systems are old—it’s that they’re incredibly hard to maintain. And by “maintain,” I mean updating them to meet new requirements.
I worked on a modernization project once that took over 18 months and more than 20 developers. Meanwhile, we still have other applications running on COBOL.
A recent legislative change required updates to two systems. I had short notice, but I was able to implement the change on time in a modern system. The same change in the COBOL system? It rolled out late—so late it missed not just one, but two additional legislative changes.
Fast forward several months, and we find out the COBOL update had been silently corrupting data the entire time. Now, multiple developers are working to clean up the mess.
The real kicker? None of the original developers are around anymore. There’s no documentation on where the bodies are buried, downstream impacts are hard to trace, and every developer who ever touched the system used their own naming conventions. It’s a nightmare.
While cost is always a big factor in deciding whether or not to modernize, and while these older systems can technically still do much of what modern platforms can, they were originally designed as a replacement for paper.
In my experience, the data itself is often the bigger problem. It’s dirty. Over the years, users created ad hoc workarounds for edge cases the system couldn’t handle—like adding special characters or using specific dates as flags. But there was no standard. One person might use 12/31/9999, another might use 01/01/1900, and someone else might just throw in a “#” and hope it worked. It was all about getting through the screen and finishing the form.
Java is as old now as COBOL was when I was taking CS in college.
Thank you. I have never seen cobol explained that way before and the code in the sample project really hits the point home.
Nice Job Maya thanks
Many custom-built business applications have some kind of their own DSL built in, be it an XML file that specifies various aspects of business logic or a real language – and some use fluent APIs as something in between. Most programming languages today are too low level to support business apps efficiently and need more layers of abstraction to get things manageable.
There was an interesting project called M at Microsoft that was basically a language for building DSL’s. It had a way to generate code but also provided a specialised (Visual Studio based) editor with syntax checking that would, at some point, even have autocomplete/intellisense functionality. Unfortunately, it was given over to SQL Server folks who never understood what it was nor knew what to do with it… There are still calls to open source the thing.
“Direct” translations to another programming language without huge refactoring steps will result in very ugly code. Well written programs can have a beauty embedded in the code itself, logical flow, and it makes programs look deceptively simple. But to do this, you have to make intelligent use of the unique set of features that any progamming language has. If you attempt a direct translation, then the things that made use of unique features of the old language require weird workarounds, and you can’t make use of the nice but unique features of the new language.
And the effort for rewriting is very much dependent in how far those “special features” of the two languages are apart.
Speaking of that … I just read a post in another forum asking why we just don’t let A.I. do the translation for us. So there you go. So let the A.I. do it. After all if it is intelligent, why not? :rolleyes: This is where we are at with this ‘A. Not I.’ foolishness that is going around.
Translation from COBOL to a ‘modern’ language was the topic…
Just consider the IBM z architecture is back-compatible all the way down to the System/360. That should give you a clue.
Couldn’t the system/360 run 701 code? So does 701 code run on z/series?
One thing I didn’t see anyone hit on in this thread is that COBOL is fast compared to a lot of other languages, especially interpreted languages like python. When I worked at a bank, we rewrote part of our statement generation logic from COBOL into a ‘modern’ language: execution time more than tripled, resulting in a day’s worth of statements taking more than a day to generate and print.
All the other observations about sunk costs are certainly valid; but at the end of the day, the code also has to be performant for the use.
I learned Cobol in the 1970’s. I had a background maintaining Burroughs accounting machines and computers. I understood Cobol because it seemed to be designed to support accounting systems.
The problem in computing languages today is that every academic thinks they know better than their predecessors and invent new, supposedly superior programming languages.
I have seen the insides of a Cobol to Java conversion and it is unmaintainable rubbish. But the Java enthusiasts convinced the business people that it was ‘modernisation’ – that Cobol was a ‘dead’ language.
Building business accounting systems in Cobol is a natural fit. Building them in in other academically designed language is bollocks.
I’m not entirely sure the author understands the concept of Domain Specific Languages. And, I know few, if any, programming languages that are domain specific. Even Assembly, which is tied to a specific CPU family’s capabilities is general purpose in use. Just on a specific platform. DSL’s to me are things like Regular Expressions. You can do many things with them, but they represent parsing and tokenization. One cannot really use something in BNF to calculate the distance to the moon. Or bowling scores (there is an entirely different symbol based DSL for that).
COBOL and FORTRAN are evolving just like any other language. They are just as general purpose as Java, C#, Python or Rust. And, in many cases, way faster. Stating COBOL is adopting Java like concepts is narrow minded. Maybe they are just good programming practices? Perhaps as development techniques have matured, the languages evolve with them?
They are tools. They have traditional applications, but they are not locked into them. I have used COBOL to handle scientific calculations and FORTAN for reporting. RPG is probably the closest to a domain specific language and I have seen UX’s developed with it.
People have the impression that older technology is no longer valid. The problem with that mindset is older technology is what most folks base their arguments for “modern techniques” on. Virtual machines? VM/370 in the early 1970’s. Hypervisors? See VM/370? Multitasking? OS-390, late 1960’s. Open source? SHARE and CBTTAPE mid 1970’s. Polyglot languages? MVS Common Execution Environment, mid 1970’s. Spooling and queuing? HASP late 1960’s.
The #1 largest problem with COBOL and FORTAN is they are no longer “modern” due to language “snobbery”. Generally discussed by folks that don’t use either or the environment which spawned them regularly.
Keep in mind, there are mainframe based systems that have existed and been operational for your entire lifetime. Take SABRE for example. It was started in the mid 1950’s. Using Assembly on IBM 360 precursor hardware. Try booking a flight without it. And, so you know, virtually all Customer Reservation Systems have a lineage traceable right back to SABRE.
Just because it’s old, doesn’t mean it’s not useful. You just have to put in the time to learn how to wield it.
All the problems discussed here are related to PEOPLE, not the COBOL programming language.
If companies let programmers (and designers!) do whatever they want, it doesn’t matter what programming language they use. The results will be bad.
COBOL was the result of a D.O.D. request to manage 400 systems programmed in 400 different ASSEMBLY languages: you can imagine the nightmare.
COBOL was developed from a combination of different programming languages, including FACT, COMTRAN, and FLOW-MATIC.
COBOL was primarily developed by the Army, hence its hierarchy into DIVISIONS, SECTIONS, PARAGRAPHS, and sentences, all understandable by a HUMAN, the exact opposite of ASSEMBLY languages and FORTRAN.
JCL has many options, but these are hardware-dependent: you have to specify the length of a record, whether it’s a fixed or variable format, etc. Under UNIX/LINUX, you can do anything (and often do).
COBOL was designed for business applications, not scientific applications. FORTRAN is for scientific applications, and again, it’s about freeing itself from ASSEMBLY.
IBM attempted a synthesis of FORTRAN and COBOL under the name PL/1, but it didn’t work well. Now, for their internal developments, IBM uses a variant called PL/390, where PL/1 and ASSEMBLY language can be mixed. It’s a programming language reserved for IBM only.
COBOL has been very well adapted to tools such as specific versions of ECLIPSE or VS CODE. You can process XML files (and store them in XML columns of a DB2 table! AND query them with a specific dialect as DB2 commands).
I’ve been using this language for almost 40 years. A bit of ASSEMBLER in the 1980s.
Nice summary of the historical context, thank you.
Computers are not the only tech that has this “problem”. I spent my career in the nuclear industry, and they have very similar issues with replacing or repairing old equipment, some of which has not been built in decades. It is not a matter of changing it out to use something that is “better”. It is a problem because there are mountains of documents and analyses that are based on the performance of the original equipment. In many cases, there is no one performance criteria that rules the performance. For one scenario it might be conservative to assume that a pump produces only a minimum flow, but in others, the conservative assumption might be that it produces much more.
When the owner of the facility changes something, they have to update all of the analyses that touch that component, and demonstrate, in a formally reviewed and approved document, the basis for plant operation with the new equipment. This is extremely non-trivial. If they have to get approval from the regulator, it can be even worse.
I have a nephew who works for a company that makes a lot of nuclear-related equipment, and he says it is the main profit center for the company, because they have to document everything the do to produce the stuff, which often has not been built for a very long time. All the dimensions, the material properties, the tested performance or each individual part has to be verified and documented. And even though he has a lot of equipment available that is less expensive, easier to maintain, and works “better” than the old stuff, no nuclear plant owner wants to buy it because changing the plant documentation to deal with the differences would be a nightmare.
The I&C in these plants is also mainly analogue, because that would be a real nightmare to change. Some parts that are not “safety-related” might have been updated, but often this is just an augmentation of the original analogue equipment, which is left in place to control the plant.
Banks are not the only organizations that use old tech. And they all have very good reasons to stick with old stuff, even if they could “save money” by changing the way they work.
I have difficulty agreeing that COBOL or FORTRAN (77 and earlier) are DSLs unless you’re claiming, IMHO, the overly broad domains of general business programming and general scientific programming.
That said, I do agree that a contemporary language, RPG (Report Program Generator), was a DSL –after all, it was intended to aid the transition from punched-card tabulating machines to digital computers. In all honesty, it did a reasonable job if you had to pile of data and needed to produce a tabular report composed of page headings with various detail and summary lines. Of course, that 1960’s implementation had very specific and rigid formatting for the code (don’t forget the H card!)
For the record, I learned FORTRAN, COBOL, RPG, and APL on an IBM 1130 with 8k of core memory.