Programming In Plain English

Star Trek had really smart computers, that you could simply tell what you wanted to do and they did it. The [Rzeppa] family has started a plain English compiler. It runs under Windows and appears to be fairly capable.

Plain language programming isn’t exactly a new idea. COBOL was supposed to mimic natural language with statements like:

MULTIPLY HOURS BY RATE GIVING PAYAMOUNT

You could argue this didn’t go over very well, but there is still a whole lot of COBOL doing a whole lot of things in the business world. Today computers have more memory and speed, so programmers have been getting more and more verbose for decades. No more variable names such as X1 and fprdx. Maybe this will catch on.

A function to clear the screen starts out with a list of phrases you might say to call the routine. This is similar to the type of personal assistant logic in which you can speak natural language, but in doing so you had better say something that matches its known template. Here’s the function:

To erase the screen;
To blank out the screen;
To wipe off the screen;
To clear the screen:
Unmask everything.
Draw the screen’s box with the black color and the black color.
Refresh the screen.
Put the screen’s box into the context’s box.

This will work if you say “erase the screen” or “blank out the screen” but it won’t work if you say “blank the screen.” The hello world program shown in the accompanying graphic looks like this:

To run:
Start up.
Clear the screen.
Use medium letters. Use the fat pen.
Pick a really dark color.
Loop.
Start in the center of the screen.
Turn left 1/32 of the way.
Turn right. Move 2 inches. Turn left.
Write “HELLO WORLD”.
Refresh the screen.
Lighten the current color about 20 percent.
Add 1 to a count. If the count is 32, break.
Repeat.
Wait for the escape key.
Shut down.

We were interested that some of the primitives let you insert machine code. For example:

To add a number to another number:
Intel $8B85080000008B008B9D0C0000000103.

That means you could do some interesting extensions if you were to take an interest. A cursory attempt shows it does work — at least somewhat — under Wine, if you want to try it out.

The post focuses on using the language with students, but we aren’t sure these are good habits for future programmers to develop unless it is the leading edge of a trend. We could make the same argument about Scratch and other visual development tools, too, though.

79 thoughts on “Programming In Plain English

    1. I’m surprised no one has mentioned the xTalk family of languages, languages clearly heavily influenced by COBOL and the like. xTalk languages main progenitor is HyperCard’s HyperTalk which was mirrored by many others (SuperTalk, Director Lingo, Toolbook, Eggplant, Oracle MediaTalk, etc.) and still has at least two viable descendants in the commerical software LiveCode and new GPL3 offshoots based on their (now formerly) OpenSource community edition.
      XTalk languages can be far more English-like than Cobol, the only other thing that might com as close is AppleScript (gasp) but that is, according to Kevin Calhoun when he was at Apple, LISP in disguise.

      Example code:
      on mouseDown
      Answer “Hello World, you English-like?” with “Yes” or “No”
      on mouseDown

  1. Changing the language doesn’t solve the fundamental problem that you have to be really precise in what you say to get a computer to do exactly as you want.

    In my years of experience with customers trying to describe to me what they want, I can tell you that very few people have that skill.

    Even the example above has several problems. The first “turn left” is unclear because we have not yet established an initial direction, and “1/32 of the way” makes no sense in combination with “turn left”, because one is a rotation, the other a distance. Is that 1/32 out of 360 degrees ? And the way to what ? “add 1 to a count”, is that any count ? same count as before ? what was that count initialized as ? There are lots of ways a few of these sentences could be misunderstood.

    Wife ask husband to go to the store and buy bread, if they have eggs, get a dozen. He comes back with twelve loaves of bread.

    1. And according to your exact wording, he really doesn’t have to buy bread at “the store”., does he? He has only to:
      1) got to the store
      2) buy bread

      Maybe the computer store is selling bread today and he can get it there. And who is “they”? There’s only one store mentioned … singular. Is “they” silly Frenchmen waving their private parts at your Auntie? Or maybe “they” are the Quick Hacks in each week’s HaD podcasts.

      Finally, the linguistics allow him really to get a dozen of anything he wants, eggs be damned!

      Alas, those darn computers are just so demanding!

      1. If he is context-aware, he will know that bread can be bought at certain types of stores, he will have list of prefered stores based on historical data, and list of prefered bread types and/or brands…

        The main problem with using natural language for programming is that most of Earthlings’ languages are not very good for the task. One would need a language in which each word has only one meaning, grammatical structures are stiff and immutable, so each sentence of certain type looks like any other sentence of that type, just words get changed, and there are no exceptions to any of the rules. Language should be concise and precise. In short it needs to be constructed language with domain-specific, limited vocabulary. Just like any programming language.

        Also if we want to program computers by talking to them, then each symbol (letter) should be equal to one and only one phoneme, so what you see is what you say. English is so bad for that, I can’t really speak it correctly. My native language, Polish, does it better, as each letter or pair of letters has a corresponding phoneme, and once you learn which is which (and pronounce them – just for kicks go to Google Translate, set language to Polish, paste “zażółcić gęsią jaźń” and press the “say it” button, try to repeat it yourself), you can read Polish like a native. Spanish works like that too.

        1. If you use an unambiguous language, then it will be really hard for most people to express themselves in that language, unless your mind already works as a programmer’s.

          1. Ambiguity comes from the fact that we stack multiple meanings to words we use. Make each word mean one and only one thing, and you get unambiguous language with giant vocabulary. Narrow the domain of the language and vocabulary is rapidly reduced. Pidgins, languages with simplified grammar and reduced vocabulary were created to ease communication between various groups of people in situations limited in context. Usually with lots of finger pointing…

            Typical sentence, no matter in what language, has four basic elements: subject, object, verb, and descriptors. Subject verbs the object, and descriptors hold additional information about them. Grammar just describes how these building blocks are connected together, how they mutate and how to use them to make more complex sentences from simple ones. Computer can easily understand rules, but it can’t deal with words that have multiple meanings. Also some languages are very hard to machine-process them due to complexity of the rules and sheer number of them. For example Polish is very, very hard to learn because for each grammatical rule there is a ton of exceptions, the whole language uses prefixes, suffixes, infixes to add meaning to words, many of them work differently with different root words (for example check out the use of polish f-word, and how it is mutated (and these changes of meaning are verb-sensitive): http://www.linguatrek.com/blog/2011/07/how-to-swear-like-a-pole-pierdoli%c4%87/). And position of the subject, object and verb in a sentence doesn’t really matter. So Yoda of Star Wars in polish translation sounds only slightly unusual. Now just imagine sorting this mess of a language with a computer system…

          2. @Moryc, it’s actually even more complex than you’ve described here. There are languages which do not have subject, object, and verb at the core of their grammar. I’ve studied British Sign Language which has many examples of different meanings for one sign, and meanings that change based on facial expression and position in 3D space.

        2. Enjoyed the Polish swear word article. I speak a different Slavic language, not Polish. Lots of similarities, but my language doesn’t combine consonants to make new sounds… what you see is is pronounced (usually). But accent or emphasis only comes with practice! Grew up in a Polish neighbourhood of Toronto, and usually broke my tongue trying to read Polish until I figured out the new sounds a couple combined letters could make. Latin has some resemblance when it comes to many tenses and genders. Learning the Cyrillic alphabet opens new reading opportunities for any Slavic speaking person… Serbian… Russian… RTFM and practice. How is this different than programming?

          Would natural language programming perhaps be more like a Google search? A collection of words that are to the point, without tense, gender, or connecting / transition words.

          1. My wife is fluent in Russian. I can read, listen and write in English, but my pronunciation of it is all over the place. I learned German, Spanish, Portuguese and Latin at some point, but I’ve forgot most of them. Spanish is arguably the simplest language in the world. It has very limited number of irregular verbs, and all the rules are real rules with small number of exceptions…

            As for Polish, there is one great lie in the grammar: officially Polish has only three tenses: past, present and future. In truth it has as many as English or Latin, but we fake it with prefixes and decorators…

            Writing is one of the things I do from time to time. I had this idea for a urban fantasy setting, where people with special talents write magic as a formal programming language of the reality, bending the rules of physics and reality at a cost of wasted energy…

          2. I’ve read “Snow Crash”, one of my favorites. Language and linguistics is one of the interests I picked up when learning English, which I learned mostly by reading English porn short stories and listening to “Dilbert\s Principle” – my english teachers had to start from the basics every time we moved up to the next educational level, because new students in my class at the beginning of middle school or high school didn’t know it. At second year of high school my teacher told me:
            “I know you know the answers, so just shut up and give others a chance to learn”.
            Still, after over 20 years of listening to and using English, and after almost 4 years of University-level courses in it I can’t speak it in such a way that others can understand me. Even my wife, who knows only basics, knows that my pronunciation is all over the place…

          3. Yes, great book. There’s supposed to be similar elements in “The Big U” but I have not got around to reading that one yet.

            One of the problems with English of course is the wide variety of accents. Both regionally in English speaking countries, and between countries. So one may pick up pronunciations of words in one accent and insert them with words heard in another accent and sound rather strange. I have a little German, and I am sure I say one phrase like I’m in Bavaria, another as a Berliner, and another sounds Austrian. My wife has been using duolingo, and that seems to give a lot of pronunciation practice for the languages we tried, I don’t know if it would bore you though as the free part of it only covers the basics.

        3. “The main problem with using natural language for programming is that most of Earthlings’ languages are not very good for the task. One would need a language in which each word has only one meaning, grammatical structures are stiff and immutable, so each sentence of certain type looks like any other sentence of that type, just words get changed, and there are no exceptions to any of the rules.”

          I used to think there were a couple of languages, such as Latin or Russian that had such unambiguous structure, But, any language can be perverted by politicians to be vague enough to mislead the people.

          So, if a politician says “We will have a brighter future with [blah-blah-bla]”

          The “We” may just mean the politician and the frog in their pocket.

        4. The problem isn’t languages, it’s how we communicate.
          Relevance theory explains this – we say the minimum we need to communicate, and assume a huge amount of shared experience and context – whether that’s you knowing my personal situation, or just general human experience. It doesn’t always work, but it’s very reliable almost all the time. This is unrelated to the language of communication.

          If we over-specify something, it draws attention to it.
          Eg If I say “I’ll drive the blue Bentley today”, you infer I have more than one Bentley, hence needed to specify the colour.

          There’s a lot more to relevance theory, but that’s the basics.

          Computers do not have this shared context. So even if they can understand our words, they cannot easily understand our meaning.

    2. Exactly, people are trying to make programming simple and accessible for anyone since COBOL.

      60 years and there is still people who think programming is hard because of language. No, it is hard because abstracting tasks and logic in simple steps IS hard.

      Good programming languages may simplify a lot of work, but they do not eliminate the need for logic and abstractions.

      1. Bad programming languages hide the tasks and concepts behind simple interfaces, so that the user thinks it’s easy but has no idea what the true complexity of the operation is.

        There’s a lot more to it than simply getting the answer right.

      2. OOP was invented to make programming simpler. Instead it invented a completely new domain of unnecessary problems.

        Also if any undefined or ambiguous, platform/domain/context specific behavior is a feature of a language (C and it’s ilk has them), the language itself is bad…

  2. A while back, when there was a court ruling about code not being free speech, cause it wasn’t like English. As a result, there were a number of programming in plain English proof-of-concepts. EG one was English to Perl. Which I might still have. (I translated it to Python, cause I didn’t have Perl.) I recall a C to English/English to C as well.

    One programming language that is very much like English is Inform 7, for writing interactive fiction, which seems nice, but drives me a bit crazy somehow.

    A rather old (Not quite COBOL old) example would be Micro-Prolog, which was more English-like. Been a few decades since I saw it but in modern Prologs “is” is used the same EG “X is 1+2”

    1. Yes Inform7 is “much like” English, tried to write some interactive fiction with it but by default there is too much boiler plate code to stay sane… Anyway, I still like it, perhaps one day I’ll derive my own version of an Inform7 like language ?

    2. Found it. The “Plain English Compiler – Roger Espel Llima (21st Apr 1996)”
      He wrote it to demonstrate that a cryptography algorithm could be written in executable plain English.

      It does say “Public Domain” but seems hard to find on the net. Though it seems in part due to the many “Plain English Compiler” projects out there.

  3. Can’t Siri do this for me?

    Seriously, this reminds me of Scratch, which replaces vocabulary and syntax with a drag-and-drop interface, as a result of the misconception that the difficult thing about programming is learning the vocabulary and syntax. But what actually happens is that after a short learning curve, the interface is SLOWER than text-based input.

    This is similar, but is probably even more counterproductive, because it results in a sort of cognitive “uncanny valley” case, where a computer language that’s too much like a human language will cause the user to expect the computer to interpret his commands as a human would. Which is a bad idea because humans don’t follow procedures consistently.

    1. Often, these attempts to simplify only work for the easy cases, but actually make it more difficult to do the harder cases.

      The graphical LabView interface is another miserable attempt.

      1. Wouldn’t call any of this type of system miserable attempts. As they do let you just worry about the flow chart program logic to a large extent. Which is something lots of people struggle with, and as a learning tool or for non-programmers who need to automate something only having to learn one (the almost universal one) of the 3 pillars of any language is a good start.

        Faster with text-based is almost inevitable for experienced programmers – even if they don’t know all the useful functions they could use after a while you know all the important ones you need and can create quickly your own function over looking up for something that does it already – having to search for and place the right box all the time is more processes than just placing the right word in the right place.

        1. It wouldn’t be such a bad approach if the graphical user interface was just a front end for the text-based design that you can open as a file and edit yourself if you want.

          As the programmer/user grows in experience, they would have the option of switching from graphical to text at any point in time.

          The miserable part is where you are forced to used tools that just get in the way of real work.

          1. “The miserable part is where you are forced to used tools that just get in the way of real work.”

            Unreal’s Blueprints must really chafe then.

          2. IF the tool was a learning tool that has taught you the first pillar of strict logical reasoning required to program its done its job already and you can more easily move on to whatever else is better suited to your needs.

            That said I don’t see why there shouldn’t be a xml? text type output you could manually edit if you really wanted to. Probably already exists for many of these tools.

    2. Scratch isn’t really intended to be productive, is it? It’s supposed to be accessible to get kids over the idea that programming is unapproachably difficult. I could totally see a natural-language environment doing something similar for adults. Give them the stuff they’ll need 99% of the time, and gradually expose them to all the things they could do in a more traditional programming language.

    3. Scratch is good for kids who often struggle to remember syntax. Moving my eldest from scratch to JS is a constant reminder of how to write a for loop, and what needs () or {}

      1. Some of the syntax is many languages is just there to drive you mad all the various braces having unique meanings sometimes only contextually can be really confusing. The wonderful world of ++ and how it works springs to mind as a fun lesson in hard to spot and easy to make buggering ups. Sure the entire loop being on a single line that if you really know your language inside-out and backwards is concise and functional is kinda cool – but its a big barrier to entry and its so easy to see what you intended to write when you got the order backwards. You also end up having to comment your code much more heavily than if the code speaks for itself more clearly. So you might not actually be saving any coding time – I expect you will save lots on initial creation, but when it needs updates and tweaks a week after you have forgotten exactly how it works you make it back up in spades as nobody ever puts enough comments into their code.

        That’s where a language designed to be more spoken world like is useful – it wouldn’t use a million special characters and specific punctuation so it would need to be sufficiently contextual – undoubtablely much longer commands, and bigger precompilation because of it. But if its easy to read and spot that one incorrect punctuation mark I’m very much in favour. IF the code also becomes largely its own comments because its so clear that then adds further help to making it work using.

        Its hard to get folks away from C families though. As all those useful library functions that save you having to reinvent the wheel already exist for them or at worst are a simple enough translation from whichever C like language you found it in to the one you actually want to use..

  4. Put aside the problem of interpreting, and just imagine describing any program you have written in whatever speech feels most natural to you. That’s an ideal case for what a natural speech programming language would be. But how awful would that be to read or to debug? To modify? Human language wasn’t designed for expressing algorithims.

    You know what was designed for it? Mathematical notation. Perfect for expressing complex algorithmic ideas in a way that can be understood unambiguously and clearly and can be easily modified and manipulated.
    https://www.jsoftware.com/papers/camn.htm

      1. put key in lock
        I see no lock
        put key in lock on door
        I do not know how to put
        use key
        a key needs to be used with another object.
        use key with door
        you do not have the door
        take door
        you cannot take the door
        FFFFFFFFFFFFFFFFFFFF

        1. Yeah, I loved Hitch hikers guide – both drove me mad and was hugely satisfying when you did figure out both what to do and how to phrase it.

          Done right there is not reason human language like programming has to be that awkward though – we have the storage space to have every possible permutation that means x in a lookup for the compiler to get the right job done. Its lots of work setting up in advance that way of course, but unlike a game that fit on a 3.5″ floppy or less can easily devote GB’s just to the thesaurus lookups now.

  5. If you make English a programming language, you’ll find that most programmers can’t write in English. English on its own is a terrible language for math.
    We joked about making a power point (specs) to schematic converter as the scribbles was handed down to us as a requirement.

  6. Remind me of the fun programming language Rockstar (https://codewithrockstar.com).

    Here is the definition of the function midnight which takes the argument heart and soul:
    “””
    Midnight takes your heart and your soul
    While your heart is as high as your soul
    Put your heart without your soul into your heart
    Give back your heart
    “””

  7. Back in the 80s I saw a book at a garage sale about a dialect of fortran that was written in English sentences. It used a structured and formal grammar, one paragraph per line number, etc. I regret not picking the book up. The title was something like “Programming in English” or “Writing Fortran in English”. I have never been able to find it again.

Leave a Reply to rustyCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.