Learn To Program With Literate Programming

My heyday in programming was about five years ago, and I’ve really let my skills fade. I started finding myself making excuses for my lack of ability. I’d tackle harder ways to work around problems just so I wouldn’t have to code. Worst of all, I’d find myself shelving projects because I no longer enjoyed coding enough to do that portion. So I decided to put in the time and get back up to speed.

Normally, I’d get back into programming out of necessity. I’d go on a coding binge, read a lot of documentation, and cut and paste a lot of code. It works, but I’d end up with a really mixed understanding of what I did to get the working code. This time I wanted to structure my learning so I’d end up with a more, well, structured understanding.

However, there’s a problem. Programming books are universally boring. I own a really big pile of them, and that’s after I gave a bunch away. It’s not really the fault of the writer; it’s an awkward subject to teach. It usually starts off by torturing the reader with a chapter or two of painfully basic concepts with just enough arcana sprinkled in to massage a migraine into existence. Typically they also like to mention that the arcana will be demystified in another chapter. The next step is to make you play typist and transcribe a big block of code with new and interesting bits into an editor and run it. Presumably, the act of typing along leaves the reader with such a burning curiosity that the next seventeen pages of dry monologue about the thirteen lines of code are transformed into riveting prose within the reader’s mind. Maybe a structured understanding just isn’t worth it.

I wanted to find a new way to study programming. One where I could interact with the example code as I typed it. I wanted to end up with a full understanding before I pressed that run button for the first time, not after.

When I first read about literate programming, my very first instinct said: “nope, not doing that.” Donald Knuth, who is no small name in computing, proposes a new way of doing things in his Literate Programming. Rather than writing the code in the order the compiler likes to see it, write the code in the order you’d like to think about it along with a constant narrative about your thoughts while you’re developing it. The method by which he’d like people to achieve this feat is with the extensive use of macros. So, for example, a literate program would start with a section like this:

The source of all this trouble.
The source of all this trouble.
Herein lies my totally rad game in which a
bird flaps to avoids pipes.The code is
structured in this manner:
<<*>>=
<<included files>
<<the objects and characters in my game>>
<<the main loop>>
@
This is the main loop, It contains the logic
and state machine, but it also has a loop to
update all the graphics within each object.
<<the main loop>>+=
 <<the game logic>
 <<the graphics>
@ 

In this example, you’d later write things like this.

In this next bit I am going to define the bird who flaps. To do this I 
will have to create an object. The object will contain the position of 
my bird, the state of its flapping wings, and the physics by which its 
perpetual descent is governed.
<<the objects within my game>>+=
  <<flapping bird>>=
  class Flapping_Bird:
     <<initialize the bird>>
@
The bird is initialized in the middle of the left side of the screen. 
As soon as the user grazes any key for the first time, the bird will begin 
to suffer. To enable this I will need to know what size window the bird 
inhabits, and with this information I will determine the scale of the bird 
and its initial location.
<<initialize the bird>>+=
  __init__(self,screenx,screeny):

etc..

Okay, this will take a bit of deciphering, and I probably didn’t nail the syntax. In simple terms, anything between a <<>> and a @ is actual code. If you see <<>>+= it means add this bit of code to the previous definition (which was defined by <<>>=.  Well, Wikipedia explained it better.

I'm into some heavy technical literature these days. Approach with caution.
I’m into some heavy technical literature these days. Approach with caution.

That introduction aside,  the concept that really stuck with me was the idea of writing down the thoughts in your head along with your code.

Ordinarily, I’d say this would be bad to put into practice. Your coworkers won’t really appreciate paragraphs of exposé between every line. The whole point of learning a programming language is to become fluent in it. The goal is, in my mind at least, to get to the point that you’ll be able to read code and logic flow like you’d read a book. (Along with a helpful spattering of comments to orient you.) Of course, I’m certainly not Donald Knuth, and I’m not even a very good programmer, so don’t take that as gospel.

Programming Should Be Enjoyable

Which gets me to the core of the article. It’s something I started doing soon after I read about literate programming, and it’s tripled how fast I learn programming. It’s also made it more fun.

I’ve been working through a great beginner book on pygame, Making Games with Python and Pygame by Al Weigart which is free online. Below is the very first example in the book, and it is presented in the standard way. The reader is expected to type the code into their text editor of choice and run the code. After the code performs as expected, the reader is then expected to go through a line by line dissection in order to figure out what each line did. This works, but I think it’s typically better at teaching someone to transcribe code, than to actually code code.

import pygame, sys
from pygame.locals import *

pygame.init()
DISPLAYSURF = pygame.display.set_mode((400, 300))
pygame.display.set_caption('Hello World!')

while True: # main game loop
  for event in pygame.event.get():
    if event.type == QUIT:
       pygame.quit()
       sys.exit()
  pygame.display.update()

With my revelations from literate programming my code started to look like this:

#alright, we remember this. The compiler or interpreter or whatever 
gets mad if we don't import the functions and files we need first

import pygame, sys #pygame is the game library I'm using, sys is full
of useful system calls. I think I'm importing this one for the sys.exit()
function later. I skipped ahead and read that this is similar to breaking
from the loop or pressing ctrl-D. Maybe this is a cleaner way to exit
a program and a bit more universal? I don't know but I don't want to get
bogged down at this point, so I'll remember the question for later.
 
from pygame.locals import * #I've already imported pygame, but I'd like
to get the stuff inside pygame.locals out. The book mentions that this
is mostly useful constants like the QUIT I'll be using later.

I would attempt to explain as much as I could about what each line was doing. It’s a bit tedious, but it helped me note the bits I remembered and figure out which parts I only thought I recognized.

pygame.init() #I should initialize pygame before I start using it. No
complaints with that logic here. Apparently it will throw errors if I don't.
### Yep, tested it by moving pygame.init() down a few lines. It definitely
throws errors.
I had this up in the robotics lab in my university for a long time.
I had this up in the robotics lab in my university for a long time.

One thing I realized was pretty fun, was to keep a log of my experiments in my code. Since python doesn’t really care, I’d just add more hashes to show when I’d gone back to try something out. Anyway, the rest is at the end of the article if you want to read the full content of my inane  and perhaps incorrect comments.

While I’m still not sold on literate programming, and perhaps, am not at a level to be sold on it anyway. I definitely benefited from some of its core ideas.

Programming is one of those things that’s really hard to take notes on. I can’t imagine anything more useless than writing, “print(‘the thing you want to print’) – prints something to the console,” on a bit of lined notebook paper. What good will that do?

Programming is about experience, developing good thought patterns, and clearly understanding what the magic is doing.  If you’ve been having trouble making programming stick for you, give this a try. Maybe it will help. Looking forward to thoughts in  the comments. Any tricks you use for learning programming?

DISPLAYSURF = pygame.display.set_mode((400, 300)) #gosh I hate all caps,
but I'm gonna trust that the book writer knows what he's up to. This is
is the main "surface" on which everything will be drawn and defines the
size of the game's window. Not entirely sure what a surface is yet in
the context of pygame.

###played with it a bit, changed the window size, etc. It doesn't like
it if the (x,y) of the screen isn't in a tuple

pygame.display.set_caption('Butts') #change the title bar. I'm an a dolt
Noting that this is all in the display bit of pygame.

while True: # main game loop
 for event in pygame.event.get(): #this bit gets all the events that 
happened and cycles through them one by one until its done. I think 
each event might be an object, but I'm not sure.
    if event.type == QUIT: #The book mentioned that the event types
are just numbers and I imported QUIT from pygame.locals earlier with the 
from statement. I could have left the from statement out and also gotten
this with pygame.locals.QUIT

###yep, just for the heck of it I tested that and it worked.
      pygame.quit() #quit pygame first. This seems important.
      sys.exit() #quit out of the program second.
 pygame.display.update() #just going out on a limb here, but my guess is
that this function updates that DISPLAYSURF I made earlier.

72 thoughts on “Learn To Program With Literate Programming

  1. I do a condensed version of this in the scripts I write. I’m not a programmer by any means, but I note like crazy when I script. Sometimes I don’t go back an use something for 6+ months and it’s nice to have a reference of what each piece does.

    1. Documentation is CRITICAL. Where I work we build industrial motor controllers, some are just phase control soft starters for medium voltage, the latest toy is a medium voltage variable frequency drive.

      You want to talk about code… VFD I think is just south of a million lines so far, without documentation, it would be impossible to come back to for improvements, add ins, error fixes etc.,,

      Our first microprocessor unit was coded by a guy who’s no longer here (mid 90’s), and he ‘made it work’ yes, but nobody had any idea how the hell any of it worked. And it did.. most of the time…

    2. In my opinion, and based on 45 years programming, the `real programmer code is self documenting` thing is an urban myth. For nine years I worked at one of top three telecommunication companies. Reorg’s and layoffs every six months were the norm. I was in two programming groups, during my time there, and had 12 different bosses in 9 years. Comments are a necessary shortcut to save time .. especially because commercial programmers never seem to have enough time.

      Because of this I inherited hundreds of websites (tools) composed of several thousand programs, most had no documentation or comments. And, I always had ten new projects being worked on at any given time, adding hundreds more websites to the maintenance pool. To make maintenance even more challenging, the ETL parts of these `tools` spanned multiple platforms, hardware, and languages (e.g. vb.net, c#.net, perl, c, ssis, shell, sql, etc).

      Just did NOT have the time to try and decipher the problem some piece of code was addressing (i.e. intent) .. especially since programmers had to `stand on their ear` to pull data from the network computers .. combined with different skill levels and different programming styles .. getting to the point of some code would just take more time than available. Even code I had written but not laid eyes on for six months .. was very helpful to have.

      My comments went as follows:
      o Summary of program at the top
      o Followed by a change Log .. one line summary of when original code was done and subsequent changes .. one liners with user id, date, change description summary.
      o Summary at beginning of major routines followed by one line comments throughout code where necessary
      o This was normally complex or important code .. NOT things like `spinning through array` when it’s obvious you have loop spinning through an array array).

      Frankly it just became a habit and I never once noticed a time drain doing it. In fact, I know that, long term, I saved immense amounts of time .. folks thing saving time leaving them out but .. time you think you save are quickly used up the following times where you have to waste hours or a day figuring why some code is there and it’s intended purpose.

  2. As far as I know, Knuth is the only person using his literate programming scheme. Note also that it is a hybrid mating of Pascal (a wretched language, that thankfully seems to have entirely vanished) and TeX – the typesetting language Mr. Knuth invented that people either adore or despise. It may have variants using other languages, and the concept has
    been used in the perl and ruby worlds — the idea being to keep documentation and source code together in one file
    lest they wander down separate paths.

    But as is often said, real programmers don’t write documentation. Perhaps they should, but they simply don’t

    My advice? Learn to program. Persevere. Eventually you will become one with the code.

    1. Real programmers believe that source code should be the only documentation needed. It won’t work in real world of spaghetti code and variables named: i, x, someNumber or var1. Every programmer assumes that is code is perfect, clear and understandable, because he understands it. I assumed that until I opened one of the older projects and wondered for hours, what it was supposed to do and how…

      1. The source code should be the only documentation needed, but that requires you to write clear, and understandable code. It also saves you from the double maintenance burden of making sure the comments match reality. I’m not too fond of the presentation style, but “Clean Code” should be mandatory reading for everyone who fancies themselves a programmer.

        Kind of related to this is the Cucumber family of tools for behavior-driven development (SpecFlow, Behave etc.), where the test scenarios are written in executable, more-or-less natural language.

        1. Now please leave the fantasy land and return to real world. In real world every programmer believes that his code is clear and understandable, and every other programmer writes gibberish. In the end when you have many people working on one code base you get undocumented program that looks like something written by thousand monkeys randomly typing on thousand keyboards.connected to one IDE. People are not clear and understandable, so how you can expect them to write clear and understandable code? Especially when every one of them thinks he’s a genius?
          The same fallacy of perception created Twilight saga and other, mostly self-published literary crap by deluded morons who believe that typing something equals to being new J.K. Rowling or Stephen King.
          Code is not its own documentation and it never will. Especially when written by morons…

          1. Maybe try working with grown-ups…? Of course it’s an ideal, but you can always hide away the really gnarly stuff so you don’t have to see the bits that require comments unless you are deliberately touching them. In our team we very rarely have to comment anything, and the difficult to understand bits are nearly always in legacy code from the time before these principles were adopted.

          2. Unfortunately, most people can’t write useful or accurate comments either. It is nearly as hard to write good comments as it is to write clear code.

        2. “The source code should be the only documentation needed”
          thats like saying that axioms are the only mathematics needed, the rest just follows…
          not everything that is fully specified is necessarily fully obvious nor all the ramifications fully understood.

        3. I liked this article “Code is not literature” code must be de-coded to be understood.
          I’ve found that code can be improved, if you’re willing to re-code a work that’s already working.
          I’ve gone back to my own programs and had the same feeling of shame, but then spent the time to fix/redo it.
          Usually the best thing is to delete huge parts and rewrite it knowing a shorter path between input and output.

    2. While Literate programming isn’t commonplace, it’s not entirely extinct. I know of at least one other example, ULIX, which is a simple OS written as documentation for an OS-development course. It’s written in C/Assembly and uses XeLaTeX for the documentation part.

      http://ulixos.org/

    3. Knuth may be the only person using his original WEB, but there are other Literate Programming schemes out there. I couldn’t get into WEB (or CWEB), so I haven’t pursued other forms of Literate Programming. I find it a fascinating idea, but one that didn’t seem to work for me in practice.

      One problem I had with WEB/CWEB was the way that the code was typeset. I am perfectly capable of reading Pascal or C code, and it doesn’t help me to see the Pascal assignment operator “:=” replaced by “←”, or the C inequality operator “!=” replaced by “≠” because those are more mathematically correct symbols.

      But the main problem is highlighted in that it is hard to read and edit the actual code. Taking the author’s “flapping bird” example, he shows defining snippets of code, reusing by reference snippets of code, and adding to snippets of code, but rarely do you get to see a whole function definition in one place. How do you know where to modify something? How do you read it and know what it does?

      Literate Programming may be very good at explaining what the programmer was thinking, but I suspect it works best when dealing with a single programmer (like Knuth), and not on team projects (like the rest of the world).

      1. The point of Literate Programming is to reform code in the order it’s easiest to reason about, rather than in the order the compiler needs it in. To take your example, the compiler requires variable definitions at the beginning; LP allows you to specify variables used by a chunk of code, then write out the code itself. This can be done recursively: If you have a logically self-contained chunk of code, you can separate it out just like with a subroutine, even in a context where subroutine would not be appropriate. The ‘tangle’ process compiles all such fragments into compilable source, by collecting all definitions at the beginning, and then puttting together all procedural code from the fragments.

        Perhaps that’s where a misunderstanding comes from: You don’t _have_ to divide code into small chunks: you only do it if it makes logical sense because those chunks stand on their own.

    4. He does literate programming in C nowadays (it’s called CWEB). There are actually people who do use LP, and I am one of them, it really helps to organize code.

    5. Real programmers _do_ write documentation – hack(er)s may not.

      As you call Pascal a wretched language I think you don’t know shit about this. BTW Pascal is widely used, it is number 13 in the TIOBE index.

      1. Is anyone using Pascal for new projects though? I mean, it’s below Assembly & VB (which itself is a few notches below VB.net), so I’d hazard a guess that it’s mostly used for maintenance of old projects.

        1. In my small company there are 3 engineers routinely writing Windows applications in Delphi and Lazarus (Object Pascal systems). A lot depends on what the target system is.

    6. “Real” programmers. What is that?
      One thing most people don’t understand is that most code is write once, read many times. Most programmers don’t even bother to write down what their code is supposed to do! In some way they benefit from that because they make themselves indispensable that way. I think it is common courtesy to take the time to document your code so other people won’t have a hard time understanding it.
      Code cannot be self documenting if it does anything other than very simple things. Among other things the comments should point out WHY you do things the way you do to prevent other people taking the wrong route towards a solution.
      Oh how I hate poorly documented code…

  3. The most important thing about writing code comments is capturing intent. Even if you don’t code for a living you can see that X loop is working through Y array, but WHY is it doing that. I like to think of good code comments as the director commentary track on a movie.

    1. Oh thank you. I hear the argument that comments should not be necessary. You just explained why they are.
      Yes, it’s the ‘why’ of it. What is a loop, a case statement — any control structure trying to accomplish?

      One can always analyze code, see inputs and outputs and side-effects but the thing that to often missing is an indication of what the programmer was thinking?

      I’ve spent a couple decades coding. Maintenance can be exhausting — or a delight. And a few well-constructed comments can make all the difference.

      Scientists and engineers stand on each other’s shoulders. Programmers step on each other’s toes.

    2. Exactly, capturing intent is the point. As a deliberately extreme example, you can take an assembler source, look at each simple instruction and understand perfectly what it does – but it’s basically guaranteed you won’t have any immediate idea what all of them together do. What many people seem to fail to realize is that C (or whatever other language) has the EXACT same problem, only on some arbitrary other scale – you may get a better idea what the specific section you’re looking at does, but minutia of the implementation will still always be obscuring the big picture; no matter how well you code, even looking at your own sources is guaranteed to leave you wondering what you intended to do in certain places if there are no comments at all. Self-documenting code is a myth – granted, some sections of code are more plain than others, but all code more complicated than a “hello world” will have obscure sections no matter what you do, so comments are not optional unless your intent is to deliberately obfuscate…

  4. Not strictly related, but readable code reminds me of a beef I have: “Tutorial” projects with code written by and for Level 99 Engineers. You know what I’m talking about. Someone trying to learn should be presented with code that is clearly labelled and does one thing at a time. Instead, you see code that looks more like line noise – it might be concise in a brilliant way but it’s also not very accessible to most people working their way through a tutorial.

  5. Ah ok now I understand how HaD bloggers program their posts:

    here i will put a fat ass title that nobody will resist clicking on it. I may change it, maybe it´s not controversial enough or lacks superlatives ?
    ok, so here we are. let´s make it easy and write like we talk. i just need three minutes to quickly parse the hack I found (ok it´s not a hack. it´s not a haaaack, i know. Those damn **** readers will flame me again. Ok I pasted the important words/sentences now i can wrap some text around. Ah, two links, and if possible one from the .io. Ok, done. Well, ok, fine, SUBMIT!.

    …ok i forgot to check if this hack hadn´t been published earlier …

  6. Commenting the code is dangerous. In theory it’s a great thing to do, in practice more often than not the comments are completely outdated and so, they are mostly misleading. I’m talking from a pragmatic point of view, of course I’d love to see code abundantly and adequately commented, it just doesn’t happen. If you want your code to be easily understandable, write it as clearly as possible instead of doing whatever and then explaining it in a comment that will be there long after that code has been lost in revisions.

    1. If you’re doing it right, you comment your code, not code your comments, therefore, if you revise your code, you SHOULD revise your comments. It’s a PITA, but part of it.

      1. I actually agree with both [postcomment] and [when did..]. Write code with variables that are self explanatory while comments explain the general gist of what it’s intended to do. I now also am in the habit to comment on boundaries for debugging. Especially with loops, arrays and pointers, it’s usually best to test if it’s within the boundaries if you can afford it. C# has bounds checking build in and has saved me a huge number of times. Bugs usually bunch up in edge cases that are overlooked.

      2. It’s a PITA, until you realize that if you are revising the code, you are probably revising it because it has survived a while, and if it has survived a while, it will likely survive a while longer, and someone else will end up revising it again. That person might even be you, future you, a you that’s forgotten almost everything you knew about the code.

        Even if it isn’t you, whoever tackles it next will be able to look at the commits in the source control system (which you are using, because you are a professional) and see who the asshole was that didn’t update the comments (or didn’t put any in in the first place).

        Ok, it may still be a PITA, but those who don’t understand that the alternative is worse will be cursed to learn from experience that the alternative is worse (or be murdered before gaining enlightenment).

        1. In my company we review 100% of code headed for production, and since the comments are the first thing reviewers look at to figure out the code, it encourages people to keep them aligned with the code. Once you realize that you will spend 50 times more time working with existing code than writing completely new stuff from scratch, you will start to really value thorough commenting.

    2. @J. – years since I did any serious (Industrial Process Control) coding, but yes, that fragmentation/separation between code & comments can cause major maintenance repercussions in the future – do modern IDEs have features that flag up a neighbouring comment as being ‘out of date’ if the nearby code has been changed??

  7. One thing to consider is that programming can be viewed in many lights and I get the sense that everybody has their own private mental model that describes the nature of the task. Some think of it purely mathematically, some think of visual metaphors, some think of it as a verbal task (akin to giving instructions), and most people form their own mix. When you find how best to represent the task in your own mind I suspect the rest will fall into place.

  8. I mostly learned coding by looking at other peoples code and then changing stuff to what I THOUGHT might happen, and then ran it, and did that millions of times over until I could understand what would happen if I changed this or that variable or order of operations. But that was back on a TRS-80 where typing a gazillion lines of code and getting something wrong was a painful ordeal :/ Now I just google it when I’m stuck.

    Being a decent coder takes a desire, you can’t lead a horse to vodka or something. A joy for problem solving, some logical thinking ability and no guilt at robbing others blind if they have something that works. This crap of “every kid a coder” fails at desire. Not everyone learns to swim, not because they are quadruple amputee named “Bob”, but cause they could care less to know how.

  9. By far the fastest way to learn programming is by following a programming course. Failing that, try youtube or any other video based course with the added advantage of being able to skip or replay things at will. The interaction of an actual live course will be much more engaging than reading any book. Only few of us learn well from books anyway :)

    Literal programming is a fun concept but hardly practical. I think a part of it is important: describe as complete as you can the behavior. Plan ahead as much as possible and you’ll be able to skip a lot of trial and error.

    I’ve wrestled my way through “The C (ansi C) programming language” by Kernighan and Ritchie. Some consider this the holy bible for C programming. It’s only 272 pages but contains such a wealth of information densely written (without bloated text that repeats over and over the same concepts in different ways). The hardest concept was wrapping my head around pointers and structures. After that I attended C++ programming at school and breezed through it but still reiterated and picked up some new concepts I missed, especially C++ specific concepts.

    While I can’t recommend starting programming with C, the investment of your time will pay dividends later on as many other languages (also known as curly braces languages) are derived from the concepts of C like C++, C#, java, javascript, php, etc etc.

    As for doing the actual programming with the lack of full programming knowledge the best in my opinion is to keep a log, a Word document where you write down and explain each new concept you’ve learned and what went wrong and why at the end of the day. This has certain didactic value: reproducing what you’ve learned makes it easier to remember, and you can read it back written in your own language and context.
    One of the reasons I abandoned projects was the lack of documentation I did so getting back in just wasn’t worth the effort. Now I just build on the knowledge I have and continue building my own library of stuff I’ve built, all fully search-able. I’m now at a point I start thinking about object oriented programming, so that code I’ve written is easier to implement in new projects by making them independent of other code and when it is dependent it is clearly defined. There’s no use to reinvent the wheel each project. There’s no shame in being a cookbook programmer ;)

    It also helps a great deal to make a behavioral/functional description of what you want to program. If you’re programming a on/off toggle switch I’d make a functional design like this:
    Input: 1 button,
    output: 1 led.
    (you can go so far as defining which pin on the MCU you want to use)

    Behavior
    The default state of the led is off and the switch is not pressed. When the switch is pressed the led state is inverted. The button must be debounced with a 1ms interval using interrupt.

    mcu peripherals
    – IO pins
    – interrupts
    – timer

    You can derive a lot from the above, and this is only a simple example. Imagine a 3D printer…

    One more thing to keep in mind: learning new concepts, new peripherals, new libraries etc is going to happen regularly so it’s important to keep in mind that the learning is never over. So you’d better have fun along the way :) There’s always a new type of display or sensor, a different kind of mcu etc.

  10. Finally I can stop feeling dumb among all you EE wizards – I’m a self taught developer with a 20+ year career behind me.
    Like many devs my age, I got into it by chance (it wasn’t known as a reasonably well paid career path when I was at school) after finding my initial choice of research chemist would be a dead end unless I was prepared to study for a PhD.
    I’ve always loved programming, and found it easy and fun to learn … that’s what makes people like me good at it, and people like Gerrit struggle.
    I’ve mentored several graduate developers over the years and within half a day I can spot the ones who even though they put the work in, they’ll always find it difficult and never produce good code. On the flip side, it takes even less time, maybe an hour to tell the ones who just “get it” and will do fine.
    You don’t have to be a genius, or great at maths, there’s just a type of mindset that naturally produces elegant efficient code.

    1. I fully agree. There is a mindset for good software development. One part is the ability to keep a large portion of the system in your head so you know if X is changed it will impact Y and Z. That is especially critical when debugging, especially on multi-developer projects.

      Programming languages, like Pascal originally and C++ today, attempt to make sure that if X changes and you don’t adjust Y and Z you’ll get a compiler error.

      1. Well yes and no …

        It depends on what sort of language “structure” you are working with.

        What you say is very true of a lower level language like a “C” variant. These sorts of language are very flat in that you are mostly using language primitives rather than an abstraction(s). So changing one thing does effect another and that is a given for these types of languages.

        However other languages, both lower level like LUA, or higher level like PHP require that you create a layer or several layers of abstraction (for decent code) and that means that x, y, and z only exist in the local scope which is obvious by what your reading rather than having to keep lots of things in your mind.

      2. Try out TDD/BDD, or something like that when you have a little time. With that you only have to keep in mind the untested code you just wrote down, and that way debugging is a lot easier. I am pretty sure writing tests first (or along with the code) works on low level languages too. I am not sure though how it is possible to use the same approach with long compile times. I would never do any project without automated tests, you always end up with a surprising amount of bugs in the code. The last time I tested after I wrote code (10+ years ago) I found somewhere between 25-50 bugs typically by the rarely used parts of the code e.g. deleting something. Writing tests first is better, because it gives some time to think about what you want the code to do. And if you know that “what” part, you just need to figure out the “how” part.

  11. Well, it IS entertaining seeing comments in code that were written by someone other than the writer of the code. But if I was maintaining a program, I would find it very disconcerting to see comments indicating that the writer isn’t really sure how it works!

    I think it was actually something Knuth wrote that was the best advice ever about coding: don’t assume that the future you will be smart enough to remember what the current you was thinking. That is, if it took you a while to figure out how to make it work in the first place, don’t waste your future time by making you figure it out again later. Code is written to make machines understand what you want; comments are written to make people understand it. We don’t expect the machine to execute our comments, so why expect humans to gain understanding from pure code?

    I also had the “nope, not doing it” reaction to Knuth’s “literate programming” when I read about it years ago. I think I understand what’s behind it, though – it’s the programmer’s aversion to writing something twice. If you write the same code twice, it usually means you should have written a function or macro. Otherwise, if you have to change it later, you have to remember to change every instance of it you wrote. Same goes for comments: if you write the same comment twice, (for example, in the code itself and also in the formal specification), you’re opening yourself up for the possibility of getting the two out of sync later and contradicting yourself. Knuth’s solution to this was to write a front-end to scrape information from the code comments to automatically generate the documentation. But this comes at the cost of having additional syntax for both the writer and reader to deal with. Having looked at enough open-source code to see a pattern start to emerge, I think that many programmers short-cut this one more step by just never writing the formal documentation. That’s not necessarily a bad thing, since this is another way of avoiding the risk of having two documents contradict each other. Since the formal spec is what always suffers, so maintainers always have to go straight to the code anyway, but it often means having to search through many files to find the starting point. I find a lot of benefit to having a SMALL readme file that directs me to the files that contain the fundamentals of the program.

    Regarding tutorials, I think the common practice in tutorials of having the student type in a block of code is intended to get them to actually read each line of code. This usually backfires for blocks of more than a few lines, though, since you can easily go into “stenographer” mode, where information goes straight from your eyes to your fingers without slowing down at the brain. And since most of us are getting our tutorials on line, it just gets copy/pasted anyway. I’ve seen a few tutorials that present the examples in the opposite order – first one line at a time with an explanation of why it’s there, then the whole block at the end. For me this is better. I can copy/paste the code one line at a time, which still at least forces me to look at each line of code, and I spend a lot less time scratching my head trying to understand what a line is doing, unaware that the tutor makes it all clear on the following page.

    1. I think writing the code first and writing the formal documentation later might be the wrong order. If you know what you are doing, then you can write the documentation first, then the tests and the code only after that. With BDD I sometimes do it this way, but it requires a lot more thinking than starting with the code. In fact sometimes I start with txt files or github issues and try to figure out there what is the best solution. The code can wait, because refactoring it every 5 mins is wasting time. I might not be the best programmer, but I ended up doing it this way.

  12. I remember writing a TCP stack for a microcontroller about 15 years ago. I literally took the RFC (quit readable) commented out all the lines and wrote the code inside the specification.

    Comments… not what you are doing. Why you are doing it.
    In C. Choose your data structures carefully. They group data, related stuff more than anything else.

  13. I’ve taken up doing Doxygen/Javadoc religiously (even if I don’t have doxygen configured right for the current task.) I find that typing out exactly how a function is supposed to behave works as an early debug step, letting me find spots where my implementation was lazy and broken. And the standard set of @param @return @warning @note tags gets me to always write something down. Combine it with commenting anything that isn’t blatantly obvious and I end up writing about as much comments as code.

  14. I fully support this idea, I’ve been doing something similar lately with my first project that started requiring me to have a deeper understanding of how it’s actually working. I use ridiculously descriptive variable names (because copying and pasting is free, and error free) and I comment the living s***t out of my code, including dumb and/or funny explanations for when my ADHD feels bored or overwhelmed by a quintuple nested conditional statement. Also, like in any conversation, the tools of expression can be more important than the content being communicated, because unilateral as it may seem, we can get all emotionally frustrated with badly communicated concepts (especially when dealing with inanimate objects and abstract logical concepts). Programming is supposed to be a human-computer interface, yes?

    1. “ridiculously descriptive variable names” Good for you. A lot of code can be self documenting if the names are chosen correctly.

      One trick is to use a shorter name when writing the original code. It can even be an acronym. Then do a search and replace to create the real, longer name. So MVLDN becomes MyVeryLongDescriptiveName.

      A good modern IDE will suggest the name as you start to type it so once you’ve entered “MyVer” it will suggest the entire name in a drop down list.

  15. Literate programming and inline documentation make more sense after you understand the distinction between imperative and declarative statements.

    Imperative statements tell you how to do something, like: “increase the value of X by two and store the result in Y.” Most programming languages are imperative, and provide a much more compact way to express that kind of thing: “Y=X+2”. One of the easiest ways to write bad comments is to re-transcribe the imperative statements of the code with ten times as many keystrokes. It doesn’t add any new information, and can fall out of sync with the part that compiles.

    Declarative statements tell you what’s being done, like: “check the input value for uniqueness”. They contain useful information, but can’t possibly compile. The whole idea of self-documenting code is to choose variable and function names that add declarative information that can’t fall out of sync with what the compiler sees.

    Not all information about a program translates to something executable though: “this section is time-critical and has to complete in 20 microseconds”. That kind of information that belongs in a comment.

    Literate programming is a middle-out way to work out the structure of a program, collect information that really belongs in comments, and find variable/function names that are usefully self-commenting, all in a way that leads directly to executable statements you can express as code.

  16. This just looks like an easy to understand approach to a ‘design in, code out’ object oriented methodology. Works extremely well in generating compact, understandable code in complex systems.

  17. Not quite 20 years ago we had to use LP at University during a particular course. I’m afraid that was the only time I ever used it. I HATED having the complete documentation inside the code, because when I’m writing, or more so, reading code, I need to be able to read the code uninterrupted by long wordy TeX syntax jarring documentation.

    I hated it so much that I actually went so far as to write a system for my editor of choice at the time (Nedit) so that I could keep my code and documentation chunks in separate files (so, basically the antithesis of LP) and use special comments in the code file to indicate where the doc chunks were to be placed. You could hit a hotkey on a line in the code file and it would open the documentation file and set the cursor to edit the documentation chunk for that code, then in the syntax highlighting for the code I just made it grey-out all those special documentation markers. Basically C code and TeX doc in their own files which just happened to be cross-linked via special marker comments. “NoNEDIT” is long gone now, but I found my write up about it on the Wayback Machine… https://web.archive.org/web/20020802040609/http://crash.ihug.co.nz/~pabbs/nonedit.html

    Don’t get me wrong, my normal code in the 20 years after, and however many before, is commented pretty heavily, but there is a difference between simple comments, and embedding a completely different (TeX) language in your code.

  18. When I first started programming in FORTRAN IV, the variables had limited length (8 characters?) and a comment required a separate card. Yeah, punched cards.
    Since then I have learned BASIC, Assembly, Forth, C/C++, Ada, Java, VHDL, and others.
    Modern text editors are great for programming. Color really helps to grasp what is happening in the code but is not sufficient to quickly recognize whether the section of code I am looking at is the code I need to change. I have adopted the habit of block commenting each routine with a few comments on code lines, for example:

    //——————————————————————————————————-
    // Routine name
    // Description of what routine does
    // Entry: what is expected on entry to the routine
    // Exit: what you get when it’s done
    // Note: anything which could end up biting you. This code changes the state of a global variable.
    // Revisions: Ver 1 by Bill Bohan
    // Ver 2 by Joe Blow – fixed off by one error
    //——————————————————————————————————-

    Routine
    if (whatever)
    { code lines
    more code
    lots of code
    extra code // it always takes more code
    return(the_answer)
    } else { // not whatever
    code and more code
    return(a_good_guess)
    }

    //——————————————————————————————————–
    // Another routine …

    This has saved countless hours when revisiting code I’ve written long ago or when somebody else maintains my code. I worked at one place for eleven years and had to maintain all the code I’d ever written there. When you change the code to actually do what the comment says, there is no need to change the comment. When you change the code and invalidate the comment, you MUST correct the comment. It’s not as much a PITA as finding that the routine does not do what the comment says.
    I do not have my comment compiler (you know, the one that compiles the comments and ignores the code) working yet but having well-written comments reduces the frustration and panic which comes from looking at spaghetti code with a looming deadline.
    Comments are not meant to be great literature. They don’t have to be complete sentences. They are more like road signs to get you where you want to go.
    Other sometimes useful but often volatile sections in the block comment are:
    // Requires: other routines needed by this routine
    // Used by: routines which use this one
    I really liked the cross-reference utility many assemblers have which supplied the above correlations.

    1. Public API documentation is its own thing, but if you extract the contents of the if and else cases into their own functions, with clear, descriptive names, you instantly remove a lot of the need for comments. It also brings the if condition within visual range of the else, removing the need for the “not whatever” comment, and so on. Version control comments belong in the version control system, no-one really cares that some piece of code has at some point had a bug.

      My personal opinion is that comments are only needed if the intent of the code is not clear, and there is a better way to fix that.

  19. I write comments when there is something that needs to be noted that a competent programmer (usually me several years later) cannot grog by just reading the code. Too many comments make it impossible to read code. Clearly this is a judgement call. In essence I write comments for myself.

    As for writing code in a way that indicates intent “rather than what the compiler wants” — down that road lies Haskell and functional languages. These are a lot of fun and will bend your mind in ways that it is good for it to be bent, but I have yet to actually do anything but solve puzzles with functional languages. To get real things done, gimmee C any day.

  20. Emacs Org mode and Babel

    Lastly I’ve been using Emacs Org mode for structured text editing (think Markdown, but much more fluid: it really grows on you).

    One very nifty feature (via org-babel) is the possibility to embed code snippets (in many, many languages) and execute them, optionally injecting the results into the text being edited.

    The snippets can be chained, either textually (making a bigger program, cf. literate programming) or by passing the result from one snippet to the next.

    Heck, there’s even a facility to embed a single C function and injecting its result into the text: org-babel takes care of putting a C wrapper around it, of generating the necessary printfs and of calling the compiler and the compiled program.

    Here are some links which explain that better than I ever could:

    https://justin.abrah.ms/emacs/literate_programming.html
    http://howardism.org/Technical/Emacs/literate-programming-tutorial.html
    http://orgmode.org/worg/org-contrib/babel/
    http://orgmode.org/worg/org-contrib/babel/how-to-use-Org-Babel-for-R.html
    http://orgmode.org/worg/org-contrib/babel/intro.html

    Wrapping one’s head around Emacs can be challenging, but *this* alone would be worth it.

    1. tl;dr: follow the links at ‘howardism’ to see the mad-power of emacs and literate dev-ops in action

      long-time lurker, new poster here .. i ignored emacs for 25 years until 3 years ago. i have used emacs/org/babel daily ever since. sure, there was a lot to learn, but the combo lends itself to building a platform that does better than any ide.

      i write text as text and code as code. i made cheet-sheets for everything that’s new in the environment as i build-it up and i schedule them so that i do a few minutes of reviewing keys and bindings once every couple of days. .. new cheat sheets rotate in, others rotate out, org-mode keeps them all accessible.

      emacs is life-time software. it has not been obsoleted and its unlikely that it will ever become obsolete or incompatible with anything. its client-server mode makes for ultra-fast startup and then there’s zile, a nano/pico replacement that starts even faster for editing the tiniest bits faster yet.

      one can write everything that goes into a project into a file to creates scripts, config, data and source-files in multiple languages. one can fold one’s way through the thing from start to finish as if navigating an outline. the git-interface is the best ever, hooks can let you fire of the very scripts you’re maintaining.

      so, i’m so sold.

      the caveat again.. its life-time software that works like the other stuff should have worked. had i known vi i trust i would have the best vi using the vi top-layer stuff from the repos. the repos are integrated and many contributions are extremely valuable.

      wait.. i was talking about the caveat! — not a week has gone by that i have not made more of this combo. you learn more, you integrate more, you doc and practice more and you get more out of it as you get sucked in. you can get going by just following examples for babel and learning some basic org-mode and emacs controls, and then you’re suddenly hooked and better organized and more efficient and connected than anyone else on your team.

      >>>

      with ‘babel’ (in emacs) LP works two ways in that you make all your data, experiments, tests, invocations, results and proofs part of your documentation. if you have never seen this, think of it as a step by step replayable REPL session that has all your mental notes written or hyperlinked in it.

      there is less profoundly less stress in my world from controlling schedules, todos, dev, installs, ops and docs like that, in turn i type much more and use the mouse far less. that works for me. :)

  21. That DISPLAYSURF line looks like a pretty confusing thing to be giving to a beginner. I don’t know python at all, but it looks to me like it is creating a weirdly named variable from the return value of a lots-of-jobs-in-one “set” command, and then never actually using it anywhere.

    If it’s the display surface would it have killed him to type the last three letters?

  22. I’m a novice programmer at best, but here is my secret technique: I write the comments first.

    I lay out a framework of comments describing every block of code before it is written, so that the first thing completed is a thought map of the entire program:

    #import all the needed modules
    #declare important global variables
    #define functions
    #this function blinks the red LED when button 1 is pressed
    #set up the hardware interrupts and calback functions
    #this is the interrupt for button 1 to make the red LED blink
    #start the main program loop
    #handle exceptions
    #allow the user to close the program

    And so on and so forth…

    1. I’d dare say every good programmer does this, but most of them are doing it in their head.

      Doesn’t make it wrong what you’re doing, and in fact it can help point out where something doesn’t make sense/you’ve forgotten something.

  23. I’m a day job programmer, but during my evenings when I code at home on one of umpteen+ programs I have stashed in various points of completeness. I often go at least several months before I get back to being interested in a particular code project. That being said I do 99.9% of my coding at home in .net express. For those that don’t know and use the .net IDE. You can use a keyword of TODO “//TODO” in C# and “‘TODO” in vb.net. Neat thing about this is you put your comments after the keyword. But you open a window called “Tasks” and you will find all your “TODO”s in that window. Double click on it and the IDE takes you to that line of code. When your happy with that piece of code just remove the keyword and leave the comments. I don’t know if any other IDEs do this but might be something to look for.

    I’ll often start a coding project with nothing but the TODOs to do the narrative flow. Fill in the code when I get interested TODO so.

    1. I really like this concept, though most of my coding is in more basic text editors. I do tend to write “dragon” at the start of a comment line describing something that isn’t working the way it should or remains to be done. It provides me a string to search for that will not likely be used in any other context.

  24. I believe literate programming is employed by some subset of the Haskell community: https://wiki.haskell.org/Literate_programming
    Probably not a large subset.

    I don’t see how literate programming solves either the problem of learning or relearning programming or the problem of good documentation. Programming is serious brain activity and requires practice, practice, practice, like any other form of serious brain activity.

    It may tickle one’s fancy to imagine a paper about one’s program as already having been written by the time the program is finished, but often programming requires trying one thing, realizing it won’t work, and trying something else. Nothing guarantees that the beautiful prose the program is embedded in remains consistent with the program. Indeed, it seems likely that one’s first musings, having suffered the drifting away of the code it used to describe, will remain unedited “until I get a free moment.” Finally you get a program that works, consisting of pieces of code scattered around a paper that, now that you think of it, should have been organized completely differently, and in any case is simply wrong. So what do you do?

    Chances are you will begin to wish you had a program organized the way programs are supposed to be organized, and a paper organized the way papers are supposed to be organized. A well organized program captures a set of useful abstractions, what’s revealed and what’s hidden in the implementation of these abstractions, and so forth.

    A “white paper” describing this program is going to conceal a lot of detail. For instance, it might describe a problem, then a series of flawed but revealing solutions, each of which avoids some problem with the previous version, until the “actual” program is described. None of these looks much like the real code, which has to be loaded up with distracting implementation details. Furthermore, the earlier, flawed solutions are almost never “subsets” of the final version. It’s up to the writer of the prose to find a sequence of abstract programs that converge, in some hopefully revealing sense, on the actual solution. This is even more difficult than programming, which is why so few programmers write papers.

    The next time you have to write a program you will probably decide literate programming is not for you.

  25. I think literate programming might work just like self-documenting code and writing comments or any other approaches where you explain what the actual chunk of code does. Those are just different styles of coding with different advantages and drawbacks and different methodology. There is no golden hammer here if you ask me.
    By writing texts you can give variable names like x,y,z and function names like f,g,h as we used to do at math class. Another advantage that it is easy to express yourself in multiple sentences. You can add links or maybe even images, and you end up with an instant documentation. In exchange you need to override a relative long text if you want to change the code, which feels like a burden when you are experimenting with different models or your customers/managers have a new idea each day. If you are not maintaining the text you will be out of sync which might be even worse than undocumented code. I guess you can avoid that by writing the new text first, but tbh. I never tried literate programming, just BDD with cucumber and that was enough for me, it was too rigid. I would do it only by writing short very abstract sentences and self-descriptive code as I do currently by BDD without cucumber. That gives the flexibility I need in the implementation.
    By self-descriptive code you don’t need to write documentation for the contributors of the project and that way you cannot get out of sync, but you need to compress the same information as you would write in text somehow into function and variable names, which far from trivial. With a single function you would write very long variable names. To make those names shorter you need to introduce more functions and you need to use the local context. That makes a lot of extra code, which you have to maintain and it takes some experience to find the proper function size, otherwise you’ll do overengineering and you add only noise. In the “Clean Code” book there are hints about how to do it properly. An advantage of this approach, that organizing the code and finding the proper names usually helps to improve your model, which you normally don’t get by writing texts only. The same might be a disadvantage too because if you try to write perfect code with perfect names, then you can end up in an infinite loop here. Another advantage that it is somewhat easier to change, because it is not as verbose as texts.

Leave a Reply to przemek klosowskiCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.