Linux Fu: The Shell Forth Programmers Will Love

One of the most powerful features of Unix and Linux is that using traditional command line tools, everything is a stream of bytes. Granted, modern software has blurred this a bit, but at the command line, everything is text with certain loose conventions about what separates fields and records. This lets you do things like take a directory listing, sort it, remove the duplicates, and compare it to another directory listing. But what if the shell understood more data types other than streams? You might argue it would make some things better and some things worse, but you don’t have to guess, you can install cosh, a shell that provides tools to produce and work with structured data types.

The system is written with Rust, so you will need Rust setup to compile it. For most distributions, that’s just a package install (rust-all in Ubuntu-like distros, for example). Once you have it running, you’ll have a few new things to learn compared to other shells you’ve used.

Examples

A good way to get a quick flavor of the shell’s idiosyncracies is to contrast it with the usual shell syntax. The Github page has several good examples:

  • Find files matching a path, and search them for data:

sh: find . -iname ‘*test*’ -print0 | xargs -0 grep data
cosh: lsr; [test m] grep; [f<; [data m] grep] map

  • Find the total size of all files in the current directory:

sh: ls | xargs stat -c %s | awk ‘{s+=$1} END {print s}’ –
cosh: ls; [is-dir; not] grep; [stat; size get] map; sum

  • Get the second and third columns from each row of a CSV file:

sh: cut -d, -f2,3 test-data/csv
cosh: test-data/csv f<; [chomp; , split; (1 2) get] map

  • Sort files by modification time:

sh: ls -tr
cosh: ls; [[stat; mtime get] 2 apply; <=>] sortp

As you can see, sometimes commands are a little longer, but presumably, there is less to remember, and it is a bit more self-documenting.

But Why?

The key idea is that this shell understands multiple data types. In particular, it can deal with hash maps, sets, and lists. Basic items include booleans, integers (32-bit or of arbitrary size), floats, and strings.

The input prompt is more like a command prompt for a programming language. In fact, Forth programmers will appreciate the RPN capabilities:

/tmp/cosh$ 5 3 /
1
/tmp/cosh$ 5.0 3 /
1.6666666666666667
/tmp/cosh$

Storing into variables is similar to Forth, too, using ! and @ with the RPN-style notation. In fact, it all looks like Forth from swap and drop to the way if controls conditionals. However, the stack doesn’t exist between lines. So the above examples do not leave the result on the stack.

The documentation on Github is good, but there are a few things you’ll have to work out. For example, the string function ++ is documented, but the example uses the word append, which doesn’t seem to work.

/tmp/cosh$ hacka day ++
hackaday
/tmp/cosh$ hacka day append
hacka
day
append

Commands and Regular Expressions

Most shell commands exist in cosh, too, but not necessarily as external tools. Some have aliases, too. For example, you can use mv, but you can also use rename. Everything is, of course, using the RPN format for arguments.

If you want an external command, you need to prefix it with a dollar sign or, in an interactive shell, you can use a space if you prefer. For example, if you run ls, you’ll get the cosh version of ls. If you run $ls, that’s the actual ls command you expect.  If you put the external name in braces, what is returned is a generator that allows you to walk through the output.

What’s a shell without regular expressions? With cosh, you have an “m” expression that tells you if you have a match or a “c” condition that returns captures from the expression. There are also “s” expressions for search and replace. You can also add flags to allow different options like case insensitivity.

I found the capture part confusing. You’d think it would provide a list of things matched in parenthesis, but either it doesn’t or I couldn’t find the right syntax to make it do so. For example:

/tmp/cosh$ name=al "name=(.*)$" c
(
   0: name=al
)
/tmp/cosh$ name=al,name=jim "name=([a-zA-z]+)/g" c
(
   0: name=al
   1: name=jim
)

The documentation shows some examples of this that don’t work exactly right, too. For example, try this from the documentation:

/tmp/cosh$ asdf "(..)" c

To get the result the document shows, you need the /g flag on the regular expression. Otherwise, only the first match appears.

Parsing

One big feature of cosh is that it can parse json and XML. It can also write out files in that format. We’d love to see a proper CSV parser, although that’s a little easier to handle directly with cosh primitives than an XML file.

For example, if you want the 3rd and 4th fields from a CSV file, you can read it and use the split and get functions in a map:

/tmp/cosh$ test.csv f<; [chomp; , split; (3 4) get] map

Of course, that’s not going to handle things like quoted values, but that’s typically a problem in other simple shell scripts, too.

Working with json is easy. For example, if you want to find the fields that match a regular expression, you can do that:

file.json f<; from-json; keys; [.{4} m] grep;
v[gen (
0: asdf
1: qwer
2: tyui
3: zxcv
)]  # from the official examples

Winner?

Will we start using cosh every day? Honestly, probably not. But it is interesting. We might keep it in our back pocket for writing scripts that need to process json or XML.

We like the Forth syntax, but not everyone will. We also like the data typing. But as a general-purpose shell, it seems to leave something to be desired.

Of course, what we really like is Linux gives you choices. If you like cosh, knock yourself out. Don’t like it? Pick another shell or — if you are feeling brave — write you own. The world is your oyster.

We couldn’t help but think of the database-like Crush shell while playing with cosh. Then here’s cat9, which is a strange shell, indeed. There are, too, some more mainstream alternative shells like zsh and fish.

28 thoughts on “Linux Fu: The Shell Forth Programmers Will Love

  1. PowerShell for Linux is probably a better bet. It understands structured data types as objects with named properties. (I.e. a directory listing has a named size field with an integer type)

  2. Thanks for sharing news of this shell, seems interesting. That said, I personally think the most advanced shell for a long time has been PowerShell, and the recent developments in other shells are just playing catch up with features that PowerShell has had for decades. If you think this is hyperbole, try exploring it, you can use it on Linux now too.

    1. While I mostly agree with you, the statement you quoted is talking about processing json or XML. Yes you can use tools to parse the json/xml, however I think the author is talking about direct processing in the shell itself without external tools.

      Personally I feel that Bash is still the best shell available for several reasons:

      1) It’s prolific across all distros of Linux over the past 20+ years, and even shells on windows like Cygwin and MSYS, etc.
      2) If you work as a system admin (as I do), you will appreciate that you have a common and consistent shell to work with across different distros and releases, avoiding the need to install a shell for some outlying script written in an odd syntax few people understand and/or can maintain.
      4) Extra features like json/xml processing should not be in the shell, it should be an external tool so as to avoid bloating the shell with features very rarely used, and avoiding extra code that could lead to system exploitation due to bugs.

      Why I feel that Cosh (and other less common shells) should not be encouraged:

      1) People new to Linux are already timid or don’t like with working in a shell, trying to educate them via remote chat to use their shell when some random distro/user has decided to change to some non-common shell causes all kinds of issues and confusion.
      2) Some young inexperienced system admin somewhere will insist and convince their manager that shell `X` is superior and they should use it, only to find out that down the track when there are issues there is little to no support after investing considerable time writing scripts that make use of it.
      3) If there is too little interest/support in the project to maintain it and the project dies, you are left stuck on an old version of some shell that is buggy and/or incomplete.

      The Linux landscape is already a support nightmare for those of us that provide it to the community. When rendering help to someone they could have their system configured in one of a million ways, to the point where the majority of all support starts with asking a ton of questions to try to determine how they have setup their OS to just debug a fault.

      1. I agree, problems like not having nested dicts in bash complicates XML/JSON parsing.

        Also, agree on all your other points. Although, ‘s/Some young inexperienced system admin/Some system admin'[1], The system admin who insisted in using csh (and modifying the bash scripts to csh scripts) always had complications when rolling out releases.

  3. Ya gotta be kidding. Who wants to type so many crammed together shifted characters? A Forth-like language would turn all that into words with names that put together readable statements.

  4. > We might keep it in our back pocket for writing scripts that need to process json or XML.

    You know Python exists, right? I honestly fail to see what problem this shell is trying to solve (other than imaginary problem of other shells not being written in Rust).

    1. Despite using Python, I’m personally not a huge fan. Inside Jupyter it is ok, but the whole virtual env seems to be a pain and not using it is even worse since everything seems to break everything else.

      I’ve refrained from commenting on the whitespace thing, but I personally hate that too. There are too many times when white space gets mangled (paste from browser or whatever) and I’d rather manage braces that don’t get eaten and don’t get forgotten without a compile problem. I do like some of the features, but not nearly all of them. Anyway, the nice thing is there are lots of tools for everyone and you don’t have to use the ones I like.

      1. I agree. I’m not a big fan of Python either. But between tool used by millions of programmers and an obscure shell I’d rather use the former since it doesn’t look like the latter has any meaningful advantages.

        > Anyway, the nice thing is there are lots of tools for everyone and you don’t have to use the ones I like.

        You say that but next thing that happens is you commit it to ops repository, I join the team, you move on to another firm and suddenly I have to maintain bunch of esoteric scripts.

        1. > You say that but next thing that happens is you commit it to ops repository, I join the team, you move on to another firm and suddenly I have to maintain bunch of esoteric scripts.

          !!! i’m irrationally mad at this ‘forth’ shell and this reminds me of an anecdote

          5 years ago, i tried to install rust on an arm-linux-elf. back then, rust was not a path-well-travelled. one of the problems i ran into was a script which used #!/bin/sh but relied on a bash-ism. debian has been directing /bin/sh to “dash” or similar, so this failed on my system. but all these bourne shell- or ksh-derived things are basically the same so it was easy for me to see what needed to change. i even pushed a patch for everyone else (a minor drive-by bugfix from a new user, the sine qua non of open source), because the problem was so shallow.

          if you’re doing something more complicated, the arguments for perl vs bespoke hack vs whatever gets more involved. but if it’s a straightforward shell task, there are actual advantages to it looking like straightforward shell syntax.

  5. Thanks but I’m not a masochist. If it runs Python I will use Python, that way if I am forced to use Windows or Mac at work I won’t have to learn something new.

    Time spent learning is time not spent creating.

  6. Forth is far more than RPN and defining data objects. The beauty of Forth is the ability to define your own vocabulary for the domain/discipline you are working in. Then the actual “Business Logic” or CLI that you are using it for is designed so the folks in the discipline get their work done efficiently. Statisticians can run statistics the same way they would describe it to a colleague. Chemists may have a whole different dictionary AND grammar. Programmers might want to think of it as Syntactic Sugar, but it’s more than that.

Leave a Reply to IAN T ROXBOROUGHCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.