Linux Fu: Shell Script File Embedding

April 9, 2021

You need to package up a bunch of files, send them somewhere, and do something with them at the destination. It isn’t an uncommon scenario. The obvious answer is to create an archive — a zip or tar file, maybe — and include a shell script that you have to tell the user to run after unpacking.

That may be obvious, but it assumes a lot on the part of the remote user. They need to know how to unpack the file and they also need to know to run your magic script of commands after the unpack. However, you can easily create a shell script that contains a file — even an archive of many files — and then retrieve the file and act on it at run time. This is much simpler from the remote user’s point of view. You get one file, you execute it, and you are done.

In theory, this isn’t that hard to do, but there are a lot of details. Shell scripts are not compiled — at least, not typically — so the shell only reads what it needs to do the work. That means if your script is careful to exit, you can add as much “garbage” to the end of it as you like. The shell will never look at it, so it’s possible to store the payload there.

So Then What?

The only trick, then, is to find the end of the script and, thus, the start of the payload. Consider this file, deliver.sh:

#!/bin/bash
WORKDIR=$( mktemp -d )

#find last line +1

SCRIPT_END=$( awk '
  BEGIN { err=1; } 
  /^\w*___END_OF_SHELL_SCRIPT___\w*$/ { print NR+1; err=0; exit 0; } 
  END { if (err==1) print "?"; }
' "$0" )

# check for error

if [ "$SCRIPT_END" == '?' ]
then
   echo Can\'t find embedded file
   exit 1
fi
# Extract file
tail -n +$SCRIPT_END $0 >"$WORKDIR/testfile"

# Do something with the file
echo Here\'s your file:
cat "$WORKDIR/testfile"
echo Deleting...
rm -r "$WORKDIR"
exit 0
# Here's the end of the script followed by the embedded file
___END_OF_SHELL_SCRIPT___
A man, a plan, a canal, Hackaday!

Not exactly a palindrome, but there's no pleh for it.

Multiple Files

If you don’t mind transmitting script files full of binary garbage at the end, the recovered file might just as well be a compressed tar file or a zip file. The trick is to create your base script and append the file to it. So I might have deliver.sh0 as the entire file up to and including the ___END_OF_SHELL_SCRIPT___ identifier. Then to create the final script you can say:

cat deliver.sh0 bundle.zip >deliver.sh

Encode, Reuse, Recycle

Sometimes you don’t want binary characters cluttering up your shell script. Maybe you want to e-mail the script and you are afraid of what the various mail systems in the path might do to your data. It is easy enough to encode your binary data as text strings (with the associated size penalty, of course). For example, you could just as easily say:

cp deliver.sh0 deliver.sh
base64 bundle.zip >>deliver.sh

To recover the file, you’d need some additional work in the main body of the script, specifically after the tail command.

tail -n +$SCRIPT_END $0 | base64 -d >"$WORKDIR/bundle.zip"

Of course, you don’t have to store the file. You could just feed it to another program. A tar archive, for example, might have the line:

tail -n +$SCRIPT_END $0 | base64 -d | tar xf

Naturally, your script can do whatever you need to do to get ready and then maybe process the files after you unpack. You might, say, install a library or a font or merge a patch to the system’s existing files.

You could even embed an executable file in a script — even another script — and then execute that script which might unpack another script. It boggles the mind. Just remember that not every system will allow executables to reside on /tmp or on some mounted file systems, so plan accordingly.

Script Doctor

While bash scripting is often maligned and not without reason, it is very flexible and powerful, as this example shows. It is dead easy to embed files in a script and that opens up a lot of flexible options for distributing complex file setups and applications.

If you are writing serious bash scripts, we suggest you write them carefully. You can even find a “lint” program that can test for errors for you.

25 thoughts on “Linux Fu: Shell Script File Embedding”

adrian says:

April 9, 2021 at 10:26 am

Anybody else here that ran Minix back in the day, and applied megabytes of shar patches from Usenet when Tanenbaum released an update ?

Report comment

Reply
1. Stappers says:
  
  April 9, 2021 at 10:29 am
  
  Thanks for telling that it was `shar`.
  
  Report comment
  
  Reply
2. Feinfinger (supervillain and artist) says:
  
  April 9, 2021 at 3:29 pm
  
  Yip… good old Minix days. Sure….
  
  …or do you remember the tricks like `cat a.tar b.zip`?
  Or images that look different depending on their extension?
  
  Lots of fun for the whole supervillain family!
  
  Report comment
  
  Reply
Stappers says:

April 9, 2021 at 10:28 am

It does bring back memory of my early UNIX days. HP/UX days to be more precise. HP support used simular like above shell script transport for binaries. I did surprise me.

Report comment

Reply
jonmayo says:

April 9, 2021 at 10:56 am

While the trick “cat deliver.sh0 bundle.zip >deliver.sh” does leave binary garbage at the end of the script. It also means the script is a valid ZIP file.

Report comment

Reply
GrizzlyAdams says:

April 9, 2021 at 10:58 am

This works great, if you assume your user can receive binary files, has a compatible shell installed (and uudecode and sed for shar…).
I recently had to stuff a few hundred megabytes of data over a serial connection, with no common set of transfer protocols available both systems. In the end I had to stuff a copy of z-modem on a floppy disk, after
1: transferring 200MB of TAR archives over floppy failed (even gzipped and split to 48 disks, I could only get to disk 12 before the target machine would stop recognizing any more of the disks, even when using a GoTek running FlashFloppy.)
2: building a shar archive, only to find out that I was missing uudecode on the target system.
3: seeing that the manual page for the included communications software makes mention of supporting zmodem via sz/rz, but doesn’t actually include sz/rz.
4: trying kermit, only to have it segfault on both host and client (presumably due to different versions?)

Report comment

Reply
1. jonmayo says:
  
  April 9, 2021 at 11:12 am
  
  That’s rough. In the old days I’d put LapLink on a floppy and plug in either the parallel cable or a null modem cable to transfer the rest.
  In the embedded world, I’d often get stuck on a system that had only a simple boot monitor in FlashROM and I’d have to either pop the chip off and reflash it (on of those coffin style sockets for surface mount parts). Once I hacked in y-modem to our boot monitor over the weekend things went a lot smoother and there were far fewer software developers hanging out in the hardware lab.
  Years later, I realize I could have done a simple serial upload in Intel HEX (or SREC) format of of file transfer protocol (x/y/z modem) and skipped the flashing nonsense. Using a short checksum utility to ensure the program was transmitted successfully. But once you get used to a particular workflow, even if it is a bad one, it’s hard to break away from that same thinking.
  
  Report comment
  
  Reply
  1. Feinfinger (super villain in nostalgy mode) says:
    
    April 9, 2021 at 3:30 pm
    
    Someone remembers PLIP?
    
    Report comment
    
    Reply
    1. jonmayo says:
      
      April 12, 2021 at 4:50 pm
      
      PLIP is even newer. I used to run LANtastic between my two DOS computers over a parallel cable. The second computer had a Hercules card and therefor a second printer port. Allowing me to share a printer to boot!
      
      Report comment
      
      Reply
2. adrian says:
  
  April 9, 2021 at 11:30 am
  
  Sad to hear of your Kermit problems – it’s always been one of the most reliable in my experience, as well as one of the fastest (despite its reputation) as long as you use modern versions that can negotiate all the options. And negotiation is it’s super power, unlike all the zmodem variants that can’t talk to each other.
  
  Report comment
  
  Reply
3. belg4mit says:
  
  April 9, 2021 at 4:49 pm
  
  If the target system has a functional Perl you can use PAR.
  
  Report comment
  
  Reply
Ben says:

April 9, 2021 at 11:04 am

As mentioned above, the “shar” utility (in the sharutils package, most likely) can create portable self-extracting archives, with uuencoding for binary files, handling of multiple files, file verification, etc.
https://www.gnu.org/software/sharutils/

Report comment

Reply
1. Jim Shortz says:
  
  April 9, 2021 at 5:25 pm
  
  I was going to say “remember shar”, but sounds like it’s still a thing. Can’t say I’d run a script someone emailed me these days :)
  
  Report comment
  
  Reply
Gravis says:

April 9, 2021 at 11:08 am

Using AWK is just sloppy because A) you’re just writing a program inside a script and B) AWK isn’t properly implemented on most systems. Using sed would be a much better choice.

Report comment

Reply
1. Aaron Cripps says:
  
  April 10, 2021 at 8:09 am
  
  I love this response so much – professionally I tent to replace awk invocations with sed wherever I can … And don’t even get me started on `cat | grep | awk […]`
  …
  
  Report comment
  
  Reply
Jon says:

April 9, 2021 at 12:11 pm

Or you could just use makeself, which allows you to do more complex file embedding and such but is based on the same concept.

Report comment

Reply
Stellan says:

April 9, 2021 at 1:19 pm

In my org, we have an automated process that uses a sessions history and a diff of all files changed to assemble a shar file of all the things that were changed. This get copied up to a file server and the file named after the change request or ticket that spurred the change. When done right, it means that we can restore a machine from an old backup, then just execute a pile of scripts off the file server to restore the machine to working condition (Application content like databases and stored files is kept on a high-performance SAN).

Our mail server, though, will kill such scripts as well as anything else similar to it. One of the filter stages detects an attachment’s magic, then searches for blocks of other data to find any magics located in the file, and will kill the file if it doesn’t match the outer file. The purpose is to kill innocuous files (Like word documents, PDFs, etc) that contain a malicious payload. It’s annoying, but it has saved our asses more than once.

Report comment

Reply
1. Stappers says:
  
  April 11, 2021 at 7:29 am
  
  > When done right, it means that we can restore a machine from an old backup, then just execute a pile of scripts off the file server to restore the machine to working condition.
  
  To restore (and to get new) machines to working condition is tooling like Ansible and Saltstack available.
  
  Report comment
  
  Reply
  1. Stellan says:
    
    April 11, 2021 at 4:37 pm
    
    Because we don’t need, nor want, a tool that requires specialized training for something that isn’t all that much better than what we do already. Using a shar means we can use standard tools and our people don’t need to know more than moderate Linux skills rather than forking over piles of cash to some company that might just disappear next year when some other tech company swallows them, or some other tool becomes the dominant fad.
    
    And I’m not sure where everyone buys their computers, but ours don’t break so much we need an automated tool to fix them. We have 4300 machines, and 100 of those live in a factory that exposes those machines to pretty much every hazard know to humankind, and we still don’t get more than a handful of break/fix, and maybe a dozen or two software tickets in a month.
    
    Report comment
    
    Reply
Alex says:

April 9, 2021 at 6:08 pm

While I’m sure there are situations where this could come in handy, I find it quite clunky to have to futz around with offsets or use awk, sed or similar for this. Seems overly brittle for limited benefit. I could see the benefit mostly if it’s a large blob that would frequently change and then be auto-appended to the script.
In most situations where I needed a shell script to come with some embedded files, here documents were the much better solution.

In bash a typical here document idiom would look like this:
cat < file_with_stuff
Stuff goes here
More stuff
EOF
usestuff.sh file_with_stuff

… or skip the temp file alltogether and just pipe the here document directly to whatever it is you’re doing.

Similar constructs exist in a great many scripting languages:
https://en.m.wikipedia.org/wiki/Here_document

Report comment

Reply
1. Alex says:
  
  April 10, 2021 at 3:33 am
  
  Grmbls. I only now saw the hackaday engine gobbled the bash sytntax in my comment. Turns out it tries to interpret “less-than somethingsomething more-than” as some sort of markup, and won’t show the text at all if the interpretation fails. I’ve tried again with html syntax, but the hackaday engine responds with “invalid security token”. Huh.
  
  Is there documentation on the hackaday comment engine’s syntax somewhere? I really feel kinda stupid but I didn’t find anything.
  
  I’ll dumb it down so the engine won’t gobble it. Please use brain-sed to replace less-than and greater-than with the correct syntax…
  
  cat less-thanless-thanEOF greater-than file_with_stuff
  Stuff goes here
  More stuff
  EOF
  usestuff.sh file_with_stuff
  
  Report comment
  
  Reply
2. Al Williams says:
  
  April 12, 2021 at 5:44 am
  
  https://hackaday.com/2020/08/03/linux-fu-help-messages-for-shell-scripts-and-here-documents/
  
  Report comment
  
  Reply
dan says:

April 9, 2021 at 8:17 pm

A) Using a program inside a script… what, you can only use shell builtins? What do you think a shell script is? Sed is turing complete as well, so pretty much using it is also “writing a program”… other than a worse syntax than awk and being harder to read, who cares which you use? B) I’ve been using awk inside scripts for over 30 years now with no ill effects. There are also zero awks that would die on the simple usage above. Did someone dear to you die from awk-poisoning or something? (Just noticed shar was still installed on the mac by default, I haven’t used that in a long time….)

Report comment

Reply
Aaron Cripps says:

April 10, 2021 at 8:24 am

This post reminds me of my time spent testing the blackberry playbook. I needed to deliver a tarball payload to the devices in the lab … But they didn’t have gunzip, or the gzip libs installed, so I wrote a script with embedded gunzip and untar binaries (statically linked), and the tarball artifact containing all the tests to extract and run the suites on the devices. I also made a script to weave together the script body and the latest artifacts from the build so we could make a Hudson job for it. It wasn’t a very sophisticated solution, but it was effective in getting the tests onto the targets, running the test suites, and reporting the results back.

Report comment

Reply
Yeppers says:

April 10, 2021 at 1:43 pm

Almost all scripting engines can read a compressed file on the fly without decompression.

Report comment

Reply

Hackaday

Linux Fu: Shell Script File Embedding

So Then What?

Multiple Files

Encode, Reuse, Recycle

Script Doctor

Read more from this series:
Linux-Fu

25 thoughts on “Linux Fu: Shell Script File Embedding”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Mining And Refining: Uranium And Plutonium

Programming Ada: First Steps On The Desktop

The Hunt For MH370 Goes On With Barnacles As A Lead

MXM: Powerful, Misused, Hackable

VCF East 2024 Was Bigger And Better Than Ever

Our Columns

Hackaday Podcast Episode 268: RF Burns, Wireless Charging Sucks, And Barnacles Grow On Flaperons

This Week In Security: Cisco, Mitel, And AI False Flags

Keebin’ With Kristina: The One With The Transmitting Typewriter

Supercon 2023: Alex Lynd Explores MCUs In Infosec

FLOSS Weekly Episode 780: Zoneminder — Better Call Randal

So Then What?

Multiple Files

Encode, Reuse, Recycle

Script Doctor

Read more from this series:Linux-Fu

25 thoughts on “Linux Fu: Shell Script File Embedding”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns

Read more from this series:
Linux-Fu