Step Up To The 1 KB Challenge

1 kilobyte. Today it sounds like an infinitesimally small number. Computers come with tens of gigabytes of ram, and multiple terabytes of storage space. You can buy a Linux computer with 1 gig of RAM and secondary storage as big as the SD card you throw at it. Even microcontrollers have stepped up their game, with megabytes of flash often available for program storage.

Rapidly growing memory and storage are a great testament to technology marching forward to the beat of Moore’s law. But, we should be careful not to forget the techniques of past hackers who didn’t have so much breathing room. Those were the days when code was written in assembly. Debugging was accomplished with an expensive ICE (an In Circuit Emulator… if you were working for a big company), or a few LEDs if you were hacking away in your basement.

To keep these skills and techniques in play, we’ve created The 1 kB Challenge, a contest where the only limit is what you can do with 1 kB of program memory. Many Hackaday contests are rather loose with constraints — anyone can enter and at least make the judging rounds. This time 1 kB is a hard limit. If your program doesn’t fit, you’re disqualified, and that is a challenge worth stepping up to.

That said, this is Hackaday, we want people to be creative and work around the rules. The important thing to remember is the spirit of the design constraints: this is about doing all you can with 1 kB of program space. Search out the old and wise tricks, like compressing your code and including a decompression program in your 1 kB. Crafty hacks to squeeze more into less is fine. Using the 1 kB as a bootloader to load more code from an SD card is not fine.


Any Hackaday contest needs some awesome prizes, and this one is no different.

  • Grand prize is a Digi-Comp II kit from EMSL
  • First prize is a Maker Select 3D printer V2
  • Second Prize is a Bulbdial Clock kit from EMSL
  • Third Prize is a Blinkytile kit from Blinkinlabs

The full rules, and more information can be found on the 1 kB Challenge contest page. Check it out, then put on your thinking cap, and get hacking!

116 thoughts on “Step Up To The 1 KB Challenge

  1. Are there any restrictions on from which country you are participating?
    Consider this :

    Someone built a ir prank device using attiny13, which was featured on HaD .
    You press any key on your ir remote, it will record it and play ot back after 30 seconds, if its the power key, the TV will turn off after 30 sec

    Attiny13 has 1kB memory.

    1. Yep. That was mine. Actually it was an attiny10. I was considering entering it in this contest, but I’ll need to make it a bit more interesting.
      Entering from Japan has never been a problem for HaD contests, but I imagine the usual list of ineligible countries remains.

          1. @the regnirps
            I would understand if the discussion was about 1000 vs 1024 bytes, but it’s stated quite clear that the limit is imposed on program memory, which means the code, which means there are no limits on ram usage.

          1. MrX: probably since before you were born. Traditionally, “byte” just meant something smaller than an instruction word. It’s only since the Intel 8080 series that “byte” has been pretty well fixed at 8 bits.

          2. A byte is formally defined as the smallest uniquely addressable unit of computer memory. Many computers did not have a byte oriented architecture; a word was the smallest addressable unit. It was the IBM System/360 that really popularized the use of byte oriented architectures.

  2. I believe we have a winner! A digital oscilloscope in 1K

    Here is a quote from the build:
    “The processor is not only pushed to the limits with respect to speed, also the 1k word program memory is literally used to the last byte. The RAM memory of 64 bytes was even a few bytes to small. Fortunately timer 1 was not used so that its registers could be used as 2 additional bytes of memory.”

    I hope Ronald will not be disqualified for the use of the register.

    1. Smashing this together with the ‘Quark85’ project would allow the ‘u-scope’ to work on a VGA screen. I certainly have spares of those lying around, but things with an analog (PAL or NTSC, take your pick) are considerably harder to find.
      I’d also go with the Atmel architecture. I like PICs, but the architecture is just so damn ‘scattergun’!

  3. How about utilisation of pre-existing firmwares, e.g., on 8-bit computers? How much of the underlying OS can you use?

    I understand using say a ~100 byte decompression algorithm in your 1KB binary to decompress the other ~900 bytes, but that’ll turn into an experiment in finding the best decompressor – you might as well declare the competition to be a 4KB no-compression contest!

    1. Ah, “the size of any rom routines or data tables must be included in the 1 kB calculation.” – which really means you can’t use the firmware, as it’s likely more generic than a custom tightly coded routine.

      1. Looks like they just went off “MSRP value”, not what someone would actually want.

        Unless you can code this thing quick the value of the prizes is hardly worth the time invested in programming. Well, there are bragging rights, I guess.

      1. For clarity, that means that loading any data from any external media (Internet, SD card, human mashing buttons) is illegal? What if that was the point of the project? I’m okay with that answer, I’m just establishing whether the parameters effectively rule out projects that can load resources from the environment.

        1. Loading data from external media is fine. 1 kB is a limit on the codespace, and any predefined data tables. A 4 function calculator is a great example of this – having a human user enter data at a keypad (the numbers to be operated on, and the operation) is totally legal.

  4. Hm, so the challenge is to use a total of 1kB of data for the application.

    First, do we mean 1000 Bytes, or 1024 bytes? I think the author meant 1024, as it’s a defacto standard)
    Secondly, what byte size do we mean? Some architectures have 4 bit bytes, while others have 36 bit bytes. I think the author meant one byte as in 8 bit.
    And third, if we were to use an architecture that encodes multiple pieces of data in one byte, then would be still follow the rules?

    And we can encode multiple pieces of data/instructions in one byte all ready, it’s called compression, and “chopping off” bits from each byte and then “gluing” them back together into fewer bytes isn’t a new thing.

        1. Technically yes but byte have for a long time (60ies) been associated with 8 bits. Anybody in the business that hears byte will replace it with 8 bits unless the size of the byte is explicitly specified, not doing that would lead to problems.

          But one have to remember that byte can be of a different size in older computers/technical literature.

  5. I once coded a program in four bytes for DOS and 486. Yes, four bytes. It called an interrupt vector in the BIOS for a warm reboot allowing switching from EMS to XMS memory configs in a couple of seconds. Don’t really count as it needed the bios. :)

  6. ” This time 1 kB is a hard limit. If your program doesn’t fit, you’re disqualified”

    I would say, well i hope your program does something wonderfull before it needs byte 1025

    1. I thought long and hard about this one. 2×16 character LCD’s would blow the 1 kB limit out of the water. The venerable Hitachi HD44780 controller which runs 2×16 LCDs contains a micro – and mask ROM. It also contains way more than 1 kB of character table ROM. You’ll have to find a different way to display your data. Remember – binary is fine! So are BCD to 7 segment decoders and associated LEDs.

        1. What should be better with them? They need high voltage and just look oldfashioned. If you like the looks then use them, but I don’t – so please don’t call them “better”.

      1. Adaim, HD44780 is quite complex logic device, but contains no micro inside. No CPU, no ALU, no registers, no program counter, no actual program that is executed – nothing that could be identified as microcontroller of any kind. It contains ROM to display characters, but this ROM is not executable. By the way, BCD to 7-segment decoders do actually contain “ROM” – it is lookup table. the same principle as character generator inside HD44780.

        1. It’s not about code, but code+tables. As a 44780 has a character table it means you’d be cheating extra data in that part. I’d say that if you want a graphical output it should be fully defined in the program code itself, taking into account the 1kB limit. Am I right about that?

          1. Not sure about it.
            The same goes for 7-segment displays and its decoders Adam mentioned as OK. Should I include segment tables in my code too? The BCD to 7seg decoder is external lookup table, giving meaning to LED segments in the LED display. HD44780 does the same thing for the LCD segments – though in apparently more convoluted way. But IMO principle is still the same.

  7. What about components that include a ROM?

    Take the standard 8×2 character display: there is a ROM in it with the font that you cannot update or remove, even if you don’t use all characters.
    Modern processors often have a bootrom that executes when you turn on the power, and that then jumps to your program, does this count if you don’t jump back to it or use any functionality it provides (except setting up the CPU)?

    1. With emulators, there’s a lot of fancy stuff you can have on a virtual TI that was rare and/or expensive. If you ever wanted some of those 3rd party enhanced Extended BASIC versions or some of the expansion hardware. There’s also Geneve 9640 emulation. I think MESS may finally have a complete 99-8 emulation.

      Want PASCAL, FORTH, P-Code? Something else? There has been a lot of new stuff recently. How about BASIC compilers to turn BASIC programs into machine code so they run much faster?

      The TI-99/4A was the only standalone computer system that ran the PLATO Courseware system. No network connection or DEC minicomputer required. It was supposed to be available for the apple ][ and IBM PC but I don’t think anything was released for them.

      You can put a TI emulator and all the PLATO software on a small memory stick for portable education. CaDD has finally produced a Windows version of their old DOS emulator and with it you can buy The Cyc DVD Won’t sell you The Cyc without also buying the emulator. :P

  8. There may be a serious hole in your rules, you need to limit the run-time (in CPU cycles) for the initialisation step if any form of compression is used because string rewriting sequences could be used to, given enough steps, generate any desired sequence.

    If you have no idea what I am talking about that would explain why there is a hole in the rules, and if you do understand the problem are you going to fix it?

    i.e. Given enough time I can find the code and binary string rewrite sequence that is under 1024 bytes yet expands into a full VM running a Linux desktop distribution, and it would all fit in the 8GB of RAM that is not uncommon these days on new PCs.

      1. …and, as you stated… “8GB of RAM that’s not uncommon these days on new PCs”… wherein a boot-ROM has already been DQ’d. But, assuming said loophole is allowed, plausibly intentionally, (e.g. it’s been explicitly stated that your code can decompress itself, fitting e.g. a 200byte decompressor and 824 bytes of compressed code)… still, can you do what you’re talking about with a machine that doesn’t have a BIOS of some sort, doesn’t have inbuilt lookup-tables for character-generation, etc…?

        1. Yes you can, if you can do it at all you can do it in a way that provides an binary blob in memory that is ready to execute. No BIOS required and the entire VM and it’s contents are in the blob.

          1. The problem is that it can’t be done. It is impossible to compress an arbitrary amount of data to a fixed size, there’s no loophole that can change that. Yes that includes custom creating a pseduo-random generator with a “magic” starting value – there’s _no_ loophole. Yes that also covers the co-creation of code and data (such as data that doesn’t generate 100% correct can be compensated by slightly changed code).

            What you are suggesting is similar to a number of infinite compressing ideas that pops up from time to time, they don’t work and your suggestion will not work.

            With that said there’s a theoretical chance that a pseduo-random generator with a magic starting value can generate a full Linux distribution. No the above is still true, do the maths, the probability that would happen is less than a new Earth clone with everyone cloned would suddenly replace Mars if I did the calculation right…

          2. No Megol it is you who need to do the maths, very specific and well established maths that I describe in my reply to Marcus. You guys seem to have missed out on some theory at some stage, fundamental knowledge that lies at the heart of computer science.

      2. Ah, no, why should it when it is no different from any other magic number used in a code routine. I do not have to defend my use of a given value to initialise a variable. Where do the rules require that? I can generate the code any way I like, otherwise compilers and compressors would be out of the question. I just have to make the tools FOSS.

    1. How do you know you have generated the correct string blob? It’s like the million monkey’s writing shakespear – someone with knowledge of the correct solution still needs to review it and compare it to the origonal to confirm.

      Sure you could compare hashes, but by nature of hashing being a much smaller subset there will be a lot of hash collisions. If it was really so easily possible someone would have already used it for lossless compression algorithms.

      1. How do you know that A=B? FFS that is not a hard problem at all.

        “If it was really so easily possible someone would have already used it…”

        Well nobody said it was easy, even though it is easy, what you don’t understand is that what matters is the question of what does it cost you to to do it, and that would be CPU cycles.

        The logic about other people doing it seems to be “Argumentum ad Ignorantiam”, a fallacy. Not that it is relevant when we already know that it is in fact mathematically proven, the method works and the only limit is space-time.

        In fact if you want to get philosophical about it you can start with nothing and apply the appropriate rules to generate an entire universe. :-)

        1. Where are you going to store “B”, within your 1kB, to compare it to “A”?

          Sure you can generate 8GB of random data. How will you make sure you arrive at the very specific bit correct target 8GB of data?????

          1. You clearly have no idea how rewrite systems work, once you have discovered the starting string value, transformation rules and number of steps then the engine that generates the desired result is deterministic. The discovery of the above mentioned conditions is part of the compilation stage of development and not limited to the 1024 bytes of the submitted executable. The smallest set of Turing complete rules that I know of are only 32 bits long and form a lookup table so the code part is very small, leaving a very large part of the 1024 bytes for the starting string and the step count value.


            And in case you naively clutch at the term “unsolvable”, that means you have to find the solution iteratively, and that is the spatiotemporally expensive part. Also some configurations grow, others are stable and others shrink, so obviously we are after one that grows such that the result after time(x) is the much larger binary blob we need to execute.

        2. Not it isn’t possible. The pigeonhole principle illustrates it perfectly.

          First realize that an algorithm isn’t magic – it is only a transforming process of data,
          Then realize that encoding 8GiB data requires 2^41 bits of information and that 1KiB data have 2^18 bits of information. 2^41/2^18=2^23 or 8Ki (8192).

          What you are proposing is using a transformation that makes 1024 bytes into 8GiB arbitrary data. To illustrate why it doesn’t work in the simplest way possible I’ll reduce the problem _but_ this is 100% analogous to the original problem, this isn’t a trick in any way just a simpler way to express it. If you don’t believe me (and don’t want to read up on this) you can simply expand the problem until you get the original one.

          The difference between the data sizes above was 8Ki, the smallest system with that property is that where the source data is 1 bit and the destination data is 8Ki bits. The source data can be 0 or 1 and can encode 2 values. Therefore if the system (with an external mechanism) is to generate 8Ki data it can only generate two variants of that data, it can’t generate any given output no matter what the external mechanism is. It is simply impossible.

          1. And by the way, this discussion is lame. If you can come up with a way of encoding a huge program into 1024 octets, including the code to decompress it, then that isn’t a hole in the rules, it’s completely fair play. Essentially, an interpreter compresses a number of machine code steps into each keyword, so are you going to outlaw using an interpreter?

          2. Yes the discussion is lame but not understanding why the idea isn’t workable or even possible can lead to real problems. There have been a number of people that have “created” a compression method that can compress already compressed data and lured investors to spend money on those. But that is an impossibility no matter how the data is encoded, one bit can’t encode more than two states (and so on), 8192 bits can’t in any way encode all the possible states a 8GB chunk of data can encode. There’s no way around it, it is 100% impossible.

            Now one could do something else like saying that the result wanted is the value of pi to the nth decimal and then it it possible to encode that in 1024 bytes. Some may say that pi is (expected to) contain all possible permutations of data in such a way that digits x to y somewhere will contain the data one want requiring only storing position x and y (possibly with y stored as y-x to reduce space) and a program generating those digits. However even if that would be true storing x and y would at best be the same size as the input data and in reality much larger.

            Again there’s no way around it.

    2. Yes, it is possible to define a set of iteratively executing rules in 1024 bytes which will create whatever larger sequence of bytes that you want. The problem, however, is to define how many times the rules need to execute before the big byte blob is generated.

      Consider the simplest binary blob generation rule: increment the blob by one. Run it enough times and you will get your 8GB VM running Linux. Problem is that you also need to define how many times you need to run it, and that number will be 8GB large.

      Another rule would be to add 2 to the binary blob. You would only have to run that rule half the amount (8GB – 1 bit) to get within one bit of the desired result, but you still have to use the last bit for parity.

      Of course there are rules which will compress the data, because a lot of it will be structured data. But at some point the compressed data will be effectively random, and any further compression attempts will not give any smaller sizes.

      1. I don’t think it is possible. How about 512 bytes or 64 bytes or 1 byte or 1 bit even. At some point you cannot create the larger sequence. Where is that line between possible and impossible. If the desired sequence contains more than 1024 bytes of information then it is impossible to automatically create it with 1024 bytes.

        1. Please read my complete message again, it is easy enough to code a simple binary blob generation rule in 1024 bytes (Example: Increment binary blob with 1). The problem lies in knowing exactly how many times you need to run it to create a specific blob, and that information will not be possible to store within 1024 bytes for a large, non-structured, binary blob.

    3. Dan,

      While this argument is delightful, why don’t you simply create said algorithm and prove every one wrong?

      I looked into the Semi-Thue System and I recognize it from other bits of code I worked on, therefore I’ll refrain from stating my opinion about SRS.

  9. Check out the demoscene, especially the 32K and 64K demos. Here’s a video of the 2000 prize winner from Farbrausch “the product”. 11 minutes of 3D geometry, textures and audio generated on the fly from 64K of code, and they didn’t use all of that – filled up the last of the space with an outro text scroll that explains quite a bit about how it was done.

    Search for farbrausch werkzeug to find some of the tools the group has used to create their demos.

    1. The first thing i was thinking of when i saw the headline were the good old 64K farbrausch demos. Never got that far to do something fancy like that myself, but it was very fascinating to see what the guys could squeeze out of these “little bits” of code.

  10. > This contest is open to any microcontroller or microprocessor architecture.
    So does the preclude old mini mainframes ?
    Some really old hardware with CISC (Complex Instruction Set Computers [!RISC]) had some nutty assembly instructions. A think that a VAX-11 had an assembly instruction for N! (N factorial).

    I think the coolest use of 1KB would be to detect the processor that the code is running on, using hand crafted custom hex.
    So x86, ARM, 8051, MIPS, AVR, RISC-V, …

  11. I’ve written plenty of code that fits in 1K (a gold box (~100B), several different 4×7 segment displays (<512B) attached to a RS485 network, HA nodes (<1KWord)). Not going to enter them as they'd get their code kicked. In some cases the 1K limit (and 14B of RAM in the gold box) was not a huge issue. Understand that a lot of small microcontrollers are used as peripherals for bigger controllers and computers. I seem to recall creating an EPROM burner for my Atari 800XL that was slight more than 1K in size (but that had the Atari OS and DOS which were much more than 1K). A lot of the HCS II (HA system, RS485) used microcontrollers with the 16C84 (1K Word & 36B RAM).

    Some of the earliest drivers are only a few hundred bytes. That allowed you to bootstrap the system (go from toggle switches to paper tape to boot).

    I'm not saying microcontrollers are better than microcomputers. The right tool for the right job.

    1. I had a parallel printer on the C64. The original “Centronics Interface” was a rather expensive (at least for a schoolboy) HW device. But there was a “software centronics interface” solution, essentially a printer driver. It used the user port and less than 192 bytes of code, so it fitted in the tape data buffer which I did not use anyway as I had a 1541. The use of this address space was nice, as you did not have to change the start of the BASIC program memory. And it had short loading time beeing so small :-)

  12. I’ve been working on an emulator for my 16-bit homebrew CPU architecture. It’s Z80-inspired so it’s not really any more efficient than any other CPU/MCU. Would that be eligible?

  13. qty 8 128×1 bit static ram chips on a hand-wired 1802 board using toggle switches to manually set address and load data into each byte…. and an enter PB plus one run/load toggle switch. Demonstrated to the wife when the first program worked! It blinked a LED. Her reaction; “You spent $125 to blink a light?” There is still a bit of a lump where she whacked me, and it still hurts a bit. That was a heavy umbrella!

    Within the first week a board was built to program 2704 eeprom to PERMANENTLY hold the monitor program being keyed in each time power was removed… … and made a lot of money selling cassette tape interfaces to friends.

    Coding contests betwixt our group members were usually a task to be accomplished in the fewest bytes of RAM… cause NOBODY had enough RAM. Hence RAM was always in all-caps, hence ROM of course followed suit being all-caps. Designed an 8K RAM board for Aardvark Technical Services…. earned a bit of royalties of their sales.

    6502 pretty much took over the scene. Things started to really take off then. The OSI series was my bread and butter. Folks programmed in Machine Language… assemblers were luxury IF you could find one! Wasn’t long and MS Basic in Rom was available and SBC’s really took off then. It became the OSI/Aim/Sym/Kim era.

    Hardware of greatest desire was soon the 110 Baud modem. Then 300 Baud! Sold a lot of those (handmade). Acoustic couplers were hard to find but if you had a supply there was money to be made, there were BBS systems out there you could download code from!

    Teletypes became the “most desired” object… a difficult quest though. The Epson MX-80 solved this. Was President of the Hacker’s Group back in those days… Milwaukee.

    Xmodem. OMG! Made a LOT of money getting Xmodem up and running on systems. And interfacing printers… custom cables… CP/M was lotta fun! Things were getting easier.

    I024 Bytes eh? Yah! Thanks for the Blast from the Past! I suggest though that it be seen what can be done in just 128 bytes. Easier to judge too!

  14. This is a fun competition but I wonder whether it is sending a good message. In the past, such technical limits have resulted in really clever but very poor software that usually end up in completely unmaintainable systems. Of course this is not an issue with a competition based on 1k of memory but it would be nice to also include a requirement that the code be understandable by the judges. That is to say they could inspect the code and within some (short) specified period of time be able to establish how it works.

    1. It is indeed sending the wrong message, but this is not any official “hackaday prize” style thing, it is more like a small weekend meetup project. That said I hardly believe they will pick the winner by how good their code is; as long as it shows impressive results and fits 1kB it is a GO.

  15. The rules regarding using an LCD really need to be clarified, especially when projects being promoted are already using them. And what about a UART? The calculator example stated input from a user via switches was okay, and reading data from an SD card was okay, so can I read data from a UART? And send data back via the same UART and display it in a dumb terminal?

  16. So if I’m reading the rules correctly, you are talking about the actually bytes that make up the program? I’m thinking about using an msp430 chip. It has a vector table, that sucks up a bunch of bytes, however, I’m actually only using the reset vector which is 2 bytes, I don’t change any of the other ISR vectors. If I use the msp430-size command on my .elf file it shows ( of 166 bytes):

    $ msp430-elf-size hackaday_1k.elf
    text data bss dec hex filename
    158 2 6 166 a6 hackaday_1k.elf

    If I look at the hex file which are the actual bytes that are written to the flash, meaning it doesn’t include any of the ram used on the chip, it shows ( of 160 bytes):

    $ msp430-elf-size hackaday_1k.hex
    text data bss dec hex filename
    0 160 0 160 a0 hackaday_1k.hex

    So which one is it 160 bytes or 166 bytes? The code that I flash to the chip or the code + ram used?

  17. What about architectures where RESET vector table has to be at the end of FLASH (MSP430)? If I have 1022 byte app and single 2 byte reset vector at address 0xFFFE, does it mean my app is 1kB or 64kB ?

    Is it possible to have 64kB binary where 63kB are just 0xFFs?

  18. Will the ELF/PE header be counted against the 1kb limit ? The headers don’t provide any “functionality” and are there just to instruct the operating system how to load the file.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.