The X Macro: A Historic Preprocessor Hack

March 16, 2023

If we told you that a C preprocessor hack dated back to 1968, you’d be within your rights to remind us that C didn’t exist in 1968. However, assemblers with preprocessors did, and where there is a preprocessor, there is an opportunity to do clever things. One of those things is the so-called X macro, which saw a lot of use in DEC System 10 code but probably dates back even earlier. You can still use it today if you like, even though there are, of course, other arguably better ways to get the same result. However, the X macro can be very efficient, and you may well run into it in some code, too.

Background

Preprocessing used to be a staple of programming. The idea is that code is manipulated purely at the text level before it is compiled. These days, languages with a preprocessor usually handle it as part of the compiler, but you can also use an external preprocessor like m4 for more sophisticated uses.

Modern languages tend to provide other ways to accomplish many of the tasks handled by the preprocessor. For example, if you have a constant you want to set at compile time, you could say:

int X = 32;
y = X;

But then you’ve created a real variable along with the overhead that might entail. A smart compiler might optimize it away for you, but you can be sure by writing:

#define X 32
y = X;

A modern compiler would prefer you to write:

const int X=32;
y = X;

But there are still some common uses for macros, like including header files. You can also make more sophisticated macros with arguments so you don’t incur a function call penalty, although modern usage would be to mark those functions as inline.

The Problem

Which brings us to the X macro. With all great hacks, there is first a problem to solve. Imagine you have a bunch of electronic parts you want to deal with in your code. You don’t want a database, and you don’t want to carry a bunch of strings around, so you define an enumerated type:

enum parts { part_LM7805, part_NE555 }; // will add more later

Of course, you will eventually want to print them, so you do need to store the names somewhere, right?

const char *partnames = { "LM7805", "NE555" }; // will add more later

This is all fine until you add a new part like, say, a 2N2222. You must remember to update both the enumerated type and the string or havoc will ensue. This seems easy until you realize that you might define the enumerated type in a header file but only define the string array in a source file. It is easy to get them out of sync.

The Hack

The idea is to define a macro that handles all the definitions of parts in one place:

#define PARTS \
X(part_LM7805,"LM7805") \
X(part_NE555,"NE555")

Now when you declare the enum and the string array (which may not be in the same file, remember):

#define X(a,b) a,
enum parts { PARTS };
#undef X

#define X(a,b) b,
const char *partnames= { PARTS };
#undef X

If you carefully read the code, you can see how it works. The PARTS macro defines a list of items using the X macro. Before using the list, you define X to “select’ one of the pieces. The first #define makes X() return its first argument, and the second #define, the second. Because these preprocessor macros run before the code is interpreted, this causes the preprocessor to write the same code as in the original example. The advantage is that the ID and the name are joined together in the text which makes it harder to forget to add or update one when making changes.

Even Better

Using modern C preprocessor syntax, we can do even better by using token pasting and the stringize operator.

Here’s a quick tutorial if you haven’t encountered these oddball preprocessor operators. The stringize operator # converts whatever you put after it into a quoted string. The token pasting operator ## joins two tokens into one token. So:

#define print(str) printf("%s\n", #str);
#define declare(type, prefix, var) type prefix##var;

declare(int,global_,v);
print(Hello!);

Not that either of these are a good idea, mind you. But you can see that the declare macro will define an integer called global_v and the print macro will print the token that follows it as a string.

Consider this:


#include <stdio.h>
#define PARTS \
   X( LM7805, 0.20 ) \
   X( NE555, 0.09 ) \
   X( 2N2222, 0.03 )

// create enum
#define X(a, b) part_##a,
enum parts { PARTS };
#undef X

//create string table
#define X(a, b) #a,
const char *partnames[]={ PARTS };
#undef X

// create price table
#define X(a, b) b,
float partprice[]= { PARTS };
#undef X

int main()
{
  enum parts p=part_NE555;
  printf("%s costs %0.2f\n", partnames[p], partprice[p]);
  printf("%s costs %0.2f\n", partnames[part_2N2222], partprice[part_2N2222]);
  return 0;
}

Here, we define a table of parts and prices. (Made up prices, to be sure.) The enumerated type uses part_##a to create things like part_NE555. The string table uses #a to get a string “NE555” into the source code. Finally, the price table uses b.

Simple, yet effective. Sure, you could use a structure or an object to help. There are also plenty of other ways you could deal with this in the preprocessor. For example, you could define everything in one file and use #if to select what parts of it are included in different parts of the code. Regardless, the X macro is an elegant hack and it does solve the problem and has been since at least 1968.

The preprocessor can do some pretty amazing things. For example, we’ve built a cross assembler using it. We’ve even seen people do logic gate simulations in the preprocessor.

45 thoughts on “The X Macro: A Historic Preprocessor Hack”

steelman says:

March 16, 2023 at 7:18 am

#define X=32

Should be #define X 32

Report comment

Reply
1. g says:
  
  March 16, 2023 at 8:53 am
  
  It of course should be 42.
  
  Report comment
  
  Reply
  1. Al Williams says:
    
    March 16, 2023 at 2:17 pm
    
    How about 0xDEADBEEF or 0xBADDAD ?
    
    Report comment
    
    Reply
    1. Francois Otis says:
      
      March 16, 2023 at 11:30 pm
      
      Please, don’t start it. ;)
      
      Report comment
      
      Reply
C says:

March 16, 2023 at 7:21 am

Avoid macros as much as possible. In C++ you can use constexpr and templates.

Report comment

Reply
1. CXX says:
  
  March 16, 2023 at 7:40 am
  
  1. This is C, not C++
  2. Use the right tool for the job. You can’t do token pasting or stringification with constexpr and templates
  3. ???
  4. Profit
  
  Report comment
  
  Reply
2. Konstanze says:
  
  March 16, 2023 at 7:51 am
  
  I’ve heard this trope so often, yet many of the uses of macros are not covered by templates or constexpr and those bring new issues, build-times and debugging being only two of them.
  
  And what is the actual reason to “Avoid macros as much as possible” ?? What is the issue that this strategy tries to prevent?
  
  Report comment
  
  Reply
  1. PWalsh says:
    
    March 16, 2023 at 8:31 am
    
    It’s C++, they have squirrely ideas about what’s good or bad.
    
    For example, they despise printf-style formatting, because:
    
    std::ios_base::fmtflags f(cout.flags());
    cout << setprecision(5) << setw(15) << number;
    cout.flags(f); // Reset cout to default
    
    is much better than %15.5f, because three whole lines to do something you can type in 6 chars is so much more self-explanatory, don't you agree?
    
    Report comment
    
    Reply
    1. Konstanze says:
      
      March 16, 2023 at 8:45 am
      
      While I certainly feel you, my comment is mostly about generalisation beyond reason. Which you could say about the way you seem to feel about C++-users.
      
      Report comment
      
      Reply
    2. rclark says:
      
      March 16, 2023 at 9:55 am
      
      I’ve run into that too with overuse of objects (C++) and other ‘obscure’ features of C++ by some over zealous C++ users … just because they can. Instead of a few lines of code, they end up with a mountain of code and modules :) … But hey, it’s c++! Nothing against C++ users (I am one too) but with my roots in C, I like to keep it simple and straight foreword for the next guy that has to maintain the code.
      
      As for macros, my motto is to keep it simple and use only when makes sense.
      
      Report comment
      
      Reply
      1. Blue Footed Booby says:
        
        March 17, 2023 at 6:43 am
        
        More terse isn’t always more good. But I absolutely agree that big C++ projects need to have style guides that lay out which subset of language features to use and which to ignore. Otherwise you have a team of regular debs, and then one dork who actually takes advantage of preprocessor and templates both being Turing complete. So 90% of the codebase is reasonable, and reading the last 10% is like getting a train run on you by the King in Yellow, Cthulhu, and the Watcher in the South.
        
        Report comment
    3. Bry says:
      
      March 16, 2023 at 10:57 am
      
      There are some benefits to the old C++ approach although in many cases they’re not worth the verbosity.
      
      The old C++ approach can be evaluated at compile time and a type-specific implementation could be inlined and optimized. I believe the C approach still evaluates format strings at runtime although I could be wrong.
      
      The old C++ approach does not require you to pick a code based on the size of the type you are printing, you just print it and the compiler chooses an implementation. The C approach requires getting the format string correct based on the type – do I use %f, %lf, %Lf, %d, %u, %ld, %lu, …? Maybe not so hard for cases where you have an explicit int or long yourself but what about system-defined typedefs that might be different on different systems? Though admittedly these days if you compile with -Wall -Werror generally the compiler will catch many issues.
      
      The C++ folks also think the old C++ approach is not ideal and added std::format in C++20 and are adding std::print in C++23. This will let you do
      
      cout << std::format("{:15.5f}", number);
      
      or eventually
      
      std::print("{:15.5f}", number);
      
      instead of
      
      printf("%15.5f", number);
      
      with (I believe) more compile-time benefits.
      
      Report comment
      
      Reply
    4. C says:
      
      March 17, 2023 at 1:15 am
      
      I use both C and C++ for embedded. We do not even use C standard library for printing, but use some external library. This is due to stack and heap usage. It is often smaller in size and faster too.
      I prefer C printf over the C++ string streaming. There are now more convenient libraries in C++.
      I’ve been using C++ in embedded since C++17. Prior to C++11 the language lacked features that are very useful and embedded compilers lacked support.
      
      Report comment
      
      Reply
  2. C says:
    
    March 17, 2023 at 1:11 am
    
    Cmake can add build-times and even commit hashes. It can generate header files for you.
    Macros cause a lot of headaches and are notoriously hard to debug. There are often better alternatives that also give the compiler the ability to do some checks and optimizations.
    
    Report comment
    
    Reply
3. Eric says:
  
  March 17, 2023 at 4:54 am
  
  Even a simple std::map can handle this in a cleaner approach. You could argue that lookup times on a std::map are expensive on embedded hw and that’s true, also related to the size of the map itself. One should always aim to use the better tool for the job. Not arguing for or against c++ but the lack of type safety in C has always been a problem. A lot of which go away with experience and time but we are prone to make mistakes
  
  Report comment
  
  Reply
4. Bryan W says:
  
  March 18, 2023 at 10:35 am
  
  Absolutisms like “avoid macros as much as possible”, like in real life, are ignorant in programming. Context is for kings.
  
  In application software, yes, one doesn’t tend to use macros because they are hard to debug and maintain and are not really worth the performance gain on modern hardware vs. that cost.
  
  But in software running on constrained hardware, drivers, or firmware, the stuff typically written in C, they are desirable because one tends to actually care about every grain of performance they can get.
  
  So going back to context, this article being about C and on a hardware hacking site, I’d say your sentiment is in error.
  
  Report comment
  
  Reply
5. Billybob Joe Bob says:
  
  March 22, 2023 at 5:50 am
  
  Why they’re harder to ingest the micros. You know the average adult has about an entire array table in their bloodstream?
  
  Report comment
  
  Reply
jpa says:

March 16, 2023 at 7:22 am

I like this form:

#define PARTS(X) \
X( LM7805, 0.20 ) \
X( NE555, 0.09 ) \
X( 2N2222, 0.03 )

// create enum
#define PARTS_TO_ENUM(a, b) part_##a,
enum parts { PARTS(PARTS_TO_ENUM) };

because it allows to have predefined helper macros with descriptive names that can be predefined in a header file, instead of using name X from global scope.

Report comment

Reply
1. Mike says:
  
  March 16, 2023 at 9:23 am
  
  Not sure I understand the alt form – how would PARTS_TO_ENUM be used?
  
  Report comment
  
  Reply
  1. jpa says:
    
    March 17, 2023 at 12:43 am
    
    It’s given as argument to the PARTS macro. One macro is the list, other macro is the X and you combine them by writing PARTS(PARTS_TO_ENUM)
    
    Report comment
    
    Reply
2. miguel says:
  
  March 18, 2023 at 9:15 am
  
  Really nice alternative. I will probably use it since I recently added some X macros to some greenfield code and this way probably is more self documenting and without redefinition of macros. Thanks!
  
  Report comment
  
  Reply
some guy says:

March 16, 2023 at 8:47 am

Don’t know… Instead of horrible nested macros (nothing wrong with macros but when they are too many and nested it’s difficult to understand) i prefer some Perl/Python/… that will read a simple text file and generate the needed C-stuff. Then you can just do a #include “parts_generated.c” (yes, .c) and it will work just fine.

Report comment

Reply
1. Jon Mayo says:
  
  March 16, 2023 at 9:30 am
  
  I prefer Lisp or Scheme where the language can easily interact with itself to generate these kinds of things in the normal language instead of through a preprocessor’s weird structure.
  I’ll never use Python for build tools again. It has been a special kind of hell dealing with version issues between different build environments, and python’s unwillingness to provide a forwardly compatible dialect of the language.
  
  Report comment
  
  Reply
2. Elliot Williams says:
  
  March 16, 2023 at 10:07 am
  
  This is also what I do.
  
  But that’s also saying that I’ve probably reinvented a good part of the preprocessor wheel in Python, simply b/c I didn’t know it.
  
  Stringize and token pasting were new to me, and I _know_ that I’ve written templatey stuff in Python to do something similar. Kinda fun to see how they used to do it back in the old days.
  
  Report comment
  
  Reply
cyanic says:

March 16, 2023 at 10:24 am

For an example usage on modern code, take a look at the code in Flipper Zero’s scene-based apps. For some reason they decided to store handler pointers in arrays by handler type instead of grouping them together by scenes in structs, but this is how they populate those arrays.

Report comment

Reply
rufusvs says:

March 16, 2023 at 11:03 am

Is it just me and my old eyes, or are the directive lines (prepended with #) in the code examples in a particular dim color? Are they colored as if they were python comments or something? It would be less critical if they actually were comments, but they are important part of the code examples. (Also, as already noted, #defined constants don’t use the equal sign (=), which is what brought me here in the first place, but I like the redefinable X-macro hack. The enum/enum text string issue happens all the time.)

Report comment

Reply
Mike Bradley says:

March 16, 2023 at 12:49 pm

Be carefull:

Assuming that #define X 32 is similar to const int X=32 can be very misleading, one must know their compiler and ecosystem

The C compiler I use often CCS C, that is drasticaly two differnt things.
:: #define X 32 would be used like a find and replace in source
:: const int X = 32, would cause the compiler to put 32 in code space memory (not ram)

When I store a chrset in rom, I use: const chrSet0[size] = { …. };

Report comment

Reply
1. Al Williams says:
  
  March 16, 2023 at 2:17 pm
  
  That’s true but the fact that it puts it in code is because it knows its is constant and that’s my whole point. Sometimes you really do just want text substitution at the point of use and it is nicer to not have to guess if the compiler will do it or not.
  
  Report comment
  
  Reply
2. Snow says:
  
  March 17, 2023 at 10:28 pm
  
  >> const int X = 32, would cause the compiler to put 32 in code space memory (not ram)
  
  Actually, the compiler will use the value of X directly in any expression that X appears – unless “constant evaluation” (aka “constant folding”) is disabled. This means that Y = X; will compile as though it were Y = 32;
  
  On the other hand, if X either is a global variable or its address is used, then there will also be a constant stored in code space – something that can’t be done when X is declared using #define
  
  Another special case is:
  volatile const int X = 32;
  
  In this case, X is treated as a variable – a read-only variable. The volatile qualifier tells the compiler that X can be modified “externally”. This could mean that X is “mapped” to a read-only hardware register, or shared with another thread, or an interrupt service routine.
  
  Often, the declaration would be:
  extern volatile const int X;
  
  Report comment
  
  Reply
Nikolai Kondrashov says:

March 16, 2023 at 1:34 pm

I used this trick with m4 a few times, one of them defining a whole database of HID usage tables and codes: https://github.com/DIGImend/hidrd/tree/master/db/usage 😁🙈

Report comment

Reply
Greg A says:

March 16, 2023 at 10:20 pm

i love C preprocessor, i think it is a good balance between smarts and power. i’ve definitely seen a lot of other languages make far more complicated generic programming tools that are much less efficient and powerful and expressive. yeah, i hate C++. i’d go so far as to say the lack of the C preprocessor is actually my single least favorite thing about java. my one exposure to rust ran me into this awful hack, a real failure of generic programming, maybe the guy who came before me didn’t know what he was doing but i couldn’t stop thinking about how rust’s impressive type system fell victim to the worst of C++’s problems when it comes to trying to put together a generic type…the C preprocessor would’ve done a better job.

the only thing “better” is forth, but it’s got its own challenges…to me, sadly, forth is a toy.

but i have to say, even though i’m a “compiler guy”, i am still sometimes astonished by the C preprocessor. i have known the rules, i know where to look them up, but i’m still often surprised by the things that suddenly become possible iff you nest your macros. i can’t even summarize it well off the top of my head, i think it is something like macros within macro arguments get evaluated if you call another macro from within your macro. iow, the rules for when macro operands are evaluated make for some very powerful expressive opportunities. but the thing you wind up with doesn’t look expressive at all :)

Report comment

Reply
a says:

March 16, 2023 at 11:34 pm

Oh my god please don’t use this in real code, just use a modern language. In nim:

type Part = object
name: string
price: float

const LM7805 = Part(name: “LM7805”, price: 0.20)
const NE555 = Part(name: “NE555”, price: 0.09)
const p2N222 = Part(name: “2N222″, price: 0.03)
echo LM7805.name, ” costs “, LM7805.price

And before you say “it takes up more memory because it’s a variable”, no it doesn’t because const is evaluated at compile time, so the compiler can just replace the parts in the echo with it’s value.

Report comment

Reply
1. Greg A says:
  
  March 17, 2023 at 8:02 am
  
  you missed the point of the exercise. now create two different arrays, one array of part numbers ando one of part names. and do it without typing LM7805 twice.
  
  i’m not saying there’s not a case to be made for expressing yourself differently — i would say generally it’s true that if your macros are complicated then you should consider reframing your problem — but your example doesn’t accomplish anything the X macro here does.
  
  Report comment
  
  Reply
  1. a says:
    
    March 17, 2023 at 10:20 am
    
    Nim and other modern languages have really good macro support. If you reaaaaaallly don’t want to type the name twice:
    “`
    type Part = object
    name: string
    price: float
    
    template createPart(partName: untyped, priceInp: float): untyped =
    const partName = Part(name: astToStr(partName), price: priceInp)
    
    createPart(LM7805, 0.20)
    createPart(NE555, 0.09)
    createPart(p2N222, 0.03)
    echo LM7805.name, ” costs “, LM7805.price
    “`
    I don’t see why you would do all that array stuff. But I guess my point is don’t do crazy crap in C unless you want bugs.
    (hopefully hackaday uses markdown?)
    
    Report comment
    
    Reply
    1. a says:
      
      March 17, 2023 at 10:22 am
      
      Gaaah is it bbcode?
      
      type Part = object name: string price: float template createPart(partName: untyped, priceInp: float): untyped = const partName = Part(name: astToStr(partName), price: priceInp) createPart(LM7805, 0.20) createPart(NE555, 0.09) createPart(p2N222, 0.03) echo LM7805.name, " costs ", LM7805.price
      
      Report comment
      
      Reply
    2. Chris Lott says:
      
      March 17, 2023 at 6:26 pm
      
      Does this technique let you put the tables in flash? I’ve reluctantly been doing the X macro method for years, because when you have fixed tables of over a hundred entries, changes can be a nightmare and error-prone if you have to type the information twice.
      
      Report comment
      
      Reply
    3. Greg A says:
      
      March 18, 2023 at 7:02 am
      
      “I don’t see why you would do all that array stuff.” like i said, it’s a good idea to examine your problem if you find yourself using complicated macros. but nonetheless and all the more, your example still doesn’t accomplish what the X macro accomplishes.
      
      it’s fine if you don’t like the X macro approach in this article but your nim example simply isn’t related at all. the whole point is to accomplish “all that array stuff”. i find a use for an approach like that one oh i don’t know about once a year. especially when making test cases, there are instances whene you want to do “all that array stuff.”
      
      Report comment
      
      Reply
    4. Greg A says:
      
      March 18, 2023 at 9:07 am
      
      heh and, on a hunch, i tried to run your example through nim. “apt install nim” worked ok. figuring out how to invoke “nim c buh.nim” wasn’t hard. but i get a bunch of error messages from glibc’s stdlib.h: __BEGIN / __END_NAMESPACE_STD, and __extension__.
      
      so, you know, i’ll see your “I don’t see why you would do all that array stuff” and raise you a “I don’t see why you would want to have a less-well-supported language toolchain” 🙂
      
      Report comment
      
      Reply
Nikolai Kondrashov says:

March 17, 2023 at 2:55 am

I definitely agree that having the preprocessor features implemented as part if the language is safer and more manageable. However there’s great power in being able to generate any construct with a preprocessor, and to do it in a straightforward manner. The overhead of doing some things in the language proper, both in learning to do it and in reading the resulting code becomes too much at some level, to the point of diminishing returns.

Report comment

Reply
gw says:

March 17, 2023 at 11:39 am

Wow, thanks for the history lesson. I was familiar with this idiom but didn’t know it had such deep roots. I first saw it in the code for LuaJIT and have been using it ever since.

As well as being good for initializing large arrays, you can use this to generate code just by using ; or && or || or + or whatever instead of the , in the X macro. You get the benefits of a data-driven coding style without the static (space) or runtime overhead of looping over an array. This definitely does take up significantly less space sometimes, depending on the use case and target architecture.

Report comment

Reply
Elwood Downey says:

March 17, 2023 at 1:06 pm

Wow! I’ve needed to this so many times. I’ve been programming C/UNIX since 1977 (yes, V6 on a VAX-780) and never saw this one before. Fantastic! Many thanks.

Report comment

Reply
Mark Stewart says:

March 18, 2023 at 9:29 am

And I thought I was pretty smart in 1990 to use #define to format the headings on a printed report!

Report comment

Reply
Pieter says:

March 19, 2023 at 8:04 am

Not knowing about this macro magic trick, I was the guy trying to understand and fix a large code base that used it extensively for state machines… ugh! A convenience editor like SlickEdit parses the code (called tagging) and should be able tell you where an enum like part_NE555 is defined and all the places where it is used, but fails out of the box :(

The convenience of defining it in one place and avoidance of duplication errors is outweighed by readability / comprehension.

Try a text search for “part_NE555” and you won’t find where it is defined in the example (line 4 and/or line 9). Now imagine a 1000 H and C files.

It’s a nice trick but caused me a lot of mental pain.

Report comment

Reply
I Alone Possess The Truth says:

March 19, 2023 at 9:38 am

It would have been nice if the writer had explained “overhead” and “function call penalty”.

Be best!

Report comment

Reply
ecloud says:

March 20, 2023 at 3:05 am

Qt has QMetaEnum::valueToKey() to get the string name of an enum value (as long as you used Q_ENUM to get moc to generate that data).

Report comment

Reply

Hackaday

The X Macro: A Historic Preprocessor Hack

Background

The Problem

The Hack

Even Better

45 thoughts on “The X Macro: A Historic Preprocessor Hack”

Leave a Reply to CCancel reply

Search

Never miss a hack

If you missed it

Belting Out The Audio

Ore Formation: A Surface Level Look

In Which I Vibe-Code A Personal Library System

Give Us One Manual For Normies, Another For Hackers

3D Printing And The Dream Of Affordable Prosthetics

Our Columns

Linux Fu: The SSD Super Cache

Hackaday Links: December 7, 2025

Something New Every Day, Something Relevant Every Week?

Hackaday Podcast Episode 348: 50 Grams Of PLA Hold A Ton, Phreaknic Badge Is Off The Shelf, And Hackers Need Repair Manuals

This Week In Security: React, JSON Formatting, And The Return Of Shai Hulud

Background

The Problem

The Hack

Even Better

45 thoughts on “The X Macro: A Historic Preprocessor Hack”

Leave a Reply to CCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns