Code Craft-Embedding C++: Hidden Activities?

What is an embedded system? The general definition is a computer system dedicated to a specific purpose, i.e. not a general purpose system usable for different tasks. That is a very broad definition. I was just skimming the C++ coding guidelines for the Joint Strike Fighter. That’s a pretty big embedded system and the first DOD project that allowed C++! When you use an ATM to get money you’re using an embedded system. Those are basically hardened PCs. Then at the small end we have all the Internet of Things (IoT) gadgets.

The previous articles about embedding C++ discussing classes, virtual functions, and macros garnered many comments. I find both the positive and critical comments rewarding. More importantly, the critical comments point me toward issues or questions that need to be addressed, which is what got me onto the topic for this article. So thank you, all.

Let’s take a look at when embedded systems should or should not use C++, taking a hard look at the claim that there may be hidden activities ripe to upset your carefully planned code execution.

Limits of Embedded Development Boards

Embedded systems are often thought of as having limited resources, e.g. memory, processing power. Having real-time constraints is another requirement frequently brought up. While those do occur in embedded systems they are not defining characteristics.

At some point a processor or memory limits preclude using C++, and often even C. Vendors might resort to a restricted version of C on some processors to provide a high-level language capability, an effort that would be silly for C++.

But we’ve not hit the limit on the boards used in these articles. We see with the Arduino Uno and its relatives that C++ is usable. The Uno is restricted to a subset of C++, in part because the developers did not have a C++ standard library available. (If you really want one, there are ports of the STL for the Uno.) The compiler in the Uno toolset supports C++11 and there is some support for C++14, but I haven’t explored the latter to know what is usable. There are capabilities in C++11, and C+14, that improve C++ use in embedded systems.

The Due, a larger Arduino board I’ve used to contrast with the Uno, does have the full standard library. Switch over to the Raspberry Pi, or equivalents, where you not only get the GCC toolset but can run Eclipse on the board, and it feels like the sky’s the limit.

Should You C++?

While all the above is valid, it misses a critical point. The issue isn’t whether you can use C++ on the smaller systems but whether solving the problem needs C++’s capabilities. What I’m suggesting is changing the question from “Can you use C++?” to “Should you use C++?”

We’ve addressed some of the really basic objections to using C++. Code bloat is not the great explosion folks imagine. Virtual functions are not super slow. But the comments raise other issues. One comment advised against using C++ because of the hidden activities. Specifically mentioned were copy constructors, side effects, hidden allocation, and unexpected operations.

What is a copy constructor, and why do we need one? It’s a constructor that makes a copy of an existing instance. Any time a copy is made the copy constructor is called. Recall that all constructors initialize instances so they are ready to be used.

A copy constructor is required if you pass a parameter by value. That’s a copy. Returning a value from a function causes a copy, although a decent compiler will optimize it away. Assignment also involves making a copy.

With built in types the cost of a copy is low, except maybe if you are using long doubles at 16 bytes a value. For large data structures a copy can be expensive and can be tricky. Rather than bemoan that C++ does copies, we need to recognize they are a necessity. That recognition means we can work to avoid them and get them right when they are needed.

One way to avoid copies is to pass structures by reference. In C, passing by pointer is a pass by reference. C++ allows that and introduces the reference operator. The reference operator is not just syntactic sugar. For example, references eliminate the dangling pointer problem since you cannot have a null reference.

Which brings up the ownership problem with pointers and the questions they raise for data structure copies. Quite frequently, even in C++, a data structure contains a pointer to another data structure. When you make a copy who owns the structure at the end of the pointer? Do you copy the pointer or the data? If you just copy the pointer you are sharing data between the two copies. One copy can modify the data in the other copy. That is usually not a good thing. Copying the data might be expensive. Also, who ultimately decides when the target of a pointer is deleted, or even if it should be deleted?

C++ doesn’t introduce a problem with copy constructors; it highlights a requirement that needs to be addressed, sometimes by looking to the problem requirements. What is needed by the solution when a copy is made?

Copying Data

In my robotics work I use an inertial measurement unit (IMU) to help track position and bearing, the robot’s pose. Inside the IMU are an accelerometer, a gyroscope, and a compass. The accelerometer and gyroscope both provide data as a triple of data, i.e. measurements in x, y, and z axis. There are a number of operations that need to be done on that data to make it usable, many more than we want to look at here. But we can look at how to handle this triple of data and to add a triple of values together. This is done with the gyroscope since it reports the angular rate of change per unit of time. By accumulating those readings you can obtain, theoretically, the bearing of the robot.

C++ Implementation

Here’s the declaration of the class Triple and the overloaded addition operator:

class Triple {
	Triple() = default; // C++11 use default constructor despite other constructors being declared
	Triple(const Triple& t);	// copy constructor so we can track usage
	Triple(const int x, const int y, const int z);

	const Triple& operator +=(const Triple& rhs);

	int x() const;
	int y() const;
	int z() const;
	int mX { 0 };	// C++11 member initialization
	int mY { 0 };
	int mZ { 0 };

inline Triple operator+(const Triple& lhs, const Triple& rhs);

I’m using a number of C++11 features here. They’re marked, and the implications for most are obvious if you are familiar with earlier versions of C++. The line with Triple() = default; probably isn’t obvious. It requests that the compiler generate the default constructor. Without it we couldn’t create a variable with no arguments on the constructor: Triple t3;. Normally the default constructor is only created by the compiler when no other constructors are defined. Since Triple has two other constructors there would be no default constructor. I requested one using the notation so variables could be created without arguments.

The next constructor, Triple(const Triple& t), is the copy constructor. It is not needed for this class since C++ would have generated one by default that would have worked fine for this simple class. I created it to show how one works and illustrate where it is invoked. This uses a new C++11 feature where a constructor can invoke another constructor to handle the initialization. This came into being to avoid code duplication, which often led to errors, or the use of a class member to perform initialization.

The final constructor allows us to initialize a Triple with three values. Those three values are stored in the data members of the class.

The next function overloads the plus equals operator. It turns out that the most effective way to implement the actual addition operator, seen a few lines below, is to first implement this operator.

The remaining functions are getters because they allow us to get data from the class. Some classes also have setters that allow setting class values. We don’t want them in Triple.

Here are the implementations of the arithmetic operators:

inline const Triple& Triple::operator +=(const Triple& rhs) {
	mX += rhs.mX;
	mY += rhs.mY;
	mZ += rhs.mZ;
	return *this;

inline Triple operator+(const Triple& lhs, const Triple& rhs) {
	Triple left { lhs };
	left += rhs;
	return left;

The first operator is straightforward; it simply applies the plus equal operator to each value in the class and returns the instance as a reference. This operator modifies the data in the calling object so the returned reference is valid.

The addition operator uses the plus equal operator in its implementation. Here is where the copy constructor comes into play. We have to create a new object to hold the result so one is created from the lhs value. That’s a copy.

The rhs is added to the new object using plus equal operator and the result returned by value, not by reference. The return is another copy. It cannot be returned by reference because the result object, left, was created inside the function.

There are two possible copies in any arithmetic operator. However, C++ in the standard specifically allows compilers to optimize away the copy for the return value. This is the return value optimization. You’re welcome to try adjusting the code, but there is no way you can avoid creating a copy or two somewhere during this operation.

This code will run on an Arduino, but I created it and ran it on Linux so I could step through the operations to verify where the copy constructor was called and where it wasn’t.

How do you use this? Pretty much the same as any arithmetic operation:

	Triple t1 { 1, 2, 3 };
	Triple t2 { 10, 20, 30 };

	Triple t3 { t1 + t2 };

C Implementation

What would a similar implementation look like in C? How about this:

struct Triple {
	int mX;
	int mY;
	int mZ;

void init(struct Triple* t, const int x, const int y, const int z) {
	t->mX = x;
	t->mY = y;
	t->mZ = z;
struct Triple add(struct Triple* lhs, struct Triple* rhs) {
	struct Triple result;
	result.mX = lhs->mX + rhs->mX;
	result.mY = lhs->mY + rhs->mY;
	result.mZ = lhs->mZ + rhs->mZ;
	return result;

Overall it looks shorter and neater. The struct Triple contains the three data items for the axis. The routine init sets them to user specified values. The add function adds two Triples and returns the result. The add routine avoids initializing result because we know its content will be overwritten by the addition operations. That’s a bit of a savings for C. There is still a copy when the function returns the value. You just don’t have any control of how that copy is done. In this simple situation it doesn’t matter but with a more complicated data structure, say, one with pointers, the copy might be more challenging. We’d probably need to resort to an output parameter using pass by reference with pointers instead of a return value.

Here is how it is used:

	struct Triple t1;
	init(&t1, 1, 2, 3);

	struct Triple t2;
	init(&t2, 10, 20, 30);

	struct Triple t3 = add(&t1, &t2);

Two values are created and initialized and then added. Simple, but you’ve got to remember to take the addresses of the structures and to assure the init routine is only called once.

Consider how the two different versions would look if you implemented a complicated expression. I’ll just say I know which I would prefer.

Wrap Up

I didn’t start this article intending to do a direct comparison between the two languages. I only wanted to illustrate that the copy constructor is, if you insist, a necessary evil. Copies occur in multiple places in both C++ and C. They become critical to understand in C++ when using user defined data types, i.e. classes. Copying in C is less obvious but still necessary.

Since I didn’t intend to make a comparison, I don’t have code size or timings for the two versions. As I pointed out and demonstrated in the article on virtual functions, comparing these simple examples on those parameters is often misleading. A C++ capability is used to solve a problem, not just as an exercise of the language features. Only if an equivalent solution in C is created is a comparison valid.

The Embedding C++ Project

Over at, I’ve created an Embedding C++project. The project will maintain a list of these articles in the project description as a form of Table of Contents. Each article will have a project log entry for additional discussion. Those interested can delve deeper into the topics, raise questions, and share additional findings.

The project also will serve as a place for supplementary material from myself or collaborators. For instance, someone might want to take the code and report the results for other Arduino boards or even other embedded systems. Stop by and see what’s happening.

34 thoughts on “Code Craft-Embedding C++: Hidden Activities?

      1. Originally Java was meant to be for simple embedded design. They started out with something similar to a Basic STAMP, iirc. Then some bright spark had the idea of embedding the Java runtime in a web browser and suddenly it was all about applets.

        1. Don’t forget the early mantra, Write Once, Run Anywhere
          It looked like it was about to happen too, until MS made a broken version for their OS.
          (apparently they wanted everyone to use Viz y’all BASIC).
          I hear they worked to break the OpenDocument momentum the same way.

  1. Speaking of copies, is using ‘immutable’ objects become popular at all in the embedded world? Obviously it uses more memory and CPU, but it’s considered a much more reliable method in larger systems and multi-threaded systems.

    1. Using immutable objects is only practical in a garbage collected language, so no, it isn’t very usable in an embedded environment.
      Moreover, it is considered more reliable in languages like java which do not have a “default is deep copy” policy, so if you pass around the same mutable object you may get unexpected side effects. C++ on the other hand defaults to deep copy, so it’s not much of a reliability issue.

      1. It is not true that C++ performs deep copy by default. You have a pointer in your class/struct and the default-generated copy-constructor performs a bitwise copy of it. Pointed-to data is not copied – that’s a shallow copy, then. It’s a old post, yet I found it by Googling something and therefore I find it must be cleared. C++ does NOT deep copy by default.

    1. You make sure your code doesn’t run out of memory. Pre-allocate all of your resources on the heap, don’t use dynamic data structures (or pull from a pool of resources and handle it gracefully when you run out), prevent the device from accepting user requests that it can’t service, etc.

      Coding to make something bulletproof is a fair bit different in the kinds of techniques you apply compared with standard software engineering best practices, which are optimized for speed and simplicity of development at the expense of performance and resource efficiency.

        1. Well yeah, if you are doing something that can cataustrauphiclly fail in a copy constructor, you would use an exception or do something else similarly drastic like loop until a watchdog reset or directly trigger a reboot. I guess, if you wanted to, you could do something messy like mark the copied object as “invalid” and have a method to check its validity after the copy.

          In an embedded system, though, you’re better off just writing software that doesn’t implicitly do anything at run-time that can fail. Instead of making a really complicated copy constructor, make the copy constructor private and implement a copy function that can return a success bool.

          It’s really quite easy to avoid dynamic memory at run-time, though. I’ve worked on multiple projects with high reliability/safety requirements, and rarely does the solution to a non-trivial problem require runtime dynamic memory allocation. Don’t use STL data structures you don’t understand and don’t call new anywhere except where you really mean it. With everything either static or allocated on the stack, you can be fairly confident that your program will either crash instantly or run indefinitely without a memory fault (unless you use unbounded recursion!)

          This means that you may need to write your C++ code differently than how you would for a desktop or smartphone, but then maybe the next time you write a desktop app, you’ll use some of the same techniques and you’ll end up with a more reliable program.

  2. I worked on OSCAR program–C++ for AV8B program. It was a pilot program to prove viability of modern language/modern design (OOP)/”COTS” hardware(Power PC–you know, the nuclear-hardened variety you’d get from Best Buy) for OFPs for military aircraft, and I was under the impression ours was the first. … I could be mistaken. Our first flight was May 1998. For comparison, the googs says F-35 first flight was Dec 2006.

    1. Similar story, I worked on avionics for the F18. Change logs have dates going back to the mid-90s. The code base is used on all F18 variants (I think). With the exception of the BSP, OS, (…and a few interrupt handlers) the whole thing is C++.

  3. Good article. I just want to add that some people afraid of C++ in embedded environments say something like this:
    “If I see a=b; in C I can figure out the time it takes without looking at the rest of the code, while if I see a=b; in C++ it may call a copy constructor and it may take a long time”

    That is nonsense and the truth is that even in C if a and b are int it takes a small time, while if a and b are instances of a large struct it may take much more time to make the copy.

    The C compiler is even allowed to replace the struct assignment with a memcpy under the hood. So even in C a simple assignment may call a function!

    Just as an example, I’ve tried compiling the following C code with gcc optimization O2 on Linux:
    — begin test.c —
    struct S {
    int array[4000];
    void foo() {
    volatile struct S s1,s2;
    — end test.c —

    and look at the assembly generated:
    — begin test.s —
    subq $32024, %rsp
    .cfi_def_cfa_offset 32032
    movl $16000, %edx
    leaq 16000(%rsp), %rsi
    movq %rsp, %rdi
    movq %fs:40, %rax
    movq %rax, 32008(%rsp)
    xorl %eax, %eax
    call memcpy
    movq 32008(%rsp), %rax
    xorq %fs:40, %rax
    jne .L5
    addq $32024, %rsp
    .cfi_def_cfa_offset 8
    call __stack_chk_fail
    — end test.s —

    and yes, it calls memcpy.

      1. How common it is depends entirely on what you are doing. If you want to do something similar to “a + b” in C (using a function) you will need the exact same resources. Also, if you’re using C++11 you won’t copy anything doing “a+b” unless you force it, the compiler will rely on move semantics.

        1. If you do the exact same thing in C, you’ll obviously need the same resources. The difference is that a C programmer is much more aware of what happens. In my C code, I rarely copy structs, and when I do, it’s a very conscious decision.

          Didn’t know about “move semantics”. That’s another problem with C++, it’s so damn complicated.

  4. Master C. Your mind will be blown out several times, but if you retry and survive you will be a code master.
    C++ is a straitjacket and padded cell but you can still hurt yourself with bloat and leaks.
    Only use C++ AFTER you’ve mastered C.
    Do not code without full understanding.

    1. I really disagree with this. Go directly to C++. C++ provides everything C does so there is no reason to wait. C will distract you from learning the capabilities of C++. The approach to programming with C++ is ultimately different from C.

      1. I would agree with this *only* when all of the following conditions are met:

        1. You learn proper object-oriented coding methodologies *before* you learn C++

        2. You learn how to use C++ *properly*.

        3. You learn what C++ is actually doing under the hood, particularly in terms of memory allocation.

        4. You step through the assembler on every program you write while learning.

        Now, it is true that you should do all of this when learning to use C in the embedded realm as well. However, as C is really nothing more than a glorified macro language, all of these steps are far, far easier. This is why *I* recommend that everyone learn C first and then learn C++.

      2. Following recommendations, half a lifetime ago I tried to learn C++ with no prior understanding of C using a C++ in 30 days book twice. I’m not saying it’s the worst thing I’ve ever done but it’s in my top 10 life mistakes and it put me off trying to learn C for almost a decade. The method and the description was completely alien to anything I’d done before (Acorn BASIC, FORTRAN77, Assembler (ARM, x86, others), Prolog). When I did start to dabble in C, I found the syntax fairly friendly and quite tolerant. Practically error messages often bear no relation to the mistake, rank amateurs can be quite creative and this produces a fair amount of stress. On the whole though getting to a dabble, read and edit level of understanding of C was hugely easier than attempting C++ from the bottom of a sheer cliff. I have a science background and I expanded into EE.

  5. “shoulda, coulda, woulda…” Taken my code from javascript or python and dropped it down to RTOS?

    Is the worst feeling one can be infected by. Hell. How about the a PHP micro-server running on a WELL KNOWN commercial appliance? WITH A MEM LEAK?

    Busy box nothing. Nothing if faster then killing dreams then trying to debug a PHP front-end running on a corporate based busy box.

    The biggest issue of any developer be it PERL/PYTHON/RUBY or C++ is the extrapolated LIBS that you want on your platform.

    If you have hours upon hours to expend by passing the CORP firewall to HTTP d/l the required packages. one-at-a-time. (bypas CPAN, EGG, ETC) More power to you. The DEPENDENCY HELL is most EPIC RABBIT HOLE you will face.

    My thanks to [Rud Merriam] for the break-down BUT can you break down the amount of time you setup your workstation? dev env? And finally the commit platform?

    Aside from the standard “use arrayfire, llvm, , mathlab, or wolfram. to compute best path analysis”

    Do we have a CLEAR path on optimizing C++ to C?

  6. Hello and thanks for this series.

    I’m a bit late but, aren’t you returning a value from the stack when you do:

    struct Triple add(struct Triple* lhs, struct Triple* rhs) {
    struct Triple result;
    result.mX = lhs->mX + rhs->mX;
    result.mY = lhs->mY + rhs->mY;
    result.mZ = lhs->mZ + rhs->mZ;
    /* Here, shouldn't result be allocated from malloc or similar */
    return result;

  7. Hello and thanks for this series.

    Perhaps I am missing something but when you do:

    struct Triple add(struct Triple* lhs, struct Triple* rhs) {
    struct Triple result;
    result.mX = lhs->mX + rhs->mX;
    result.mY = lhs->mY + rhs->mY;
    result.mZ = lhs->mZ + rhs->mZ;
    return result;

    aren’t you returning a structure allocated on the stack? I expect that to blow up later.

    1. The function returns a value that is on the stack. The value is copied from result to t3. A good compiler may optimize away the copy and use t3 for the calculation.

      struct Triple t3 = add(&t1, &t2);

      A critical part of the article is that C does copies, also. C++ makes you aware this is occurring by telling you about copy constructors.

  8. One good thing you’ll learn from C is how important is struct’s initialization.
    It is, accordind to me, the most important reason to go to C++.

    Another reason to go to C++ is syntax, a+b is easier to understand than add ( &a, &b ) in pure C, non whistandind the fact that C cannot return complex objects.

    But let’s be honnest, C++ is by far more difficult than C to master but the end result is also much more secure/powerfull application ! Try replacing all the printf in your code with stream operators… but yeah you’ll have to learn stl too which is important.

    In the particular field of embedded programming, since I’ve past 10 years XP in that field, I’d say C++-11 reigns supreme, you’ll always need std::atomic for modeling hardware mem mapped registers, preventing you from some huge headaches.

  9. Why don’t you do something like this:
    void add(struct Triple* result, struct Triple* lhs, struct Triple* rhs) {
    result->mX = lhs->mX + rhs->mX;
    result->mY = lhs->mY + rhs->mY;
    result->mZ = lhs->mZ + rhs->mZ;
    struct Triple t3;
    add(&t3, &t1, &t2);

    And in c++ why dont tou do Triple t3 += t1 += t2; ?

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.