Programming With Rust

The rust language logo being branded onto a microcontroller housing

Do hardware hackers need a new programming language? Your first answer might be no, but hold off a bit until you hear about a new language called Rust before you decide for sure.

We all know real hackers use assembly language to program CPUs directly, right? Well, most of us don’t do as much assembly language as we used to do. Languages like C can generate tight, predictable code and are easier to manage.

Although some people use more abstract languages in some embedded systems, it is no secret that for real-time systems, device driver development, and other similar tasks, you want a language that doesn’t obscure underlying details or generate code that’s difficult to reason about (like, for example, garbage collection). It is possible to use special techniques (like the Real-Time Java Specification) to help languages, but in the general case a lean language is still what most programmers reach for when you have to program bare metal.

Even C++, which is very popular, obscures some details if you use things like virtual functions (a controversial subject) although it is workable. It is attractive to get the benefit of modern programming tools even if it does conceal some of the underlying code more than straight C.

About Rust

That’s where Rust comes in. I could describe what Rust attempts to achieve, but it is probably easier to just quote the first part of the Rust documentation:

Rust is a systems programming language focused on three goals: safety, speed, and concurrency. It maintains these goals without having a garbage collector, making it a useful language for a number of use cases other languages aren’t good at: embedding in other languages, programs with specific space and time requirements, and writing low-level code, like device drivers and operating systems. It improves on current languages targeting this space by having a number of compile-time safety checks that produce no runtime overhead, while eliminating all data races. Rust also aims to achieve ‘zero-cost abstractions’ even though some of these abstractions feel like those of a high-level language. Even then, Rust still allows precise control like a low-level language would.

High goals, indeed. Rust promises guaranteed memory safety, threads without data races, trait-based generics, type inference, a minimal runtime, and efficient C bindings (the compiler uses LLVM, by the way). You can download the software for Linux, Mac, or Windows. You can even edit and run example code in your browser right from the home page with no software installed.

Rusty Hardware

But, wait. I mentioned hardware hacker language, right? Since Rust targets Linux, it is usable with the many single board computers that run Linux. There is an unofficial repository that handles several ARM-based boards to make it easy to put Rust on those computers.

If you want to see an example of Rust on embedded hardware, [Andy Grove] (not the one from Intel) recently posted a hello world LED blinking example using Rust and a Beaglebone Black. He also found a crate (a Rust library, more or less) to do digital I/O.

Of course, blinking an LED isn’t very compelling, but it does illustrate that the system will work on an embedded board. Rust does a lot of safety checks at compile time. It also has an unusual scheme that has variables that own memory. When the variable goes out of scope, the memory is deallocated. Obviously, then, only one variable can own some piece of memory at one time. You can “borrow” references to the variable, but when you do, you may prevent future changes to the data (at compile time).

Owning and Borrowing Memory

For example, suppose you have this (lifted from the Rust documentation):

fn main() {
 let mut x = vec!["Hello", "world"];
 let y = &x[0];
}

The variable x is mutable (and automatically typed). It owns the memory that makes up the vector create by the vec macro (macro invocation uses the exclamation point). Variable y is a reference to part of x. It doesn’t own any memory; it is just a reference. Then this code will fail to compile:

fn main() {
  let mut x = vec!["Hello", "world"];
  let y = &x[0];
  x.push("foo");
}

The issue is that changing x (with push) invalidates the reference to y. The compiler catches this (and is smart enough to know if the y variable fell out of scope before the change to x). You can also clone the vector element into a new variable (which would then own that copy of the element).

For special cases, there are ways to mark code unsafe and bypass the usual Rust checks. This is just one example of how Rust works–it takes a little reading and working with code examples to get a feel for all the differences. Luckily, the documentation is pretty good.

Other Tools and History

Rust also uses a tool called Cargo that manages the build process. In addition to compiling, it downloads and builds dependencies. Rust calls compilation units crates (I guess that’s a play on cargo). So where you think of libraries in C, you’d talk about crates when using Rust.

Rust isn’t exactly new. Mozilla employee Graydon Hoare started it as a personal project. Mozilla started sponsoring the project in 2009. Rust 1.0, the first stable release, occured May 15, 2015.

Do We Care?

Do we need a new bare metal language? Maybe. I haven’t tried Rust in a real project yet, so I don’t know if it will really perform or not. However, I applaud the idea of finding more problems at compile time. One thing I really like is that Rust doesn’t guess what’s best for you. There are multiple types of pointers, memory cells, and synchronization primitives that allow you to pick between different (well documented) trade-offs. As I mentioned, you can even mark code unsafe and sidestep some checks.

Launching a new language is hard. Only time will tell if Rust will find a home on hacker’s hard drives.

108 thoughts on “Programming With Rust

    1. There’s currently some work going in to merging avr-llvm into llvm proper upstream http://reviews.llvm.org/rL251471
      This will hopefully mean the dawn of a lot of nice new languages such as Rust, Nim, etc. for the ever popular AVR platform.
      I’m excited to see what the compile-time niceties of Rust can bring from a code safety, code readability and performance standpoint to the embedded world.

  1. > Do we need a new bare metal language?

    It baffles me that the question has to be asked, much less answered with “maybe”, given how thoroughly awful the design of C is. Nullable pointers, zero-terminated strings, ridiculous overuse of undefined behavior, lack of namespacing or usable abstractions… Not to mention being even more hostile to kernel and microcontroller developers than to everyone else: bitfields are broken and there’s no way to portably specify alignment on memory accesses in a language that’s for some baffling reason called “close to metal“, if you excuse my use of the phrase.

      1. Given how the vast majority of existing software invokes undefined behavior dynamically, I agree: yes, you can compute garbage very quickly. That must be valuable.

        Overzealous use of undefined behavior also leads to innumerable vulnerabilities that are high-impact and are hard to discover: e.g. https://bugs.chromium.org/p/nativeclient/issues/detail?id=245 or https://lwn.net/Articles/342330/.

        Nullable references specifically are a “million-dollar mistake” in words of the person who invented it: https://www.linkedin.com/pulse/20141126171912-7082046-tony-hoare-invention-of-the-null-reference-a-billion-dollar-mistake. Using an option type (https://en.wikipedia.org/wiki/Option_type) instead is both more convenient and vastly safer; this type can be translated by the compiler into a nullable reference in all cases and optimized in a similar fashion, which is what happens in Rust.

        Zero-terminated strings is the literal worst way to implement a string. You cannot compute length quickly, you cannot store a null byte (which leads to vulnerabilities when some other language, e.g. Java or Ruby, sanitizes a string and then passes it to C, which uses a truncated version with different meaning instead), and as a direct consequence of this design, most string manipulation functions C gives you are much easier to use in a way that introduces a buffer overflow vulnerability than not.

        Together, these properties of C combine to give us a language that is nearly impossible to write secure code in, even for highly experienced programmers, which are a minority anyway.

        1. You’re complaining that C is not like other dynamic languages that support the things you wish for. However, C is not a language for doing those things, yet, is often used to create languages that do. It’s flexibility is in the design and to do otherwise makes it less flexible, slower and larger, thus defeating the purpose for which it was created.

          I find it interesting when anyone says, “thoroughly awful the design of C is.”, while ignoring that, today, it’s one of the two most used languages in the world designed by computer scientists at Bell Labs, the most presitigious lab in the world at the time, for the most used operating system in the world and which many other languages base their design on.

          1. Nobody is ignoring that C is widespread or effective at what it does. That is, in fact, the reason they complain about it- it *also* has a thoroughly awful design for writing robust and secure software, and those problems can be fixed *without* making a language less flexible, or slower, or generate bigger binaries. See Rust, the point of the article you’re commenting on.

      2. Nullable pointers lead to runtime exceptions trying to dereference them. Null terminated strings lead to unterminated strings, and to being unable to represent binary data containing nulls.

        1. If you don’t produce null pointers in your code, there won’t be any exceptions dereferencing them. It’s up to you. And if you do have a good reason to produce null pointers, then you just need to check before dereferencing.

          Not being able to embed ASCII NUL in a string is a disadvantage, but storing the length in the string isn’t universally better. How many bytes for the length, and how are they aligned ?

          1. Right, so all you have to do to not be bitten by this incredibly common issue that frequently results in vulnerabilities is be a perfect programmer who never makes mistakes.

            Or, here in the real world, we could try and build languages that detect or prevent common mistakes that real world people make as early as possible.

          2. It’s not all that common that a pointer can be null by accident. Usually it has a meaning, like malloc() returning null when it’s out of memory. If you didn’t have null pointers, you’d have to encode the out-of-memory condition in some other way, and test for it just the same.

            Common mistakes are also easily fixed. It’s the really deep mistakes that present the biggest problem, and they can’t be fixed by the language.

          3. You literally just cited a case when you can accidentally end up with a pointer to null – not checking the return value of malloc(). As others have pointed out, other languages provide a number of ways to represent a possibly-null return value in a way that causes a compile time error if you don’t handle it properly.

          4. Aha, so you still get a null type, but you are forced to check for it, and if not null, you can convert it to a non-null pointer type that you don’t have to check any more. Interesting.

          5. Actually, from what I can find on-line about Rust is that failed allocations seem to cause an abort, although I can’t be sure because a good reference seems to be lacking.

            If an abort is really what is happening, I would hesitate to call this an improvement over returning a null pointer. At least with a null pointer, I can check it myself, and decide what is the best way to recover.

            In addition, Rust seems to rely a lot on having a functional heap. String manipulations, like adding a letter to an existing string could cause reallocations, which could potentially fail. Without a graceful way to recover from this, I wouldn’t call this “safe”.

          6. Aborting is safe in that it doesn’t allow for undefined behaviour. Checking for failures is fine in theory, but doing so consistently, and recovering in a meaningful manner is hard.

          7. Aborting maybe safe in the sense that your credit card number won’t roll out of the database by accident. It may not be safe if you’re in the middle of controlling an airplane. If you prefer aborting over null pointer dereferencing, it would be trivial to adapt the C malloc() function to abort when running out of memory.

          8. Continuing executing with half-initialized data because somewhere a return value wasn’t checked or a buffer size was incorrectly calculated is far worse than aborting and resetting.

          9. “Continuing executing with half-initialized data because somewhere a return value wasn’t checked or a buffer size was incorrectly calculated is far worse than aborting and resetting.”

            That’s like saying that being shot in the head is worse than being stabbed in the chest. They are both unacceptable. The proper solution is to check the return value, and to recover gracefully from any errors. At least C lets you do this. It’s great if people want to improve it to a point where you can’t forget it, but replacing it with an abort is not the solution. Like I said, it would have been an easy fix to implement an abort in the C library call for malloc() when it runs out of memory.

          10. Rust’s use of abort on OOM is not a requirement, it’s “merely” what the standard library does. Embedded systems often avoid malloc-like allocators anyway, especially in critical situations like controlling an airplane, but for the times that you do want to allocate memory it is possible to replace the aborting allocator with something else.

          11. Aborting on OOM is a property of Rust’s standard library, not the language. You can replace it in situations like embedded programming where it’s an inappropriate choice (though you typically don’t want to be doing dynamic allocation in an airplane controller anyway).

          12. > Actually, from what I can find on-line about Rust
            > is that failed allocations seem to cause an abort,
            > although I can’t be sure because a good reference
            > seems to be lacking.

            Rust itself doesn’t abort on OOM, it’s the dynamically-allocating portion of the stdlib that does. It’s generally expected that embedded programmers will jettison this portion of the stdlib for the non-dynamically-allocating subset of the stdlib (called the “core” lib). If you want you can even layer your own faux-stdlib on top of this core lib and reimplement basic things like malloc yourself and make it have whatever behavior you need. Both the core lib and stdlib are all written in Rust and can be found in the repo on Github if you’d like to see how it handles low-level tasks like this.

          13. Using a “string” to store and manipulate binary data is itself somewhat hacky. A string in C is just a byte array. The string manipulation functions expect null termination but you don’t have to use those functions. You don’t need to use strcpy and then complain about it when you really should have used memcpy instead.

          14. > Using a “string” to store and manipulate binary data is itself somewhat hacky.

            Which is why you would use a byte array instead- Rust strings handle utf-8 above and beyond vanilla byte arrays.

      3. Nullable pointers leave open the chance that you forget to set them before dereferencing, leading to errors. Non-nullable pointers, such as C++’s references, avoid this problem, but then there’s no way to explicitly denote the “unset” state. You generally get around that by limiting the scope of the reference though (Yes, it is possible to set a reference to NULL in C++, but that’s only because C++ supports nullable pointers too).

        Similarly, zero-terminated strings have the potential for the programmer to forget the zero (or have it be overwritten), leading to other types of errors.

        You can get used to these pitfalls, to the point where avoiding them seems straightforward and obvious. But might it be nice if you didn’t have to? Or didn’t have to learn them before becoming a proficient C programmer?

        I’m not a Rust programmer mind you, so IDK if the tradeoffs are worth it. Certainly, some people have been bitten by these pitfalls enough to think they are fundamental flaws in the language.

          1. Length encoded strings aren’t as efficient for many simple operations.Also in Rust it seems you can only create strings on the heap, and some string operations can cause reallocations, which may result in aborts. In a small embedded system there may not even be a heap.

          2. “Length encoded strings aren’t as efficient for many simple operations”
            Like what? I can find many operations that are less efficient on null terminated strings. For example, given a string if you need to return the mid element, you need to traverse all the string to find the length first if it’s a null terminated string. In the other case you can simply access the mid element in constant time.

          3. Can you give an example of an operation that is more efficient with null terminated over a size_t length string? Neither format is unlimited in length, because both require a pointer. Having spent some time in the past writing word parallel strlen, strcmp and strstr, I can assure you that a length prefix is very nice to have.

            Don’t confuse different with worse. I found rust to be faster than C for my own highly performance critical routines, and the memory overhead to be lower. The language shootout echos my experience.

          4. “Can you give an example of an operation that is more efficient with null terminated over a size_t length string?”

            A common operation is to build a C string by adding single characters like this: *s++ = c

            Append character to Rust string involves updating the length, comparing with max length, and possibly reallocating the string (which apparently could result in an abort if you’re unlucky), and copying it to new location.

            And if you don’t like the C strings, you can always use a different library. Apart from string constants, C doesn’t really have a built-in string type anyway.

          5. Rust strings do not have to be on the heap any more than C strings do, and can also be preallocated with extra capacity just as C strings can be. And, once you’re at that point, you have to store and update a size anyway- appending to a C string involves just as much reallocation, and checking the length (the more frequent operation) is slower than updating the length (the less frequent operation).

          6. > in Rust it seems you can only create strings on the heap

            If you want, you can model strings in Rust just as C does: create an array on the stack and poke bytes into it. The built-in String type exists to do a lot more than you’d expect from C strings, such as ensure well-formed Unicode.

          7. > If you want, you can model strings in Rust just as C does: create an array on the stack and poke bytes into it

            Yeah, but then you lose all benefits of length checking, or not ?

            > appending to a C string involves just as much reallocation, and checking the length (the more frequent operation) is slower than updating the length

            A simple *s++ = c, doesn’t involve any reallocation. Of course, it requires the programmer to be sure that there’s room, but that check doesn’t have to be done for each single character.

          8. > A simple *s++ = c, doesn’t involve any reallocation. Of course, it requires the programmer to be sure that there’s room, but that check doesn’t have to be done for each single character.

            …and the same is true of Rust. Is there any *real* problem you see with how Rust handles strings, or is it just not what you’re used to?

          9. *s++ = c is not safe by itself, as the string is no longer null terminated. It is also a byte wise operation, which misses the benefit of simd. The length field can live in a register while you are performing the operations, and the equivalent operation could be s[l++] = c; which is safe (up to the allocation length), is easy to parallelize, and avoids expensive cache data hazards.

            Other commenters have addressed your other insecurities. C is the only mainstream language that uses null terminated strings (included C++!) for a good reason.

    1. C may have some issues, but the issues you mention are features, except for “ridiculous overuse of undefined behavior” The other items you mention are where it gets its speed and small size. Removing the training wheels allows one to really get the maximum performance out of a high level language. Lack of portability is REQUIRED in order to provide highly optimized small size code. You can’t have something fast, small and universal. It’s just not possible with today’s computing power. Maybe when quantum compilers show up, you can write a Java program what will compile down to the optimized machine code for a specific target, but not today.

      1. There are many issues with these statements.

        For one, it’s mostly false. Nullable pointers, namespacing, and some abstractions do not lead to increase in code size or decrease in performance. Underspecified behavior of bitfields and enums are a complete accident of standardization process, since the incompatibilities that are introduced in ABI are caused by arbitrary choices; lack of a way to force a compiler to access memory using instructions with given bit width also doesn’t improve portability in any way.

        For another, the belief that runtime performance is the most important characteristic of an embedded system is deeply misguided. A system must be first and foremost correct, which it isn’t when a bored teenager can find an RCE in a day. Only after it is correct, should you worry about performance. Sure, if you have a hard realtime system, meeting your timing deadlines are a part of correctness, but many systems aren’t.

        In part this is due to lack of liability of software engineers for the suffering of the people whose life are affected by their poorly written code.

      2. Sometimes it’s critical that you can specify the order and packing of struct members, even if it’s slower than optimal. There’s absolutely nothing preventing a language from allowing you to specify that when it’s important, and letting it optimise things itself when it’s not.

        Since there’s no native way to do that in C, you either end up having to use compiler-specific pragmas, relying on undefined behaviour, or doing things manually instead of using bitfields, which would otherwise be a perfect match.

    2. Not to mention being even more hostile to kernel and microcontroller developers than to everyone else

      I know of at least four different kernels for quite advanced operating systems written in C. With (I expect) thousands of developers contributing to those projects for the past twenty years. What’s wrong with them (us)?

    3. I is obvious that you (whitequark) has a passion for hating C. And while I tend to agree with some of the others, that some of the things you consider flaws, I would (also) consider features. Yes, it is not perfect. And when there’s something better, then sure, I’ll be interested.
      But, right now, there is no other language that I can be as productive in when developing embedded applications. And it is mature (for good and bad) with really good compilers and compilers for most targets.
      You do have a lot of valid points, but I don’t think knocking C/C++ of its throne on the embedded market is gonna happen soon, and even when it happens, C is still gonna be around long after that.
      Love it or hate it, right now C is the best option for writing code in an at least semi-portable way, for embedded targets, with decent productivity (and maintainability) and still be effective. While I totally agree that correctness of the system is #1, compiled code size, resource usage and speed DOES matter.

      1. The article is about a language designed with the explicit goal of being able to replace C in resource usage and speed, and it has largely accomplished those goals while simultaneously solving the problems whitequark mentioned. C will of course be around forever, but that’s a horrible argument for dismissing the possibility of replacing it *now* for new code.

          1. As has been explained to you several times, only the dynamically-allocating part of Rust’s standard library aborts on allocation failure. The language itself solves the null pointer problem by making pointers non-nullable by default and enabling the implementation of the Option and Result types- an allocator designed for embedded systems could easily use those to signal failure instead of an abort.

            As for bitfields, the language itself does not provide C’s red-herring-like underspecified bitfields, but it does provide a better macro system which has been used to implement nice syntax on top of manual bit masking and shifting.

            All of which you would know if you had done a little research into the language before arguing about it in the hackaday comments.

          2. “only the dynamically-allocating part of Rust’s standard library aborts on allocation failure”

            I understand, but that’s a very large part. For example, I was looking at the Redox OS, and it’s full of ‘push’ calls, which (correct me if I’m wrong) are based on dynamically reallocation, that could potentially fail and panic the kernel. So, how would you rewrite this code to be just as safe as the Linux C kernel ?

            By the way, I tried to do some research, but https://doc.rust-lang.org/ has been giving me page load errors, and other information is often conflicting or too brief.

          3. The dynamically-allocating part of the stdlib is for the most part libcollections (which includes, for example, String and Vec, but not &str or &[T]). If I were rewriting Linux in Rust, I would ditch libcollections for a library designed for that sort of system, that emphasizes things that do less allocation and that signals allocation failure with the Result type. The generated binary would look a lot like the C version, but the compiler would catch any attempts to allocate without handling potential failure.

          4. “If I were rewriting Linux in Rust, I would ditch libcollections for a library designed for that sort of system”

            Sounds good, I wonder why the Redox developers didn’t do that, or why there isn’t such a library in the first place.

          5. > So where does the Rust manual specify the exact semantics of the Shl operator for different types and shift distances ?

            It’s a panic if you shift more than a type’s width, with methods like wrapping_shl and rotate_left for cases when you don’t want that to happen.

            > I wonder why the Redox developers didn’t do that, or why there isn’t such a library in the first place.

            It feels to me like Redox is (at least currently) more focused on having fun with OS programming than building a robust system. For example, they started out with most of their code marked as unsafe and only recently started trying to take advantage of Rust’s type system.

            As far as a more embedded-friendly library, there has been talk of writing something like that, but the (limited) development efforts have so far been elsewhere. It would certainly be a valuable tool, and I hope we get one (or several) someday.

          6. “It’s a panic if you shift more than a type’s width, with methods like wrapping_shl and rotate_left for cases when you don’t want that to happen”

            Where can I find this information ? Also, suppose I want to rule out any possible panics, is there a way to tell the compiler that it should flag an error when using unsafe functions such as shl instead of wrapping_shl ?

          7. Panicking from a shift is a specific case of overflow checking, which is described in RFC 560: https://github.com/rust-lang/rfcs/blob/master/text/0560-integer-overflow.md

            Rust uses a very specific definition of (un)safe, which is memory safety- no use-after-free, no data races, etc. Panicking is thus considered safe (and indeed, is often used as a way to preserve memory safety). One thing to note about overflow, however, is that it only panics in debug mode, and otherwise has defined behavior (i.e. won’t summon nasal demons).

          8. “Shift operations (shl, shr on a value of with N can be passed a shift value greater-than-or-equal to N. It is unclear what behaviour should result from this, so the shift value is unconditionally masked to be modulo N to ensure that the argument is always in range.”

            This part from the RFC doesn’t mention panics. Also, default wrapping would not have guaranteed to prevent the chromium bug that whitequark mentioned: https://bugs.chromium.org/p/nativeclient/issues/detail?id=245
            Wrapping still allows the same bug, and would rely on careful programmers knowing these details. This RFC also doesn’t specify what happens if N is negative, or if you do a shift on a signed value.

          9. That issue would have been caught in debug builds in Rust, because the shift would panic. The RFC says this a few lines above your quote:

            > The error conditions that can arise, and their defined results, are as follows. The intention is that the defined results are the same as the defined results today. The only change is that now a panic may result.

        1. I guess I measure things a little differently

          My targets are low end PIC 10/12/16, STM8 / STM32, and most other ARM parts. Is there a RUST port with debugger for these targets? If not, not interested.

          1. Rust just uses LLVM for code generation at the moment. This means ARM is pretty well supported, but targets like PIC and STM are probably not. You may be able to use LLVM’s C backend depending on if it exists right now, but otherwise that is a legitimate advantage of C for the moment.

        1. Carpenters usually prefer working with fewer tools, like ripping and crosscutting with the same saw. They learn special techniques that allow them to work faster with the few tools at hand, even though the tools are not necessarily the optimal best tools for the task at hand.

          The mark of an inexperienced carpenter is a cabinet or a chest full of tools of every size and description, or a number of complex power tools with attachments for every imaginable purpose, most of which are never used. You can watch some woodworking youtube videos for examples, where some people build elaborate jigs for making joinery with a router, where others don’t bother and just go straight to work with a chisel and mallet.

    4. >Nullable pointers,

      You might actually want a pointer that doesn’t point to anything.

      >zero-terminated strings,

      C strings are simple. If you want something “safer” or flexible you can implement that. Your elaborations on this point seem to boil down to “if the terminator is lost or not there then bad things will happen”. If your data is corrupted how the length of the string is stored is a non-issue, you’re already fucked. If you use strlen() etc on “strings” that aren’t null terminated then that is a programmer error. C makes no promises about protecting anything from programmer error.

      FYI; There are tools out there that can find these sorts of programmer errors during your testing phase and GCC has stuff you can use if you really want runtime bounds checking etc that slows everything down but means you don’t have to actually know what you’re doing.

      >ridiculous overuse of undefined behavior,

      The C spec and specs like POSIX say “this is how it should work.. doing this is undefined”. It’s usually undefined because it’s platform specific and C doesn’t pretend that it abstracts that stuff. Languages/VMs like Java pretend that they make everything conform to “the spec” but in reality the underlying OS etc can cause things that should run the same everywhere do quite different things.

      >lack of namespacing or usable abstractions…

      Create a new file, mark whatever you want to only be visible in the file static.. boom you have a namespace.

      >Not to mention being even more hostile to kernel

      The most popular kernel (and most other kernels) are written in C.

      >and microcontroller developers than to everyone else:

      The millions of products that contain microcontrollers running compiled C code begs to differ.

      >there’s no way to portably specify alignment on memory accesses

      How do you make something that’s unportable like alignment portable? Is this real a common enough problem that it needs a whole new language to try to solve?

      1. The point is defaults- pointers default to being non-nullable but you can make null pointers with the Option type; strings default to having a length but you can still use C-strings if you need to (and the stdlib provides them); symbols default to namespaced so they don’t collide.

        As for undefined behavior, you seem to have completely missed the point. If you do something undefined, it’s not so much a problem that your program will become less portable, but that the compiler and optimizer assume you don’t do that and will break your program: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

        1. On the other hand, removing all the ‘undefined’ and ‘implementation defined’ behavior isn’t a cure-all. First of all, it will make the implementation suffer on some targets, because the native hardware doesn’t support the same implementation that the standard enforces. Secondly, it still requires the programmer to be aware of exactly what is specified to happen in rare corner cases.

          Also, some hardware deviates on purpose. A DSP may have saturating arithmetic. People pick such as DSP for a project, because that’s exactly what they want. It would be bad for the standard to enforce wrap around overflow instead. Not only would performance suffer, but the results would not be what was intended. The same applies to processors with non standard integer widths.

          1. You’re still missing the point of C’s undefined behavior. You can’t rely on it *at all*, not even in the single-compiler/single-platform case- for example, the optimizer assumes that signed integers never overflow and generate buggy code if you assume otherwise, while the standard dictates that unsigned integers wrap, just like Rust.

            On the other hand, the optimizer cannot assume that implementation-defined or unspecified behavior never happen, so relying on them is merely non-portable rather than impossible. For example, right-shifting an integer is implementation-defined, and the order of evaluation for function args is unspecified.

            In the end, leaving things undefined is only one approach to performance. The approach you’re complaining about, to specify everything, will indeed cost performance without an easy way out. But Rust picks a third way, and it’s again about defaults.

            In C, accessing an uninitialized variable, or dereferencing a null pointer, or overflowing a signed int, invokes undefined behavior, so the optimizer assumes that they never happen, leading to extremely bizarre output if you rely on it (really, read that LLVM series if you haven’t).

            But in Rust, using an uninitialized variable is a compile time error, accessing a nullable pointer before a null check is a type error, and you have to specify overflow behavior and width in the type, so you get the performance without the implicit hidden pitfalls. Writing your DSP code in Rust would just involve using a Saturating type.

          2. “Writing your DSP code in Rust would just involve using a Saturating type.”

            Ah, cool. And what happens when you try to add regular signed integers on a DSP that only supports saturating arithmetic ? Compile time error, slow library call, or can you specify ?

          3. It doesn’t have to be slower, if you’re willing to use your own libcore that doesn’t implement the Add trait for primitive types- then you could get a compile-time error.

            Don’t quote me on that though- I don’t know how well-supported that is right now, or even if llvm supports any DSPs.

          4. “It doesn’t have to be slower, if you’re willing to use your own libcore that doesn’t implement the Add trait for primitive types- then you could get a compile-time error”

            The scenario I see is someone on a DSP porting some old integer code for non-math stuff. In that case, you don’t want a compile time error, so you’d have to use slow library calls.

            Of course, what will happen is that compiler vendors take shortcuts, and leave out the overflow checks, and give you a footnote in the manual that the overflow rules deviate from the standard. The same things happens in C. For example, some platforms may give you 8 bit ints, in violation of the standard.

          5. Of course, along the same lines as removing the Add trait implementations, you could replace them with saturating implementations, giving you the same end result if that’s important enough to break with the standard.

          6. Don’t throw unspecified and undefined behaviour in the same category. They are totally different things.

            Unspecified or implementation-specified behaviour means that otherwise correct programs can behave differently between implementations. Invoking it is not necessarily an error, if the programmer is aware of the portability limitations it causes, or explicitly covers possible variations. Some of the unspecified behaviour in C has merit, as it genuinely broadens possible use cases, and/or significantly improves performance in certain situations. (Though other instances just seem to be stakeholders with existing implementations unwilling to agree on common standards…) Unspecified behaviour also is usually benign, as it will (by itself) typically result in incorrect results at worst if not properly taken into account.

            Undefined behaviour OTOH is invoked *only* in the case of programming errors causing invalid operations. (Often in turned triggered by unspecified behaviours not taken into account…) It exists solely to avoid performance penalties from runtime checks, and is otherwise a very serious problem, as it regularly causes all kinds of security exploits. Rust however proves that through smart language design, it is possible to avoid this *without* any runtime penalty in most cases!

  2. You might want to consider Nim (http://nim-lang.org). It is a very concise and elegant language that borrows modern features from popular languages, and it can compile to C. Its garbage collector is optional and can be disabled for manual memory management on resource limited platforms. It’s still under development, but already quite usable.

      1. C is unreasonably overused language for people that are lazy to understand or don’t want to understand that certain languages can deterministically produce virtually same disassembly as C when using certain features. Yes C++ and Rust can use dynamic allocation but they do it just using SPECIFIC syntax or features. Don’t even get me started about C++ stdlib.. you SHOULD NOT use it. Or Rust stdlib .. Rust even have original nostdlib feature you can enable! :D

        If there shoud be an absolutely mandatory requirement, then it is to compile to LLVM IR not an unportable C shit.

      2. It _currently_ compiles to C as stated in its homepage, and apparently is very good at it since it produces about the fastest and smallest executables around among new programming languages. When dealing with embedded systems code size and speed matter much more than on desktop machines; I would rather consider Rust for system programming and networking for being written with safety in mind, while Nim strives to be the faster and smaller one which would make it ideal for small embedded boards.

        But.. if you don’t give a rats ass about speed, Lua and Tekui will probably fit most bills (http://tekui.neoscientists.org/index.html)

      1. Because C is actually a really hard language to compile into unless you understand all the nuances of undefined behaviour. Better to compile into LLVM bytecode (which C compiles into via Clang)

  3. “There are no bad programs, only bad programmers.”

    Seriously if RUST solved all programming problems I would learn it tomorrow and never touch C again and if the world had wanted a type safe language we would have all moved to ADA back in the 80’s.

    1. Rust doesn’t solve all programming problems, nor does it claim to. It *does* solve some of the worst ones though, which still regularly plague programs written in C or C++, in spite of all recent tooling improvements etc.

      (The discussion so far doesn’t even mention the “fearless concurrency” aspect, i.e. avoiding memory errors also in multi-threaded programs — which is a much bigger boon yet than “only” avoiding memory errors in single threads…)

      Don’t get me wrong: I love C. I called it my favourite language for more than 20 years; and until recently I also considered all calls to abolish C and/or C++ ridiculous, as the only alternatives were managed high-level languages eminently unsuitable to replace C and C++ in many use cases.

      But Rust is different. Rust is a genuine improvement over C and C++, while retaining most of their nice properties, and being perfectly suitable as a drop-in replacement in most cases.

  4. A modern C++ compiler that enforces the rules from the C++ Core Guidelines project, used with the Guidelines Support Library should provide pretty much the same safety guarantees offered by Rust, with the advantage of not having to learn a completely new language. The challenge is really un-learning the old ways of doing things, and purging them from existing code. Unfortunately the guidelines are still a work in progress, and IIRC Visual C++ 2015SP1 is the only existing compiler that has any support at all, and even that is only a few of the basic ones.

    Bjarne Stroustrup’s keynote from CppCon’15 is definitely worth watching, along with Herb Sutter’s “Writing Good C++14… By Default”

  5. If the Rust team would write the Linux Kernel in Rust, they would prove the value of the language, and provide tons of documentation in a real project. If not that, then some RTOS to take the Pi type computers to the next level.

      1. Development has been fast because they’ve only done the fun bits, like a window manager. Under the hood, the code is polling for the IDE drive.

        while self.ide_read(ATA_REG_STATUS) & ATA_SR_BSY == ATA_SR_BSY

    1. Nobody is going to rewrite the Linux Kernel as it’s too big to rewrite it, and there’s relatively no benefit to doing so – why rewrite the Linux kernel and keep its architectural flaws when you can create a new kernel? There’s quite a few hobby OS projects and a serious one (Redox).

      If you do want a large sized chunk of real Rust code, look at the Servo project.

    2. Rewriting Linux is neither realistic nor reasonable. But one of the advantages is that Rust integrates well with other languages, and thus it’s perfectly possible to add new modules written in Rust to existing projects written in C. (Or most other languages for that matter.)

  6. … and in a few years time Rust will be “broken by design” according to people that spend more time on social media than actually doing anything productive and there will be yet another language sent from the heavens to save us all that everyone can get all excited about “maybe using it in a real project in a few years time” only to wait forever for whatever backend it uses to generate machine code to actually support something that isn’t x86 running a proper OS properly.

    In the meantime I’ll still be compiling C code for hardware that hasn’t been made for the last few decades with the latest GCC and tools like valgrind for finding memory issues etc at runtime will continue to improve.

    1. Your insinuation that the designers of Rust spend more time on social media than being productive is nonsense. They created Rust to solve a specific problem (widespread vulnerabilities in C programs making it into production), and use it for a specific project (the Servo browser engine), and see real benefits from it, even on ARM targets.

      Hardware unsupported by LLVM is a legitimate reason not to use Rust, but it’s not a flaw in the language’s design- merely a shortcoming in its current implementation. C was like that once, too.

      1. While I have no hard evidence, I’m somewhat doubtful about the “C was like that once, too” claim. After all, C was specifically created to make Unix more portable — so I suspect it did stand up pretty well compared to other languages of the time even in its infancy…

        (I totally agree on the rest though.)

  7. This is such a good thread!! I hope everyone in the world reads this and may their brains be overloaded with glee! Exciting stuff if Rust can stand up to the plate like it claims. All your base!!

  8. Q: Why don’t all you erudites quit arguing about how many angels can dance on the head of a pin, and just write the goddam assembly language?

    A: Because you CAN”T write assembly language, but you CAN sound important if you do this! Go soak your heads.

  9. This “asdf” guy is clearly a Ruby hipster turned amateur systems programmer. He literally has no clue what he’s talking about. No real systems programmer I’ve ever met talks like that.

      1. “Continuing executing with half-initialized data because somewhere a return value wasn’t checked or a buffer size was incorrectly calculated is far worse than aborting and resetting.”

        Talk about false dichotomy…

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.