When working on any software project, the developers have to balance releasing on time with optimizations. As long as you are hitting your desired time constraints, why not just ship it earlier? It’s no secret that Super Mario 64, a hotly anticipated launch title for the Nintendo 64 console in 1996, had a lot of optimizations left on the table in order to get it out the door on time. In that spirit, [Kaze Emanuar] has been plumbing the depths of the code, refactoring and tweaking until he had a version with serious performance gains.
Why would anyone spend time improving the code for an old game that only runs on hardware released over two decades ago? There exists a healthy modding community for the game, and many of the newer levels that people are creating are more ambitious than what the original game could handle. But with the performance improvements that [Kaze] has been working on, your budget for larger and more complex levels suddenly becomes much more significant. In addition, it’s rumored that a multi-player mode was originally planned for the game, but Nintendo had to scrap the feature when it was found that the frame rate while rendering two cameras wasn’t up to snuff. With these optimizations, the game can now handle two players easily.
[Kaze] has a multi-step plan for improving the performance involving RAM alignment, compiler optimizations, rendering improvements, physics optimizations, and generally reducing “jankiness.” To be fair to the developers at Nintendo, back then they were working with brand new hardware and pushing the boundaries of what home consoles were capable of. Modeling software, toolchains, compilers, and other supporting infrastructure have vastly improved over the last 20+ years. Along the way, we’ve picked up many tricks around rendering that just weren’t as common back then.
The central theme of [Kaze]’s work is optimizing Rambus usage. As the RCP and the CPU have to share it, the goal is to have as little contention as possible. This means laying out items to improve cachability and asking the compiler to generate smaller code rather than faster code (no loop unrolling here). In addition, certain data structures can be put into particular regions of memory that are write-only or read-only to improve resource contention. Logic bugs are fixed and rendering techniques were improved. The initial results are quite impressive, and while he isn’t done, we’re very much looking forward to playing with the final product.
With the Nintendo 64 on its way to becoming a mainline-supported Linux platform, the old console is certainly seeing a lot of love these days.