Between Tesla Motors’ automobiles and SpaceX’s rockets, Elon Musk’s engineers just have to be getting something right. In part, SpaceX’s success in landing their first stage rockets is due to analysis of telemetry data. You can see some of the data from their launch vehicles on the live videos and there is surely a lot more not shown.
An article in MIT Technology Review provides similar insights in how Tesla came from behind in autonomous vehicle operation by analyzing telemetry from their cars. Since 2014 their Model S received an increasing number of sensors that all report their data over the vehicle’s always-on cellular channel. Sterling Anderson of Tesla reported they get a million miles of data every 10 hours.
The same approach can help us to improve our systems but many believe creating a log of key data is costly in time and resources. If your system is perfect (HA HA!) that would be a valid assessment. All too often such data becomes priceless if analysis explains why your drone or robot wanted to go left into a building instead of right into the open field.
I’ve used logging in both my professional and hobby projects. One system used satellite communications to obtain data from remote pipeline locations. Two notable locations were the side of a mountain in Nevada which was totally inaccessible in the winter and a swamp in the middle of Louisiana. These remote units, and their counterparts at the client’s data centers, provided log data about the operation of the communications to our 24/7 operations center. It wasn’t unusual for our operators to contact a client about a remote site going down before the client’s operators noticed the problem.
The amount of data to log varies among systems and the state of development. In a production system the log consists of high level data and operational steps. A system undergoing intensive testing and debug will record extensive details on operational steps and follows data and calculations step by step through the processing. The level is determined by the likelihood of failure and the data required to determine why the failure occurred. Some systems may allow adjustments to a level while operating while others require recompiling.
Usually the most intensive is the debug level which tracks all the sensor readings, calculations, and outputs down to a very detailed level. The least intensive might capture sensors, outputs, and the results of calculations that go directly into determining the outputs. Usually there is at least one, and possibly more, intermediate levels available for use.
But any way you look at it, people who are serious about building and maintaining complex and robust systems include data logging in their design plan. If you aren’t already doing this with every one of your projects you should be.
How to Organize Data for Journaling (Data Logging)
I call my system a journaling system to avoid name conflicts with logarithm functions. My system uses three levels: detail, trace and log. These can be combined with another flag which tracks function calls. Here is a sample of an output file which is organized as four columns of data.
16:17:50:125 D journal Journal srr091213.journal opened 16:17:50:140 D journal Cleanup old journal files 16:17:50:140 D journal Journal opened 16:17:50:140 D journal Journal level: eDetail 16:17:50:140 D configuration task: search 16:17:50:140 D configuration lport: COM4 16:17:50:140 D configuration ldevice: 11 16:17:50:140 D configuration rport: COM3 16:17:50:140 D configuration rdevice: 10 16:17:50:140 D configuration mport: com5 16:17:50:140 D configuration mdevice: 12 16:17:50:140 D configuration ttlport: com6 16:17:50:140 D configuration inhibit: 0 16:17:50:140 D configuration inhibit RC: 0 16:17:50:140 D configuration update_rate: 300 16:17:50:140 D configuration name: MYSTICLAKE 16:17:50:140 D configuration task: idle 16:17:50:140 D configuration inhibit: 1 16:17:50:156 D thread ~Thread 16:17:50:156 D main **EXCEPTION** in APP Constructor: : COM4 Bad Serial Port Handle
The first column in each line is the time of day to milliseconds. Next there is a designator letter which indicates which type of logging is active, followed by the name of the C++ file that is writing to the log. Finally the actual data is recorded and this can vary greatly from task to task.
Precisely recording the time is important for a few reasons. Obviously it tells us in what order the events occurred, and allows sorting of the logs based on this. But there can be other correlations that will make it useful. My robots save video of their operation that contain their own time stamps which can be matched with the first entry in the log.
The designator letter, in this case a ‘D’, indicates the output is from a detailed trace. This could also be ‘T’, ‘L’, or ‘F’ for trace, log, or function. This indicates the level of detail the logs are recording. When things are not going well you want as much data as possible for troubleshooting, and this verbosity setting will be clearly indicated in my logs.
The first three fields are fixed length which allows me to load them into a spreadsheet. Once in the spreadsheet I can sort based on these fields but still return the file to the original order by sorting on the time stamp.
Reading Through This Example
The first few entries in this file are simply the startup of the journal system. The first line tells the date for this journal. It’s a bit old, 12 September 2013. It then reports the cleanup of old journal files. This is important because if your system resources are limited, large journal files can consume the file system. I used a PC with a solid-state drive but still cleaned out old files automatically because they quickly became outdated due to code changes.
Once the journal was running the configuration of the system began. This read an INI file containing the resources to use which were a number of control boards connected to serial ports. Each board had an identification number to allow multiple boards on a port. The journal reported the port and the identification number to use for each controller. For instance, the left motor port, lport, is on COM4 with the identifier of 11.
After the configuration, the main processing began, except in this case it didn’t because a serial port was not connected. The attempt to open it failed which threw an exception.
Using the Journaled Data
At one point I had a problem where the robot would move, stop for an unknown reason, and then start moving again. The journal helped me find the answer. The key line was:
16:17:53:172 D chassis kill 0 loV 1 err 121 lim 1 temp 281 targ 0 rc1 0 rc2 0
This line reported the data read from a motor control board. The file is chassis since this section of code was concerned with the chassis control. As you might guess, the problem was the input voltage was dropping below the limit I’d set, as indicated by the ‘loV’ being 1. You also see the temperature was 28.1 C and the radio control inputs were not active.
This shows the utility of journaling or logging. Without this log it would have been difficult to determine why the robot was stopping. As I recall the robot was attempting a tight turn which caused motors to stall, increased the current draw from the battery, and dropped the battery voltage.
Using a PC I had plenty of room for logs and these days the ubiquity of Single Board Computers (often with huge SD cards) continues to make journaling an easy option. If you’re on an Arduino Uno you don’t have much space but you could send the log over a wireless channel to be recorded on a smartphone or laptop. On a slightly larger system you could keep the log in memory using a circular buffer. The trick there is keeping the unit powered on until you somehow read the buffered data.
You can also add a board specifically for logging. The Adafruit Arduino Logging Shield use I2C while the Sparkfun OpenLog uses a serial connection to handle the data transfer. Since these are writing to SD cards the speed of transfer might limit how much data you can transfer. If you run into that limit you might transfer binary data rather than text. A program on your PC or another board could translate the binary to text.
Logging data is a useful tool for projects of some complexity. It’s worth the time to develop utility routines to help with recording the data regardless of the storage media. The data you save might be the data that saves the day.