Query Your C Code

If you’ve ever worked on a large project — your own or a group effort — you know it can be difficult to find exactly where you want to be in the source code. Sure, you can use ctags and most other editors have some way of searching for things. But ClangQL from [AmrDeveloper] lets you treat your code base like a database.

Honestly, we’ve often thought about writing something that parses C code and stuffs it into a SQL database. This tool leverages the CLang parser and lets you write queries like:

SELECT * FROM functions

That may not seem like the best example, but how about:

SELECT COUNT(name) FROM functions WHERE return_type="int"

That’s a bit more interesting. The functions table provides each function’s name, signature, a count of arguments, a return type, and a flag to indicate methods. We hope the system will grow to let you query on other things, too, like variables, templates, preprocessor defines, and data types. The tool can handle C or C++ and could probably work with other CLang front ends with a little work.

This would be great for estimating the difficulty of tasks. Imagine asking for how many functions return a float when trying to decide how long it would take to switch to fixed point. We plan to try it on a source tree for the Linux kernel and give it a spin.

Truthfully, we’ve long been surprised databases haven’t taken over as file systems and source code anyway. A lot of what we do in git could be done in a database. And vice-versa.

17 thoughts on “Query Your C Code

    1. Reverse engineering software packages like JEB, JADX, Ida Pro and Ghidra actually saves decompiler code in a database like format, actually. It’s noting new. Until someone decides to implement an open source version

  1. So I was very interested in this and the author’s similar *QL projects and after drilling down into the code I found it to be yet another poorly implemented database. I like from scratch implementations as much as the next programmer but small in memory databases are a solved problem! Seriously, just use sqlite, it’s a small efficient library that is absolutely bulletproof.

    1. Fair comment from a practical standpoint. But sometimes it’s more about the journey than the end product, and doing it all yourself is interesting/fun. Still an interesting idea, either way.

      1. Except it does. Not to insult this project but I worked at a place 20 years ago that had something like this, except you didn’t write sql. And to give a better example than what’s in the article, you’d use it for stuff like “we’re adding a new field to a database table. Show me every program that references that table” so you knew the entire list of source files you’d need to look at to see if they needed updates, and could compile all of them for testing. Iirc we also integrated our source control so you could check them all out in one go.

        This was all stored in a database, updated at each code check in. The database was the same commercial product or application used (think something like MSSQL, although that wasn’t the specific product. ) saves a ton of time.

  2. git rekt: Git is already a database, always has been. If you didn’t already know that, and have been wishing it were this whole time, I genuinely suggest you learn more about Git.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.