Sometimes, it pays to read the man pages of commands you use often. There might be a gem hidden in there that you don’t know about. Case in point: I’ve used curl (technically, cURL, but I’m going to stick with curl) many times to grab data from some website or otherwise make a web request. But what happens if you want to do the same thing from a C program? Well, you could be lazy and just spawn a copy of curl. But it turns out curl has a trick up its sleeve that can help you. If only I’d read the man page sooner!
First Things
The simplest use of curl is to just name a URL on the command line. For example, consider this session:
$ curl http://www.hackaday.com <html> <head><title>301 Moved Permanently</title></head> <body> <center><h1>301 Moved Permanently</h1></center> <hr><center>nginx</center> </body> </html>
This isn’t so useful because it is a 301 response (to send you to the https server, in this case). The -L option will make curl go get the page instead of the redirect. Try:
$ curl -L http://www.hackaday.com
You probably want to pipe it through less or use the -o option to send the output to a file. If you want to see the details of the redirect, try:
$ curl -i http://www.hackaday.com
Jack of All
That’s just the very simplest thing you can do with curl. It can do a bewildering array of protocols, including FTP, SMB, POP3, IMAP, SFTP, and many more. It can form different requests and manipulate cookies, certificates, and many other things.
However, it turns out curl doesn’t really do any of those things. It just reads your input and manipulates libcurl which is where all the smarts are. If you have done much with Linux, then you realize that means you could use libcurl, too. But how?
Less Work, More Code
You could look up the details of libcurl. It isn’t a secret. In fact, there are two ways to use the library. The “easy” interface is quick to use and great for most things. If you need heavy-duty multiple transfers and other exotica, you might have to use the “multi” API, which is a bit more complex.
However, you don’t even have to start there. The curl program itself will help you. Here’s how it works. First, build a command line to get the results you want (just like we did earlier to read the front page of Hackaday). Then, add the –libcurl option to the command line along with the name of a C source file (that probably doesn’t already exist). The program will then write a skeletal piece of code to do the exact transfer you specified. You can compile in the usual way, just add -lcurl to the compile command.
You might have to tweak it a bit, and depending on your application, you might want to make some changes like modifying the URL. But you’ll get a great start.
For example, try:
$ curl -L -o output.txt http://www.hackaday.com --libcurl hackcurl.c
The skeleton will have a main function (you might want to change that if you are adding it to another program). In that main will be code reflecting most of the options you set on the command line.
There will also be a comment showing you some things you might want to set that it can’t figure out for you. Finally, there’s a little driver that just performs the operation, does some cleanup, and then exits with a status code. Here are the options the boilerplate code suggests:
/* Here is a list of options the curl code used that cannot get generated as source easily. You may choose to either not use them or implement them yourself. CURLOPT_WRITEDATA set to a objectpointer CURLOPT_INTERLEAVEDATA set to a objectpointer CURLOPT_WRITEFUNCTION set to a functionpointer CURLOPT_READDATA set to a objectpointer CURLOPT_READFUNCTION set to a functionpointer CURLOPT_SEEKDATA set to a objectpointer CURLOPT_SEEKFUNCTION set to a functionpointer CURLOPT_ERRORBUFFER set to a objectpointer CURLOPT_STDERR set to a objectpointer CURLOPT_HEADERFUNCTION set to a functionpointer CURLOPT_HEADERDATA set to a objectpointer */
An Example
The thing you’ll most often need to change is what happens to the data you read. The -o option isn’t what you wanted, probably, and so curl doesn’t build that into your C code. Of course, if you just wanted to send a request, that might be all you need. In my case, I will eventually want to know what the Hackaday server said back, so I have some work to do. Luckily, it isn’t much.
At the start of the code (available online). I added a section near the top that includes a few headers and defines a BUFFER structure. This is just a string with a length. I also made a simple function to create a buffer and another that libcurl can call to send me data (writefn).
That last function is the most complex of the custom code. It looks at how much data curl received, reallocates the buffer, and saves it. I didn’t intend this to be general purpose, so there’s no provision for fancy editing to the buffer. Things come in. The buffer grows. Stuff gets put on the end of the buffer. That’s all.
The next part of the custom code appears near the end. It tells libcurl about the writefn
function and asks it to pass the address of buff
(the buffer) to that function. Obviously, libcurl doesn’t care what that argument is. It just takes whatever you tell it and adds it to the call.
Once the boilerplate curl_easy_perform
function returns successfully, the buff
structure has the web page in it. For this example, I just print it out, which is boring, but you could, obviously, do whatever you wanted here.
That’s it, and you’ll find that for many tasks, this is sufficient. If it isn’t, the comments and the documentation suggest other ways to configure the library similar to how this example sets the write function.
More Curl Tricks
Need your current IP address (thanks, Amazon):
curl checkip.amazonaws.com
Want to check if a site is up?
curl -L -Is http://www.hackaday.com | head -n 1 | cut -d ' ' -f2
Need a QR code?
curl qrenco.de/www.hackaday.com
Can’t remember the definitions for the word inductor?
curl dict.org/d:inductor
Of course, you don’t need C code to call these. If you do a quick search, you’ll find there are tons of services that curl can easily access from the command line or, using libcurl, from your programs, too.
Want to test your curl chops? We’ve looked at grabbing Hackaday automatically before.
I’ve seen commercial programs that use curl behind the scenes.
You’ve seen a lot of them. You’re probably using at least 2 right now.
Yes curl allows for free commercial use
Not to forget, custom headers, POST and GET requests and the list goes on
I am sad the dict example doesn’t use the dict protocol.
Use xmllint –html for parse html
Interesting. In the Java world, I’ve (ab)used W3C’s _tidy_ tool as an ad hoc HTML processor… “Great minds think alike. Ours do too.”
Is that curling c, or c’ing curl?
curl ipecho.net/plain for a non Amazonified IP address service.
curl stappers.it/t
my fav way to check ip is curl ifconfig.me