DNS Tunneling with an ESP8266

There’s a big problem with the Internet of Things. Everything’s just fine if your Things are happy to sit around your living room all day, where the WiFi gets four bars. But what does your poor Thing do when it wants to go out and get a coffee and it runs into a for-pay hotspot?

[Yakamo]’s solution is for your Thing to do the same thing you would: tunnel your data through DNS requests. It’s by no means a new idea, but the combination of DNS tunneling and IoT devices stands to be as great as peanut butter and chocolate.

DNS tunneling, in short, relies on you setting up your own DNS server with a dedicated subdomain and software that will handle generic data instead of information about IP addresses. You, or your Thing, send data encoded in “domain names” for it to look up, and the server passes data back to you in the response.

DNS tunneling is relatively slow because all data must be shoe-horned into “domain names” that can’t be too long. But it’s just right for your Thing to send its data reports back home while it’s out on its adventure.

Oh yeah. DNS tunneling may violate the terms and conditions of whatever hotspot is being accessed. Your Thing may want to consult its lawyer before trying this out in the world.

51 thoughts on “DNS Tunneling with an ESP8266

    1. If you’re Dan Kaminsky, you’re allowed to do hopelessly evil things to DNS like tunneling ~2Mbps video over it. But I’ve met Dan Kaminsky, and you’re not him. Most DNS tunneling systems are either very slow, or else leave litter all over your ISP’s DNS caches, sometimes even multiple ISPs’ DNS caches, and you really shouldn’t do that.

      A little bit of telemetry like daily requests to 0x12345678.bad-example.com isn’t too bad, but even with a short DNS cache timeout, any significant data rate is going to splatter junk all over.

  1. This reminds me of a college that I went to that would re-direct one to a login page when connected to their wireless network. I found out that I could easily bypass this by simply connecting to an SSH server listening on port 53 and setting up a SOCKS proxy.

    1. Comcast does the same with their damned Xfinity Routers. Just spoof the mac and sign in for a complimentary session once an hour. Its a pain, but it works. the hard part is coming up with new emails every time. there are only so many permutations of “free@wifi.com” available….not that i would do that. (if my neighbor is reading this, thank you.)

        1. the email address is likely not actually required- Much like many download links asking for email. it will probably work if you ignore it entirely and click login (or download). cannot remember the details regarding the “CableWifi” hotspots. (it was alarming to see 4 companies sharing a presence on them. twc, brighthouse, Cox, and Comcast? or charter?)

    1. Because clients cache DNS requests. If you block DNS then your clients may not request the DNS again, assume the domain is non existent and then you get angry people.

  2. Now think about devices that connect to any available hotspot and have to be able to send small batches of data anywhere they are. Like a GPS tracker with WiFi. For example, GSM tracker means country-wide unlimited service, while WiFi means worldwide limited service, basically. If you combine them, you get awesome results – and ESP8266 is a nice device to make sure you do.

      1. My phone often does cache DNS failures, if I end up on the payment screen (run out of credit) from a site I was looking at, that site bounces to the login screen until I do a lot of pissing about.

    1. The more advanced DNS tunneling don’t rely on you contacting your own DNS server directly. Instead, IP traffic is embedded in actual, legitimate DNS requests, that pass through the hotspot’s recursive DNS server and are potentially cached. Outgoing traffic is in the hostname being queried, so you might query deadbeef.tunnel.com where deadbeef is the outgoing data. Incoming traffic might be in TXT responses. These will then be processed like any other DNS record by the recursive DNS server provided by the hotspot.

  3. I wouldn’t rely on this. Some hotspots will return the same IP (A record) for all DNS requests so that you get the login / payment page. Another problem is that some DNS caches don’t obey TTL/Refresh intervals in SOA records so your data or DNS request wont get through to the server and the cache will keep returning an old response.

    In any case it would be fairly naive to use a protocol that is designed for responses that are hours or even days apart for data transfer that is more or less in real time. Put simply – something has to give so something will break.

    Also ‘DNS tunnelling’ is a real thing and this description here is not ‘DNS tunnelling’.

    1. I haven’t seen the former happening, because as you say yourself, caching would interfere with that – you’d continue to resolve to the captive portal even after you’ve paid.

      You can avoid your responses being cached by using a unique query for each message.

  4. “But it’s just right for your Thing to send its data reports back home while it’s out on its adventure.”

    That’s some damn fine apostrophic control there. Just playing with us. Bet you this one knows how to spell “collaboratorium”.

  5. I wish WiFi would add a new feature of allowing any device to connect and use it at limited rates(like 100Bps or something). Would be unusable for people, but still ok for internet of things. Hell, I bet you could send a short text email in a reasonable time over a 100Bps connection.

    1. Exactly! The manufacturers have made their modules and now expect to sit back while the community builds the software and that to spawn new hardware based in the manufacturers modules or chips.

      While this may be a solution to making the ‘Things’ of IoT it’s no solution at all to the Internet of ‘IoT’.

      If you have to add a GPRS modem and associated SIM to your IoT project to allow it to go portable then the new price point dooms it to failure.

      What the community needs is infrastructure to allow portable IoT devices but the community doesn’t have the influence to make this happen. The manufacturers should get off their asses and get things happening at the infrastructure level or watch there competitor succeed while they fail.

      If WiFi modem manufacturers had an option to allow a IoT network that would service ANY device in range then customers would use it even if they only had one IoT device but the bonus would really be that it becomes yet another area that has IoT access for portable devices. It would need to have some controls like bandwidth limiting of course.

      WiFi modem manufacturers are not going to listen to the general community because we don’t have a single common voice but I am sure that manufacturers would have some influence especially given that their products are dependent on this.

      1. Look, to be fair I think the modules are a great achievement from the term of silicon and price. Sure, software is missing, it will mostly be built by the community which will further enhance the value chip, bringing profits to the company not the community.

        Somewhere else some other people are trying to build cellular like networks for Iot, for example: http://sigfox.com/en/ . We all know a lot of home automation kits come with a gateway, this could eliminate that, even the wifi. Of course, extra features could be added to wifi routers to allow low power sensors to communicate freely through it, but nobody seems to bother about that, even though wifi seems to be in so many places and it could be practically free. The good: you can buy a thing and then readily have its data on your pc/phone. The bad: somebody has to build some extra network for that.

        Just like the DNS request here: imagine there is a special packed that can be transferred to a server even over password locked wifi networks. This can all be implemented in SW on routers and using low cost stuff like the ESP. Of course, people will find ways to abuse it.

  6. I built a DNS server and configured it to log every query it receives. I then built a program that monitors the log and spits it out on the terminal in a much more readable format.

    It was right after I did this that I noticed a TON of txt queries from my antivirus software where the host part of the query was a seemingly random and long length of data. The official knowledge base document on this ‘feature’ indicated that it was some kind of signature lookup and was part of the software that was keeping my web browsing safe. Ok, maybe so, maybe not. There isn’t enough transparency to really tell me *what* data is being sent to their DNS server (yes – it would hit my recursive DNS, but every query would eventually hit a DNS server that they operated).

    The only thing I haven’t done yet is prove that these queries are being emitted when my browser isn’t running (because my browser is almost never shutdown/restarted). Seems like it would be an interesting research project for someone in the security community…

  7. I saw laptops doing do a DNS lookup with a very long very cryptic ip-address imediately at power-up! Performed by the Bios, before the Hard disk is accessed at all. So somebody somewhere in the world knows immediately where his machine is right now. Then I found out, this is due to a feature called theft-protection, you can subscribe at the manufacturer for this service. However, disabling it in the bios did not stop the machine from sending this data out…

      1. you’re right of course, it was a long *hostname*, similar to something like 3939af83fc7293ac4.dell.com, probably encoding the machine id… And it was on wired ethernet, already at least 5 years ago.

        Just to be used as an argument that transmitting information over DNS is more commonly used than typically realized. And if you sniff ip-traffic today, there are lots and lots of “long cryptic hostnames” floating around. And together with more and more smartphones which try to automatically log into any open WiFi network around, you get a story…

        There is perfect legit use for coding information into the hostname. Virtual hosts: Many webservers host a lot of domains on the same machine with the same IP. The page you get depends on the hostname. Load balancing: Depending on whatever rule employd, you get a different IP for the same hostname. So there is two-way communiction via DNS possible and in common use today.

        For IoT applications, a nameserver may simply act as a broker, selecting a target host depending on the message content, coded into the hostname part. Take some “Thing” with an analog input for example. Small values are ok, they should be sent to some cheap logging host only. But high values above some critical – but variable – treshold mean “alarm” and be should be handeled by another expensive reliable server. So the “Thing” will construct a hostname based on it’s own ID and the analog value and lookup the appropriate target IP for the following communication. The big advantage of this scheme is that in case of blocked network acess where the communication will fail, a fallback is possible: If logging-, alarm- and nameserver communicate, they can detect that except the DNS queries nothing has arrived and the nameserver hands the payload contained in the hostname over to the appropriate destination. And depending on the returned IP, the “Thing” knows if it´s above threshold and can react immediately even without the normal answer from the server. The advantage to code the functionality in this way is the increased reliability by inherently using multiple protocols. That it may also work through most fire- and paywalls is pure accident.

        A pretty neat use for an ESP without any additional hardware, just a battery soldered to it: Scan networks, if there is an unencrypted ssid, connect to it. Transmit all the visible bssids to your servers using the scheme above. Go to deep sleep for some time and repeat… The servers at home can use some geolocation service to locate the beast, either use openwlanlocate or tweaking google or apple into thinking you see all the bssids itself. Voila: a 2$ tracking device you can attach with a chewing gum to all your legit properties…

        1. Just as an info how to setup:

          Subscribe to some dynamic DNS service and set it up in your router.

          You will get an AAA record, pointing to your dynamically assigned IP via yourserver.somedyndnsservice.com

          Create an additional CNAME entry for a subdomain e.g. iot.yourserver.somedyndnsservice.com

          Run a dns-server for this domain on your PC, listening either on port 53 directly or, if you have a router doing NAT, create a port forwarding of public port 53 to any port on your PC, thus avoiding binding to a privileged port.

          This server now will receive all requests for the subdomain xxx.iot.yourserver.somednsservice.com

          For demonstration purposes, this “server” can be a programmed as a simple three line shell script which pipes netcat through grep into a logfile

          It might respond with the aproppriate IPs where your other IoT server scripts are listening on port xy, maybe on yourserver itself.

          On your ESP-Thing send UDP packetes containing “payload” to payload.iot.yourserver.somednsservice.com port xy

          Make sure that “payload” conforms to a valid hostname, i.e. only numbers, letters, -, _ and being not longer than 63 characters

          Normally you will receive the payload on both servers, in case of hotspots with fire- or paywalls only with your dns-script

          Maybe someone is willing to create a business this way?

          1. This is exactly *why* no one is going to set this up professionally.

            You mentioned retrieving the authoritative DNS server for the created name space (sub zone) but not how that is done, so it is assumed that you are using a normal DNS process. Fair and good. Then when the you are querying the non-existent name space (with the data string) it’s assumed that you are also using normal DNS process instead of requests directly to your server.

            Any sys-op who sees what you are doing to their DNS server is going to cut you off in the rudest way possible to reflect your abuse of their system.

            You go to the trouble of setting up a DNS server that relies on everyone else’s equipment to chain the requests to your server instead of making the requests directly to your server in the first place.

            Everyone is going to do this exactly the same way because to do it properly requires a little uncommon knowledge of how the DNS system actually works. And when they do, sys-ops are going to kill it dead – very dead!

  8. ROB, I agree with you: If I was a sysadmin noticing it happen on my system – I would stop this. However, DNS tunneling is a well known procedure since the last century with lots of scientific publications about it and possible countermeasures. And still *all* public hotspots I ever tried will allow it. Even many company firewalls based on cisco routers will allow it in their standard configuration. All better home routers which have firewall like features “child protection”, limited time guest access etc. will allow it. Try yourself! So I assume it is the free(?) decision of hotspot operators and hardware manufactures to allow DNS tunneling. And all their sysads indeed never stop it. You talk about abuse of the DNS-chain: In fact no third party equipment is significantly involved/abused: The Thing at the hotspot queries the DHCP-assigned DNS server of the hotspot operator. And this server will then query your home server. (of course there will be also one single request in the whole process involving a “normal” lookup for the IP of my home server). And my argument stated above was, as it is common use of the DNS system to transmit hardware identities back home even from behind firewalls, operators and sysops accept it for decades and nobody else is involved/abused – So why is it a bad thing?

    1. This isn’t a *bad thing*! It’s a *Very very bad bad bad thing*!!!

      You are taking a protocol that was intended as a very slow system of updating infrequently changing information and using it to tunnel *not only* high volume information but *also* information in the *opposite* direction to the intended purpose of the protocol.

      The only reason this works is because the very low volume of stolen traffic that is used this way today is probably not worth worrying about.

      Up the rate of data traffic theft and you can bet sys-ops will come crashing down hard on this and hard!

      And the sys-ops wont be concerned about the very small amount of traffic that is stolen. What will be concerning them is that the traffic is passing through servers and protocols that simply were not ever designed for this purpose and that will cause all sorts of unwanted issues!!!

      If you were to tunnel your DNS requests directly through to your own server than no one would give a damn. But most people are never going to know how to do this and they will find that *it just works* to use the local DNS and they will go with that.

      To give you an idea on how low traffic is on a DNS caching server … the most frequent inbound request will be for google – perhaps thousands of requests per minute and all of these inbound requests will result in only *one* outbound request every four hours or so. Now add your IoT and do one outbound request per minute and you are now creating 240 times more outbound request than there are for google.

      This *wont* scale as DNS tunnelling grows!!! The whole system will collapse unless sys-ops squash it very very quickly and that my friend is what their paid to do.

      1. so why do you think that every hotspot provider allows full dns while blocking every other packet before login? It even imposes additional effort, there must be a rule implemented somewhere: forbid everything but allow all dns. And again: No other dns server will be stressed exept that of the provider itself who deliberatedly decided to open this channel. Why do they?

          1. But that redirection does not happen via the DNS protocol, it always delivers the correct IP your browser has asked for. Say, you are not logged in yet and your browser wants to open http://www.google.com. It starts with a gethostbyname(“www.google.com”). The hotspot performs a normal DNS request into the internet for that name and returns the correct IP back to the browser. Then the browser sends a HTTP GET / request to that IP. As long as you are not logged in, the hotspot will not relay that request to Google but return a spoofed answer instead, containing a HTTP REDIRECT to “http://www.mystupidhospotwithauthentification.com/login?” and your browser displays the login page. Once you are logged in, the next time you try to open http://www.google.com, everything is normal.

            Why not simply do the spoofing on DNS level already? The hotspot could resolve *every* DNS query with 10.0.0.1 and run a webserver on that IP. Same effect, much simpler and no DNS tunneling possible.

            Well, an argument against the second method could be that your DNS cache may get poisened and all sequent attempts to get someting from http://www.google.com will always be sent to 10.0.0.1, thus rendering gogle unacessible. Simple solution: All the spoofed 10.0.0.1 DNS answers of the hotspot should get a very short TTL, so your browser will resolve http://www.google.com imediately again and get the correct answer this time.

            So why do all hotspots choose the first method that accidently provides transparent DNS access for everybody?

          2. Client tools like gethostbyname only return a small part of the zone record like the “A” record or IP address.

            The TTL / Refresh periods are contained in the SOA records that the client machine and browser don’t request so there is no way to reliably control the expire time of a “A” record from the host DNS service. And hence the preference to a HTML META Refresh solution to the problem. In any case the DNS has to work from the start because without a DNS you have no IP address and without an IP address TCP/IP is useless.

            So the system you mentioned stills ‘leaks’ the original DNS request back to the remote DNS server. It’s no big deal to fix this. It’s just that the leaked traffic hasn’t yet go the point that it’s worth fixing yet.

  9. ROB, RFC 1035 states that the header of *all* DNS resolver requests and answers contain a TTL field:

    “a 32 bit signed integer that specifies the time interval
    that the resource record may be cached before the source
    of the information should again be consulted. Zero
    values are interpreted to mean that the RR can only be
    used for the transaction in progress, and should not be
    cached. For example, SOA records are always distributed
    with a zero TTL to prohibit caching. Zero values can
    also be used for extremely volatile data.”

    So a hotspot could simply deliver 10.0.0.1 answers with zero TTL to all DNS requests of the client before it logs in without breaking any future communication. Perfect clean and standard compliant solution without offering any undesired tunnel access.

    TCP/IP works perfectly without DNS. The DHCP Protocol is used to assign your IP address. Otherwise local networks without internet connections wouldn’t be possible.

    Regarding fixing the DNS “leakage”: In many cases it is not possible for the sysop! If you watch out for open DNS youl’ll find it in ridiculous places. Once I could establish a DNS tunnel during holiday on a cruiseliner on the open ocean. It offered intranet for the passengers but no internet connection – except for the DNS protocol to my surprise. I spoke to the communication officer about that, he was pretty surprised, too, as the ship just had a 56*kilo*Bit sattelite link, ment for crew use only. And passenger’s DNS requests turned up to use a significant part of that bandwith. The sysop of the ship immediately looked into the case and found – he could not stop it. It was default behavior of the cisco router which he could not change…

    1. I client utility like getHostByName is not a DNS request and doesn’t conform to RFC 1035. It it *were* a DNS request then it would be something like getHostArecord.

      Sure the client system has to do a DNS to get the info in the first place (including TTL/Refresh/Expire) but that doesn’t mean the client Operating System will pass that additional info any further.

      Even if the *IF* the Client OS does make the additional information available then some (or most) browsers ignore it anyway. I am using Firefox and it doesn’t cache DNS – it does a new lookup every single time.

      RFC 1035 is a specification for DNS serves and *not* local DNS policies on the client machine making the requests. The local (Client) machine doesn’t support DNS services for other machines so there is no need for compliance.

      The RFC’s are well and truly forgotten when you get to client machines and local OS’s.

      Sure DHCP will give you a local IP but … who are you going to call?

      As for the ships DNS leakage, subnets would have fixed the problem – that’s what they’re for.

  10. Tough discussion, ROB :) OK, I agree, gethostbyname() is an interface between the application layer and the resolver which is typically operating on system level. And it does not tell anything about TTLs. So any application which calls gethostbyname(“www.google.com”) only once and then relays on the returned IP forever may fail. But this is why Firefox calls gethostbyname() again and again to handle all situations that can legally occur under RFC1035. IPs are allowed to change, and thus all “hidden” communication between the resolver, which is part of your operating system and any external DNS server speaks RFC1035 and has TTLs attached. And it is the job of the resolver to decide if it returns a cached IP to your application or requests an update from the server. So, for a Win98 PC running Netscpe Communicator it might be a problem indeed and the user possibly wouldn’t be automatically redirected to google after hotspot login. That’s the reason to enable world wide free tunnel access for everybody?

    Hotspots always do NAT and assign a local IP. No need to let me communicate with the global DNS system for that…

    Of course subnets with different nameservers would easily solve the problem for intranets. So for me it’s really hard to believe how often you find an open DNS, even behind strict firewalls which will never allow you any other internet access. Not only on hotspots but also in universties, in companies and on home routers employing parental control…

  11. I have done this on the esp8266 and it’s very effective.

    In my view, this may be violating the t&c of hotspots, but the amounts of data used are likely to be so small that nobody will care. Many devices already randomly connect to open APs and do DNS queries anyway.

    DNS can be used to transfer data in both directions but it’s very limited. For special purposes it’s ok, I used it for clock synchronisation and sending back simple telemetry to a server. This works on many (but not all) hotspots.

    There is no logistical way to set up a device like esp8266 to work with a variety of different captive-portals. The vendors change their login logic all the time, and there are few widely supported standards.

    DNS works in a lot of cases. If you don’t abuse it in a noticable way, I really don’t think it’s a problem.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s