Roll Your Own Tracking

The smartphone is perhaps the signature device of our modern lives. For most of the population it is never more than an arm’s length away, it’s on your person more than any other device in your life. Smartphones are packed with all sorts of radios and ways to communicate wireless. [Amine Mansouri] built an ESP8266 based tracking device that takes advantage of this.

Most WiFi-enabled devices will send out “probe requests” frames trying to search for the SSIDs they were connected to. These packets contain the device MAC address as well as the SSIDs you’ve connected to. Using about 12 components, [Amine] laid out a small board in Eagle. By putting the ESP8266 in monitor mode, the probe frames can be logged and uploaded. The code can be updated OTA making it easy to service while in the field.

With permission from his local library, eight repeater boards were scattered throughout the building to forward the probe packets to where the tracker could pick them up. A simple web interface was built that allows the library to figure out how many people are in the library and how often they frequent the premises.

While an awesome project with open-source code on Github, it is important to stress how important is it to get permission to do this kind of tracking. While some phones implement MAC randomization, there are still many out in the wild that don’t. While this is similar to another project that listens to radio signals to determine the coming and going of ships and planes, tracking people with this sort of granularity is in a different category altogether.

Thanks [Amine] for sending this one in!

35 thoughts on “Roll Your Own Tracking

  1. > While an awesome project with open-source code on Github, it is important to stress how important is it to get permission to do this kind of tracking.

    What?

    What???

    WHAT?!?!?!

    PERMISSION? WHOSE PERMISSION DID THIS IDIOT GET, AGAIN?

    The people whose permission you need are the people you’re actually spying on, not random librarians. The library’s permission has ZERO bearing on ANYTHING. There is NO legitimate application for that kind of garbage, period. And, yes, I am well aware that retailers do it all the time.

    1. The only permission needed was to install a device that presumably uses the library’s power. You implicitly give your permission by the nature of your device constantly spamming out the names of networks and other identifiable markers without being told to. It is perfectly legal for others to receive unencrypted unsolicited broadcasts that devices under your control are making on the ISM band. Further, your assertion that there is no legitimate use of tracking library patrons is naive and flawed. Libraries regularly track the flow of patrons and their habits to determine how their budgets are spent, and often report the traffic counts to their funding agencies to determine how state-wide funds are allocated.

        1. It depends, if you needed to sign a form for access to the library, there could be a free pass for them to do whatever they like on their premises included in the fine print.

          Like windows still collect telemetry even if you turn off every single options that they allow you to disable, slightly less telemetry is harvested, but they never fully stop collecting data. And their legal loophole is “does not collect any personal data, it only collects some anonymous data to help improve their service for everyone”.

          Google do something similar, they store a cookie with a globally unique ID in web browsers and it does matter to them who the user is, they are building up a profile for that browsers unique ID and not the individual using the browser. It is a subtle loophole, but as long as the metadata is not linked directly to the personal data related to a living person, they step over a big legal landmine.

          The library could dive through the same loopholes.

          1. Replying to “source?”

            search for google user_ID for details. Lots of hits for instructions on how to use it to track users across multiple devices, ostensibly for Google Analytics.

            These are Google’s sites telling you how they can track users for you across sites and across devices, so I would trust that as a source.

    2. This depends on local law. There’s “nothing” preventing you from recording this information in bulk. Except that in some countries, there are privacy laws that make it illegal to do this information collection.

      By analogy, there’s ‘nothing’ preventing you from driving through a red light. Nothing about the traffic light prevents you in any way from driving. Photons don’t stop cars. Yet somehow, the system works.

      Some nations have decided that individual privacy is a fundamental human right, and that violating someone else’s right to privacy by collecting their MAC beacons is illegal.

      Retailers only ‘do it all the time’ where they are allowed to. If you don’t want to allow them, you will need to get involved in changing your local laws.

      I’d totally agree that the people going into the library should be informed that they’re being tracked, and because it’s a public building, perhaps the tracking shouldn’t be done at all — preventing those who value their privacy from accessing public infrastructure is unfair. But as it stands in the States, this is just a moral “we should behave this way” type of thing, which unfortunately gives the worst actors free rein.

      1. “preventing those who value their privacy from accessing public infrastructure is unfair.”

        Privacy in a library would only work if all people were nice all the time, returned all the books on time in the same state they borrowed it, and didn’t use the Internet connection for things they wouldn’t do at home.

        By entering a library you accept you will be recognised by a librarian, and or on camera.
        By checking out a book you accept you will be logged as having checked out said book.
        By logging on the WiFi you accept at least your MAC address is logged, but from what I’ve seen you also use your library card.

        Tracking the 2.4-5Ghz spectrum doesn’t remove any privacy, as you had non in the first place.

        Best you can hope for, is they don’t keep your logs longer than needed, though unless they are particularly nice, they keep the logs as long as the law permits.

        Or you could walk into the library with your devices off, while wearing a balaclava.

      2. Dumping all the MACs and SSIDs to a hard drive, at least in EU could be considered a violation of GDPR, that’s obvious.

        But what if we dumped only random MACs? Since they are random and probably not stored permanently on the device, they are not Personally Identifiable Information, right?

        And what if we hashed SSIDs so that they no longer contained hotel and restaurant names, residential addresses or unique cable modem ids? Would salting the hash help? Would changing the salt daily because we case about daily usage and we don’t expect anyone staying overnight, mitigate the risk of a trial?

        Would the fact that all the information is not kept in any permanent storage, just in RAM, change anything? Would powering the recording device with an UPS be fine then? And what if we used encrypted swap with random keys on a local 1TB drive? Would it be permanent storage or not? And what if it used NAS, still randomly encrypted? (see the mentions about permanent storage in European Court of Justice case C-212/13 ruling)

        That’s what I don’t like about GDPR. It’s not always clear what is personal data and what isn’t. What can be used to identify an individual and what can’t. Too often it depends on an arbitrary interpretation. And the interpretation may change with the advance of technology. What about the data collected before? And it may depend from state to state. Should I, as an individual, care about logging IP addresses accessing a my web page? Does it matter if it’s hosted from a cloud service or a Raspberry Pi at home? Is it an GDPR article 2 (c) exempt or not? Should I blur the license plates on the photos I post online? Can I put up a CCTV camera recording my private property if it’s not fenced and anyone can enter? Can I be held liable for using Tile or other BLE crowd-finding applications collecting and sending MACs across the border? Etc. etc.

    1. MAC randomization is only partly helpful. There are folks who use “hidden” networks under the impression that it makes it harder for someone to find and connect to it. It ends up making all the client devices broadcast looking for the network instead of the network broadcasting its existence for clients. That means if you’ve ever connected to a hidden network, your device could be broadcasting looking for that network all the time, even when it isn’t anywhere near you.

  2. Many places keep records of visitors in other sections of the electro-magnetic spectrum (e.g. security cameras). Why is logging somebodies signals in the radio range of the spectrum less legitimate?

    It puzzles me why security systems (home security, etc.) do not routinely keep a log of WiFi/Bluetooth/whatever MAC addresses/signatures of the devices they see.

    Most of this would just be thrown away after a little while, but when a breakin occurs,
    a log of what devices were present could be preserved.

    If I can show that suspect X’s device was present at the time it seems like that could assist with investigation/securing a conviction.
    (Sure you would only catch the people who weren’t too bright, but that is ever the way.)

    The long time log could be analyzed to determine which devices are normally present (filter out my devices, my neighbors’ devices, etc.)

    1. For one, because MAC addresses (randomized, hashed or otherwise) are considered personal data under GPDR. So in some countries the security provider would face a much greater chance of (and fine for) violating GPDR than compared to, say, (illegally) publishing the offender’s picture.

      1. Lots of places aren’t subject to GPDR. (e.g. USA)
        Protecting the physical security (and information security, etc.) of a facility might well be in the public interest, help to fulfill a legal obligation (e.g. obligation to keep data secure), or be in the vital interest of the facility owner, employees, customers, etc.
        Not clear that home settings fall under GPDR, so not clear that it would be a limitation in home security.

  3. Did no one catch the fact that the source code only covers how to send the collected data over serial and not to a WiFi relay as stated in this article, or the original?

    A person brought this up in the HAD project page comments and the author danced around one of the contributors danced around the issue without actually addressing the problem. Granted you could always write the code you’re self but at minimum it would be nice for the authors to throw in a little blurb about how not all described functionality is available in the Source.

    Also, check out a project called Nzyme..

  4. I have something like this for determining who is home and who is away. If no one is home for X minutes, the alarm enables. The difficult part is relaying this information out. I had to use two ESP8266, one to monitor for devices and one to connect to the wifi and relay the information to the automation system.

      1. I also have motion sensors for detecting if someone is actually home or not. Again, with a long enough idle time during the correct periods during the day, the alarm is enabled. The mac address detection just speeds up the detection of no one being home. Even if they don’t have their phone, the motion sensors will prevent the alarm from enabling. There’s always ways this goes sideways, but that the problem with having humans as part of any security system.

  5. I actually went to a marketing presentation where most of the companies who track people were pimping their platforms (around 9-10 years ago now, google and facebook were there and a few of the older companies who send targeted spam emails, and other data harvesting companies). I got an invite from a friend who was unsure if they would need help , or not, in understanding the technology involved – they did not. The whole thing was pitched perfectly for people in advertisement and marketing companies. And you actually got to ask them questions, which they all answered! My eyes were opened, learned about microsecond auctions for which advertiserts gets to place the next ad in the perfect position on the screen. None of the actual data harvested is sold, only access to push an ad to either keywords or a demographic selection. That is what is sold. e.g You want your ad displayed to caucasian females 14-24, who live in Polk City Florida and are pregnant at the time that the ad is displayed. And the metric used for tracking success is the click through rate (which is how they originally learned the best spots in the screen to display ads). Then with analytics using either cookies or tracking pixels the previous website can be recorded to monitor and convert click through rates into real world sales.

    In terms of marketing, 9-10 years ago is an eternity. I’m sure that things have changed a lot since then. That is where I learned about google’s unique tracking cookies.

    1. I was very confused by that comment as well. I was unfamiliar with GDPR and am learning new things.

      While I support privacy, I do find it strange that a MAC address is considered personal information. It seems to me that something broadcast in the clear, all the time, cannot be considered private. It would be like walking around with a name tag on and acting as if you want to remain anonymous. Or if you approached everyone you passed on the street and said “hi my name is ___.” You cannot be upset if someone remembers your name. If you don’t want them to, you either don’t tell them (your wifi is off or your WiFi does not broadcast its MAC address to everyone) or you give a fake name (random MAC addresses).

      I realize it may compromise privacy to have your WiFi on when you’re in public—but so does simply being in public. I’m just struggling to understand the concept here.

Leave a Reply to NopeCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.