File Compression By Steganography

In a world with finite storage and an infinite need for more storage space, data compression becomes a very necessary problem. Several algorithms for data compression may be more familiar – Huffman coding, LZW compression – and some a bit more arcane.

[Labunsky] decided to put to use his knowledge of steganography to create a wholly unique form of file compression, perhaps one that may gain greater notoriety among other information theorists.

Steganography refers to the method of concealing messages or files within another file, coming from the Greek words steganos for “covered or concealed” and graphe for “writing”. The practice has been around for ages, from writing in invisible ink to storing messages in moon cakes. The methods used range from hiding messages in images to evade censorship to hiding viruses in files to cause mayhem.

100% not [via xkcd]
The developer explains that since every file is just a bit sequence, observing files leads to the realization that a majority of bits will be equal on the same places. Rather than storing all of the bits of a file, making modifications to the hard drive at certain locations can save storage space. What is important to avoid, however, is lossy file compression that can wreak havoc on quality during the compression stage.

The compression technique they ended up implementing is based on the F5 algorithm that embeds binary data into JPEG files to reduce total space in the memory. The compression uses libjpeg for JPEG decoding and encoding, pcre for POSIX regular expressions support, and tinydir for platform-independent filesystem traversal. One of the major modifications was to save computation resources by disabling a password-based permutative straddling that uniformly spreads data among multiple files.

One caveat – changing even one bit of the compressed file could lead to total corruption of all of the data stored, so use with caution!

Reverse Engineering Liberates Dash Cam Video

If you’ve purchased a piece of consumer electronics in the last few years, there’s an excellent chance that you were forced to use some proprietary application (likely on a mobile device) to unlock its full functionality. It’s a depressing reality of modern technology, and unless you’re willing to roll your own hardware, it can be difficult to avoid. But [krishnan793] decided to take another route, and reverse engineered his DDPAI dash camera so he could get a live video stream from it without using the companion smartphone application.

Like many modern gadgets, the DDPAI camera creates its own WiFi access point that you need to connect to for configuration. By putting his computer’s wireless card into Monitor mode and running Wireshark, [krishnan793] was able to see that the smartphone was communicating with the camera using some type of REST API. After watching the clear-text exchanges for awhile, he not only discovered a few default usernames and passwords, but the commands necessary to configure the camera and start the video stream.

After hitting it with the proper REST messages, an nmap scan confirmed that several new services had started up on the device. Unfortunately, he didn’t get any video when he pointed VLC to the likely port numbers. At this point [krishnan793] checked the datasheet for the camera’s Hi3516E SoC and saw that it supported H.264 encoding. By manually specifying that as the video codec when invoking VLC, it was able to play a video stream from port 6200. A little later, he discovered that port 6100 was serving up the live audio.

Technically that’s all he wanted to do in the first place, as he was looking to feed the video into OpenCV for other projects. But while he was in the area, [krishnan793] also decided to find the download URL for the camera’s firmware, and ran it through binwalk to see what he could find out. Not surprisingly the security turned out to be fairly lax through the entire device, so he was able to glean some information that could be useful for future projects.

Of course, if you’d rather go with the first option and build your own custom dash camera so you don’t have to jump through so many hoops just to get a usable video stream, we’ve got some good news for you.

Linux Fu: Python GUIs For Command Line Programs (Almost) Instantly

Not every programmer likes creating GUI code. Most hacker types don’t mind a command line interface, but very few ordinary users appreciate them. However, if you write command line programs in Python, Gooey can help. By leveraging some Python features and a common Python idiom, you can convert a command line program into a GUI with very little effort.

The idea is pretty simple. Nearly all command line Python programs use argparse to simplify picking options and arguments off the command line as well as providing some help. The Gooey decorator picks up all your options and arguments and creates a GUI for it. You can make it more complicated if you want to change specific things, but if you are happy with the defaults, there’s not much else to it.

At first, this article might seem like a Python Fu and not a Linux Fu, since — at first — we are going to focus on Python. But just stand by and you’ll see how this can do a lot of things on many operating systems, including Linux.

Continue reading “Linux Fu: Python GUIs For Command Line Programs (Almost) Instantly”

This Sentence‌‌‌‌‍‌ Isn’‌‌‌‌‍‌‬t Just ‌‌‌‌‌‬‌‌a‌‌‌‌‍‬‬‍ Sentence‌‌‌‌‍‌‌‌‌‌‌‬‌‌‌‌‌‌‍‬‬‍‌‌‌‌‍‍‌‌‌‌‌‌

Some sentences have more than meets the eye, and we’re not talking about interpretive nonsense. Rather, some sentences may contain up to four paragraphs’ worth of hidden text, invisible to readers.

Thanks to Zero Width Obfuscation, it is possible to use Zero Width Characters – Unicode characters that are invisible even when you try to highlight them. They’re typically used for abstract foreign languages that require separators that don’t take up an entire space. In this case, they’re used to obfuscate and de-obfuscate hidden messages sent through text.

[inzerosight] published a browser extension that identifies, de-obfuscates, and obfuscates these messages for you on the web. It does this by querying each page for the Unicode of the Zero Width Characters (U+FEFF, U+200C, U+200D, U+200E, U+2060, U+180E) and highlighting where they’ve been spotted. The encoding replaces each Unicode character with a permutation of two of the Zero Width Characters, essentially doing a find and replace across the text message.

I’m just waiting to see how long it takes for Zero Width Obfuscation to become the next Konami Code Easter Egg.

How Random Is Random?

Many languages feature a random number generator library for help with tasks like rolling a die or flipping a coin. Why, you may ask, is this necessary when humans are perfectly capable of randomly coming up with values?

[ex-punctis] was curious about the same quandary and decided to code up an experiment to test the true randomness of human. A script guesses the user’s next input from two choices, keeping a tally in the JavaScript backend that holds on to the past five choices. If the script guesses correctly, they take $1 from the user. Otherwise, the user earns $1.05.

The data from gathered from running the script with 200 pseudo-random inputs 100,000 times resulted in a distribution of correct guess approximately normal (µ=50% and σ=3.5%). The probability of the script correctly guessing the user’s input is >57% from calculating µ+2σ. The result? Humans aren’t so good at being random after all.

It’s almost intuitive why this happens. Finger presses tend to repeat certain patterns. The script already has a database of all possible combinations of five presses, with a counter for each combination. Every time a key is pressed, the latest five presses is updated and the counter increases for whichever combination of five presses this falls under. Based on this data, the script is able to make a prediction about the user’s next press.

In a follow-up statistic analysis, [ex-punctis] notes that with more key presses, the accuracy of the script tended to increase, with the exception of 1000+ key presses. The latter was thought to be due to the use of a psuedo random number generator to achieve such high levels of engagement with the script.

Some additional tests were done to see if holding shorter or longer sequences in memory would account for more accurate predictions. While shorter sequences should theoretically work, the risk of players keeping a tally of their own presses made it more likely for the longer sequences to reduce bias.

There’s a lot of literature on behavioral models and framing effects for similar games if you’re interested in implementing your own experiments and tricking your friends into giving you some cash.

An Apartment-Hunting AI

Finding a good apartment is a lot of work and includes searching websites for available places and then cross-referencing with a list of characteristics. This can take hours, days or even months but in a world where cars drive themselves, it is possible to use machine learning in your hunt.

[veesot] lives in a city between Europe and Asia and was looking for a new home, and his goal was to create a model that can use historical data to not only suggest if an advertised price was right, but also recommend waiting by predicting the decrease in the the future. The data-set includes parameters such as “area”, “district”, “number of balconies” etc and tried to determine an optimal property to view.

There is a lot that [veesot] describes in his post which includes cleaning the data in terms of removing flats that are tool small or tool large. This is essentially creating a training data-set for the machine learning system that will allow the system to generate usable output. [veesot] also added parameters such districts which relate to the geographical location, age of the building and even the materials used in the construction.

There is also an interesting bit about analyzing the data variables and determining cross-correlation which ultimately leads to the obvious conclusions that the central/older districts have older apartments and newer ones are larger. It makes for a few cool graphs but the code can certainly come in handy when dealing with similar data-sets. The last part of the writing discusses applying Linear Regression and then testing its accuracy. Interpreting the model produces interesting results about the trained model and the values of the coefficients.

Continue reading “An Apartment-Hunting AI”

A Web API For Your Pi

There are many ways to attach a project to the Internet, and a plethora of Internet-based services that can handle talking to hardware. But probably the most ubiquitous of Internet protocols for the average Joe or Jane is the web browser, and one of the most accessible of programming environments lies within it. If only somebody with a bit of HTML and Javascript could reach a GPIO pin on their Raspberry Pi!

If that’s your wish, then help could be at hand in the form of [Victor Ribeiro]’s RPiAPI. As its name suggests, it’s an API for your Raspberry Pi, and in particular it provides a simple web-accessible endpoint wrapper for the Pi’s GPIO library from which its expansion port pins can be accessed. By crafting a simple path on the address of the Pi’s web server each pin can be read or written to, which while it’s neither the fastest or most accomplished hardware interface for the platform, could make it one of the easiest to access.

Security comes courtesy of Apache password protected directories via .htaccess files, so users would be well-advised to consider the implications of connecting this to a public IP address very carefully. But for non experts in security it still has the potential to make a very useful tool in the armoury of ways to control hardware from the little single board computer. It’s not the first try at this idea as we’ve seen a PHP example early in the Pi’s lifetime as well as one relying upon MySQL, but it does seem to be a simpler option than the others.