“Lazier” Web Scraping Is Better Web Scraping

February 7, 2022 by Donald Papp 13 Comments

Ever needed to get data from a web page? Parsing the content for data is called web scraping, and [Doug Guthrie] has a few tips for making the process of digging data out of a web page simpler and more efficient, complete with code examples in Python. He uses getting data from Yahoo Finance as an example, because it’s apparently a pretty common use case judging by how often questions about it pop up on Stack Overflow. The general concepts are pretty widely applicable, however.

[Doug] shows that while parsing a web page for a specific piece of data (for example, a stock price) is not difficult, there are sometimes easier and faster ways to go about it. In the case of Yahoo Finance, the web page most of us look at isn’t really the actual source of the data being displayed, it’s just a front end.

One way to more efficiently scrape data is to get to the data’s source. In the case of Yahoo Finance, the data displayed on a web page comes from a JavaScript variable that is perfectly accessible to the end user, and much easier to parse and work with. Another way is to go one level lower, and retrieve JSON-formatted data from the same place that the front-end web page does; ignoring the front end altogether and essentially treating it as an unofficial API. Either way is not only easier than parsing the end result, but faster and more reliable, to boot.

How does one find these resources? [Doug] gives some great tips on how exactly to do so, including how to use a web browser’s developer tools to ferret out XHR requests. These methods won’t work for everything, but they are definitely worth looking into to see if they are an option. Another resource to keep in mind is woob (web outside of browsers), which has an impressive list of back ends available for reading and interacting with web content. So if you need data for your program, but it’s on a web page? Don’t let that stop you!

Python Web Proxy Convinces Sonos To Stream YouTube

February 4, 2022 by Dave Rowntree 22 Comments

[Maurice-Michel Didelot] owns a Sonos smart speaker, and was lamenting the devices inability (or plain unwillingness) to stream music from online sources without using a subscription service. YouTube Music will work, but being a subscription product there is a monthly fee, which sucks since you can listen to plenty of content on YouTube for free. [Maurice] decided that the way forward was to dig into how the Sonos firmware accesses ‘web radio’ sources, and see if that could be leveraged to stream audio from YouTube via some kind of on-the-fly stream conversion process.

What? No MP4 support for web radio? Curses!

So let’s dig in to how [Maurice] chose to approach this. The smart speaker can be configured to add various streaming audio sources, and allows you add custom sources for those. The Sonos firmware supports a variety of audio codecs, besides MP3, but YouTube uses the MP4 format. Sonos won’t handle that from a web radio source, so what was there to do, but make a custom converter?

After a little digging, it was determined that Sonos supports AAC encoding (which is how MP4 encodes audio) but needs it wrapped in an ADTS (Audio Data Transport Stream) container. By building a reverse web-proxy application, in python using Flask, it was straightforward enough to grab the YouTube video ID from the web radio request, forward a request to YouTube using a modified version of pytube tweaked to not download the video, but stream it. Pytube enabled [Maurice] to extract the AAC audio ‘atoms’ from the MP4 container, and then wrap them up with ADTS and forward them onto the Sonos device, which happily thinks it’s just a plain old MP3 radio stream, even if it isn’t.

Sonos doesn’t have the best reputation, let’s say, but you can’t deny that there’s some pretty slick tech going on inside. Here’s a neat hack we covered last year, adding Sonos support to an old school speaker, and a nice teardown of a IKEA Sonos-compatible unit, which uses some neat design hacks.

Thanks [mip] for the tip!

Featured image by Charles Deluvio on Unsplash.

Linux Arcade Cab Gives Up Its Secrets Too Easily

February 1, 2022 by Dave Rowntree 9 Comments

Sometimes reverse engineering embedded systems can be a right old faff, with you needing to resort to all kinds of tricks such as power glitching in order to poke a tiny hole in the armour, giving you an way in. And, sometimes the door is just plain wide open. This detailed exploration of an off-the-shelf retro arcade machine, is definitely in that second camp, for an unknown reason. [Matthew Alt] of VoidStar Security, took a detailed look into how this unit works, which reads as a great introduction to how embedded Linux is constructed on these minimal systems.

Could this debug serial port be more obvious?

The hardware is the usual bartop cabinet, with dual controls and an LCD display, with just enough inside a metal enclosure to drive the show. Inside this, the main PCB has the expected minimal ARM-based application processor with its supporting circuit. The processor is the Rockchip RK3128, sporting a quad-core ARM Neon and a Mali400 GPU, but the main selling point is the excellent Linux support. You’ll likely see this chip or its relatives powering cheap Android TV boxes, and it’s the core of this nice looking ‘mini PC’ platform from firefly. Maybe something to consider seeing as though Raspberry Pis are currently so hard to come by?

Anyway, we digress a little, [Matthew] breaks it down for us in a very methodical way, first by identifying the main ICs and downloading the appropriate datasheets. Next he moves on to connectors, locating an internal non-user-facing USB micro port, which is definitely going to be of interest. Finally, the rather obvious un-populated 3-pin header is clearly identified as a serial port. This was captured using a Saleae clone, to verify it indeed was a UART interface and measure the baud rate. After doing that, he hooked it into a Raspberry Pi UART and by attaching the standard screen utility to the serial device, lo-and-behold, a boot log and a root prompt! This thing really is barn-door wide-open.

Is that a root prompt you have for me? Oh why yes it is!

Simply by plugging in a USB stick, the entire flash memory was copied over, partitions and all, giving a full backup in case subsequent hacking messed things up. Being based on U-Boot, it was a trivial matter of just keying in ‘Ctrl-C’ at boot time, and he was dropped straight into the U-Boot command line, and all configuration could be easily read out. By using U-Boot to low-level dump the SPI flash to an external USB device, via a RAM copy, he proved he could do the reverse and write the same image back to flash without breaking something, so it was now possible to reverse engineer the software, make changes and write it back. Automation of the process was done using Depthcharge on the Raspberry Pi, which was also good to read about. We will keep an eye on the blog for what he does with it next!

As we’ve covered earlier, embedded Linux really is everywhere, and once you’ve got hardware access and some software support, hacking in new tricks is not so hard either.

Online Tool Turns STLs Into 3D ASCII Art

January 22, 2022 by Dan Maloney 8 Comments

If you look hard enough, most of the projects we feature on these pages have some practical value. They may seem frivolous, but there’s usually something that compelled the hacker to commit time and effort to its doing. That doesn’t mean we don’t get our share of just-for-funsies projects, of course, which certainly describes this online 3D ASCII art generator.

But wait — maybe that’s not quite right. After all, [Andrew Sink] put a lot of time into the code for this, and for its predecessor, his automatic 3D low-poly generator. That project led to the current work, which like before takes an STL model as input, this time turning it into an ASCII art render. The character set used for shading the model is customizable; with the default set, the shading is surprisingly good, though. You can also swap to a black-on-white theme if you like, navigate around the model with the mouse, and even export the ASCII art as either a PNG or as a raw text file, no doubt suitable to send to your tractor-feed printer.

[Andrew]’s code, which is all up on GitHub, makes liberal use of the three.js library, so maybe stretching his 3D JavaScript skills is really the hidden practical aspect of this one. Not that it needs one — we think it’s cool just for the gee-whiz factor.

Continue reading “Online Tool Turns STLs Into 3D ASCII Art” →

Setup Menu Uses Text Editor Hack

January 20, 2022 by Chris Lott 19 Comments

Many embedded devices that require a setup menu will use a USB serial port which you connect to your favorite terminal emulator. But we recently encountered a generic USB knob that did setup using a text editor, like Notepad or even Vim (although that was a bit ugly). A company called iWit makes several kinds of USB knobs which end up in many such products.

These generic USB knobs are normally just plug-and-play, and are used to control your PC’s volume and muting. Some models, like the iWit, the user can configure the mapping within the device. For example, knob rotation can be set to generate up and down arrow keys, and knob press could be ENTER. One could do this kind of mapping on the PC, but many of these USB knobs can do it for you. The crux of the setup is this menu (which you can see in action in the first 30 seconds of the video below).

Continue reading “Setup Menu Uses Text Editor Hack” →

Learning The Ropes With A Raspberry Pi Mandelbrot Cluster

January 18, 2022 by Tom Nardi 10 Comments

You’ve probably heard it said that clustering a bunch of Raspberry Pis up to make a “supercomputer” doesn’t make much sense, as even a middle-of-the-road desktop could blow it away in terms of performance. While that may be true, the reason most people make Pi clusters isn’t for raw power, it’s so they can build experience with parallel computing without breaking the bank.

So while there was probably a “better” way to produce the Mandelbrot video seen below, creator [Michael Kohn] still learned a lot about putting together a robust parallel processing environment using industry standard tools like Kubernetes and Docker. Luckily for us, he was kind enough to document the whole process for anyone else who might be interested in following in his footsteps. Whatever your parallel task is, and whatever platform it happens to be running on, some of the notes here are likely to help you get it going.

It’s not the biggest Raspberry Pi cluster we’ve ever seen, but the four Pi 4s and the RGB LED festooned enclosure they live in make for an affordable and space-saving cluster to hone your skills on. Whether you’re practicing for the future of software development and deployment, or just looking for something new to play around with, building one of these small-scale clusters is a great way to get in on the action.

Continue reading “Learning The Ropes With A Raspberry Pi Mandelbrot Cluster” →

KiCAD 6.0: What Made It And What Didn’t

January 18, 2022 by Dave Rowntree 56 Comments

I’ve been following the development of KiCAD for a number of years now, and using it as my main electronics CAD package daily for a the last six years or thereabouts, so the release of KiCAD 6.0 is quite exciting to an electronics nerd like me. The release date had been pushed out a bit, as this is such a huge update, and has taken a little longer than anticipated. But, it was finally tagged and pushed out to distribution on Christmas day, with some much deserved fanfare in the usual places.

So now is a good time to look at which features are new in KiCAD 6.0 — actually 6.0.1 is the current release at time of writing due to some bugfixes — and which features originally planned for 6.0 are now being postponed to the 7.0 roadmap and beyond. Continue reading “KiCAD 6.0: What Made It And What Didn’t” →