Web scraping Amazon and Rotten Tomatoes

web-scraping-amazon-and-rotten-tomatos

[Rajesh] put web scraping to good use in order to gather the information important to him. He’s published two posts about it. One scrapes Amazon daily to see if the books he wants to read have reached a certain price threshold. The other scrapes Rotten Tomatoes in order to display the audience score next to the critics score for the top renting movies.

Web scraping uses scripts to gather information programmatically from HTML rather than using an API to access data. We recently featured a conceptual tutorial on the topic, and even came across a hack that scraped all of our own posts. [Rajesh's] technique is pretty much the same.

He’s using Python scripts with the Beautiful Soup module to parse the DOM tree for the information he’s after. In the case of the Amazon script he sets a target price for a specific book he’s after and will get an email automatically when it gets there. With Rotten Tomatoes he sometimes likes to see the audience score when considering a movie, but you can’t get it on the list at the website; you have to click through to each movie. His script keeps a database so that it doesn’t continually scrape the same information. The collected numbers are displayed alongside the critics scores as seen above.

Google Talk bot running on Raspberry Pi

google-talk-raspberry-pi

[Michael Mitchell] put together a demonstration of how Google Talk can be used to communicate with scripts. Although the concept isn’t new we haven’t seen very many projects that use the chat interface for issuing commands and receiving data. The one that does come to mind is this home automation project which uses Google Talk because it’s quite a bit faster than SMS or email communications.

Luckily there’s already a Python library called pygtalkrobot which helps with the XMPPPY protocol used by Google Talk. In addition to that package, [Michael] also installs some extras which allow him to access the GPIO pins on the RPi via Python. In the video after the break he demonstrates switching and LED on and off, as well as reading from a slide switch connected to pin 8. Of course it’s a snap to code feedback from the OS itself. As you can see in the image above the RPi is reporting it’s uptime after being issued a command by [Michael]

[Read more...]

Printing images with a wood burning CNC machine

printing-images-with-a-wood-burning-cnc-machine

Just to clear up any confusion from the title, this wood burning CNC machine runs on electricity. The wood burner acts as the print head. It’s the thing in the upper right of the field that looks a bit like a soldering iron. In this case it’s being used like a dot matrix printer.

We suppose this is a form of halftone printing, although it doesn’t produce the uniformity we’ve seen with mill-based halftone techniques. [Random Sample] built the machine from wood, drawer sliders, and stepper motors with toothed belts. His Python script takes an image and transforms it into a file which can be used to guide each of the three axes of the machine. An Arduino receives these commands via the USB connection. Each image prints in a grid, with darker pixels created by leaving the hot tip in contact with the wood for a longer period of time.

Don’t miss the sample video embedded after the jump.

[Read more...]

Picture frame that scrapes train times from the web

rpi-train-times-fixture

Whenever [Gareth James] needs to catch a train he has only to push a button on this frame and the next three departure times will be displayed. As you can see from the post-processing in the photo, this is accomplished by a Raspberry Pi board using a few familiar tools.

Let’s take a look at the hardware first. He acquired a 7″ LCD display which he removed from its plastic case. The bare screen will easily fit inside of the rather deep wood frame and its composite video input makes it quite simple to interface with the RPi board. There was a little work to be done for power. The LCD needs 12V so he’s using a 12V wall wart to feed the frame, and including a USB car charger to power the RPi. The last thing he added is a button connected to the GPIO header to tell the system to fetch a new set of times.

A Python script monitors the button and uses Beautiful Soup to scrape the train info off of a website. To get the look he wanted [Gareth] wrote a GUI using tkinter. Don’t miss the demo after the jump.

If you need a bit of a primer on scraping web data take a look at this guide.

[Read more...]

pyMCU test project looks like a Minecraft mob

pymcu-controlled-blockhead

Hackaday’s own [Jeremy Cook] has been testing out the pyMCU board and managed to put together an animated block head that looks like it could be a foe in Minecraft. That’s thanks mostly to the block of foam he’s using as a diffuser. The face of the project is a set of LEDs. These, along with the servo motors that move the neck are controlled using Python code which you can glance at after the break (there’s a video demo there too).

We first saw pyMCU early in the year. The PIC 16F1939 offers plenty of IO and acts as a USB connected bridge between your hardware and your Python scripts. Speaking of hardware, the test platform used to be an RC helicopter. [Jeremy] scrapped most of it, but kept the servo motors responsible for the pitch of the rotors. The board makes these connections easy, and the concept makes controlling them even easier. In fact, there’s only about 17 lines of code for the functions that control the servos. The rest is a simple UI built with Tkinter.

[Read more...]

Python can be your best friend when it comes to binary math

python-binary-math

If you’re into microcontrollers you know the ability to think and perform math in binary is a must. [Joe Ptiz] has been looking for a way to keep from being distract by the math when coding while still keeping the binary strings in the forefront of his mind. The solution he came up with is to use the Python interpreter as a binary math aide.

We knew that you could use Python to convert between decimal, hexadecimal, and binary. But we failed to make the leap to using it for troubleshooting bit-wise operations. We can see this being especially useful when working with sixteen-bit I/O ports like those found on STM32 chips. For us it’s easy to do 8-bit math in our head, but doubling that is another story.

The image above is one screenshot from [Joe's] tutorial. This illustrates a few different bit-wise operators given decimal inputs but displaying binary as output. He also illustrates how you can use python to test out equations from C code by first setting the variables, pasting the equation, then printing the result to see if the output is what was expected.

Web scraping tutorial

web-scraping-tutorial

Web scraping is the act of programmatically harvesting data from a webpage. It consists of finding a way to format the URLs to pages containing useful information, and then parsing the DOM tree to get at the data. It’s a bit finicky, but our experience is that this is easier than it sounds. That’s especially true if you take some of the tips from this web scraping tutorial.

It is more of an intermediate tutorial as it doesn’t feature any code. But if you can bring yourself up to speed on using BeautifulSoup and Python the rest is not hard to implement by trial and error. [Hartley Brody] discusses investigating how the GET requests are formed on your webpage of choice. Once that URL syntax has been figured out just look through the source code for tags (css or otherwise) that can be used as hooks to get at your target data.

So what can this be used for? A lot of things. We’d suggest reading the Reddit comments as there are several real world uses discussed there. But one that immediately pops to mind is the picture harvesting [Mark Zuckerburg] used when he created Facemash.