A couple of weeks ago I was visiting SupplyFrame to meet the new owners of Hackaday. The CEO, [Steve Flagg] asked me how we can introduce FindChips to the readers. I’m used to people trying to get things on our front page so I had the question ready for him: where’s the hack? Much to my surprise he was ready for me. “What if I tell you that it started as a hack by a NASA Engineer?”
It turns out he was right. He put me in contact with [Randy Sargent], the founder of FindChips.com. If you’ve ever hacked together a script to make your life easier you’ll want to listen to what Randy had to say. You never know when it’ll turn into a full-blown start-up.
It turns out that [Randy] wasn’t a NASA engineer when he founded FindChips, but he is now. For those that aren’t familiar with it, the site offers a unified search for electronic components that will give you data from worldwide vendors on their stock count and price breakdown. This sort of service simply didn’t exist in the 1990’s and [Randy] was getting really frustrating with the time he was spending visiting every possible vendor site to search for components going into his designs. So he did what any good hacker did, he wrote a script to do it for him.
The script and how it spread
What he came up with was a Perl script which used web scraping to do the searching for him. [Randy] recalls that there were no parsing libraries (eg: Beautiful Soup) available at the time so he wrote everything to grab the data and made sense of it with regular expressions. The software was fragile in that small changes on one vendor’s site could break the scraper so he wrote libraries that would search for general landmarks. For instance, they looked for any table and matched two column headers to identify it. This way if other columns were added or removed, or if the order of the tables was changed, the script would still run.
At some point in 1998 the first version was up and running from his home DSL connection. He found it invaluable in his own work, and for that reason he made a presentation at the monthly Seattle Robotics Society meeting. The group of about 30-40 people spent most of their free time making robots; this was perfect for them. And so it’s no surprise that traffic quickly outgrew his home connection. He moved on up by physically transplanting the box running the site to a colocation facility.
Vendors started to notice so [Randy] hacked on their search algorithms
The thing about web scraping is, if you start to use a lot of traffic the server owners are going to notice. [Randy] recalls that vendors generally approached him in two different ways. Some either asked him to stop scraping their site, or blocked him directly. But others had the opposite reaction. They told him that his site was the largest single source of traffic for them. That sounds great at first, but since he was using the web search interface it had the side effect of slowing down the vendor site search for actual human users. The solution was to set up a system where a copy of the vendor’s stock database would be uploaded to FindChips once a day.
This wasn’t an API, it was direct access to their data which had [Randy] back in the coding chair once again reworking his script to query server side rather than web scraping. He told me that he modified the grep command to help with this search process and ended up with a faster return from the db query than the vendors themselves were able to offer. Seeing this success it wasn’t long before more vendors jumped aboard.
Where we are now
[Randy] sold the site to SupplyFrame in 2010. Sure, it’s not a company the size of Apple. But the stories are along the same lines. He developed it to fix his own problem, showed it off to a hobby organization that is surely a predecessor of today’s Hackerspaces, and then turned it into a sizable web start-up. That’s not only impressive, it’s inspiring. Maybe the projects we screw around with will someday be useful to a wide audience?
These days, in addition to being a Computer Scientist for NASA’s Ames Research Center, [Randy Sargent] is a Visiting Scientist for Google and a Senior Systems Scientist for Carnegie Mellon University. One of the projects he’s most excited about right now is the GigaPan Time Machine. It captures and makes available time-lapse photography on the scale of billions of pixels.
[Image Source for Randy’s headshot]