Billy Hoffman has built a site crawler that can hide its activity within normal web traffic. Crawling a website is one of the easiest ways to find exploitable pages, but the systematic nature of the crawl makes it stand out in logs. Billy set out to design a crawler that would behave like a normal web browser. It follows more popular links first (think “news”, not “legal notice”) and it doesn’t hit deep linked pages directly without first creating an appropriate Google referrer. There are tons of other tricks involved in making the crawler look “human” which you’ll find in Billy’s slides over at SPI Labs. You can also read about the talk on Wired News.
Continue reading “Shmoocon 2006: Covert Crawling: A Wolf Among Lambs”