Tech Blog

Author archives: Noel Taylor

RSS feed of Noel Taylor

Scraping pdf, doc, and docx with Scrapy

In February 2017, Google announced its plans to discontinue its Google Site Search product. Those clients of Imaginary Landscape who had relied on Google to provide their users with a search engine service for their website looked to us for a new solution. Finding no obvious equivalent replacement, we decided to create our own website scraper and accompanying search app.