dcsimg

The Ubiquity and Danger of Web Scraping

  • The Ubiquity and Danger of Web Scraping-

    Top Victims

    The top web scraping victims by industry in 2015 were real estate, digital publishing, e-commerce, directories and classifieds, and airlines and travel. Many of these industries are being targeted by an influx of startups that are scraping information from industry leaders in order to compete.

    Real estate sites are the no. 1 web scraping victims. Real estate had the highest percentage of bad bots at 32 percent. From 2014 to 2015, the real estate industry saw a 300 percent increase in bad bot activity.

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

The Ubiquity and Danger of Web Scraping

  • 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8
  • The Ubiquity and Danger of Web Scraping-6

    Top Victims

    The top web scraping victims by industry in 2015 were real estate, digital publishing, e-commerce, directories and classifieds, and airlines and travel. Many of these industries are being targeted by an influx of startups that are scraping information from industry leaders in order to compete.

    Real estate sites are the no. 1 web scraping victims. Real estate had the highest percentage of bad bots at 32 percent. From 2014 to 2015, the real estate industry saw a 300 percent increase in bad bot activity.

Web scraping is a software method used to extract information from websites. It often includes transforming unstructured website data into a database for analysis, or repurposing stolen content for the scraper's own online operations. Not only does web scraping pose a critical challenge to company branding, it can also threaten sales and conversions, lower SEO rankings or undermine the integrity of content that took considerable time and resources to produce.

Through analysis of top web scraping platforms and services, Distil Networks' 2016 Economics of Web Scraping Report uncovers the ubiquity and danger of this practice. The following findings outline how the democratization of web scraping lets perpetrators effortlessly steal sensitive information on the web.