Part I: Contagion of Scraping – Has It Gone Too Far?
According to Internet World Statistics,
the number of Internet users has increased by over 550% from 360 million users
in 2000 to 2.4 billion users 2012. While
these statistics may not come as a total surprise, the explosion in the number
of people with access to the Internet via mobile devices has led the number of
users to double over the past five years.
The proliferation of Internet access has led to exponential growth in
the amount of content on the Internet, creating a repository of "publicly
available" information. This
includes not only news, business, and financial information, but also personal
data via movie, restaurant and hotel reviews (e.g. Yelp, Angie’s List, and
TripAdvisor). This same technological explosion; however, has made it easy to extract
this data for commercial use and sale—and to do so for free and without
authorization. This data extraction, commonly referred to as
"scraping," creates legal issues and concerns for both sides of this
issue—those who want to scrape (love it), and those who want to protect against
it (hate it).
Scraping of data is extremely common in today’s Web
environment. Start-ups love the ability to scrape because it’s a cheap and
powerful way to gather data without the need for partnerships. For example,
Airbnb built its business by scraping data from Craigslist and posting it to
their own site, leading to the receipt of a formal “cease and desist”
letter. In the travel industry, ~30% of the website traffic is traced to web
scraping bots, according to a study by Distil Networks. On the other hand, big
corporations also use web scrapers to collect data for their own benefit;
however, they don’t want others to use bots against them. Further, the “contagion” of scraping data has
spread to the investment community.
Recently, a former colleague employed by a hedge fund admitted that he
routinely hires Ph.D.’s to scrape company websites. By conducting this level of diligence, he is
able to refine investment theses for companies that derive a meaningful amount
of sales from online traffic. By
scraping data on a daily basis from a given company’s website, he is able to observe,
amongst other things, the number of visitors, enabling him to modify financial
forecasts to more accurately predict sales over a given quarter or year.
No comments:
Post a Comment