According
to Internet World Statistics, the number of Internet users has
increased by over 550% from 360 million users in 2000 to 2.4 billion users 2012. While these statistics may not come as a
total surprise, the explosion in the number of people with access to the
Internet via mobile devices has led the number of users to double over the past
five years. The proliferation of
Internet access has led to exponential growth in the amount of content on the
Internet, creating a repository of "publicly available" information. This includes not only news, business, and
financial information, but also personal data via movie, restaurant and hotel reviews
(e.g. Yelp, Angie’s List, and TripAdvisor). This same technological explosion;
however, has made it easy to extract this data for commercial use and sale—and
to do so for free and without authorization. This data extraction, commonly
referred to as "scraping," creates legal issues and concerns for both
sides of this issue—those who want to scrape (love it), and those who want to
protect against it (hate it).
Scraping
of data is extremely common in today’s Web environment. Start-ups love the
ability to scrape because it’s a cheap and powerful way to gather data without
the need for partnerships. For example, Airbnb built its business by scraping
data from Craigslist and posting it to their own site, leading to the receipt
of a formal “cease and desist” letter. In the travel industry, ~30%
of the website traffic is traced to web scraping bots, according to a study by
Distil Networks. On the other hand, big corporations also use web scrapers to
collect data for their own benefit; however, they don’t want others to use bots
against them. Further, the “contagion”
of scraping data has spread to the investment community. Recently, a former colleague employed by a
hedge fund admitted that he routinely hires Ph.D.’s to scrape company websites. By conducting this level of diligence, he is
able to refine investment theses for companies that derive a meaningful amount
of sales from online traffic. By
scraping data on a daily basis from a given company’s website, he is able to
observe, amongst other things, the number of visitors, enabling him to modify financial
forecasts to more accurately predict sales over a given quarter or year.
No comments:
Post a Comment