Tuesday, February 02, 2010

The Human Algorithm

I'm interested in content aggregation. In fact, I'm hoping to start a company that creates a hybrid aggregation model that leverages both algorithmic aggregation as well as humans (through crowdsourcing). The only aggregator that I know of that effectively uses a hybrid model is Techmeme, but they can't scale it.

Ultimately, I find aggregation is most useful as a discovery tool. To that end, I've been watching the rise of Twitter as a content discovery engine (I'm not really a fan of it otherwise).

The problem I've always found with Twitter as a discovery tool is that surfaced links must stem from some form of popularity (not much different than Digg in that sense). Popularity, however, isn't really a good way to effectively filter information.

Needless to say, I was really interested when I saw this post titled, "The Human Algorithm: How Google Ranks Tweets in Real-Time Search." Google is new to trying to integrate Twitter into its search.

Essentially, the author notes that Google is trying to employ a pagerank-style algorithm to apply relevance rankings to Twitter posters (and therefore their Tweets). If a Tweeter has more followers, they will generally be considered more relevant and important. If a popular Tweeter follows you, you are now deemed more important, as well.

While this is better than nothing, I think it still misses a fundamental issue in uncovering the most relevant dynamic information online . . . The Tweeter is important for what?

Ashton Kutcher and Britney Spears each have millions of followers. However, if I'm interested in astrophysics, I couldn't really care less if they happen to Tweet a link to some pretty photos from the Hubble. What Neil deGrasse Tyson Tweets will be much more important, even though he has only 10,000 followers.

My sense is that people are more likely to Tweet about a wide variety of topics, whereas websites are much more inclined to keep a stronger editorial focus that is more conducive to applying relevance rankings through links and semantic analysis of the content. One example is Tyson Tweeting about his Netflix rental last night. I doubt that would show up on his blog, but apparently it's Tweet-able.

A Google Fellow is quoted in the post as noting, "“It is definitely, definitely more than a popularity contest.” . . . I have a hard time seeing how at the moment, but I wouldn't be surprised.

No comments: