Deflating the Filter Bubble

filterbubble295I was asked recently to speak at a symposium on Media Choices at Drexel University. The event drew a fascinating array of scholars who were studying things like Internet addiction, online dating, and political polarization in media consumption.

When someone mentions “media choice” to me, I automatically start thinking about the algorithms that have been developed to help shape that choice.

I have followed avidly the growing use of recommendation systems that you see on sites like Amazon, Netflix, YouTube and Pandora. I saw these mechanisms as a significant move away from demographic marketing (which I find deeply flawed) to marketing based on customer taste.

I did have my reservations though. I was very moved by Eli Pariser’s TED talk about the danger of “filter bubbles,” which effectively insulate us from opinions and content that we don’t understand or like. His talk really resonated with me because of the deeply divided ideological and taste communities that the Lear Center found in a major survey research project on the correlation between entertainment preferences and political ideology (spoiler: they are even more deeply connected than you might think.)

But, when I conducted further research about collaborative filtering systems, I made some rather counter-intuitive discoveries. YouTube, for instance,¬†found that “suggesting the videos most closely related to the one a person is already watching actually drives them away.”

Of course YouTube’s goal is to get you to sit and watch YouTube like you watch TV: to lean back and watch a half hour to an hour of programming, rather than watching for two minutes, getting frustrated trying to find something else worth watching and then going elsewhere. So, in short, it’s in YouTube’s best interest to introduce some calculated serendipity into their recommendations.

Look at the parallel in successful TV programming blocks: a network would never schedule three extremely similar comedies together: the topics, settings, characters, location must be different. Often the unifying feature is more often about pace and tone, which can be very difficult to codify and make keyword searchable. That’s why human behavioral data is so incredibly powerful. By mining data about real human’s consumption habits, you can discover patterns in preference that don’t boil down to simple characteristics like affection for a certain actor, franchise or genre, which in many ways still drives motion picture development.

When you look through the Association for Computing Machinery Conference Series on Recommender Systems, you find plenty of papers proposing methods for increasing the amount of diversity, novelty and serendipity in recommendation engines. The trick is balancing “accuracy” – the predictive factor determining the likelihood that your end user will actually like the recommended content – with newness, difference and the pleasure of the unexpected.

And this is where I had an “Aha!” moment. One of the findings from our politics and entertainment surveys is that, for a significant portion of Americans, their taste profile includes a predisposition toward things that don’t reflect their values or beliefs. Some people gravitate toward things that feel foreign while others recoil from them. Here’s the conundrum: how do you go about predicting an unexpected pleasure for someone based on their previous behavior?

Contestants in a contest sponsored by Netflix witnessed something in the data that might start to address this fascinating problem. Several years ago, Netflix held a public contest to improve their Cinematch system by 10%. Many of the contestants used a mathematical technique called “singular value decomposition,” which identifies groups of films that share a certain predictive factor such as graphic violence or profanity or sci-fi geekiness or chick-flickiness, for that matter. The most disarming thing about this technique is that the programmers don’t identify the factor beforehand. They just run the algorithm and hope they can figure out what factor the program has isolated in each group. Sometimes the answer seems very clear but other times it’s utterly inscrutable. I can’t help but wonder if it’s demonstrating this chaotic element in human taste: that many of us appreciate precisely what our viewing history has made “unexpected.” (For more on this, see my Lear Center blog on the Technologies of Taste.)

One way that YouTube has tried to safely increase the diversity of its recommendations is by taking into account quality – or rather their proxy for quality, which is whether a bunch of other humans watched a lot or all of another video after watching the one you just watched. By using other people’s human behavior as a gage, Google hopes they can diversify recommendations to you, while avoiding recommending duds.

Netflix looks at similar data points, paying very close attention to how long people watch, where they pause, what scenes they play over and over, and which videos are played all the way through. This is all in addition to the ranking feedback we offer them, which can be skewed by our laughable desire to seem more highbrow than we really are (you may give Lincoln four stars, but you watched it once, while you gave Talladega Nights three stars even though you’ve watched it four times).

Cinematch is absolutely crucial to their business model because new members usually only have a couple dozen movies that they know they want to see. In order to keep these people as subscribers, Cinematch must tell customers about titles they’ve never heard of and that they subsequently like.

No matter how much data you have, this is no easy task. I was not surprised at all to come across a research study on how to maximize profits for movie recommender systems, which found that movie consumers prefer to watch (and even pay for) movies they have already seen than movies that are new to them. Obviously, novelty is not always the key to success.

The Cinematch system has been so successful that the vast majority of their rentals are from the backlist – older movies, old TV series, classic films, documentaries and small indie films that never made it into theaters in most U.S. towns. For Netflix to be successful, it’s in their best interest to have a cosmopolitan user base that appreciates a wide range of entertainment choices. It’s much easier to keep those people as long-term subscribers.
So, the lesson here? As we move from a push to a pull media culture, we need to monitor the way in which people segregate themselves to media and entertainment content that reinforces their world views and look at the social, psychological and economic effects of this new media ecosystem. It’s essential that we pay attention to the automated, commercially driven and highly sophisticated recommendation systems which are pushing content in a pull environment. While we may simply presume that these collaborative filtering systems will further insulate people, I see an acute business interest in not allowing that to happen.

Related Posts