I despise The Big Bang Theory to an almost pathological degree. According to Netflix, The Big Bang Theory is an 88% match to my interests. By contrast, Blackadder is just a 71% match, even though it’s a show I’ve watched and loved my entire life. Breaking Bad, which I’ve watched from start to finish multiple times on Netflix, has a healthy 96% rating. But Brooklyn Nine-Nine, which I used to watch on Netflix until it got crap and I stopped three and a half years ago, has an even healthier 97%. Hannibal, another show I’ve watched from start to finish on Netflix, clocks in at 84%, narrowly ahead of Peppa Pig at 82%. Comedians in Cars Getting Coffee, a show I would only watch if paid a princely sum to review, is a 90% match to my interests. Only Fools and Horses, a show I watch all the time, is rated too low for Netflix to even bother giving me a number. My recommendations are full of anime, even though I haven’t watched any anime since I was a child. Netflix thinks I’d like every single Louis Theroux series it has, even though I have never, ever watched any documentary TV series in my life.
Netflix’s recommendation algorithm seems like it’s broken. But it’s not, it’s working just fine, at least for now. The problem is the algorithm’s job isn’t to help users find TV shows and movies they would enjoy. It’s to trick Netflix’s investors into thinking the company is worth more than it is.
Two years ago, Netflix replaced their five-star rating system with a thumbs-up, thumbs-down that was instantly criticised for obviously being dumb as shit. Binary ratings lack the nuance of a star rating, which is already so much less nuanced than a person’s actual feelings about a work of art. I’d give a thumbs up to both The Lego Movie 2 and The Terminator, but I think the former is just fine and the latter is one of the best films ever. I’d give a thumbs down to both Steven Soderbergh’s Bubble and Bohemian Rhapsody, but one is an admirable but flawed film by a director I adore and the other is Rami Malek trying to gently nudge his prosthetic teeth back into place with his lower lip for two hours. And I’d rather watch Bubble again than The Lego Movie 2, even though I’d rate it lower, because it’s interesting to me in ways The Lego Movie 2 never will be.
The official explanation from Netflix for the change was that people were confused about how the star rating system worked: the rating shown on a series or film was a prediction of how you would rate it, not the average of other users’ ratings. But that issue was solved by replacing the predicted rating with the percentage match I mentioned earlier. Why did Netflix also reduce the ability of users to give input?
Netflix is unique among streaming giants in not being a subsidiary of any larger company that makes other things. Prime Video belongs to Amazon, the world’s largest online retailer. Each of the four Hollywood giants – Comcast, Disney, AT&T and National Amusements – owns at least one and often several streaming services, from major players like Hulu and Disney+ (Disney) to smaller fish like DC Universe (AT&T), ESPN+ (Disney) and CBS All Access (National Amusements) to the upcoming HBO Max (AT&T) and Peacock (Comcast). Even some comparatively niche streaming services belong to a larger conglomerate, like Crunchyroll (AT&T), Shudder and Sundance Now (both AMC). Apple TV+ just launched this week. There’s even Facebook Watch, a streaming service embedded for free in Facebook that I literally only found out existed a few weeks ago, because Facebook is so huge it can afford to make several TV shows available for free to its users and not even bother to promote them. YouTube Premium, by contrast, is shoved down my throat every day by Google’s relentless advertising.
Netflix is not part of a larger company that sells other products. Its losses can’t be defrayed by the profits of other divisions, and, as the major studios establish their own streaming services and pull their content from Netflix, it’s had to pivot hard to keeping subscribers with original content, largely financed by taking on huge debt against future earnings. For Netflix to continue operating, it needs to convince its investors it can survive the coming struggle for streaming dominance in the face of mounting evidence it simply can’t. Netflix has no product but itself: it runs no ads on its platform and while it gathers extensive personal data from its users, it doesn’t sell the data onto third parties. The pitch for Netflix’s continued viability is that it can use all that personal data to create the world’s most addictive streaming service, by ignoring demographic data in favour of deeper insight into watch habits. Its content promises to be driven by predictions of what people grouped in various “taste clusters” will enjoy, and not just what it recommends you watch either. The clusters will be mined for data to guide creative decisions on Netflix’s original properties, influencing the kind of shows and movies that get put into production. Netflix is hyping its unique ability to match its content to its users’ preferences as the thing that will allow it to continue growing subscribers, its main source of revenue. The limiting of user expression to a binary thumb rating represents a hard pivot to prioritising the use of data it gathered from you passively as you used Netflix over anything you might actively express. Netflix needs to prove it can read your very soul without your input to persuade investors to keep backing it.
In an ideal world, there might be something to it. The notion of cutting through demographic differences to find common passions that unite us across race and gender and faith and sexuality is appealing. But Netflix’s algorithms can’t bring us there. They don’t perform as advertised. They never will.
The first problem is a basic methodological issue that leaves Netflix with an incomplete data set. Netflix only gathers data on how I use it, not how I don’t use it. Netflix can see what I watch and for how long, but when it comes to what I don’t watch, what I don’t even start, Netflix is blind. No matter how many times it recommends me Louis Theroux shows, I’ve never been interested, but Netflix can’t record me choosing not to watch something, because I don’t press any buttons to do it. It also can’t separate what I haven’t watched out of disinterest and what I haven’t watched for other reasons. Netflix can’t know I hate Sean Penn and don’t want to watch him in anything, or that I’d rather rupture a kidney than sit through an entire Stephen Daldry movie, or that I really, really want to watch The Wandering Earth and just haven’t blocked out two hours to do it in yet. It also can’t really predict what content I’d like, only how I react to content it already has. I could watch a bunch of horror films and see my recommendations fill with them, when what I really want is spaghetti westerns. But in the absence of spaghetti westerns in the first place, Netflix can’t collect data on my unfulfilled desire. There is no input to record.
The second problem is a deeper and more fundamental flaw. Netflix’s algorithms can only measure correlation, not causation. It can see I like both Orange is the New Black and Breaking Bad, but it has no idea what shared qualities attract me to either. I could be a particularly huge fan of shows about crime or I could just like them both for completely unrelated reasons. It can only see data and draw links between them. It can’t understand what, if anything, those links imply. It can only say that things are true, not why, and so it can’t say what true things are also relevant, let alone significant. People can interpret the links, I suppose, but the whole point of the algorithms is supposed to be seeing deeper and further than humans can. Besides, it presupposes the data will mean anything to its human interpreters.
The third problem is a feature of machine learning that may well explain Netflix’s bizarre confidence that I’d really enjoy Peppa Pig. It’s impossible to say for certain, because the code is proprietary, but the kind of learning algorithms that Netflix seems to use, given its descriptions of their complexity, are not told how to group and sort data. Instead, they’re given the data with minimal instructions and figure out how to use the data by analysing it:
“For example, you give a machine learning system thousands of scans of sloppy, handwritten 8s and it will learn to identify 8s in a new scan. It does so, not by deriving a recognizable rule, such as ‘An 8 is two circles stacked vertically,’ but by looking for complex patterns of darker and lighter pixels, expressed as matrices of numbers — a task that would stymie humans. In a recent agricultural example, the same technique of numerical patterns taught a computer how to sort cucumbers.”
Machine reasoning is therefore completely alien to humans, to the extent even its creators can’t reverse engineer how self-teaching programs work. The reasons it draws connections between two pieces of information are incomprehensible to us. We can only explore the links to see what they might imply. This can work wonders with hard, objective, quantifiable data. The use of machine learning in medical diagnosis shows huge promise to improve healthcare. But taste is not hard, objective or quantifiable. It shifts and changes, ebbs and flows, blossoms and withers. Two people can share a lot of the same interests for completely unrelated reasons. Not all facts about the data will be productive or useful. To the extent these facts form the basis of original content, the human layer of decision-making will prevent Netflix making incomprehensible films. But with no human filter on the recommendations side, the inevitable result – one that will get worse with time – are these bizarre, random, useless taste profiles that increasingly only recommend anything we’d enjoy by pure chance.
Much has been written about how streaming services in general, and Netflix in particular, offer an inferior replacement to the video store experience:
“At my local video store, I could surf through hundreds of movies selections too, but I would often head to the shelves where clerks would lovingly place their recommendations for undecided viewers. Of the three employees there, one made every effort to broaden the horizons of each customer. My parents were hardly on the front lines of American Independent film, but, here, was my father, checking out Pulp Fiction and The Piano after chatting with this bold clerk. Worst case scenario, if something we wanted wasn’t there, we’d go home happy with something else thanks to his recommendation.”
Even in a video store without such active and vocal staff, the browsing experience is fundamentally different to that of streaming services. Movies might be grouped by genres or age, but within that group, they’d mostly be displayed alphabetically, forcing you to manually search through them for something to watch. Even if you had a particular title in mind, you might come across something else that caught your eyes. That’s how I found The Prestige, without which I might well never have found my passion for film. Netflix’s algorithm, by contrast, have never successfully recommended me a film I didn’t already want to watch. The usual critique of recommendation algorithms is that they create feedback loops, echo chambers or bubbles where our inputs are simply regurgitated back to us, to the extent, in Netflix’s case, we’re often recommended stuff we’ve already watched.
But the reality is even worse. Unmoored from human reasoning, and especially those features of human reasoning often called “common sense”, learning algorithms won’t trap us in bubbles. A bubble has a coherent, predictable structure. Netflix’s recommendations are more like a labyrinth built by an alien who doesn’t know they’re supposed to have exits. There’s no recognisable logic to them at all. They’re something more than random and less than rational, growing more alien and impenetrable with time. Netflix says its predictive power will keep subscribers hooked and hook in new ones. But, speaking as one of its subscribers, Netflix has grown more and more cumbersome to use with time. I remember a time when I was “trapped” in something like a bubble on Netflix, when it did a decent job of suggesting things I might want to watch. Now? It thinks I want to watch The Big Bang Theory.
Maybe that’s not such a big deal. The algorithm is, in many ways, the least of Netflix’s flaws. I still find stuff to watch on Netflix, usually based on word of mouth from people I trust. But I can’t find many movies or shows from before the millennium. Fewer and fewer of its offerings draw from the long histories of major studios as they set up rival services. Most of Netflix’s original content is total rubbish. But the failures of its algorithm scare me most, because the company has staked its entire future on its success. What happens when the revenue to service its debts never materialises? What happens when investors get spooked and sell their stock? What happens when the only streaming giant without a conglomerate behind it goes the way of WeWork and Theranos and Juicero and all the other companies with business models based on imaginary thinking and outright fraud?
The archival implications of Netflix’s potential collapse are mind-boggling. Just think of all the films and shows they’ve made over the past several years. In the best-case scenario, Netflix is stripped for parts and its content library ends up at a company (or several companies) with a decent record on preservation. Even then, most of their output ends up inaccessible since it’s not worth the cost of distributing. The worst-case scenario sees much of that library lost altogether, preserved by pirates, if at all.
I don’t have a solution to this. I don’t understand how Netflix can outlast the limits of its own model and compete long-term against streaming services owned by conglomerates that could potentially subsidise them even when they’re not profitable just to drive Netflix into the ground. But if Netflix has any chance of lasting through the coming onslaught, it won’t be found in its recommendations algorithm. It’s not even broken. If it were, it could be fixed.