You don't need this fancy recommendations system

I spent years on looking for a powerful and accurate recommendations system that would give me the titles of the movies I was desperately needing to bring my film knowledge to completeness (we movie buffs are quite obsessional - and we try to heal ourselves with making lists). The more I discovered new movies and new directors, the more the recommendations engine I was using were proved ineffective.

It looks like we try to address the need for recommendations always with the same tools and the same approach, regardless of the matter. The recommendations system of my favorite cooking recipes website seems to be coded by the same engineer who designed the recommendation system of the website I’m browsing to dig music records. Same design and same implementation. But would we expect, in real life, be advised to listen “Abbey Road” like we’re advised to mix aubergine, hummus, and goat cheese? (tried yesterday, this is delicious)

I know I’m being provocative. The usual data-science-powered recommendation system just generates recommendations and ignores the context, it evens ignores the way the advice will be given and received. But this is precisely the limit I’m seeing.

Most websites could wrap the cold and clinical generated recommendations with a nice UX, to bring some warmth and make them lively and substantial (cf. the dumb Netflix categories), but the connoisseur will always notice the trick. The amateur will not, but he’ll have to live with the dumb recommendations that no one would dare to share in real life.

I’m sure that algorithm-generated recommendations are great in many situations. Sometimes, the data volume and the business complexity are such that the usual recommendation system will always provide the best results, furthermore for a modest cost of implementation. There are topics that are hardly mentally mapped by humans. For instance, I’m having trouble imagining a human-curated recommendations systems for cooking recipes. How to map a link between peanuts lovers and squash lovers?

Conversely, there are areas that cover a relatively small data volume, and that have been deeply analyzed by academic research for years. An example, more than any other: movies. No more than a few million movies have been shot. Maybe a few hundred thousand of professional ones. And much less, maybe one hundred thousand, of movies worthy of interest for the usual movie buff. Among these one hundred thousand films, I’m including the very rare and almost amateurish French art movies, Bollywood films, etc. At the same time, many books, theses, or interviews, gave us a deep understanding of the medium that allows us to map the movie history, drawing connections between filmmakers, based on who influenced who, who liked what.

This is exactly what I tried to with They Love Pictures, which generates recommendations based the favorite movies of your favorite directors. I won’t claim this is a good system. It has serious flaws, like the inability to give recent movie recommendations. If the new Citizen Kane is released tomorrow, it won’t be proposed to you until a relatively famous director cites it as one of his favorite movies. But it gave me films that at least I want to see (and actually, I must say I like that it is looking at the past, towards older and “lindy” movies, as would say Nassim Taleb)

Moreover, these recommendations are contextualized and I can see why these have been proposed to me. It helps me to trust the system, and to better grasp the long and labyrinthic movie history. A young and curious Tarantino fan doesn’t just want to receive a raw list containing “Le Cercle Rouge / The Good, The Bad and The Ugly / Carrie”, he also wants to know that he received them because Tarantino grew with watching movies by Jean-Pierre Melville, Sergio Leone, and Brian De Palma. This is what will help him to develop a solid and precise understanding of movies.

Again, I don’t want to make this blog post some disguised advertising for a toy project I developed in a few days during the lockdown. I speak as an engineer, but mostly as a movie buff that never met a decent movies recommendations system and yet can read this on Wikipedia:

One of the events that energized research in recommender systems was the Netflix Prize. From 2006 to 2009, Netflix sponsored a competition, offering a grand prize of $1,000,000 to the team that could take an offered dataset of over 100 million movie ratings and return recommendations that were 10% more accurate than those offered by the company’s existing recommender system. This competition energized the search for new and more accurate algorithms.
https://en.wikipedia.org/wiki/Recommender_system#The_Netflix_Prize

I’m glad it at least energized the search for new and more accurate algorithms. This is like an ironic confirmation of the fact that recommendation systems are made for engineers and data scientists looking for technical challenges, not users.

In brief, it seems what we tend too often to overengineer a solution without taking the time to think about the essence of the subject. I believe that many websites could provide a convincing recommendations system to their users based only on human-curated information and the user data. Recommendations would have a tone and a personality that the usual data-science-heavy recommendation system is unable to match, with the advantage of working from the first user (no need to have thousands of users and collect a ton of data) and being cheap to build and maintain.

2020-08-10