Spotify: A Product Story Episode 4 - Transcript This is : A Product Story. And I’m Gustav Söderström. I head up product, engineering, data and design for Spotify.

In this podcast, we’ll bring you the biggest product strategy lessons that we’ve learned at Spotify -- from launching the first desktop app all the way to the new audio formats that we’re developing right now. We’ll break down why we made the decisions we made, what worked, what didn’t, and the stories behind the products -- all told in the words of the people who were actually there, who lived it and who made it happen.

(00:52) In today’s episode, we’ll explain how we evolved from being an online music library and playlisting service that put all the actual work of curating and organizing the worlds music catalogue in the hands of the user, to a machine-learning powered recommendation engine -- that does the work for you.

Over the years, we’ve developed some of the most powerful recommendation algorithms in the world for audio content. All working toward one simple -- but very hard -- goal: delivering listeners the right content at the right time.

I’ll be the first to admit that our journey into Machine Learning (or ML, as it’s often called) started with a bit of a misstep. Because for the first few years of Spotify, we didn’t quite see how it would actually fit into our bigger goal of bringing users the perfect listening session. Well, most of us didn’t. Myself included.

Oskar Stål: I do remember at one time when we talked about recommendations -- you, Daniel and I -- and we were kind of like, yeah, that's not really core -- can outsource that to someone else. We don't have to consider that kind of our core thing. And then at some point within a few years after that, we changed opinion.

That’s Oskar Stål, and to say we changed our opinion is an understatement. So much of an understatement that Oskar’s title is now VP of Personalization at Spotify.

And here is how it came about.

Early on, in 2008 -- the year that Spotify launched -- a master’s candidate at Stockholm’s Royal Institute of Technology by the name of Erik Bernhardsson came to Spotify to finish his thesis -- And eventually joined the engineering team full-time. But already as a student he

1 Spotify: A Product Story Episode 4 - Transcript saw the potential in giving users personalized recommendations by doing massive matrix multiplications.

Oskar Stål: Y eah, I mean, he was like a math genius type of guy. He was doing some math magic to basically do these matrix computations on our data. And back then, the magic was not specifically the algorithm for recommendation. It was more the algorithm for actually doing the matrix multiplication. So it was kind of like a math for approximating in a reasonable way in matrix multiplication. That was what he did. And I think what he wanted to do was just like work on that and make it better. But I think the work he did in the master's thesis, he never quite got the opportunity to improve that for years and years because we always wanted him to work on something else. Gustav Söderström: I t's interesting because the theories, they had been around for many years, like, different forms of collaborative filtering or matrix multiplications and stuff. But the tricky thing which not many people had done at the time was actually implement that at scale for I think what was already then tens of millions, if not hundreds of millions of playlists, for example, and later billions. So a lot of the innovation was actually to your point, in first of all figuring out the engineering, because you had to break this up into many computations on different clusters of computers and then approximate some version of the actual algorithm. So there's a lot about actually implementing these things in practice, and that hadn't been done so much at scale. Oskar Stål: Y eah, exactly. This time, you know, it was well known that to do collaborative filtering and it was well known how to do that through matrix multiplication. But the thing people did was much smaller data and much more like reasonable matrixes to work with, so to speak.

“Collaborative filtering” is the fancy name for a rather intuitive idea: when a large group of users put the same bunch of tracks next to each other on the same types of playlists over and over again, they’re telling you that those tracks go well together. And that those tracks probably have something in common.

Algorithms then use that information to figure out how similar two tracks are mathematically. Based solely on how often they appear on the same playlists.

But -- at the time -- running that kind of analysis with a data set as large as ours was incredibly difficult.

2 Spotify: A Product Story Episode 4 - Transcript (05:04) Besides being hard, recommending new music to listeners also seemed secondary. We didn’t discard it, but we didn’t yet see it as core, as our main thing. So -- except for Erik’s work with collaborative filtering, we outsourced the rest of our recommendations to a Massachusetts-based start-up called The Echo Nest.

Because as far as discovering new music goes, we thought Spotify was already pretty perfect. All you needed was a really good search bar and an advanced playlisting tool, from there, you could soundtrack your life perfectly! What could be easier than that?

It turns out, a lot. A lot of things are easier than that.

Which is the basis for our first product strategy lesson.

Lesson #1: Build for yourself first. But don’t build for yourself only.

It's good to start by building for yourself, because that’s where you have intuition, but pretty quickly you need to ask yourself Mary Meeker’s question from episode 3 - How many of me are there? - or your product will hit a ceiling.

We’d built a product that gave just one type of user the perfect listening session. Users just like us. It was the perfect tool for a huge music fan. Someone with an encyclopedic knowledge of bands and genres, who already keeps up with the latest releases and enjoys spending hours at a time combing through the back catalogue and putting together carefully crafted playlists. In other words, someone who could look at a blinking cursor in an empty search bar and -- instead of feeling intimidated and overwhelmed -- would know exactly what they wanted to hear.

But guess what? There are only so many die-hard music fans out there.

If we wanted to continue to grow, we had to find ways to bring more casual listeners on to the platform.

We called it the aficionado problem. Spotify was a powerful product -- it gave you access to almost all the world’s music. But it wasn’t a very helpful product for those who didn’t already have that time or knowledge. In fact, for them it felt like a lot of work.

3 Spotify: A Product Story Episode 4 - Transcript By 2011, we saw the macro wind that we mentioned in episode 3, the shift from curation-focused to recommendation-focused services, that did a lot of the work for you, starting to really pick up speed. And we realized that recommendations needed to become a part of our core strategy and that we needed to hire for it.

The only problem was, everyone else also realized it at the exact same time. Here’s Oskar Stål again.

Oskar Stål: W hat happened, of course, was that, you know, we didn't really manage to hire anyone because it was impossible to find these machine learning engineers. So I think for a long time we pretty much had like two machine learning engineers in the entire company. And what they did was basically they continued on the same track, like, how do you do collaborative filtering? And I -- the way I remember it was that the first period we continued on basically our own algorithm for Matrix Factorization and used that for quite a while. And basically the work was around iterating on this and making that better. And we were out talking about it. So that was kind of I think the focus for the first year was really just building on that and then applying that to the discovery feed that we were building at the time.

We did have one incredibly valuable advantage, something nobody else had -- our library of many hundreds of millions of playlists already back then, which was arguably the largest music curation database in history, growing larger every minute to over 4 billion playlists today!. But that database came with its own set of problems.

Oskar Stål: I t was like a nightmare. There were all these strange things that would be popping up all the time. You know, we would be trying to do something about pop and you would have - get Christmas music or you'd get children's music. So it was still really hard for us to make it like, truly work. It was kind of working, but it was not really working.

We called these mistakes WTFs -- you definitely d on’t need me to tell you what that stands for -- and they cropped up because it turns out that there is a lot of noise in 100 of millions of playlists, and many users playlist tracks together that aren’t that similar.

(09:32) So -- as Oskar said -- it was kind of working in that the code and the models were functioning correctly but not really working in that the data itself wasn’t clean enough to

4 Spotify: A Product Story Episode 4 - Transcript prevent these slip-ups and create a really great user experience that seemed intelligent. It would often come up with absolutely amazing and unintuitive suggestions that no human would’ve ever found. But it also made simple mistakes that no human would’ve made. So it still came across as a bit “dumb” to the listener.

And these mistakes get at the heart of our 2nd product strategy lesson: throwing Machine Learning at data you don’t fully understand isn’t enough to give you a great product.

This is often referred to as "black box machine learning" and most people in the industry have had this painful learning.

The problem with collaborative filtering is that it doesn’t actually “listen” to the music or understand it in any way. It just looks at how often it appears together with other songs.

So, if enough people put on some heavy metal right next to ballet on the same playlist, collaborative filtering is going to deduce that when you hit play on some metal -- you might also enjoy a little Swan Lake.

While this may seem very unlikely to you, trust me, whenever you have 100 of millions or billions of something, anything that can happen will happen, thousands of times!

Oskar Stål: D uring this time we were very much a one trick pony outfit. So we were doing one thing and we were doing that pretty well. But it was only one thing. And that one thing was the collaborative filtering of playlists and tracks. But at the same time, we didn't really know anything about the tracks. We couldn't really tell whether it was pop or Christmas or a lullaby for children or what it was. So what happened was that we created a product that worked really great, but they continuously had these WTF moments that destroyed the entire product. So if you're listening to a 10 song playlists and you have a lullaby and a Christmas song in them, your opinion is going to be that this suck.

For two years, we tried to refine our collaborative filtering enough that it would put an end to the WTFs. But that alone could only get us so far.

Oskar Stål: Y ou know, we only had like one or two machine learning engineers. So creating a lot of advanced, like, data processing to identify like -- what track is Christmas, what track is this that the other -- would also have been like a major

5 Spotify: A Product Story Episode 4 - Transcript undertaking, like a big investment that I don't even think we realized that we needed to make.

But we would find out soon just how big -- and impactful -- an investment in ML would turn out to be.

(12:37) But first let’s break for some product strategy theory.

When you first start working with machine learning, it’s tempting to treat machine learning like just another tool to keep doing what you’ve always done, just a little bit better, a little bit faster. But the goal should be to create true “Machine Learning-first” products -- products that are entirely built on the premise of Machine Learning -- rather than just incrementally improving your existing product.

Economist Ajay Agrawal puts it perfectly in his book “Prediction Machines”. Imagine the prediction accuracy of a machine learning system as a volume knob on a radio.

When you turn up that knob, you’re turning up the accuracy of the prediction of your machine learning system.

Machine learning teams are constantly trying to crank up that knob by working on algorithms, looking at user behaviour and so on, and they use that to make what users are already doing even easier. For example by recommending similar artists on an artist page or suggesting new tracks for a playlist.

But when you reach a certain point on that knob -- when your predictions are accurate enough -- something happens.

You cross a threshold, where you should actually rethink your whole business model and product based on machine learning.

What do I mean with that?

Well, Agrawal gives online shopping as an example. As machine learning gets more and more accurate, an online store could move from just slowly improving its existing

6 Spotify: A Product Story Episode 4 - Transcript “people-who-bought-this-also-bought-that” recommendations to upending the entire experience of shopping.

When the accuracy reaches say 8 out 10 perfect predictions for what you will buy, the business model could change, and instead of waiting for you to place your order on those 8 items every week, the online store could ship you boxes of the 10 items it predicts you will want, and then you “shop” in the comfort and convenience of your own home by choosing which items to keep and send the rest back.

This is the move from "shopping, then shipping" to "shipping, then shopping"

But we weren't about to shift our paradigm with two machine learning engineers and a bunch of WTFs.

To really crank up our knob of prediction accuracy, we needed technology that could understand music in ways that collaborative filtering with playlists wasn't capturing. Fortunately, we knew exactly where to look.

Remember The Echo Nest, that start-up we talked about earlier, the one we originally outsourced recommendations to?

Ajay Kalia: T he Echo Nest used machine learning algorithms to try to understand music at scale the way that humans understand music.

That’s Ajay Kalia. While we were struggling to squash WTFs, Ajay was hard at work fine-tuning The Echo Nest’s recommendation system.

Ajay Kalia: T here was two components to that. One was literally trying to figure out how people describe music, what are the words people use? And for that the echo nest had this series of crawlers that would go and read the entire Internet, and finding all these blogs, reviews, all sorts of stuff, in order to see how music was being described and then doing natural language processing on top of that. So if you see that there is, you know, an artist that's constantly being described with this phrase jangle pop. Right. And it appears specifically with that artist and very infrequently with other artists, we learned to associate the term jangle pop with this particular artist. That was one, you know, the kind of the base foundation of it. And then what you could do from there is start to see, given any two artists - what are connections

7 Spotify: A Product Story Episode 4 - Transcript between these artists in terms of how they're described, who are artists that are frequently described with a lot of the same words or a lot of the same musical terms. Because this was being done by machines, we could do it at scale, meaning we can go and crawl for -- just, you know, thousands and hundreds of thousands and millions of artists and see how the very niche audiences that are out on the Internet were describing these artists, beyond whatever a human was able to evaluate over the course of, you know, an hour or two. The other half, you know, acknowledged that songs can be you know, you can't always put words on a song. Sometimes it's more about how it sounds or the, you know, the vibe or the mood. People have different ways of describing this. But what you could do is take a song and then break it down acoustically. Like, literally take the signal that is contained inside an MP3 or inside a sound file, chunk it up into little windows and look at all the characteristics of that song. Both things that we sort of describable to a human like tempo and beat, as well as things that are probably less describable but still contained in the pattern of the music.

(17:30) In other words, The Echo Nest had what we lacked -- algorithms that said everything about the music itself but nothing about how listeners interacted with it -- and Spotify had what they lacked: listening data on how people interacted with the music. Lots and lots and lots of listening data.

With that, we’ve arrived at product strategy lesson #3: If you don’t have one side of an equation inside the company -- look outside for it.

In some cases this could mean an acquisition, where you join forces with an existing team. It may seem like a steep price upfront, but ask yourself: how many years' worth of time and effort will you save compared to building out the same capabilities in-house? And, furthermore, how many years into the future will you jump by combining forces now?

And so, in 2014, we took the plunge and acquired The Echo Nest.

Ajay Kalia: T he recommendation models at Spotify at the time were good at finding patterns, but it couldn't really describe those patterns. And it was hard to know why songs sometimes were included in those patterns, even though to a human they felt very unnatural. So with the Echo Nest technology what you had was a piece of technology that could scan the world of music and identify characteristics of those

8 Spotify: A Product Story Episode 4 - Transcript songs in more specific and grounded ways. And then together you could have a joint model that could understand the concept of both -- what does it mean for a song to be happy, for instance? Well, we can look for songs that people tend to put on playlists called happy. We can look at the characteristics of those songs and we can look at how people describe those songs or those artists and then together have a much clearer view of the world.

With the help of The Echo Nest’s tech, not only did we have the engineering know-how and raw data we needed to create a powerful, helpful, WTF-free recommendation platform.

But more importantly, we were finally equipped to launch our first true ML-first product.

Only about a year after Spotify acquired The Echo Nest, we launched Discover Weekly, our first fully algorithmically generated playlist, individualized for every user. Every Monday morning, you’d open up the app to find a brand new playlist full of gems that you missed from Spotify’s vast back catalogue of millions of tracks, served up for you by the algos.

With Discover Weekly we switched the paradigm from “shopping then shipping”, to “shipping then shopping”, the way Ajay Agrawal described. We had reached a level of Machine Learning accuracy where we could switch from just giving users even better tools to playlist themselves, to just giving them a weekly playlist and let them save the tracks they really liked. We switched our vision from “even better tools to playlist yourself” to “you should never have to playlist again”.

Discover Weekly was a huge hit for Spotify, but astonishingly, underneath the surface, something unexpected happened. The same people who had esoteric music tastes - the music aficionados - who were convinced that no algorithm could ever encapsulate their unique taste were the ones blown away by Discover Weekly’s accuracy. While the more casual mainstream listeners, they actually weren’t so impressed.

Like us, you might expect that the easier problem to solve would have been recommendations for more mainstream listeners -- people with less obscure, esoteric taste. But in reality, it was the biggest music fans who playlisted the most, and therefore influenced the data the most. So, if you think about it, it makes sense that the recommendation data would be biased towards the hardcore music lovers.

9 Spotify: A Product Story Episode 4 - Transcript We had managed to solve the problem of personalized recommendations. But only for a subset of our users - the really complicated ones! So we were facing the aficionado problem all over again.

It turned out, what we needed was a human in the loop!

In this series, we've heard from a number of people whose work has transformed Spotify from the outside in. And now, I'd like to introduce you to another one, someone whose work has been integral in how companies, including Spotify, leverage machine learning. Andrew Ng.

Andrew Ng: T oday, A.I. is transforming every industry. I have been saying A.I. is the new electricity. Gustav Söderström: Y ou were one of the first, I think as far as I know, obviously you have been called one of the grandfathers of A.I., but you were also one of the first to kind of go from academia and try to make ML and AI happen in the industry. And since then, others have followed. But I -- as far as I know you were definitely one of the first. Andrew Ng: Y eah, boy, I don't know if I want to be called a grandfather of AI -- don't think I'm -- I’m not sure I'm old enough yet. Gustav Söderström: F ather of AI? Andrew Ng: F ather of Nova, my daughter is good enough for me.

Andrew’s background spans both academia -- he’s a professor at Stanford -- and industry -- he co-founded Google Brain. And he played a major role in taking machine learning and AI out of the abstract academic world and into the real-world applications of the tech industry.

Andrew Ng: W ith the march of automation it’s always very interesting to see where it ends up. And it feels like it keeps on changing. We went from no automation to using the human input to assist automation to now even further along on the spectrum. Gustav Söderström: R ight. And that's something I find fascinating. This point in time where you go from like linear improvements to like, well, let's just do ML first instead, right, completely different user experience. I mean, I can tell you my -- when I discovered this, was the Google Photos. Right. So it was so clear to me that your iPhone was producing way too many photos for you to ever organize manually again into albums. And then at some point, Google said like, no, let's flip it around instead of saying, let's give you better tools to manually organize your photos. Let's tell you, you will never organize them again, like you will never create an album again, we're

10 Spotify: A Product Story Episode 4 - Transcript just going to index it for you. We're going to group them. Right. So machine learning hit some threshold where you could change the entire experience instead of just improving on the old paradigm. The paradigm changed. Andrew Ng: Y eah, that's really interesting. And many years ago, maybe like 20 years ago when machine learning was starting to make its way into programmatic advertising. You know, I remember some of the ad salespeople, they said, no, look, you could never have an AI place ads on websites. I know this website is a travel website. I know that this travel agency, they are going, you know, pay me for ads, and we'll put the ads on this website. And that's how it’s going to work, and AI could never do that. And maybe they were right for the technology of that time. But then the learning algorithms started to come in and initially we would build learning algorithms, so pretty data science algorithms. So it would make some suggestions like, hey, did you know you can send more people to talk to the travel agency because they're under indexed in terms of advertising? And then we shifted to taking their suggestions as input to the learning algorithm. We'd go to the ad sales, we’d go wow, you're so insightful for this industry. Tell us what you know so we can feed it into the learning algorithm and it will make better predictions together. And we did that. And then as the algorithms continued to develop. Well, you know, the rest is history. As we now know, we use programmatic advertising on many ad platforms. And so the role of that salespeople there were manually placing ads has changed. Now, today, the large online ad platforms still have large sales teams and data scientists do help direct or do help suggest where they direct their efforts.

(25:27) I asked Andrew to join us on the show, because he pioneered the thinking behind our next product strategy lesson. Lesson #4: In a machine learning world, you have to learn the product, not build it!

What does that look like in practice?

For decades, Product Managers, Designers and Engineers have been working together in more or less the same way. We often start with something called a PRD -- a Product Requirements Document -- to describe what the product should do, and a Wireframe to visualize what it should look like and how it should work.

Andrew Ng: I n the pre-machine learning era, the way a product manager would specify a product would be of a PRD, product requirements document, maybe a

11 Spotify: A Product Story Episode 4 - Transcript wireframe and engineers would execute it. But if you are trying to build a machine learning product, anything from a self driving car or a music recommendation system or e-commerce recommendation system you can't wireframe, you know how a self-driving car should drive. So I think that the new way to specify a product is for the product manager to come off the test set and go see the machine learning team and say dear machine learning team, are you able to make predictions at least, you know, 97 percent accuracy or some other number on this test set? And that is the PRD that the machine learning team can then execute against.

What Andrew means by saying “the test set is the new wireframe” is that -- instead of writing up a spec or drafting a single wireframe -- product managers should go out and source a ton of examples of what the feature should behave like. Machine learning engineers can then use those examples as the “test set” to see how closely their machine learning system actually matches and predicts that data at scale -- or in machine learning speak, how well it scores on that test set.

Just saying that “the product should be great” -- doesn’t cut it, and giving individual anecdotal examples for a feature that needs to respond differently to millions and millions of people’s input isn’t enough. The job of a product manager in a machine learning world is to find metrics, objectives and source datasets that can objectively describe what “great” actually looks like - or in our case - sounds like at scale.

Meet the human in the loop.

Meg Tarquinio: M y name's Meg Tarquinuo. I work on the editorial team, had been there seven years and now I lead up a team called Curation Strategy.

In 2013 -- even before we bought The Echo Nest -- we acquired Meg’s former employer, Tunigo, in a bid to reach more casual mainstream listeners through human curation.

Tunigo was founded by a and music producer named Nick Holmsten, in Sweden in 2010. He -- along with his team of music critics and DJs -- approached music discovery differently from everyone else at the time.

Meg Tarquinio: S o getting down to it Tunigo was a music app that was powered off the Spotify API. So we were a standalone app, but also an app within the app. And at its heart, the mission was to open up the world of music. So essentially answering

12 Spotify: A Product Story Episode 4 - Transcript that question, what do you listen to when you can listen to absolutely anything at any time?

While the rest of the world -- including Spotify and The Echo Nest -- was trying to describe music using technical terms and categories like EDM, grunge, and trip-hop, Meg and the other Tunigo editors put together soundtracks for what you should d o with the music, what we in product development call “use cases”, playlists like, “ High energy Running”, “Deep Focus”, “Dinner with friends” and “Your Favorite Coffeehouse”. Terms that would let anyone navigate the music catalogue.

Meg Tarquinio: A t Tunigo, our curation approach was really to create a deep ecosystem of music playlists to help listeners find music for as many moods, moments, modes, activities, genres, lifestyles as possible. So every playlist -- to belong and be in the ecosystem -- has to have a goal and it has to have a hypothesis. So how it plans to reach that goal and it has to think about what its editorial intent is, has to think about user intent and audience and context and strategize its selection and sequencing. But the goal is to provide a soundtrack for our particular moment.

In other words, Meg and the other Tunigo editors made playlists that told a story, that captured dimensions of people’s everyday life that not even the most sophisticated algorithms could pick up on by just looking at song similarity or audio characteristics. And because Tunigo was built using Spotify’s API, we knew how wildly popular their playlists actually were with the more mainstream users we were trying to reach -- users who also loved music, but who didn’t speak the music vocabulary of trip hop, lo-fi, electronica and EDM, and wanted to spend less time playlisting and more time listening.

Thinking back to lesson #3, the question wasn’t really whether or not we should acquire Tunigo. It was what Tunigo’s editors could do once they had access to all our listening data?

Once on the inside, a Tunigo editor could drop a track into a playlist and within minutes they could see a number of plays, skips, saves, and add, remove and re-rank tracks in an inexorable loop to optimize the playlist for the use case.

So, we acquired Tunigo in a bet to build the most data driven playlisting system the world had ever seen.

13 Spotify: A Product Story Episode 4 - Transcript But acquiring these two companies and all the talent that came with them -- Ajay at The Echo Nest, Meg at Tunigo -- isn’t the same as making them work together.

And so by 2015, we had developed 2 distinctly different tools for recommendations: personalized, algorithmically generated playlists like Discover Weekly for the hardcore music fans, and “soundtracks” for different situations curated by editors like Meg for the more casual listeners. But 1+1 still just added up to 2, rather than the mythical 3 that we wanted.

Meg Tarquinio: P ersonalization was not something that you could access or have as an editor or listener in our curated playlists, in our sets we followed a broadcast model. So we were able to be very culturally relevant and to be temporal. But everyone saw the same thing at the same time.

You might be surprised to hear me say this but in several aspects, the editorial team’s playlists were superior to the algos’ in that they were better at tapping into the zeitgeist and being able to target complicated human emotions and situations, because they were built by complicated humans who understood those situations.

But at the same time they were very limited in their ultimate ability to perfectly fit any specific listeners’ taste because they had to be an average of everyone’s taste in that use case. In order to find a big audience for a use case, situation or mood, each playlist needed to be largely right for everyone - but not perfect for anyone. Conversely, the algo playlists could cater to an individual users’ taste, but they lacked depth and understanding of the situation.

It turns out that humans are pretty great and still very smart compared to most algorithms. But the thing about humans is they don’t scale so well. In an ideal world, we would be able to hire 3 professional editors per user, each editor would work an 8 hour shift, going through your listening history, getting to know you and be ready to whip up a truly perfect, personalized playlist for any event, activity, or mood, just for you - 24 hours a day. Unfortunately, we just can’t afford to have 900 million editors on staff for 300 million users. While the algos aren’t quite as refined as the editors, they do scale incredibly well and we literally can have an algo working 24h per day for every listener.

For years, the two teams couldn’t help but see each other -- at best -- as tools to make their own existing products incrementally better. Sort of like how we originally treated machine learning.

14 Spotify: A Product Story Episode 4 - Transcript Here’s Ajay Kalia, the product manager from The Echo Nest, again.

Ajay Kalia: T he personalization group was ingesting hundreds of signals about music data for many, many points of view. Our suggestion at the time was we'll have the editors act as one more input to that and we'll put it into the machine. We'll update it. We'll evaluate it. And we will make our recommendation based of that. They felt like the best approach for combining editorial and personalization was to think of algorithms and all that data as one more input into how they made decisions when they're making playlists. So they would ask for, you know, song suggestions on playlists and they could use their brain power and everything they knew to to build the rest of that playlist.

And then, one day, in Spotify’s Boston office, Ajay and Meg start chatting.

Ajay Kalia: Y ou know, in this part of the office, there was a really big picture window with a really great view. And the person who would snag that view was Meg Tarquinio, one of our music editors. And so she was sitting right next to us in our little pod - different teams - but we were in the same space. Meg Tarquinio: I pretty vividly remember, Ajay coming over to me, and at the time I was in this beautiful corner desk with exposed brick. It was very Boston, it was very Echo Nest, and it was just this -- I had my little plants and he said something like, if we could just do like a small test, like anything, what could we do? Ajay Kalia: A nd she was explaining, you know, as an editor, we have hundreds of playlists that generate tons of consumption around all these different concepts, but it's very frustrating to her that you can only make one playlist with 50 songs that everybody gets. And this seemed like an obvious place we should work together on, because what she was great at doing was figuring out new concepts that should exist in the world and what kind of music someone might like out of that. And what we're really good at was figuring out, well, what kind of music do you like?

This was Andrew Ng’s wireframing in action.

It turns out that, while Meg was hired as a music editor, in Andrew Ng’s world, she was a product manager, and she had done exactly what the product manager in a machine learning first world should do! She had come up with the product use case, and she had even sourced a small “test set” of what that use case should sound like.

15 Spotify: A Product Story Episode 4 - Transcript The first thing Ajay did was look for the right playlist - or in product speak, the right product and use case to start experimenting on.

They needed to find a playlist that lots of people opened but not very many actually played. That would indicate the desire was there -- people wanted to listen to music that fit that description -- it was the right use case -- but when they saw the tracklist it wasn’t what they expected yet, and they moved on.

Ajay Kalia: A nd one playlist in particular was sort of off the charts, and it was called Songs to sing in the Car and it was made by Meg. And so we turned to our left and asked Meg, like, what is Songs to Sing in the Car? Meg Tarquinio: S ongs to Sing in the car was just one playlist among thousands and thousands of editorial playlists. So a relatively universal, unique moment in the human experience, which is simply feeling your feelings and singing in the car. I’m just laughing because I'm thinking about how many times I've stopped at a red light and I'm doing this and I'm just like singing my guts out and someone sees me, you know, but that's the moment that this playlist was meant for. So we wanted to deliver music that was familiar to people that they know all the words to and that they'd want to sing along to. And without personalization, we just had to rely on the hits. So to guarantee that someone would know all the words, we just had to kind of instead curate to most people. So what are the most popular songs that are also the most singable? As opposed to songs that you personally love and know all the words to. Ajay Kalia: W hat we decided to do was a small test with almost like a prototype of the algorithm. You know, you have prototypes in visual design. You could do prototypes with recommendations as well. Then you figure out the components and you build the smallest possible engine on top of that. So in this case, she built the candidate pool. We didn't even use any algorithms. Meg Tarquinio: S o I think we scoped it to about 700 tracks. I think I spent an hour or two max widening the pool. So selection was covered by an editor. So every song that came in kind of filtered through that kind of popularity and singability matrix. But of course, we were able to widen the scope to have more taste profiles and more artists in that pool. And then for sequencing used our personalization algorithm to kind of really scale that and kind of do that second step of selection. So of these 700 tracks, what are the top 100 for you as an individual, but then also sequencing it by that kind of personal relevance. Ajay Kalia: W ith most tests of recommendations technology, you end up getting neutral results. There's really no difference. Sometimes you get a negative result, so

16 Spotify: A Product Story Episode 4 - Transcript at least you've changed something. But even that's pretty rare. if you can get a five or 10 percent boost in engagement for what you did. That's usually considered a pretty huge win. And in this case, we were seeing like a 100 percent increase, like 2x increase. Like I've never seen numbers that high in a first test in Spotify and probably never will again. And it was immediately obvious to all of us like something here was really working. So, you know, it's almost like product market fit. She had been able to identify a real need that a lot of users in Spotify had. But the product wasn't quite working. And so when you put this in, so someone was not only seeing something they liked, but clicking into that playlist and getting exactly the songs they expected and wanted within there. Users love this.

We discovered that when you think of the playlist as a product, and the editor as the product manager -- sourcing the “test set” -- the pool of tracks that Meg created, ML can deliver a truly personalized listening session. And thus algotorial -- algorithmic plus editorial -- was born, and 1 + 1 finally equaled 3 for Spotify.

Meg Tarquinio: W hat I think we've done and where the relationship lies and what we continue to do is to examine the strengths and the weaknesses and blocker's of human curation in any specific situation, an opportunity, and look to see where tech or machine learning can unblock and scale editorial creativity and impact. And conversely, over the years, which has been really great to see and explore and be a part of, is how editors can be humans in the loop and can work to fuel our algorithms in a unique way as well. So it's become a very symbiotic, mutually beneficial relationship that scales and amplifies both humans and machines on either side and build something new together in the middle that isn't possible without that relationship. So I think we've moved beyond those fears and we've started to kind of speak the same language that we've developed together in partnership.

We had moved to learning the product -- in our case the playlist -- instead of building it.

But -- while machine learning on its own just kind of looks in the rearview mirror and drives in a straight line into the future -- with algotorial, we not only found a way to infuse human creativity and intelligence into the products we create. We found a way for a human to have a hand on the wheel -- one who can occasionally choose to make a left or right turn and even take the scenic route every once in a while, but only as long as the passengers still enjoy the ride!

17 Spotify: A Product Story Episode 4 - Transcript (41:18) And that’s the untold story of how we went from somewhat skeptical of the value of machine learning to realizing that machine learning - and the personalized soundtracks we create - is the product! So what’s next for Spotify?

Most of the machine learning in the tech industry up until recently has been focused on observing and predicting users’ actions and reactions in the moment. What is the user most likely to click on right now?

Tony Jebara is on a mission to change that short-term way of thinking.

Tony Jebara: H i, my name is Tony Jebara. I'm the vice president of Machine Learning Engineering at Spotify.

Tony is behind Spotify’s push to move beyond baseline machine learning to a method called reinforcement learning.

Reinforcement learning adds another dimension to the system’s rearview mirror - how the landscape and the users’ behavior, changes over time, and applies that to predict what might be best in the long term. We call your position in this landscape the “state.”

Tony Jebara: T here’s also this concept of state. And every time you do these loops - they know they had a great bunch of sessions. And so they're going to behave differently than if they just had a bunch of bad sessions. They also have changed their taste or discovered new things. And so that to me, feels more like the journey that we talk about sometimes users being on a journey and us changing their state over time, as opposed to a transactional service which says, oh, here is some information about a user and some content and I'm going to take this action and then hopefully get a click or stream or, you know, some kind of engagement.

Companies like DeepMind and OpenAI use reinforcement learning to teach AI systems how to make good long term decisions in games like Go, Chess, and Starcraft, where just getting the highest possible score in the moment is often not the best long term strategy to actually win the game. Going straight for the rook might seem like a good move, but not if it lets your opponent take your queen.

18 Spotify: A Product Story Episode 4 - Transcript So what if we would think of music recommendations as a co-operative “game” that you play with Spotify, and all the listening history as previous rounds of that game, played by hundreds of millions of people?

Tony lays it out as a kind of three-dimensional landscape, with longitude, latitude, and altitude.

Tony Jebara: U sers have some positions in this landscape which really summarize their experience and history with the service, and their audio consumption and their impressions and so forth. What they've discovered, what they're aware of, what they're not aware of. That summary that's putting you in some place in this fictitious world. And then we can say, how happy are you where you are - and you could think of that as the altitude. And so if your state is near a valley, if we do some things that are not so good, you may end up getting pushed and nudged by all our surfaces and the actions into this valley where you're no longer happy and then you cancel your membership. And ideally, what we want is to move users to higher altitude locations. And some of them are clearly showing us higher retention, higher rewards every time they come back. And what is different about those users in their state? And can we find users that are like them nearby but are much lower down the hill and then make more deliberate of an effort with our surfaces to make them like the happier users, the users that have now found a diversity of content and coming back to it and have discovered new genres and have the right healthy mix of habit and familiar with discovery and a mix of routines throughout the week. Gustav Söderström: S o one way to think about it is we could use all our existing users and their different levels of happiness and engagement to try to map out a bit of this landscape, to understand where there are peaks and valleys. And then if we, based on your usage, can place you in this in on a longitude latitude, we could sort of see like there's actually a valley close to you that you should avoid, or a hill close to you, maybe we should try to guide you towards that. Because other users, when they climbed that hill, they had -- they were more happy. Right. They had a better state. Tony Jebara: E xactly. And in reinforcement learning the altitude is called a value function. The value function is a measure of given this date how happy are you and how likely are you to reward, you know, Spotify in the next round of play with a game that involves, you know, the content and the user and that kind of loop around taken action to see if it's rewarded And so you have to think about not just getting the next instant reward and instant gratification, but also setting up the grounds for success so that, you know they're overall more likely to reward you.

19 Spotify: A Product Story Episode 4 - Transcript Gustav Söderström: S o why do you think the long term perspective, looking for long term reward versus just immediate reward, the most likely next click, for example, why do you think that's so important? Tony Jebara: B ecause in the end, it's a subscription business and really we’re about people staying subscribing for a long time as opposed to an ad network where it's just click convert, make the money and leave. And I also think users eventually want to move towards a healthier Internet. The Internet has been extremely transactional and we're seeing some of the adverse effects of that where it's about the maximum click kind of instant gratification Internet and lots of controversy around that, leading to unhealthy behavior, leading to socially damaging behavior, leading to users churning out and leaving platforms and being upset with other companies as as examples. Having been maybe a little too instant gratification click focused as opposed to having healthy consumers that live long on the platform.

Because what you might be most likely to click in the immediate term isn’t necessarily what you will actually value in the long term.

And that brings us to today.

(46:39) Before we go, here are the top 4 lessons we covered in today’s episode:

Lesson #1: Build for yourself first. But don’t build for yourself only. Designing with yourself in mind is a good place to start, but don’t limit your potential to people just like you.

Lesson #2 - Throwing Machine Learning at data you don’t fully understand isn’t enough to give you a great product. You need to understand your data deeply.

Lesson #3 - If you don’t have one side of an equation inside the company -- look outside for it. And weigh the present-day costs against how far ahead you can leapfrog into the future.

Lesson #4 - In a machine learning world, you have to learn the product, not build it!

And that’s a wrap on today’s episode -- Next week -- what do you do when your winning bet becomes your losing bet?

20 Spotify: A Product Story Episode 4 - Transcript Emil Fredriksson: I remember like ordering millions and millions of dollars of servers as we were like doing the migration because we had to have this overlap. And I mean, it doesn't feel great doing that, knowing that these are going to be in use for a short amount of time. But that's just the way -- that's what you had to do.

Spotify: A Product Story is produced by Munck Studios for Spotify.

We’re edited by Frances Harlow and mixed by Joakim Löfgren, Viktor Bergdahl and Andrea Fantuzzi.

Our theme music was composed by Andrea Fantuzzi.

Veronica Harth is our in-house Spotify correspondent.

Special thanks to: ● Erik Bernhardsson ● Laura Pezzini ● And Glen McDonald

And I’m Gustav Söderström. Thanks for listening.

21