How we built a machine learning recommender for a shoe retailer

A very common challenge we encounter for e-commerce businesses, is that users are, in fact, terrible at finding the item they like.

A story of Michelle, the shoe shopper

Let me tell you a story about Michelle.

Michelle is an exercise buff, and is looking for a new pair of running shoes. She also has great taste, and wants to find a shoe that compliments her wardrobe of workout attires. She likes pastel colours, so her outfits are generally either black or gray, or some form of pastel colour.

To get her new pair of running shoes, she goes online. And online, she looks for “white female running shoes”.

And various e-commerce merchants pop up. Some from name brands themselves, like Nike, or Adidas, or Onitsuka. And various 3rd-party retailers who carry a large collection of brands.

She spots a retailer who seems to carry a nice, white pair of shoes. She loves it, because it sports an elegant design without the brand and logo being too eye catching, and is from her favorite brand – Onitsuka.

The price seems kind of right too, less than $300, well within what she expected to spend for a decent pair of shoes. At least for her.

So she heads to the site, and browses through every single picture, from every angle.

She sends the links to her running buddies asking their opinions, but she really doesn’t care much. She loves the shoes. She then looks deeper at the shoe’s details – the weight, the technology it features, and the type of workout it is designed for.

And her heart sank. It said – “tennis shoes”.

Dismayed, she closed her browser, intending to resume her search the next day, not realizing that there is another model, incredibly similar, with just a slightly different tongue design (the tongue is a part of the shoe), that is for “running”.

An intelligent recommender system could have told her that.

Recommender systems – what are they?

The name probably already makes it fairly obvious. They’re intelligent systems that recommend things to users that are relevant.

In Michelle’s case, her (likely) ideal running shoes.

When thinking about recommender systems, we can divide it into two approaches – content-based filtering models, and collaborative filtering models.

They both use machine learning, but they work differently.

In fact, modern recommender systems generally use both of them, together, so that they can get even more accurate results.

We’ll quickly describe these two methods below, to provide a high-level view of how recommender systems work in the first place.

Content-based filtering models

Put very simply, content-based filtering use the content of the items that are up for recommendation, to determine how similar they are.

For instance, you’re thinking of buying a pair of Adidas light-weight running shoes, with laces, of a simple gray design.

A content-based method will look at all the metrics of all the shoes, and try to find other shoes that are very similar.

Of course, what is the meaning of similar is something that the algorithm needs to be taught. Perhaps, it may suggest that you might also want to consider this:

And it will almost certainly suggest this:

The **same** Adidas UltraBoost, but in white rather than gray

In other words, content-based models really look at the similarity of the attributes of the items has knowledge about.

Collaborative filtering models

The other method is using collaborative filtering models.

Collaborative filtering models basically work by analyzing your behavior, and the behavior of others. In short, behavior.

Behavior can be explicit. The example we will all know is when we rate a particular item, or purchase, and give it a bunch of stars. 5 stars for fantastic, 1 star for really bad. They can also come from reviews, where (a different) machine learning system analyzes the text of the review and, say, performs sentiment analysis to determine whether it is a good or bad (or scathing) review.

Behavior can also be implicit. How many times we go back to looking at that item (on the web page), how long we spend looking at it, whether we start looking at all the detailed specifications or we simply go away. It can also include whether or not we favorited the item, or if we asked the merchant or seller questions about it.

Okay, but that doesn’t tell me how collaborative filtering actually works.

You’re right.

Let’s put it in a simple conversation:

David: “Hey, I like shoes A, B, C and D!”

Jane: “For me, I like shoes B, C, D, and E.”

Algorithm: “David and Jane are pretty similar, they have quite a bit of overlapping shoes – they both like shoes B, C and D. Okay, here goes.
David, you may want to consider looking at shoe E.
Jane, you may want to at a look at shoe A.”
A simple view on how collaborative filtering models work.

In other words, collaborative filtering models look at the similarity between users, and recommends things that similar users like.

Building an actual model for our shoe retailer

The problem when starting with machine learning

When we started building a recommender system for our shoe retailer, we ran into a problem that is pretty uncommon when starting out with machine learning – data, or the lack of.

toddler's standing in front of beige concrete stair

However, as we help companies leverage machine learning to scale their business, we refuse to let the lack of data get in our way. Unless there is absolutely zero data, which is a problem for another discussion beyond this one.

Thankfully, most people have some data, even if it’s not perfect. But obviously, the problem has to fit a machine learning solution in the first place.

The good news is that our shoe retailer actually had data. They painstakingly tagged every model of shoes with their key attributes, such as color, size, design, brand, year of production, purpose (running, tennis, cross-trainers, etc.) and so on. And they intend to continue doing this. It’s their standard practice as a great shoe retailer.

The challenge was that our shoe retailer had very little user behaviour data. They certainly had the purchasing information, but they were not set up to track behavior such as the amount of time the user spent looking at a shoe, which shoes did the user look at, how long she looked at each shoe, and so on.

Designing the high-level parts of recommender system

So we had to decide how to build a recommender system that could work reasonably well with good item attribute data, but without great user behavior data.

And also, to consider how, as the amount of data increased, to allow our recommender system to grow with it as well, and increase its accuracy.

Therefore, we had 3 jobs:

Since we have item attribute data, it made sense to have a content-based filtering model.
Since we have user purchasing data, so we at least have some data, albeit a very small amount of data, to tune our collaborative filtering model.
And, since we also want to create an accurate recommender system that uses detailed user behavior, we had to work with our shoe retailer to put in place the technology to capture such user behaviour, and translate that into data that our collaborative filtering model can learn from.

Creating our content-based filtering model

This is the easier one. As we mentioned, our shoe retailer has already painstakingly tagged every shoe with all their critical attributes.

So how do we build our model to be able to recognize similar shoes, based on their attributes, to the shoe our user is looking at?

Simply, we create a vector (which is just a fancy word for list) of all the attributes of a shoe, and compare it with another vector of another shoe.

After we make the comparison of attribute vectors of a shoe vs. another shoe, we assign a “similarity score”, which we can represent as a similarity matrix:

With this, you can quite easily see how we can figure out what are the, say, top 10 similar shoes to a given shoe.

We also, after testing, chose to go with a very simple model where all attributes of the shoe is considered equal. In other words, the colour of the shoe is not considered more, or less, important, than the type of the shoe.

Now this may or may not work for every shoe retailer, but in our case, we decided on this after we tested and experimented with various weights, and also because the content-based filtering model is designed to be a further filter for the more powerful collaborative filtering model, described later.

In other words, the intention is to take the results of the collaborative filtering model, and present only those which this content-based filtering model also “agreed” with, within some threshold of agreement.

And that is the essence of our content-based filtering model.

Creating our collaborative filtering model

For our collaborative filtering model, we have to take into account users’ behavior.

In other words, we want to find:

Similar users
Figure out what shoes they like (or for starters, bought)
Recommend the shoes that similar users like to a given user

So how do we, in practice, calculate the suggestions for hundreds of thousands of users (i.e. shoes to recommend these users), based on hundreds of thousands of other users’ behaviour (e.g. purchases)?

In a nutshell, again, using matrix math.

Matrix representing of users and the shoes they like (or bought)

This is a super simplified view. The real matrix is huge. Basically, the rows represent every individual user, and the columns represent every single shoe.

Therefore, for a given row (user) and a given column (shoe), a 1 means that the user has interacted with the shoe (in our simplified model, the user has bought the shoe).

If the user hasn’t interacted with the shoe, it gets a 0.

You can probably already guess that as we add more user behavior data, or “features” in machine language speak, our model morphs from just a simple 1 and 0 to a weighted score, that takes into account all the user behavior, and how important they are (the weight).

Then, we run some matrix math, which is basically just a long, complicated factorization formula and so on.

When we’re done, we end up with two things:

A vector that represents a user’s preferences
A vector that represents a shoe’s profile

And how do we use those results?

To find which users are similar to each other, as we mentioned earlier is important in our David and Jane example, we compare the user vectors.

Mathematically, this is a dot product. But anyway, by comparing the two vectors, we know how similar, or dissimilar, 2 users are.

This means that we can find all the users who are similar to a given user, by applying some threshold.

Similarly, we can do the same for the shoes, and figure out which shoes are the most similar to a given shoe.

Therefore, for a given user, looking at a given shoe, we:

Find similar users, and suggest to you shoes that they have bought
Find similar shoes, and directly suggest them to you

And therein lies the fundamental collaborative filtering recommendation system.

Combining the models

As briefly mentioned, our design was to use the results of the collaborative filtering model, and refine it with what the content-based model agreed with, and present only those results.

An alternative would be to present both sets of results, of course. But after testing, we decided not to do this.

Working with our shoe retailer to capture more user behavior

Another key part of our work with our shoe retailer was to empower them to capture more user behavior. Things like:

How many times the user viewed a shoe
The pattern of viewing times of a given user for a given shoe
The immediate shoes the user viewed after viewing a given shoe
What shoes did the user view much later, if the user came back to the shoe retailer’s website
What specific information of the shoe did the user look at
How long did the user look at specific information of the shoe
Did the user ask questions about the shoe
Did the user add the shoe to his/her cart (this is commonly captured)
Did the user add and then remove the shoe from his/her cart (this is less commonly captured)
And much more

As the enhanced shoe retailer’s web site continues to capture this information, it will be cleaned and sanitized and the collaborative filtering model will be prepared to make use of the data.

Moving forward and growing with the shoe retailer

woman wearing pink tank top holding wood stick during sunrise

The initial results are extremely encouraging. From a website that used a rudimentary recommendation system (the shoe retailer manually tagged shoes which they “felt” were similar), to one that leveraged machine learning, the shoe retailer increased sales by slightly more than 30%. Also, their drop-off rate, which represents the users who came to their site, and did not buy anything, dropped by about 20%.

Of course, these promising results are strongly helped by the fact that the shoe retailer has strong and diverse inventory, which means that they are somewhat likely to have a product that a user wants, but may not be able to find immediately.

As more user behaviour data is captured and the collaborative filtering model has more data to learn from, we expect this to lead to a much more accurate recommender system.

As a machine learning system, the system scales with the shoe retailer’s business, and correspondingly, the amount of data they have. They do not have to do more work to allow the recommender system to improve.

In short, this means they are even more likely capture a sale, as long as they have a suitable product to recommend a user like Michelle.