Curriculum learning. Sounds exactly like what happens when you go to school, right? It is. And it turns out that it helps machine learning systems learn better too.
Seems like we got something right when teaching ourselves.
By and large, machine learning systems, whether they use simple neural networks, or use deep learning, or anything else that you may have heard of, mostly emphasize a lot on 2 things:
- Quality of data
- Amount of data
The quality of data is of course extremely important. “Garbage in, garbage out”, you may have heard before. It applies to machine learning as well. If you train a machine learning system with garbage, it will learn the garbage and give you back garbage.
The amount of data is important as well, because the machine learning system needs to “see” enough examples from your data in order to figure out how to do what you want it to do.
For instance, if you want your machine learning system to predict which segment of customers are the most likely to buy, and stay loyal to your business, you need to give it lots of examples of customers both likely and unlikely to buy, and customers who stay loyal and do not stay loyal. Only then, can the machine learning system understand the nuances behind these different types of customers, and predict for you who will buy and stay loyal.
Simple and makes sense, right? Nothing you didn’t already know.
So what is this about curriculum learning?
Right now, if a machine learning system does not do well – for instance the model above predicts the wrong kinds of customers who it thinks will buy and/or stay loyal to you – it usually means it hasn’t been trained on enough data.
And the usual and obvious solution? Get more data, and train it better.
However, by mimicking the way we humans learn, how we teach our kids in school, we may be able help the machine learn better without more data. In fact, using the same data, but simply having the machine learn in a different way, can make it perform better.
That is fantastic, because one of the greatest challenges in machine learning for many businesses, is getting enough data in the first place.
How does curriculum learning work?
If you’ve gotten to this point, you probably want to know a little about how the way we humans learn can make our machine learn better.
Turns out, it’s not very difficult at all.
Teaching a child how to recognize cars
Now imagine with me, how you would teach your child. Say, you want to teach her what a car is. How would you do it? Perhaps every time you see a car on the road, you’ll point to it and tell your child that’s a car. When you read a book together, and there’s a picture of a car, you’ll point it out as well. When you drive her to school, you’ll emphasize that you’re going to the car. Basically, car, car, and car.
What you won’t do, at the start, is to start saying that’s a Honda, or a Toyota, or a Ferrari. Neither will you say that’s an off-road vehicle, or a sports car, or a 2-door sedan, or a convertible. Basically, you start with the broadest possible label – a “car”. Everything is simply a “car”.
Only after your child has figured out how to recognize cars, will you go deeper and break it down into, say, types of cars, or brands of cars, or colours of cars.
Teaching a machine with curriculum learning
So with the machine, we do exactly the same thing. We use your data, and give it broad labels first. Just like a “car”. Then we train it.
Next, we use the exact same data, but give it more refined labels. A “sedan”, a “convertible”, a “sports car”, an “off-road vehicle”. Then we train it again.
And we continue to do this until we reach the level of detail we want to achieve. Say, for instance, a “red Honda sports car”.
Is this better?
Yes. And the results are pretty consistent.
For almost any kind of training, using or involving curriculum learning helps the machine learning system do better. In some cases, more than 10% better.
Of course, there is some additional effort in putting the broad labels to your data set. However, in our experience, this can usually be derived quite quickly, and in most cases, automated to a high degree. Since we are already working on the detailed labels, it usually follows some rules to be able to figure out the broad labels. And since this is rule-based, it is generally easy to automate.
What this means is that this method, while requiring a little bit more effort, nearly guarantees you a better AI model. It can easily be worth it. It is something we consider, and if possible apply, when we build machine learning models, for ourselves and for our clients.