These 5 mistakes ruin your Machine Learning Algo

What are the 5 common mistakes that ruin Machine Learning models ? That’s what we see together in this article !

5 Machine Learning mistakes

Not analyzing your data

A mistake that we often see among beginners.

Machine Learning algos can seem so powerful that we can forget that, sometimes, a simple data analysis would be enough to solve our problem.

You should always take the time to study your data.

At Inside Machine Learning, we advise you to use the matplotlib package.

This library allows you to easily draw graphs.

With this you can have visual and efficient results for your data analysis !

Overfitting

The classic error in Machine Learning: overfitting !

You probably know this one.

It consists in training your Machine Learning model so much that it becomes inefficient !

You may ask how is this possible ?

Well, during training, your model improves only by looking at the data you’ve provided. This can lead it to build solutions that work only on these data and not on others.

The model is said to fail to generalize.

To avoid this mistake, make sure you evaluate your model on both training data and validation data !

When you see that the results of the model increase on the training data but decrease on the validation data, it is certainly a negative signal !

For the more curious, we wrote an article where we develop a concrete example to counter overfitting, it’s here !💡

Lack of data

Another common mistake in Machine Learning is the lack of data.

This one is the dread of Data Scientists !

Spending a whole project to build a Machine Learning model… only to realize at the end that there is not enough data.

In reality, this is more an oversight than a mistake. But it is still common to see this problem occur during significant projects !

To prevent the lack of data, determine in advance your needs.

But above all, measure the complexity of the task you want to solve.

The more difficult it is, the more data you’ll need.

Photo by Alex Iby on Unsplash

Too Deep Neural Networks

Here, we get into more technical errors.

In Deep Learning, the system of layers of neurons is really interesting and adding some of them in a model can sometimes achieve miracles !

With the GPT-3 model developed by Open.AI, we are talking about 96 layers of neurons !

This model can solve dozens of tasks such as summarizing books or even starting a conversation with you.

But for simpler problems, a large number of layers is often useless… even restrictive.

Indeed, it takes a particularly long training time and regularly leads to overfitting (a mistake you already know !)

However, if your task truly requires a large number of layers, then we advise you to have a good GPU… or a good amount of calm and patience.

And remember, sometimes the Machine Learning model you want to build already exists online. Before jumping into long hours of work, take a look at if someone hasn’t already done what you want to achieve ! 😉

Using the wrong activation function

The last but not least mistake we see together is : the choice of the activation function.

This is a concept that may not be easy for beginners in Machine Learning.

However, it is fundamental in order to build a good prediction model.

That’s why we wrote an article specially dedicated to the activation functions, with as a bonus a summary table to know which one to choose according to the problem to solve.

That’s all for this article, we hope you’ll find it useful.

More Machine Learning tips await you in this section !

Tom Keldenich
Tom Keldenich

Data Engineer & passionate about Artificial Intelligence !

Founder of the website Inside Machine Learning

Leave a Reply

Your email address will not be published.

Beginner, expert or just curious?Discover our latest news and articles on Machine Learning

Explore Machine Learning, browse our most recent notebooks and stay up to date with the latest practices and technologies!