Loss, Validator, Optimizer & Activator
the four pillars...
Four is the only number that has the same number of alphabets as the number.
Losses, Validators, Optimizer & Activators. I call them the four pillars of Machine Learning because the performance of the model depends on these four metrics. We shall begin with losses.
Losses/ Loss functions' main role is to measure the discrepancy between the predicted output and the actual output of a model. The goal of a model is to minimize the loss function, which means, pulling the predicted value close to the actual value. In Supervised Learning, we have pre-labeled data. But what happens in the case of Unsupervised Learning? Let's consider an example,
We know that, unsupervised learning forms clusters/groups rather than strictly predicting stuff like supervised learning. Let's say we have a dataset of customer transactions as follows,
and, we shall group them based on their spending style. Let's say we will classify them into two groups, for which we have to choose two
centroids at random. I'll choose transactions 1 and 4 and calculate the distance between each transaction to these centroids. So for every transaction, we get two distances. I will assign the transaction to the group, where the distance is less. and I will repeat the process until I get the same result for two subsequent iterations and then I will end the process.
You might ask, How is loss useful in this scenario? Well yes, the loss function is used to adjust the cluster assignments and update the centroids, which in turn the classification becomes accurate as we desire. This process of classifying is called
What about Reinforcement Learning? Say you're training a chess bot and you receive a positive reward for a good move and a negative reward for a bad move. The loss function in these types of scenarios helps in maximizing the reward points by reducing the negative rewards.
Yay! We're now good with Losses! now, let's have a glance at the life of Optimizers.
Optimizers tweak the weights and biases of neurons in a Neural Network, based on the reports given by Loss Functions so that the loss is minimized and this continues for every epoch in your training. Simply, the loss function calculates the loss and the optimizer tries to reduce the loss by tweaking the setting of your model.
Activators simply help the neural network to learn complex relations between the data. No matter how many hidden layers are present in a neural network and if there is no activation function, at the end of the day it is simply a linear equation. But a model cannot learn with a simple linear equation, right? So we need an activation function to reduce the linearity of a model.
Say we have a three-layered neural network with 10 neurons in the first layer, 20 in the second, and 5 in the third. The activation function is not applied to the first layer because it just passes the inputs to the second layer. In the second layer, each neuron calculates the weighted sum of inputs, and then the activation function is applied to the neuron to determine the neuron's output. The activation function helps shape the behavior of the neuron by transforming the sum of weighted inputs into a desired range, allowing the neuron to learn complex patterns in the data.
Unlike losses and optimizers, Validators come into the picture after the model's training is complete. Validators tell us how good the model is, how well performing it is and allow us to compare the model. Validators are like the ratings of a model :). The motto of a validator is almost the same in all three types of learning, but with a small difference as follows:
In supervised learning, validators assess the accuracy of prediction.
In unsupervised learning, validators evaluate how well data is clustered together.
In reinforcement learning, validators measure the model's performance in terms of rewards.
Here are some examples of validators and losses:
Mean Squared Error(MSE) ,
Mean Absolute Error(MAE) ,
Stochastic Gradient Descent,
Losses, Vaidators, Optimizers and Activators play a crucial role in shaping the model. And hence, these metrics are to be chosen according to the data and the use case of the model.
Until next time, Sree Teja Dusi.
Did you find this article valuable?
Support Sree Teja Dusi by becoming a sponsor. Any amount is appreciated!