Well, the internet is of no power without data. Quite similarly, Recommendation Systems require tons of data. Not just a simple dataset, on which you can train and use forever! It must be updated with data every second. Let's assume you watched Tom & Jerry on YouTube and liked it. It makes a positive impact and tells the recommendation algorithm that you like Tom & Jerry. Then you watched another video, just for a few seconds, and closed it. This makes a negative impact and tells the recommender that you didn't like the video. But still, there are two types of recommendation systems:

This method makes recommendations based on user behavior. It identifies patterns by analyzing user-item interactions. If users A and B have similar preferences and have liked or purchased similar items, the system will recommend items liked by one to the other.

In this approach, recommendations are made based on the characteristics or content of the items and the user's profile. For instance, if you've shown interest in science fiction movies, the system will recommend more science fiction movies.

While recommendation systems offer incredible advantages, they're not without their challenges. One major hurdle is the "filter bubble" effect, where users are exposed only to content that aligns with their existing beliefs and preferences, potentially limiting diverse viewpoints. So, next time you're presented with that perfect movie or book, remember, it's not magic; it's the remarkable science of recommendation systems.

Till next time, Sree Teja Dusi.

]]>Natural Language Processing is where human expression meets computational intelligence, unlocking the language of possibilities. ~ ChatGPT

Yeah, the above quote is given by the overrated AI chatbot, ChatGPT. So, how do these chatbots work? how come these understand the language of humans, and how can they assess them? well, it's termed Natural Language Processing, which simply is making a machine process our(natural) language.

Machines can only understand zeroes and ones, but, let's not stick to 0's and 1's and also include other numbers in our discussion(*not imaginary ones, though :p*). Let us say we have a sentence as follows I love Machine Learning! In this sentence, we have 4 words: `I`

, `love`

, `Machine`

, `Learning`

. Let's add them to our dictionary as 1 for `I`

, 2 for `love`

, 3 for `Machine`

, 4 for `Learning`

. Now, let's consider another sentence, I also love chocolate. This sentence contains the following words: `I`

, `also`

, `love`

, `chocolate`

. Now, we only add the words `also`

and `chocolate`

to our dictionary because it already contains `I`

and `love`

.

Now, after we have a dictionary as such, the sentences are mapped to the dictionary to create a series of numbers instead of sentences. For instance, *I love Machine Learning!* can be transformed as `1 2 3 4`

and *I also love chocolate* can be transformed as `1 5 2 6`

. And these are further processed and fed to a model.

Great! That's how NLP works(*in a nutshell. there are a lot more processing techniques 🔥*). Now let's build a small model, which can classify our text as positive sentiment or negative sentiment.

We need a dataset(*obviously!!*). The dataset I used for this tutorial is "IMDb dataset" which has 50k reviews! You can download it here. Let's begin by reading the dataset,

`dataset = pd.read_csv('../Datasets/IMDB_Dataset.csv')print(dataset.head())`

and, clean our dataset by changing the word `positive`

to 1 and `negative`

to 0;

`clean_dataset = dataset.replace('positive',1).replace('negative',0)print(clean_dataset.head())`

splitting into train and validation,*I chose to split 80% for training and 20% for validation.*

`splitValue = 0.8train = clean_dataset.sample(frac=splitValue, ignore_index=True)validation = clean_dataset.drop(train.index)train_labels = np.array(train['sentiment'])validation_labels = np.array(validation['sentiment'])`

now there are three important phases of building our model,

1. Tokenizing

2. Padding

3. Building...yay!

As we discussed before, every text has to be converted into a stream of numbers, which we call Tokenization. Firstly, we create our vocabulary,

`tokenizer = tf.keras.preprocessing.text.Tokenizer(oov_token='OOV') #Calling the Tokenizer class, passing a OOV tokentokenizer.fit_on_texts(clean_dataset['review']) #fitting the Tokenizer on our reviews, to generate a vocabulary.`

💡

OOV is an acronym for

`out-of-vocabulary`

, which is a placeholder for the words that aren't in the vocabulary.once we have the vocabulary ready, we then transform all of our sentences into sequences as follows,

`train_sequences = tokenizer.texts_to_sequences(train['review'])validation_sequences = tokenizer.texts_to_sequences(validation['review'])`

We don't get all the sentences of unique length. In the same way, when we convert the sentences to sequences of numbers, we get them in different lengths. But according to legends, a model must have a unique shape across its data. so, we pad the sentences, for which we simply add zeroes at the beginning or the end. through which, we achieve the data with a unique shape.

and it's done as follows,

`train_padded = tf.keras.preprocessing.sequence.pad_sequences(train_sequences, maxlen=120, truncating='post')validation_padded = tf.keras.preprocessing.sequence.pad_sequences(validation_sequences, maxlen=120, truncating='post')`

here, `maxlen`

is the maximum length a sequence can acquire, `truncating`

is the position of the padding(*pre/post -sentence*)

finally, we build and compile the model as follows:

`model = tf.keras.Sequential([ tf.keras.layers.Embedding(200000, 16, input_length=120), tf.keras.layers.Flatten(), tf.keras.layers.Dense(10, activation='relu'), tf.keras.layers.Dense(1, activation='sigmoid'),])`

**Embedding** is a layer especially used in NLP to convert integers to vector-based representations which help extract features among the data.

**Flatten** reduces data of multiple dimensions to a single dimension.

eg: [[[1,2,3],[4,5,6]]] --> [1,2,3,4,5,6]

**Dense** is a layer that takes input from the previous layer, applies weight and bias and passes to the next layer.

**ReLU** stands for Rectified Linear Unit is an activation function, which returns 0 if the output value is negative, and returns 1 if the output value is zero or positive.

**Sigmoid** is an activation function that takes any real-valued number as input and squashes it to a range between 0 and 1. Specifically, positive values are mapped close to 1, negative values are mapped close to 0, and the value 0 is mapped to exactly 0.5. This property makes sigmoid suitable for binary classification problems.

`model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])model.fit(train_padded, train_labels, epochs=10, validation_data=(validation_padded, validation_labels))`

**Adam** is an optimization algorithm and **Binary Crossentropy** is a loss function for binary classification tasks.

Yayyyy! We're done! You can get the notebook file here.

Until next time, Sree Teja Dusi.

]]>Science is the systematic classification of experience ~George Henry Lewes

Let's bring a bit of fun out of ML today, by classifying images into Dogs and Cats. For that, we'll be using a Deep Neural Network(*a neural net that has more than one hidden layer. You can read about it* *here**.*)

💡

We'll be using TensorFlow for Python in this tutorial (just a few lines of code).

I have uploaded a dataset with images of cats and dogs, which is provided by Tensorflow, here. You can download it and get started.

An image is a matrix arrangement of colors. Consider an example,

zooming a bit,

your image has such kind of boxes, where each box is a color of three values- Red, Green, and Blue. These three values range from 0 to 255. so, if we have an image of 10x10 pixels, we get a total of 100 pixels which has 3 values each, i.e 300 values for a small picture of 10x10(*youtube has a maximum resolution of 3840x2160 pixels which is 24883200 values*). We, humans can't depict an image by just reading the numbers, but machines can! and hence, we extract the features of an image by its colors and feed it to a neural net.

Before we design the model, let's define our variables,

`train_data_dir = "../Datasets/catsanddogs_tensorflow/train/"test_data_dir = "../Datasets/catsanddogs_tensorflow/validation/"width, height = 150, 150shape = (width, height, 3)batch_size = 32epochs = 10`

you can replace the above path with the path of your dataset. `Width`

and `Height`

are the desired width and height we provide to our model, which must be unique. `shape`

has an extra param `3`

because the pixels have 3 values: `Red`

, `Green`

, and `Blue`

. For Grayscale images, we pass the dimension to be `1`

. `batch_size`

is the number of batches we divide the images for training(*we can't process thousands of images at once 🥲*). `epochs`

is the number of times we circulate through the data, which is `10`

in this case.

We have many images of different dimensions, which is not acceptable by a Neural Network. Neural Net must be fed with the data of unique dimensions, so we preprocess the images to reduce the size to our desired size. We can also transform images without saving them(*which is a feature provided by TensorFlow*) through which we have more data to train, leading to increased accuracy.

And, we know that each pixel has three values ranging from 0 to 255. Models with such large values take longer to train and need heavy processing power. Hence, we scale them to fall between 0 and 1 as follows,

`train_datagen = tf.keras.preprocessing.image.ImageDataGenerator( rescale=1.0 / 255, shear_range=0.2, #Rotating the images zoom_range=0.2, #Zooming the images horizontal_flip=True, #Images are flipped horizontally)test_datagen = tf.keras.preprocessing.image.ImageDataGenerator( rescale=1.0 / 255, #Modification isn't required for Test Data.)`

💡

`shear_range`

deals with the amount of rotation, and `zoom_range`

deals with the amount of zoom. There are many such parameters, and you can read about them here.now that we define how to process our images, we need to pass the directory of our images to the generator,

`train_generator = train_datagen.flow_from_directory( train_data_dir, target_size=(width, height), batch_size=batch_size, class_mode="binary", #it is binary because we only have 2 classes: Dogs and Cats)test_generator = test_datagen.flow_from_directory( test_data_dir, target_size=(width, height), batch_size=batch_size, class_mode="binary", #it is binary because we only have 2 classes: Dogs and Cats)`

in the generator, images are scaled to a size of `150,150`

, images are fed at a rate of `32`

, at a time.

💡

The class mode is

`binary`

because we only have two classifications: `cat`

and `dog`

. For multiple classifications, we pass `categorical`

.Now that we have generated images that suit our training, let's build out Neural Network.

Our neural network is defined as follows,

`model = tf.keras.Sequential( [ tf.keras.layers.Conv2D(32, (3, 3), activation="relu", input_shape=shape), tf.keras.layers.MaxPooling2D((2, 2)), tf.keras.layers.Conv2D(64, (3, 3), activation="relu"), tf.keras.layers.MaxPooling2D((2, 2)), tf.keras.layers.Conv2D(128, (3, 3), activation="relu"), tf.keras.layers.MaxPooling2D((2, 2)), tf.keras.layers.Flatten(), tf.keras.layers.Dense(512, activation="relu"), tf.keras.layers.Dropout(0.5), tf.keras.layers.Dense(1, activation="sigmoid"), ])`

Woooaaaaahhh! That's a bunch of Layers!, don't be afraid, It's simple when we break it into smaller chunks. Let's talk about Activation Functions later, and understand the layers first.

`Conv2D`

is a Convolutional Layer, which extracts the features from the image by applying specified filters to the image(*which are 32, 64, and 128 in this case*). `(3,3)`

is the size of the filter kernel. The first layer being the input layer, accepts a shape of `(150, 150, 3)`

.

the image depicts feature extraction using an algorithm called `Convolution`

.

💡

Specifying shape for every layer is not necessary, as the data of the same shape from 1st layer is passed to every layer.

`MaxPooling2D`

is a data compressor, which takes a window of `2x2`

(*in this case*), selects the maximum value in the window, which in turn condenses the image.

`Flatten`

is a layer that will flatten the data from any dimension to 1-dimension, in turn, it becomes easier for `Dense`

layer to operate.

`Dense`

layer connects each neuron in its layer to every neuron in the previous layer. It takes all the inputs from the previous layer, applies weights to them, sums them up, adds a bias term, and then passes the result through an activation function. The output of a dense layer is a set of values that represent the learned features and relationships in the data, which can be used for making predictions or further processing in the neural network. `512 and 1`

is the number of neurons in the dense layer in this case.

`Dropout`

layer drops random outputs from the previous layer to prevent over-fitting and to maintain the robustness of the data(*the dropping of data follows no pattern, it happens randomly at each iteration*).

`ReLU`

stands for`Rectified Linear Unit`

is an activation function, which returns`0`

if the output value is negative, and returns`1`

if the output value is zero or positive.`Sigmoid`

is an activation function that takes any real-valued number as input and squashes it to a range between 0 and 1. Specifically, positive values are mapped close to 1, negative values are mapped close to 0, and the value 0 is mapped to exactly 0.5. This property makes sigmoid suitable for binary classification problems.

we compile the model as follows,

`model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])`

`adam`

*short for Adaptive Moment Estimation*is an optimization algorithm used to update the weights of a neural network during the training process.`binary_crossentropy`

is a specific loss function used especially in binary classification tasks. It is commonly employed when you have a binary output (two classes) and want to compare the predicted probabilities to the true binary labels. In binary classification, the goal is to predict whether an input belongs to one class (usually represented as 1) or another class (usually represented as 0). The binary crossentropy loss measures the dissimilarity between the predicted probabilities and the true binary labels.

training the model,

`history = model.fit( train_generator, steps_per_epoch=len(train_generator), epochs=epochs, #10 in this case validation_data=test_generator, validation_steps=len(test_generator),)`

`steps_per_epoch`

and `validation_steps`

specifies the number of batches to process from the data in each epoch. Here's what the graph looks like,

Yay!! you have successfully trained a Model on predicting the images of Cats vs. Dogs. You can refer to my notebook here to know how to predict custom images from your gallery.

Until next time, Sree Teja Dusi.

]]>Four is the only number that has the same number of alphabets as the number.

Losses, Validators, Optimizer & Activators. I call them the four pillars of Machine Learning because the performance of the model depends on these four metrics. We shall begin with losses.

Losses/ Loss functions' main role is to measure the discrepancy between the predicted output and the actual output of a model. The goal of a model is to minimize the loss function, which means, pulling the predicted value close to the actual value. In Supervised Learning, we have pre-labeled data. But what happens in the case of Unsupervised Learning? Let's consider an example,

We know that, unsupervised learning forms clusters/groups rather than strictly predicting stuff like supervised learning. Let's say we have a dataset of customer transactions as follows,

Transaction Id | Price | Frequency(per month) |

1 | 50 | 1 |

2 | 30 | 2 |

3 | 80 | 8 |

4 | 20 | 1 |

5 | 70 | 7 |

and, we shall group them based on their spending style. Let's say we will classify them into two groups, for which we have to choose two `centroids`

at random. I'll choose transactions 1 and 4 and calculate the distance between each transaction to these centroids. So for every transaction, we get two distances. I will assign the transaction to the group, where the distance is less. and I will repeat the process until I get the same result for two subsequent iterations and then I will end the process.

You might ask, How is loss useful in this scenario? Well yes, the loss function is used to adjust the cluster assignments and update the centroids, which in turn the classification becomes accurate as we desire. This process of classifying is called `K-Means Clustering`

.

What about Reinforcement Learning? Say you're training a chess bot and you receive a positive reward for a good move and a negative reward for a bad move. The loss function in these types of scenarios helps in maximizing the reward points by reducing the negative rewards.

Yay! We're now good with Losses! now, let's have a glance at the life of Optimizers.

Optimizers tweak the weights and biases of neurons in a Neural Network, based on the reports given by Loss Functions so that the loss is minimized and this continues for every epoch in your training. Simply, the loss function calculates the loss and the optimizer tries to reduce the loss by tweaking the setting of your model.

Activators simply help the neural network to learn complex relations between the data. No matter how many hidden layers are present in a neural network and if there is no activation function, at the end of the day it is simply a linear equation. But a model cannot learn with a simple linear equation, right? So we need an activation function to reduce the linearity of a model.

Say we have a three-layered neural network with 10 neurons in the first layer, 20 in the second, and 5 in the third. The activation function is not applied to the first layer because it just passes the inputs to the second layer. In the second layer, each neuron calculates the weighted sum of inputs, and then the activation function is applied to the neuron to determine the neuron's output. The activation function helps shape the behavior of the neuron by transforming the sum of weighted inputs into a desired range, allowing the neuron to learn complex patterns in the data.

Unlike losses and optimizers, Validators come into the picture after the model's training is complete. Validators tell us how good the model is, how well performing it is and allow us to compare the model. Validators are like the *ratings of a model :)*. The motto of a validator is almost the same in all three types of learning, but with a small difference as follows:

In supervised learning, validators assess the accuracy of prediction.

In unsupervised learning, validators evaluate how well data is clustered together.

In reinforcement learning, validators measure the model's performance in terms of rewards.

Here are some examples of validators and losses:

**Losses:** `Mean Squared Error(MSE)`

,`Mean Absolute Error(MAE)`

,`Crossentropy`

**Optimizers:** `Stochastic Gradient Descent`

, `RMSprop`

, `Adam`

**Activators:** `Sigmoid`

, `tanH`

, `ReLU`

, `Softmax`

**Validators:** `Accuracy`

, `Precision`

, `R-Squared`

Losses, Vaidators, Optimizers and Activators play a crucial role in shaping the model. And hence, these metrics are to be chosen according to the data and the use case of the model.

Until next time, Sree Teja Dusi.

]]>Your brain doesn't manufacture thoughts. Your thoughts shape neural networks.

It was fun predicting house prices from the area, ain't it? But the real fun is in cooking food than eating it. I've used Neural Networks in predicting Housing Prices from the area, and now let's know how Neural Networks work. Here's my previous post which is Linear Regression: Housing Prices Prediction.

A neural network is a collection of layers, where each layer holds a specific number of neurons that help in predictions. The following is a basic diagram of a neural network. Neural Networks fall under the category of Supervised Machine Learning.

It contains an input layer, a hidden layer, and an output layer. In every layer, there are circular representations which are neurons. The input layer takes the input, makes some calculations and outputs the result which is given as input to the second layer. The output layer takes the input(*which is the output of the input layer*), makes calculations and outputs the result, which is again fed to the output layer to finally get the output. There are two important points here to notice:

Though there are multiple lines from one neuron, they are all the same, fed to multiple neurons in the following layer.

The hidden layer is called the hidden layer because you don't play with it, i.e. it is neither the input layer nor the output layer.

Let's zoom into the network,

The neuron also called a Perceptron, is a very important component in a Neural Net.

Assume we have a situation, *"You wanted to go Shopping* 🛍*"*, and It depends on three factors: You are accompanied by your friends; You own a vehicle; The weather is good.

Now, these three factors contribute to your decision-making about going shopping. If the weather is bad, you don't walk out of your house. This means the weather has more weight on your decision as compared to the other two factors. These are called **weights**. weight is a parameter that in simple terms is defined as *the extent of importance an input has in making the decision*.

Now everyone doesn't feel the same about shopping...Some might feel it exciting, some might feel it boring. This is where the **bias** comes in. The person who is more interested in shopping will consider a greater bias and those who feel it boring consider a less bias.

Since we already have historic data to work with, it'll help us in understanding the user's interest in shopping and hence for the model's prediction to match the actual value, the bias is a very helpful element.

💡

The Weights always range between 0 & 1 whereas Bias can be anything(*integers*).

Let's take an example, and the weights are as follows,

Weight of the decision that

*you are accompanied by your friends*:**0.7***you own a vehicle*:**0.9***the weather is good*:**0.6**

and, the inputs are as follows:

*You are not accompanied by your friends*, so: **0***You own a vehicle*, so: **1***The weather is awesome!*, so: **1**

now, you calculate the total weight of the neuron by,

$$sumof(weight * input) + bias$$

which leaves us,

$$[ 0(0.7) + 1(0.9) + 1(0.6) ] - 1.3$$

I personally am not more into shopping, so I randomly gave a bias of -1.3. Upon solving the above, we get a value of **0.2**. I only go shopping if I ran out of chips, which is very rare so let's say I have a threshold of 3.9 to go shopping. As my weight(*which is 0.2*) is less than my threshold, I will not go shopping.

While training your model, the weights, and the biases are automatically adjusted to make predictions close to the actual values, which we call **accuracy**. The threshold, sometimes, is defined by the model or by the programmer himself. You might be wondering how these weights are defined in the first place and how they are manipulated, let's see how.

The magic lies in the hidden layers of your neural network which simply is the layers standing in between your input and output layer. Once you have given the input to the layer, it randomly assigns weights and biases to your inputs, then calculates the weights and passes them to the next layer. In this way, deep through your layers, the data becomes narrower and decisions become easier. The model adjusts the weights and biases with the help of the **Activators** & **Loss Functions** and this way of adjusting them is called **Back Propagation.**

Neural Nets simply are mathematical fellows that decide the output on how much you care about an input. I think it's too much knowledge for today, so I'll talk about Activators and Loss Functions in my next article.

Until next time, Sree Teja Dusi.

]]>Regression to the Mean (phr.) : no matter how bad things get or how good, things always come back to middle.

Linear Regression is one of the simplest methods in Supervised Machine Learning, and let us put it into practice by predicting the price from the area of the house. The two prerequisites are Python Programming Language and Patience(You might not get it for the first time :x).

Let's break the recipe into chunks for peace of mind 😌,

Breathe in...Breathe out

Gathering stuff

Cleaning the Data

Normalization

Training 🎉

Yayy! We're halfway done(of the half 😅).

*With great power tools comes great responsibility achievements*. We need a dataset, a framework, and peace of mind to build ML models. Now that we have one from the 1st step, let's go get our dataset. The dataset I suggest to try Linear Regression on is, Housing Prices, and the framework Tensorflow, and the reason I'll be sharing with you in another post. Now, let's clean our dataset.

What? & Why?. I know these two questions are swirling in your mind. Firstly, I'll answer the "What" question.

*What is Cleaning the Data?*

Well, data isn't always 100% complete. There are incomplete fields; multiple types of data(viz. numbers, true/false, text). Cleaning the data is simply transforming the data, which is easier for you to use.

*Why is cleaning data necessary?*

When there are null/incomplete fields in your data, that does not end up well with your model's training, or when there is text in your data, your model doesn't accept it. So *clean it, before you use it.*

Let's begin by importing the necessary packages

`import pandas as pdimport tensorflow as tfimport matplotlib.pyplot as pltimport numpy as np`

and reading our dataset.

`dataset = pd.read_csv('../Datasets/HousingPrices.csv')dataset.head()`

It's time to clean.

`dataset.drop(columns=['mainroad','guestroom','basement','hotwaterheating','airconditioning','prefarea','furnishingstatus'],inplace=True)`

We don't need the above columns because we don't need them. Just kidding. We're dealing with numbers, so we don't need them.

You might be like: Why am I even doing this? 🥲. In our data, especially "area," they are large numbers ranging from 4-5 digits. Training on such huge figures costs time, and you might end up with a less accurate model(*that's a long explanation for "messing up"*). So how about we scale it down such that the areas range from 0 to 1 :0. And that's what we call **Normalization**. There's a simple formula for doing it:

$$[data - min(data)]/[max(data)-min(data)]$$

before we use that formula, we divide our data into "Train" and "Test" datasets. Because we're going to evaluate our model in the future. The common ratio is 70:30 or 80:20.

`train_dataset = dataset.sample(frac=0.7) #extracts 70% of the datatest_dataset = dataset.drop(train_dataset.index) #extracts remaining 30%# Data Normalizationtrain_features = (train_dataset-np.min(train_dataset))/(np.max(train_dataset)-np.min(train_dataset))test_features = (test_dataset-np.min(test_dataset))/(np.max(test_dataset)-np.min(test_dataset))# .pop method removes the price column from dataframe and stores it in following variables.# we need labels because it is Supervised Machine Learning. Read my first post "A Glimpse into Machine Learning" to know what it is!train_labels = train_features.pop('price')test_labels = test_features.pop('price')`

The following is another way to normalize the data. The normalization method divides the difference of data and its mean with its variance

$$[data - mean(data)]/variance(data)$$

`area = np.array(train_features['area'])area_normalizer = tf.keras.layers.Normalization(input_shape=[1,],axis=None)area_normalizer.adapt(area)`

However, this method outputs the best results for our dataset.

The time has finally come...To become a Typical Indian Baba(they usually predict your future without ML 😆).

Now here's the model we're going to train:

`model = tf.keras.Sequential([ area_normalizer, tf.keras.layers.Dense(units=64, activation='relu'), tf.keras.layers.Dense(units=64, activation='relu'), tf.keras.layers.Dense(units=1)])`

Sequential is a model type that will easily allow you to create a Neural Network by stacking the layers sequentially. The layers we provide in Sequential are our Normalization layer and three dense layers, which is our magical predictor! I will explain more about these dense layers in my next post.

Now, you compile the model,

`model.compile(optimizer=tf.keras.optimizers.legacy.Adam(learning_rate=0.001),loss='mean_absolute_error',metrics=['mae'])`

optimizer: The goal of an optimizer is to find the optimal set of values that result in the best performance of the model on the given task.

loss: the loss function (also known as the cost function or objective function) measures how well your model performs on a particular task.

metrics: they just measure your model's performance and have 0 contributions in training.

now, you fit the model to your data,

`model.fit(train_features['area'],train_labels, epochs=500, validation_split=0.2)`

You provide features and labels, the number of times it goes through the whole data(termed '*epochs'*), and the percentage of split to validate itself in training.

then you can predict values by,

`y=model.predict([64986])`

Here's what the Regression Line looks like:

Thats it! We're done! You can have a complete look at the code here. In my next post, I'll be writing about Behind the Scenes of Linear Regression 🤩. Stay Tuned!

*Datasets from* *kaggle.com* *&* *datasetsearch.research.google.com**. For more insights on Tensorflow, visit:* *tensorflow.org*

Until Next Time, Sree Teja Dusi

]]>Numbers constitute the only universal language. ~ Nathanael West.

According to a legend, Mathematics is the Mother of Machines. From the origin of the invention of the first computer in 1822 to achieving the highest computational power of 1102 petaflops, mathematics is the linchpin of the operation. And today, we are amazed by the miracles of Machine Learning, which only is possible through the involvement of Math.

Well, yes. You might wonder if someone is sitting inside your computer and processing all the dumb questions you ask (*just kidding :p*). Well, it's just some weird calculus and statistics behind it.

Let's just put it with an example, say you're trying to predict the price of CocaCola in the coming 40 years, and you have a dataset of costs from the previous 100 years in one hand and a bag of chips in the other. Now, after you're done with your chips, You'll give the data to a model that tells you to save money for buying your favorite cold drink in the future. But how?

The straightforward one you can consider implementing is Linear Regression (*Time Series Forecasting is a bit complex, for an example :]*). Linear Regression puts a scattered plot between the Independent Variables(like the market and time) and Dependent variables(the price - in this case), then draws a regression line through the data(*a linear equation,* `y=mx+b`

) so that the Mean Squared Error(*average of all the errors squared*) is minimum.

Now, you can extend the Regression line further and predict the values. It is never 100% accurate, but you go closer to the value(so it is called Prediction 🥲).

You will be a master of ML if you are good at some important topics of Mathematics, some of which include Statistics, Linear Algebra, Calculus, and Matrices.

With Statistics, you'll have reasonable control over the data; Algebra allows you to transform the data; Calculus for Gradient and Loss functions; and Matrices for studying Images.

Though these are just a few popular topics you've encountered in your schooling, they will be ample enough to move forward.

Until we meet with another story, Sree Teja Dusi.

]]>LinkedIn currently lists 452,732 jobs worldwide that require Machine Learning skills, showcasing the growing demand for expertise in this field.

While we witness remarkable AI advancements every day, many enthusiasts remain unfamiliar with the underlying magic.

To grasp the concept of machine learning, it's important to first differentiate it from artificial intelligence (AI). AI encompasses the broader scope of developing intelligent machines capable of emulating human thinking and behavior. Machine learning, on the other hand, is a specific field within AI that enables devices to acquire knowledge from data without explicit programming. In simpler terms, machine learning is a subset of AI that focuses on building intelligent systems capable of adapting to human behavior. However, it's crucial to note that machine learning itself is dependent on data for its functioning.

Machine learning primarily revolves around data acquisition, training, and inference. However, data is not a one-size-fits-all concept. Different data types and use cases require distinct approaches for effective analysis. Just as you can't understand an image by examining each pixel individually, you can't comprehend text by merely glancing at it. The same principle applies to machine learning.

Now that we've scratched the surface of machine learning, let's delve a little deeper into its capabilities.

Machine learning models are typically trained using three different methods:

Think of this as a scenario where I show you a picture and inform you that it's a dog 🐶. You observe the characteristics, such as wide ears, big round eyes, and a curvy tail. Later, when I show you a picture of a human, you can differentiate it from a dog based on the learned features. Supervised learning involves providing the model with questions and answers, enabling it to learn the relationship between them.

In contrast to supervised learning, unsupervised learning doesn't involve providing explicit answers. Let's say I give you a set of pictures containing both dogs and cats, but without any labels. Your task is to classify them into separate groups. This method discovers hidden patterns and organizes data without relying on pre-existing knowledge or labels.

Remember how you were rewarded with candy in your childhood for passing exams? Reinforcement learning works similarly. The model is rewarded for every positive action it takes, encouraging it to make more beneficial moves. A classic example of reinforcement learning is training bots to play chess. These bots learn the game through trial and error, with positive moves leading to rewards and improved gameplay.

Additionally, you may encounter a fourth type: Semi-Supervised Learning, which combines elements of supervised and unsupervised learning approaches.

By understanding these three main types of learning, you're equipped with a glimpse into machine learning, ready to embark on your own ML journey.

Until we meet again, Sree Teja Dusi.

]]>