Machine Learning — Linear Regression

Shravan C
2 min readDec 22, 2019

Linear regression as said in my previous article solves the problem of the whose output is of type continuos. There is no fixed value to its output cannot be categorical. Input variables can be either continuous or discrete values.

We will use the sklearn linear regression library for this. And predict the air quality, download link for the data. Steps are the same as said in the previous article, writing here again.

  • Step1: Load data
  • Step2: Process data
  • Step3: Import the regression library
  • Step4: find the accuracy for the algorithm
  • Step5: Predict for the new data

Getting into the code:

#Step 1
import pandas as pd
data = pd.read_csv("weather_lab.csv")
print(data.head(10))

weather_lab.csv is available in the repository. data.head(10) will print the first 10 entries in the CSV.

#Step 2
X = data.iloc[:, 0].values
y = data.iloc[:, 1].values
import matplotlib.pyplot as pltplt.scatter(X, y)
plt.xlabel('MinTemp', fontsize=14)
plt.ylabel('MaxTemp', fontsize=14)
plt.show()
X = X.reshape(-1, 1)

Data distribution with Matplotlib. Plotting MiniTemp on X-axis and MaxTem Y-axis.

#Step 3
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)

Dividing the dataset into train and test data respectively.

#Step 4
from sklearn.linear_model import LinearRegression
clf = LinearRegression()
clf.fit(x_train, y_train)
#print(x_test)
y_pred_test = clf.predict(x_test)
y_pred_train = clf.predict(x_train)

It is the same as it is done for the classifier. Basically fit your train_data(Usually first calculates all the coefficients/weights required) and then run predict with new data.

#Step5
#Least Mean Sqaured error calculations.
import numpy as np
print(np.sqrt(np.mean((y_pred_test - y_test)**2)))
alpha = clf.intercept_
beta = clf.coef_[0]
plt.plot(X,alpha+beta*X,color='r')
plt.scatter(X,y)
plt.show()

Linear regression finds its solutions by minimizing the least mean square error. This will sum up all the differences between the coordinate and the slope.

#Step 6
minimum_temp = input("Enter Minimum Temp:")
print(clf.predict(np.array([[int(minimum_temp)]])))

Last but not least, the client code to predict new data.

Again this is a beginner level problem. Understanding each step makes it easy to simplify the machine learning concepts. The code for the same can be found in the Github link.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Shravan C
Shravan C

Written by Shravan C

Software Engineer | Machine Learning Enthusiast | Super interested in Deep Learning with Tensorflow | GCP

No responses yet

Write a response