
Linear regression as said in my previous article solves the problem of the whose output is of type continuos. There is no fixed value to its output cannot be categorical. Input variables can be either continuous or discrete values.
We will use the sklearn linear regression library for this. And predict the air quality, download link for the data. Steps are the same as said in the previous article, writing here again.
- Step1: Load data
- Step2: Process data
- Step3: Import the regression library
- Step4: find the accuracy for the algorithm
- Step5: Predict for the new data
Getting into the code:
#Step 1
import pandas as pd
data = pd.read_csv("weather_lab.csv")
print(data.head(10))
weather_lab.csv is available in the repository. data.head(10) will print the first 10 entries in the CSV.
#Step 2
X = data.iloc[:, 0].values
y = data.iloc[:, 1].valuesimport matplotlib.pyplot as pltplt.scatter(X, y)
plt.xlabel('MinTemp', fontsize=14)
plt.ylabel('MaxTemp', fontsize=14)
plt.show()X = X.reshape(-1, 1)
Data distribution with Matplotlib. Plotting MiniTemp on X-axis and MaxTem Y-axis.
#Step 3
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)
Dividing the dataset into train and test data respectively.
#Step 4
from sklearn.linear_model import LinearRegression
clf = LinearRegression()
clf.fit(x_train, y_train)
#print(x_test)
y_pred_test = clf.predict(x_test)
y_pred_train = clf.predict(x_train)
It is the same as it is done for the classifier. Basically fit your train_data(Usually first calculates all the coefficients/weights required) and then run predict with new data.
#Step5
#Least Mean Sqaured error calculations.
import numpy as np
print(np.sqrt(np.mean((y_pred_test - y_test)**2)))alpha = clf.intercept_
beta = clf.coef_[0]
plt.plot(X,alpha+beta*X,color='r')
plt.scatter(X,y)
plt.show()
Linear regression finds its solutions by minimizing the least mean square error. This will sum up all the differences between the coordinate and the slope.
#Step 6
minimum_temp = input("Enter Minimum Temp:")
print(clf.predict(np.array([[int(minimum_temp)]])))
Last but not least, the client code to predict new data.
Again this is a beginner level problem. Understanding each step makes it easy to simplify the machine learning concepts. The code for the same can be found in the Github link.