Explaining Multiple Linear Regression with example. | Download project in Python code.
Prediction through Multiple Linear Regression based Model
In this post, we are
performing prediction through the use of Multiple linear regression. For
implementing MLR (Multiple Linear Regression), we have collected the dataset
consisting of 5 columns: R&D Spend, Administration, Marketing Spend, State
and Profit.
Here in the example, we
are predicting Profit by considering other 4 factors. For achieving the
prediction, the following steps are undertaken: Starting by importing the
libraries and the dataset followed by exploring the dataset. The data consists of a total of 30 rows with no null value and no categorical data. Further the data
is split into training set and test set. Train data is required to train the
model to perform the prediction for new data.
Step 1: Importing the
libraries
Step 2: Importing the
dataset
Step 3: Exploring the
dataset
Step 4: Encoding
categorical data
Step 5: Splitting the
dataset into the Training set and Test set
Step 5: Training the
Multiple Linear Regression model on the Training set
Step 6: Predicting the
Test set results
Importing the libraries
import numpy as np
import
matplotlib.pyplot as plt
import pandas as pd
Importing the dataset
dataset = pd.read_csv(‘50_Startups.csv’)
X = dataset.iloc[:,
:-1].values
y = dataset.iloc[:,
-1].values
Exploring the dataset
dataset.head(5)
Encoding categorical
data
from sklearn.compose
import ColumnTransformer
from
sklearn.preprocessing import OneHotEncoder
ct =
ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [3])],
remainder='passthrough')
X =
np.array(ct.fit_transform(X))
Splitting the dataset
into the Training set and Test set
from
sklearn.model_selection import train_test_split
X_train, X_test,
y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
Training the Multiple
Linear Regression model on the Training set
from
sklearn.linear_model import LinearRegression
regressor =
LinearRegression()
regressor.fit(X_train,
y_train)
Predicting the Test set
results
y_pred =
regressor.predict(X_test)
np.set_printoptions(precision=2)
print(np.concatenate((y_pred.reshape(len(y_pred),1),
y_test.reshape(len(y_test),1)),1))
Comments
Post a comment