Introduction
Linear regression is a widely used supervised learning algorithm that has been a cornerstone of machine learning for decades. While linear regression provides good performance on simple datasets, it can become less effective when dealing with non-linear relationships between the independent and dependent variables. This is where polynomial linear regression comes in – an extension of traditional linear regression that accommodates non-linear interactions between features.
In this blog, we will delve into the math behind polynomial linear regression, explore a Python implementation using scikit-learn, and provide guidance on creating our own PolynomialLinearRegression class from scratch.
Mathematical Background
Polynomial linear regression is an extension of traditional linear regression. The core idea remains the same: to find the best-fitting line that minimizes the error between predicted and actual values.
Given a dataset with n samples, each sample has p independent variables (features) denoted by x_i and a single dependent variable y_i. We want to model this relationship using a polynomial equation of degree d:
y_i = β_0 + Σ(j=1 to d) β_j x_{ij}^j
where β_0 is the intercept, and β_j are coefficients corresponding to each feature (x_{ij}) raised to the power j.
To optimize these coefficients, we use ordinary least squares (OLS), a widely used method for linear regression. The OLS objective function can be written as:
Minimize ∑[i=1 to n] (y_i – (β_0 + Σ(j=1 to d) β_j x_{ij}^j))^2
Scikit-learn Implementation
For demonstration purposes, let’s use scikit-learn’s `PolynomialRegressor` class. This implementation uses a built-in polynomial transformation and an optimized linear regression algorithm.
“`python
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
import numpy as np
# Create a sample dataset with features x1, x2, x12
X = np.array([[1], [2], [4], [8]])
y = np.array([2, 3, 5, 7])
# Transform the features to include polynomial terms up to degree 2
poly_features = PolynomialFeatures(degree=2)
X_poly = poly_features.fit_transform(X)
# Create and train a linear regression model with polynomial transformation
model = LinearRegression()
model.fit(X_poly, y)
print(“Coefficients:”, model.coef_)
“`
Creating Your Own PolynomialLinearRegression Class
Let’s create our own implementation of PolynomialLinearRegression from scratch. This will allow us to customize the degree and other hyperparameters.
“`python
import numpy as np
class PolynomialLinearRegression:
def __init__(self, degree=2):
self.degree = degree
self.coef_ = None
self.intercept_ = None
def fit(self, X, y):
# Transform the features to include polynomial terms up to degree ‘degree’
poly_features = PolynomialFeatures(degree=self.degree)
X_poly = poly_features.fit_transform(X)
# Create and train a linear regression model with polynomial transformation
self.model_ = LinearRegression()
self.model_.fit(X_poly, y)
# Extract coefficients and intercept from the trained model
self.coef_ = self.model_.coef_
self.intercept_ = self.model_.intercept_
def predict(self, X):
# Transform new features to include polynomial terms up to degree ‘degree’
poly_features = PolynomialFeatures(degree=self.degree)
X_poly = poly_features.fit_transform(X)
# Use the trained model to make predictions
return self.model_.predict(X_poly)
# Example usage:
X_train, y_train = np.random.rand(100, 2), np.random.rand(100)
model = PolynomialLinearRegression(degree=3)
model.fit(X_train, y_train)
print(“Coefficients:”, model.coef_)
print(“Intercept:”, model.intercept_)
y_pred = model.predict(np.array([[1.5], [2.8]]))
print(“Predicted values:”, y_pred)
“`
Conclusion
Polynomial linear regression is a powerful extension of traditional linear regression that can handle non-linear relationships between features. By understanding the math behind this algorithm and implementing it in Python, we have gained a deeper insight into how to create models that capture complex relationships.
While scikit-learn’s `PolynomialRegressor` class provides an efficient implementation, creating our own PolynomialLinearRegression class from scratch allows for customization of degree and other hyperparameters. This approach also enhances understanding of the underlying algorithms and techniques used in machine learning.
As machine learning continues to evolve, we will likely see more complex regression models that incorporate higher-order polynomials or even non-polynomial transformations. By building a solid foundation in polynomial linear regression, we can tackle these challenges with confidence and create more accurate models for real-world problems.

