Skip to main content

Linear Classification and Regression

Linear Classification and Linear Regression are two fundamental techniques in machine learning and statistical modeling that deal with prediction tasks. Although both techniques use linear models, they are applied in different contexts and are aimed at solving different types of problems.

Let’s break down both concepts:


1. Linear Classification

Linear Classification is a type of supervised learning where the goal is to classify data into different categories (or classes) based on a linear decision boundary. The model attempts to find a linear equation that best separates the different classes in the feature space.

  • Goal: In classification, the output is categorical (e.g., spam vs. non-spam, malignant vs. benign tumors, or dog vs. cat). The objective is to assign input data to one of the predefined classes.
  • Linear Classifier: A linear classifier makes predictions by finding a linear decision boundary (hyperplane) that divides the feature space into regions corresponding to different classes. The simplest example of a linear classifier is Logistic Regression or Support Vector Machines (SVM) with a linear kernel.

How Linear Classification Works:

A linear classifier uses a linear function to compute the decision boundary between classes. The decision boundary is a line (in two dimensions), plane (in three dimensions), or hyperplane (in higher dimensions) that divides the data points of different classes.

The general form of the linear classifier is:

f(x)=w1x1+w2x2++wnxn+bf(x) = w_1 x_1 + w_2 x_2 + \cdots + w_n x_n + b

Where:

  • f(x)f(x) is the predicted output (the decision function).
  • x1,x2,,xnx_1, x_2, \dots, x_n are the features of the input data.
  • w1,w2,,wnw_1, w_2, \dots, w_n are the weights (coefficients) assigned to the features.
  • bb is the bias term, allowing the model to make decisions that are not strictly at the origin (shifting the decision boundary).

Types of Linear Classification Models:

  • Logistic Regression: Despite the name, it's a classification model. It uses a linear function in combination with a logistic (sigmoid) function to model the probability of a binary outcome.
  • Linear Support Vector Machine (SVM): The goal of an SVM is to find the hyperplane that maximizes the margin between the classes.

Example:

  • Binary Classification: You may have a dataset of emails with features such as "contains certain keywords" or "length of email," and you want to classify each email as spam or non-spam. A linear classifier will draw a boundary in the feature space to separate the two classes.

Decision Boundary:

  • In 2D, a linear classifier tries to find a line that separates the classes:

    w1x1+w2x2+b=0w_1 x_1 + w_2 x_2 + b = 0

    This equation represents a straight line (in 2D) that can be used to classify new data points.


2. Linear Regression

Linear Regression is a type of supervised learning used for predicting a continuous output based on one or more input features. The model finds the best-fitting straight line (in simple linear regression) or hyperplane (in multiple linear regression) that minimizes the error between the predicted values and the actual target values.

  • Goal: In regression, the output is continuous (e.g., predicting house prices, stock prices, or temperature). The goal is to fit a model that predicts a real-valued number.
  • Linear Model: Linear regression uses a linear function to predict the output based on the input features. The difference is that instead of classifying the data into discrete categories, it predicts a continuous value.

How Linear Regression Works:

Linear regression assumes that there is a linear relationship between the input variables (features) and the output (target). The general form of a linear regression model is:

y=w1x1+w2x2++wnxn+by = w_1 x_1 + w_2 x_2 + \cdots + w_n x_n + b

Where:

  • yy is the predicted output (the dependent variable).
  • x1,x2,,xnx_1, x_2, \dots, x_n are the input features (independent variables).
  • w1,w2,,wnw_1, w_2, \dots, w_n are the weights (coefficients) of the features, which represent the contribution of each feature to the prediction.
  • bb is the bias term, which adjusts the output to fit the data better.

Training Linear Regression:

The goal of linear regression is to find the optimal weights w1,w2,,wnw_1, w_2, \dots, w_n and bias bb that minimize the sum of squared errors (SSE) between the predicted values and the true values. The error for each data point is the difference between the predicted value y^\hat{y} and the actual value yy.

The cost function used is:

Cost(w)=12mi=1m(yiyi^)2\text{Cost}(w) = \frac{1}{2m} \sum_{i=1}^{m} (y_i - \hat{y_i})^2

Where:

  • mm is the number of training samples.
  • yiy_i is the true value.
  • yi^\hat{y_i} is the predicted value for the ii-th data point.

The closed-form solution for linear regression uses ordinary least squares (OLS), or it can be solved iteratively using Gradient Descent.

Example:

  • Predicting House Prices: You may have a dataset of houses with features such as size (square footage), number of bedrooms, and age, and you want to predict the price of a house. Linear regression would learn the relationship between these features and the house price, and predict the price for new houses.

Linear Regression in 2D:

In simple linear regression (with one feature), the model tries to find a straight line that best fits the data points:

y=w1x+by = w_1 x + b

Where:

  • yy is the predicted target (e.g., price).
  • xx is the feature (e.g., square footage).
  • w1w_1 is the slope of the line.
  • bb is the intercept.

The line is drawn such that the sum of squared differences between the predicted values and the actual data points is minimized.


Key Differences Between Linear Classification and Linear Regression:

Aspect Linear Classification Linear Regression
Output Type Categorical (e.g., class labels) Continuous (e.g., numerical values)
Goal Assign input to a class Predict a real-valued number
Target Variable Discrete (e.g., class labels like 0 or 1) Continuous (e.g., price, temperature, etc.)
Model Type Linear Decision Boundary (e.g., Logistic Regression) Linear relationship between input features and output
Cost Function Typically uses cross-entropy or log-loss (classification error) Uses Mean Squared Error (MSE) or sum of squared residuals
Examples Spam classification, image classification, sentiment analysis House price prediction, stock price prediction, temperature forecasting

Summary:

  • Linear Classification involves finding a linear decision boundary to classify data into discrete classes (e.g., spam vs. non-spam, fraud vs. non-fraud).
  • Linear Regression involves modeling the relationship between input features and a continuous output, predicting real-valued numbers (e.g., predicting house prices based on features like square footage and number of bedrooms).

Both methods rely on a linear model, but they are applied to different kinds of prediction tasks: classification vs. regression.

Comments

Popular posts from this blog

Simple Linear Regression - and Related Regression Loss Functions

Today's Topics: a. Regression Algorithms  b. Outliers - Explained in Simple Terms c. Common Regression Metrics Explained d. Overfitting and Underfitting e. How are Linear and Non Linear Regression Algorithms used in Neural Networks [Future study topics] Regression Algorithms Regression algorithms are a category of machine learning methods used to predict a continuous numerical value. Linear regression is a simple, powerful, and interpretable algorithm for this type of problem. Quick Example: These are the scores of students vs. the hours they spent studying. Looking at this dataset of student scores and their corresponding study hours, can we determine what score someone might achieve after studying for a random number of hours? Example: From the graph, we can estimate that 4 hours of daily study would result in a score near 80. It is a simple example, but for more complex tasks the underlying concept will be similar. If you understand this graph, you will understand this blog. Sim...

What problems can AI Neural Networks solve

How does AI Neural Networks solve Problems? What problems can AI Neural Networks solve? Based on effectiveness and common usage, here's the ranking from best to least suitable for neural networks (Classification Problems, Regression Problems and Optimization Problems.) But first some Math, background and related topics as how the Neural Network Learn by training (Supervised Learning and Unsupervised Learning.)  Background Note - Mathematical Precision vs. Practical AI Solutions. Math can solve all these problems with very accurate results. While Math can theoretically solve classification, regression, and optimization problems with perfect accuracy, such calculations often require impractical amounts of time—hours, days, or even years for complex real-world scenarios. In practice, we rarely need absolute precision; instead, we need actionable results quickly enough to make timely decisions. Neural networks excel at this trade-off, providing "good enough" solutions in seco...

Activation Functions in Neural Networks

  A Guide to Activation Functions in Neural Networks 🧠 Question: Without activation function can a neural network with many layers be non-linear? Answer: Provided at the end of this document. Activation functions are a crucial component of neural networks. Their primary purpose is to introduce non-linearity , which allows the network to learn the complex, winding patterns found in real-world data. Without them, a neural network, no matter how deep, would just be a simple linear model. In the diagram below the f is the activation function that receives input and send output to next layers. Commonly used activation functions. 1. Sigmoid Function 2. Tanh (Hyperbolic Tangent) 3. ReLU (Rectified Linear Unit - Like an Electronic Diode) 4. Leaky ReLU & PReLU 5. ELU (Exponential Linear Unit) 6. Softmax 7. GELU, Swish, and SiLU 1. Sigmoid Function                       The classic "S-curve," Sigmoid squashes any input value t...