Skip to main content

What is Fully Connected Neural Network?

Fully Connected Neural Network (FCNN)

A Fully Connected Neural Network (FCNN), also known as a Dense Network, is a type of artificial neural network where every neuron in a layer is connected to every neuron in the subsequent layer. These networks are composed of one or more layers of neurons, where each neuron receives input from all neurons of the previous layer, processes the input, and passes the result to the next layer.


Key Characteristics of a Fully Connected Neural Network

  1. Layers:

    • Input Layer: Receives the input features (e.g., pixels for image data).
    • Hidden Layers: One or more layers where neurons process the input data using weighted connections.
    • Output Layer: Produces the final prediction (e.g., class probabilities for classification tasks).
  2. Connections:

    • Fully Connected: Every neuron in a layer is connected to every neuron in the next layer.
    • Each connection has a weight that is learned during training, indicating the strength of the connection between two neurons.
  3. Activation Functions:

    • Neurons in each layer apply an activation function (e.g., ReLU, Sigmoid, Tanh) to their weighted inputs to introduce non-linearity.
  4. Weights and Biases:

    • Weights: Parameters that control the strength of connections between neurons.
    • Biases: Additional parameters added to the weighted inputs to shift the activation function.

How Fully Connected Neural Networks Work

  1. Forward Propagation:

    • Input data is passed through the network layer by layer.
    • For each neuron, the weighted sum of its inputs is computed, and the activation function is applied.
    • This process continues through the hidden layers until the output layer is reached.
  2. Backpropagation and Training:

    • Loss Function: The difference between the predicted output and the actual target (e.g., mean squared error for regression, cross-entropy for classification).
    • Backpropagation: A process where the error is propagated back through the network to adjust the weights and biases using gradient descent or other optimization algorithms.

Mathematics Behind Fully Connected Layers

For a given layer in a fully connected neural network:

  • Let x\mathbf{x} be the input vector to the layer.
  • W\mathbf{W} be the weight matrix, and b\mathbf{b} the bias vector.
  • The output of the layer before applying the activation function is: z=Wx+b\mathbf{z} = \mathbf{W} \mathbf{x} + \mathbf{b}
  • After applying the activation function f()f(\cdot), the output becomes: a=f(z)\mathbf{a} = f(\mathbf{z}) where a\mathbf{a} is the output of the layer.

Advantages of Fully Connected Neural Networks

  1. Expressive Power:
    • FCNNs can approximate any continuous function given sufficient neurons and layers, thanks to the Universal Approximation Theorem.
  2. Simplicity:
    • The structure of FCNNs is straightforward and easy to implement.
  3. Flexibility:
    • FCNNs are versatile and can be applied to various tasks, such as regression, classification, and even time series forecasting.

Disadvantages of Fully Connected Neural Networks

  1. Computational Cost:

    • Due to the dense connectivity, the number of parameters grows quickly as the number of neurons and layers increases, leading to high memory and computational requirements.
  2. Overfitting:

    • With a large number of parameters, FCNNs are prone to overfitting, especially with limited training data. Regularization techniques like dropout, weight decay, or early stopping are needed.
  3. Inefficiency in Handling Spatial Data:

    • For tasks like image or video processing, FCNNs are less efficient than specialized architectures like Convolutional Neural Networks (CNNs), which take advantage of spatial hierarchies.

Example of a Fully Connected Neural Network for Classification

  1. Input: A dataset with 784 features (e.g., 28×2828 \times 28 pixel image flattened into a 1D vector).

  2. Network Architecture:

    • Input Layer: 784 neurons (one for each pixel).
    • Hidden Layer 1: 128 neurons, ReLU activation.
    • Hidden Layer 2: 64 neurons, ReLU activation.
    • Output Layer: 10 neurons (for 10 classes in classification), softmax activation for probability distribution.
  3. Forward Pass:

    • Input data is passed through each layer, and activations are computed based on the weights and biases.
  4. Output: The network outputs a probability distribution over the 10 classes, and the class with the highest probability is selected as the predicted label.


Applications of Fully Connected Neural Networks

  • Image Classification (when combined with CNNs).
  • Time Series Forecasting.
  • Speech Recognition.
  • Recommendation Systems.
  • Natural Language Processing (NLP) (e.g., fully connected layers in a sequence model).

Summary

A Fully Connected Neural Network is a simple, versatile type of neural network where each neuron is connected to every other neuron in adjacent layers. They are used for a wide range of applications but can become computationally expensive with larger datasets. While FCNNs are powerful, they are often combined with specialized networks like Convolutional Neural Networks (CNNs) for tasks like image processing.

Comments

Popular posts from this blog

Simple Linear Regression - and Related Regression Loss Functions

Today's Topics: a. Regression Algorithms  b. Outliers - Explained in Simple Terms c. Common Regression Metrics Explained d. Overfitting and Underfitting e. How are Linear and Non Linear Regression Algorithms used in Neural Networks [Future study topics] Regression Algorithms Regression algorithms are a category of machine learning methods used to predict a continuous numerical value. Linear regression is a simple, powerful, and interpretable algorithm for this type of problem. Quick Example: These are the scores of students vs. the hours they spent studying. Looking at this dataset of student scores and their corresponding study hours, can we determine what score someone might achieve after studying for a random number of hours? Example: From the graph, we can estimate that 4 hours of daily study would result in a score near 80. It is a simple example, but for more complex tasks the underlying concept will be similar. If you understand this graph, you will understand this blog. Sim...

What problems can AI Neural Networks solve

How does AI Neural Networks solve Problems? What problems can AI Neural Networks solve? Based on effectiveness and common usage, here's the ranking from best to least suitable for neural networks (Classification Problems, Regression Problems and Optimization Problems.) But first some Math, background and related topics as how the Neural Network Learn by training (Supervised Learning and Unsupervised Learning.)  Background Note - Mathematical Precision vs. Practical AI Solutions. Math can solve all these problems with very accurate results. While Math can theoretically solve classification, regression, and optimization problems with perfect accuracy, such calculations often require impractical amounts of time—hours, days, or even years for complex real-world scenarios. In practice, we rarely need absolute precision; instead, we need actionable results quickly enough to make timely decisions. Neural networks excel at this trade-off, providing "good enough" solutions in seco...

Activation Functions in Neural Networks

  A Guide to Activation Functions in Neural Networks 🧠 Question: Without activation function can a neural network with many layers be non-linear? Answer: Provided at the end of this document. Activation functions are a crucial component of neural networks. Their primary purpose is to introduce non-linearity , which allows the network to learn the complex, winding patterns found in real-world data. Without them, a neural network, no matter how deep, would just be a simple linear model. In the diagram below the f is the activation function that receives input and send output to next layers. Commonly used activation functions. 1. Sigmoid Function 2. Tanh (Hyperbolic Tangent) 3. ReLU (Rectified Linear Unit - Like an Electronic Diode) 4. Leaky ReLU & PReLU 5. ELU (Exponential Linear Unit) 6. Softmax 7. GELU, Swish, and SiLU 1. Sigmoid Function                       The classic "S-curve," Sigmoid squashes any input value t...