Skip to main content

What are Tensors in AI

 

Tensors in AI: The Building Blocks of Deep Learning Explained

What Is a Tensor? (The 30-Second Version)

A tensor is just a container for numbers, organized in a specific shape. Think of it as:

  • Scalar = Single number (0D tensor)
  • Vector = List of numbers (1D tensor)
  • Matrix = Table of numbers (2D tensor)
  • Tensor = Numbers in 3D, 4D, or more dimensions

That's it! Everything in deep learning is built on this simple concept.

From Numbers to Neural Networks: A Journey

Step 1: The Scalar (0-Dimensional Tensor)

42

Just a single number. Examples in AI:

  • Learning rate: 0.001
  • Loss value: 2.34
  • Accuracy: 95.2%

Step 2: The Vector (1-Dimensional Tensor)

[1.2, 3.4, 5.6, 7.8]

A list of numbers. Examples in AI:

  • Word embedding: [0.2, -0.5, 0.8, ...]
  • Neuron activations: [0, 0.9, 0.1, 0.7]
  • Probabilities: [0.1, 0.3, 0.6] (cat, dog, bird)

Step 3: The Matrix (2-Dimensional Tensor)

[[1, 2, 3],
 [4, 5, 6],
 [7, 8, 9]]

A table of numbers. Examples in AI:

  • Grayscale image: pixels in rows and columns
  • Weight matrix: connections between neural network layers
  • Batch of vectors: multiple word embeddings

Step 4: 3D Tensor (The Real Power Begins)

[[[R, G, B],    # Pixel 1
  [R, G, B]],   # Pixel 2
 [[R, G, B],    # Pixel 3
  [R, G, B]]]   # Pixel 4

Examples in AI:

  • Color image: Height × Width × 3 (RGB channels)
  • Video frame: Height × Width × Channels
  • Text sequence: Words × Embedding size × Batch

Step 5: 4D Tensor and Beyond

Shape: [Batch, Height, Width, Channels]
Example: [32, 224, 224, 3]

This represents 32 color images, each 224×224 pixels!

Why Tensors? The Superpower of AI

1. Batch Processing

Instead of one image:

[224, 224, 3]  # One image

Process 32 at once:

[32, 224, 224, 3]  # 32 images simultaneously!

2. Parallel Computing

GPUs love tensors because they can process all numbers simultaneously:

  • CPU: Process numbers one by one
  • GPU: Process entire tensor at once
  • Result: 100x speedup!

3. Mathematical Elegance

Neural network forward pass:

Output = Input × Weights + Bias

With tensors, this works for 1 sample or 1 million samples - same code!

Real-World Tensor Examples

📷 Image Classification

# Input tensor shape
image_batch = [32, 224, 224, 3]
# 32 images, 224×224 pixels, 3 colors (RGB)

# After convolution
feature_maps = [32, 112, 112, 64]
# 32 images, smaller size, 64 different features

📝 Natural Language Processing

# Input tensor shape
text_batch = [16, 100, 768]
# 16 sentences, 100 words max, 768-dimensional embeddings

# After attention
attention_output = [16, 100, 768]
# Same shape, but enriched with context

🎥 Video Processing

# Input tensor shape
video_batch = [8, 30, 224, 224, 3]
# 8 videos, 30 frames each, 224×224 pixels, RGB

Tensor Operations: The Magic Moves

1. Reshape - Change the Organization

# From image to flat vector
[224, 224, 3] → [150,528]

# Why? To feed into a fully connected layer

2. Transpose - Flip Dimensions

# Swap batch and sequence for processing
[Batch, Sequence, Features] → [Sequence, Batch, Features]

3. Concatenate - Combine Information

# Combine features from different layers
[32, 64] + [32, 128] → [32, 192]

4. Slice - Extract Parts

# Get first 10 images from batch
[32, 224, 224, 3] → [10, 224, 224, 3]

The Tensor Lifecycle in Neural Networks

Step 1: Input Preparation

Raw Data → Tensor
Image file → [Height, Width, Channels]
Text → [Sequence Length, Embedding Size]
Audio → [Time Steps, Frequencies]

Step 2: Forward Pass

Input Tensor → Layer 1 → Tensor 1 → Layer 2 → Tensor 2 → ... → Output Tensor
[32, 784] → [32, 128] → [32, 64] → [32, 10]

Step 3: Loss Calculation

Predictions [32, 10] vs. Labels [32, 10] → Loss Scalar

Step 4: Backpropagation

Gradient Tensors flow backward, same shapes as forward!

Common Tensor Shapes in Popular Models

🖼️ CNN (Convolutional Neural Network)

Input: [Batch, Height, Width, Channels]
Conv Layer: [Batch, H/2, W/2, Filters]
Pooling: [Batch, H/4, W/4, Filters]
Output: [Batch, Classes]

🔤 Transformer (BERT, GPT)

Input: [Batch, Sequence, Embedding]
Attention: [Batch, Sequence, Sequence]
Output: [Batch, Sequence, Hidden]

🔄 RNN/LSTM

Input: [Batch, Time, Features]
Hidden: [Batch, Hidden_Size]
Output: [Batch, Time, Output_Size]

Debugging Tensor Shapes: The #1 Skill

Common Error Messages

"Expected input shape (32, 224, 224, 3), got (32, 224, 224)"

Fix: Add channel dimension!

"Matrix multiplication shapes don't match: (32, 784) vs (512, 10)"

Fix: Wrong layer size, should be (784, 10)!

Pro Tips

  1. Always print shapes during development
  2. Use assertions to check expected shapes
  3. Visualize tensor flow through your network

Tensor Broadcasting: The Hidden Magic

When tensors have compatible shapes, operations "just work":

[32, 224, 224, 3] + [3] = [32, 224, 224, 3]

The [3] is "broadcast" to match the bigger tensor!

Rules:

  1. Start from rightmost dimension
  2. Dimensions match if equal or one is 1
  3. Missing dimensions are added as 1

Memory Considerations

Tensor Size Calculation

Float32 tensor [32, 224, 224, 3]:
32 × 224 × 224 × 3 × 4 bytes = 19.3 MB

Memory-Saving Tricks

  1. Use smaller batch sizes
  2. Mixed precision (Float16)
  3. Gradient checkpointing
  4. Model parallelism

Practical Code Examples

PyTorch

import torch

# Create tensors
x = torch.randn(32, 224, 224, 3)  # Random image batch
w = torch.randn(3, 3, 3, 64)      # Conv weights

# Operations
y = torch.nn.functional.conv2d(x, w)  # Convolution
z = y.mean(dim=[2, 3])               # Global average pool

TensorFlow

import tensorflow as tf

# Create tensors
x = tf.random.normal([32, 224, 224, 3])
w = tf.random.normal([3, 3, 3, 64])

# Operations
y = tf.nn.conv2d(x, w, strides=1, padding='SAME')
z = tf.reduce_mean(y, axis=[1, 2])

The Journey from Tensor to Intelligence

  1. Pixels → Tensor → Edge Detection
  2. Edges → Tensor → Shape Recognition
  3. Shapes → Tensor → Object Detection
  4. Objects → Tensor → Scene Understanding

Each step is just tensor operations!

Key Takeaways

🎯 Remember These

  1. Tensor = Multi-dimensional array of numbers
  2. Shape = The dimensions of the tensor
  3. Rank = Number of dimensions
  4. Broadcasting = Automatic shape matching

💡 Why It Matters

  • Every input to AI is converted to tensors
  • Every AI operation is tensor manipulation
  • Understanding tensors = Understanding AI

🚀 Next Steps

  1. Practice visualizing tensor shapes
  2. Learn basic tensor operations
  3. Debug shape mismatches (you will encounter them!)
  4. Think in tensors, not loops

Final Thought

Tensors might seem abstract, but they're just organized numbers. Every stunning AI image, every chatbot response, every recommendation - it all comes down to tensors flowing through mathematical operations. Master tensors, and you've mastered the language of AI!


Remember: If you can organize numbers in boxes, you understand tensors. Everything else is just details!

Comments

Popular posts from this blog

Simple Linear Regression - and Related Regression Loss Functions

Today's Topics: a. Regression Algorithms  b. Outliers - Explained in Simple Terms c. Common Regression Metrics Explained d. Overfitting and Underfitting e. How are Linear and Non Linear Regression Algorithms used in Neural Networks [Future study topics] Regression Algorithms Regression algorithms are a category of machine learning methods used to predict a continuous numerical value. Linear regression is a simple, powerful, and interpretable algorithm for this type of problem. Quick Example: These are the scores of students vs. the hours they spent studying. Looking at this dataset of student scores and their corresponding study hours, can we determine what score someone might achieve after studying for a random number of hours? Example: From the graph, we can estimate that 4 hours of daily study would result in a score near 80. It is a simple example, but for more complex tasks the underlying concept will be similar. If you understand this graph, you will understand this blog. Sim...

What problems can AI Neural Networks solve

How does AI Neural Networks solve Problems? What problems can AI Neural Networks solve? Based on effectiveness and common usage, here's the ranking from best to least suitable for neural networks (Classification Problems, Regression Problems and Optimization Problems.) But first some Math, background and related topics as how the Neural Network Learn by training (Supervised Learning and Unsupervised Learning.)  Background Note - Mathematical Precision vs. Practical AI Solutions. Math can solve all these problems with very accurate results. While Math can theoretically solve classification, regression, and optimization problems with perfect accuracy, such calculations often require impractical amounts of time—hours, days, or even years for complex real-world scenarios. In practice, we rarely need absolute precision; instead, we need actionable results quickly enough to make timely decisions. Neural networks excel at this trade-off, providing "good enough" solutions in seco...

Activation Functions in Neural Networks

  A Guide to Activation Functions in Neural Networks 🧠 Question: Without activation function can a neural network with many layers be non-linear? Answer: Provided at the end of this document. Activation functions are a crucial component of neural networks. Their primary purpose is to introduce non-linearity , which allows the network to learn the complex, winding patterns found in real-world data. Without them, a neural network, no matter how deep, would just be a simple linear model. In the diagram below the f is the activation function that receives input and send output to next layers. Commonly used activation functions. 1. Sigmoid Function 2. Tanh (Hyperbolic Tangent) 3. ReLU (Rectified Linear Unit - Like an Electronic Diode) 4. Leaky ReLU & PReLU 5. ELU (Exponential Linear Unit) 6. Softmax 7. GELU, Swish, and SiLU 1. Sigmoid Function                       The classic "S-curve," Sigmoid squashes any input value t...