Artificial Intelligence Theory and Application

Posts

Showing posts from September, 2025

Exploratory Data Analysis (EDA) - visually and statistically analyze data

P-Value - The Complete Guide

P-Value: The Complete Guide 📊 The p-value is one of the most used (and misused!) concepts in statistics. Let's demystify it completely! What is a P-Value? 🎯 The p-value is the probability of getting results at least as extreme as what you observed, assuming the null hypothesis is true . Simple Definition: "If nothing special is happening, how surprised should I be by what I'm seeing?" Even Simpler: Small p-value = "Wow, that's surprising! Maybe something IS happening!" Large p-value = "Meh, this could easily happen by chance" Real-World Analogy 🎰 Imagine you suspect a coin is rigged: Null Hypothesis (H₀): "The coin is fair" (50/50 chance) You flip it 10 times: Get 9 heads Question: If the coin IS fair, what's the probability of getting 9+ heads? Answer: p-value ≈ 0.011 (about 1.1%) Interpretation: "If the coin is fair, there's only a 1.1% chance of this happening. That's suspicious!...

Cross-Entropy - Classification Loss Function

Cross-Entropy Loss Function - Complete Guide 📊 Cross-entropy is one of the most important loss functions in machine learning, especially for classification problems. Let's dive deep! What is Cross-Entropy? Cross-entropy measures the difference between two probability distributions : True distribution: What actually happened (ground truth) Predicted distribution: What our model thinks will happen Think of it as measuring "how wrong" our predictions are, with a special focus on confident wrong predictions. Mathematical Definition Binary Cross-Entropy (BCE) For binary classification (yes/no, cat/dog, spam/not-spam): Formula: L = -[y × log(p) + (1-y) × log(1-p)] Where: y = actual label (0 or 1) p = predicted probability of class 1 log = natural logarithm Categorical Cross-Entropy (CCE) For multi-class classification: Formula: L = -Σ(yi × log(pi)) Where: yi = 1 if class i is the true class, 0 otherwise pi = predicted probability for class i ...

Activation Functions in Neural Networks

A Guide to Activation Functions in Neural Networks 🧠 Question: Without activation function can a neural network with many layers be non-linear? Answer: Provided at the end of this document. Activation functions are a crucial component of neural networks. Their primary purpose is to introduce non-linearity , which allows the network to learn the complex, winding patterns found in real-world data. Without them, a neural network, no matter how deep, would just be a simple linear model. In the diagram below the f is the activation function that receives input and send output to next layers. Commonly used activation functions. 1. Sigmoid Function 2. Tanh (Hyperbolic Tangent) 3. ReLU (Rectified Linear Unit - Like an Electronic Diode) 4. Leaky ReLU & PReLU 5. ELU (Exponential Linear Unit) 6. Softmax 7. GELU, Swish, and SiLU 1. Sigmoid Function The classic "S-curve," Sigmoid squashes any input value t...