Skip to main content

Posts

Showing posts from March, 2025

Paper: Rademacher complexity (Bartlett & Mendelson, 2003)

  1. Original Paper (Bartlett & Mendelson, 2002) Title :  "Rademacher and Gaussian Complexities: Risk Bounds and Structural Results" Authors : Peter Bartlett and Shahar Mendelson Journal :  Journal of Machine Learning Research (JMLR) , 2002. PDF :  Download from JMLR Key Contributions : Introduces  Rademacher complexity  as a measure of hypothesis class complexity. Provides  data-dependent generalization bounds  (tighter than VC bounds). Connects Rademacher complexity to  Gaussian complexity  and covering numbers. 2. Why Rademacher Complexity? Rademacher complexity measures the  ability of a hypothesis class to fit random noise , defined as: R n ( H ) = E σ , D n [ sup ⁡ h ∈ H 1 n ∑ i = 1 n σ i h ( x i ) ] R n ​ ( H ) = E σ , D n ​ ​ [ h ∈ H sup ​ n 1 ​ i = 1 ∑ n ​ σ i ​ h ( x i ​ ) ] where  σ i σ i ​  are random Rademacher variables ( ± 1 ± 1  with equal probability). Interpretation : A high Rademacher complexity...

Randomized predictor (e.g., a Bayesian model or stochastic neural network)

A   randomized predictor   is a machine learning model that makes predictions by   sampling from a distribution over possible hypotheses   (or parameters), rather than using a single deterministic function. This approach is central to   Bayesian modeling ,   stochastic neural networks , and   ensemble methods . Here’s a breakdown of how it works and why it matters: 1. What is a Randomized Predictor? A randomized predictor: Outputs a probability distribution  over predictions (not a single point estimate). Samples predictions  from this distribution at inference time. Examples : Bayesian Neural Networks (BNNs) : Model weights are sampled from a posterior distribution. Dropout Networks : Predictions are made by randomly dropping neurons. Stochastic Gradient Langevin Dynamics (SGLD) : Adds noise to SGD to sample from the posterior. 2. Key Properties Property Description Uncertainty-Aware Captures epistemic (model) and aleatoric (data) uncertaint...