🚀 Dev Day with Sam #3: CNN: Teaching Machines How to See 👀
There was a time I thought computers just knew what they were looking at.
You show it a cat… it says “cat.” Simple, right? Until I realized, a computer doesn’t actually see anything. It sees numbers. Just pixels. And that’s where Convolutional Neural Networks (CNNs) come in.
🧠 So… What Really Is a CNN? A Convolutional Neural Network (CNN) is a deep learning model designed to understand images by learning patterns, edges, shapes, textures, layer by layer.
Think of it like this: You don’t recognize a face all at once.
You notice edges → shapes → features → identity. CNNs do the same thing… but mathematically.
🔍 Step 1: Convolution: Where Learning Begins At the heart of CNNs is the convolution operation. A small filter (kernel) moves across the image and extracts features.
Mathematically:
Where:
I= Input image
K= Kernel (filter)
Output = Feature map
👉 This is how CNN detects things like edges and textures
⚡ Step 2: Activation: Introducing Non-Linearity After convolution, we apply an activation function:
This is ReLU (Rectified Linear Unit)
👉 It removes negative values and keeps important features
👉 Without this, the network would just be… linear (and useless for complex patterns)
🧊 Step 3: Pooling: Reducing Complexity Pooling reduces the size of the feature map while keeping key information.
Example: Max Pooling
👉 Keeps strongest signals
👉 Reduces computation
👉 Prevents overfitting
🧠 Step 4: Fully Connected Layer: Making Decisions After extracting features, CNN flattens everything and feeds it into a dense layer:
z=Wx+b
👉 This is where the model decides:
“Is this a cat… or not?”
🎯 Step 5: Output: Probability Final output uses Softmax:
👉 Converts outputs into probabilities
👉 Helps the model choose the most likely class
🔄 Putting It All Together
A CNN pipeline looks like this:
Image → Convolution → ReLU → Pooling → Fully Connected → Output
Or more formally:
Input → Feature Extraction → Dimensionality Reduction → Classification
🧠 What Makes CNN Powerful?
CNNs don’t just memorize images…
They learn patterns hierarchically:
Early layers → edges
Middle layers → shapes
Deep layers → objects
This is why CNNs are powerful:
Face recognition
Medical imaging
Self driving cars
Security systems
🛠️ Realization Moment
While studying CNNs, I had a moment: “This isn’t just image processing… this is perception.”
We’re teaching machines how to interpret the world.
🤝 Dev Day with Sam
This is what this journey is about.
Not just using models… But understanding what’s happening underneath.
We don’t just use AI
👉 You build it.
