Paper

Very Deep Convolutional Networks for Large-Scale Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Architecture

VGG Design rules:

All convolutions are 3x3 with stride 1, pad 1

All max poolings are 2x2 with stride 2

After pool, double # of channels

Network has 5 convolutional stages:

Stage 1: conv-conv-pool

Stage 2: conv-conv-pool

Stage 3: conv-conv-pool

Stage 4: conv-conv-conv-[conv]-pool

Stage 5: conv-conv-conv-[conv]-pool

(VGG-19 has 4 conv in stages 4 and 5)

Untitled

AlexNet vs. VGG-16

Much bigger network, Simpler structure (stable gradient)

Untitled

AlexNet