Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
VGG Design rules:
All convolutions are 3x3 with stride 1, pad 1
All max poolings are 2x2 with stride 2
After pool, double # of channels
Network has 5 convolutional stages:
Stage 1: conv-conv-pool
Stage 2: conv-conv-pool
Stage 3: conv-conv-pool
Stage 4: conv-conv-conv-[conv]-pool
Stage 5: conv-conv-conv-[conv]-pool
(VGG-19 has 4 conv in stages 4 and 5)
All conv are 3x3 stride with stride 1, pad 1
Conv(5x5) vs. 2 Conv(3x3)
Two 3x3 conv has same recpetive field as a single 5x5 conv, but has fewer parameters and tekes less computation!
All max pool are 2x2 stride 2 / After pool, double #channels
Conv layers at each spatial resolution take the same amount of computation!
(HxW 반으로 줄이고 C 2배로 늘린 뒤에 Conv 하는거랑 전이랑 연산이 같음)
Much bigger network, Simpler structure (stable gradient)