Contributions

@Jiwon @김지호@윤민서 @배민성

Lecture 2: Model Architecture (完)

VGG

ResNet

Comparison

Seq2Seq

Attention with RNN

Image Captioning with visual attention

Attention is All You Need

How to use Attention / Transformers for Vision?

ViT

ViT: An Image is Worth 16x16 Words:Transformers for Image Recognition at Scale

Swin Transformer

MLP-Mixer

Lecture 2 Summary


Lecture 3: Object Detection (完)

Task: Object Detection

R-CNN