KCCV 2022 Review - actruce's blog

2022-08-08, Oral#2

MON-O-05, DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation (예종철, KAIST)

CLIP (Radford, ICML 2021)
GAN Inversion

StyleCLIP -> StyleGAN-NADA (Siggraph 2021)

◆ Diffusion CLIP
– Not using DDPM
– DDIM ( Simplified version of DDPM )
하나의 Domain 마다 Fine-tuning 된 모델 있어야…

MON-O-06, GAN Inversion for out-of-range images with geometric transformations (조성현, POSTECH)

◆ Problems
– Dataset-Bias
– Out-of-range images

① BD Invert
– Base code : geometric transformation
– Detail code :

Base code 는 Convolution Layer 뒤에 적당한 feature map 을 빼내서 사용

② Regularized Optimization Scheme (좋은 content code)
-> encoder for base code
-> regularization for detail code

reconstruction 을 좋게 하는데 초점을 맞춤 <-> Editing 성능은 좋지 않음

Industry #1

1) Lunit
① 100,000 X 100,000 pixels ( huge data ) -> large images
② Fine-grained annotations required
③ Many sources of variability across images

annotation “CD3DET”
Barlow : Twin

2) 스트라드비전
“Software-defined car”
EDU -> DCU ( Domain Control using multi-channel cammera )

Invited Talk #1

Antonio Torralba (MIT)

“Learning to see by looking at noise”

-> Most important is sensing

GAN Dissection -> Style GAN
* Layer4 Neuron 119 ‘tree’
* Layer4 Neuron 43 ‘dome’

DataSetGAN

BigDatasetGAN

Pyramid based ~

Shades21K dataset

~~Images~~ ~~Labels~~

2022-08-09, Oral#3

TUE-O-01 Video-Question Answering Using Language-Guided Deep Compressed-Domain Video Feature

‘Video Compression Features’

Video Coding ?
– Intra-coding (I-Frame)
– P-B frame (motion vector & residual)

TUE-O-02 ‘Meta-Learning Sparse Implicit Neural Representations (INR)’, 신진우 KAIST

INR, f(X,Y) = RGB

장점 : SR: Novel-View, Video Representation, Scalable

Obstables : Cost of Training

Alternative : Sparse Initial model (less parameter) + Good Initialization

Meta Learning + Network Prunning

Recent Works> Shifting modulation, l_0 regularization

2022-08-10, Oral#5

WED-O-02 ‘End-to-End Trainable Trident Person Search Network Using Adaptive Gradient Propagation’ (심재영)

Person Search = Person(pedestrian) Detection + Person re-ID

기존: Two-step, End-to-End

문제점 :
1) Task Conflict
– P.D. -> person commonness
– P. re-ID -> uniqueness for each identity
2) Insufficient features

AGWF (Adaptive Gradient Weight Function)
: Detection 의 Confidence (αi) 를 weight 로 사용
-> Detection 이 잘 안 되면 consine similirity 활용도가 떨어진다는 가정
-> Detection 이 잘 된 Bounding Box 를 re-ID 하면 더 잘 된다.

Invited Talk #3

WED-I Chelsea Finn, “Robust Deep Networks through Invariance and Adaptation”

Oral#6

WED-O-05, XVFI : eXtreme Video Frame Interpolation (김문철, KAIST)

문제점 :
– Occlusion
– Deformation
– Large motion

WED-O-06, “Contrastive Learing for Space-Time Correspondence via Self-cycle Consistency” (Jeany Son)

Supervised? No!

Self-Supervised: forward —- backward (cycle-consistency)

supervision -> affinity matrix

self-supervision -> contrastive learning

문제점 : occlusion 문제로 propagation 중단 됨

해결책 :
1) self-cycle edges
2) bayesian model averaging (handling multi-hop)

Industry#3

NaverLABs

– Computer Vision for in / outdoor mobility

You Might Also Like

NVIDIA AI Conference 2019 (Seoul) Brief Summary

NVIDIA DLI Workshop 2019 (Seoul)

NeurIPS 2024 at Vencouver