2020 한국인공지능학회 동계강좌 정리 – 1. 고려대 주재걸 교수님, Human in the loop

This entry is part 1 of 9 in the series 2020 한국인공지능학회 동계강좌

 

2020 인공지능학회 동계강좌를 신청하여 2020.1.8 ~ 1.10 3일 동안 다녀왔다. 총 9분의 연사가 나오셨는데, 프로그램 일정은 다음과 같다.

전체를 묶어서 하나의 포스트로 작성하려고 했는데, 주제마다 내용이 꽤 될거 같아, 한 강좌씩 시리즈로 묶어서 작성하게 되었다. 첫 번째 포스트에서는 고려대학교 주재걸 교수님의 “Human in the loop” 강연 내용을 다룬다.

 

  1. Recognition(인지) vs Generation (생성)
    1. Recognition : compresses a large number of input values into a small number of output values.
    2. Generation : expands a small number of input values into a large number of output values.
    3. Translation : transforms a large number of input information into another large number of output values.
    4. Conditional Generation : An additional input is given, which steers the generation processes in a user-driven manner.
  2. Applications of Generative Models
    1. Realistic samples for artwork, super-resolution, colorization, In-Painting, etc
  3. GANs
    1.  Terminology
      1. Generative : It is a model for generation
      2. Adversarial : Improves the generation quality via adversarial training
      3. Networks : The model is formed as neural networks.
    2. Current Status of GAN :
      1. PGGAN : Karras et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR’18
      2. StyleGAN : Karras et al. A Style-Based Generator Architecture for Generative Adversarial Networks, CVPR’ 19
    3. Several type of GANs
      1. DCGAN : No pooling layer (Instead strided convolution), Use batch normalization, Adam optimizer (lr=0.0002, beta1=0.5, beta2=0.999)
      2. pix2pix : Paired Image-to-Image Translation
      3. CycleGAN : Unpaired Image-to-Image Translation
      4. CGAN (Conditional GAN) : Mirza & Osindero, Conditional Generative Adversarial Nets, 2014
      5. ACGAN (Auxiliary Classifier) : Odena et al, Conditional Image Synthesis With Auxiliary Classifier GANs, 2016
        • Improves the training of GANs using class labels
      6. STARGAN : Choi et al, StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, CVPR 2018
        • Multi-Domain Image-to-Image translation
  4. Motivations for Human-in-the-Loop Approach
    1. User intent are often too complex to describe as a simple categorical variable.
      • Flexible, sophisticated forms of user inputs are necessary
    2. Some among them may not be satisfactory to users nor aligned with user intent.
      • Users should be able to partially edit the output in an interactive manner
  5. User Inputs in Generative Models
    1. Global (male of female) vs. Local (strokes and scribbles)
    2. Reference-based vs. non-reference-based
    3. Strokes and scribbles
      1. Positive vs. Negative clicks (segmentation)
      2. Particular colors (colorization)
      3. https://paintschainer.preferred.tech/index_en.html
    4. Reference image
      1. User’s own vs. one among a pre-given set
      2. Concatenation-based
        1. STARGAN : Choi et al, StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, CVPR 2018
        2. DRIT : Lee et al, Diverse Image-to-Image Translation via Disentangled Representations, ECCV 2018
      3. Normalization-based
        1. MUNIT : Huang et al, Multimodal Unsupervised Image-to-Image Translation, ECCV 2018
        2. GDWCT : Cho et al, Image-to-Image Translation via Group-wise Deep Whitening-and-Coloring Transformation, CVPR 2019
        3. SPADE : Park et al, Semantic Image Synthesis with Spatially-Adaptive Normalization, CVPR 2019
    5. GANPaint
      1. A user edits a generated image or a photograph with high-level concepts rather than pixel colors
      2. https://www.youtube.com/watch?v=yVCgUYe4JTM&feature=youtu.be
      3. Bau et al, GAN Dissection: Visualizing and Understanding Generative Adversarial Networks, ICLR 2019
      4. http://gandissect.res.ibm.com/ganpaint.html
      5. https://github.com/CSAILVision/gandissect
      6. Human-machine interaction is almost real-time
    6. Semantic Photo Manipulation
      1. Bau et al, Semantic Photo Manipulation with a Generative Image Prior, SIGGRAPH 2019
      2. https://www.youtube.com/watch?v=q1K4QWrbCRM&feature=youtu.be
    7. Interactive Colorization
      1. Zhang et al, Real-Time User-Guided Image Colorization with Learned Deep Priors, SIGGRAPH 2017
        • Support both local and global hints
        • Global hints can incorporate a particular characteristic of color histogram of a given reference image
        • Concatenation-based conditioning
      2. Sangkloy et al, Scribbler: Controlling Deep Image Synthesis with Sketch and Color, CVPR 2017
        • Interactive Colorization via Sketch and Color Strokes
        • Simulating User’s Sketch Inputs
      3. Zhang et al, Deep Exemplar-based Video Colorization, CVPR 2019
      4. Sun et al, Adversarial Colorization Of Icons Based On Structure And Color Conditions, ACM MM 2019
      5. Lee et al, Reference-Based Sketch Image Colorization using Augmented Self Exemplar and Dense Semantic Correspondence, under review
      6. Bahng et al, Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation, ECCV 2018
    8. Interactive Segmentation
      1. Acuna et al, Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++, CVPR 2018
      2. Ling et al, Fast Interactive Object Annotation with Curve-GCN, CVPR 2019
      3. Wang et al, Object Instance Annotation with Deep Extreme Level Set Evolution, CVPR 2019
  6. Future Research Directions
    1. Support for real-time, multiple iterative interactions
      • Reflecting higher-order user intent in multiple sequential interactions
    2. Revealing inner-workings and interaction handle
      • explicitly using attention module
    3. Better simulating user inputs in the training stage
    4. Incorporating data visualization and advanced user interfaces
    5. Leveraging hard rule-based approaches
    6. Incorporating users’ implicit feedback and online learning
    7. Useful Links
      1. 2019 ICML Workshop on Human in the Loop Learning (HILL)
      2. 2020 IUI Workshop on Human-AI Co-Creation with Generative Models
      3. Key researchers
Series Navigation2020 한국인공지능학회 동계강좌 정리 – 2. 서울대 김건희 교수님, Pretrained Language Model >>

Leave a Comment