RoIPooling in Object Detection: PyTorch Implementation(with CUDA)
RoIPooling Explanation
Region of interest pooling (also known as RoI pooling) is an operation widely used in object detection tasks using convolutional neural networks. For example, to detect multiple cars and pedestrians in a single image. Its purpose is to perform max pooling on inputs of nonuniform sizes to obtain fixed-size feature maps (e....
InstaGAN: Instance-aware Image-to-Image Translation - Sangwoo Mo - ICLR 2019
Info
Title: InstaGAN: Instance-aware Image-to-Image Translation
Task: Image-to-Image Translation
Author: Sangwoo Mo, Minsu Cho, Jinwoo Shin
Date: Dec. 2018
Arxiv: 1812.10889
Published: ICLR 2019
Highlights & Drawbacks
Instance-level translation with semantic map
Sequential mini-batch training strategy
Abstract
Unsuperv...
GANs for Image Generation: ProGAN, SAGAN, BigGAN, StyleGAN
ProGAN
ProGAN is a new technique developed by NVIDIA Labs to improve both the speed and stability of GAN training.6 Instead of immediately training a GAN on full-resolution images, the paper suggests first training the generator and discriminator on low-resolution images of, say, 4 × 4 pixels and then incrementally adding layers throughout the t...
PyTorch Code for SPADE
Project page
Paper
GTC 2019 demo
Youtube Demo of GauGAN
Installation
Clone this repo.
git clone https://github.com/NVlabs/SPADE.git
cd SPADE/
This code requires PyTorch 1.0 and python 3+. Please install dependencies by
pip install -r requirements.txt
This code also requires the Synchronized-BatchNorm-PyTorch rep....
Point-to-Point Video Generation - Tsun-Hsuan Wang - ICCV 2019
Info
Title: Point-to-Point Video Generation
Task: Video Generation
Author: Tsun-Hsuan Wang, Yen-Chi Cheng, Chieh Hubert Lin, Hwann-Tzong Chen, Min Sun
Date: Apr. 2019
Arxiv: 1904.02912
Published: ICCV 2019
Abstract
While image manipulation achieves tremendous breakthroughs (e.g., generating realistic faces) in recent years, video...
MoCoGAN: Decomposing Motion and Content for Video Generation - Tulyakov - CVPR 2018
Info
Title: MoCoGAN: Decomposing Motion and Content for Video Generation
Task: Video Generation
Author: Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz
Date: July 2017
Arxiv: 1707.04993
Published: CVPR 2018
Abstract
Visual signals in a video can be divided into content and motion. While content specifies which objects are ...
PSNR and SSIM Metric: Python Implementation
PSNR: Peak Signal-to-Noise Ratio
Peak signal-to-noise ratio, often abbreviated PSNR, is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation.
The Formula
where
Numpy Implementation
import math
import numpy as np
def calculate_ps...
Generating Bicubic Low Resolution Images with Python and MATLAB
The following script down samples HR images to LR images for Super-Resolution data preparation. It requires OpenCV and NumPy.
Python Version
import os
import sys
import cv2
import numpy as np
try:
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from data.util import imresize_np
except ImportError:
pass...
130 post articles, 17 pages.