Memos for papers, which are related to ML, CV and NLP.
- Wide Residual Networks
- Densely Connected Convolutional Networks
- Deep Pyramidal Residual Networks with Separated Stochastic Depth
- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
- Dual Path Networks
- CondenseNet: An Efficient DenseNet using Learned Group Convolutions
- Recurrent Models of Visual Attention
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- SSD: Single Shot MultiBox Detector
- Feature Pyramid Networks for Object Detection
- DSSD : Deconvolutional Single Shot Detector
- Speed/accuracy trade-offs for modern convolutional object detectors
- Focal Loss for Dense Object Detection
- DetNet: A Backbone network for Object Detection
- Light-Head R-CNN: In Defense of Two-Stage Object Detector
- Fully Convolutional Instance-aware Semantic Segmentation
- Mask R-CNN
- Fast and accurate object detection in high resolution 4K and 8K video using GPUs
- Revisiting RCNN: On Awakening the Classification Power of Faster RCNN
- Fully Convolutional Networks for Semantic Segmentation
- SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
- U-Net: Convolutional Networks for Biomedical Image Segmentation
- Self-critical Sequence Training for Image Captioning
- Show and Tell: A Neural Image Caption Generator
- Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
- Deep visual-semantic alignments for generating image descriptions
- Generative Adversarial Nets
- Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
- Image-to-Image Translation with Conditional Adversarial Networks
- cGAN-based Manga Colorization Using a Single Training Image
- Learning from Simulated and Unsupervised Images through Adversarial Training
- TextBoxes++: A Single-Shot Oriented Scene Text Detector
- Synthetic data generation for Indic handwritten text recognition
- Reading Scene Text with Attention Convolutional Sequence Modeling
- Improving Online Multiple Object tracking with Deep Metric Learning
- Mobile Video Object Detection with Temporally-Aware Feature Maps
- Towards High Performance Video Object Detection for Mobiles
- FlowNet: Learning Optical Flow with Convolutional Networks
- FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
- Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
- Learning to Compose Domain-Specific Transformations for Data Augmentation
- Spatial Transformer Networks
- Effective Approaches to Attention-based Neural Machine Translation
- Neural Machine Translation by Jointly Learning to Align and Translate
- Sequence to Sequence Learning with Neural Networks
- Attention Is All You Need
- Unsupervised Deep Embedding for Clustering Analysis
- Attention-Based Models for Speech Recognition
- Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex
- What’s your ML Test Score? A rubric for ML production systems
- Multimodal Emoji Prediction
- Born Again Neural Networks
- Digital Auditor: A Framework for Matching Duplicate Invoices
- Pedestrian Detection: A Benchmark