Very Deep Convolutional Networks for Large-Scale Image Recognition [VGG]
Rethinking the Inception Architecture for Computer Vision [inception_v3]
Scaling Distributed Machine Learning with the Parameter Server (OSDI 2014)[Paper]
Project Adam: Building an Efficient and Scalable Deep Learning Training System (OSDI 2014)[Paper]
Revisiting Distributed Synchronous SGD [Paper]
Neural Architecture Search with Reinforcement Learning (ICLR2017) [Paper]
[SSP] More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server (NIPS 2013) [Paper]
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour [Paper]
[D-PSGD] Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent (NIPS 2017) [Paper]
[AD-PSGD] Asynchronous Decentralized Parallel Stochastic Gradient Descent (ICML 2018) [Paper]
Heterogeneity-Aware Asynchronous Decentralized Training [Paper]