Skip to content

Latest commit

 

History

History
12 lines (11 loc) · 3.75 KB

efficientDM.md

File metadata and controls

12 lines (11 loc) · 3.75 KB

Awesome-PaperManager-Efficient DM

A curated list for Efficient Diffusion Models

Full List

Title Authors Introduction Links Github
Cache Me if You Can: Accelerating Diffusion Models through Block Caching Meta GenAI which we reuse outputs from layer blocks of previous steps to speed up inference. Furthermore, we propose a technique to automatically determine caching schedules based on each block's changes over timesteps. Paper
DeepCache: Accelerating Diffusion Models for Free Xinyin Ma Gongfan Fang Xinchao Wang* National University of Singapore DeepCache capitalizes on the inherent temporal redundancy observed in the sequential denoising steps of diffusion models, which caches and retrieves features across adjacent denoising stages, thereby curtailing redundant computations. Utilizing the property of the U-Net, we reuse the high-level features while up Paper Github
FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models Junhyuk So, Jungwon Lee, and Eunhyeok Park we introduce an advanced acceleration technique that leverages the temporal redundancy inherent in diffusion models. Reusing feature maps with high temporal similarity opens up a new opportunity to save computation resources without compromising output quality. Paper
Approximate Caching for Efficiently Serving Text-to-Image Diffusion Models Shubham Agarwal and Subrata Mitra, Adobe Research we introduce a novel approximate-caching technique that can reduce such iterative denoising steps by reusing intermediate noise states created during a prior image generation. Based on this idea, we present an end-to-end text-to-image generation system Paper
FORA: Fast-Forward Caching in Diffusion Transformer Acceleration Pratheba Selvaraju Microsoft We present Fast-FORward CAching (FORA), a simple yet effective approach designed to accelerate DiT by exploiting the repetitive nature of the diffusion process. FORA implements a caching mechanism that stores and reuses intermediate outputs from the attention and MLP layers across denoising steps, thereby reducing computational overhead. This approach does not require model retraining and seamlessly integrates with existing transformerbased diffusion models Paper Github
Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching Xinyin Ma Xinchao Wang National University of Singapore1 To achieve this, we introduce a novel scheme, named Learning-to-Cache (L2C), that learns to conduct caching in a dynamic manner for diffusion transformers. Specifically, by leveraging the identical structure of layers in transformers and the sequential nature of diffusion, we explore redundant computations between timesteps by treating each layer as the fundamental unit for caching. Paper