From 4912066559e7f2e838c85b592fb17ad62828745b Mon Sep 17 00:00:00 2001 From: hao-pt Date: Mon, 21 Oct 2024 23:32:11 -0400 Subject: [PATCH] update mobile rendering --- index.html | 113 +++++++++++++++------------------------- static/css/index.css | 5 ++ static/images/speed.jpg | Bin 0 -> 76384 bytes 3 files changed, 46 insertions(+), 72 deletions(-) create mode 100644 static/images/speed.jpg diff --git a/index.html b/index.html index 19e82d9..14c9747 100644 --- a/index.html +++ b/index.html @@ -108,7 +108,7 @@
-

DiMSUM : Diffusion Mamba - A Scalable and Unified +

DiMSUM : Diffusion Mamba - A Scalable and Unified Spatial-Frequency Method for Image Generation

@@ -124,10 +124,10 @@

DiMSUM Hoang Phan4 - Dimitris N. Metaxas3  + Dimitris N. Metaxas2 - Anh Tran1 + Anh Tran1

@@ -524,11 +524,10 @@

Unconditional Generation

Why is scanning in frequency space helpful?

-
+
-

Previous state-space models, particularly in processing visual data, failed to effectively address the design choice of scanning order due to their exclusive reliance on spatial processing, neglecting crucial long-range relations in the frequency spectrum. We propose a novel approach that integrates frequency scanning with the conventional spatial scanning mechanism. @@ -595,88 +594,58 @@

Globally-shared Transformer Block

Results

-
-
-
-
- - -
Figure 1. Unconditional generation on CelebA HQ
-
- -
-
-
-
+
+ + + + + + +
Figure 1. Unconditional generation on CelebA HQ 256 & 512
-
-
-
-
- -
Figure 2. Unconditional generation on LSUN Church
-
-
- +
+
+
+ +
Figure 2. Training convergence on CelebA HQ 256.
+
-
-
- -
Figure 3. Class-conditional generation on ImageNet1k 256
-
-
+
+ +
Figure 3. Unconditional generation on LSUN Church
+
- +
+
- -
- Figure 4. Training convergence on CelebA HQ 256. - Our method achieves faster training convergence, requiring fewer than half the training epochs compared to other diffusion models, while delivering a more stable training curve. -
+ +
Figure 4. Class-conditional generation on ImageNet1k 256
- - +
+
+ +
+
+ The speed gap between our method and DiT widens as the input resolution increases, highlighting the efficiency of our method for high-resolution synthesis. +
+
+