Questions about code and some SD start knowledge #17

stillbetter · 2024-12-09T11:40:04Z

Hi, thanks for your patience.

I want to know the difference between the two function get_velocity and get_approximated_x0 in scheduler/ddpm_scheduler.py with the v_prediction type in step function in line 356. Seems they all predict the denoised latents, but why thet are called in different way.

The text was updated successfully, but these errors were encountered:

stillbetter · 2024-12-09T12:27:00Z

Another question is, since we can get the pred_original_sample in every step in pipeline, why we not take it directly, instead keep the denoising step by step.

This may be irrelavent to the paper, but I wat truly confused, but cant get an proper answer. It would be great if you can explain it. Thanks

claudiom4sir · 2024-12-09T14:08:00Z

Hi,

I want to know the difference between the two function get_velocity and get_approximated_x0 in scheduler/ddpm_scheduler.py with the v_prediction type in step function in line 356. Seems they all predict the denoised latents, but why thet are called in different way.

get_approximated_x0 does exactly the same as v_prediction type in line 410. A different function was implemented to avoid calling the step function of the scheduler.

The implementation of get_velocity is very similar to get_approximated_x0 but has a different purpose. Indeed, you can see that the equation velocity = sqrt_alpha_prod * noise - sqrt_one_minus_alpha_prod * sample is different from pred_original_sample = (alpha_prod_t**0.5) * sample - (beta_prod_t**0.5) * model_output (i.e., the coefficients of sample and noise are inverted).

Another question is, since we can get the pred_original_sample in every step in pipeline, why we not take it directly, instead keep the denoising step by step.

Because pred_original_sample is just an approximation of the final latent. It is computed by combining the noise predicted by the UNet and the noisy latent that is progressively refined. When t is large (e.g., 900), the latent is still very noisy and the noise predicted by the UNet may contain errors. As a consequence, the approximation of x0 is not good enough to be the final output (refer to Figure 3 of the paper). By adopting a step-by-step denoising, the current latent becomes progressively better (i.e., with less noise) and the noise predicted by the UNet is more accurate. In addition, this step-by-step denoising allows us to exploit the bidirectional strategy proposed to ensure temporal consistency.

stillbetter · 2024-12-10T12:18:38Z

Greatly thanks for your reply! Exciting to cry~

So can I just treat v_prediction as the prediction object is a denoised latents, and epsilon as the object is noise?
I compare the get_velocity and get_approximated_x0. Since get_approximated_x0 is the same as v_prediction type . And in train.py line 1004, v_prediction corresponding to get_velocity function. But in step function of scheduler line 410, v_prediction relates to an inverted one. Why their return back are inverted? For get_approximated_x0, I understand the return thing is a noised latent minus a estimated noise so we can get a 'clean' latens of $\widetilde{x_0}$. But for get_velocity, I can't take intuitively the meaning of a noise minus a latents.

stillbetter · 2024-12-10T12:32:18Z

Well, let me make Quest.2 clear. Why the a same v_prediction corresponding to two inverted formulars, in train and step func.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about code and some SD start knowledge #17

Questions about code and some SD start knowledge #17

stillbetter commented Dec 9, 2024

stillbetter commented Dec 9, 2024

claudiom4sir commented Dec 9, 2024

stillbetter commented Dec 10, 2024 •

edited

Loading

stillbetter commented Dec 10, 2024

Questions about code and some SD start knowledge #17

Questions about code and some SD start knowledge #17

Comments

stillbetter commented Dec 9, 2024

stillbetter commented Dec 9, 2024

claudiom4sir commented Dec 9, 2024

stillbetter commented Dec 10, 2024 • edited Loading

stillbetter commented Dec 10, 2024

stillbetter commented Dec 10, 2024 •

edited

Loading