Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于Loss的疑问 #51

Open
tulvgengenr opened this issue Nov 12, 2024 · 1 comment
Open

关于Loss的疑问 #51

tulvgengenr opened this issue Nov 12, 2024 · 1 comment

Comments

@tulvgengenr
Copy link

tulvgengenr commented Nov 12, 2024

您好!在models/modeling_showo.py文件中,关于Loss的计算代码如下:

        if labels is not None:
            # 1. Mask token prediction (discrete diffusion) for image generation
            # Note that, max_seq_length indicates the maximum number of text tokens, maybe a bit confused.
            loss_t2i = F.cross_entropy(
                logits[:batch_size_t2i, max_seq_length + 1:].contiguous().view(-1, self.output_size),
                labels[:batch_size_t2i, max_seq_length + 1:].contiguous().view(-1), ignore_index=-100,
            )

            # 2. Next token prediction for language modeling
            loss_lm = F.cross_entropy(
                logits[batch_size_t2i:batch_size_t2i + batch_size_lm, :-1].contiguous().view(-1, self.output_size),
                labels[batch_size_t2i:batch_size_t2i + batch_size_lm, 1:].contiguous().view(-1), ignore_index=-100,
            )

            # 3. Next token prediction for captioning/multimodal understanding
            loss_mmu = F.cross_entropy(
                logits[-batch_size_mmu:, :-1].contiguous().view(-1, self.output_size),
                labels[-batch_size_mmu:, 1:].contiguous().view(-1), ignore_index=-100,
            )

我有一个疑惑,在t2i任务中,使用的是logits[:batch_size_t2i, max_seq_length + 1:]和labels[:batch_size_t2i, max_seq_length + 1:]计算交叉熵loss,这貌似表示在预测image的时候,logits不再表示next token的概率,而就是当前token的概率。并没有像lm和mmu任务中,logits和labels错位1。这与传统的自回归生成不同?

@Sierkinhane
Copy link
Collaborator

image generation我们是采用了discrete diffusion(or mask token prediction),具体细节可以看文章的preliminary和方法部分哈

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants