ConvTranspose2d Encoder Decoder Conv parameter 4, 2, 1 => 5, 2, 2 #11

lunaB · 2021-01-05T18:40:48Z

@khodid 해리님이 분석한 zi2zi 프로젝트의 설명을 듣고 Encoder Decoder net 구조를 바꾸어 보았다.

해리님이 추가한 부분

기존에는 Conv2d 4,2,1 로 사이즈를 2배 늘려 사용했지만 Conv2d를 5,2,2 로 바꾸어서도 똑같이 2배 늘려 사용할 수 있었다.

설명듣고 추가한부분

ConvTranspose2d 같은 경우는 5,2,2로 할 시 사이즈가 맞지 않아 output_padding을 1 추가해서 맞춰주었다. (5,2,2,1)

다른점

기존의 모델에서는 레이어를 더 추가하더나 하진 않았다

결과

사실 전보다는 잘 움직이는 것 같으나 살짝 애매하게 나왔다. 이런 모양일경우 interpolation이 가능한가 싶다.

khodid · 2021-01-05T18:45:55Z

설명

폰트 관련 GAN 프로젝트들에서 많이들 레퍼런스로 삼은 zi2zi 모델의 구조를 흉내내보기로 했다.

변경사항
- input으로 받는 이미지의 크기가 128 x 128에서 256 x 256으로 변경됨
- 레이어가 하나 더 증가
- decoder에서 사이즈 유지시키는 레이어 없음
- kernel size = (5, 5), stride = (2, 2), padding = (2, 2)
결론 3줄요약
- 학습 과정이 좀 특이하다. 중간에 검은 색이나 하얀색 큰 잡음이 끼어있는 케이스도 발견됨.
- 뭔가 좀더 흐려짐 현상이 완화된 것 같기도 아닌 것 같기도
- 그러나 목표하는 모델까진 아직 멀었다

모델

class Encoder(nn.Module):
    def __init__(self):
        super(Encoder, self).__init__()
        cs = 64

        # in, out, k, s, p, d
        self.e1 = nn.Sequential(
            nn.Conv2d(1, cs, 5, 2, 2) # referance to zi2zi : kernel size = (5, 5) / stride = (2,2) / padding = 'same'
        )
        self.e2 = nn.Sequential(
            nn.LeakyReLU(0.2),
            nn.Conv2d(cs, cs*2, 5, 2, 2),
            nn.BatchNorm2d(cs*2),
        )
        self.e3 = nn.Sequential(
            nn.LeakyReLU(0.2),
            nn.Conv2d(cs*2, cs*4, 5, 2, 2),
            nn.BatchNorm2d(cs*4),
        )
        self.e4 = nn.Sequential(
            nn.LeakyReLU(0.2),
            nn.Conv2d(cs*4, cs*8, 5, 2, 2),
            nn.BatchNorm2d(cs*8),
        )
        self.e5 = nn.Sequential(
            nn.LeakyReLU(0.2),
            nn.Conv2d(cs*8, cs*8, 5, 2, 2),
            nn.BatchNorm2d(cs*8),
        )
        self.e6 = nn.Sequential(
            nn.LeakyReLU(0.2),
            nn.Conv2d(cs*8, cs*8, 5, 2, 2),
            nn.BatchNorm2d(cs*8)
        )
        self.e7 = nn.Sequential(
            nn.ReLU(0.2),
            nn.Conv2d(cs*8, cs*8, 5, 2, 2),
            nn.BatchNorm2d(cs*8)
        )
        self.e7 = nn.Sequential(
            nn.ReLU(0.2),
            nn.Conv2d(cs*8, cs*8, 5, 2, 2),
            nn.BatchNorm2d(cs*8)
        )
        self.e8 = nn.Sequential(
            nn.ReLU(0.2),
            nn.Conv2d(cs*8, cs*8, 5, 2, 2),
            nn.BatchNorm2d(cs*8)
        )

        self.e1.apply(init_weights)
        self.e2.apply(init_weights)
        self.e3.apply(init_weights)
        self.e4.apply(init_weights)
        self.e5.apply(init_weights)
        self.e6.apply(init_weights)
        self.e7.apply(init_weights)

    def forward(self, x):
        d = dict()
        x = self.e1(x)
        d['e1'] = x
        x = self.e2(x)
        d['e2'] = x
        x = self.e3(x)
        d['e3'] = x
        x = self.e4(x)
        d['e4'] = x
        x = self.e5(x)
        d['e5'] = x
        x = self.e6(x)
        d['e6'] = x
        x = self.e7(x)
        d['e7'] = x
        x = self.e8(x)
        d['e8'] = x
        return x, d

        
class Decoder(nn.Module):
    def __init__(self):
        super(Decoder, self).__init__()
        cs = 64

        # 128 = style vector
        self.d1 = nn.Sequential(
            nn.ReLU(0.2),
            nn.ConvTranspose2d(cs*8+128, cs*8, 4, stride=(2, 2), padding=1),
            # nn.BatchNorm2d(cs*8),
            nn.Dropout(0.5)
        )
        self.d2 = nn.Sequential(
            nn.ReLU(0.2),
            nn.ConvTranspose2d(cs*16, cs*8, 4, stride=(2,2), padding=1),
            #nn.BatchNorm2d(cs*8),
            nn.Dropout(0.5),
        )
        self.d3 = nn.Sequential(
            nn.ReLU(0.2),
            nn.ConvTranspose2d(cs*16, cs*8, 4, stride=(2,2), padding=1),
            #nn.BatchNorm2d(cs*8),
            nn.Dropout(0.5),
        )
        self.d4 = nn.Sequential(
            nn.ReLU(0.2),
            nn.ConvTranspose2d(cs*16, cs*8, 4, stride=(2,2), padding=1),
            nn.BatchNorm2d(cs*8),
        )
        self.d5 = nn.Sequential(
            nn.ReLU(0.2),
            nn.ConvTranspose2d(cs*16, cs*4, 4, stride=(2,2), padding=1),
            nn.BatchNorm2d(cs*4),
        )
        self.d6 = nn.Sequential(
            nn.ReLU(0.2),
            nn.ConvTranspose2d(cs*8, cs*2, 4, stride=(2,2), padding=1),
            nn.BatchNorm2d(cs*2),
        )
        self.d7 = nn.Sequential(
            nn.ReLU(0.2),
            nn.ConvTranspose2d(cs*4, cs*1, 4, stride=(2,2), padding=1),
            nn.BatchNorm2d(cs),
        )
        self.d8 = nn.Sequential(
            nn.ReLU(0.2),
            nn.ConvTranspose2d(cs*2, 1, 4, stride=(2,2), padding=1),
            # nn.BatchNorm2d(cs),
            # nn.LeakyReLU(0.2),
            # nn.Sigmoid(),
            nn.Tanh(),
        )

        self.d1.apply(init_weights)
        self.d2.apply(init_weights)
        self.d3.apply(init_weights)
        self.d4.apply(init_weights)
        self.d5.apply(init_weights)
        self.d6.apply(init_weights)
        self.d7.apply(init_weights)
        self.d8.apply(init_weights)

    def forward(self, x, e):
        x = self.d1(x)
        # print('x after d1 = [{}], e7 looks like = [{}]'.format(x.shape, e['e7'].shape))
        x = torch.cat((x, e['e7']), dim=1)
        x = self.d2(x)
        x = torch.cat((x, e['e6']), dim=1)
        x = self.d3(x)
        x = torch.cat((x, e['e5']), dim=1)
        x = self.d4(x)
        x = torch.cat((x, e['e4']), dim=1)
        x = self.d5(x)
        x = torch.cat((x, e['e3']), dim=1)
        x = self.d6(x)
        x = torch.cat((x, e['e2']), dim=1)
        x = self.d7(x)
        x = torch.cat((x, e['e1']), dim=1)
        x = self.d8(x)
        return x

class Discriminator(nn.Module):
    def __init__(self, category_num):
        super(Discriminator, self).__init__()
        cs = 64

        self.category_num = category_num

        self.d1 = nn.Sequential(
            # nn.LeakyReLU(0.2),
            nn.Conv2d(1, cs, 5, 2, 2),
            nn.LeakyReLU(0.2)
        )
        self.d2 = nn.Sequential(
            nn.Conv2d(cs, cs*2, 5, 2, 2),
            nn.BatchNorm2d(cs*2),
            nn.LeakyReLU(0.2),
        )
        self.d3 = nn.Sequential(
            nn.Conv2d(cs*2, cs*4, 5, 2, 2),
            nn.BatchNorm2d(cs*4),
            nn.LeakyReLU(0.2),
        )
        self.d4 = nn.Sequential(
            nn.Conv2d(cs*4, cs*8, 5, 2, 2),
            nn.BatchNorm2d(cs*8),
            nn.LeakyReLU(0.2),
        )
        self.fc_tf = nn.Sequential(
            nn.Flatten(),
            nn.Linear(cs*8*16*16, 1),
        )
        self.fc_cg = nn.Sequential(
            nn.Flatten(),
            nn.Linear(cs*8*16*16, category_num),
        )

        self.d1.apply(init_weights)
        self.d2.apply(init_weights)
        self.d3.apply(init_weights)
        self.d4.apply(init_weights)
        self.fc_tf.apply(init_weights)
        self.fc_cg.apply(init_weights)

    def forward(self, x):
        x = self.d1(x)
        x = self.d2(x)
        x = self.d3(x)
        x = self.d4(x)

        tf = self.fc_tf(x)
        cg = self.fc_cg(x)

        return tf, cg

결과

학습과정

초기 학습 단계에서는 원본(고딕)에 덧그리듯이 나타나는 모습

50 epoch에서도 이미지를 많이 닮게는 못 한다

검은 노이즈와 흰 노이즈

150epoch(학습 완료)

interpolation

lunaB assigned lunaB and khodid Jan 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConvTranspose2d Encoder Decoder Conv parameter 4, 2, 1 => 5, 2, 2 #11

ConvTranspose2d Encoder Decoder Conv parameter 4, 2, 1 => 5, 2, 2 #11

lunaB commented Jan 5, 2021

khodid commented Jan 5, 2021 •

edited

Loading

ConvTranspose2d Encoder Decoder Conv parameter 4, 2, 1 => 5, 2, 2 #11

ConvTranspose2d Encoder Decoder Conv parameter 4, 2, 1 => 5, 2, 2 #11

Comments

lunaB commented Jan 5, 2021

해리님이 추가한 부분

설명듣고 추가한부분

다른점

결과

khodid commented Jan 5, 2021 • edited Loading

설명

모델

결과

학습과정

interpolation

khodid commented Jan 5, 2021 •

edited

Loading