๐Ÿ“Œ ๋ชฉ์ฐจ

1. DCGAN (Deep Convolutional GAN)
2. WGAN-GP (Wasserstein GAN-Gradient Penalty)

3. CGAN (Conditional GAN)
4. ์š”์•ฝ

 

๐Ÿง  preview: 

GAN์€ ์ƒ์„ฑ์ž์™€ ํŒ๋ณ„์ž๋ผ๋Š” ๋‘ ๋ชจ๋“ˆ๊ฐ„์˜ ์‹ธ์›€์ด๋‹ค.
์ƒ์„ฑ์ž: random noise๋ฅผ ๊ธฐ์กด dataset์—์„œ samplingํ•œ ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด๋„๋ก ๋ณ€ํ™˜
ํŒ๋ณ„์ž: sample์ด ๊ธฐ์กด dataset์—์„œ์ธ์ง€, ์ƒ์„ฑ์ž์—์„œ๋‚˜์™”๋Š”์ง€ ์˜ˆ์ธก.

 

 

 

 

 

 

 

 


1.  DCGAN (Deep Convolutinal GAN)

DCGAN 

2015๋…„์— ๋‚˜์˜จ ๋…ผ๋ฌธ ์ฐธ๊ณ .


๐Ÿ’ธ Generator

๋ชฉํ‘œ: ํŒ๋ณ„์ž๊ฐ€ ํŒ๋ณ„ ๋ถˆ๊ฐ€๋Šฅํ•œ img์ƒ์„ฑ
input: ๋‹ค๋ณ€๋Ÿ‰ํ‘œ์ค€์ •๊ทœ๋ถ„ํฌ์—์„œ ๋ฝ‘์€ ๋ฒกํ„ฐ
output: ์›๋ณธ train data์— ์žˆ๋Š” img์™€ ๋™์ผํ•œ ํฌ๊ธฐ์˜ img

์œ„ ์„ค๋ช…์ด ๋งˆ์น˜ VAE๊ฐ™๋‹ค๋ฉด?
์‹ค์ œ๋กœ VAE์˜ Decoder์™€ ๋™์ผํ•œ ๋ชฉ์ ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.
latent space์˜ ๋ฒกํ„ฐ๋ฅผ ์กฐ์ž‘, ๊ธฐ์กด domain์—์„œ img์˜ ๊ณ ์ˆ˜์ค€ ํŠน์„ฑ์„ ๋ฐ”๊พธ๋Š” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ.
Gen = nn.Sequential(
    nn.ConvTranspose2d(input_latent, 512, 4, 1, 0, bias=False),
    nn.BatchNorm2d(512),
    nn.ReLU(True),

    nn.ConvTranspose2d(512, 256, 4, 2, 1, bias=False),
    nn.BatchNorm2d(256),
    nn.ReLU(True),

    nn.ConvTranspose2d(256, 128, 4, 2, 1, bias=False),
    nn.BatchNorm2d(128),
    nn.ReLU(True),

    nn.ConvTranspose2d(128, 64, 4, 2, 1, bias=False),
    nn.BatchNorm2d(64),
    nn.ReLU(True),

    nn.ConvTranspose2d(64, 3, 4, 2, 1, bias=False),
    nn.Tanh()
)
     

Gen = Gen.to(device)
Gen.apply(initialize_weights)





๐Ÿ” Discriminator

๋ชฉํ‘œ: img๊ฐ€ ์ง„์งœ์ธ์ง€, ๊ฐ€์งœ์ธ์ง€ ์˜ˆ์ธก.
๋งˆ์ง€๋ง‰ Conv2D์ธต์—์„œ Sigmoid๋ฅผ ์ด์šฉํ•ด 0๊ณผ 1์‚ฌ์ด ์ˆซ์ž๋กœ ์ถœ๋ ฅ.
Dis = nn.Sequential(
    nn.Conv2d(3, 64, 4, 2, 1, bias=False),
    nn.LeakyReLU(0.2, inplace=True),

    nn.Conv2d(64, 128, 4, 2, 1, bias=False),
    nn.BatchNorm2d(128),
    nn.LeakyReLU(0.2, inplace=True),

    nn.Conv2d(128, 256, 4, 2, 1, bias=False),
    nn.BatchNorm2d(256),
    nn.LeakyReLU(0.2, inplace=True),

    nn.Conv2d(256, 512, 4, 2, 1, bias=False),
    nn.BatchNorm2d(512),
    nn.LeakyReLU(0.2, inplace=True),
            
    nn.Conv2d(512, 1, 4, 1, 0, bias=False),
    nn.Sigmoid()
)


Dis=Dis.to(device)
Dis.apply(initialize_weights)โ€‹




๐Ÿ”จ Train

batch img์ƒ์„ฑ→ํŒ๋ณ„์ž์— ํ†ต๊ณผ→๊ฐ img์— ๋Œ€ํ•œ ์ ์ˆ˜ get.
โˆ™ G_Loss: BCELoss (0: fake img  /  1: real img)
โˆ™ D_Loss: BCELoss (0: fake img  /  1: real img)
์ด๋•Œ, ํ•œ๋ฒˆ์— ํ•œ ์‹ ๊ฒฝ๋ง ๊ฐ€์ค‘์น˜๋งŒ update๋˜๋„๋ก ๋‘ ์‹ ๊ฒฝ๋ง์„ ๋ฒˆ๊ฐˆ์•„ trainํ•ด์ค˜์•ผํ•จ.
criterion = nn.BCELoss()

Gen_optimizer = torch.optim.Adam(Gen.parameters(), lr=0.0002, betas=(0.5, 0.999))
Dis_optimizer = torch.optim.Adam(Dis.parameters(), lr=0.0002, betas=(0.5, 0.999))โ€‹

 

๊ธฐํƒ€ train์ฝ”๋“œ ์ฐธ๊ณ : https://github.com/V2LLAIN/Vision_Generation/blob/main/Implicit_Density/DCGAN/train.py


์ด๋•Œ, DCGANํ›ˆ๋ จ๊ณผ์ •์ด ๋ถˆ์•ˆ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. (โˆต ํŒ๋ณ„์ž์™€ ์ƒ์„ฑ์ž๊ฐ€ ์šฐ์œ„๋ฅผ ์ฐจ์ง€ํ•˜๋ ค ์„œ๋กœ ๊ณ„์† ๊ฒฝ์Ÿํ•˜๊ธฐ ๋•Œ๋ฌธ.)
์‹œ๊ฐ„์ด ์ถฉ๋ถ„ํžˆ ์ง€๋‚˜๋ฉด, ํŒ๋ณ„์ž๊ฐ€ ์šฐ์„ธํ•ด์ง€๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค.
๋‹ค๋งŒ, ์ด์‹œ์ ์—๋Š” ์ƒ์„ฑ์ž๊ฐ€ ์ถฉ๋ถ„ํžˆ ๊ณ ํ’ˆ์งˆ Img์ƒ์„ฑ์ด ๊ฐ€๋Šฅํ•ด์„œ ํฐ ๋ฌธ์ œ๋Š” ๋˜์ง€ ์•Š๋Š”๋‹ค.


Label Smoothing

๋˜ํ•œ, GAN์— random noise๋ฅผ ์กฐ๊ธˆ ์ถ”๊ฐ€ํ•˜๋ฉด ์œ ์šฉํ•œ๋ฐ, train๊ณผ์ •์˜ ์•ˆ์ •์„ฑ ๊ฐœ์„ฑ ๋ฐ img์„ ๋ช…๋„๊ฐ€ ์ฆ๊ฐ€ํ•œ๋‹ค.
(๋งˆ์น˜ Denoise Auto Encoder์™€ ๊ฐ™์€ ๋Š๋‚Œ.)

GAN ํ›ˆ๋ จ ํŒ &  Trick

โˆ™ D >> G ์ธ ๊ฒฝ์šฐ.

ํŒ๋ณ„์ž๊ฐ€ ๋„ˆ๋ฌด ๊ฐ•ํ•˜๋ฉด Loss์‹ ํ˜ธ๊ฐ€ ๋„ˆ๋ฌด ์•ฝํ•ด์ง„๋‹ค.
์ด๋กœ ์ธํ•ด ์ƒ์„ฑ์ž์—์„œ ์˜๋ฏธ์žˆ๋Š” ํ–ฅ์ƒ์„ ๋„๋ชจํ•˜๊ธฐ ์–ด๋ ค์›Œ์ง„๋‹ค.
๋”ฐ๋ผ์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํŒ๋ณ„์ž๋ฅผ ์•ฝํ™”ํ•  ๋ฐฉ๋ฒ•์ด ํ•„์š”ํ•˜๋‹ค.
โˆ™ ํŒ๋ณ„์ž์— Dropout rate ์ฆ๊ฐ€.
โˆ™ ํŒ๋ณ„์ž์˜ LR ๊ฐ์†Œ.
โˆ™ ํŒ๋ณ„์ž์˜ Conv filter ์ˆ˜ ๊ฐ์†Œ.
โˆ™ ํŒ๋ณ„์ž ํ›ˆ๋ จ ์‹œ, Label์— Noise์ถ”๊ฐ€. (Label Smoothing)
โˆ™ ํŒ๋ณ„์ž ํ›ˆ๋ จ ์‹œ, ์ผ๋ถ€ img์˜ label์„ random์œผ๋กœ ๋’ค์ง‘๋Š”๋‹ค.โ€‹


โˆ™ G >> D ์ธ ๊ฒฝ์šฐ.

mode collapse: ์ƒ์„ฑ์ž๊ฐ€ ๊ฑฐ์˜ ๋™์ผํ•œ ๋ช‡๊ฐœ์˜ img๋กœ ํŒ๋ณ„์ž๋ฅผ "์‰ฝ๊ฒŒ ์†์ด๋Š” ๋ฐฉ๋ฒ•"
mode: ํŒ๋ณ„์ž๋ฅผ ํ•ญ์ƒ ์†์ด๋Š” ํ•˜๋‚˜์˜ sample.

์ƒ์„ฑ์ž๋Š” ์ด๋Ÿฐ mode๋ฅผ ์ฐพ์œผ๋ ค๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๊ณ ,
latent space์˜ ๋ชจ๋“  point๋ฅผ ์ด img์— mapping๊ฐ€๋Šฅํ•˜๋‹ค.
๋˜ํ•œ, ์†์‹คํ•จ์ˆ˜์˜ Gradient๊ฐ€ 0์— ๊ฐ€๊นŒ์šด๊ฐ’์œผ๋กœ ๋ถ•๊ดด(collapse)ํ•˜๊ธฐ์— ์ด์ƒํƒœ์—์„œ ๋ฒ—์–ด๋‚˜๊ธฐ ์–ด๋ ค์›Œ์ง„๋‹ค.




โˆ™ ์œ ์šฉํ•˜์ง€ ์•Š์€ Loss

์†์‹ค์ด ์ž‘์„์ˆ˜๋ก ์ƒ์„ฑ๋œ imgํ’ˆ์งˆ์ด ๋” ์ข‹์„ ๊ฒƒ์ด๋ผ ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋‹ค.
ํ•˜์ง€๋งŒ ์ƒ์„ฑ์ž๋Š” ํ˜„์žฌ ํŒ๋ณ„์ž์— ์˜ํ•ด์„œ๋งŒ ํ‰๊ฐ€๋œ๋‹ค.
ํŒ๋ณ„์ž๋Š” ๊ณ„์† ํ–ฅ์ƒ๋˜๊ธฐ์— train๊ณผ์ •์˜ ๋‹ค๋ฅธ์ง€์ ์—์„œ ํ‰๊ฐ€๋œ ์†์‹ค์„ ๋น„๊ตํ•  ์ˆ˜ ์—†๋‹ค.
์ฆ‰, ํŒ๋ณ„Loss๋Š” ๊ฐ์†Œํ•˜๊ณ , ์ƒ์„ฑLoss๋Š” ์ฆ๊ฐ€ํ•œ๋‹ค.→ GAN train๊ณผ์ • ๋ชจ๋‹ˆํ„ฐ๋ง์ด ์–ด๋ ค์šด ์ด์œ .

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


2. WGAN-GP (Wasserstein GAN with Gradient Penalty)

GAN Loss

GAN์˜ ํŒ๋ณ„์žโˆ™์ƒ์„ฑ์ž ํ›ˆ๋ จ ์‹œ ์‚ฌ์šฉํ•œ BCE Loss๋ฅผ ์‚ดํŽด๋ณด์ž.

ํŒ๋ณ„์ž Dํ›ˆ๋ จ: real_img์— ๋Œ€ํ•œ ์˜ˆ์ธก pi=D(xi)์™€ target yi=1์„ ๋น„๊ต.
์ƒ์„ฑ์ž Gํ›ˆ๋ จ: ์ƒ์„ฑ_img์— ๋Œ€ํ•œ ์˜ˆ์ธก pi=D(G(zi))์™€ target yi=0์„ ๋น„๊ต.

[GAN D_Loss ์ตœ๋Œ€ํ™” ์‹]:

[GAN G_Loss ์ตœ์†Œํ™” ์‹]:




Wesserstein Loss

[GAN Loss์™€์˜ ์ฐจ์ด์ ]:
โˆ™ 1๊ณผ 0๋Œ€์‹ , yi = 1, yi = -1์„ ์‚ฌ์šฉ.
โˆ™ D์˜ ๋งˆ์ง€๋ง‰์ธต์—์„œ sigmoid์ œ๊ฑฐ.
→ ์˜ˆ์ธก pi๊ฐ€ [0,1]๋ฒ”์œ„์— ๊ตญํ•œ๋˜์ง€ ์•Š๊ณ  [-∞,∞] ๋ฒ”์œ„์˜ ์–ด๋–ค ์ˆซ์ž๋„ ๋  ์ˆ˜ ์žˆ๊ฒŒํ•จ.
์œ„์˜ ์ด์œ ๋“ค๋กœ WGAN์˜ ํŒ๋ณ„์ž๋Š” ๋ณดํ†ต ๋น„ํ‰์ž(Critic)๋ผ ๋ถ€๋ฅด๋ฉฐ, ํ™•๋ฅ ๋Œ€์‹  ์ ์ˆ˜"score"๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค.

[Wesserstein Lossํ•จ์ˆ˜]:
WGAN์˜ critic D๋ฅผ ํ›ˆ๋ จํ•˜๊ธฐ์œ„ํ•ด 
real_img์— ๋Œ€ํ•œ ์˜ˆ์ธก(D(xi))๊ณผ ํƒ€๊ฒŸ(= 1)์„ ๋น„๊ต.
์ƒ์„ฑ_img์— ๋Œ€ํ•œ ์˜ˆ์ธก(D(G(zi)))๊ณผ ํƒ€๊ฒŸ(= -1)์„ ๋น„๊ต.
∴ ์†์‹ค์„ ๊ณ„์‚ฐ


[WGAN Critic D_Loss ์ตœ์†Œํ™”]: real๊ณผ ์ƒ์„ฑ๊ฐ„์˜ ์˜ˆ์ธก์ฐจ์ด ์ตœ๋Œ€ํ™”.

[WGAN G_Loss ์ตœ์†Œํ™”]: Critic์—์„œ ๊ฐ€๋Šฅํ•œ ๋†’์€ ์ ์ˆ˜๋ฅผ ๋ฐ›๋Š” img์ƒ์„ฑ.
(= Critic์„ ์†์—ฌ real_img๋ผ ์ƒ๊ฐํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๊ฒƒ.)

1-Lipshitz Continuous function

sigmoid๋กœ [0,1]๋ฒ”์œ„์— ๊ตญํ•œํ•˜์ง€ ์•Š๊ณ 
Critic์ด [-∞,∞] ๋ฒ”์œ„์˜ ์–ด๋–ค ์ˆซ์ž๋„ ๋  ์ˆ˜ ์žˆ๊ฒŒํ•œ๋‹ค๋Š” ์ ์€ Wessertein Loss๊ฐ€ ์ œํ•œ์—†์ด ์•„์ฃผ ํฐ ๊ฐ’์ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ธ๋ฐ, ๋ณดํ†ต ์‹ ๊ฒฝ๋ง์—์„œ ํฐ ์ˆ˜๋Š” ํ”ผํ•ด์•ผํ•œ๋‹ค.

๊ทธ๋ ‡๊ธฐ์—, "Critic์— ์ถ”๊ฐ€์ ์ธ ์ œ์•ฝ์ด ํ•„์š”"ํ•˜๋‹ค.
ํŠนํžˆ, Critic์€ 1-Lipshitz ์—ฐ์†ํ•จ์ˆ˜์—ฌ์•ผ ํ•˜๋Š”๋ฐ, ์ด์—๋Œ€ํ•ด ์‚ดํŽด๋ณด์ž.

Critic์€ ํ•˜๋‚˜์˜ img๋ฅผ ํ•˜๋‚˜์˜ ์˜ˆ์ธก์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜ D์ด๋‹ค.
์ž„์˜์˜ ๋‘ input_img x1, x2์— ๋Œ€ํ•ด ๋‹ค์Œ ๋ถ€๋“ฑ์‹์„ ๋งŒ์กฑํ•˜๋ฉด, ์ด ํ•จ์ˆ˜๋ฅผ 1-Lipshitz๋ผ ํ•œ๋‹ค:
|x1-x2| : ๋‘ img ํ”ฝ์…€์˜ ํ‰๊ท ์ ์ธ ์ ˆ๋Œ“๊ฐ’ ์ฐจ์ด
|D(x1) - D(x2)| : Critic ์˜ˆ์ธก๊ฐ„์˜ ์ ˆ๋Œ“๊ฐ’ ์ฐจ์ด
Lipshitz Continuous Function

๊ธฐ๋ณธ์ ์œผ๋กœ ๊ธฐ์šธ๊ธฐ์˜ ์ ˆ๋Œ“๊ฐ’์ด ์–ด๋””์—์„œ๋‚˜ ์ตœ๋Œ€ 1์ด์–ด์•ผํ•œ๋‹ค
= ๋‘ img๊ฐ„ Critic์˜ˆ์ธก๋ณ€ํ™”๋น„์œจ ์ œํ•œ์ด ํ•„์š”ํ•˜๋‹ค๋Š” ์˜๋ฏธ.

 

WGAN-GP

WGAN์˜ Critic์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์ž‘์€ [-0.01, 0.01]๋ฒ”์œ„์— ๋†“์ด๋„๋ก
train batch ์ดํ›„ weight clipping์œผ๋กœ Lipshitz์ œ์•ฝ์„ ๋ถ€๊ณผํ•œ๋‹ค.

์ด๋•Œ, ํ•™์Šต์†๋„๊ฐ€ ํฌ๊ฒŒ ๊ฐ์†Œํ•˜๊ธฐ์— Lipshitz์ œ์•ฝ์„ ์œ„ํ•ด ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์„ ์ ์šฉํ•œ๋‹ค:
๋ฐ”๋กœ Wesserstein GAN-Gradient Penalty์ด๋‹ค.

[WGAN-GP]: Gradient Norm์ด 1์—์„œ ๋ฒ—์–ด๋‚˜๋ฉด ๋ชจ๋ธ์— ๋ถˆ์ด์ต์„ ์ฃผ๋Š” ๋ฐฉ์‹์ด๋‹ค.

[Gradient Penalty Loss]:
input_img์— ๋Œ€ํ•œ ์˜ˆ์ธก์˜ Gradient Norm๊ณผ 1์‚ฌ์ด ์ฐจ์ด๋ฅผ ์ œ๊ณฑํ•œ ๊ฒƒ. 
๋ชจ๋ธ์€ ์ž์—ฐ์Šค๋ ˆ GPํ•ญ์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฐ€์ค‘์น˜๋ฅผ ์ฐพ์œผ๋ คํ•˜๊ธฐ์— ์ด ๋ชจ๋ธ์€ ๋ฆฝ์‹œ์ธ  ์ œ์•ฝ์„ ๋”ฐ๋ฅด๊ฒŒ ํ•œ๋‹ค.

Train๊ณผ์ •๋™์•ˆ ๋ชจ๋“ ๊ณณ์—์„œ Gradient๊ณ„์‚ฐ์€ ํž˜๋“ค๊ธฐ์— WGAN-GP๋Š” ์ผ๋ถ€์ง€์ ์—์„œ๋งŒ Gradient๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค.
์ด๋•Œ, real_img์™€ fake_img์Œ ๊ฐ„์˜ interpolation img๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.
from torch.autograd import Variable
from torch.autograd import grad as torch_grad

def gradient_penalty(self, real_data, generated_data):
        batch_size = real_data.size()[0]

        # Calculate interpolation
        alpha = torch.rand(batch_size, 1, 1, 1)
        alpha = alpha.expand_as(real_data)

        interpolated = alpha*real_data.data + (1 - alpha)*generated_data.data
        interpolated = Variable(interpolated, requires_grad=True)

        # Calculate probability of interpolated examples
        prob_interpolated = self.D(interpolated)

        # Calculate gradients of probabilities with respect to examples
        gradients = torch_grad(outputs=prob_interpolated, inputs=interpolated,
                               grad_outputs=torch.ones(prob_interpolated.size()).cuda() if self.use_cuda else torch.ones(
                               prob_interpolated.size()),
                               create_graph=True, retain_graph=True)[0]

        # Gradients have shape (B,C,W,H)
        # so flatten to easily take norm per example in batch
        gradients = gradients.view(batch_size, -1)
        self.losses['gradient_norm'].append(gradients.norm(2, dim=1).mean().data[0])

        # Derivatives of the gradient close to 0 can cause problems because of
        # the square root, so manually calculate norm and add epsilon
        gradients_norm = torch.sqrt(torch.sum(gradients**2, dim=1) + 1e-12)

        # Return gradient penalty
        return self.gp_weight * ((gradients_norm-1)**2).mean()

[WGAN-GP์—์„œ์˜ Batch Normalization]

BN์€ ๊ฐ™์€ batch์•ˆ์˜ img๊ฐ„์˜ correlation์„ ๋งŒ๋“ ๋‹ค.
๊ทธ๋ ‡๊ธฐ์— gradient penalty loss์˜ ํšจ๊ณผ๊ฐ€ ๋–จ์–ด์ง€์—
WGAN-GP๋Š” Critic์—์„œ BN์„ ์‚ฌ์šฉํ•ด์„œ๋Š” ์•ˆ๋œ๋‹ค.

 

 

 

 

 

 

 

 

 

 

 

 

 

 


3.  CGAN (Conditional GAN)

prev.

์•ž์„œ ์„ค๋ช…ํ•œ ๋ชจ๋ธ๋“ค์€ "์ฃผ์–ด์ง„ trainset์—์„œ ์‚ฌ์‹ค์ ์ธ img๋ฅผ ์ƒ์„ฑํ•˜๋Š” GAN"์ด์—ˆ๋‹ค.
ํ•˜์ง€๋งŒ, "์ƒ์„ฑํ•˜๋ ค๋Š” img์˜ ์œ ํ˜•์„ ์ œ์–ดํ•  ์ˆ˜ ๋Š” ์—†์—ˆ๋‹ค."
(ex. ์ƒ์„ฑํ•˜๋ ค๋Š” img์œ ํ˜•: ํฌ๊ฑฐ๋‚˜ ์ž‘์€ ๋ฒฝ๋Œ, ํ‘๋ฐœ/๊ธˆ๋ฐœ ๋“ฑ๋“ฑ)

latent space์—์„œ randomํ•œ ํ•˜๋‚˜์˜ point sampling์€ ๊ฐ€๋Šฅํ•˜๋‹ค.
latent variable์„ ์„ ํƒํ•˜๋ฉด ์–ด๋–ค ์ข…๋ฅ˜์˜ img๊ฐ€ ์ƒ์„ฑ๋  ์ง€ ์‰ฝ๊ฒŒ ํŒŒ์•…๊ฐ€๋Šฅํ•˜๋‹ค.

 

CGAN

[GAN v.s CGAN]:
CGAN์€ GAN๊ณผ ๋‹ฌ๋ฆฌ "label๊ณผ ๊ด€๋ จ๋œ ์ถ”๊ฐ€์ •๋ณด๋ฅผ ์ƒ์„ฑ์ž์™€ critic์— ์ „๋‹ฌํ•œ๋‹ค๋Š” ์ "์ด๋‹ค.
โˆ™ ์ƒ์„ฑ์ž: ์ด ์ •๋ณด๋ฅผ one-hot encoding vector๋กœ latent space sample์— ๋‹จ์ˆœํžˆ ์ถ”๊ฐ€.
โˆ™ Critic: label ์ •๋ณด๋ฅผ RGB img์˜ ์ฑ„๋„์— ์ถ”๊ฐ€์ฑ„๋„๋กœ ์ถ”๊ฐ€.
→ input img๊ฐ€ ๋™์ผํ•œ ํฌ๊ธฐ๊ฐ€ ๋  ๋•Œ ๊นŒ์ง€ one-hot encoding vector๋ฅผ ๋ฐ˜๋ณต.

[์œ ์ผํ•œ ๊ตฌ์กฐ ๋ณ€๊ฒฝ์‚ฌํ•ญ]:
label์ •๋ณด๋ฅผ G,D์˜ ๊ธฐ์กด ์ž…๋ ฅ์— ์—ฐ๊ฒฐํ•˜๋Š” ๊ฒƒ.
class Generator(nn.Module):
    def __init__(self, generator_layer_size, z_size, img_size, class_num):
        super().__init__()
        
        self.z_size = z_size
        self.img_size = img_size
        
        self.label_emb = nn.Embedding(class_num, class_num)
     
        self.model = nn.Sequential(
            nn.Linear(self.z_size + class_num, generator_layer_size[0]),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(generator_layer_size[0], generator_layer_size[1]),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(generator_layer_size[1], generator_layer_size[2]),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(generator_layer_size[2], self.img_size * self.img_size),
            nn.Tanh()
        )
    
    def forward(self, z, labels):
        
        # Reshape z
        z = z.view(-1, self.z_size)
        
        # One-hot vector to embedding vector
        c = self.label_emb(labels)
        
        # Concat image & label
        x = torch.cat([z, c], 1)
        
        # Generator out
        out = self.model(x)
        
        return out.view(-1, self.img_size, self.img_size)โ€‹

class Discriminator(nn.Module):
    def __init__(self, discriminator_layer_size, img_size, class_num):
        super().__init__()
        
        self.label_emb = nn.Embedding(class_num, class_num)
        self.img_size = img_size
        
        self.model = nn.Sequential(
            nn.Linear(self.img_size * self.img_size + class_num, discriminator_layer_size[0]),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Dropout(0.3),
            nn.Linear(discriminator_layer_size[0], discriminator_layer_size[1]),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Dropout(0.3),
            nn.Linear(discriminator_layer_size[1], discriminator_layer_size[2]),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Dropout(0.3),
            nn.Linear(discriminator_layer_size[2], 1),
            nn.Sigmoid()
        )
    
    def forward(self, x, labels):
        
        # Reshape fake image
        x = x.view(-1, self.img_size * self.img_size)
        
        # One-hot vector to embedding vector
        c = self.label_emb(labels)
        
        # Concat image & label
        x = torch.cat([x, c], 1)
        
        # Discriminator out
        out = self.model(x)
        
        return out.squeeze()โ€‹

 

 

 

 

 

 

 

 

 

 

 

 

 


4.  ์š”์•ฝ

์•ž์„œ Data๋ฅผ ์ง์ ‘์ƒ์„ฑํ•˜๋Š” ํ™•๋ฅ ์ ๊ณผ์ •์œผ๋กœ ๋ฐ€๋„ํ•จ์ˆ˜๋ฅผ ์•”๋ฌต์ ์œผ๋กœ ๋ชจ๋ธ๋ง๋ฐฉ์‹์ด๋ผ ํ–ˆ์—ˆ๋‹ค. 

์ด๋ฒˆ์— ์†Œ๊ฐœํ•œ GAN์€ ์ด 3๊ฐ€์ง€๊ฐ€ ์žˆ๋‹ค.(์ฐธ๊ณ )
โ‘  DCGAN: mode collapse ๋ฐ gradient vanishing problem์กด์žฌ.

โ‘ก WGAN: DCGAN๋ฌธ์ œ ํ•ด๊ฒฐ์„ ์œ„ํ•ด ์•ˆ์ •ํ™” ์ง„ํ–‰.
WGAN-GP: ํ›ˆ๋ จ๊ณผ์ •์ค‘ 1-Lipshitz์กฐ๊ฑด(์†์‹คํ•จ์ˆ˜์— Gradient Norm์ด 1์ด ๋˜๋„๋ก ๋Œ์–ด๋‹น๊ธฐ๋Š” ํ•ญ)์„ ์ถ”๊ฐ€.

โ‘ข CGAN: ์ƒ์„ฑ๋œ ์ถœ๋ ฅ Img์œ ํ˜•์„ ์ œ์–ดํ•˜๋Š”๋ฐ ํ•„์š”ํ•œ ์ถ”๊ฐ€์ •๋ณด๊ฐ€ ์‹ ๊ฒฝ๋ง์— ์ œ๊ณต.
๋‹ค์Œ์—๋Š” sequential data modeling ์‹œ, ์ด์ƒ์ ์ธ AR Model์„ ์•Œ์•„๋ณผ ๊ฒƒ์ด๋‹ค.

'Gain Study > Generation' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[G]Part 2-5. Energy-based Model  (0) 2024.01.29
[G]Part 2-4. Normalizing Flows  (2) 2024.01.29
[G]Part 2-3. Auto Regressive Models  (0) 2024.01.26
[G]Part 2-1. VAE  (2) 2024.01.25
[G]Part 1. Intro. Generative Deep Learning  (0) 2024.01.25

+ Recent posts