Adversarial Learning

2019/07/21

traditional ML:optimization: Adversarial ML: game theory:

Generative Adversarial Network(GAN)

Generative Models

General idea of GAN: for some distribution metric D,

Training GANs: Two-Player Game

  • Generator network (G)
    • try to fool the discriminator by generating real-looking images(complex map)
  • DIscriminator network(D)
    • try to distinguish between real and fake images (two-class classification model)

  • Train jointly in minimax objective function: where: D(x) for real data x, D(G(z)) for generated fake data G(z)

  • max D s.t. D(x) is close to 1 (real) and 𝐷(𝐺(𝑧)) is close to 0 (fake)
    • gradient ascent on discriminator D
    • small gradient for good samples, large gradient for bad samples
  • min G s.t. D(G(z)) is close to 1 (D is fooled into thinking G(z) is real)
    • gradient descent on generator G
    • small gradient for bad samples, large gradient for good samples
    • optimizing generator objective does not work well dut to gradient vanishing
  • Two-player game is matching two empirical distributions and

Wasserstein GAN(WGAN)

If for all function f, then P=Q

Measure distribution distance by dual form Wasserstein: Using netwrok D to parametrize Lipchitz function class : Wasserstein GAN:

Adversarial Training

find small noise that can keep appearance of image and cause mistake in network

Reason for adversarial example

  • overfitting
  • local linearity(ReLU)
  • If a sample is near classification surface while the local surface is not nonlinear enough to fit it, this sample is exposed to the classification surface and an advesarial example can be created nearby

Attacking Methods

White-Box Attack

  • bounded attack method
    • targeted attack
    • untargeted attack
    • FGSM
    • PGD

Black-Box Attack

use transferability to attack that model

Defense Methods

Adversarial Training

Post Directory