AutoEncoder

AutoEncoder通过设计encode和decode过程使输入和输出越来越接近,是一种自监督学习过程,输入图片通过encode进行处理,得到code,再经过decode处理得到输出,我们可以控制encode的输出维数,就相当于强迫encode过程以低维参数学习高维特征。

模型

ref1

1_eLieLQIprDp4BgC63288kQ

自编码器包含两个部分:encoder和decoder。

encoder的输入是$x$,输出是$h$,通常$h$的维度比$x$少。比如:我们输入encoder一张大小为$100\times 100$大小的图片,输出潜在编码$h$,大小是$100\times 1$。这种情况下,encoder将输入图片映射到了一个低维空间,这样我们只需要很少的空间就可以存储$h$,当然这样的压缩也会丢失一些原始数据的信息。

decoder的输入是潜在编码$h$,并且尝试重构出一张图片。假如$h$的大小是$100\times 1$,decoder试图利用$h$重构一张大小为$100\times100$的图片,我们训练decoder使之输出的图片尽量接近原图。

只有当输入是彼此关联的(比如图片是来自同一个领域),降维才能有效。如果输入是完全随机的图片,降维的效果是很差的。同时这个过程中我们没有使用任何label数据,所以自编码器属于无监督模型。

intuition1

video1

Screen Shot 2018-07-22 at 12.42.04 AM

Screen Shot 2018-07-22 at 12.33.57 AM

Screen Shot 2018-07-22 at 12.36.22 AM

Python实现

Tensorflow - Denoising AE

点击显/隐内容
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
from keras.layers import Input, Flatten, Dense, Reshape
from keras import Model
import numpy as np
from keras.optimizers import Adam
from keras.datasets import mnist
from tqdm import tqdm
import matplotlib.pyplot as plt
from PIL import Image
class AE():
def __init__(self):
self.rows = 28
self.cols = 28
self.chanels = 1
self.input_shape = (self.rows,self.cols,self.chanels)
self.latent_dim = 2
opt = Adam(lr=0.002,beta_1=0.9)
self.encoder = self.make_encoder()
en_input = Input(self.input_shape)
en_out = self.encoder(en_input)
self.decoder = self.make_decoder()
out_img = self.decoder(en_out)
self.ae = Model(en_input,out_img)
self.ae.compile(optimizer=opt,loss='mse',metrics=['accuracy'])
def make_encoder(self):
input = Input(self.input_shape)
h = Flatten()(input)
h = Dense(1000,activation='relu')(h)
h = Dense(1000,activation='relu')(h)
output = Dense(self.latent_dim)(h)
encoder = Model(input,output)
encoder.summary()
return encoder
def make_decoder(self):
input = Input((self.latent_dim,))
h = Dense(1000, activation='relu')(input)
h = Dense(1000, activation='relu')(h)
h = Dense(np.prod(self.input_shape),activation='tanh')(h)
output = Reshape(self.input_shape)(h)
decoder = Model(input,output)
decoder.summary()
return decoder
def train(self,batch_size=128,epoches=100):
(x_train,y_train),(x_test,y_test) = mnist.load_data()
x_train = (x_train-127.5)/127.5
x_test = (x_test-127.5)/127.5
x_train = np.expand_dims(x_train,axis=3)
x_test = np.expand_dims(x_test,axis=3)
losses = []
for i in tqdm(range(epoches)):
idx = np.random.randint(0,len(x_train),batch_size)
input_imgs = x_train[idx,:,:,:]
loss = self.ae.train_on_batch(input_imgs,input_imgs)
losses.append(loss[0])
losses = np.array(losses)
plt.plot(range(1,len(losses)+1),losses)
plt.show()
self.pred(x_test)
def pred(self,x_test):
idx = np.random.randint(0,len(x_test),3)
image = x_test[idx, :, :, :]
generated_img = self.ae.predict(image)
generated_img = np.reshape(generated_img,(-1,28,28))
for i in range(len(generated_img)):
pic = generated_img[i]
pic = pic*127.5+127.5
img = Image.fromarray(pic)
plt.imshow(img)
plt.show()
if __name__ == '__main__':
ae = AE()
ae.train()

image-20180809104701466