【deep learning】dl-with-python-Summary

Summary of Chapter 1-5 DataSet : Mnist Architecture: Two Layer network, Two Affine ,Activation: Relu and the last layer is softmax, loss is cross-entropy

目标

用两层网络实现

架构

  • LayerOne : Input([图片展开向量], )
    • Weights:
    • Bais
    • Activation: Relu
  • LayerTwo
    • Weights:
    • Bais
    • Activation : Relu
  • Output
    • Softmax
  • Loss
    • crossentropy

      Details

      全连接层

  • forward : $Y = WX + b$
  • backward :
    • 传递参数:out$=\frac{dL}{dY}$
      $\frac{dL}{dX} = \frac{dL}{dY} * \frac{dY}{dX} = dout * W^T$
      $\frac{dL}{dW} = \frac{dL}{dY} * \frac{dY}{dW} = X^T * dout$
      $\frac{dL}{db} = \frac{dL}{dY} * \frac{dY}{db} = dout * 1$

      ReLu层

  • $y =\begin{cases} x, & x \ge 0 \\ 0 , & x \lt 0\end{cases}$
  • $\frac{\partial y}{\partial x} = \begin{cases} 1, & x \ge 0 \\ 0 , & x \lt 0\end{cases}$
  • mask = x <= 0
  • forward : out[mask] = 0
  • backward : dout[mask] = 0

Softmax With CrossEntropy 层

  • softmax:
    • $y_k = \frac {exp(a_k)}{\sum_{i=1}^n exp(a_i)}$
      • 改进:$y_k = \frac {exp(a_k)}{\sum_{i=1}^n exp(a_i)} = \frac {Cexp(a_k)}{C\sum_{i=1}^n exp(a_i)} = \frac {exp(a_k + logC)}{\sum_{i=1}^n exp(a_i + logC)}= \frac {exp(a_k + C^{‘})}{\sum_{i=1}^n exp(a_i + C^{‘})}$
  • crossentropy:
    • $\mathbf E = -\sum_kt_klogy_k$,$y_k$是正确输出,$t_k$是正确理解标签,并且$t_k$中只有正确解标签为1,其他均为0(one-hot表示)
  • softmaxwithcrossentropy
    • forward : 先进行softmax,再计算crossentropy
      $y = softmax(x)$
      $loss = cross_entropy(y, t)$
    • backward:
      $\frac{\partial L}{\partial a_k} = \frac{\partial L}{\partial y_k}\frac{\partial y_k}{\partial a_k} =-t_k\frac{1}{y_k}(\frac{exp(a_k)\sum_{i=1}^nexp(a_i) - exp(a_k)^2}{(\sum_{i=1}^nexp(a^i))^2}) =-t_k\frac{1}{y_k}(\frac{exp(a_k)}{\sum_{i=1}{n}} - (\frac{exp(a_k)}{\sum_{i=1}^n})^2) = -t_k\frac{1}{y_k}(y_k - y_k^2) = t_k(y_k - 1) = \begin{cases} y_k - 1 = y_k - t_k & t_k = 1 \\ 0(y_k - 1) = y_k - t_k & t_k = 0\end{cases} = y_k -t_k$

数据导入

  • 通过tensorflow自带的导入数据
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data',one_hot=True)
  • 操作数据,获取图片,标签
train_nums = mnist.train.num_examples
validation_nums = mnist.validation.num_examples
test_nums = mnist.test.num_examples

train_data = mnist.train.images
val_data = mnist.validation.images
test_data = mnist.test.images

train_labels = mnist.train.labels

网络搭建

  • 前向传播
def __init__(self, input_size, hidden_size, output_size,weight_init_std=0.01):
    
    self.params = {}
    self.params['W1'] = weight_init_std * np.random.randn(input_size, hidden_size)
    self.params['b1'] = np.zeros(hidden_size)
    self.params['W2'] =  weight_init_std * np.random.randn(hidden_size, output_size)
    self.params['b2'] = np.zeros(output_size)
    
    
    self.layers = OrderedDict()
    self.layers['Affine1'] = Affine(self.params['W1'], self.params['b1'])
    self.layers['Relu'] = Relu()
    self.layers['Affine2'] = Affine(self.params['W2'], self.params['b2'])
    self.lastLayer = SoftmaxWithLoss()

def predict(self, x):
    for name, layer in self.layers.items():
      print(name, layer)
      x = layer.forward(x)
    return x