Deep Learning cheatsheet

⟶ 深度學習參考手冊

Neural Networks

⟶ 神經網路

Neural networks are a class of models that are built with layers. Commonly used types of neural networks include convolutional and recurrent neural networks.

⟶ 神經網路是一種透過 layer 來建構的模型。經常被使用的神經網路模型包括了卷積神經網路 (CNN) 和遞迴式神經網路 (RNN)。

Architecture ― The vocabulary around neural networks architectures is described in the figure below:

⟶ 架構 - 神經網路架構所需要用到的詞彙描述如下：

[Input layer, hidden layer, output layer]

⟶ [輸入層、隱藏層、輸出層]

By noting i the ith layer of the network and j the jth hidden unit of the layer, we have:

⟶ 我們使用 i 來代表網路的第 i 層、j 來代表某一層中第 j 個隱藏神經元的話，我們可以得到下面得等式：

where we note w, b, z the weight, bias and output respectively.

⟶ 其中，我們分別使用 w 來代表權重、b 代表偏差項、z 代表輸出的結果。

Activation function ― Activation functions are used at the end of a hidden unit to introduce non-linear complexities to the model. Here are the most common ones:

⟶ Activation function - Activation function 是為了在每一層尾端的神經元帶入非線性轉換而設計的。底下是一些常見 Activation function：

[Sigmoid, Tanh, ReLU, Leaky ReLU]

⟶ [Sigmoid, Tanh, ReLU, Leaky ReLU]

Cross-entropy loss ― In the context of neural networks, the cross-entropy loss L(z,y) is commonly used and is defined as follows:

⟶ 交叉熵損失函式

Learning rate ― The learning rate, often noted α or sometimes η, indicates at which pace the weights get updated. This can be fixed or adaptively changed. The current most popular method is called Adam, which is a method that adapts the learning rate.

⟶ 學習速率 - 學習速率通常用 α 或 η 來表示，目的是用來控制權重更新的速度。學習速度可以是一個固定值，或是隨著訓練的過程改變。現在最熱門的最佳化方法叫作 Adam，是一種隨著訓練過程改變學習速率的最佳化方法。

Backpropagation ― Backpropagation is a method to update the weights in the neural network by taking into account the actual output and the desired output. The derivative with respect to weight w is computed using chain rule and is of the following form:

⟶ 反向傳播演算法 - 反向傳播演算法是一種在神經網路中用來更新權重的方法，更新的基準是根據神經網路的實際輸出值和期望輸出值之間的關係。權重的導數是根據連鎖律 (chain rule) 來計算，通常會表示成下面的形式：

As a result, the weight is updated as follows:

⟶ 因此，權重會透過以下的方式來更新：

Updating weights ― In a neural network, weights are updated as follows:

⟶ 更新權重 - 在神經網路中，權重的更新會透過以下步驟進行：

Step 1: Take a batch of training data.

⟶ 步驟一：取出一個批次 (batch) 的資料

Step 2: Perform forward propagation to obtain the corresponding loss.

⟶ 步驟二：執行前向傳播演算法 (forward propagation) 來得到對應的損失值

Step 3: Backpropagate the loss to get the gradients.

⟶ 步驟三：將損失值透過反向傳播演算法來得到梯度

Step 4: Use the gradients to update the weights of the network.

⟶ 步驟四：使用梯度來更新網路的權重

Dropout ― Dropout is a technique meant at preventing overfitting the training data by dropping out units in a neural network. In practice, neurons are either dropped with probability p or kept with probability 1−p

⟶ Dropout - Dropout 是一種透過丟棄一些神經元，來避免過擬和的技巧。在實務上，神經元會透過機率值的設定來決定要丟棄或保留

Convolutional Neural Networks

⟶ 卷積神經網絡

Convolutional layer requirement ― By noting W the input volume size, F the size of the convolutional layer neurons, P the amount of zero padding, then the number of neurons N that fit in a given volume is such that:

⟶ 卷積層的需求 - 我們使用 W 來表示輸入資料的維度大小、F 代表卷積層的 filter 尺寸、P 代表對資料墊零 (zero padding) 使資料長度齊一後的長度，S 代表卷積後取出的特徵 stride 數量，則輸出的維度大小可以透過以下的公式表示：

Batch normalization ― It is a step of hyperparameter γ,β that normalizes the batch {xi}. By noting μB,σ2B the mean and variance of that we want to correct to the batch, it is done as follows:

⟶ 批次正規化 (Batch normalization) - 它是一個藉由 γ,β 兩個超參數來正規化每個批次 {xi} 的過程。每一次正規化的過程，我們使用 μB,σ2B 分別代表平均數和變異數。請參考以下公式：

It is usually done after a fully connected/convolutional layer and before a non-linearity layer and aims at allowing higher learning rates and reducing the strong dependence on initialization.

⟶ 批次正規化的動作通常在全連接層/卷積層之後、在非線性層之前進行。目的在於接納更高的學習速率，並且減少該批次學習初期對取樣資料特徵的依賴性。

Recurrent Neural Networks

⟶ 遞歸神經網路 (RNN)

Types of gates ― Here are the different types of gates that we encounter in a typical recurrent neural network:

⟶ 閘的種類 - 在傳統的遞歸神經網路中，你會遇到幾種閘：

[Input gate, forget gate, gate, output gate]

⟶ 輸入閘、遺忘閥、閘、輸出閘

[Write to cell or not?, Erase a cell or not?, How much to write to cell?, How much to reveal cell?]

⟶ 要不要將資料寫入到記憶區塊中？要不要將存在在記憶區塊中的資料清除？要寫多少資料到記憶區塊？要不要將資料從記憶區塊中取出？

LSTM ― A long short-term memory (LSTM) network is a type of RNN model that avoids the vanishing gradient problem by adding 'forget' gates.

⟶ 長短期記憶模型 - 長短期記憶模型是一種遞歸神經網路，藉由導入遺忘閘的設計來避免梯度消失的問題

Reinforcement Learning and Control

⟶ 強化學習及控制

The goal of reinforcement learning is for an agent to learn how to evolve in an environment.

⟶ 強化學習的目標就是為了讓代理 (agent) 能夠學習在環境中進化

Definitions

⟶ 定義

Markov decision processes ― A Markov decision process (MDP) is a 5-tuple (S,A,{Psa},γ,R) where:

⟶ 馬可夫決策過程 - 一個馬可夫決策過程 (MDP) 包含了五個元素：

S is the set of states

⟶ S 是一組狀態的集合

A is the set of actions

⟶ A 是一組行為的集合

{Psa} are the state transition probabilities for s∈S and a∈A

⟶ {Psa} 指的是，當 s∈S、a∈A 時，狀態轉移的機率

γ∈[0,1[ is the discount factor

⟶ γ∈[0,1[ 是衰減係數

R:S×A⟶R or R:S⟶R is the reward function that the algorithm wants to maximize

⟶ R:S×A⟶R 或 R:S⟶R 指的是獎勵函數，也就是演算法想要去最大化的目標函數

Policy ― A policy π is a function π:S⟶A that maps states to actions.

⟶ 策略 - 一個策略 π 指的是一個函數 π:S⟶A，這個函數會將狀態映射到行為

Remark: we say that we execute a given policy π if given a state a we take the action a=π(s).

⟶ 注意：我們會說，我們給定一個策略 π，當我們給定一個狀態 s 我們會採取一個行動 a=π(s)

Value function ― For a given policy π and a given state s, we define the value function Vπ as follows:

⟶ 價值函數 - 給定一個策略 π 和狀態 s，我們定義價值函數 Vπ 為：

Bellman equation ― The optimal Bellman equations characterizes the value function Vπ∗ of the optimal policy π∗:

⟶ 貝爾曼方程 - 最佳的貝爾曼方程是將價值函數 Vπ∗ 和策略 π∗ 表示為：

Remark: we note that the optimal policy π∗ for a given state s is such that:

⟶ 注意：對於給定一個狀態 s，最佳的策略 π∗ 是：

Value iteration algorithm ― The value iteration algorithm is in two steps:

⟶ 價值迭代演算法 - 價值迭代演算法包含兩個步驟：

1) We initialize the value:

⟶

針對價值初始化：

2) We iterate the value based on the values before:

⟶ 根據之前的值，迭代此價值的值：

Maximum likelihood estimate ― The maximum likelihood estimates for the state transition probabilities are as follows:

⟶ 最大概似估計 - 針對狀態轉移機率的最大概似估計為：

times took action a in state s and got to s′

⟶ 從狀態 s 到 s′ 所採取行為的次數

times took action a in state s

⟶ 從狀態 s 所採取行為的次數

Q-learning ― Q-learning is a model-free estimation of Q, which is done as follows:

⟶ Q-learning 演算法 - Q-learning 演算法是針對 Q 的一個 model-free 的估計，如下：

View PDF version on GitHub

⟶ 前往 GitHub 閱讀 PDF 版本

[Neural Networks, Architecture, Activation function, Backpropagation, Dropout]

⟶ [神經網路, 架構, Activation function, 反向傳播演算法, Dropout]

[Convolutional Neural Networks, Convolutional layer, Batch normalization]

⟶ [卷積神經網絡, 卷積層, 批次正規化]

[Recurrent Neural Networks, Gates, LSTM]

⟶ [遞歸神經網路 (RNN), 閘, 長短期記憶模型]

[Reinforcement learning, Markov decision processes, Value/policy iteration, Approximate dynamic programming, Policy search]

⟶ [強化學習, 馬可夫決策過程, 價值/策略迭代, 近似動態規劃, 策略搜尋]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cheatsheet-deep-learning.md

cheatsheet-deep-learning.md

Files

cheatsheet-deep-learning.md

Latest commit

History

cheatsheet-deep-learning.md

File metadata and controls