Skip to content

Latest commit

 

History

History
321 lines (214 loc) · 10.4 KB

cheatsheet-deep-learning.md

File metadata and controls

321 lines (214 loc) · 10.4 KB
  1. Deep Learning cheatsheet

⟶ 深度學習參考手冊

  1. Neural Networks

⟶ 神經網路

  1. Neural networks are a class of models that are built with layers. Commonly used types of neural networks include convolutional and recurrent neural networks.

⟶ 神經網路是一種透過 layer 來建構的模型。經常被使用的神經網路模型包括了卷積神經網路 (CNN) 和遞迴式神經網路 (RNN)。

  1. Architecture ― The vocabulary around neural networks architectures is described in the figure below:

⟶ 架構 - 神經網路架構所需要用到的詞彙描述如下:

  1. [Input layer, hidden layer, output layer]

⟶ [輸入層、隱藏層、輸出層]

  1. By noting i the ith layer of the network and j the jth hidden unit of the layer, we have:

⟶ 我們使用 i 來代表網路的第 i 層、j 來代表某一層中第 j 個隱藏神經元的話,我們可以得到下面得等式:

  1. where we note w, b, z the weight, bias and output respectively.

⟶ 其中,我們分別使用 w 來代表權重、b 代表偏差項、z 代表輸出的結果。

  1. Activation function ― Activation functions are used at the end of a hidden unit to introduce non-linear complexities to the model. Here are the most common ones:

⟶ Activation function - Activation function 是為了在每一層尾端的神經元帶入非線性轉換而設計的。底下是一些常見 Activation function:

  1. [Sigmoid, Tanh, ReLU, Leaky ReLU]

⟶ [Sigmoid, Tanh, ReLU, Leaky ReLU]

  1. Cross-entropy loss ― In the context of neural networks, the cross-entropy loss L(z,y) is commonly used and is defined as follows:

⟶ 交叉熵損失函式

  1. Learning rate ― The learning rate, often noted α or sometimes η, indicates at which pace the weights get updated. This can be fixed or adaptively changed. The current most popular method is called Adam, which is a method that adapts the learning rate.

⟶ 學習速率 - 學習速率通常用 α 或 η 來表示,目的是用來控制權重更新的速度。學習速度可以是一個固定值,或是隨著訓練的過程改變。現在最熱門的最佳化方法叫作 Adam,是一種隨著訓練過程改變學習速率的最佳化方法。

  1. Backpropagation ― Backpropagation is a method to update the weights in the neural network by taking into account the actual output and the desired output. The derivative with respect to weight w is computed using chain rule and is of the following form:

⟶ 反向傳播演算法 - 反向傳播演算法是一種在神經網路中用來更新權重的方法,更新的基準是根據神經網路的實際輸出值和期望輸出值之間的關係。權重的導數是根據連鎖律 (chain rule) 來計算,通常會表示成下面的形式:

  1. As a result, the weight is updated as follows:

⟶ 因此,權重會透過以下的方式來更新:

  1. Updating weights ― In a neural network, weights are updated as follows:

⟶ 更新權重 - 在神經網路中,權重的更新會透過以下步驟進行:

  1. Step 1: Take a batch of training data.

⟶ 步驟一:取出一個批次 (batch) 的資料

  1. Step 2: Perform forward propagation to obtain the corresponding loss.

⟶ 步驟二:執行前向傳播演算法 (forward propagation) 來得到對應的損失值

  1. Step 3: Backpropagate the loss to get the gradients.

⟶ 步驟三:將損失值透過反向傳播演算法來得到梯度

  1. Step 4: Use the gradients to update the weights of the network.

⟶ 步驟四:使用梯度來更新網路的權重

  1. Dropout ― Dropout is a technique meant at preventing overfitting the training data by dropping out units in a neural network. In practice, neurons are either dropped with probability p or kept with probability 1−p

⟶ Dropout - Dropout 是一種透過丟棄一些神經元,來避免過擬和的技巧。在實務上,神經元會透過機率值的設定來決定要丟棄或保留

  1. Convolutional Neural Networks

⟶ 卷積神經網絡

  1. Convolutional layer requirement ― By noting W the input volume size, F the size of the convolutional layer neurons, P the amount of zero padding, then the number of neurons N that fit in a given volume is such that:

⟶ 卷積層的需求 - 我們使用 W 來表示輸入資料的維度大小、F 代表卷積層的 filter 尺寸、P 代表對資料墊零 (zero padding) 使資料長度齊一後的長度,S 代表卷積後取出的特徵 stride 數量,則輸出的維度大小可以透過以下的公式表示:

  1. Batch normalization ― It is a step of hyperparameter γ,β that normalizes the batch {xi}. By noting μB,σ2B the mean and variance of that we want to correct to the batch, it is done as follows:

⟶ 批次正規化 (Batch normalization) - 它是一個藉由 γ,β 兩個超參數來正規化每個批次 {xi} 的過程。每一次正規化的過程,我們使用 μB,σ2B 分別代表平均數和變異數。請參考以下公式:

  1. It is usually done after a fully connected/convolutional layer and before a non-linearity layer and aims at allowing higher learning rates and reducing the strong dependence on initialization.

⟶ 批次正規化的動作通常在全連接層/卷積層之後、在非線性層之前進行。目的在於接納更高的學習速率,並且減少該批次學習初期對取樣資料特徵的依賴性。

  1. Recurrent Neural Networks

⟶ 遞歸神經網路 (RNN)

  1. Types of gates ― Here are the different types of gates that we encounter in a typical recurrent neural network:

⟶ 閘的種類 - 在傳統的遞歸神經網路中,你會遇到幾種閘:

  1. [Input gate, forget gate, gate, output gate]

⟶ 輸入閘、遺忘閥、閘、輸出閘

  1. [Write to cell or not?, Erase a cell or not?, How much to write to cell?, How much to reveal cell?]

⟶ 要不要將資料寫入到記憶區塊中?要不要將存在在記憶區塊中的資料清除?要寫多少資料到記憶區塊?要不要將資料從記憶區塊中取出?

  1. LSTM ― A long short-term memory (LSTM) network is a type of RNN model that avoids the vanishing gradient problem by adding 'forget' gates.

⟶ 長短期記憶模型 - 長短期記憶模型是一種遞歸神經網路,藉由導入遺忘閘的設計來避免梯度消失的問題

  1. Reinforcement Learning and Control

⟶ 強化學習及控制

  1. The goal of reinforcement learning is for an agent to learn how to evolve in an environment.

⟶ 強化學習的目標就是為了讓代理 (agent) 能夠學習在環境中進化

  1. Definitions

⟶ 定義

  1. Markov decision processes ― A Markov decision process (MDP) is a 5-tuple (S,A,{Psa},γ,R) where:

⟶ 馬可夫決策過程 - 一個馬可夫決策過程 (MDP) 包含了五個元素:

  1. S is the set of states

⟶ S 是一組狀態的集合

  1. A is the set of actions

⟶ A 是一組行為的集合

  1. {Psa} are the state transition probabilities for s∈S and a∈A

⟶ {Psa} 指的是,當 s∈S、a∈A 時,狀態轉移的機率

  1. γ∈[0,1[ is the discount factor

⟶ γ∈[0,1[ 是衰減係數

  1. R:S×A⟶R or R:S⟶R is the reward function that the algorithm wants to maximize

⟶ R:S×A⟶R 或 R:S⟶R 指的是獎勵函數,也就是演算法想要去最大化的目標函數

  1. Policy ― A policy π is a function π:S⟶A that maps states to actions.

⟶ 策略 - 一個策略 π 指的是一個函數 π:S⟶A,這個函數會將狀態映射到行為

  1. Remark: we say that we execute a given policy π if given a state a we take the action a=π(s).

⟶ 注意:我們會說,我們給定一個策略 π,當我們給定一個狀態 s 我們會採取一個行動 a=π(s)

  1. Value function ― For a given policy π and a given state s, we define the value function Vπ as follows:

⟶ 價值函數 - 給定一個策略 π 和狀態 s,我們定義價值函數 Vπ 為:

  1. Bellman equation ― The optimal Bellman equations characterizes the value function Vπ∗ of the optimal policy π∗:

⟶ 貝爾曼方程 - 最佳的貝爾曼方程是將價值函數 Vπ∗ 和策略 π∗ 表示為:

  1. Remark: we note that the optimal policy π∗ for a given state s is such that:

⟶ 注意:對於給定一個狀態 s,最佳的策略 π∗ 是:

  1. Value iteration algorithm ― The value iteration algorithm is in two steps:

⟶ 價值迭代演算法 - 價值迭代演算法包含兩個步驟:

  1. 1) We initialize the value:

  1. 針對價值初始化:

  1. 2) We iterate the value based on the values before:

⟶ 根據之前的值,迭代此價值的值:

  1. Maximum likelihood estimate ― The maximum likelihood estimates for the state transition probabilities are as follows:

⟶ 最大概似估計 - 針對狀態轉移機率的最大概似估計為:

  1. times took action a in state s and got to s′

⟶ 從狀態 s 到 s′ 所採取行為的次數

  1. times took action a in state s

⟶ 從狀態 s 所採取行為的次數

  1. Q-learning ― Q-learning is a model-free estimation of Q, which is done as follows:

⟶ Q-learning 演算法 - Q-learning 演算法是針對 Q 的一個 model-free 的估計,如下:

  1. View PDF version on GitHub

⟶ 前往 GitHub 閱讀 PDF 版本

  1. [Neural Networks, Architecture, Activation function, Backpropagation, Dropout]

⟶ [神經網路, 架構, Activation function, 反向傳播演算法, Dropout]

  1. [Convolutional Neural Networks, Convolutional layer, Batch normalization]

⟶ [卷積神經網絡, 卷積層, 批次正規化]

  1. [Recurrent Neural Networks, Gates, LSTM]

⟶ [遞歸神經網路 (RNN), 閘, 長短期記憶模型]

  1. [Reinforcement learning, Markov decision processes, Value/policy iteration, Approximate dynamic programming, Policy search]

⟶ [強化學習, 馬可夫決策過程, 價值/策略迭代, 近似動態規劃, 策略搜尋]