Long-Short Term Memory (LSTM)

LSTM - How it Works

see: LSTM - Understanding LSTM Networks

LSTM - Structure

Structure Diagram

Equations

Range

Description

𝑥𝑡

∊ℝ𝑛

input vector

All 𝑊s

∊ℝℎx(𝑛+ℎ)

weight matrices

All 𝑏s

∊ℝ

bias vectors

𝑓𝑡 = 𝜎(𝑊𝑓[ℎ𝑡-1,𝑥𝑡] + 𝑏𝑓)

∊(0,1)

forget gate’s activation vector

𝑖𝑡 = 𝜎(𝑊𝑖[ℎ𝑡-1,𝑥𝑡] + 𝑏𝑖)

∊(0,1)

input/update gate’s activation vector

𝐶𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝐶[ℎ𝑡-1,𝑥𝑡] + 𝑏𝐶)

∊(-1,1)

cell input activation vector

𝑜𝑡 = 𝜎(𝑊𝑜[ℎ𝑡-1,𝑥𝑡] + 𝑏𝑜)

∊(0,1)

output gate’s activation vector

𝑐𝑡 = 𝑓𝑡𝑐𝑡-1 + 𝑖𝑡𝐶𝑡-1

∊ℝ

cell state vector

𝑡 = 𝑜𝑡 ⊙ 𝑡𝑎𝑛ℎ(𝑐𝑡)

∊(-1,1)

output vector

Resources