Short-Time Fourier Transform (STFT)
- represents a signal in the time-frequency domain by computing discrete Fourier transforms (DFT) over short overlapping windows
- it is used to compute a spectrogram
STFT - Definition
STFT analyzes a signal π₯(π‘)Β by multiplying it with a window function π€(π‘ β π) centered at time π, then taking the Fourier Transform:
Ordinary Fourier Transform analyzes a signal π₯(π‘) with no window function:
Choosing Window Function
Different window shapes have different trade-offs in time vs frequency resolution:
|
Rectangular window |
|
|
|---|---|---|
|
Hamming window |
|
|
|
Hann window |
|
|
|
Gaussian window |
|
|
Choosing Window/Frame Size
|
Size (N) |
Time span of frame |
Frequency resolution Ξf = Fs/N |
Notes |
|---|---|---|---|
|
256 |
~5.8 ms |
~172 Hz/bin |
Very fast, good for transients (drums, speech consonants), but poor pitch resolution |
|
512 |
~11.6 ms |
~86 Hz/bin |
Balance for speech, can still track rapid events |
|
1024 |
~23.2 ms |
~43 Hz/bin |
Common default, good mix for general audio/music |
|
2048 |
~46.4 ms |
~21 Hz/bin |
Better frequency detail (notes, harmonics), worse timing |
|
4096 |
~93 ms |
~10.8 Hz/bin |
High frequency precision, but smears fast events |
|
8192 |
~186 ms |
~5.4 Hz/bin |
Super sharp frequency, very blurry in time β used in offline spectral analysis, not real-time |