time-series-3
Time Series (3)
Spectral Analysis and Filtering
We define a cycle as one complete period of a sine or cosine function defined over a unit time interval:
\[X_t = A\cos(2\pi\omega t + \phi) \quad \quad (4.1)\]
for \(t = 0, \pm 1, \pm 2, ...\), where:
- \(\omega\) is a frequency index, defined in cycles per unit time, so if \(\omega = 2\), for every 1 time unit, we have \(2\) cycles, this is fixed.
- \(A\) determining the height or amplitude of the function, this is random.
- \(\phi\) determining the start point of the cosine function, this is random.
Using the trigonometric identity, we can write above equation as:
\[X_t = U_1 \cos (2 \pi \omega t) + U_2 \sin (2 \pi \omega t) \quad \quad (4.2)\]
Where:
- \(U_1 = A \cos \phi\) is often taken to be normally distributed random variable.
- \(U_2 = - A \sin \phi\) is often taken to be normally distributed random variable.
- \(A = \sqrt{U_1^2 + U_2^2}\)
- \(\phi = \tan^{-1} (-\frac{U_2}{U_1})\)
Now consider a generalization of above equation that allows mixtures of periodic series with multiple frequencies and amplitudes:
\[X_t = \sum^q_{k=1} [U_{k_1} \cos(2\pi\omega_k t) + U_{k_2}\sin (2\pi\omega_k t)] \quad \quad (4.3)\]
Where:
- \(U_{k_1}, U_{k_2}\), for \(k = 1, 2, ...., q\) are independent zero-mean random variables with variances \(\sigma^2_k\)
- \(\omega_k\) are distinct frequencies
- \(\gamma(h) = \sum^q_{k=1} \sigma^2_k \cos(2\pi \omega_k h) \quad \quad (4.4)\)
- \(\gamma(0) = \sum^{q}_{k=1} \sigma^2_k\), the variance of the process is the sum of each individual parts.
The Spectral Density
Theorem C1:
A function \(\gamma(h)\), for \(h = 0, \pm 1, \pm 2, ...\) is non-negative-definite IFF it can be expressed as:
\[\gamma(h) = \int^{\frac{1}{2}}_{-\frac{1}{2}}e^{2\pi i \omega h} dF(\omega)\]
where \(F(\cdot)\) is non-decreasing. The function \(F(\cdot)\) is right continuous, bounded in \([-\frac{1}{2}, \frac{1}{2}]\), and uniquely determined by the conditions \(F(-\frac{1}{2}) = 0, F(\frac{1}{2}) = \gamma(0)\)
Theorem C2
If \(X_t\) is a mean-zero stationary process, with spectral distribution \(F(\omega)\) as given
Property 4.1: Spectral Representation of a Stationary Process
In non-technical terms, Theorem C2 states taht any stationary time series may be thought of, approximately, as the random superposition of sines and cosines oscillating at various frequencies.
Property 4.2: Spectral Density
If the autocovariance function, \(\gamma(h)\), of a stationary process satisfies:
\[\sum^\infty_{h=-\infty} |\gamma(h)| < \infty \quad \quad (4.10)\]
then it has the representation:
\[\gamma(h) = \int^{\frac{1}{2}}_{-\frac{1}{2}}e^{2\pi i \omega h} f(\omega)d\omega \quad \quad h = 0, \pm 1, \pm 2, ... \quad \quad (4.11)\]
as the inverse transform of the spectral density, which has the representation:
\[f(\omega) = \sum^\infty_{h=-\infty} \gamma(h) e^{-2\pi i \omega h} \quad \quad -\frac{1}{2}\leq \omega \leq \frac{1}{2} \quad \quad (4.12)\]
\(f(\omega)\) is called the spectral density.
Since \(f(\omega) = f(- \omega)\), we can see that the spectral density is an even function of period one. Because of the evenness, we will typically only plot \(f(\omega)\) for \(\omega \geq 0\). In addition, putting \(h=0\), we have:
\[\gamma(0) = Var[X_t] = \int^{\frac{1}{2}}_{-\frac{1}{2}} f(\omega) d\omega\]
which expresses the total variance as the integrated spectral density over all of the frequencies. When the conditions in property 4.2 are satisfied, the autocovariance function \(\gamma(h)\) and the spectral density function \(f(\omega)\) contain the same information.
Definition 4.1 Discrete Fourier Transform (DFT)
Given data \(x_t, ...., x_n\), we define the discrete Fourier transform (DFT) to be:
\[d(\omega_j) = \frac{1}{\sqrt{n}} \sum^n_{t=1} x_t e^{-2\pi i \omega_j t}\]
for \(j = 0, 1, ...., n - 1\), the frequencies \(\omega_j = \frac{j}{n}\) are called the Fourier or Fundamental frequencies.
For a inverse DFT, we have:
\[x_t = \frac{1}{\sqrt{n}} \sum^{n-1}_{j=0} d(\omega_j) e^{2\pi i \omega_j t}\]
Definition 4.2: Periodogram
Given data \(x_1, ...., x_n\), we define the periodogram to be:
\[I(\omega_j) = |d(\omega_j)|^2 = d(\omega_j)\overline{d(\omega_j)}\]
for \(j = 0, 1, 2, ...., n - 1\).
Notice that \(I(0) = n \bar{x}^2\) where \(\bar{x}\) is the sample mean.
State Space Models
In general, the state space model is characterized by two principles.
- There is a hidden or latent process \(X_t\) called the state process, the state process is assumed to be a Markov process, this means that the feature \(\{X_s: s > t\}\) and the past \(\{X_s: s < t\}\) are independent conditional on the present \(X_t\).
- The observations \(Y_t\) are independent given the states \(X_t\).
This means that the dependence among the observations is generated by current states \(X_t\).
Linear Gaussian Model
The linear Gaussian state space model or Dynamic linear model, in its basic form, employs an order one, \(p\)-dimensional vector autoregression as the state equation,
\[X_t = \Phi X_{t-1} + W_t\]
Where:
- \(W_t\) are \(p \times 1\) i.i.d, zero mean normal vectors with covariance matrix \(Q\), \(W_t \overset{i.i.d}{\sim} N_p(\boldsymbol{0}, Q)\).
- The process starts with a normal vector \(X_0 \sim N_p(\boldsymbol{\mu_0}, \Sigma_0)\).
- \(p\) is called state dimension.
We assume we do not observe the state vector \(X_t\) directly, but only a noisy version of it called observation equation:
\[Y_t = A_t X_t + v_t\]
Where
- \(A_t\) is a \(q \times p\) measurement or observation matrix.
- \(Y_t\) is \(q \times 1\) which \(q\) can be larger or smaller than \(p\).
- \(V_t \overset{i.i.d}{\sim} N_q(\boldsymbol{0}, R)\) is the additive observation noise.
Filtering, Smoothing, Forecasting
A primary aim of any analysis involving the state space model, would be to produce estimators for the underlying unobserved signal \(X_t\), given the data \(y_{1:s} = \{y_1, ...., y_s\}\) to time \(s\), when:
- \(s < t\), the problem is called forecasting.
- \(s = t\), the problem is called filtering.
- \(s > t\), the problem is called smoothing.
Notations throughout this chapter:
- \(x_t^s = E[X_t | y_{1:s}]\)
- \(P^s_{t1, t2} = E[(X_{t_1} - x^s_{t_1}) (X_{t_2} - x^s_{t_2})^T)]\), when \(t_1 = t_2\), we write \(P^s_{t1}\)