In the last post, I reviewed some initial implications when we face the estimation of the spectral density of a signal. It has been shown that our estimator will depend somehow on the properties of the Dirichlet's kernel. In this post I will go further and I talk on the effects that such kernel has in our estimator, specifically on a property called bias.
As it has been discussed that when we measure the world and we get our samples we do not get any analytic signal but rather a noisy imperfect signal of an underlying distribution that is strong enough to be capture by our measurements. In some cases this underlying signal is really strong, like in the case we measure the movement of the moon in the sky, but in other cases the signal is not so clear, like the movement of molecules in a liquid. That is why what we analyze is an estimator of that underlying distribution.
It has been mentioned also that the estimator is, hence, a stochastic process or random variable/signal while the underlying distribution is considered to be deterministic. Then as a random signal it makes sense to consider that the estimator will be good if:
- The expected value or mean value of the estimator match the underlying distribution, i.e.
- The variance is not high or, even better, zero i.e.
Having this condition is quite ideal. It implies that with our observed data it is likely that our estimator provides a good description of the underlying statistics (this description will be absolute perfect if the underlying statistic would be Gaussian distributed because then mean and variance are enough to describe a normal distribution). Nonetheless, this is rather the exception and usually the situation will be more complex.
Let's put the focus on the expected value of the estimator. If the previous does not happen then we can describe this situation with another variable which is what is considered the bias. In formal terms will be:
In daily language bias is defined as 'Inclination or prejudice for or against one person or group, especially in a way considered to be unfair'. In statistic you can think as the estimator being unfair and hence not revealing the truth distribution. The ideal case mentioned above is said that the estimator is unbiased. This is of course a rare situation. But we want to approach to that ideal! What could we expect then? Well, the most realistic situation that you can expect is that taking enough time to observe or measure could be sufficient to get a good estimator. In other words, if the estimator is considered asymptotically unbiased.
It's not hard to understand a person; it is only hard to listen without bias
Criss Jami, https://www.facebook.com/crissjami
Ok, but what does this have to do with the spectral estimation? Let's refresh a bit the situation at which the last post ended. We had that our estimator of the spectrum is:
which is the Dirichlet kernel that you can see in the figure below.
The effect of this estimator is better illustrated with an example. Imagine that our signal as a pure tone, that is, a signal that oscillate with only one frequency, for instance f=10 Hz. Because it is necessary only one base function of the Fourier base functions the representation in the spectral domain can be described analytically by two Dirac delta functions. Despite of this when we measure/observe our signal we are windowing the whole signal and hence the estimated spectrum do not look anymore as two delta functions but as the Dirichlet kernel centered at those two deltas. The image below describe the situation.
Just visually it seems that the estimator is far from being a good one. There are two factors that make this kernel does not look like a delta: the central peak (marked with a star) has a finite width, given by 2/T, and the first side lobe (marked with a dot) has a finite height. Putting together this with the definition of bias it seems clear that the mean value of several realizations of this estimator will not converge to the real underlying distribution. We can say then that the spectral density estimator computed in this way is a biased estimator. Because this bias appears due to the mentioned phenomena in the literature have been called as: narrowband bias, corresponding to the width of the central lobe, and broadband bias, the effect due to the slow decay of the side lobes.
Will our estimator at least be asymptotically unbiased? It seem that if the narrowband bias depends on 2/T, when T goes to infinity this term will be negligible. Hurra! The broadband bias has the particularity that when T increases the side lobes come closer to the central lobe but they decay slowly with T though. This effects can be observed at Figure 3. Despite of this effects while increasing the observing periods, in the limit this bias converges to zero when T goes to infinity. We can claim hence that our spectral density estimator is asymptotically unbiased!
This is a great step in statistics. This provides the confidence in the spectral estimator to use it in further analysis without compromising any hypothesis or model that should be built on the derived implications. Despite of that this happen only for the infinity which is not the time we have usually to get our measurements. Is there any other way we can boost the performance of the estimator specially against the broad band bias? In addition, so far I have talked only about whether there is a deviation from the underlying truth but I did not mention about how variant is this estimator until reaching that point, i.e. I did not talk about the other term, the variance of the estimator, What will it happen with that term? These questions and many others will be answered in further posts. Stay tunned!
If you have any doubt about this post, please,ask freely in the comments. In the next days I will upload the python code I used to generated all the plots to my github and hence you can reproduce these examples.
Now, time to solve!