In the last post, I reviewed some initial implications when we face the estimation of  the spectral density of a signal. It has been shown that our estimator will depend somehow on the properties of the Dirichlet's kernel. In this post I will go further and I talk on the effects that such kernel has in our estimator, specifically on a property called bias.

As it has been discussed that when we measure the world and we get our samples we do not get any analytic signal but rather a noisy imperfect signal of an underlying distribution that is strong enough to be capture by our measurements. In some cases this underlying signal is really strong, like in the case we measure the movement of the moon in the sky, but in other cases the signal is not so clear, like the movement of molecules in a liquid. That is why what we analyze is an estimator of that underlying distribution.

It has been mentioned also that the estimator is, hence, a stochastic process or random variable/signal while the underlying distribution is considered to be deterministic. Then as a random signal it makes sense to consider that the estimator will be good if:

• The expected value or mean value of the estimator match the underlying distribution, i.e. 
• The variance is not high or, even better, zero i.e. 

Having this condition is quite ideal. It implies that with our observed data it is likely that our estimator provides a good description of the underlying statistics (this description will be absolute perfect if the underlying statistic would be Gaussian distributed because then mean and variance are enough to describe a normal distribution). Nonetheless, this is rather the exception and usually the situation will be more complex.

Let's put the focus on the expected value of the estimator. If the previous does not happen then we can describe this situation with another variable which is what is considered the bias. In formal terms will be:



In daily language bias is defined as 'Inclination or prejudice for or against one person or group, especially in a way considered to be unfair'. In statistic you can think as the estimator being unfair and hence not revealing the truth distribution. The ideal case mentioned above  is said that the estimator is unbiased. This is of course a rare situation. But we want to approach to that ideal! What could we expect then? Well, the most realistic situation that you can expect is that taking enough time to observe or measure could be sufficient to get a good estimator. In other words, if  the estimator is considered asymptotically unbiased.

It's not hard to understand a person; it is only hard to listen without bias

##### Criss Jami, https://www.facebook.com/crissjami

Ok, but what does this have to do with the spectral estimation? Let's refresh a bit the situation at which the last post ended. We had that our estimator of the spectrum is:



with:



which is the Dirichlet kernel that you can see in the figure below. FIGURE 1. Example of the Dirichlet kernel for different window/observation times. It was introduced already in the previous post and it give a nice overview of the behavior of such kernel, specially of the amplitude and width of the lobes with T.

The effect of this estimator is better illustrated with an example. Imagine that our signal as a pure tone, that is, a signal that oscillate with only one frequency, for instance f=10 Hz. Because it is necessary only one base function of the Fourier base functions the representation in the spectral domain can be described analytically by two Dirac delta functions. Despite of this when we measure/observe our signal we are windowing the whole signal and hence the estimated spectrum do not look anymore as two delta functions but as the Dirichlet kernel centered at those two deltas. The image below describe the situation. FIGURE 2. This is an example of the practical implications of the estimator. In the first row at the left there is a physical signal of f = 10 Hz, it is a pure tone. At the right it is its spectrum which is analytically known and has the shape of two Dirac delta functions. At the second row left the same signal but now considereing onlyt the observation time given by the red window. This translates to the right into an spectrum completely different than the previous one. In this case the spectrum is characterized by two Dirichlet kernels centered at the points defined with the previous deltas. It is clear that both are quite different and provides a good example of the bias that is introduced by our estimator. In the last row a zoom in the last spectrum is depicted with highlights on two main features: the main lobe, and the secondary lobe.

Just visually it seems that the estimator is far from being a good one. There are two factors that make this kernel does not look like a delta: the central peak (marked with a star) has a finite width, given by 2/T, and the first side lobe (marked with a dot) has a finite height. Putting together this with the definition of bias it seems clear that the mean value of several realizations of this estimator will not converge to the real underlying distribution. We can say then that the spectral density estimator computed in this way is a biased estimator. Because this bias appears due to the mentioned phenomena in the literature have been called as: narrowband bias, corresponding to the width of the central lobe, and broadband bias, the effect due to the slow decay of the side lobes.

Will our estimator at least be asymptotically unbiased? It seem that if the narrowband bias depends on 2/T, when T goes to infinity this term will be negligible. Hurra! The broadband bias has the particularity that when T increases the side lobes come closer to the central lobe but they decay slowly with T though. This effects can be observed at Figure 3. Despite of this effects while increasing the observing periods, in the limit this bias converges to zero when T goes to infinity. We can claim hence that our spectral density estimator is  asymptotically unbiased! FIGURE 3. Example of how the width of the window influences the two different types of bias, narrow and broad band ones. The first row shows the same pure tone as before, with a slightly reduced amplitude for the shake of visualization and three different windows. In the lower row the estimator obtained with the windowed data. Observe how while the T increases the side lobes approach to the frequency of the signal while, as disadvantage, the amplitude of those lobes decay slower. In the infinity these lobes will converge in a single point at f = 10 Hz but...nobody has such time 🙂

This is a great step in statistics. This provides the confidence in the spectral estimator to use it in further analysis without compromising any hypothesis or model that should be built on the derived implications. Despite of that this happen only for the infinity which is not the time we have usually to get our measurements. Is there any other way we can boost the performance of the estimator specially against the broad band bias? In addition, so far I have talked only about whether there is a deviation from the underlying truth but I did not mention about how variant is this estimator until reaching that point, i.e. I did not talk about the other term, the variance of the estimator, What will it happen with that term? These questions and many others will be answered in further posts. Stay tunned!

If you have any doubt about this post, please,ask freely in the comments. In the next days I will upload the python code I used to generated all the plots to my github and hence you can reproduce these examples.

Now, time to solve!