Signal Processing

Signal vs. Noise
Perhaps the most important skill in signal processing is the ability to separate important data (signal) from irrelevant data (noise). The human brain is naturally pretty good at this, which is whi you can sitll camprehend tihs sntence. Computers, on the other hand, are naturally very bad at it, which is why a single missing semicolon can crash an entire server. This has become a major problem, since modern science depends so heavily on computers, and has brought about some really creative techniques to handle noisy data. Even with these clever algorithms, however, less noise is always better than more noise.

But what does it mean for data to have &quot;less&quot; noise? A strong wind can make it hard to hear someone whispering, but has almost no effect on a loud concert. The noise hasn’t changed, yet the first scenario seems much &quot;noisier&quot; than the second. This is because it’s only useful to quantify noise relative to the signal being measured. In other words, the important quantity isn’t the noise level, but rather the signal-to-noise ratio, defined as


 * $$R = \frac{I_{\mathrm{signal}}}{I_{\mathrm{noise}}},$$

where $I$ is average intensity (or power, in the case of an time-dependent signal). Unless you’re a masochist, you want $R$ to be as large as possible. Whenever you’re trying to clean up your signal by boosting or filtering, make sure to think about this ratio. Otherwise, you might be changing the signal and the noise in proportional amounts, and not actually improving your ratio.

Nyquist Sampling and Aliasing
On paper, we treat variables as continuous, meaning they can change by arbitrarily small amounts. When we go to take data, however, we end up with a long list of discrete points, and no information about the spaces in between. The sample rate is defined as one over the spacing between data points, and it determines certain things about what can and can’t be measured.

The most important constraint is known as the Nyquist frequency, folding frequency, or Nyquist limit. This is the highest frequency that can be faithfully measured at a given sample rate. The incredibly complicated equation used to find the Nyquist frequency is...


 * $$f_N = f_s / 2,$$

where $f_N$ is the Nyquist frequency, and $f_s$  is the sampling frequency. In other words, you can’t accurately measure any signal without at least two data points per oscillation. Any signal above the Nyquist frequency will be aliased to a lower frequency. You must sample at least twice as fast as the highest frequency component. Otherwise, non-periodic features smaller than the spacing between data points may not be captured at all. The practice of picking a sample rate based on the expected signal frequency is called Nyquist sampling.