Theo Verelst Audio Digital Signal Processing Page

last update: Aug 26 2011

(Previous Page at this URL: Theo Verelst DSP Page)
(Clearly this page isn;t finished..)

Introduction

The intention here is to begin with an overview of the main Electrical Engineering type of theories and considerations concerning the broad subject of Digitial Signal Processing, with an emphasis on serious audio aplications, and some pointers and examples about the studio technologies from when there was still bright daylight in the hot one hundred and so on. I don't suggest this is a course which will make you a succesful digital Lexicon reverberation unit user in that ballgame, but I'm sure that direction is valid (and to some extend present here) and a lot of  "modern" software isn't bound to go anywhere but the dust-pile, because it should.

The examples I've worked on are either hardware (synths, effects, analog filters, DSP boards, FPGA prototypes) or run on Linux, with just a few exceptions (Cubase on windows XPpro) because years ago I've delved into some of the existing software for PCs and was disappointed at the quality levels, and honestly they for too big part still all sound alike, and I don't like that sound, and suspect the main mathematical approaches to never gravitate to something all too ok. This is not necessarily a flaw in the design of windows OS, it's just a partial observation.

Theoretically speaking, first it is necessary to temper some of the enthusiasm about processing audio in the digital domain, especially at low sample rates (like CD quality), so we'll look at some of the EE foundations of what it is like to put a signal in digital form , and what it takes to get the original signal back with very little distortion. Also is a good idea to know something about generalities from electronics and general audio land like power of loudspeaker systems, High Fidelity and signal integrity, standard analog processing boxes like equalizers and limiters, and as a more advanced subject, the loudness curve and how the natural hearing properties are important in well made music and sound recordings.

Of course many readers will want to know about (Fast) Fourier Transforms, sound sampling and playing of samples, impulse theory, and some practical examples of how to improve digital audio processing work-flows according to what I'm arguing here for: I'll see how far I'll get, but surely I guarantee that knowing some essential theoretical rules and how they translate to digital problems and some possible solutions is worth while.


The Sampling theorem in practice

Lets begin with the main problem always showing up in digital audio chains: to convert the recorded digital signal back to an analog signal with a Digital to Analog (DA) converter.

Of course many have heard of the Nyquist theorem from Shannons' sampling theory which states that an analog signal which is time sampled (in an equidistant manner, meaning the samples are taken at exactly the same distances) and put into sampled values of a discrete number (quantisation of the value of the samples) can contain signals only of frequency up to half the sampling rate. Higher frequencies will get "aliased" meaning the higher frequencies are irrecoverably lost (for any general type of sampling setup) and cause very ugly noise which interferes with the other frequencies (in known ways).

In short, the signal entering the Analog to Digital converter must be fultered or otherwise guaranteed to have no freqnencies in it higher than half the sampling frequency, or the recorded digital signal cannot be played back with likeness to the original, and that is a hard rule, no real exceptions possible, for the general case of all possible signals. Usually AD-converters as electronics box/part will integrate an analog and digital filter set for the purpose of limiting higher frequency content, but still for a practical audio recording it can well be important to use microphones and analog precautions to make sure no aliasing takes place during recording.

A second important part of sampling theory is the perfect reconstruction theorem meaning the theory states it is possible to take a perfectly sampled signal (created in whichever way, and of course considering the accuracy of the quantisation, i.e. "the number of noise free bits per sample" and recreate, with mathematical perfection as the limit case, the orginal analog signal from it with a DA converter. This in essence requires mathematically speaking a "sinc" function laid through every sample, and all these function must then be added up to get the original signal back, but the sinc function and the additions requires infinite time. So it is possible to define or "measure"  correctly sampled samples form an analog signal, and in mathematical form recreate the real original signnal from those, with infinite accuracy, but this will take many computations, and in principle takes infinite time.

Of course, every practical audio DA converter has a less perfect and less then infinite time delay "reconstruction filter" built in, of varying accuracy, but this is never good enough to recreate perfect audio from sampled signals, and honestly in practice hardly any DA converter in any machine does a good enough jon for decent hifi in the sensitive mid-frequency range. Some measures can be taken to make the sound pleasant through general converters, and in premeditated cases this can work sufficiently good to make HiFi listeners somewhat satisfied, but the only right way out in general is to make better, longer filtering DA converters, which probably isn't realistic for live sound, because the reconstruction filter lengths may well surpass significant parts of a second.

For non-live applications it can pay to take a low frequency samples signal (like an audio CD) and convert the sample rate up to for instance 96 or even 192 kHz so that much more data space per song is required, but with a long (and computationally expensive) up-sample filter, a 192 kHz converter may in practice make a big difference to the ever-present mid-range distortion because of the sample frequency baing more than two octaves higher. Mind that the filters to do this conversion are long, hard and  therefore not the simple upsample filters you can easily find and which do not suggest significant delay, these cannot theoretically work great. That too is hard theoretical fact, no loopholes for general signals.

Digital filters

Depending on the definition used this of course can be a huge subject, unfortunately filled with people who often range from laymen and quacksalvers to deluded well meaning and pyramid-game-interested. Essentially of course a digital filter is a "black box" where digital signals come in and, probably at the same sample rate, signals come out with less than infinite delay.

In formal theory from the past (I think the book that was used among others at my university was Papoulis "Circuits and Systems") often a possible way to define digital filters would be to compare them with analog filters one-on-one, which is essentially possible by treating the difference between subsequent samples as a measure for the derivative of a signal, like a capacitor in electronic filters behaves as a differentiator between voltage and current. The corresponding analog and digital filters can be exactly defined with unambiguous filter type connections (apart from accuray), but the digital versions will not necessarily behave well, and there is a (possibly small but not necessarily) approximation error related to a power function. Low frequency signals as compared to the sample rate will work fine with these analog-similarity filters, and sound fairly natural in practice.

All those filters end up having Infinite Impulse Response properties, which essentially means that if a signal spike is sent through an analog filter, the output of that filter will take "forever" to stabilize, which is (apart from quantisation accuracy) the same for digital filters of similar transfer function (an electronics/physics/mechanics concept which theoretically states filter behaviour as a formula of known type), so the digital filter of IIR kind will continue to give out a transiennt and possibly resonances until either the output and internal signals become smaller than the accuracy or to noisy to be relevant.

The other main type of filter which mostly only exists in the digital domain (though there are various theoretical connections possible with analog circuits and physics) is a called a Finite Impulse Response (FIR) filter, for instance a subsequent sample averaging  filter, often referred to as having "taps". In practice the oversampling filters in advanced AD converters may well use FIR filter stages of specific types to help anti-aliasing of the sampled signal.

Linearity and it's importance, shift invariance

...
The main observation about the theoretical and practical sampling theory to think about is that both the well enough behaved IIR and FIR filters described above, the AD converter and to an extend DA converter (presuming it does sufficiently good reconstruction of the signal through the samples) and all kinds of "simple" digital operations like multiplying a signal to change it's amplitude are all linear processing steps in the important general sense of linearity. It's easy to come up with all kinds of clever or ill-fantasized digital processing steps which seriously inpact linearity, while of course there are a lot of processing kinds which aren't linear in the electrics meaning by mature such as certain sound effects, complicated reverbs (see also "impulse theory") and compression and gating (à la the analog electronic effects).

The theoretical sampling theory has a important propertie which is called "shift invariance" meaning that for perfect reconstruction to work on a correctly sampled signal (hard to define in theory, but however) THERE NEEDS BE NO RELATION BETWEEN THE SIGNAL AND THE SAMPLE SIGNAL. In practice this means if you want to record your hobo with a microphone and your computers' sound card, no matter how the internal sampling frequency of the sound card creates the sample moments, the recorded hobo signal can be recreated no matter that. When (if) reconstruction is (too) inaccurate, differences are audible for different "takes" of a recording of the same instrument when the sample points on the time axis are lined up in comparison with the hobo signal in a different way.

Theoretically however, where the samples are taken of the analog hobo signal in comparison with for instance the beginning of a tone, makes no difference at all for most linear filters (with some ifs and buts related to accuracy) and for well done (long...) resample filters and for well-reconstructing DA converters. In practice a not so well-reconstructing DA converter will lead to clearly distinguishable differences between otherwise "identical" recordings. Unless you're into mid-range PWM-type of averaging filters (and know what yu're doing and what the difference is between that and dithering), you don't want to care about those different takes, and just process your sampled recordings within reason of what you think is ok, and burn the CDs and other media with those processed signals, without all kinds of sample-level signal manglings which appear to "improve" it. In the long run the people having access to your materials getting better DA converters and such will thank you.

...


FFTs and their use

The "big brother" of the probably well known term Fast Fourier Transformation (FFT), also in audio, is the well known electronicist, physicist etc. Fourier Transform, which includes moderately complicated mathematical integral calculations and is most easily summed up as a transformation of a signal into the frequency domain. It's like if we whistle into a microphone attached to a well working spectrum analyzer, we'd see a graph with frequencies (for instance 20Hz - 20,000 Hz) on the horizontal axis and some measure of volume (like dBs) on the vertical axis with a line running from the horizontal axis upward to a point of which the hight depends on the volume of the whistling, and the line's distance from the Y-axis would depend on how high or low we whistle. If two people would whistle we'd see individual lines for both of them. This idea is similar to the working of a Fourier Transformation.

The Fast Fourier Transformation which can be computed reasonably fast on modern computers and DSPs does something similar, except that the number of lines on the spectogram it can discern is limited, and the averaging time before it decides on what frequencies are in a signal has stringent constant behavior letting it come close or quite far away from the "correct" frequency analysis, like an actual Fourier Transform.

Sending an audio signal "through" and FFT based effects has a number of issues and shortcomings if you want to take it that the "outcome" of the FFT is like a frequency picture ("cepstrum", "frequency/phase value pairs", etc.) which can then be Inverse Fast Fourier Transformed (IFFT) back to "normal" samples after altering frequency components. There are serious errors and limitations associated with even the simplest of equalizations done in this way, and the only way to know for sure to get something close to the original sampled signal back is to change nothing about all the frequency+phase outcomes of the transform, and use them without averaging to perform an IFFT, which of course isn't useful, except for the theoretical purpose of proving the IFFT is indeed in practice the inverse of the FFT.

It is possible to use averaging and very repeated FFTs possibly with windows in the frequency and time domain applied, to make equalization types of effects, or, more likely, to add a feel of averaging to an audio recording. That's a fairly profound subject, and should probably be considered advanced studio technology, with all kinds of implicit (possibly explicitly known, but I have no sources for that) norms and loudness related mid-range sound terror prevention rules.

...

Impulse theory and practice

Just like the IIR and FIR digital filters mentioned above have in electronic theory a "characteristic impulse" which uniquely defines their behavior in the case of a linear signal processing path, it is possible  to record the characteristics of other "filters" in the form of an impulse, which can be fun for using with convolution software to approximate the convolution of the recorded impulse as a filter on some signal put through it. Suppose we'd be able to determine the impulse behaviour of a guitar loudspeaker cabinet, we could virtually send a signal through it by using the impulse and convolution software and get that sound without ever putting signal through the actual cabinet.

Mind you, the impulse theory ONLY and EXCLUSIVELY holds for linear systems, so if the speaker also distorts (which it without any doubt does) which is a form of non-linear behavior, the impulse you've recorded isn't in line with the theory anymore, and does "stuff" without good theoretical foundation, and certainly isn't anymore a good way to model the loudspeaker.

Analog equalizers to more or lesser extend are linear enough to take notice of and could be impulse sampled to imitate them. Some simple forms of reverberation  as well, which is fun, but very soon those impulse reverbs (unless used for a well defined small purpose) are going to sound dull and all the same because another condition for linear impulse theory to work is a constant system with no changing variables, which the air in a reverberating space is not.






 home page [2]       email: theover@tiscali.nl