Theo Verelst Audio Digital Signal
Processing Page
last
update: Aug 26 2011
(Previous Page at this URL: Theo Verelst DSP Page)
(Clearly this page isn;t finished..)
Introduction
The
intention here is to begin with an overview of the main Electrical
Engineering type of theories and considerations concerning the broad
subject of Digitial Signal Processing, with an emphasis on serious
audio aplications, and some pointers and examples about the studio
technologies from when there was still bright daylight in the hot one
hundred and so on. I don't suggest this is a course which will make you
a succesful digital Lexicon reverberation unit user in that ballgame,
but I'm sure that direction is valid (and to some extend present here)
and a lot of "modern" software isn't bound to go anywhere but the
dust-pile, because it should.
The examples I've worked on are either hardware (synths, effects,
analog filters, DSP boards, FPGA prototypes) or run on Linux, with just
a few exceptions (Cubase on windows XPpro) because years ago I've
delved into some of the existing software for PCs and was disappointed
at the quality levels, and honestly they for too big part still all
sound alike, and I don't like that sound, and suspect the main
mathematical approaches to never gravitate to something all too ok.
This is not necessarily a flaw in the design of windows OS, it's just a
partial observation.
Theoretically speaking, first it is necessary to temper some of the
enthusiasm about processing
audio in the digital domain, especially at low sample rates (like CD
quality), so we'll look at some of the EE foundations of what it is
like to put a signal in digital form , and what it takes to get the
original signal back with very little distortion. Also is a good idea
to know something about generalities from electronics and general audio
land like power of loudspeaker systems, High Fidelity and signal
integrity, standard analog processing boxes like equalizers and
limiters, and as a more advanced subject, the loudness curve and how
the natural hearing properties are important in well made music and
sound recordings.
Of course many readers will want to know about (Fast) Fourier
Transforms, sound sampling and playing of samples, impulse theory, and
some practical examples of how to improve digital audio processing
work-flows according to what I'm arguing here for: I'll see how far
I'll get, but surely I guarantee that knowing some essential
theoretical rules and how they translate to digital problems and some
possible solutions is worth while.
The Sampling theorem in practice
Lets begin with the main problem always showing up in digital audio
chains: to convert the recorded digital signal back to an analog signal
with a Digital to Analog (DA) converter.
Of course many have heard of the Nyquist
theorem from Shannons' sampling theory which states that an analog
signal which is time sampled (in an equidistant manner, meaning the
samples are taken at exactly the same distances) and put into sampled
values of a discrete number (quantisation
of the value of the samples) can contain signals only of frequency up
to half the sampling rate. Higher frequencies will get "aliased"
meaning the higher frequencies are irrecoverably lost (for any general
type of sampling setup) and cause very ugly noise which interferes with
the other frequencies (in known ways).
In short, the signal entering the Analog to Digital converter must be
fultered or otherwise guaranteed to have no freqnencies in it higher
than half the sampling frequency, or the recorded digital signal cannot
be played back with likeness to the original, and that is a hard rule,
no real exceptions possible, for the general case of all possible
signals. Usually AD-converters as electronics box/part will integrate
an analog and digital filter set for the purpose of limiting higher
frequency content, but still for a practical audio recording it can
well be important to use microphones and analog precautions to make
sure no aliasing takes place during recording.
A second important part of sampling theory is the perfect reconstruction theorem
meaning the theory states it is possible to take a perfectly sampled
signal (created in whichever way, and of course considering the
accuracy of the quantisation, i.e. "the number of noise free bits per
sample" and recreate, with mathematical perfection as the limit case,
the orginal analog signal from it with a DA converter. This in essence
requires mathematically speaking a "sinc" function laid through every
sample, and all these function must then be added up to get the
original signal back, but the sinc function and the additions requires infinite time. So it is possible to
define or "measure" correctly sampled samples form an analog
signal, and in mathematical form recreate the real original signnal
from those, with infinite accuracy, but this will take many
computations, and in principle takes infinite time.
Of course, every practical audio DA converter has a less perfect and
less then infinite time delay "reconstruction filter" built in, of
varying accuracy, but this is never good enough to recreate perfect
audio from sampled signals, and honestly in practice hardly any DA
converter in any machine does a good enough jon for decent hifi in the
sensitive mid-frequency range. Some measures can be taken to make the
sound pleasant through general converters, and in premeditated cases
this can work sufficiently good to make HiFi listeners somewhat
satisfied, but the only right way out in general is to make better,
longer filtering DA converters, which probably isn't realistic for live
sound, because the reconstruction filter lengths may well surpass
significant parts of a second.
For non-live applications it can pay to take a low frequency samples
signal (like an audio CD) and convert the sample rate up to for
instance 96 or even 192 kHz so that much more data space per song is
required, but with a long (and computationally expensive) up-sample
filter, a 192 kHz converter may in practice make a big difference to
the ever-present mid-range distortion because of the sample frequency
baing more than two octaves higher. Mind that the filters to do this
conversion are long, hard and therefore not the simple upsample
filters you can easily find and which do not suggest significant delay,
these cannot theoretically work great.
That too is hard theoretical fact, no loopholes for general signals.
Digital filters
Depending on the definition used this of course can be a huge subject,
unfortunately filled with people who often range from laymen and
quacksalvers to deluded well meaning and pyramid-game-interested.
Essentially of course a digital filter is a "black box" where digital
signals come in and, probably at the same sample rate, signals come out
with less than infinite delay.
In formal theory from the past (I think the book that was used among
others at my university was Papoulis "Circuits and Systems") often a
possible way to define digital filters would be to compare them with
analog filters one-on-one, which is essentially possible by treating
the difference between subsequent samples as a measure for the
derivative of a signal, like a capacitor in electronic filters behaves
as a differentiator between voltage and current. The corresponding
analog and digital filters can be exactly defined with unambiguous
filter type connections (apart from accuray), but the digital versions
will not necessarily behave well, and there is a (possibly small but
not necessarily) approximation error related to a power function. Low
frequency signals as compared to the sample rate will work fine with
these analog-similarity filters, and sound fairly natural in practice.
All those filters end up having Infinite Impulse Response properties,
which essentially means that if a signal spike is sent through an
analog filter, the output of that filter will take "forever" to
stabilize, which is (apart from quantisation accuracy) the same for
digital filters of similar transfer function (an
electronics/physics/mechanics concept which theoretically states filter
behaviour as a formula of known type), so the digital filter of IIR
kind will continue to give out a transiennt and possibly resonances
until either the output and internal signals become smaller than the
accuracy or to noisy to be relevant.
The other main type of filter which mostly only exists in the digital
domain (though there are various theoretical connections possible with
analog circuits and physics) is a called a Finite Impulse Response
(FIR) filter, for instance a subsequent sample averaging filter,
often referred to as having "taps". In practice the oversampling
filters in advanced AD converters may well use FIR filter stages of
specific types to help anti-aliasing of the sampled signal.
Linearity and it's importance, shift invariance
...
The main observation about the theoretical and practical sampling
theory to think about is that both the well enough behaved IIR and FIR
filters described above, the AD converter and to an extend DA converter
(presuming it does sufficiently good reconstruction of the signal
through the samples) and all kinds of "simple" digital operations like
multiplying a signal to change it's amplitude are all linear processing steps in
the important general sense of linearity. It's easy to come up with all
kinds of clever or ill-fantasized digital processing steps which
seriously inpact linearity, while of course there are a lot of
processing kinds which aren't linear in the electrics meaning by mature
such as certain sound effects, complicated reverbs (see also "impulse
theory") and compression and gating (à la the analog electronic
effects).
The theoretical sampling theory has a important propertie which is
called "shift invariance" meaning that for perfect reconstruction to
work on a correctly sampled signal (hard to define in theory, but
however) THERE NEEDS BE NO RELATION BETWEEN THE SIGNAL AND THE SAMPLE
SIGNAL. In practice this means if you want to record your hobo with a
microphone and your computers' sound card, no matter how the internal
sampling frequency of the sound card creates the sample moments, the
recorded hobo signal can be recreated no matter that. When (if)
reconstruction is (too) inaccurate, differences are audible for
different "takes" of a recording of the same instrument when the sample
points on the time axis are lined up in comparison with the hobo signal
in a different way.
Theoretically however, where
the samples are taken of the analog hobo signal in comparison with for
instance the beginning of a tone, makes no difference at all for most
linear filters (with some ifs and buts related to accuracy) and for
well done (long...) resample filters and for well-reconstructing DA
converters. In practice a not
so well-reconstructing DA converter will lead to clearly
distinguishable differences between otherwise "identical" recordings.
Unless you're into mid-range PWM-type of averaging filters (and know
what yu're doing and what the difference is between that and
dithering), you don't want to care about those different takes, and
just process your sampled recordings within reason of what you think is
ok, and burn the CDs and other media with those processed signals,
without all kinds of sample-level signal manglings which appear to "improve" it. In the long
run the people having access to your materials getting better DA
converters and such will thank you.
...
FFTs and their use
The "big brother" of the probably well known term Fast Fourier
Transformation (FFT), also in audio, is the well known electronicist,
physicist etc. Fourier Transform, which includes moderately complicated
mathematical integral calculations and is most easily summed up as a
transformation of a signal into the frequency domain. It's like if we
whistle into a microphone attached to a well working spectrum analyzer,
we'd see a graph with frequencies (for instance 20Hz - 20,000 Hz) on
the horizontal axis and some measure of volume (like dBs) on the
vertical axis with a line running from the horizontal axis upward to a
point of which the hight depends on the volume of the whistling, and
the line's distance from the Y-axis would depend on how high or low we
whistle. If two people would whistle we'd see individual lines for both
of them. This idea is similar to the working of a Fourier
Transformation.
The Fast Fourier Transformation which can be computed reasonably fast
on modern computers and DSPs does something similar, except that the
number of lines on the spectogram it can discern is limited, and the
averaging time before it decides on what frequencies are in a signal
has stringent constant behavior letting it come close or quite far away
from the "correct" frequency analysis, like an actual Fourier Transform.
Sending an audio signal "through" and FFT based effects has a number of
issues and shortcomings if you want to take it that the "outcome" of
the FFT is like a frequency picture ("cepstrum", "frequency/phase value
pairs", etc.) which can then be Inverse Fast Fourier Transformed (IFFT)
back to "normal" samples after altering frequency components. There are
serious errors and limitations associated with even the simplest of
equalizations done in this way, and the only way to know for sure to
get something close to the original sampled signal back is to change
nothing about all the frequency+phase outcomes of the transform, and
use them without averaging to perform an IFFT, which of course isn't
useful, except for the theoretical purpose of proving the IFFT is
indeed in practice the inverse of the FFT.
It is possible to use averaging and very repeated FFTs possibly with
windows in the frequency and time domain applied, to make equalization
types of effects, or, more likely, to add a feel of averaging to an
audio recording. That's a fairly profound subject, and should probably
be considered advanced studio technology, with all kinds of implicit
(possibly explicitly known, but I have no sources for that) norms and
loudness related mid-range sound terror prevention rules.
...
Impulse theory and practice
Just like the IIR and FIR digital filters mentioned above have in
electronic theory a "characteristic impulse" which uniquely defines
their behavior in the case of a linear signal processing path, it is
possible to record the characteristics of other "filters" in the
form of an impulse, which can be fun for using with convolution
software to approximate the convolution of the recorded impulse as a
filter on some signal put through it. Suppose we'd be able to determine
the impulse behaviour of a guitar loudspeaker cabinet, we could
virtually send a signal through it by using the impulse and convolution
software and get that sound without ever putting signal through the
actual cabinet.
Mind you, the impulse theory ONLY
and EXCLUSIVELY holds for linear systems, so if the speaker
also distorts (which it without any doubt does) which is a form of
non-linear behavior, the impulse you've recorded isn't in line with the
theory anymore, and does "stuff" without good theoretical foundation,
and certainly isn't anymore a good way to model the loudspeaker.
Analog equalizers to more or lesser extend are linear enough to take
notice of and could be impulse sampled to imitate them. Some simple
forms of reverberation as well, which is fun, but very soon those
impulse reverbs (unless used for a well defined small purpose) are
going to sound dull and all the same because another condition for
linear impulse theory to work is a constant system with no changing
variables, which the air in a reverberating space is not.
home page [2]
email: theover@tiscali.nl