Slogan: sound technology

Inside the HD2

Introduction

Besides improved audio quality over analog solutions, digital loudspeaker management systems (DLM) offer more flexible audio processing possibilities. Most systems currently on the market use IIR filters to realize the crossover and equalization (EQ) functions. The HD2 takes advantage of FIR filters, which offer controllable group delay, while also offering traditional IIR solutions. Furthermore, the separate peak and RMS limiters are provided and the input dynamic range bottleneck is circumvented by using dual-range AD conversion.

Architecture

The motherboard of the HD2 was designed to be configurable regarding its input and output sections. Due to its flexibility and small size of 20 × 13 cm (or 11 × 13 cm for a 2-in-4 version), it can be mounted inside the cabinet of an active loudspeaker or in a 19” housing to form a complete loudspeaker management system.
The input section consists of up to three dual-range or six single-range AD converters, one AES/EBU digital input, and one IS input as an interface to ADAT, Ethersound, Cobranet, e.g. A sample rate converter can be inserted for input rates other than 96 kHz or to circumvent jitter problems. The output section consists of eight analog and one AES/EBU output. An Ethernet and RS232 port allow remote control.

The signal processing is performed by two Motorola 56321 DSPs with less than 0.4 W power consumption each which are fed from a flexible routing matrix. For the input section, a parametric EQ bank and a main delay block is provided. Four output channels are processed in each DSP. Each output signal chain comprises a delay block for inter-driver arrival time adjustments and optional resampling filters with a sample rate reduction factor up to 32. These can be activated in conjunction with a crossover/EQ-FIR filter, which is handled by the DSP56321 coprocessor (EFCOP). Each signal path also contains an IIR filter bank to allow for IIR crossover and EQ filters and, of course, an individual gain.
At the end of the chain, two separate limiters are provided. A quickly reacting peak limiter is provided to prevent amplifier clipping and mechanical damage of the drivers An RMS limiter with slow reaction time models the loudspeaker’s voice coil temperature to protect it from overheating.
Requantization stages from 48 to 24 bit with dither and selectable noise shaper are active behind the input PEQ filter bank and at each channel output.

IIR Filters

IIR filters are the most commonly used type for equalization and crossover filters. They emulate analog filters and define the frequency and phase response by setting poles and zeroes. Their phase response is related to the frequency response magnitude.
Commonly, filters of second order with two delays and five coefficients are used. They are able to approximate almost all common filters such as high-pass and low-pass with classical characteristics (Butterworth, Bessel, Linkwitz-Riley, etc.). Peak, shelving, and notch filters can also be formed. Filters of higher order are usually realized by cascading second order filters.
The overall transfer-function for one output is composed by serially concatenating a variable number of biquads, realizing the desired crossover bandpass function and performing the passband equalization.

In the HD2, IIR filters can be used for the crossover function as well as for equalization. Instead of deriving the filter coefficients from pre-calculated tables, the DSP is fed with the real filter parameters (type, frequency in Hz, gain in dB, and quality factor) and performs a complete bilinear transform to obtain the digital filter coefficients. Since the DSP56000 family operates with fixed-point numbers, a floating-point library was implemented in order to be able to do the necessary calculations for the transform. The achieved precision of the filter transfer function, especially for tricky filters such as very narrow low-frequency notch filters, is much better than could be achieved by interpolating through pre-calculated table values.

Since the bilinear transform maps the frequency range of an analog filter (DC – ∞) to that of the digital filter (DC – fS/2), its frequency response features become increasingly compressed on the frequency scale when approaching the Nyquist frequency fS/2. Particularly, the common Bell (Peaking)-filters appear with a smaller bandwidth than their analog templates. To remedy this, the quality-factor Q entered in the coefficient calculation can be decreased according to a certain formula.
Since filters of orders > 2 can be realized by cascaded biquads, the parameters frequency, gain, and quality are sufficient to also allow the calculation of these filters. Low-pass and high-pass as well as low-shelf and high-shelf can be modified in their characteristic by altering the quality factor Q.

Audio Performance

Due to the feedback loops, the performance in terms of noise and distortion of an IIR filter is more critical than that of a FIR filter, especially on fixed-point platforms. Limit cycles inside the filter and requantization generate noise. A longer word length is therefore required to perform the calculations. Figure 3 shows the noise of an IIR filter on a fixed point DSP with word lengths of 24 bits, 24 bits with error feedback and 48 bits (double precision arithmetic). It can be clearly seen that only double precision arithmetic and data paths provide satisfying results in terms of the self-generated noise of the filter. In contrast, 24 bits for the coefficients provide enough precision to realize very narrow notch filters, even at low frequencies where they are most sensitive to coefficient quantization.
While 48-bit double-precision biquad calculation on a 24-bit fixed point DSP is cumbersome, involving shift-and-add operations of partial results and taking about three times more processor cycles to execute than a single-precision biquad, it compares favourably to 32-bit floating-point processing in terms of noise and distortion. For very faint signals, the noise floor of 32- bit float processing is typically a bit lower than that of 48-bit integer processing. However, the noise floor of floating-point processing scales with the signal amplitude, while the noise floor of integer processing remains on a very low fixed level for all signal amplitudes. This means that 48-bit integer processing produces considerably less noise and distortion than 32-bit floating-point processing for medium and high signal amplitudes.

FIR FILTERS

FIR filters allow a radically different approach for the crossover and system equalization task. While the desired transfer function of an IIR based loudspeaker management system is usually realized with generic filter blocks concatenated from a building set, one single FIR filter can replace the whole IIR filter chain, performing the roles of the crossover and equalizer at once.
Since the impulse response of an FIR filter is identical to its coefficients, it is possible to realize filters with phase and group delay characteristics which can range from minimum-phase through linear-phase up to maximum-phase (which would be the time-inverted minimum-phase-filter). The necessary length and hence number of coefficients of a FIR filter to reach satisfactory spectral selectivity scales with wavelength. If not handled by overlapped FFT convolution techniques, an FIR filter dedicated for the mid, low or sub band has to be run at a reduced sample rate to spare DSP resources.

Minimum phase

The behaviour of a minimum phase FIR filter closely resembles that of a chain of minimum-phase IIR filters as described above, that is, phase distortion is introduced but with the benefit of the lowest physically possible overall group delay. A minimum phase FIR filter can be easily derived from the following two cases by means of the Hilbert transform.

Linear phase

Linear phase filters used to realize the crossover function in the mid and high frequency ranges are an interesting alternative to the conventional IIR designs as they avoid any phase distortion introduced by the filter itself, regardless of the steepness of the transition band. This potentially eliminates problems with dips near the crossover frequency caused by phase mismatch between the two involved drivers. Since the impulse response of the filters is symmetric, they add an overall group delay of half the filter length to the system.
A linear phase FIR filter can be easily derived e.g. from the minimum phase FIR filter by setting the group delay of its transfer function to a constant value and performing an IFFT. Some loudspeaker management systems permit to use linear phase FIR filters to perform the crossover functionality, while the passband equalization is done conventionally with a chain of IIR-PEQs. This possibility also exists in the HD2. However, the more typical application is the “one-FIR-filter-does-it-all” approach.

Complex-equalizing

Instead of opting for linear phase crossover/EQ filtering, it is even more enticing to let the FIR filter not only equalize the loudspeaker’s amplitude response, but also linearize its phase response, yielding a constant, frequency-independent group delay. In other words, the loudspeaker’s dispersion is removed, resulting in a faithful reproduction of the audio signal’s temporal waveform. However, the resulting perfect transmission behavior is usually restricted to the proximity of the reference point where the transfer function measurement was made.

However, one must be careful to not equalize all dips and peaks to obtain a flat transfer function since they may stem from interference of the direct sound with reflected or refracted components, from reflections in horns, or cone-breakup in compression drivers.
For these reasons, the FIR generation software provides a couple of tools to optionally pre-process the measured TFs, filling dips and smoothing amplitude and group delay courses. These in the end will produce filter-transfer functions whose magnitude may be similar to those obtained with chains of conventional IIR-PEQs.

Complex-equalizing filters as well as linear phase filters may introduce a considerable amount of group delay of the overall system due to the principle of causality. To illustrate this, let’s consider a 3-way system as depicted in the figures to the left. The magnitude responses of the overall system after equalization are shown in the top graph. An equalization using minimum phase filters, and complex- equalization was carried out. The phase responses (after subtracting the overall group delay) are shown in the bottom figure.

Complex-equalizing produces almost no phase distortions except for very low frequencies, which could be corrected by using a longer filter. Not addressing the loudspeaker’s phase response in the minimum phase approach results in considerably more phase shift. The resulting overall basic group delay is 15 ms for the minimum phase approach and 80 ms for the complex-equalization. For pure playback situations this is not a problem, but this latency is prohibitive for live usage. A remedy for this is the use of mixed minimum-phase and complex-equalizing. Since the woofer path normally determines the maximum delay, it is equalized using a minimum-phase filter, thus its phase is not linearized for the sake of lower group delay.
It is also possible to “morph” a FIR filter’s group delay between minimum and complex-equalizing phase.

Regarding audibility, group delay distortion caused by loudspeakers themselves seems to be inaudible even under critical listening conditions. An exception is bandpass-type subwoofers. Their high group delay ranging up to some tenths of ms often causes a retarded “slugging” bass response not in line with the first transient of a bass drum or a slapped bass guitar. Complex-equalizing is able to dramatically change the reproduction of such a device, yielding a very tight, compact bass response strictly synchronized with the mid and high range sound. However, the flip side of the coin is an overall latency of 50 – 100 ms, depending on the particular construction and filter parameters.
So complex-equalizing suffers a dilemma: In the mid and high range, where it is feasible with only a moderate latency penalty, constant group delay and detailed equalization of even the finest spectral features do not necessarily improve the perceived sound quality. In contrast, a considerable improvement is possible for the subwoofers, but only at the expense of an overall latency not compatible with live sound requirements.

Multirate processing

If opting for FIR bandpassing and equalization by direct convolution in the time domain, the only viable way to process the mid and low bands with reasonable computational burden and memory usage is by previous sample rate reduction.
The goal is to realise a sufficient frequency resolution also at ow frequencies. The number of filter coefficients determines the resolution. For 512 coefficients at 48 kHz, e.g. the resolution would be 48000 Hz / 512 = 93,75 Hz, that is, at low frequencies, the filter can only be determined by one frequency bin each 94 Hz. Using downsampling with a factor of 16, e.g. would increase the resolution to 5,8 Hz.
The HD2 uses downsampling rates between 2 and 32 which yields a maximum frequency resolution of 3 Hz at low frequencies.

Dual Range AD Conversion

A loudspeaker management system is typically located between the mixing desk and the power amplifiers. If the mixer is digital, the connection is of course best done digitally via AES3, avoiding additional latency and preserving the dynamic range. However, many consoles for live use are still purely analog. Typically equipped with opamps fed with up to 18V supply voltage, their symmetrical outputs are capable of delivering around +28 dBu (27.5 Vpp). In contrast, when sliding the master faders down to mute the signal, only the output buffer noise remains, letting the noise level easily drop to below 100 dBu. The difference of both values yields a dynamic range of approximately 130 dB, not attainable by today’s AD converters. For this reason, most loudspeaker management systems have a gain control potentiometer or switches in their input stages, tasking the sound engineer with the decision as to whether high headroom or lowest noise are more important.
Ideally, the loudspeaker management system’s inputs should not clip before the mixing desks in order to not diminish headroom needed for the limiters. At the other end of the input level range, the input circuit should not contribute significantly to the overall system noise when only a moderate volume is needed.
The dual-range conversion principle depicted in the figure to the left solves the problem, generously boosting the dynamic range by using a stereo AD-converter in a mono configuration. The input signal is treated by two preamplifiers, one with unity gain and the other one with a gain equal to the desired dynamic range expansion. The circuit must be carefully designed to avoid crosstalk to the other channel when the preamplifier with the higher gain clips.

The entailing leap of distortion can be kept satisfactorily small, as the figure to the left documents.
At levels below clipping, the digitized higher-gain ADC signal, attenuated by the gain difference and corrected for the DC offset difference between both channels, is elected to be the source signal. When the clipping level is reached, the lower-gain ADC input becomes the source.
Some precautions must be taken to achieve seamless and inaudible switching, such as tracking the gain difference and the DC offset difference between both channels since they are are prone to slight drift due to circuit warm-up and other factors.
While these precautions already guarantee that the transitions are inaudible under any circumstance, some further improvements are made to drastically reduce the possible number of transitions per second, E.g., a hyteresis is introduced, making use of psycho-acoustic properties of our hearing sense. Additionally, the thresholds for switching up and down are not equal. This prevents switching when the input signal has almost constant level.
The technique described here obviously does not increase the instantaneous SNR of the ADC. At a given moment, only one of the two channels is active, so the value of the single ADC channel rules. When switching from the high to the low gain channel, the SNR drops from 118 dB to a little less than 100 dB. However, this noise level continues to be masked even in case of the most critical signals, such as very low frequency tones.
So the dual-range ADC input can be considered as a providential dynamic range extension. The high gain ADC channel assures a conversion with very low input noise level ( 108 dBu) up to about +8 dBu – slightly above the typical levels encountered at the mixer output when reaching maximum volume. From there on, the other ADC channel takes over, allowing for 20 dB of headroom to be used by the limiters, now described in detail.

Limiters

Limiter design has been the last holdout of analog-addicts. However, digital signal processing allows devising better sounding limiters with much improved precision, no added noise and considerably reduced distortion as compared to analog designs.

Look-ahead peak limiter

Digital peak limiters can make use of a simple yet effective tool not available to analog limiters. A small delay inserted in the peak limiter’s signal path (left figure) allows its control logic to gradually reduce the gain (with a constant dB/s slope) when triggered by a peak.

This limits the transient to exactly the programmed threshold when reaching the output (see figure). This way, clipping in the subsequent signal processing stages is safely avoided. The main benefit, however, is that the gain reduction signal (corresponding to the VCA control voltage in an analog solution) has significantly lower high frequency content, effectively reducing the unpleasant and aggressive sounding distortion introduced if high frequency components modulate the audio signal.

Though already reduced by more than 20 dB compared to an analog limiter with similar parameters and the same input signal (see figure), the high-frequency components can be further attenuated by low-pass filtering the “VCA control voltage” (dashed block in block diagram). A 6th order Bessel filter with 500 Hz cut-off frequency has been found suitable for this purpose. Its constant passband group delay of nearly 1 ms can be easily compensated by adding it to the length of the signal delay line. Though providing a very subtle increase in sound clarity, the additional latency may be objectionable in live sound applications, so a switch allows bypassing the filter.
To summarize, the attack rate is adapted to the amplitude of the detected transient, while the attack time is constant. Its value of 1 ms has been chosen to be independent of the processed band. This means that in the high-band, the limiter has time to reduce the gain over at least one half-wave. In contrast, a slowly developing half-wave triggering the limiter of the subwoofer channel is chamfered over a one-millisecond period before reaching the threshold level and remains there until reaching the cusp. This behavior, in conjunction with the “controlled overshoot” feature described in 6.2, preserves the “kick” while avoiding continued clipping in the subsequent half-waves.
The limiter control is rounded out by a retriggerable hold phase of 50 ms and a release rate adjustable by the user from 10 dB/s to 200 dB/s, just as in a conventional peak limiter. However, the actual release rate is not static. If the calculated output signal level remains near the limiter threshold, the release process is continuously slowed down. This eliminates a potential problem for slowly decaying sounds like a stroke on a crash cymbal: If the decay rate was slower than the static release rate, release and hold phases would alternate, giving the sound a sawtooth-like modulation.
It must be stressed here that although the look-ahead concept reduces distortion by keeping the gain reduction signal cleaner and by safely avoiding clipping in subsequent stages, the action of the peak limiter is of course not inaudible. Feeding an audio signal with a level well above the threshold and choosing a very fast release will result exactly in the same very dense and yelling loud sound unfortunately overused by many mastering studios and radio stations.

Controlled overshoot

The peak limiter’s purpose is to avoid amplifier clipping and mechanical damage of the drivers. If the connected loudspeaker easily handles the maximum amplifier output (often true for subwoofer arrays), a peak limiter programmed to the amplifier’s rated long-term power could mean squandering its short-term power capability. In order to remedy this waste of 2 or 3 dB (or even more) for transients, the peak limiter concept was modified to simulate the power supply voltage drop occurring when power is drawn. Two additional parameters are needed to program this “controlled overshoot” feature: The “surge” value indicating the peak power available when the power supply electrolytic capacitors are fully charged, and the “duration” value which is the time when the output power has dropped to the continuous level (observing the initial declination of the power drop).

Both values can be easily measured by loading both amplifier channels, applying a burst signal that slightly drives the amplifier into clipping, and analyzing the resulting waveform on an oscilloscope (left figure). If not available, a “surge” value of 2 dB and a “duration” value of 30 ms harmonize well with most common power amps. The power drawn from the amplifier is modeled by squaring the signal and feeding an RC-like accumulator whose output shifts the limiter threshold downwards.

RMS limiter

The independent RMS thermal limiter is programmed with the continuous power rating of the loudspeaker and a time-constant which is the intersection of the initial temperature rise with the temperature limit upon applying the rated power. While the first value is available from the loudspeaker manufacturer, the second one has to be estimated according to the thermal capacity of the driver. The limiter algorithm also includes a second thermal circuit, simulating the heat transfer from the voice coil to the magnet, but due to the absence of reliable data, this feature has not yet been used.
The power transformed into heat is modeled by squaring the output signal, assuming a constant load impedance. The fact that both the impedance and the excursion and thus the main cooling mechanism of a woofer’s voice coil are strongly frequency-dependent makes this assumption fairly imprecise. Exciting a vented box with rated power at its tuning frequency (where excursion and heat convection are minimal) will make the thermal limiter fatally fail in protecting the woofer from overheating. However, just as the rated power information is usually evaluated with AES noise (a pink noise with controlled crest factor), the limiter model is based on the broad spectral statistics of a musical signal.
The RMS limiter acts independently of the peak-limiter and both gain reductions (in dB) are added (see block diagram above). This means that peaks are limited to a lower output value when the RMS limiter is active, a desired protection feature.