Lemur - A Tool for Timbre Manipulation

Kelly Fitz, Lippold Haken, and Bryan Holloway

CERL Sound Group, University of Illinois
103 S. Mathews, Urbana IL 61801, USA

Abstract

We will present Lemur, a system for generating and manipulating sinusoidal models for sampled sound, implemented as a Macintosh(R) program. Our system uses an enhanced McAulay-Quatieri (MQ) style analysis for modeling sampled sounds. The system provides tools for time and frequency scale modifications, partial editing and pruning, timbre morphing, and many other manipulations in the model domain. Time-variant manipulations may be performed using control files. Third-party implementations of the Lemur model will be discussed. A real-time controllable implementation of a bandwidth-enhanced synthesis algorithm will be demonstrated.

Whatsa Lemur?

Lemur is a Macintosh implementation of an extended McAulay-Quatieri (MQ) algorithm for sound analysis and synthesis (McAulay and Quatieri 1986) based on the work of Maher and Beauchamp at the University of Illinois (Maher 1988). Lemur analysis consists of a series of short-time Fourier spectra from which significant frequency components are selected. Similar components in successive spectra are linked to form time-varying partials, called tracks. The number of significant frequency components, and, thus, the number of tracks may vary over the duration of a sound. Synthesis is performed by a bank of oscillators, each oscillator reproducing the frequency and amplitude trajectory of a single track. Phase accuracy is maintained using cubic phase (parabolic frequency) interpolation between spectra. The Lemur model allows extensive modification of the sound using Lemur's built-in editing functions, or using other customized editors to modify the intermediate analysis file before resynthesis (Fitz, Walker, and Haken 1992). The Lemur file format contains only amplitude, frequency, initial phase and linking information for each track, making it easy for users to write customized Lemur file editors.

Track selection and editing

When a Lemur analysis file is opened, it is displayed as shown in Figure 1. In this editing window, it is possible to select and modify tracks. Tracks may be selected individually, by region, or according to a track characteristic. Individual tracks are selected and deselected by the familiar point-and-click mouse technique. Regions may be drawn on the Lemur display using point-click-and-drag mousing. All the tracks born in a region may selected or deselected. Tracks may be selected according to their duration or amplitude characteristics, or according to a user-assigned track label. If a region has been drawn on the Lemur display, only tracks born in that region are searched for selection by length, threshold clearance, or label. Selected tracks are highlighted in the Lemur display.

Figure 1. The Lemur editing window, displaying the analysis data for a violin tone. Tracks representing the harmonics 1, 4, 5, 9, and 13 have been selected.

Frequency, phase, and magnitude can be modified for individual tracks in a Lemur file. Selected tracks can be scaled and shifted in any of these parameters. Frequency scaling and shifting allows selected components (the sustaining harmonic components, for example) of an analysis to be modified in frequency without affecting the frequency scale of the other components. Magnitude and phase scaling and shifting provide selective filtering capabilities that would be difficult or impossible to achieve using ordinary digital filtering methods, and can be used to create subtle timbre modifications. For example, emphasizing (magnitude scaling) the even numbered harmonic components of a bowed cello analysis simulates the sound of a bowed artificial harmonic on the cello.

Lemur can be used to prune analysis files to extract selected features of a sound. This capability can be used to separate the harmonic parts of an analysis from the noisy or inharmonic parts, to perform voice separation, to separate transient events from sustaining components, to achieve perfect (no roll off) filtering, and many other special effects in the domain of the Lemur model. Figure 2 shows an edited version of the Lemur file in Figure 1.

Figure 2. The Lemur editing window, displaying the analysis data for the Lemur file created by saving only the tracks selected in Figure 1.

Time and Frequency Scaling

The Lemur model is well-suited to time- and frequency- scale modification of analyzed signals. A single set of analysis data can be used to synthesize any number of modified versions of a signal. In Lemur synthesis, time scaling is performed by varying the number of samples computed between analysis frame boundaries. The frequency of the track is unaltered, though its duration may change. Frequency scaling is achieved by modifying the track frequencies without changing the track durations. The number of samples generated is unaffected by changes in the track frequencies. Thus, independent control of the time and frequency scale of a synthesized signal can be easily exercised in the domain of the Lemur model. Since the phase curve is computed at the time of synthesis, even radical modifications produce no phase discontinuities or wide frequency excursions, which produce audible frequency artifacts in other sinusoidal methods (Serra 1989).

Synthesis Control Files

In Lemur, time, frequency and magnitude scaling may be varied continuously over the duration of a synthesis. Lemur can interpret ordinary samples files as control files for time, frequency, and magnitude scaling, and frequency shifting. Control files are resampled to match the duration of the Lemur file, so their sample rate and duration are arbitrary. Frequency and magnitude scale modifications can be applied to all tracks in an analysis, or only to those tracks bearing a specified label. These modifications are performed during synthesis, and do not alter the Lemur file itself. Figure 3 shows a example of a Lemur synthesis using a control file.

Figure 3. Example of control files used in Lemur synthesis. The control file is used to achieve time-variant frequency scaling, with a base scale of 1.5.

Merging

Lemur files can be merged to form larger or more complex models. The tracks from one file are merged into another by adding the peaks from each frame in the former to the corresponding frame of the latter. Since peaks in consecutive frames in the original analysis must remain in consecutive frames, merging files with different frame lengths can have interesting or undesired time scaling effects on the merged file. The merged analyses need not be the same length, and need not begin together; a time offset may be specified for the merged file. A recent Lemur project has consisted in separating (using track selection) the many short auditory events in recordings of metallic banging and scraping sounds from a train yard and combining (merging) them to make new banging and scraping sounds. Figure 4 shows the result of merging two Lemur files. Merged Lemur files produce the same synthesis as an audio mix (with delay) of the separate syntheses. But the effects of selective and time-variant scaling and shifting on the merged analyses are difficult to achieve separately.

Figure 4. Example of merging Lemur files. The second file was merged into the first with a delay of 125 milliseconds. For visibility, the tracks from the second file have been highlighted in the graph of the merged file.

Morphing

Timbre morphing is the process of combining two or more sounds to create a new sound with intermediate timbre and duration. For instance, a long sound with a fast and narrow vibrato may be morphed with a quiet sound with a slow and wide vibrato, to create a morphed sound with a medium length, medium loudness, and a with an intermediate vibrato speed and width (Tellman, Haken, and Holloway 1995). This process differs from mixing sounds, as only a single sound, with some of the characteristics of each of the original sounds, is audible as the morphed sound.

As a first step in the process, Lemur must determine which tracks should be paired for morphing. For pitched sounds, most of the tracks are approximately integer multiples of the fundamental of the sound. Tracks which are the same multiple of the fundamental in the two sounds are morphed together. Tracks in one sound with no corresponding track in the other sound are morphed with a zero-magnitude partial, with a frequency determined by the ratios of the sounds' fundamentals.

To prepare analyses for morphing, the user must identify features in each analysis. Lemur distinguishes between two types of features: unique features and repeatable features. Unique features are specific points in each sound which must be lined up in the morphing process. Unique features include the start of the attack, the peak of the attack, start of the decay, etc. Repeatable features are ones which may be duplicated or omitted in the morphing process, such as the beginning of each vibrato cycle.

The weight for each sound in the morph is specified by a control file and may vary as the morph progresses. If the weight gradually changes when two sounds with different vibrato rates are morphed, the vibrato rate of the morphed sound will gradually change.

Roll Your Own - The Lemur File Format

Lemur is an ideal tool for composers wanting to exert fine timbral control over sampled sounds. In addition to using the powerful timbre editing capabilities built into Lemur, composers can create customized programs for editing Lemur data files. Lemur's analysis data is stored in a publicly available file format that is easy to manipulate. The Lemur file format contains only amplitude, frequency, initial phase, and link data for each track. The amplitude and frequency data represent the time-varying behavior of the track. The link data is used to reconstruct the tracks from the Lemur file.

Other implementations of the raw MQ format contain data representing the phase behavior of the track, so that synthesis may be phase-accurate. Any time or frequency modification of a track requires that the phase behavior be recomputed, a complex operation that makes the raw MQ data inconvenient for timbre manipulation. Since the frequency trajectory of a track is the derivative of its phase trajectory, Lemur maintains phase accuracy by retaining only track starting phases and interpolated frequency data, which requires no special treatment during modification.

Folks Who Rolled Their Own

Steve Berkley's QuickMQ application is a tool for frequency-domain transformations of Lemur files. These transformations include convolution, deconvolution, spectrum mix, granular desynthesis, brightening, harmonic rotation, and a spectrum processing language based on the graphics manipulation language, Popi, by Gerard Holzmann. QuickMQ allows editing of Lemur files in a movie environment, where the changes in spectrum may be observed over time. QuickMQ allows the user to describe frequency domain translation algorithms to produce new audio transformations of frequency, time, and amplitude (Berkley 1994).

PAST, developed at Dartmouth by Chris Langmead, is a perceptual timbre modeling tool that uses Lemur as a preprocessor. The timbral dimensions of the model are the morphologies, or shapes created by the Lemur tracks. PAST displays graphically the morphologies of track onset asynchrony, track density, amplitude and mean frequency envelopes, track length, spectral envelope and harmonicity. The morphologies are used by the model to calculate similarity measurements between different timbres. The model also allows dimension specific timbre modification by either "morphing" the dimension of one timbre onto another, or by drawing a new curve for the selected dimension(s) (Langmead 1995).

The timbre morphing feature described above was first implemented as a stand-alone application called LemurMorph (Tellman, Haken, and Holloway 1995). MQT, written by Ted Apel, is a compositional tool for applying transformations to Lemur analysis data (Apel 1993).

QuickMQ, PAST, and MQT are available via anonymous ftp at music.dartmouth.edu. Lemur Pro for the Macintosh or PowerMac is available by anonymous ftp at www.cerlsoundgroup.org, in pub/lemur. The authors may be contacted at

lemur@uiuc.edu

References

Ted Apel, "Transformation of Audio Signals By Use of the McAulay Quatieri Sinusoidal Model of Sound", M.A. thesis and accompanying computer software, Department of Electro-Acoustic Music, Dartmouth College, Hanover, NH, 1993.

Steve W. Berkley, "QuickMQ: A Software Tool for the Modification of Time-Varying Spectrum Analysis Files." M.A. thesis and accompanying computer software, Dept. of Electro-Acoustic Music, Dartmouth College, Hanover, NH, 1994.

Kelly Fitz, William Walker, and Lippold Haken, "Extending the McAulay-Quatieri Analysis for Synthesis With a Limited Number Of Oscillators," Proc. Intl. Computer Music Conf., 1992, pp. 381-382.

Chris J. Langmead, "A Theoretical Model of Timbre Perception Based on Morphological Representations of Time-Varying Spectra," M.A. thesis, Dept. of Electro-Acoustic Music, Dartmouth College, Hanover, NH, 1995.

Robert J. McAulay and Thomas Quatieri, "Speech Analysis/Synthesis Based on a Sinusoidal Representation," IEEE Trans. Acous, Speech, Signal Processing, vol ASSP-34, pp. 744-754, 1986.

Robert C. Maher, "An Approach for the Separation of Voices in Composite Musical Signals," Ph.D. dissertation, Dept. of Computer Science, Univ. Of Illinois at Urbana-Champaign, 1989.

Xavier Serra, "A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus Stochastic Decomposition," Ph.D. dissertation, Dept. of Music, Stanford University, Stanford CA, 1989.

Edwin Tellman, Lippold Haken, and Bryan Holloway, "Morphing Between Timbres with Unequal Numbers of Features", to appear in Journal of the Audio Engineering Society, 1995.

Macintosh(R) is a registered trademark of the Apple Computer Corporation.

More information about Lemur Pro.

Go to Kelly's Home Page

Go to Bryan's Home Page

Download a postscript version of this paper. (2060 kbytes)