Lemur - A Tool for Timbre Manipulation
Kelly Fitz, Lippold Haken, and Bryan Holloway
CERL Sound Group, University of Illinois
103 S. Mathews, Urbana IL 61801, USA
lemur@uiuc.edu
Abstract
We will present Lemur, a system for generating and manipulating sinusoidal models for sampled sound,
implemented as a Macintosh(R) program. Our system uses an enhanced McAulay-Quatieri (MQ) style analysis
for modeling sampled sounds. The system provides tools for time and frequency scale modifications, partial editing
and pruning, timbre morphing, and many other manipulations in the model domain. Time-variant manipulations may
be performed using control files. Third-party implementations of the Lemur model will be discussed. A real-time
controllable implementation of a bandwidth-enhanced synthesis algorithm will be demonstrated.
Whatsa Lemur?
Lemur is a Macintosh implementation of an extended McAulay-Quatieri (MQ) algorithm for sound analysis and
synthesis (McAulay and Quatieri 1986) based on the work of
Maher and Beauchamp at the University of Illinois
(Maher 1988). Lemur analysis consists of a series of short-time
Fourier spectra from which significant frequency
components are selected. Similar components in successive spectra are linked to form time-varying partials, called
tracks. The number of significant frequency components, and, thus, the number of tracks may vary over the
duration of a sound. Synthesis is performed by a bank of oscillators, each oscillator reproducing the frequency and
amplitude trajectory of a single track. Phase accuracy is maintained using cubic phase (parabolic frequency)
interpolation between spectra. The Lemur model allows extensive modification of the sound using Lemur's built-in
editing functions, or using other customized editors to modify the intermediate analysis file before resynthesis
(Fitz, Walker, and Haken 1992). The Lemur file format contains only
amplitude, frequency, initial phase and linking
information for each track, making it easy for users to write customized Lemur file editors.
Track selection and editing
When a Lemur analysis file is opened, it is displayed as shown in Figure 1. In this
editing window, it is possible to
select and modify tracks. Tracks may be selected individually, by region, or according to a track characteristic.
Individual tracks are selected and deselected by the familiar point-and-click mouse technique. Regions may be
drawn on the Lemur display using point-click-and-drag mousing. All the tracks born in a region may selected or
deselected. Tracks may be selected according to their duration or amplitude characteristics, or according to a
user-assigned track label. If a region has been drawn on the Lemur display, only tracks born in that region are
searched for selection by length, threshold clearance, or label. Selected tracks are highlighted in the Lemur display.
Figure 1. The Lemur editing window, displaying the analysis data for a violin tone. Tracks
representing
the harmonics 1, 4, 5, 9, and 13 have been selected.
Frequency, phase, and magnitude can be modified for individual tracks in a Lemur file. Selected tracks can be
scaled and shifted in any of these parameters. Frequency scaling and shifting allows selected components (the
sustaining harmonic components, for example) of an analysis to be modified in frequency without affecting the
frequency scale of the other components. Magnitude and phase scaling and shifting provide selective filtering
capabilities that would be difficult or impossible to achieve using ordinary digital filtering methods, and can be used
to create subtle timbre modifications. For example, emphasizing (magnitude scaling) the even numbered harmonic
components of a bowed cello analysis simulates the sound of a bowed artificial harmonic on the cello.
Lemur can be used to prune analysis files to extract selected features of a sound. This capability can be used to
separate the harmonic parts of an analysis from the noisy or inharmonic parts, to perform voice separation, to
separate transient events from sustaining components, to achieve perfect (no roll off) filtering, and many other
special effects in the domain of the Lemur model. Figure 2 shows an edited version of
the Lemur file in Figure 1.
Figure 2. The Lemur editing window, displaying the analysis data for the Lemur file created by saving
only the tracks selected in Figure 1.
Time and Frequency Scaling
The Lemur model is well-suited to time- and frequency- scale modification of analyzed signals. A single set of
analysis data can be used to synthesize any number of modified versions of a signal. In Lemur synthesis, time
scaling is performed by varying the number of samples computed between analysis frame boundaries. The
frequency of the track is unaltered, though its duration may change. Frequency scaling is achieved by modifying
the track frequencies without changing the track durations. The number of samples generated is unaffected by
changes in the track frequencies. Thus, independent control of the time and frequency scale of a synthesized signal
can be easily exercised in the domain of the Lemur model. Since the phase curve is computed at the time of
synthesis, even radical modifications produce no phase discontinuities or wide frequency excursions, which
produce audible frequency artifacts in other sinusoidal methods (Serra
1989).
Synthesis Control Files
In Lemur, time, frequency and magnitude scaling may be varied continuously over the duration of a synthesis.
Lemur can interpret ordinary samples files as control files for time, frequency, and magnitude scaling, and
frequency shifting. Control files are resampled to match the duration of the Lemur file, so their sample rate and
duration are arbitrary. Frequency and magnitude scale modifications can be applied to all tracks in an analysis, or
only to those tracks bearing a specified label. These modifications are performed during synthesis, and do not alter
the Lemur file itself. Figure 3 shows a example of a Lemur synthesis using a control
file.
Figure 3. Example of control files used in Lemur synthesis. The control file is used to achieve
time-variant
frequency scaling, with a base scale of 1.5.
Merging
Lemur files can be merged to form larger or more complex models. The tracks from one file are merged into
another by adding the peaks from each frame in the former to the corresponding frame of the latter. Since peaks in
consecutive frames in the original analysis must remain in consecutive frames, merging files with different frame
lengths can have interesting or undesired time scaling effects on the merged file. The merged analyses need not be
the same length, and need not begin together; a time offset may be specified for the merged file. A recent Lemur
project has consisted in separating (using track selection) the many short auditory events in recordings of metallic
banging and scraping sounds from a train yard and combining (merging) them to make new banging and scraping
sounds. Figure 4 shows the result of merging two Lemur files. Merged Lemur files
produce the same synthesis as
an audio mix (with delay) of the separate syntheses. But the effects of selective and time-variant scaling and
shifting on the merged analyses are difficult to achieve separately.
Figure 4. Example of merging Lemur files. The second file was merged into the first with a delay of
125
milliseconds. For visibility, the tracks from the second file have been highlighted in the graph of the merged file.
Morphing
Timbre morphing is the process of combining two or more sounds to create a new sound with intermediate timbre
and duration. For instance, a long sound with a fast and narrow vibrato may be morphed with a quiet sound with a
slow and wide vibrato, to create a morphed sound with a medium length, medium loudness, and a with an
intermediate vibrato speed and width (Tellman, Haken, and Holloway 1995). This process differs from mixing
sounds, as only a single sound, with some of the characteristics of each of the original sounds, is audible as the
morphed sound.
As a first step in the process, Lemur must determine which tracks should be paired for morphing. For pitched
sounds, most of the tracks are approximately integer multiples of the fundamental of the sound. Tracks which are
the same multiple of the fundamental in the two sounds are morphed together. Tracks in one sound with no
corresponding track in the other sound are morphed with a zero-magnitude partial, with a frequency determined by
the ratios of the sounds' fundamentals.
To prepare analyses for morphing, the user must identify features in each analysis. Lemur distinguishes between
two types of features: unique features and repeatable features. Unique features are specific points in each sound
which must be lined up in the morphing process. Unique features include the start of the attack, the peak of the
attack, start of the decay, etc. Repeatable features are ones which may be duplicated or omitted in the morphing
process, such as the beginning of each vibrato cycle.
The weight for each sound in the morph is specified by a control file and may vary as the morph progresses. If the
weight gradually changes when two sounds with different vibrato rates are morphed, the vibrato rate of the
morphed sound will gradually change.
Roll Your Own - The Lemur File Format
Lemur is an ideal tool for composers wanting to exert fine timbral control over sampled sounds. In addition to using
the powerful timbre editing capabilities built into Lemur, composers can create customized programs for editing
Lemur data files. Lemur's analysis data is stored in a publicly available file format that is easy to manipulate. The
Lemur file format contains only amplitude, frequency, initial phase, and link data for each track. The amplitude and
frequency data represent the time-varying behavior of the track. The link data is used to reconstruct the tracks
from the Lemur file.
Other implementations of the raw MQ format contain data representing the phase behavior of the track, so that
synthesis may be phase-accurate. Any time or frequency modification of a track requires that the phase behavior
be recomputed, a complex operation that makes the raw MQ data inconvenient for timbre manipulation. Since the
frequency trajectory of a track is the derivative of its phase trajectory, Lemur maintains phase accuracy by
retaining only track starting phases and interpolated frequency data, which requires no special treatment during
modification.
Folks Who Rolled Their Own
Steve Berkley's QuickMQ application is a tool for frequency-domain transformations of Lemur files.
These transformations include convolution, deconvolution, spectrum mix, granular desynthesis, brightening,
harmonic rotation, and a spectrum processing language based on the graphics manipulation language, Popi, by
Gerard Holzmann. QuickMQ allows editing of Lemur files in a movie environment, where the changes in spectrum
may be observed over time. QuickMQ allows the user to describe frequency domain translation algorithms to
produce new audio transformations of frequency, time, and amplitude (Berkley 1994).
PAST, developed at Dartmouth by Chris Langmead, is a perceptual timbre modeling tool that uses Lemur
as a preprocessor. The timbral dimensions of the model are the morphologies, or shapes created by the Lemur
tracks. PAST displays graphically the morphologies of track onset asynchrony, track density, amplitude and mean
frequency envelopes, track length, spectral envelope and harmonicity. The morphologies are used by the model to
calculate similarity measurements between different timbres. The model also allows dimension specific timbre
modification by either "morphing" the dimension of one timbre onto another, or by drawing a new curve for the
selected dimension(s) (Langmead 1995).
The timbre morphing feature described above was first implemented as a stand-alone application called
LemurMorph (Tellman, Haken, and Holloway 1995). MQT, written by Ted Apel, is a
compositional tool for applying transformations to Lemur analysis data (Apel 1993).
QuickMQ, PAST, and MQT are available via anonymous ftp at
music.dartmouth.edu. Lemur Pro for the Macintosh or PowerMac is available by anonymous ftp at
www.cerlsoundgroup.org, in pub/lemur. The authors may be contacted at lemur@uiuc.edu.
References
Ted Apel, "Transformation of Audio Signals By Use of the McAulay Quatieri Sinusoidal Model of Sound", M.A.
thesis and accompanying computer software, Department of Electro-Acoustic Music, Dartmouth College,
Hanover, NH, 1993.
Steve W. Berkley, "QuickMQ: A Software Tool for the Modification of Time-Varying Spectrum Analysis Files."
M.A. thesis and accompanying computer software, Dept. of Electro-Acoustic Music, Dartmouth College, Hanover,
NH, 1994.
Kelly Fitz, William Walker, and Lippold Haken, "Extending the McAulay-Quatieri Analysis for Synthesis With a
Limited Number Of Oscillators," Proc. Intl. Computer Music Conf., 1992, pp. 381-382.
Chris J. Langmead, "A Theoretical Model of Timbre Perception Based on Morphological Representations of
Time-Varying Spectra," M.A. thesis, Dept. of Electro-Acoustic Music, Dartmouth College, Hanover, NH, 1995.
Robert J. McAulay and Thomas Quatieri, "Speech Analysis/Synthesis Based on a Sinusoidal Representation,"
IEEE Trans. Acous, Speech, Signal Processing, vol ASSP-34, pp. 744-754, 1986.
Robert C. Maher, "An Approach for the Separation of Voices in Composite Musical Signals," Ph.D.
dissertation, Dept. of Computer Science, Univ. Of Illinois at Urbana-Champaign, 1989.
Xavier Serra, "A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus
Stochastic Decomposition," Ph.D. dissertation, Dept. of Music, Stanford University, Stanford CA, 1989.
Edwin Tellman, Lippold Haken, and Bryan Holloway, "Morphing Between Timbres with Unequal Numbers of
Features", to appear in Journal of the Audio Engineering Society, 1995.
Macintosh(R) is a registered trademark of the Apple Computer Corporation.
More information about Lemur Pro.
Go to Kelly's Home Page
Go to Bryan's Home Page
Download a postscript
version of this paper. (2060 kbytes)