Working with sound

Thor Magnusson, November 2007

FLOSS Audio Tools... How fantastic! There is a whole world of free and exciting instruments at our disposal for play and work. These instruments try to cater for all the needs that the musician or the sound artist might have. And what is better: if you program computers, you can get the code source of the software and adapt it to your own specific work pattern, thus transforming yourself from being a mere consumer to a co-author of the software. At the moment FLOSS software provides most of what commercial software can provide, and sometimes more.

Normally someone decides upon making a specific software tool when he or she wants to perform a certain task and there is no existing tool that does the job. Software is based upon people's needs and work habits. There might exist a tool that performs a certain task, but not in the same way as is desired, so a new tool is created. If we look at the question of user needs from a very basic overview position, we can define the following categories: Audio Editing (for recording and processing sound); Sequencing (for layering sound events as tracks on a timeline and perhaps apply effects on the tracks); Score writing (creating musical scores on staves or piano roll interface. This is software based upon the old tradition of writing music as notes on paper); Virtual Instruments (tools that allow you to generate sounds through events such as commands from a sequencer or input from hardware such as a MIDI keyboard); Sound Feature Analyser (for analysing the nature of the sound: its timbre, temporal onsets and amplitude); Patchers - Algorithmic/Interactive/Generative Composition (for working with formal structures, algorithms and generativity - and that's what music essentially is). In the following sections  we will look at some of the FLOSS tools that are found in each of these categories.

Audio Editing

The most basic need of anyone working with sound is the capability to record it into the computer. This requires a microphone and a soundcard (inbuilt in most computers, but for good quality people buy special soundcards) that digitizes the analog signal from the microphone. Once the sound is digitized (through the soundcard) it can be recorded onto the hard disk with a sound editor and represented graphically in various modalities. The most popular FLOSS audio editor is Audacity [1]. It allows you to record sounds on multiple tracks, process them with virtual effects such as the LADSPA [2] or VST, and export the sound in various known audio formats, such as wave, ogg vorbis [3] (open source compression format) or mp3.

Spectral and waveform view in Audacity

Audio Editors perform tasks that the tape would have done in the pre-digital age, but they add the powers of analysis, graphical representation of the sound (very difficult in the pre-digital age), multiple and easy cutting and pasting, and high quality digital signal processing. Audacity performs tasks that go beyond simple audio editing such as multi-tracking and it has its own scripting language called Nyquist which can be used to generate sound and manipulate or generate MIDI data. Audacity exists for Linux, Mac OS and Windows.

Other editors include Snd [4] and WaveSurfer [5]

Sequencing

Sound Sequencers are basically a digital implementation of the multi-track tape machines that were found in recording studios in the latter part of the 20th century. Sound sequencers can layer sounds into tracks that play at the same time. You can either record directly into the track or import a sound from the hard disk of the computer. Two typical usage situations would be: a) a musician that records sounds in the field or downloads them from the net [see http://freesound.iua.upf.edu]. She manipulates them in an audio editor and then imports them into a sequencer in order to layer them and create a formal structure. b) a band with many instruments creates a demo recording by recording each instrument through a multichannel sound card. Each instrument is recorded live into the respective audio tracks of the sequencer. Later they then process and mix the tracks before bouncing down to a soundfile.

Screenshot of Ardour

Ardour [6] is the most impressive free and open source multi-track sequencer to be found at the moment. It compares to software such as Cubase, ProTools or Logic. It supports multi-track recording, audio processing (using native effects, LADSPA or VST), MIDI recording and manipulation, virtual instruments and post-production. It exists on Linux and Mac OS X.

Score Writing

This category overlaps with "Sequencing" but it might have a different user focus. Before computers were powerful enough to deal with real audio (with 16 bit, 44100 sample rate) they were often used to create scores that would be played out through MIDI to hardware synthesizers or samplers. The score-writing tools could often switch between various representational modes such as the piano-roll (where you would see a vertical piano keyboard on the left and then time would be represented horizontally) or the good old 5 line stave. Obviously the piano roll was better suited for people without formal music education. The most popular software in the late 1980s would be Cubase on the Atari computer.

Score writing software is not focusing on recording audio in real-time but is aimed more at the composer who wants to compose music by arranging notes and let the software play the score in order to hear the results. Rosegarden [7] is perhaps the best FLOSS audio software for writing scores. It has to be noted here that Ardour can also be used for arrangement of MIDI notes and Rosegarden in turn records live audio, it's just that the focus of the two tools are different.

 Rosegardens Matrix editor

Rosegardens powerfull  notation editor

When a MIDI score has been made in Rosegarden the ideal software to set up and print out scores is called Lilypond [8]. It accepts various formats such as MIDI and MusicXML but it also has its own scripting protocol that can be written out from any programming language. Lilypond does great job in writing out beautiful and logical scores for musicians to play.

Virtual Instruments

Above we talked about the score sequencer software for arranging notes and how they would send MIDI notes out to external hardware to generate the sound. In the late 1990s affordable computers became powerful enough to do real-time synthesis and virtual instruments and effects were added to the flora of audio tools. Steinberg, the company behind Cubase (then the most popular sequencer), created the VST (Virtual Studio Technology) audio plugin architecture. The software development kit is open and it has resulted in thousands of developers continuously creating new effects and instruments. Apple has created their own open architecture called Audio Units (or AU). Other architectures include FreeST for Linux which makes it possible to use VST plugins on Linux. The native Linux audio plugin architecture is called LADSPA [9] (Linux Audio Developers Simple Plugin API) and it is supported by most of the software mentioned in this article.

Virtual instruments can be used in various setups. For example you could plug your MIDI keyboard into the computer and use a host program to load up the instruments and effects you want to use. You can create a chain of instruments and effects, typically choosing a sound, say guitar, and then route that through effects such as bandpass filters, reverb, delay or distortion. You could then use the host program to record what you play for later editing. Another usage would be to compose directly by writing notes, perhaps using Rosegarden, and then listen to what you write by playing the score through a virtual instrument.

Sound Feature Analysers

Musicians, artists and scientists often need to analyse the sound they are working with. They might be curious to look at the spectral (the distribution of frequencies) qualities of the sound and change its harmonic structure through manipulating the sound’s partials. Praat [10] is a great application for this purpose. It can do spectral, formant, pitch, intensity and other types of analysis. It is particularly well suited for speech studies and it provides neural network algorithms for learning and synthesis algorithms for speech synthesis.

Screenshot of Praat

Tapestrea [11] from the productive Princeton Audio Lab is another interesting and fun sound feature manipulator.  Like Audacity it has scripting capabilities and it uses the ChucK programming language [20]  for scripting.

Other applications include Sonic Visualiser [12], Baudline [13] and RtFFT [14]  or snd-peek [15].

Patchers - Algorithmic/Interactive/Generative

This category of “patchers” is where FLOSS software blows away the commercial world in quality, ingeniousness and experimentation. The computer does not merely have to imitate old technology from our physical world. It allows us to create our own tools according to our own ideas of how music should be or how tools should behave. For that purpose there are many different patchers out there: basically environments where you can create your own synthesis graphs, control structures and interfaces.

Historically the patchers originate from the Music N languages made by Max Matthews in the 1950s and 60s. The idea here is to create unit generators that generate the sound and then provide a programming environment to control them. This is ideal for sound synthesis and algorithmic composition. From the user interaction perspective, we can divide the patchers into two categories: graphical programming environments such as Pure Data [16] or jMax [17] and textual programming environments such as SuperCollider [18], CSound [19] and ChucK [20].

Screenshot of Puredata

The patchers allow you to create your own program, so you could create an algorithmic composition based on rules, a generative composition that plays music that's never the same, sound toys, interactive installations (using sensors, audio, video and motors), musical analysers, a thought platform for music theory, explore sound physics or psychoacoustics and so on... These programming languages are made for sound, but try to shy away from incorporating too much music theory. Therefore you won't find any 16 step sequencers or 12 tone keyboards. It's up to you (and not some software designer) to define the conceptual foundations of your music.

It varies what people prefer when choosing their environment. In open source software, the most important things to consider when choosing a platform to learn (apart from the sound quality and the way you embrace the environment) is the status of documentation, stability, continuity and community. A helpful mailing list community is characteristic of all the above mentioned patchers where more experienced users help less experienced users to understand their way through the theoretical maze one can find they are.

People have different cognitive styles. For some Pd is the ideal platform for composing their music or instruments as the interface and the programming language are one and the same thing. It provides a graphical data-flow representation of the internal functionality of the computer. For others, textual programming languages like SuperCollider can be more efficient for what the goal is. The power here is that of writing classes, compact code, ease of use and different type of representation. Each of these environments have their cons and pros and it is only meaningful to compare them when thinking about some specific task that needs to be performed.

Screenshot of some SuperCollider code

Conclusion

All musicians or artists have their own agendas and goals and it is impossible to tell which tools are suitable for each and every person. It is now up to you to download and install these environments and see where they take you. Read the tutorials and subscribe to the mailing lists. There are always people there to help you, and it is easy to unsubscribe again. And remember that people have put their free time into developing these dynamic, free and open source programs. Using them can therefore be exciting experience where you will establish personal relationships with other users of the tools and their developers. New ideas or discussions about the tool are always welcome. There are many good reasons to use free and open source software, but perhaps the most important one is this change of status the user will experience from being a mere “customer” of a company or a “consumer” of software, to a fellow “user” or “co-developer” of it.

Tip: Want to try?

Would you like to try these programs without having to install all of them on your machine? You then have the possibility of running a "live-cd": basically a Linux operating system that runs from a CD. Planet CCRMA [21], pure:dyne [22], Ubuntu Studio [23] or 64 Studio [24] all provide you with a Linux distro on a CD that can be run on your computer just by booting up from the CD drive. This way you can explore and try out most of the software that we have covered in this article.

Notes

[1] http://audacity.sourceforge.net

[2] http://www.ladspa.org

[3] http://www.vorbis.com

[4] http://ccrma.stanford.edu/software/snd

[5] http://www.speech.kth.se/wavesurfer/index.html

[6] http://ardour.org

[7] http://www.rosegardenmusic.com

[8] http://lilypond.org

[9] http://www.ladspa.org

[10] http://www.fon.hum.uva.nl/praat

[11] http://taps.cs.princeton.edu

[12] http://www.sonicvisualiser.org

[13] http://www.baudline.com

[14] http://www.music.mcgill.ca/~gary/rtfft

[15] http://soundlab.cs.princeton.edu/software/sndpeek

[16]  http://puredata.info

[17] http://freesoftware.ircam.fr/rubrique.php3?id_rubrique=14

[18] http://supercollider.sourceforge.net

[19] http://www.csounds.com

[20] http://chuck.cs.princeton.edu

[21] http://ccrma.stanford.edu/planetccrma/software

[22] https://devel.goto10.org/puredyne

[23] http://ubuntustudio.org

[24]  http://64studio.com

Images

Rosegarden screenshots from http://www.rosegardenmusic.com

Puredata screenshot courtesy of Frank Barknecht http://footils.org

All other images courtesy of the author.