quick menu
3. Basics & concepts - ALSA and Ecasound
In this chapter I try to get into some concepts of ALSA - the Advanced Linux Sound Architecture - and ecasound - my favourite recording tool.
Table of contents
3.1 ALSA
The following subparts will deal with a few alsa-concepts. ALSA, the Advanced Linux Sound Architecture, is your soundcard driver. It should always be running, so you can hear any sounds from your card(s). Under Windows and other operating-systems you have a different soundcard-driver for every card. Ususally you get your drivers together with your card. Under Linux there is only one driver for all supported cards. This is ALSA or - in the old days :-) - OSS. This has a lot of advantages:
- You only need to worry about one program :-)
- A lot of features can be added for all cards. So not every single programmer of a certain company needs to reinvent all these features.
- All programmers of applications using sound, have only one API to use. Every program should be usable for every card. As long as the card fullfills certain requirements. - That means: There's no fun in using a program, that's supposed to record 16 tracks at a time with a simple one-channel soundcard. :-)
- There are a lot of people using the same driver. This generates a lot of good feedback and - if needed - good bugreports... So the software is always well-maintained.
- Last but not least: You've only one place to know, where you can ask questions. This is the alsa-user or alsa-devel mailing-list.
3.1.1 The main devices
I'm always talking about devices. What do I mean by that?
A devices - in ALSA-terms - is a particular part of your soundcard. i.e.: analog-line-in, output, microphone-in, an amplifier... There's a lot of small items built in a soundcard. :-)
There are several types of devices, of which I only want to name and roughly describe a few. It's too complicated telling you about all of them. If you've got more questions or anything specific use the alsa-mailing-lists.
ALSA-PCM-Devices
A PCM-device is usually associated with an in- or output. To keep it easy, just imagine a PCM-device to be a jack on your card. Some place you can plug a cable into. For now this will be enough to understand.
ALSA-MIXER-Devices
I'm not really sure about them, but you don't go looking for them every day. As I understand it, they are only useful for configuring your soundcard. You do that usually with alsamixer or amixer, so don't worry about them.
ALSA-PLUG_Devices
Here it gets a bit trickier. The "devices" of type plug aaren't real hardware-devices. You can't open you computer put your finger on the soundcard and say: "Here, that's my plug-devices presto-in!" They are so called "Virtual Devices". "What the devil is that?" A Virtual Device can look to you like a real input or output or whatever. But it's only a configuration. Imagine the following situation:
You've got a card with 10 input-jacks. Every jack is a mono-input. But you want to record stereo or quadrophone. So you can create a "Virtual Device" of type plug, which takes two (or four) of your real jacks together and builds a new devices called "my_in". Than you can take your favourite program and record using the device "my_in".
ALSA offers you much more powerful and complex "devices-types" than a simple plug-device. You ask: "How can I create such a device?" I must tell you, that we're dealing with it later or never. :-) Sorry, I don't know, howmuch I can figure out for myself. But you can always take a look at the official ALSA webpage or the ALSA-Wiki. The latter deals with this topic in more detail. A clue: You need a .asoundrc-file.
3.1.2 Alsamixer
So now that you have your ALSA installed and running, you'd like to adjust the volume. But how? One answer to this question can be "alsamixer".
Alsamixer is an ncuvrses-based mixer program. Here you can change volume, mute/unmute, playback/capture of your soundcard's devices. Below I'll try to explain how to use alsamixer and what some of the most common devices are.
NOTE: Although alsamixer is ncurses-based it is usable with a braille-display. I've used it often enough to confirm this. :-)
General usage of alsamixer
How to call up the program:
alsamixer
This is the simplest - and most often - apropriate way of calling alsamixer. If you have more than one soundcard and want to configure the other card(s) type:
alsamixer -c 1
This will show the mixer settings for you SECOND soundcard (remeber to count from 0!). With the additional option "-s" you can tell alsamixer to only show one item at a time. This gives you a better overview.
Alsamixer is navigated by simple keystrokes. Here are the most important:
- cursor-up - increase value (mostly volume)
- cursor-down - decrease value (mostly volume)
- escape - quit alsamixer
- cursor-left - move one item to the left
- cursor-right - move one item to the right
- space - toggle capture/playback
- m - mute/unmute channel
- F1 - get more help
These are the most imporant key-strokes you'll need.
Most important devices you'll PROBABLY see
A lot of devices are common on a lot of cards. If you have a simple (not multitrack) card, then you'll probably see these:
- Master - the name says it: It's THE master device. Set the volume of this device to get some sound at all. You MUST unmute it! If it is muted (signalled by "MM" at the top of the bar going up), you just press "m".
- PCM - This sets your general pcm-volume. This too MUST be unmuted and the volume needs to be adjusted!
- line - This usually (not always) is there for the lin-input. Umute it, set it to capture (with space) and adjust the volume. NOTE: On some cards this is only the output of line-in. So if you keep the volume at 0, you'll still be able to record sound, but you WON't hear it WHILE recording; At least not automatically.
- mic - your microphone. Unmute and raise volume if you want to use it. Otherwise leave it muted! I noticed, that with my old card, the mic-input produced a good deal of noise.
- mic boost - This is a mic-booster. It is an amplifier, as I understand it. You can use it, if your microphone needs a bit more volume.
- cd - If you've connected your cdrom-drive to your soundcard, youcan use this device to listen to audio-cds with players like cdcd, jac or cd-console.
- capture - This is for "the other" soundcards, which use line-in only as a "monitor". Adjust this device and you can record from your line-in.
NOTE: These hints are only _VERY_ general. Mostly you'll have to adjust some more devices to record.
3.1.3 Plugins and .asoundrc
Now here we are. How to create ALSA-plugins. You'll need to create a ".asoundrc" in your home-directory or an "asound.conf" in /etc. As discovered in chapter 2 there are already some predifined plugins. Such as "default". As mentioned, "default" is set to use the first soundcard and the first device on that soundcard. Internnally devices (in/outputs) are numbered starting at 0. There is one other plugin I know: dmix. Dmix is a feature introduced in ALSA0.9.1. With dmix you could do the following:
aplay -D plug:dmix 1.wav
Change console and while 1.wav is still playing type:
aplay -D plug:dmix 2.wav
You'll hear 1.wav and 2.wav playing at the same time.
Practical uses: You're listening to a long mp3-file on one console and rendering new mp3s on another. Then you'd like to hear if the newly created mp3s are ok. Usually you'd have to stop your long mp3 and then start the newly rendered file. Afterwards you'd have to restart you're long mp3 and search for your position. If you start your mp3-player with the dmix=plugin as output, you can just pause and take a short listen to your new file. Then unpause the long track and go on listening. No searching at all! Before dmix you needed a soundserver such as artsd or esd for this. Now you don't have to start another program and worry, if your mp3-player supports that server.
How to create virtual devices yourself and how to use some of the plugins will be added at a later date. I'll need some good preparation for this. Please excuse this shortcoming at the moment!
3.2 Ecasound
Ecasound will be the main tool of this studio-guide. It can record, process and mix your songs. It enables you to create multitrack harddisk recording sessios, use LADSPA-effects and the JACK-Audio-Connection-Kit audio-server.
3.2.1 Input/Output
Ecasound can use a lot of object for input and output. You can i.e. play a file to your soundcard or convert it to another file-format (which is also JUST outputting the original). To listen to a file, you can use alsa, oss, jack, esd... I don't know how to work with all of these. But I'll tell you about alsa, jack and some file-formats at least.
If you know the difference between a .wav-file and an ogg-file and if you know what bit depth, channel-number and sampling-frequency is, just skip the following and continue here
general info about input/output
An audiofile - as you may or may not know - has a format. This format is made up of several small options. Those are:
- File type
- The type of a file. The type is usually determined by the extension of the file. So if you have a file named: hello.wav or goodbye.ogg, ecasound knows the first one is a wave (.wav)-file and the second is an Ogg/Vorbis-file by the .wav or .ogg. You can use the .wav-fileformat, which is nice for recording or burning to cd. You can also use .ogg or .mp3. those are compressed formats. They "steal" a bit of your quality, but are WAY smaller than a wav-file. there are good for storing audio on you computer. For other formats see the manual page of ecasound and of libsndfile (if you installed it).
You could compare the filetype to a brand of car. You can by a BMW or a Mercedes. If you pick the right models, they are in general the same, but the design looks diferent.
- bit-depth
- This determines howmany bits a sample gets. Here the old saying: "a lot helps a lot" is true. but 16 bits will in general suffice. 16 bit is the bit-depth of a cd. (most of the simple soundcards can't do any better)
- channels
- this means mono, stereo or quadrophone or whatever. Ecasound simply counts them (there are no specific names for special channel-numbers. So just put up your fingers and count. :-)
- Sampling rate (or sampling frequency)
- This number tells you how many sample (small bits of audio) are played per second. The usual value for this is 44100 Hz (or 44.1kHz). This number you probably know, it's the sampling rate of a normal cd (and again the best simple cards can do).
Taking up the car-example again you could say: The bit-depth, channel-number and sampling-frequency are "physical" factors. You can buy a truck, a van or a bus. They can use different kinds of petrole. Those factors are independent of the brand. Merces builds vans as well as BMW might. A .wav-file can have 44100 Hz Frequency as well as a raw file or an ogg-file.
So now you want to tell ecasound, which format you like. Imagine you have a simple audio book on you're computer. A voice sounds good even if it doesn't have cd-quality. So you could adjust the format a bit. Let's say to 16bit, mono and 22050 Hz. So to convert your fle you could do:
ecasound -f:16,2,44100 -i good_file.wav -f:16,1,22050 -o bad_file.wav
In this example we converted a cd-quality file (16 bits, stereo, 44100 Hz sampling rate) to a file with 16 bits, 1 channel (mono) and 22050 Hz (half cd-frequency). To specify those "physical" parts of a format, you use the -f option. It is rather simple, once you understood, what those bits, channels and frequencies are... :-) The syntax is:
-f:bits,channels,frequency
Now you've got this info, we can continue with ecasound's different inputs and outputs.
Ecasound's inputs and outputs
Here is a list of a few input and output types ecasound supports:
- file input/output
- Reading or writing simple files, is one thing you'll do very often. In general it is very simple:
ecasound -i myfile.wav -o your_file.cdr
This will read my_file.wav and write your_file.cdr. Doing that, you just convert the file format from .wav to .cdr.
NOTE: Don't get confused by the extension .cdr. It is not - as I expected - 16 bit (little endian), stereo, 44100 Hz, but 16 bit (BIG endian), stereo 44100 Hz. .cdr-files also will be adjusted, to have the correct length for burning to cd. But the cd won't usually play very good. The usualy format for burning something directly to cd would be: -f:s16_le,2,44100 (the default on all linux-systems I know.
- OSS - open sound system
- OSS is a soundcard-driver. You can use OSS to record from or play to your soundcard(s). You do it like that:
ecasound -i /dev/dsp -o my_file.wav
this will record from your first soundcard to a file named my_file.wav. To play back:
ecasound -i my_file.wav -o /dev/dsp1
This will play back my_file.wav to your second available output (use /dev/dsp or /dev/dsp0 for your first output).-
- NULL - the NULL I/O
- NULL is a way to record silence or play a sound into the great nothing. It's mostly useful for testing or evaluationg something. It works like that:
ecasound -i null -o file.wav
OR:
ecasound -i file.wav -o null
There is also an rtnull (real-time null) device. But I don't know, if it still exists, or how it works. This rtnull device simulated a soundcard, that did: NOTHING. :-)
- ALSA - Advanced Linux Sound Architecture
- ALSA is - I think - the modern way of communicating with your soundcard. It is a soundcard driver (like OSS). But it has a lot of features, that OSS doesn't have. If you want to work with all those modern audio-tools, you better use it.
NOTE: ALSA can simulate OSS.
To work with ALSA try this:
ecasound -i alsa,default -o my_file.wav
this will record from your first soundcard to the file my_file.wav. Why? ALSA has named PCM-devices, as discussed in the previous section. There is always a device called "default". This is the same like /dev/dsp with OSS. You could also write:
ecasound -i alsahw,0,0 -o my_file.wav
this would usually have the same effect. Now you've already seen two ways of talking to ALSA: alsa,name or alsahw,card_number,device_number,[subdevice_number]. The subdevice_number is optional. the easiest way however is to use alsa,name. Just using alsa,name, would usually have the nice side-effect, that you could avoid conversion-problems.
- JACK - Jack Audio Connection Kit (jack_alsa obejct)
- Jack is an audio-server. It offers you the possibility to connect programs with virutal cables. So you can connect to ecasounds or ecasound an a software synth, without plugging in any REAL cable. A basic way of using jack input/output is:
ecasound -i jack_alsa,in -o my_file.wav
This will connect ecasound to the first to soundcard inputs (jack:alsa_capture_1 and jack_alsa_capture_2). The same works for output:
ecasound -i my_file.wav -o jack_alsa,out
- JACK (jack_auto object)
- This is the next - very fine way - to communicate with jack. Imagine you have a fluidsynth - software synth - running and want to connect ecasound to it, so you can record the synthesizers sounds. then you could do:
ecasound -i jack_auto,fluidsynth -o my_file.wav
this would create jack_ports for ecasound and then take the "invisible" cables and plug them into fluidsynth and ecasound. So You can just start playing and it will be recorded. the same works for output.
- JACK (jack object)
- This is for those of you, who are a bit more advanced. You can type:
ecasound -i jack -o my_file.wav
This will create jack-ports for ecasound, but they won't be connected. It's like noticing, that you're soundcard has inputs, but no cables are plugged in. You have to do the "in-plugging" manually with jack_connect.
- JACK (jack_generic object)
- This is more advanced still. You can type something like that:
ecasound -i jack_generic,my_in -o my_file.wav
This will create the inputs ecasound:my_in_1 and ecasound:my_in_2. Usually ecasound-ports are simply named: in_1 or out_1. With the jack_generic object you can set your own name-prefix.
- loop devices
- A loop device is a fairly advance object, it is used to route audio between different chains. Let's say you have to mp3-files 1.mp3 and 2.mp3 and you want to play them back simultaneously. This would be easy, but: now you want that both mp3s get the same effect (i.e. alowpass filter). So you could do it with loops:
ecasound -a:1 -i 1.mp3 -o loop,1 -a:2 -i 2.mp3 -o loop,1 -a:3 -i loop,1 -o alsa,default -efl:800
What is happening: We have two chains (1 and 2). Those take the mp3-files (1.mp3 and 2.mp3) as input. Their output is a loop-device, with the number 1 (loop,1). This number - loop_id - is necessary to know, which loop-device you're refering to. Then there is the chain (3), which take loop,1 as input and outputs to alsa,default. It also adds a filter (-efl:800). What will be heard: You'll hear both mp3s. They'll sound a bit like lofi.
- .ewf-files
- .ewf-files are special to ecasound. We'll see what they are a bit further on. They are simple to use:
ecasound -i my.ewf -o alsa,default
The same works for output.
For more VERY good information and a nice hands-on tutorial see: the ecasound webpage. Click on documentation and choose. A good place to start is always the examples. Other good places for detailed information are the "Ecasound user's guide" and the manpages.
3.2.2 What is a chain
From the ecasound user's guide:
"Chain is the central signal flow abstraction. In many ways chains are similar to audio cables. You have one input and one output to which you can connect audio producers and consumers (like guitar and amplifier for instance)."
You've already seen chains without knowing you did. We always said:
ecasound -i some_input.wav -o some_output.wav
Ecasound creates a default chain for this. to explicitly specify a chain you do:
ecasound -a:my_chain -i input -o output
This specifies the chain "my_chaqin". You can use names or numbers for them. But as mentioned, a chain is more than a simple audio-cable. A fully-fledged chain can look like this:
input -> chain-operators -> output
What is a chain-operator
A chain operator is usually an effect (like chorus, a filter, delay...). How does this look on a commandline? Imagine you are singing into your microphone. Now you want to add a delay (echo) to your voice. ou could use ecasound's built-in delay-effect (-etd). the syntax is: -etd:delay_time_in_msecs,surround_mode,number_of_delays,mix-%. A nice adjustment would be: -etd:125,2,2,50. 125 milliseconds delay, surround-mode 2 (left-right stereo-delay), 2 delays and 50 percent mix. So now you can try the following:
ecasound -a:my_chain -i alsa,microphone -o my_file.wav -etd:125,2,2,50
When you listen to the recording, you'll see what it did. :-)
Ecasound comes with a few effect on its own, you can read about those in the manpage (man ecasound). Another great feature of ecasound's is, to use ladspa effects. Ladspa is "simple" system of writing effects. Simple if you know a little about effects and if you familiar with C. But there are many effect already and the collection is growing. Take a look at Steve Harris' SWH plugins or at the linux-sound page. Ecasound offers two ways of using LADSPA. One is by name:
ecasound -i input.wav -o output.wav -el:delay,0.2
If there is a LADSPA-effect named "delay" and if it has one parameter (delay-time in seconds), then this commandline would add a delay of 0.2 seconds to your input. The other way is to use LADSPA-effect by their unique ID:
ecasound -i intput.wav -o output.wav -eli:1202,12,22050
If I remeber correctly effect number 1202 is a decimator. I simulates bad audio-quality. You can get a kind of lofi atosphere. So I used the decimator with 12 (12bit depth) and 22050 (samplingrate).
NOTE: I think this is easier than -el. Names can be hideous! Numbers can have no capital letters :-). Also: usually the unique LADSPA-ids are directly added to the name of the .so-files, in which they are stored. So this is no problem.
More about LADSPA should follow in chapter 4 or so. :-)
3.2.3 The .ewf-format
The .ewf file format is just a small text-based wrapper.
NOTE: The .ewf-file DON'T contain audio. So DON'T delete your audio-files after creating a .ewf-file with a line like:
ecasound -i my.wav -o my.ewf
This is FATAL!!! (Trust me: I did it once :-))
So what can you do with .ewf-files? - Let's take an example. Get one audio-file - i.e. my.wav - and then use some editor to create a file my.ewf like this:
source = my.wav
length = 2
If you've done that, try:
ecasound -i my.ewf
This should play two seconds of my.wav. Now let's take a closer look and don't worry, it's simple!
.ewf files are mostly used to save diskspace. You can place a snippet of audio in the middle of some long recording or loop it as long as you wish (could be nice for durmloops). There are not many keywords to use:
- source = <filename>
- Specify the filename of the audio-object you want to use (a wave file, mp3, ogg... Whatever)
- length = <time in seconds>
- Specify how many seconds of the file should be played
-
- offset = <time in seconds>
- How many seconds should go by before the file start playing. - To clarify this matter, imagine that: You've got a recording with a long piano intro and then yourself singing. But you recorded your singing so, that your voice start with the beginning of the file my_singing.wav. So now you want to piece them together. Then you have to place your singing at - say two minutes - of piano play. So you'd do: offset = 120.0. This would cause your voice to start after 120 seconds. Clear?
- start-position = <time in seconds>
- Start position in the file you want to play. Imagine: You started singing and recognise afterwards, that you don't like the first 20 seconds of it. Then you can just say:
start-position = 20
and the unwanted 20 seconds won't be played.
- loop = <true OR false>
- Determines whether the audio source is looped or not. Well you would mostly only use this if you wish the source to be looped anyway. :-)
Well that's mostly it. Now you can take some wave=-file or so and play aroun. For me this is really mostly useful to paste in a sample at a certain time or concatenate parts of a long piece. So I can create .ewf-files for part2 and part3 saying something like that:
part2.ewf
source = p2.wav
offset = 62.3
Which means, that part1 is 62.3 seconds long.
part3.ewf
source = p3.wav
offset = 212.6
There we go! We connected three part to one piece.
TIP: You can either take ecasound's interactive mode (-c commandline option) and then use the command "getpos" to find exact locations or - if it is not important to get a perfect flow use the "ecalength" program to get the length of your audiofile as it is. - Still make sure the whole volume is good!
3.2.4 Oscillators
What is an oscillator in general? - An oscillator is an object, that periodically moves between certain points. Take the simplist oscillator I can think of a "sawtooth oscillator." If you would attach it to a screen this oscilator would generate a zigzagging line, like the blade of a simple saw.
What can oscillators be used for? - They can be used to:
- Generate basic sounds - this is done by every subtractive, additive and FM synthesizers.
- modulate an effect - a tremolo or an auto-wah for instance work like this. With the tremolo the volume goes up and drops down when the curve of the oscillators rises and drops again.
In ecasound the second of these alternatives is used. You can use an oscillator to move values of a controller up and down. But this is all very abstract. To get a basic idea of what we can do with it, here's an example:
ecasound -c -i my_guitar_sample.wav -o jack_alsa -ea:100 -kos:1,50,100,1,0.5
The above example does the following:
- Read my_guitar_example.wav
- Apply an amplifier (-ea)
- Modulate it with a sine-oscillator to create a tremolo effect
- Output the result to the soundcard using jack
About the parameters of oscillators. What are the parameters?
- fx-parameter - parameter of the effect you want to modify - in this case this is the first parameter of the -ea effect.
- start-value - lowest value of the oscillator - in this case the lowest volume should be 50.
- end-value - highest value of the oscillator - in this case volume should "oscillate" between 50 (lowest) and 100 (highest).
- frequency of the oscillator - how many times per second should the oscillator move around (from lowest to highest). For effect-modulation put this between 0.1 and max. 30 Hz. If you turn this value higher, you'll get the effect of a new constant ton building.
- i-phase - initial phase of the oscillator. - I assume everyone of you have seen a drawing of a sine curve in some school lesson. Now where do you want to start at zero, where value is also zero, or at 0.5*PI, where the sine has reached its peak?
All the oscillators and controllers (-ksomething options) have the first three parameters in common. All osciallators (as I know) do have the fourth parameter in common, too. This means their first four parameters being: fx-parameter you wanna change, start-value, end-value and frequency of the oscillator.
Where can it be helpful or interesting to apply oscillators:
- With volume (-ea) to create tremolo
- With lowpass-filters (e.g. -ef3) to create auto-wah
- With panning (-epp) to let the sound swing around a bit
- With pitch-schifting (-ei) to create a wobbling tape (apply slow and soft 0.2Hz and 97-103 as start-end :-)
Of course you can try a lot of other weird effects and hear what comes of it. This may be a very psychodelic experience.
Back to top of page
E-mail me
someday the copyright notice :-)
Last modified : Feb 12
Rendered with bug.sh version: 0.3