 |
::| SUBSYNTH
lowlevel audio synthesis subsystem. modular analog synthesizer in software
|
: oss.notes :
Notes from the OSS programming guide found at:
http://www.opensound.com/pguide/oss.pdf
OSS (formerly called VoxWare)
- c api
- device driver for sound cards and other sound devices under unix
- derived from original linux sound driver
- runs on more than a dozen OSs, supporting many sound cards/devices.
sound cards
- have several devices or ports that produce or record sound
- digitized voice device /dev/dsp
- codec
- pcm
- dsp
- adc/dac
- mixer device
- handles I/O volume levels.
- synthesizer device
- plays music
- sound effects
- 2 kinds typically
- Yamaha FM synth chip
- OPL-2, two operator
- OPL-3, four operator
- wave table
- plays back pre recorded samples
- very realistic
- MIDI interface
- standard
- serial interface
- 31.5 kbps
- designed to work with on-stage equipment
- synths
- keyboards
- stage props
- lighting controllers
- communicate through a MIDI cable
- joystick port
- not controlled by OSS
- CD-ROM interface
- not controlled by OSS
programming OSS
- c header: in -I/usr/lib/oss/include
- OSS provides/supports these devices:
- /dev/mixer
- access to built-in mixer circuit on sound card
- adjust playback and recording levels for sound channels
- linein
- master
- cd
- wav
- synth
- mic
- supports several mixers on the same system
- named /dev/mixer0, /dev/mixer1
- /dev/mixer is just a sym link to one of them (usually /dev/mixer0)
- /dev/sndstat
- for diagnostic purposes
- human readable data
- info for all ports and devices detected by OSS
- cat /dev/sndstat
- /dev/dsp and /dev/audio
- main device files for digitized voice apps
- data written here is played with the
DAC/PCM/DSP device on the sound card
- data read here comes from the input source (default microphone)
- difference?
- /dev/audio is logarithmic, provided for SunOS compatibility
- /dev/dsp is 8-bit unsigned linear
- use ioctl to select encoding method (affects both)
- several are supported:
- /dev/dsp0 /dev/dsp1
- /dev/dsp is a symlink to one of them.
- /dev/audio works similarly
- /dev/sequencer
- electronic music
- sound effects (i.e. in games)
- access synth located on sound cards, or external music synth
connected to MIDI port.
- allows control of up to 15 synth chips and up to 16 MIDI ports
- allows more precise control than /dev/music
- /dev/music
- similar to dev sequencer
- easier than /dev/sequencer
- handles synth and MIDI devices in the same way
- enables device independant programming
- based purely on MIDI protocols
- /dev/midi
- low level interface to MIDI bus ports
- similar to TTY (character terminal) devices in raw mode
- not intended for realtime use (timing not guarenteed)
- everything sent to MIDI port immediately
- useful by MIDI SysEx and sample librarian applications (data dumps)
- /dev/midi00, /dev/midi01,
- /dev/midi is a sym link to one of the others.
- /dev/dmfm
- low level register access to FM chip
- /dev/dmmidi
- raw interface to MIDI devices
- direct TTY-like access to the MIDI bus
- for special applications
- portability considerations
- gui
- endian issues
- when system and audio device differ
- i.e. big endian RISC system and little endian PCI card
- OSS hides device specific features behind its API
- the API is based on universal properties of sound and music, not hardware
- design your application so the user can select the /dev/ file
- default /dev/dsp0 file might be broken
- /dev/dsp1 might be preferred to the user
- don't assume undocumented values
- need to query default values
- i.e. /dev/sequencer timer is usually 100hz, but can be 60 or 1024hz
- don't open a device twice
- not all sound cards have a mixer... :)
- older cards
- unsupported cards
- some high end digital only cards
- not all mixers have a master volume control
- your app should query the available channels from the driver
- don't use a device before checking that it exists.
- mixer programming
- check if it exists: ioctl will fail and set errno to ENXIO if no mixer
- based on channels.
- each channel represents the physical slider
- vary betwen 0 (off) and 100 (max vol)
- most channels are stereo (set them separately for balance)
- usually only one input source is available for recording
- query the capabilities with ioctl
- see what channels are actually present
- channels vary from card to card.
- 30 channels
- general sound background info
- Wavetable Synthesis
- a method of sound generation that uses digital sound samples stored
in memory
- FM synthesis
- frequency modulation synthesis
- procedural method of sound generation
- uses wave generators/modulators sometimes in combination to produce
sound.
- small memory footprint
- operator
- waveform oscillator used to produce sound in FM synthesis
more operators produce more realistic sounds
- voice
- an independant sound generator
- sequencer
- device (hardware or software) which controls (sequences) the playing
of notes on a music synthesizer
- patch
- the device settings for a sound generator (i.e. piano)
- db (decible)
- unit to measure volume of sound. the scale is logarithmic since the
human ear has logarithmic responce
- audio programming
- sound is stored as a sequence of samples taken from an
audio signal at constant time intervals.
- each sample represents the volume of the signal at the moment
it was measured
- each sample requires one or more bytes of storage.
- the number of bytes in a sample depends on channels (mono/stereo) and format
(8 or 16 bits)
- the time interval between each sample gives the sampling rate
- expressed in samples per second (hertz)
- common: 8khz (telephone) to 48khz (DAT tape) to 96khz (DVD audio)
- hardware
- ADC (analog to digital converter)
- DAC (Digital to analog converter)
- codec (contains both ADC and DAC)
- refered to as a "dsp"
- fundamental parameters that affect sound quality
- samp rate
- expressed in samp per second or herz
- limits the highest frequency that can be stored (Nyquist Frequency)
- Nyquist's Sampling Theorem states:
- highest freq that can be reproduced is at most half of the samp freq
- i.e. if samp rate == 44khz, then the highest freq is 22khz
- format / samp size
- expressed in bits
- affects the dynamic range
- i.e. 8bit gives a range of 48db, 16bit gives 96db
- in pratice
- record and play using the standard system calls
- open
- close
- read
- write
- default params are usually very low (speech quality)
- change device parametrs with ioctl
- all dev files support read/write, but some devices that can't record
- most devices can work in half duplex mode (O_RDONLY or O_WRONLY)
- record and play but not at the same time
- some devices work in full duplex (O_RDWR)
- simplest way to record audio is to use UNIX commands
- cat /dev/dsp > recorded.file.raw
- devices are exclusive, if in use they return EBUSY
- include:
-
-
-
-
- mandatory data:
- int audio_filedescriptor;
- unsigned chat audio_buffer[BUF_SIZE]
- for real time performance, keep the buffer short. 1024 - 4096 is
a good range. Choose a value that is a multiple of your sample size.
- samp_size = channels * format
- three parameters are needed to describe sampled audio data.
- sample format (num of bits)
- num of channels (mono, stereo)
- samp rate (speed, sampling frequency)
- for stereo data
- two samples for each time slot.
- left channel sample is always stored before the right channel sample
- this extends for more than 2 samples
- the device must be reset before setting them: ioctl( SNDCTL_DSP_RESET
- OSS programming:
- OSS supports multichannels (i.e. more than 2)
- professional multichannel audio devices
- 16 or more mono channels (8 stereo pairs)
- how to encode multiple channels:
- interleaved (like with stereo sound data)
- multiple /dev/dsp devices
- mixed (dsp1 in 2 channel more, dsp2 in mono, dsp3 in quad, etc...)
- to access the mixer
- don't access it directly through /dev/mixer
- can't guarentee that it is accociated with the /dev/dsp you need
- instead access the /dev/dsp's ioctl mixer settings.
- for quality sound (avoid clicking)
- when using a single audio buffer, you cannot write to it while it
is being read, this produces a click while writing to the buffer.
- the click is your audio hardware starving.
- forces your application to do only one thing to ensure minimal popping
- to fix, use the double buffering method
- 2 buffers,
- one is being read while,,,
- one is being written to
- then swap, repeat for as long as the device is in use.
- buffer overruns
- too much data (results in data loss)
- buffer underrun
- not enough data
- improving latency (for generated sound with realtime requirements)
- use smaller buffer (fragment) size to improve latency
- useful to sync sound with the graphics.
- use select
- if writing an effect processor (read/write simultaneously)
- use full duplex,
- or use two devices together...
- MIDI
- what is it
- Musical Instrument Digital Interface
- communication protocol
- hardware level interface
- communication protocol between devices using the MIDI hardware
level interface.
- doesn't produce the audio, it controls some kind of external synthesizer
which performs the sound generation
- hardware details
- interface
- asyncronous
- serial
- byte-oriented
- similar to RS-232 (serial port)
- transfer rate is 31250 bits per second
- MIDI cables
- connect devices
- 5-pin DIN connector at each end
- single
- MIDI ports (called MPU401 device - developed by Roland corp.)
- there are dedicated (professional) MIDI only cards, without audio
- dumb serial post
- no sound generation capabilities
- used to connect to an external MIDI device using MIDI cabling
- MIDI external device
- MIDI keyboard
- rack mounted tone generator w/out keyboard
- audio mixer
- flamethrower, washing machine, etc...
- connection
- possible to have unlimited number of devices
- through daisey chaining
- MIDI multiplexers (like a Y-cable, but more expensive)
- one command on the MIDI cable may be processed by unlimited number
of devices. each of them can react to the command as they wish.
- protocol
- MIDI simply sends and receives bytes
- you don't care what device is attached, you just see a MIDI port
- it is even possible that there is nothing connected to the port.
- each port has 16 possible channels
- no sound transmitted, only control messages
- instrument change messages
- trigger, release (key was pressed)
- note number (which key)
- velocity (how hard)
- what is a synthesizer?
- a synth is a tone generator.
- hardware tone generation/mixing
- internal to a computer
- mounted directly to sound card or motherboard.
- Yamaha OPL2/3 FM synth
- OPL2 was used in the adlib (late 80's)
- OPL3 was used in sound blaster pro
- OPL4 is FM combined with wave table (OPL3 compatible)
- Gravis UltraSound (GUS)
- first wave table sound card on the market
- 32 simultaneous voices
- wave samples stored in a table in card's internal RAM
- 512k to 8MB
- limited memory, so the aplpication needed to manage patch loading
and caching.
- defacto API supported by many wave table device drivers
- Emu8000
- the chip in the SoundBlaster 32/64/AWE cards.
- similar to GUS, but provides the GM patch set in ROM
- patch load/cache is not nessesary (but possible)
- or external
- Roland Sound Canvas
- etc...
- a sound chip can have capabilities beyond what MIDI defines
- but caps are unavailable if you're only using MIDI to control it
- software tone generation/mixing
- www.fruityloops.com
- www.propellerheads.se
- SoftOSS
- software based wavetable engine by 4front technologies.
- does mixing in software
- can use any audio card (without wave table) to play MIDI with wave
table quality.
- usually what is connected to a MIDI port device
- allows standard interface to the synth
- allows portable code, etc...
- MIDI file format
- file extension .mid
- contain only control data, no instrument data (unlike MOD format)
- instrument timbres are asigned by the playing system.
- MIDI in OSS
- use /dev/midi, or /dev/midi00, /dev/midi01, /dev/midi02
- data is sent ASAP (midi data is queued)
- queue can hold about 60 bytes
- you can use Midilib which comes with OSS to read MIDI files.
- uses the MIDI 1.0 spec
- follows General MIDI (GM) and Yamaha XG specifications
- see also www.midi.org, and www.ysba.com
- MIDI
- is a highly real time process
- timing errors can be very noticeable by an experienced listener
- /dev/music and /dev/sequencer can hold enough data for several seconds.
- simply need to write more data before the queue becomes empty.
- queuing is the central part of the API
- queue for playback and for recording
- playback happens in the background (async)
- queue is non-blocking (application never waits)
- queue is organized as a stream of events
- events are records of 8 or 4 bytes
- command code
- parameter data
- two main types of event:
- timing
- time stamp
- included before input events
- used to delay playback
- input
- i.e. play a note, volume change, etc...
- instantaneous
- when no timing events, the hardware tried to play al inputs
as fast as possible (simultaneously)
- events are processed in the order written to the device (FIFO queue)
- for realtime events, you can send an immediate event ahead of the queue.
- good for playing real time events (live performance, etc..)
- MIDI Instruments
- emulate acoustic and artificial sounds
- MIDI devices are generally multitimbral
- can emulate more than one instrument
- to change instrument, send the MIDI port a "program change" message
- programs (instruments) numbered between 0 and 127 (7 bit addressing)
- modern devices support the GM (general midi) specification
- GM maps instruments to defined program numbers
- i.e. piano is 0
- numbering starts at 0, some books list them starting at 1 (wrong)
- usually support other device specific numbering schemes.
- the MIDI device implements these instruments (timbres)
usually using:
- procedural methods such as FM
- explicit (canned) methods like wave table (which uses recorded samples)
- MIDI Notes
- playing notes is the main task
- there are 2 messages in MIDI
- note on
- this msg signals a key press
- it contains info about the key that was pressed (contols inst pitch)
- it also contains velocity (controls inst volume and envelope)
- note off
- this msg signals a key release
- after this msg, the sound will decay according to the instrument
characteristics
- each message signals the note number (0 to 127)
- number of the key on the keyboard.
- middle C is 60
- Voices and Channels
- to play a note, the device usually needs one or more voices
- some notes use many voices (for layering)
- the number of possible voices played at one time is limited by the device
- 9 with OPL2
- 18 with OPL3
- currently most support 30 or 32
- future trend is to support 64 or 128
intro | documentation | design| requirements| implementation.notes| lit.search| publications