::| SUBSYNTHlowlevel audio synthesis subsystem. modular analog synthesizer in software |
OSS (formerly called VoxWare) - c api - device driver for sound cards and other sound devices under unix - derived from original linux sound driver - runs on more than a dozen OSs, supporting many sound cards/devices. sound cards - have several devices or ports that produce or record sound - digitized voice device /dev/dsp - codec - pcm - dsp - adc/dac - mixer device - handles I/O volume levels. - synthesizer device - plays music - sound effects - 2 kinds typically - Yamaha FM synth chip - OPL-2, two operator - OPL-3, four operator - wave table - plays back pre recorded samples - very realistic - MIDI interface - standard - serial interface - 31.5 kbps - designed to work with on-stage equipment - synths - keyboards - stage props - lighting controllers - communicate through a MIDI cable - joystick port - not controlled by OSS - CD-ROM interface - not controlled by OSS programming OSS - c header:in -I/usr/lib/oss/include - OSS provides/supports these devices: - /dev/mixer - access to built-in mixer circuit on sound card - adjust playback and recording levels for sound channels - linein - master - cd - wav - synth - mic - supports several mixers on the same system - named /dev/mixer0, /dev/mixer1 - /dev/mixer is just a sym link to one of them (usually /dev/mixer0) - /dev/sndstat - for diagnostic purposes - human readable data - info for all ports and devices detected by OSS - cat /dev/sndstat - /dev/dsp and /dev/audio - main device files for digitized voice apps - data written here is played with the DAC/PCM/DSP device on the sound card - data read here comes from the input source (default microphone) - difference? - /dev/audio is logarithmic, provided for SunOS compatibility - /dev/dsp is 8-bit unsigned linear - use ioctl to select encoding method (affects both) - several are supported: - /dev/dsp0 /dev/dsp1 - /dev/dsp is a symlink to one of them. - /dev/audio works similarly - /dev/sequencer - electronic music - sound effects (i.e. in games) - access synth located on sound cards, or external music synth connected to MIDI port. - allows control of up to 15 synth chips and up to 16 MIDI ports - allows more precise control than /dev/music - /dev/music - similar to dev sequencer - easier than /dev/sequencer - handles synth and MIDI devices in the same way - enables device independant programming - based purely on MIDI protocols - /dev/midi - low level interface to MIDI bus ports - similar to TTY (character terminal) devices in raw mode - not intended for realtime use (timing not guarenteed) - everything sent to MIDI port immediately - useful by MIDI SysEx and sample librarian applications (data dumps) - /dev/midi00, /dev/midi01, - /dev/midi is a sym link to one of the others. - /dev/dmfm - low level register access to FM chip - /dev/dmmidi - raw interface to MIDI devices - direct TTY-like access to the MIDI bus - for special applications - portability considerations - gui - endian issues - when system and audio device differ - i.e. big endian RISC system and little endian PCI card - OSS hides device specific features behind its API - the API is based on universal properties of sound and music, not hardware - design your application so the user can select the /dev/ file - default /dev/dsp0 file might be broken - /dev/dsp1 might be preferred to the user - don't assume undocumented values - need to query default values - i.e. /dev/sequencer timer is usually 100hz, but can be 60 or 1024hz - don't open a device twice - not all sound cards have a mixer... :) - older cards - unsupported cards - some high end digital only cards - not all mixers have a master volume control - your app should query the available channels from the driver - don't use a device before checking that it exists. - mixer programming - check if it exists: ioctl will fail and set errno to ENXIO if no mixer - based on channels. - each channel represents the physical slider - vary betwen 0 (off) and 100 (max vol) - most channels are stereo (set them separately for balance) - usually only one input source is available for recording - query the capabilities with ioctl - see what channels are actually present - channels vary from card to card. - 30 channels - general sound background info - Wavetable Synthesis - a method of sound generation that uses digital sound samples stored in memory - FM synthesis - frequency modulation synthesis - procedural method of sound generation - uses wave generators/modulators sometimes in combination to produce sound. - small memory footprint - operator - waveform oscillator used to produce sound in FM synthesis more operators produce more realistic sounds - voice - an independant sound generator - sequencer - device (hardware or software) which controls (sequences) the playing of notes on a music synthesizer - patch - the device settings for a sound generator (i.e. piano) - db (decible) - unit to measure volume of sound. the scale is logarithmic since the human ear has logarithmic responce - audio programming - sound is stored as a sequence of samples taken from an audio signal at constant time intervals. - each sample represents the volume of the signal at the moment it was measured - each sample requires one or more bytes of storage. - the number of bytes in a sample depends on channels (mono/stereo) and format (8 or 16 bits) - the time interval between each sample gives the sampling rate - expressed in samples per second (hertz) - common: 8khz (telephone) to 48khz (DAT tape) to 96khz (DVD audio) - hardware - ADC (analog to digital converter) - DAC (Digital to analog converter) - codec (contains both ADC and DAC) - refered to as a "dsp" - fundamental parameters that affect sound quality - samp rate - expressed in samp per second or herz - limits the highest frequency that can be stored (Nyquist Frequency) - Nyquist's Sampling Theorem states: - highest freq that can be reproduced is at most half of the samp freq - i.e. if samp rate == 44khz, then the highest freq is 22khz - format / samp size - expressed in bits - affects the dynamic range - i.e. 8bit gives a range of 48db, 16bit gives 96db - in pratice - record and play using the standard system calls - open - close - read - write - default params are usually very low (speech quality) - change device parametrs with ioctl - all dev files support read/write, but some devices that can't record - most devices can work in half duplex mode (O_RDONLY or O_WRONLY) - record and play but not at the same time - some devices work in full duplex (O_RDWR) - simplest way to record audio is to use UNIX commands - cat /dev/dsp > recorded.file.raw - devices are exclusive, if in use they return EBUSY - include: - - - - - mandatory data: - int audio_filedescriptor; - unsigned chat audio_buffer[BUF_SIZE] - for real time performance, keep the buffer short. 1024 - 4096 is a good range. Choose a value that is a multiple of your sample size. - samp_size = channels * format - three parameters are needed to describe sampled audio data. - sample format (num of bits) - num of channels (mono, stereo) - samp rate (speed, sampling frequency) - for stereo data - two samples for each time slot. - left channel sample is always stored before the right channel sample - this extends for more than 2 samples - the device must be reset before setting them: ioctl( SNDCTL_DSP_RESET - OSS programming: - OSS supports multichannels (i.e. more than 2) - professional multichannel audio devices - 16 or more mono channels (8 stereo pairs) - how to encode multiple channels: - interleaved (like with stereo sound data) - multiple /dev/dsp devices - mixed (dsp1 in 2 channel more, dsp2 in mono, dsp3 in quad, etc...) - to access the mixer - don't access it directly through /dev/mixer - can't guarentee that it is accociated with the /dev/dsp you need - instead access the /dev/dsp's ioctl mixer settings. - for quality sound (avoid clicking) - when using a single audio buffer, you cannot write to it while it is being read, this produces a click while writing to the buffer. - the click is your audio hardware starving. - forces your application to do only one thing to ensure minimal popping - to fix, use the double buffering method - 2 buffers, - one is being read while,,, - one is being written to - then swap, repeat for as long as the device is in use. - buffer overruns - too much data (results in data loss) - buffer underrun - not enough data - improving latency (for generated sound with realtime requirements) - use smaller buffer (fragment) size to improve latency - useful to sync sound with the graphics. - use select - if writing an effect processor (read/write simultaneously) - use full duplex, - or use two devices together... - MIDI - what is it - Musical Instrument Digital Interface - communication protocol - hardware level interface - communication protocol between devices using the MIDI hardware level interface. - doesn't produce the audio, it controls some kind of external synthesizer which performs the sound generation - hardware details - interface - asyncronous - serial - byte-oriented - similar to RS-232 (serial port) - transfer rate is 31250 bits per second - MIDI cables - connect devices - 5-pin DIN connector at each end - single - MIDI ports (called MPU401 device - developed by Roland corp.) - there are dedicated (professional) MIDI only cards, without audio - dumb serial post - no sound generation capabilities - used to connect to an external MIDI device using MIDI cabling - MIDI external device - MIDI keyboard - rack mounted tone generator w/out keyboard - audio mixer - flamethrower, washing machine, etc... - connection - possible to have unlimited number of devices - through daisey chaining - MIDI multiplexers (like a Y-cable, but more expensive) - one command on the MIDI cable may be processed by unlimited number of devices. each of them can react to the command as they wish. - protocol - MIDI simply sends and receives bytes - you don't care what device is attached, you just see a MIDI port - it is even possible that there is nothing connected to the port. - each port has 16 possible channels - no sound transmitted, only control messages - instrument change messages - trigger, release (key was pressed) - note number (which key) - velocity (how hard) - what is a synthesizer? - a synth is a tone generator. - hardware tone generation/mixing - internal to a computer - mounted directly to sound card or motherboard. - Yamaha OPL2/3 FM synth - OPL2 was used in the adlib (late 80's) - OPL3 was used in sound blaster pro - OPL4 is FM combined with wave table (OPL3 compatible) - Gravis UltraSound (GUS) - first wave table sound card on the market - 32 simultaneous voices - wave samples stored in a table in card's internal RAM - 512k to 8MB - limited memory, so the aplpication needed to manage patch loading and caching. - defacto API supported by many wave table device drivers - Emu8000 - the chip in the SoundBlaster 32/64/AWE cards. - similar to GUS, but provides the GM patch set in ROM - patch load/cache is not nessesary (but possible) - or external - Roland Sound Canvas - etc... - a sound chip can have capabilities beyond what MIDI defines - but caps are unavailable if you're only using MIDI to control it - software tone generation/mixing - www.fruityloops.com - www.propellerheads.se - SoftOSS - software based wavetable engine by 4front technologies. - does mixing in software - can use any audio card (without wave table) to play MIDI with wave table quality. - usually what is connected to a MIDI port device - allows standard interface to the synth - allows portable code, etc... - MIDI file format - file extension .mid - contain only control data, no instrument data (unlike MOD format) - instrument timbres are asigned by the playing system. - MIDI in OSS - use /dev/midi, or /dev/midi00, /dev/midi01, /dev/midi02 - data is sent ASAP (midi data is queued) - queue can hold about 60 bytes - you can use Midilib which comes with OSS to read MIDI files. - uses the MIDI 1.0 spec - follows General MIDI (GM) and Yamaha XG specifications - see also www.midi.org, and www.ysba.com - MIDI - is a highly real time process - timing errors can be very noticeable by an experienced listener - /dev/music and /dev/sequencer can hold enough data for several seconds. - simply need to write more data before the queue becomes empty. - queuing is the central part of the API - queue for playback and for recording - playback happens in the background (async) - queue is non-blocking (application never waits) - queue is organized as a stream of events - events are records of 8 or 4 bytes - command code - parameter data - two main types of event: - timing - time stamp - included before input events - used to delay playback - input - i.e. play a note, volume change, etc... - instantaneous - when no timing events, the hardware tried to play al inputs as fast as possible (simultaneously) - events are processed in the order written to the device (FIFO queue) - for realtime events, you can send an immediate event ahead of the queue. - good for playing real time events (live performance, etc..) - MIDI Instruments - emulate acoustic and artificial sounds - MIDI devices are generally multitimbral - can emulate more than one instrument - to change instrument, send the MIDI port a "program change" message - programs (instruments) numbered between 0 and 127 (7 bit addressing) - modern devices support the GM (general midi) specification - GM maps instruments to defined program numbers - i.e. piano is 0 - numbering starts at 0, some books list them starting at 1 (wrong) - usually support other device specific numbering schemes. - the MIDI device implements these instruments (timbres) usually using: - procedural methods such as FM - explicit (canned) methods like wave table (which uses recorded samples) - MIDI Notes - playing notes is the main task - there are 2 messages in MIDI - note on - this msg signals a key press - it contains info about the key that was pressed (contols inst pitch) - it also contains velocity (controls inst volume and envelope) - note off - this msg signals a key release - after this msg, the sound will decay according to the instrument characteristics - each message signals the note number (0 to 127) - number of the key on the keyboard. - middle C is 60 - Voices and Channels - to play a note, the device usually needs one or more voices - some notes use many voices (for layering) - the number of possible voices played at one time is limited by the device - 9 with OPL2 - 18 with OPL3 - currently most support 30 or 32 - future trend is to support 64 or 128