Saturday, March 07, 2009

NAudio Tutorial 3 - Sample Properties

The 3rd installment in this NAudio tutorial series will be focusing on controlling sample properties.

As this 3rd Instalment in the NAudio Tutorial's, is quite large and there is a fair bit of code to review, I have packaged up the entire tutorial, including an RTF and AbiWord version of this tutorial in a convenient zip for you. Available here.

Additionally we will also look at building the starting blocks of an NAudioInterface class to provide us with a generic level of abstraction that can be called through other applications. This component is specifically to start addressing my own personal needs for the use of the NAudio API, as a replacement to the existing audio API's OpenSebJ is currently using - SDL.Net,  OpenAL.Net and previously Direct Sound through Microsoft's Managed DirectX.

The goals for the NAudioInterface we will be building in this tutorial will be to:

  • Provide a single method for all a default configuration and setup of audio output; working on the basis that all audio samples are designated for mixing together

  • Abstract all of the wave file and stream loading operations, such that only a file name needs to be provided

  • Setup a sample concept, such that all audio sample operations can occur through this instance. Such as:

    • Play

    • Pause

    • Pan

    • Volume

    • Looping

    • Sample Playback Position





We will be building upon the first two Tutorials, so if you haven't had a chance to review what's going on, they would be a good place to start.

NAudio & the relationship to NAudioInterface

NAudio does a great job of abstracting the underlying actions required to load audio files and stream them to various output devices and it does present an admirably well thought out mix of complexity vs. the amount of functionality. So why bother abstracting this any further?

Well in the specific scenario that I'll be presenting here, we will want to reduce some of this complexity and tie the common options and functions together to make interfacing and interacting with the samples more to do with the functions that are being performed and requested and less about the implementation of those functions. So think about the NAudioInterface as the level of abstraction removing us from how its done to just getting it done.

Architecture

I'll insert an obligatory diagram, describing the interaction that is going between the various classes and how we are logically grouping these interactions together.



Note: This diagram is not trying to portray an accurate physical implementation of the classes and the internal workings of NAudio; rather it is designed to depict and explain the logical relationships and their basic level of interaction.

Starting at the top, we have

  • Audio Application; This is our Tutorial Application however could be any application which would benefit from utilizing an Interface Structure such as this.

  • NAudioInterface; Groups all of the Audio functionality together, presenting a single unified location for all interaction with the Samples by the Audio Application.

  • WaveMixerStream32; is setup by the NAudioInterface in a Setup method, at this time the WaveMixerStream32 instance is linked to the IWavePlayer. No Changes to this class are made by this Tutorial.

  • IWavePlayer; is our waveOutDevice. In this tutorial we will just initialize an ASIO stream straight to the device. No changes to this class have been made by this Tutorial.

  • AudioSample; Is inherited from the WaveStream class and is the place where the non core NAudio sample related functionality is stored and used to manipulate the data that is read in to the byte array passed through to the WaveMixerStream32. I.e. Setting up a sample to loop.

  • WaveStream; Used in an overrider relationship from the AudioSample instance, to send the byte arrays which are to be mixed by the WaveMixerStream32 and subsequently played y the sound card. No changes to this class have been made in this Tutorial.



As we constrain all of the related NAudio modifications to the AudioSample class, which is inherited from the WaveStream Class and create the new NAudioInterface class, we can isolate any modifications we need to make in relation to the Audio that is playing in specific locations that mirror the functionality they need to invoke with out directly updating the NAudio libraries. So no branch of the underlying NAudio API required.


NAudioInterface

As we will want to access these functions from multiple locations within our program and for a scenario where there are many windows, from many different classes. We also need to keep consistency from what we are requesting. As such the first step is to declare this as a static class, such that it can be accessed without being instantiated, any where within our program.

using NAudio.Wave;

namespace AudioInterface
{
   public static class NAudioInterface
   {

& then declare all of the items we will need to keep state information about globally. I.e. we will only need one waveOutDevice and one mixer, which all audio steams will be mixed to.

       //Declarations required for audio out and mixing
       private static IWavePlayer waveOutDevice;
       private static WaveMixerStream32 mixer;
       
       // The Sample array we will load our Audio Samples in to
       private static AudioSample[] Sample;

The new introduction here, from the previous tutorials, is the AudioSample class. I previously mentioned in Tutorial 2, how there was a really good implementation in one of the NAudio Demo Applications called, MixDiff - specifically in MixDiffStream.cs which provides a great level of abstraction from the heavy lifting and that class has been reused in this tutorial and expanded to add the additional functionality mentioned in the goals.

Conceptually this has been grouped together in to the AudioSample class, such that any interaction with a sample from the calling program should be able to access any sample related functionality through the Sample instance we have created.

Now we have an array of AudioSamples, we need to initiate our waveOutDevice. This is handled in a single method called SetupAudio in the NAudioInterface class and is designed to just be called once when the calling program is setting up all of it's requirements.

       /// <summary>
       /// Setup Audio via NAudio.
       /// Defaults to using Asio for Audio Output.
       /// </summary>
       public static void SetupAudio(int Samples)
       {
           //Setup the Mixer
           mixer = new WaveMixerStream32();
           mixer.AutoStop = false;
           
           if (waveOutDevice == null)
           {
               waveOutDevice = new AsioOut();
               waveOutDevice.Init(mixer);
               waveOutDevice.Play();
           }

          Sample = new AudioSample[Samples];
       }

After the waveOutDevice has been setup and the mixer associated, we will now start loading waves in to the AudioSample instance. As such we have a simplified method to handle loading of the samples, funnily enough named LoadSample.

      public static void LoadSample(string fileName, int sampleNumber)
       {
           Sample[sampleNumber] = new AudioSample(fileName);
           mixer.AddInputStream(Sample[sampleNumber]);
           
           // The stop is required because when an InputStream
           // is added, if it is too long it will start
           // playing because we do not turn off the mixer.
           // This is effectively just a work around by making
           // sure that we move the playback position
           // to the end of the stream to aviod this issue.
           Stop(sampleNumber);
       }

Fairly straight forward but two things to note. First, we are adding the sample which has been loaded, to the mixer input stream. Now it's important to remember that we have already associated the mixer input stream with the waveOutDevice and set the device to be in a status of playing. This can result in the wave file being played mid way through when enough data has been loaded - you wouldn't notice it for a small wave sample being loaded but certainly noticeable in a longer length sample. As such we will call the method Stop, lets have a look at this now.

      public static void Stop(int sampleNumber)
       {
           // Set the position at the end of the sample length
           Sample[sampleNumber].Position = Sample[sampleNumber].Length;
       }

OK, so why have we set the playing position of the Sample to the length of the sample to stop playing the sample? It comes back to our structure of the mixer and the waveOutDevice being a constant state of Play. Therfore we need to make sure that we stop streaming audio from the Sample to the mixer but rather than unloading the sample, moving the current position to the end of the sample ensures that there is no more data is being loaded in to the mixer stream for play back. Similarly or perhaps conversely we have the Play method.

      public static void Play(int sampleNumber)
       {
           Sample[sampleNumber].Position = 0;
       }

It becomes just a simple matter of setting the current position for the sample back to the beginning. So now that we have established that there is a fairly easy way to stop the Sample playback we therefore have everything required to construct a Pause() and a Resume() method.

      public static void Pause(int sampleNumber)
       {
           Sample[sampleNumber].Pause();
       }

       public static void Resume(int sampleNumber)
       {
           Sample[sampleNumber].Resume();
       }

I would agree that these two methods don't have a lot of detail - so let's have a sneak peak in to the AudioSample class to see whats going on:

namespace AudioInterface
{
   public class AudioSample : WaveStream
   {
       // General Sample Settings (Info)
       string _fileName = "";
       bool _loop;
       long _pausePosition = -1;
       bool _pauseLoop;
               
       // Sample WaveStream Settings
       WaveOffsetStream offsetStream;
       WaveChannel32 channelSteam;
       bool muted;
       float volume;

<Snip>

      public void Pause()
       {
           // Store the current stream settings
           _pausePosition = Position;
           _pauseLoop = _loop;

           // Ensure the sample is temporairly
           // not looped and set the position to the
           // end of the stream
           _loop = false;
           Position = Length;

           // Set the loop status back, so that any
           // further modifications of the loop
           // status are observed
           _loop = _pauseLoop;
       }

       public void Resume()
       {
           // Ensure that the sample had actuall been
           // paused and that we are not just jumping
           // to a random position
           if (_pausePosition >= 0)
           {
               // Set the position of the stream
               // back to where it was paused
               Position = _pausePosition;

               // Set the pause position to negative
               // so that we know the sample is not currently paused
               _pausePosition = -1;
           }
       }

If you have been keeping up with me so far, I'll assume this doesn't need more explanation than whats already included in the comments. But for clarity or obsessive compulsive explanations I'll just confirm that we effectively stop the sample the same way we did previously on the Pause and keep a reference to what the position was before stopping. Then on the calling of the resume method we just let it go and start streaming again from the previous position. There are a few other functions from NAudioInterface we will introduce before moving in to the detail of the implementation of these functions in the AudioSample class. These are rather self explanatory and as they are in the NAudioInterface class they contain very little implementation details for whats actually going on under the hood.

The looping will, loop the sample. When it reaches the end of playback, as determined by there being no more audio to add in form the end of the stream, the reading is looped back to the beginning and the process starts over again.

      public static void Loop(int sampleNumber, bool Loop)
       {
           Sample[sampleNumber].SetLoop(Loop);
       }

Set Pan - Left to Right. We are using a float value that goes from -1.0 to 0 being the center and 1.0 being the complete other end of the speaker pan.

       public static void SetPan(int sampleNumber, float pan)
       {
           Sample[sampleNumber].SetPan(pan);
       }

We set the volume here, again using a float. The spectrum goes from 0 to 1.

       public static void SetVolume(int sampleNumber, float volume)
       {
           Sample[sampleNumber].Volume = volume;

       }

Setting the position of the sample playback is certainly a concept previously introduced.

       internal static void SetPostion(int sampleNumber, long position)
       {
           Sample[sampleNumber].Position = position;
       }

& Finally wrapping it all together is the get length method for the audio. Now for clarity this is returning in the same format as the SetPosition uses, which is based on the Length of the Source Stream /  the number of Bytes Per Sample. i.e. the number of individual samples.

       internal static long GetLength(int sampleNumber)
       {
           return Sample[sampleNumber].Length;
       }

Well that wraps up what the NAudioInterface looks like. Time to get on to the AudioSample class and all it's related methods. Harrar.


AudioSample

As mentioned earlier, this class is inherited from the WaveStream class, where most of the underlying functionality will still exist and in this scenario very little has been overridden in the AudioSample class.

Although we have already had a sneak peak at how this class is setup it's worth reviewing the key elements here again:

using NAudio.Wave;

namespace AudioInterface
{
   public class AudioSample : WaveStream
   {
       // General Sample Settings (Info)
       string _fileName = "";
       bool _loop;
       long _pausePosition = -1;
       bool _pauseLoop;
               
       // Sample WaveStream Settings
       WaveOffsetStream offsetStream;
       WaveChannel32 channelSteam;
       bool muted;
       float volume;

The AudioSample object now represents contains all the information that relates to the individual samples, which helps us keep a structured NAudioInterface class, not needing to be aware of the individual sample properties unless it is specifically interested. So internally we store whether or not the sample is set up to loop, the filename of the original sample, if the sample is paused & what the pause position was, what the volume for the sample is etc.

In our constructor for the AudioSample class, we only need take in the filename initially and we continue to use our other defaults for the sample until told otherwise.

       public AudioSample(string fileName)
       {
           _fileName = fileName;
           WaveFileReader reader = new WaveFileReader(fileName);
           offsetStream = new WaveOffsetStream(reader);
           channelSteam = new WaveChannel32(offsetStream);
           muted = false;
           volume = 1.0f;
       }

Key to the constructor is setting up the channelStream, based off the WaveFileReader. The Offset stream is used if we want to change the offset of a sample, based on where it starts etc. I haven't looked in to this much yet but thought it may be useful in the future so I've left it in for the time being. Performance overhead for this? I have no idea.

Rather then pull out every override that is practically the same I'll focus on those that are at least non-obvious or have a slightly less straight forward explanation; if you have managed to read this far in to the tutorial let me know - it would nice to know that some one had bothered reading this chapter from the novel that is my Audio Programming Experience.

Actually doing this only leaves me with one method that is worth talking about, which hasn't already been covered, which is the overridden method Read.

Read is the central arterial vein of the WaveStream in NAudio and as such is an interesting and fun place to override and make constructive changes. In this example we have used the override method to introduce looping capability for samples; the code for this looping was ripped from one of the other NAudio Demo's.

      public override int Read(byte[] buffer, int offset, int count)
       {
           // Check if the stream has been set to loop
           if (_loop)
           {
               // Looping code taken from NAudio Demo
               int read = 0;
               while (read < count)
               {
                   int required = count - read;
                   int readThisTime = channelSteam.Read(buffer, offset + read, required);
                   if (readThisTime < required)
                   {
                       channelSteam.Position = 0;
                   }

                   if (channelSteam.Position >= channelSteam.Length)
                   {
                       channelSteam.Position = 0;
                   }
                   read += readThisTime;
               }
               return read;
           }
           else
           {
               // Normal read code, sample has not been set to loop
               return channelSteam.Read(buffer, offset, count);
           }
       }

What is happening here is that the Read method knows how much data is required to be read into the stream, for sending to the Mixer. As such we use this length to read in all the available data to the end of the stream. Once we reach the end of the stream, identified by readThisTime being less than the required data for the read (The WaveChannel32 instance, channelStream, returns an integer on the read event representing the number of bytes actually read) - I'm not sure in this instance, when we have the number of bytes read less than what was expected to be read actually sounds like. i.e. is there a momentary gap of audio but from my testing so far I personally have not been able to distinguish any auditable issues from this.

int readThisTime = channelSteam.Read(buffer, offset + read, required);

Or the other check being that the current Position of the Channel Stream being greater than the complete length of the channel stream, we move back to the first position in the channel stream and start reading the additional data from the beginning of the stream.

if (channelSteam.Position >= channelSteam.Length)

Audio Application

Time to put a bow on all of this and call it done. I've put together to forms that are going to be used to show of the magic that is NAudio Tutorial 3. The first form is a launcher, that can be used to load different wave samples and allows for the same wave sample to be loaded to multiple positions simultaneously for multi-but-same sample fun.





Brutally boring and beautifully powerful. Check one or a number of boxes and then click Open and find a wave sample to load. Once the sample is loaded a number of windows (based on how many positions you have checked) pop up.



If you have ever seen OpenSebJ and think this window looks strikingly similar to the properties window thats in OpenSebJ - you wouldn't be wrong. For comparison:

 

I've only pasted this here so that you can see how this fits in to the big picture of a Real Time mixing Open Source audio composition tool - OpenSebJ; that and because if I can't bang the drum about OpenSebJ here, then where can I?

Sample Properties

So back on track the sample properties window will allow us to control all of the functionality we have bubbled to the surface through our NAudioInterface and as you can see we have a button, switch or slider for all of them. So lets have a look at whats required to do all this:

using AudioInterface;

namespace NAudioTutorial3
{
   public partial class SampleProperties : Form
   {
       int samplePosition = -1;

       public SampleProperties(int SamplePosition)
       {
           samplePosition = SamplePosition;
           InitializeComponent();
       }

We keep a local variable of the SamplePosition populated so that we know what sample this UI relates to when sending instructions the NAudioInterface.

       private void SampleLauncher_Load(object sender, EventArgs e)
       {
           trkPosition.Maximum = (int)NAudioInterface.GetLength(samplePosition);
       }

On load, we setup the track bar to have a maximum length the same as the sample, so that when we scroll the slider we move the current play position of the sample - think moving a needle along a record but electronically. Also in terms of this analogy of a scratch it is more like if you actually picked up the needle and then moved it to the exact spot and let it play, or when you drag it back it would be like picking it up multiple times putting it back down and letting it play for a moment and then doing the same. Like a leapfrog effect backwards. I will be looking in to how to reverse a sample at a latter point in time to effectively simulate record scratching.

       private void cmbPlay_Click(object sender, EventArgs e)
       {
           NAudioInterface.Play(samplePosition);
       }

       private void chkLoop_CheckedChanged(object sender, EventArgs e)
       {
           NAudioInterface.Loop(samplePosition, chkLoop.Checked);
       }

       private void cmbPause_Click(object sender, EventArgs e)
       {
           if (cmbPause.Text == "Pause")
           {
               NAudioInterface.Pause(samplePosition);
               cmbPause.Text = "Resume";
           }
           else
           {
               NAudioInterface.Resume(samplePosition);
               cmbPause.Text = "Pause";
           }
           
       }

One thing to note with the Pan and Volume change is that the actual class uses a float value of 0 to 1 for the volume and -1 to 1 for the pan. As such I have setup the track values to 0 to 1000 and -1000 to 1000 such that the track bars feel smooth when moving them and thus providing a greater level of precision the the actual adjustments that are occurring.

       private void trkPan_ValueChanged(object sender, EventArgs e)
       {
           NAudioInterface.SetPan(samplePosition, (float)trkPan.Value / (float)1000);
       }

       private void trkVolume_ValueChanged(object sender, EventArgs e)
       {
           NAudioInterface.SetVolume(samplePosition, (float)trkVolume.Value / (float)1000);
       }

Changes the current playing position of the sample in real time. Lots of fun can be had with this - even if I do say so myself.

       private void trkPosition_MouseMove(object sender, MouseEventArgs e)
       {
           if (e.Button == MouseButtons.Left)
           {
               NAudioInterface.SetPostion(samplePosition, (long)trkPosition.Value);
           }
       }

   }
}


Sample Launcher

Most of the code here relates to the UI and ensuring we only load a single sample once but I'll throw it in for completeness. Refer to in-line comments for details.


using AudioInterface;

namespace NAudioTutorial3
{
   public partial class NAudioTutorial3 : Form
   {
       // sampleLoaded used to make sure we don't try and load in to the
       // same sample position twice
       bool[] sampleLoaded = new bool[256];
       
       // Our Sample Properties window.
       SampleProperties[] sampleWindow = new SampleProperties[256];


       public NAudioTutorial3()
       {
           InitializeComponent();
       }

       private void NAudioTutorial3_Load(object sender, EventArgs e)
       {
           // Setup the NAudioInterface first and only required once
           // for the runtime.
           NAudioInterface.SetupAudio(256);
           
           // Set all data to inital values.
           for (int i = 0; i < 256; i++)
           {
               sampleLoaded[i] = false;
               chkList.Items.Add(i.ToString(), false);
           }
       }

       private void cmbOpen_Click(object sender, EventArgs e)
       {
           // prompt for file load
           OpenFileDialog openFileDialog = new OpenFileDialog();
           openFileDialog.Filter = "WAV Files (*.wav)|*.wav";
           if (openFileDialog.ShowDialog() == DialogResult.OK)
           {
               // Want to iterate through every possiable sample
               for (int i = 0; i < 256; i++)
               {
                   // Check if it has previously
                   // been ticked and not previously loaded
                   if (chkList.GetItemChecked(i) == true && sampleLoaded[i] == false)
                   {
                       // One call to the NAudioInterface to
                       // load the sample
                       NAudioInterface.LoadSample(openFileDialog.FileName, i);
                       
                       // Setup the sample window and show it
                       sampleWindow[i] = new SampleProperties(i);
                       sampleWindow[i].Show();

                       // Make sure we don't load the
                       // sample position again
                       chkList.SetItemChecked(i, false);
                       chkList.Items[i] = "Loaded";
                       sampleLoaded[i] = true;
                   }
                   else
                   {
                       // Untick any item which was ticked
                       // to be loaded but has
                       // already been previously loaded
                       chkList.SetItemChecked(i, false);
                   }

               }//end for
           }//end if
           
       }


   }
}

Thats it!

Conclusion

In this tutorial we have run through how to setup an Static Class that will act as a broker for all of the related NAudio functionality, such that the code required in forms and extended classes is minimalist in nature.

Reviewing the tutorial against our original goals we have:

Simple Audio Setup
Provide a single method for all a default configuration and setup of audio output; working on the basis that all audio samples are designated for mixing together

We demonstrated this through a single call:  

AudioInterface.SetupAudio(256);


Simple File Loading

A
bstract all of the wave file and stream loading operations, such that only a file name needs to be provided

Demonstrated through only one method call required to load a sample from our Audio Application:  

NAudioInterface.LoadSample(openFileDialog.FileName, i);

Sample Concept

Setup a sample concept, such that all audio sample operations can occur through this instance. Such as:

  •  Play

  • Pause

  • Pan

  • Volume

  • Looping

  • Sample Playback Position


 
I wont cut and paste the Sample class but you can check the AudioSample class if you don't believe me.

So we have met all our goals and have a Sample Properties interface I can almost drop in as-is to OpenSebJ, brilliant.


Next Time 

There are a few options for the following tutorials and I would be interested in hearing peoples thoughts on what you would like to see covered. Some of the items I am personally interested in are:

  •  How we can introduce a reverse option to a stream, so that we can start supporting a real-time scratch interface.

  • Adding Audio Effects to a Stream

  • Transposing the frequency of the stream being played back

  • Setting up stream to disk recording and (hopefully) getting it to work while we are still using ASIO (or in my case ASIO4ALL)



Final Thoughts

This Tutorial ended up being much longer than I expected and became a way for me to impart my experience with Audio Applications in C#, with generalized concepts and nice framework that I can use in OpenSebJ.

However through this process I have also re-factored the tutorial a few times to make it more universally applicable. Moving the sample properties to a separate class for instance, so that it can be instantiated and used to control many samples and not just one - demonstrating the mixing capability and the number of simultaneous streams that can be handled.

There is still more room for improvement on this,  perhaps some of these actions should be made simpler in the NAudioInterface and re-factored in to the AudioSample class but we can do that latter.


Reminder: You can download entire tutorial, including an RTF and AbiWord version of this tutorial, with all of the associated code, in a convenient zip here.

2 comments:

Diego said...

Great tutorials!

I cannot describe how much I'm thankful for those who built and support NAudio library.

I'm using it direct from SVN and I just love it. I ain't no sound expert, but I do write code for living and DirectSound was really starting to piss me off.

It's incredible the way the complexity of these libraries is hidden from the developer.

Thank you all of you guys for the great work.

Thomas said...

AMAZING Tutorials!!!!!!
I searched alot in the www and tried out DirectSound pe. Finally in wanted and needed ASIO (because of my DiplomaThesis) and find NAudio with youre tutorials!

Have you finished this point?
"Setting up stream to disk recording and (hopefully) getting it to work while we are still using ASIO (or in my case ASIO4ALL)"

Would be great if you give an answer!