Code Monkey vs. Silverlight Audio Capture

I’ve been working on a new project (Symphony, https://github.com/ermau/Symphony) that is heavily media centric. The general idea is to provide better support to .NET for a whole range of multimedia scenarios, whether they’re recording and playing back audio, decoding a song, or what have you. In order to reach the broadest audience possible, I decided early on to include as much Silverlight support as I possibly could. Part of that supports is bringing Silverlight’s built in microphone support under Symphony’s unified API (which is, to say, wrapping it). My head and desk are no longer on speaking terms, but more to the point I’d love to know (at least in the case of Silverlight) if Microsoft bothers to get people who have experience in the area their assigned to write libraries for.

A lesson I learned the hard way in developing Gablarski was that available audio devices change, a lot. Some enable/disable themselves automatically when detecting a 3.5mm being connected/disconnected, others are USB devices that get plugged in and removed at will, so on and so forth. As any logical person would deduce from this truth, it becomes important to be able to track what device is currently the default, when that default changes, and when a device is removed. So let’s look at Silverlight’s APIs and see what’s available; first we have the CaptureDeviceConfiguration:

public static class CaptureDeviceConfiguration
{
	public static bool AllowedDeviceAccess { get; }
 
	public static bool RequestDeviceAccess();
	public static ReadOnlyCollection<VideoCaptureDevice> GetAvailableVideoCaptureDevices();
	public static ReadOnlyCollection<AudioCaptureDevice> GetAvailableAudioCaptureDevices();
	public static VideoCaptureDevice GetDefaultVideoCaptureDevice();
	public static AudioCaptureDevice GetDefaultAudioCaptureDevice();
}

Ok, so this gives us a way to get the current default and a list of all available devices. As you’ll notice, however, it does not provide an event notifying you that the default device has changed. I’ve found that the best user experience for voice based applications is to default to the default audio device and to track that device. If I manually set another device, the application will gladly remember what I’ve selected, but I should be given the choice of “default” which follows the current default instead of just the default at selection time. Without a notification to tell the application when this has changed, the developer is left with no good way to provide an excellent user experience. The best I could come up with to work around this is to poll the current default device in order to provide the notification that Symphony’s API demands. Having the notification at all is a positive outcome, but having to balance between how much delay between polls and how many extra cycles we’re wasting checking is undesirable. Here’s what I came up with:

public class SilverlightAudioCaptureDeviceProvider
{
	public SilverlightAudioCaptureDeviceProvider()
	{
		if (!CaptureDeviceConfiguration.AllowedDeviceAccess)
			CaptureDeviceConfiguration.RequestDeviceAccess();
 
		UpdateDevices();
		this.pollTimer.Interval = TimeSpan.FromMilliseconds (250);
		this.pollTimer.Tick += PollTimerOnTick;
		this.pollTimer.Start();
	}
 
	public event EventHandler DefaultDeviceChanged;
 
	public AudioCaptureDevice DefaultDevice
	{
		get;
		private set;
	}
 
	private readonly DispatcherTimer pollTimer = new DispatcherTimer();
	private AudioCaptureDevice lastDefaultDevice;
 
	private void PollTimerOnTick (object sender, EventArgs eventArgs)
	{
		UpdateDevices();
	}
 
	private void UpdateDevices()
	{
		AudioCaptureDevice defaultDevice = CaptureDeviceConfiguration.GetDefaultAudioCaptureDevice();
		if (DefaultDevice != defaultDevice) // This is actually a bug which I’ll explain later
		{
			DefaultDevice = defaultDevice;
			OnDefaultDeviceChanged (EventArgs.Empty);
		}
	}
 
	private void OnDefaultDeviceChanged (EventArgs e)
	{
		var changed = DefaultDeviceChanged;
		if (changed != null)
			changed (this, e);
	}
}

(This is not the Symphony API, I’ve reduced it to a decorator off sorts to provide these functions for any Silverlight use for the reader.)

Now that we have default device notifications, we need to check into notifications for devices becoming unavailable (which could be a user manually disabling the device in Windows, unplugging a USB mic, etc.) It’s obviously not on CaptureDeviceConfiguration class, so let’s check the AudioCaptureDevice class:

public class CaptureDevice
	: DependencyObject
{
	public static readonly DependencyProperty FriendlyNameProperty;
	public static readonly DependencyProperty IsDefaultDeviceProperty;
 
	public string FriendlyName { get; }
	public bool IsDefaultDevice { get; }
}
 
public sealed class AudioCaptureDevice
	: CaptureDevice
{
	public static readonly DependencyProperty AudioFrameSizeProperty;
 
	public int AudioFrameSize { get; set; }
	public ReadOnlyCollection<AudioFormat> SupportedFormats { get; }
	public AudioFormat DesiredFormat { get; set; }
}

So there’s no status changed events and there’s no properties to check if the device is even still alive. Let’s step back and take a look at a basic usage of this API. First you new up a CaptureSource:

public sealed class CaptureSource
	: DependencyObject
{
	public static readonly DependencyProperty VideoCaptureDeviceProperty;
	public static readonly DependencyProperty AudioCaptureDeviceProperty;
 
	public CaptureSource();
 
	public event EventHandler<ExceptionRoutedEventArgs> CaptureFailed;
	public event EventHandler<CaptureImageCompletedEventArgs> CaptureImageCompleted;
 
	public VideoCaptureDevice VideoCaptureDevice { get; set; }
	public AudioCaptureDevice AudioCaptureDevice { get; set; }
	public CaptureState State { get; }
 
	public void Start();
	public void Stop();
	public void CaptureImageAsync();
}

So, we take our audio device and assign CaptureSource.AudioCaptureDevice to it and call start. CaptureFailed looks promising, I would expect it to be raised if the device had disappeared before or after starting the capture. I can’t help but notice that there’s explicit methods for capturing a single image, but there’s no way from right here to get audio samples (or multiple images). In order to actually get the audio samples you then have to create (as there are no built-in implementations) and initialize an AudioSink:

public abstract class AudioSink
{
	public CaptureSource CaptureSource { get; set; }
 
	protected abstract void OnCaptureStarted();
	protected abstract void OnCaptureStopped();
	protected abstract void OnFormatChange (AudioFormat audioFormat);
	protected abstract void OnSamples ( long sampleTimeInHundredNanoseconds, long sampleDurationInHundredNanoseconds,
					byte[] sampleData);
	~AudioSink();
}

Here are some more alternatives, OnCaptureStopped() might be called if the device goes away during capture. So, I set up a test with all of these components, put breakpoints everywhere, started the capture and disabled a device while it was running. Can you guess which event/callback breakpoint was hit? I asked a few different people that question when I was attempting to do all of this and the answer is so ridiculous that no one guessed it their first try. The answer is NONE of them. If the device you’re capturing with goes away while you’re capturing, you received no notification whatsoever. No event raised, no callback fired, no 0 value arguments to OnSamples, no CaptureSource.State == CaptureState.Failed, no exception, nothing.

As before, this is a pretty critical factor of user experience. The user may not even know that the device just went away. I can just imagine the “Hey, can anyone hear me?” and subsequent “I’ve been talking for 30 minutes and no one could hear me..”. If you’ve used a VoIP application while playing games, you’ve probably done this before with a mute switch on your headset. So, there are two alternatives: We can poll the complete list of devices to see if the device is listed anymore, or we can have a timeout for OnSamples calls (as it does stop being called when the device goes away). Since we already have a poll for the default device, I decided to go with polling.

First, we’ll need a decorator to hold the status and event:

public class AudioCaptureDeviceEx
{
	public AudioCaptureDeviceEx (AudioCaptureDevice device)
	{
		if (device == null)
			throw new ArgumentNullException ("device");
 
		this.device = device;
	}
 
	public event EventHandler IsAvailableChanged;
 
	public bool IsAvailable
	{
		get { return this.isAvailable; }
		set
		{
			if (this.isAvailable == value)
				return;
 
			this.isAvailable = value;
			var changed = IsAvailableChanged;
			if (changed != null)
				changed (this, EventArgs.Empty);
		}
	}
 
	// 1:1 AudioCaptureDevice wrappings left out for brevity
 
	private readonly AudioCaptureDevice device;
	private bool isAvailable = true;
}

We’ll want a list of devices with our additions in the device provider:

private readonly Dictionary<AudioCaptureDevice, AudioCaptureDeviceEx> devices =
	new Dictionary<AudioCaptureDevice, AudioCaptureDeviceEx>();
 
public IEnumerable<AudioCaptureDeviceEx> AudioDevices
{
	get
	{
		lock (this.devices)
			return this.devices.Values.Where (d => d.IsAvailable).ToList();
	}
}

Now we need to update our polling routine:

private void UpdateDevices()
{
	AudioCaptureDeviceEx currentDefault = null;
 
	lock (this.devices)
	{
		var newDevices = new HashSet<AudioCaptureDevice> (CaptureDeviceConfiguration.GetAvailableAudioCaptureDevices());
		foreach (var kvp in this.devices)
		{
			if (kvp.Value.IsDefaultDevice)
				currentDefault = kvp.Value;
 
			if (newDevices.Remove (kvp.Key))
			{
				kvp.Value.IsAvailable = true;
				continue;
			}
 
			this.devices[kvp.Key].IsAvailable = false;
		}
 
		foreach (AudioCaptureDevice device in newDevices)
		{
			var d = new AudioCaptureDeviceEx (device);
			if (d.IsDefaultDevice)
				currentDefault = d;
 
			this.devices.Add (device, d);
		}
	}
 
	if (DefaultDevice != currentDefault)
	{
		DefaultDevice = currentDefault;
		OnDefaultDeviceChanged (EventArgs.Empty);
	}
}

Seems like it should work, right? Yeah, I thought so too. In my tests (on my machine anyway), I discovered AudioCaptureDevice.IsDefaultDevice always returns false. You’d think there’d be an Assert.IsTrue (CaptureDeviceConfiguration.GetDefaultAudioCaptureDevice().IsDefaultDevice) somewhere in their tests. So, I need to instead go back to querying CaptureDeviceConfiguration.GetDefaultAudioCaptureDevice() and then reconcile the AudioCaptureDeviceEx instances. This is where I was in for another surprise; remember that bug I said I’d tell you about later? AudioCaptureDevice does not implement equality functions and new instances are returned on each of CaptureDeviceConfiguration’s methods, so there’s no other way to compare them but their “FriendlyName” (which is actually cut off, which makes it impossible to display an exact match to what’s in Windows). Comparing by their name will probably work most of the time, it’s definitely not ideal. So now we have:

private class AudioCaptureDeviceEqualityComparer
	: IEqualityComparer<AudioCaptureDevice>
{
	public static readonly AudioCaptureDeviceEqualityComparer Instance
		= new AudioCaptureDeviceEqualityComparer();
 
	public bool Equals (AudioCaptureDevice x, AudioCaptureDevice y)
	{
		if (x == y)
			return true;
		if (x == null || y == null)
			return false;
 
		return (x.FriendlyName == y.FriendlyName);
	}
 
	public int GetHashCode (AudioCaptureDevice obj)
	{
		if (obj == null)
			throw new ArgumentNullException ("obj");
 
		return obj.FriendlyName.GetHashCode();
	}
}

We’ll need to update the dictionary holding the devices:

private readonly Dictionary<AudioCaptureDevice, AudioCaptureDeviceEx> devices =
	new Dictionary<AudioCaptureDevice, AudioCaptureDeviceEx> (AudioCaptureDeviceEqualityComparer.Instance);

And finally the updated UpdateDevices:

private void UpdateDevices()
{
	AudioCaptureDevice currentDefault = CaptureDeviceConfiguration.GetDefaultAudioCaptureDevice();
	AudioCaptureDeviceEx currentDefaultEx = null;
 
	lock (this.devices)
	{
		var newDevices = new HashSet<AudioCaptureDevice> (CaptureDeviceConfiguration.GetAvailableAudioCaptureDevices(),
							AudioCaptureDeviceEqualityComparer.Instance);
		foreach (var kvp in this.devices)
		{
			if (currentDefault != null && kvp.Value.FriendlyName == currentDefault.FriendlyName)
				currentDefaultEx = kvp.Value;
 
			if (newDevices.Remove (kvp.Key))
			{
				kvp.Value.IsAvailable = true;
				continue;
			}
 
			this.devices[kvp.Key].IsAvailable = false;
		}
 
		foreach (AudioCaptureDevice device in newDevices)
		{
			var d = new AudioCaptureDeviceEx (device);
			if (currentDefault != null && d.FriendlyName == currentDefault.FriendlyName)
				currentDefaultEx = d;
 
			this.devices.Add (device, d);
		}
	}
 
	if (DefaultDevice != currentDefaultEx)
	{
		DefaultDevice = currentDefaultEx;
		OnDefaultDeviceChanged (EventArgs.Empty);
	}
}

When I was wrapping AudioCaptureDevice for Symphony’s capture device contract, I discovered another fun bit. Symphony has it’s own definition for audio formats, so I have to keep a map of formats so I can pass the correct one to Silverlight. Ok, no problem, I’ll just call ToDictionary with a little conversion function I wrote, and.. ArgumentException. AudioCaptureDevice.SupportedFormats actually contains duplicates. I realize this is probably just directly reading from whatever API it’s sitting on, but when you’re trying to provide a simple API you should really clean up crap like this.

Let’s review

  • No notifications for default device changed
  • No notifications for devices added / removed
  • AudioCaptureDevice.FriendlyName is fixed width
  • AudioCaptureDevice.IsDefaultDevice just plain broken
  • AudioCaptureDevice isn’t equatable
  • AudioCaptureDevice.SupportedFormats contains duplicates

If someone handed me code like this for such a wide reaching project like Silverlight and told me it was done, I’d fire them. That it actually made it to production baffles me, somewhere a manager completely failed at their job. I’m really not sure how you can go this wrong, but the more I explore Silverlight, the more I come to the conclusion that it’s a hack job and that if it wasn’t for WP7 I really wouldn’t bother wasting my time trying to support it for anything. The Mono crew have unified platform support figured out, how is it that Microsoft with all its resources (or perhaps that’s exactly the reason?) has managed to create such different platforms. I very quickly ran out of votes on Silverlight’s UserVoice page: http://dotnet.uservoice.com/forums/4325-silverlight-feature-suggestions

The complete final code for the device provider is available here: https://gist.github.com/1047203

P.S. If I got this wrong, I’d love to be corrected.

This entry was posted in Rant and tagged , , , . Bookmark the permalink.
Add Comment Register



Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>