Windows Core Audio API

Whoever has used advanced audio apparels knows that the most used driver is the ASIO one: quite commonly, each audio interface is provided with a proprietary implementation of the ASIO specifications and the driver is generally used from within the Digital Audio Workstation or other applications that are able to stream through this protocol.

ASIO is a proprietary system, produced by Steinberg: its main aim is to be low-latency; in fact, the idea behind ASIO (Audio Stream Input/Output) is to completely bypass the Operating System audio path and directly connect the audio client to the hardware.

Starting from Windows Vista (and improved in Windows 7), Microsoft has introduced the Core Audio API: it is a new core set of user-mode components that empower those clients that want to access the machine audio system in a safe, convenient and efficient way. The points of strenght of the system can be found in the about page of the API:

  • Low-latency, glitch-resilient audio streaming.
  • Improved reliability (many audio functions have moved from kernel mode to user mode).
  • Improved security (processing of protected audio content takes place in a secure, lower-privilege process).
  • Assignment of particular system-wide roles (console, multimedia, and communications) to individual audio devices.
  • Software abstraction of the audio endpoint devices (for example, speakers, headphones, and microphones) that the user manipulates directly.

These APIs are used by others higher level APIs, such as DirectSound, DirectMusic, Windows Multimedia and Media Foundation – a lot more information is available from the above Core Audio API link, no need here to copy/paste the whole documentation. Our target is to understand how the various parts work so to be able to stream data directly from memory to an audio endpoint.

The first API we want to play with is the Windows Multimedia Device (MMDevice) API, which can be used to enumerate the audio endpoint devices and gather various information about them. The file to include to work with this API is “Mmdeviceapi.h“, the first interface to study is IMMDeviceEnumerator, which “provides methods for enumerating multimedia device resources. In the current implementation of the MMDevice API, the only device resources that this interface can enumerate are audio endpoint devices.“. The code we’ll execute is as follow:

const CLSID CLSID_MMDeviceEnumerator = __uuidof(MMDeviceEnumerator);
const IID IID_IMMDeviceEnumerator = __uuidof(IMMDeviceEnumerator);
HRESULT res;
IMMDeviceEnumerator* pEnumerator;

res = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
HRESULT hr = CoCreateInstance(CLSID_MMDeviceEnumerator, NULL, CLSCTX_ALL, IID_IMMDeviceEnumerator, (void**)&pEnumerator);

The call to CoInitializeEx is necessary: should you forget that, the subsequent CoCreateInstance will fail and return an error code that reminds you to do that – this call is just needed to initialize the COM library.

After done that we can call CoCreateInstance, here’s the signature:

HRESULT CoCreateInstance(
  REFCLSID  rclsid,
  LPUNKNOWN pUnkOuter,
  DWORD     dwClsContext,
  REFIID    riid,
  LPVOID    *ppv
);

Description of the function: “Creates a single uninitialized object of the class associated with a specified CLSID” – so:

  • it creates an object
  • which class is associated with a specified CLSID

…what’s a CLSID? The Microsoft documentation explains it this way:

A CLSID is a globally unique identifier that identifies a COM class object. If your server or container allows linking to its embedded objects, you need to register a CLSID for each supported class of objects.

from the CLSID page.

Let’s take a look first at the parameters expected by the CoCreateInstance function:

  • REFCLSID rclsid – this is a const reference to an IID, which in turn is defined as a GUID. In the context of this function it is “the CLSID associated with the data and code that will be used to create the object.“. Reference to a class ID ==> rclsid.
  • LPUNKNOWN pUnkOuter – optional parameter, it is a pointer to an IUnknown struct. “If NULL, indicates that the object is not being created as part of an aggregate. If non-NULL, pointer to the aggregate object’s IUnknown interface (the controlling IUnknown)“.
  • DWORD dwClsContext – “Context in which the code that manages the newly created object will run. The values are taken from the enumeration CLSCTX“.
  • REFIID riid – again defined as a const reference to an IID. Reference to an interface ID ==> riid.
  • LPVOID *ppv – Address of pointer variable that receives the interface pointer requested in riid.

Since we want an MMDeviceEnumerator object and the interface that will be used to communicate with it, first we have to retrieve the CLSID for that class and interface; to do that we can use the __uuidof() keyword that retrieves the UID attached to a given expression; these two calls:

const CLSID CLSID_MMDeviceEnumerator = __uuidof(MMDeviceEnumerator);
const IID IID_IMMDeviceEnumerator = __uuidof(IMMDeviceEnumerator);

are used to retrieve the GUIDs of the MMDeviceEnumerator class and the IMMDeviceEnumerator interface that will be used as first and fourth parameters. The result of this call is saved in the ppv parameter which is a pointer to the class instance returned by the CoCreateInstance. As the documentation states, this is not the only way to get an instance of a class of a given type but it’s definitely the most simple – an alternative and more efficient way, in case of multiple instance are needed, would be to get the factory and then use it to create the objects:

CoGetClassObject(rclsid, dwClsContext, NULL, IID_IClassFactory, &pCF); 
hresult = pCF->CreateInstance(pUnkOuter, riid, ppvObj) 
pCF->Release();

As a final step, we’ll try to retrieve the name of the “default” output device of the machine; to do so we’ll use the GetDefaultAudioEndpoint function of the IMMDeviceEnumerator to get a pointer to an IMMDevice, which is the interface that encapsulate the feature of a multimedia device resource. The IMMDevice function OpenPropertyStore() can be then used to retrieve the properties store through an IPropertyStore interface:

IMMDevice* pDevice;
IPropertyStore* pProperties;
hr = pEnumerator->GetDefaultAudioEndpoint(EDataFlow::eRender, ERole::eMultimedia, &pDevice);
hr = pDevice->OpenPropertyStore(STGM_READ, &pProperties);

The IPropertyStore interface can be finally used to browse the properties of an audio interface – as a first thing, we’ll read the “friendly name” of the device – we’ll include the “Functiondiscoverykeys_devpkey.h” and the “propvarutil.h” headers and we’ll add the “propsys.lib” to the input libs of the linker:

PROPVARIANT varName;
WCHAR szDeviceName[128];
PropVariantInit(&varName);
pProperties->GetValue(PKEY_Device_FriendlyName, &varName);
PropVariantToString(varName, szDeviceName, 128);
std::wcout << szDeviceName;

This is the result:

Default output device browsed through the Core Audio API

Summary

The MMDevice API can be used to discover the audio endpoint available in the system, determine their capabilities and create driver instances for those devices; Mmdeviceapi.h is the header that defines these interfaces.

InterfaceDescription
IMMDeviceRepresents an audio device
IMMDeviceCollectionRepresents a collection of audio devices
IMMDeviceEnumeratorProvides methods for enumerating
audio devices.
IMMEndpoint Represents an audio endpoint device
IMMNotificationClient Provides notifications when an audio
endpoint device is added or removed,
when the state or properties of a device
change, or when there is a change
in the default role assigned to a device

Full code:

#include "Mmdeviceapi.h"
#include "Functiondiscoverykeys_devpkey.h"
#include "propvarutil.h"
#include <iostream>
#include <string>

void ShowDefaultAudioEndpoint()
{
	const CLSID CLSID_MMDeviceEnumerator = __uuidof(MMDeviceEnumerator);
	const IID IID_IMMDeviceEnumerator = __uuidof(IMMDeviceEnumerator);

	HRESULT res;
	IMMDeviceEnumerator* pEnumerator;
	res = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
	HRESULT hr = CoCreateInstance(CLSID_MMDeviceEnumerator, NULL, CLSCTX_ALL, IID_IMMDeviceEnumerator, (void**)&pEnumerator);

	IMMDevice* pDevice;
	IPropertyStore* pProperties;
	hr = pEnumerator->GetDefaultAudioEndpoint(EDataFlow::eRender, ERole::eMultimedia, &pDevice);
	hr = pDevice->OpenPropertyStore(STGM_READ, &pProperties);

	PROPVARIANT varName;
	WCHAR szDeviceName[128];
	PropVariantInit(&varName);
	pProperties->GetValue(PKEY_Device_FriendlyName, &varName);
	PropVariantToString(varName, szDeviceName, 128);
	std::wcout << szDeviceName;
        
        pProperties->Release();
        pDevice->Release();
        pEnumerator->Release();
}

Leave a Reply

Your email address will not be published. Required fields are marked *