Whoever has used advanced audio apparels knows that the most used driver is the ASIO one: quite commonly, each audio interface is provided with a proprietary implementation of the ASIO specifications and the driver is generally used from within the Digital Audio Workstation or other applications that are able to stream through this protocol.
ASIO is a proprietary system, produced by Steinberg: its main aim is to be low-latency; in fact, the idea behind ASIO (Audio Stream Input/Output) is to completely bypass the Operating System audio path and directly connect the audio client to the hardware.
Starting from Windows Vista (and improved in Windows 7), Microsoft has introduced the Core Audio API: it is a new core set of user-mode components that empower those clients that want to access the machine audio system in a safe, convenient and efficient way. The points of strenght of the system can be found in the about page of the API:
- Low-latency, glitch-resilient audio streaming.
- Improved reliability (many audio functions have moved from kernel mode to user mode).
- Improved security (processing of protected audio content takes place in a secure, lower-privilege process).
- Assignment of particular system-wide roles (console, multimedia, and communications) to individual audio devices.
- Software abstraction of the audio endpoint devices (for example, speakers, headphones, and microphones) that the user manipulates directly.
These APIs are used by others higher level APIs, such as DirectSound, DirectMusic, Windows Multimedia and Media Foundation – a lot more information is available from the above Core Audio API link, no need here to copy/paste the whole documentation. Our target is to understand how the various parts work so to be able to stream data directly from memory to an audio endpoint.
The first API we want to play with is the Windows Multimedia Device (MMDevice) API, which can be used to enumerate the audio endpoint devices and gather various information about them. The file to include to work with this API is “Mmdeviceapi.h
“, the first interface to study is IMMDeviceEnumerator
, which “provides methods for enumerating multimedia device resources. In the current implementation of the MMDevice API, the only device resources that this interface can enumerate are audio endpoint devices.“. The code we’ll execute is as follow:
const CLSID CLSID_MMDeviceEnumerator = __uuidof(MMDeviceEnumerator);
const IID IID_IMMDeviceEnumerator = __uuidof(IMMDeviceEnumerator);
HRESULT res;
IMMDeviceEnumerator* pEnumerator;
res = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
HRESULT hr = CoCreateInstance(CLSID_MMDeviceEnumerator, NULL, CLSCTX_ALL, IID_IMMDeviceEnumerator, (void**)&pEnumerator);
The call to CoInitializeEx
is necessary: should you forget that, the subsequent CoCreateInstance
will fail and return an error code that reminds you to do that – this call is just needed to initialize the COM library.
After done that we can call CoCreateInstance
, here’s the signature:
HRESULT CoCreateInstance(
REFCLSID rclsid,
LPUNKNOWN pUnkOuter,
DWORD dwClsContext,
REFIID riid,
LPVOID *ppv
);
Description of the function: “Creates a single uninitialized object of the class associated with a specified CLSID” – so:
- it creates an object
- which class is associated with a specified
CLSID
…what’s a CLSID
? The Microsoft documentation explains it this way:
A
from the CLSID page.CLSID
is a globally unique identifier that identifies a COM class object. If your server or container allows linking to its embedded objects, you need to register aCLSID
for each supported class of objects.
Let’s take a look first at the parameters expected by the CoCreateInstance
function:
REFCLSID rclsid
– this is aconst
reference to anIID
, which in turn is defined as aGUID
. In the context of this function it is “the CLSID associated with the data and code that will be used to create the object.“. Reference to a class ID ==> rclsid.LPUNKNOWN pUnkOuter
– optional parameter, it is a pointer to anIUnknown
struct. “If NULL, indicates that the object is not being created as part of an aggregate. If non-NULL, pointer to the aggregate object’s IUnknown interface (the controlling IUnknown)“.DWORD dwClsContext
– “Context in which the code that manages the newly created object will run. The values are taken from the enumeration CLSCTX“.REFIID riid
– again defined as a const reference to an IID. Reference to an interface ID ==> riid.LPVOID *ppv
– Address of pointer variable that receives the interface pointer requested in riid.
Since we want an MMDeviceEnumerator
object and the interface that will be used to communicate with it, first we have to retrieve the CLSID
for that class and interface; to do that we can use the __uuidof()
keyword that retrieves the UID
attached to a given expression; these two calls:
const CLSID CLSID_MMDeviceEnumerator = __uuidof(MMDeviceEnumerator);
const IID IID_IMMDeviceEnumerator = __uuidof(IMMDeviceEnumerator);
are used to retrieve the GUID
s of the MMDeviceEnumerator
class and the IMMDeviceEnumerator
interface that will be used as first and fourth parameters. The result of this call is saved in the ppv
parameter which is a pointer to the class instance returned by the CoCreateInstance
. As the documentation states, this is not the only way to get an instance of a class of a given type but it’s definitely the most simple – an alternative and more efficient way, in case of multiple instance are needed, would be to get the factory and then use it to create the objects:
CoGetClassObject(rclsid, dwClsContext, NULL, IID_IClassFactory, &pCF);
hresult = pCF->CreateInstance(pUnkOuter, riid, ppvObj)
pCF->Release();
As a final step, we’ll try to retrieve the name of the “default” output device of the machine; to do so we’ll use the GetDefaultAudioEndpoint
function of the IMMDeviceEnumerator
to get a pointer to an IMMDevice
, which is the interface that encapsulate the feature of a multimedia device resource. The IMMDevice
function OpenPropertyStore()
can be then used to retrieve the properties store through an IPropertyStore
interface:
IMMDevice* pDevice;
IPropertyStore* pProperties;
hr = pEnumerator->GetDefaultAudioEndpoint(EDataFlow::eRender, ERole::eMultimedia, &pDevice);
hr = pDevice->OpenPropertyStore(STGM_READ, &pProperties);
The IPropertyStore
interface can be finally used to browse the properties of an audio interface – as a first thing, we’ll read the “friendly name” of the device – we’ll include the “Functiondiscoverykeys_devpkey.h
” and the “propvarutil.h
” headers and we’ll add the “propsys.lib
” to the input libs of the linker:
PROPVARIANT varName;
WCHAR szDeviceName[128];
PropVariantInit(&varName);
pProperties->GetValue(PKEY_Device_FriendlyName, &varName);
PropVariantToString(varName, szDeviceName, 128);
std::wcout << szDeviceName;
This is the result:
Summary
The MMDevice API can be used to discover the audio endpoint available in the system, determine their capabilities and create driver instances for those devices; Mmdeviceapi.h
is the header that defines these interfaces.
Interface | Description |
IMMDevice | Represents an audio device |
IMMDeviceCollection | Represents a collection of audio devices |
IMMDeviceEnumerator | Provides methods for enumerating audio devices. |
IMMEndpoint | Represents an audio endpoint device |
IMMNotificationClient | Provides notifications when an audio endpoint device is added or removed, when the state or properties of a device change, or when there is a change in the default role assigned to a device |
Full code:
#include "Mmdeviceapi.h"
#include "Functiondiscoverykeys_devpkey.h"
#include "propvarutil.h"
#include <iostream>
#include <string>
void ShowDefaultAudioEndpoint()
{
const CLSID CLSID_MMDeviceEnumerator = __uuidof(MMDeviceEnumerator);
const IID IID_IMMDeviceEnumerator = __uuidof(IMMDeviceEnumerator);
HRESULT res;
IMMDeviceEnumerator* pEnumerator;
res = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
HRESULT hr = CoCreateInstance(CLSID_MMDeviceEnumerator, NULL, CLSCTX_ALL, IID_IMMDeviceEnumerator, (void**)&pEnumerator);
IMMDevice* pDevice;
IPropertyStore* pProperties;
hr = pEnumerator->GetDefaultAudioEndpoint(EDataFlow::eRender, ERole::eMultimedia, &pDevice);
hr = pDevice->OpenPropertyStore(STGM_READ, &pProperties);
PROPVARIANT varName;
WCHAR szDeviceName[128];
PropVariantInit(&varName);
pProperties->GetValue(PKEY_Device_FriendlyName, &varName);
PropVariantToString(varName, szDeviceName, 128);
std::wcout << szDeviceName;
pProperties->Release();
pDevice->Release();
pEnumerator->Release();
}