XAudio2 Tutorial 6

Author: Jay Tennant

A Brief Look at XAudio2: Introducing Submix Voices, Controlling Volume

XAudio2 is a sound API available on the Windows Vista/7+ and XBox 360 platforms. This tutorial aims at demonstrating in brevity how to connect Submix Voices to the audio graph and use them to control volumes of two different groups: "Action" sounds, and "Music" sounds. The target audience should be at least intermediate level C++ programming. Experience with DirectShow is beneficial, though not required. Moderate familiarity with Win32 programming is required. Previous experience with DirectSound will not help as much as you may think. Sorry. :(

In this series, we use the rule: code first, ask questions later. The demo wave files are available here and at the end. So here is the code:

//by Jay Tennant 3/28/12
//A Brief Look at XAudio2: Intro to Submix voices
//demonstrates using submix voices to control collective volume
//win32developer.com
//this code provided free, as in public domain; score!

#include <windows.h>
#include <tchar.h>
#include <xaudio2.h>
#include "staticWave.h"

//XAudio2 objects
IXAudio2* g_engine = NULL;
IXAudio2MasteringVoice* g_master = NULL;

//play event
HANDLE g_hPlayEvent = NULL;

//custom window creation
bool createCustomWindow( IXAudio2SubmixVoice* ActionGroup, IXAudio2SubmixVoice* MusicGroup );

int WINAPI WinMain( HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nShowCmd )
{
	//required by XAudio2
	CoInitializeEx( NULL, COINIT_MULTITHREADED );

	//create the engine
	if( FAILED( XAudio2Create( &g_engine ) ) )
	{
		CoUninitialize();
		return -1;
	}

	//create the mastering voice
	if( FAILED( g_engine->CreateMasteringVoice( &g_master ) ) )
	{
		g_engine->Release();
		CoUninitialize();
		return -2;
	}

	//load a few sound files
	StaticWave sfx, song;
	if( !sfx.load( TEXT("sfx.wav") ) || !song.load( TEXT("humoresqueClip.wav") ) )
	{
		g_engine->Release();
		CoUninitialize();
		return -3;
	}

	//create source voice for the buffer
	IXAudio2SourceVoice* actionSource[5] = {NULL}, *musicSource = NULL;
	for(int i = 0; i < 5; i++)
	{
		if( FAILED( g_engine->CreateSourceVoice( actionSource + i, sfx.wf() ) ) )
		{
			g_engine->Release();
			CoUninitialize();
			return -4;
		}
	}
	if( FAILED( g_engine->CreateSourceVoice( &musicSource, song.wf() ) ) )
	{
		g_engine->Release();
		CoUninitialize();
		return -4;
	}

	//create submix voice for each group (an action sound group, and background music group)
	IXAudio2SubmixVoice* actionGroup = NULL, *musicGroup = NULL;
	XAUDIO2_VOICE_DETAILS sd1 = {0}, sd2 = {0};
	actionSource[0]->GetVoiceDetails( &sd1 );
	musicSource->GetVoiceDetails( &sd2 );

	if( FAILED( g_engine->CreateSubmixVoice( &actionGroup, sd1.InputChannels, sd1.InputSampleRate ) ) || 
		FAILED( g_engine->CreateSubmixVoice( &musicGroup, sd2.InputChannels, sd2.InputSampleRate ) ) )
	{
		g_engine->Release();
		CoUninitialize();
		return -5;
	}

	//prepare the voice sends structure
	XAUDIO2_SEND_DESCRIPTOR sendDesc = {0};
	sendDesc.Flags = 0;
	sendDesc.pOutputVoice = actionGroup;

	XAUDIO2_VOICE_SENDS voiceSends = {0};
	voiceSends.SendCount = 1;
	voiceSends.pSends = &sendDesc;

	//set the output voice of the source to the submix
	for(int i = 0; i < 5; i++)
		actionSource[i]->SetOutputVoices( &voiceSends );

	//do the same for the other submix group
	sendDesc.pOutputVoice = musicGroup;
	musicSource->SetOutputVoices( &voiceSends );

	//start consuming audio
	for(int i = 0; i < 5; i++)
		actionSource[i]->Start();
	musicSource->Start();

	//create the play event
	g_hPlayEvent = CreateEvent( NULL, FALSE, FALSE, NULL );

	//create custom window, passing our two submix groups
	if( !createCustomWindow( actionGroup, musicGroup ) )
	{
		g_engine->Release();
		CloseHandle( g_hPlayEvent );
		CoUninitialize();
		return -6;
	}

	//used with music
	XAUDIO2_VOICE_STATE voiceState = {0};

	//message loop
	bool quitting = false;
	MSG msg;
	while( !quitting )
	{
		if( PeekMessage( &msg, NULL, 0, 0, PM_REMOVE ) )
		{
			TranslateMessage( &msg );
			DispatchMessage( &msg );
			if( msg.message == WM_QUIT )
				quitting = true;
		}

		//if the play event occurs, play the music and action sounds
		if( 0 == WaitForSingleObject( g_hPlayEvent, 0 ) )
		{
			//prevent music from playing in a row
			musicSource->GetState( &voiceState );
			if( voiceState.BuffersQueued == 0 )
				musicSource->SubmitSourceBuffer( song.buffer() );

			//play all action sounds, with 1/2 second delay
			for(int i = 0; i < 5; i++)
			{
				actionSource[i]->SubmitSourceBuffer( sfx.buffer() );
				Sleep( 500 );
			}
		}
	}

	//flush the buffers
	for(int i = 0; i < 5; i++)
	{
		actionSource[i]->Stop();
		actionSource[i]->FlushSourceBuffers();
	}
	musicSource->Stop();
	musicSource->FlushSourceBuffers();

	//release the engine, cleanup
	g_engine->Release();
	CloseHandle( g_hPlayEvent );
	CoUninitialize();

	return 0;
}

//structure stored in the extra window bytes
struct SubmixVoiceGroups
{
	IXAudio2SubmixVoice* action;
	IXAudio2SubmixVoice* music;
};

#define ID_BUTTON_APPLY 101
#define ID_BUTTON_PLAY 102

//custom window procedure
LRESULT CALLBACK customWindowProc( HWND hWnd, UINT Msg, WPARAM wParam, LPARAM lParam )
{
	static HWND m_hActionText = NULL, m_hMusicText = NULL;
	static SubmixVoiceGroups m_svg = {0};

	switch( Msg )
	{
	case WM_CREATE:
		//copy the svg over
		m_svg = *reinterpret_cast<SubmixVoiceGroups*>( reinterpret_cast<CREATESTRUCT*>(lParam)->lpCreateParams );

		//create the controls
		CreateWindowEx( 0, TEXT("STATIC"), TEXT("Action volume (0-127):"), WS_CHILD | WS_VISIBLE | SS_SIMPLE, 10, 10, 150, 20, hWnd, NULL, GetModuleHandle(NULL), NULL );
		m_hActionText = CreateWindowEx( 0, TEXT("EDIT"), TEXT("127"), WS_CHILD | WS_VISIBLE | ES_NUMBER, 10, 30, 50, 20, hWnd, NULL, GetModuleHandle(NULL), NULL );
		CreateWindowEx( 0, TEXT("STATIC"), TEXT("Music volume (0-127):"), WS_CHILD | WS_VISIBLE | SS_SIMPLE, 10, 50, 150, 20, hWnd, NULL, GetModuleHandle(NULL), NULL );
		m_hMusicText = CreateWindowEx( 0, TEXT("EDIT"), TEXT("127"), WS_CHILD | WS_VISIBLE | ES_NUMBER, 10, 70, 50, 20, hWnd, NULL, GetModuleHandle(NULL), NULL );
		CreateWindowEx( 0, TEXT("BUTTON"), TEXT("Apply"), WS_CHILD | WS_VISIBLE | BS_PUSHBUTTON, 10, 100, 180, 40, hWnd, (HMENU)ID_BUTTON_APPLY, GetModuleHandle(NULL), NULL );
		CreateWindowEx( 0, TEXT("BUTTON"), TEXT("Play"), WS_CHILD | WS_VISIBLE | BS_DEFPUSHBUTTON, 10, 150, 180, 40, hWnd, (HMENU)ID_BUTTON_PLAY, GetModuleHandle(NULL), NULL );
		break;
	case WM_DESTROY:
		PostQuitMessage(0);
		break;
	case WM_COMMAND:
		switch( LOWORD( wParam ) )
		{
		case ID_BUTTON_APPLY:
			if( HIWORD( wParam ) == BN_CLICKED )
			{

				TCHAR buffer[256] = TEXT("");
				unsigned int value = 0;

				//read value in edit control
				SendMessage( m_hActionText, WM_GETTEXT, sizeof(buffer), (LPARAM)buffer );
				value = _ttoi( buffer );
				value = min( value, 127 );

				//apply that value to the group
				m_svg.action->SetVolume( (float)value / 127.0f );

				//reset that value in the edit box
				memset( buffer, 0, sizeof(buffer) );
				_itot( value, buffer, 10 );
				SendMessage( m_hActionText, WM_SETTEXT, -1, (LPARAM)buffer );

				//read value in other edit control
				SendMessage( m_hMusicText, WM_GETTEXT, sizeof(buffer), (LPARAM)buffer );
				value = _ttoi( buffer );
				value = min( value, 127 );

				//apply that value to the group
				m_svg.music->SetVolume( (float)value / 127.0f );

				//reset that value in the edit box
				memset( buffer, 0, sizeof(buffer) );
				_itot( value, buffer, 10 );
				SendMessage( m_hMusicText, WM_SETTEXT, -1, (LPARAM)buffer );
			}
			break;
		case ID_BUTTON_PLAY:
			if( HIWORD( wParam ) == BN_CLICKED )
				SetEvent( g_hPlayEvent );
			break;
		}
		break;
	default:
		return DefWindowProc( hWnd, Msg, wParam, lParam );
	}
	return 0;
}

//create our custom window procedure
bool createCustomWindow( IXAudio2SubmixVoice* ActionGroup, IXAudio2SubmixVoice* MusicGroup )
{
	SubmixVoiceGroups svg = { ActionGroup, MusicGroup };

	WNDCLASSEX wc = {0};
	wc.cbSize = sizeof(wc);
	wc.cbWndExtra = sizeof(svg);
	wc.hbrBackground = (HBRUSH)COLOR_WINDOW;
	wc.hCursor = LoadCursor( NULL, IDC_ARROW );
	wc.hIcon = LoadIcon( NULL, IDI_APPLICATION );
	wc.hIconSm = LoadIcon( NULL, IDI_APPLICATION );
	wc.hInstance = (HINSTANCE)GetModuleHandle( NULL );
	wc.lpfnWndProc = (WNDPROC)customWindowProc;
	wc.lpszClassName = TEXT("test");
	wc.style = CS_HREDRAW | CS_VREDRAW;

	ATOM atom = RegisterClassEx( &wc );
	if( !atom )
		return false;

	HWND hWnd = CreateWindowEx( 0, (LPCTSTR)atom, TEXT("ABLAX: Intro To Submix voices"), WS_OVERLAPPEDWINDOW, 50, 50, 220, 240, NULL, NULL, wc.hInstance, &svg );

	if( hWnd == 0 || hWnd == INVALID_HANDLE_VALUE )
		return false;

	ShowWindow( hWnd, SW_NORMAL );
	UpdateWindow( hWnd );
	return true;
}

And here is the helper waveInfo.h header:

//waveInfo.h
//by Jay Tennant 3/8/12
//loads the information for a wave file using non-buffered disk reads
//win32developer.com
//this code provided free, as in public domain; score!

#ifndef WAVEINFO_H
#define WAVEINFO_H

#include <windows.h>
#include <xaudio2.h>

class WaveInfo
{
private:
	WAVEFORMATEXTENSIBLE m_wf;
	DWORD m_dataOffset;
	DWORD m_dataLength;

protected:
	//looks for the FOURCC chunk, returning -1 on failure
	DWORD findChunk( HANDLE hFile, FOURCC cc, BYTE* memBuffer, DWORD sectorAlignment ) {
		DWORD dwChunkId = 0;
		DWORD dwChunkSize = 0;
		DWORD i = 0; //guaranteed to be always aligned with the sectors, except when done searching
		OVERLAPPED overlapped = {0};
		DWORD sectorOffset = 0;
		DWORD bytesRead = 0;

		bool searching = true;
		while( searching )
		{
			sectorOffset = 0;
			overlapped.Offset = i;
			if( FALSE == ReadFile( hFile, memBuffer, sectorAlignment, &bytesRead, &overlapped ) )
			{
				return -1;
			}

			bool needAnotherRead = false;
			while( searching && !needAnotherRead )
			{
				if( 8 + sectorOffset > sectorAlignment ) //reached the end of our memory buffer
				{
					needAnotherRead = true;
				}
				else if( 8 + sectorOffset > bytesRead ) //reached EOF, and not found a match
				{
					return -1;
				}
				else //looking through the read memory
				{
					dwChunkId = *reinterpret_cast<DWORD*>( memBuffer + sectorOffset );
					dwChunkSize = *reinterpret_cast<DWORD*>( memBuffer + sectorOffset + 4 );

					if( dwChunkId == cc ) //found a match
					{
						searching = false;
						i += sectorOffset;
					}
					else //no match found, add to offset
					{
						dwChunkSize += 8; //add offsets of the chunk id, and chunk size data entries
						dwChunkSize += 1;
						dwChunkSize &= 0xfffffffe; //guarantees WORD padding alignment

						if( i == 0 && sectorOffset == 0 ) //just in case we're at the 'RIFF' chunk; the dwChunkSize here means the entire file size
							sectorOffset += 12;
						else
							sectorOffset += dwChunkSize;
					}
				}
			}

			//if still searching, search the next sector
			if( searching )
			{
				i += sectorAlignment;
			}
		}

		return i;
	}

	//reads a certain amount of data in, returning the number of bytes copied
	DWORD readData( HANDLE hFile, DWORD bytesToRead, DWORD fileOffset, void* pDest, BYTE* memBuffer, DWORD sectorAlignment ) {
		if( bytesToRead == 0 )
			return 0;

		DWORD totalAmountCopied = 0;
		DWORD copyBeginOffset = fileOffset % sectorAlignment;
		OVERLAPPED overlapped = {0};
		bool fetchingData = true;
		DWORD pass = 0;
		DWORD dwNumberBytesRead = 0;

		//while fetching data
		while( fetchingData )
		{
			//calculate the sector to read
			overlapped.Offset = fileOffset - (fileOffset % sectorAlignment) + pass * sectorAlignment;

			//read the amount in; if the read failed, return 0
			if( FALSE == ReadFile( hFile, memBuffer, sectorAlignment, &dwNumberBytesRead, &overlapped ) )
				return 0;

			//if the full buffer was not filled (ie. EOF)
			if( dwNumberBytesRead < sectorAlignment )
			{
				//calculate how much can be copied
				DWORD amountToCopy = 0;
				if( dwNumberBytesRead > copyBeginOffset )
					amountToCopy = dwNumberBytesRead - copyBeginOffset;
				if( totalAmountCopied + amountToCopy > bytesToRead )
					amountToCopy = bytesToRead - totalAmountCopied;

				//copy that amount over
				memcpy( ((BYTE*)pDest) + totalAmountCopied, memBuffer + copyBeginOffset, amountToCopy );

				//add to the total amount copied
				totalAmountCopied += amountToCopy;

				//end the fetching data loop
				fetchingData = false;
			}
			//else
			else
			{
				//calculate how much can be copied
				DWORD amountToCopy = sectorAlignment - copyBeginOffset;
				if( totalAmountCopied + amountToCopy > bytesToRead )
					amountToCopy = bytesToRead - totalAmountCopied;

				//copy that amount over
				memcpy( ((BYTE*)pDest) + totalAmountCopied, memBuffer + copyBeginOffset, amountToCopy );

				//add to the total amount copied
				totalAmountCopied += amountToCopy;

				//set the copyBeginOffset to 0
				copyBeginOffset = 0;
			}

			//if the total amount equals the bytesToRead, end the fetching data loop
			if( totalAmountCopied == bytesToRead )
				fetchingData = false;

			//increment the pass
			pass++;
		}

		//return the total amount copied
		return totalAmountCopied;
	}

public:
	WaveInfo( LPCTSTR szFile = NULL ) : m_dataOffset(0), m_dataLength(0) {
		memset( &m_wf, 0, sizeof(m_wf) );
		load( szFile );
	}
	WaveInfo( const WaveInfo& c ) : m_wf(c.m_wf), m_dataOffset(c.m_dataOffset), m_dataLength(c.m_dataLength) {}

	//loads the wave format, offset to the wave data, and length of the wave data;
	//returns true on success, false on failure
	bool load( LPCTSTR szFile ) {
		memset( &m_wf, 0, sizeof(m_wf) );
		m_dataOffset = 0;
		m_dataLength = 0;

		if( szFile == NULL )
			return false;

		//load the file without system cacheing
		HANDLE hFile = CreateFile( szFile, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_FLAG_NO_BUFFERING, NULL );

		if( hFile == INVALID_HANDLE_VALUE )
			return false;

		//figure the sector size for reading
		DWORD dwSectorSize = 0;
		{
			DWORD dw1, dw2, dw3;
			GetDiskFreeSpace( NULL, &dw1, &dwSectorSize, &dw2, &dw3 );
		}

		//allocate the aligned memory buffer, used in finding and reading the chunks in the file
		BYTE *memBuffer = (BYTE*)_aligned_malloc( dwSectorSize, dwSectorSize );
		if( memBuffer == NULL )
		{
			CloseHandle( hFile );
			return false;
		}

		//look for 'RIFF' chunk
		DWORD dwChunkOffset = findChunk( hFile, MAKEFOURCC( 'R', 'I', 'F', 'F' ), memBuffer, dwSectorSize );
		if(dwChunkOffset == -1)
		{
			_aligned_free( memBuffer );
			CloseHandle( hFile );
			return false;
		}

		DWORD riffFormat = 0;
		//inFile.seekg( dwChunkOffset + 8, std::ios::beg );
		//inFile.read( reinterpret_cast<char*>(&riffFormat), sizeof(riffFormat) );
		if( sizeof(DWORD) != readData( hFile, sizeof(riffFormat), dwChunkOffset + 8, &riffFormat, memBuffer, dwSectorSize ) )
		{
			_aligned_free( memBuffer );
			CloseHandle( hFile );
			return false;
		}
		if(riffFormat != MAKEFOURCC('W', 'A', 'V', 'E'))
		{
			_aligned_free( memBuffer );
			CloseHandle( hFile );
			return false;
		}

		//look for 'fmt ' chunk
		dwChunkOffset = findChunk( hFile, MAKEFOURCC( 'f', 'm', 't', ' ' ), memBuffer, dwSectorSize );
		if( dwChunkOffset == -1 )
		{
			_aligned_free( memBuffer );
			CloseHandle( hFile );
			return false;
		}

		//read in first the WAVEFORMATEX structure
		//inFile.seekg( dwChunkOffset + 8, std::ios::beg );
		//inFile.read( reinterpret_cast<char*>(&m_wf.Format), sizeof(m_wf.Format) );
		if( sizeof(m_wf.Format) != readData( hFile, sizeof(m_wf.Format), dwChunkOffset + 8, &m_wf.Format, memBuffer, dwSectorSize ) )
		{
			_aligned_free( memBuffer );
			CloseHandle( hFile );
			return false;
		}
		if( m_wf.Format.cbSize == (sizeof(m_wf) - sizeof(m_wf.Format)) )
		{
			//read in whole WAVEFORMATEXTENSIBLE structure
			//inFile.seekg( dwChunkOffset + 8, std::ios::beg );
			//inFile.read( reinterpret_cast<char*>(&m_wf), sizeof(m_wf) );
			if( sizeof(m_wf) != readData( hFile, sizeof(m_wf), dwChunkOffset + 8, &m_wf, memBuffer, dwSectorSize ) )
			{
				_aligned_free( memBuffer );
				CloseHandle( hFile );
				return false;
			}
		}

		//look for 'data' chunk
		dwChunkOffset = findChunk( hFile, MAKEFOURCC( 'd', 'a', 't', 'a' ), memBuffer, dwSectorSize );
		if(dwChunkOffset == -1)
		{
			_aligned_free( memBuffer );
			CloseHandle( hFile );
			return false;
		}

		//set the offset to the wave data, read in length, then return
		m_dataOffset = dwChunkOffset + 8;
		//inFile.seekg( dwChunkOffset + 4, std::ios::beg );
		//inFile.read( reinterpret_cast<char*>(&m_dataLength), 4 );
		if( sizeof(m_dataLength) != readData( hFile, sizeof(m_dataLength), dwChunkOffset + 4, &m_dataLength, memBuffer, dwSectorSize ) )
		{
			_aligned_free( memBuffer );
			CloseHandle( hFile );
			return false;
		}

		_aligned_free( memBuffer );

		CloseHandle( hFile );

		return true;
	}

	//returns true if the format is WAVEFORMATEXTENSIBLE; false if WAVEFORMATEX
	bool isExtensible() const { return (m_wf.Format.cbSize > 0); }
	//retrieves the WAVEFORMATEX structure
	const WAVEFORMATEX* wf() const { return &m_wf.Format; }
	//retrieves the WAVEFORMATEXTENSIBLE structure; meaningless if the wave is not WAVEFORMATEXTENSIBLE
	const WAVEFORMATEXTENSIBLE* wfex() const { return &m_wf; }
	//gets the offset from the beginning of the file to the actual wave data
	DWORD getDataOffset() const { return m_dataOffset; }
	//gets the length of the wave data
	DWORD getDataLength() const { return m_dataLength; }
};

#endif

And here is the helper staticWave.h header:

//staticWave.h
//by Jay Tennant 3/8/12
//loads PCM wave data into memory as a static buffer
//win32developer.com
//this code provided free, as in public domain; score!

#ifndef STATICWAVE_H
#define STATICWAVE_H

#include "waveInfo.h"

//loads wave data using Window's functions, non-buffered reads
class StaticWave : public WaveInfo
{
private:
	BYTE *m_dataBuffer; //the data buffer, which is the length of the audio data + sector alignment for reading the file
	XAUDIO2_BUFFER m_xaBuffer; //the buffer used by xaudio2
public:
	StaticWave( LPCTSTR szFile = NULL ) : WaveInfo( NULL ), m_dataBuffer( NULL ) {
		memset( &m_xaBuffer, 0, sizeof(m_xaBuffer) );

		load(szFile);
	}
	StaticWave( const StaticWave& c ) : WaveInfo( c ), m_dataBuffer( NULL ), m_xaBuffer( c.m_xaBuffer ) {
		//check whether valid data exists in the source object
		if( c.m_xaBuffer.AudioBytes == 0 || c.m_xaBuffer.pAudioData == NULL )
			return;

		//allocate the buffer (we don't care if it's aligned or not since we're not reading from a file, but we'll set it to 4)
		m_dataBuffer = (BYTE*)_aligned_malloc( m_xaBuffer.AudioBytes, 4 );

		//copy the data over
		memcpy( m_dataBuffer, c.m_xaBuffer.pAudioData, m_xaBuffer.AudioBytes );
		m_xaBuffer.pAudioData = m_dataBuffer;
	}
	~StaticWave() {
		if( m_dataBuffer != NULL )
			_aligned_free( m_dataBuffer );
		m_dataBuffer = NULL;

		WaveInfo::load( NULL );
	}

	//loads the file for wave playback from memory
	bool load( LPCTSTR szFile ) {
		if( m_dataBuffer != NULL )
			_aligned_free( m_dataBuffer );
		m_dataBuffer = NULL;

		memset( &m_xaBuffer, 0, sizeof(m_xaBuffer) );

		//load the wave information, testing whether the file is valid
		if( !WaveInfo::load( szFile ) )
			return false;

		//calculate the sector alignment
		DWORD dw1, dw2, dw3, dwSectorAlignment;
		GetDiskFreeSpace( NULL, &dw1, &dwSectorAlignment, &dw2, &dw3 );

		//calculate how much memory to allocate
		DWORD memoryToAllocate = getDataLength();
		if( memoryToAllocate % dwSectorAlignment )
		{
			//adjust for sector-aligned padding for the reading in sector-aligned quatities
			memoryToAllocate = memoryToAllocate - (memoryToAllocate % dwSectorAlignment) + dwSectorAlignment;
		}

		//sector aligned padding for the read operation being on a possible file position offset
		memoryToAllocate += dwSectorAlignment;

		//allocate the memory for the read operation
		m_dataBuffer = (BYTE*)_aligned_malloc( memoryToAllocate, dwSectorAlignment );

		//open the file
		HANDLE hFile = CreateFile( szFile, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_FLAG_NO_BUFFERING, NULL );
		if( hFile == INVALID_HANDLE_VALUE )
			return false;

		//calculate the offset position to read the file at
		DWORD beginOffset = getDataOffset() % dwSectorAlignment;
		OVERLAPPED overlapped = {0};
		overlapped.Offset = getDataOffset() - beginOffset;

		//read the file
		DWORD dwBytesRead = 0;
		if( FALSE == ReadFile( hFile, m_dataBuffer, memoryToAllocate, &dwBytesRead, &overlapped ) )
		{
			return false;
		}

		//use the lesser of the data length and the number of bytes read (minus the offset) in setting the xaudio2 data
		if( dwBytesRead < beginOffset )
			dwBytesRead = 0;
		else
			dwBytesRead -= beginOffset;
		m_xaBuffer.AudioBytes = min( dwBytesRead, getDataLength() );

		//set the xaudio2 data to point at the beginning of the offset of the dataBuffer
		m_xaBuffer.pAudioData = m_dataBuffer + beginOffset;
		m_xaBuffer.Flags = XAUDIO2_END_OF_STREAM;

		//close the file
		CloseHandle( hFile );

		//return success
		return true;
	}

	//returns the xaudio2 buffer
	const XAUDIO2_BUFFER* buffer() const {return &m_xaBuffer;}
};

#endif

Again, feel free to use the code, and modify it to fit your needs. I certainly will!

The Hills are Alive...

With the sound of music, and really, really loud blasters! Blast those blasters!

A common interest in game applications is the ability to control the volume of both the background music and the action sounds separately. Sometimes, one just wants to hear music. And other times, the dialogue may be of greater interest. The same could be true in an audio processing program, where a group of sound tracks are to be controlled with the same volume envelope parameters. Although an application could loop through every source voice and set its volume appropriately, the limited scope and amount of data to track becomes cumbersome and restrictive.

An easy solution is to process all the source streams normally, and push the output to a different mixing group instead of the master. This mixing group can have one volume envelope to control the output of that mix, and push that processed data to the mastering voice for presentation.

Enter: Submix Voice! This is the mixing group described as the solution. While a simple volume envelope affecting many source mixes can be achieved with this Submix Voice, the scope is much greater, which we will unfold as we start using X3DAudio and XAPO. For this tutorial, we simply control all the action volumes and music volumes separately by using two separate submix groups, adjusting those submix group volumes appropriately.

And now, to analyze the source. I will not explain code that has been explained in previous tutorials, so if you need help with events, XAudio2 engine and voice creation, look to those previous tutorials.

StaticWave sfx, song;

Creates, then loads the sound effect and song that feed into the action and music source voices. That helper class makes it very easy to load waves, so feel free to use it! (I know I'm loading an 11MB file in memory for the "song", but it's to simplify the discussion away from streaming voices.)

XAUDIO2_VOICE_DETAILS sd1 = {0}, sd2 = {0};
actionSource[0]->GetVoiceDetails( &sd1 );
musicSource->GetVoiceDetails( &sd2 );

We created a few structure instances of XAUDIO2_VOICE_DETAILS, and filled them with the action and music source voices. The structure is defined as:

typedef struct XAUDIO2_VOICE_DETAILS {
    UINT32 CreationFlags;
    UINT32 InputChannels;
    UINT32 InputSampleRate;
} XAUDIO2_VOICE_DETAILS;

We are only interested in the number of channels and sample rate of the source voice. These values are used in the creation of the submix voices that will be receiving the output of the source voices.

if( FAILED( g_engine->CreateSubmixVoice( &actionGroup, sd1.InputChannels, sd1.InputSampleRate ) ) || 
	FAILED( g_engine->CreateSubmixVoice( &musicGroup, sd2.InputChannels, sd2.InputSampleRate ) ) )

The submix voice creation function is much longer, but we are only interested in setting the number of input channels and the input sample rate. The rest have default values that we aren't concerned with right now.

XAUDIO2_SEND_DESCRIPTOR sendDesc = {0};
sendDesc.Flags = 0;
sendDesc.pOutputVoice = actionGroup;

XAUDIO2_VOICE_SENDS voiceSends = {0};
voiceSends.SendCount = 1;
voiceSends.pSends = &sendDesc;

Two structures are introduced here, defined as:

typedef struct XAUDIO2_SEND_DESCRIPTOR {
    UINT32 Flags;
    IXAudio2Voice *pOutputVoice;
} XAUDIO2_SEND_DESCRIPTOR;

typedef struct XAUDIO2_VOICE_SENDS {
    UINT32 SendCount;
    XAUDIO2_SEND_DESCRIPTOR *pSends;
} XAUDIO2_VOICE_SENDS;

The first structure allows us to specify the output voice, which can only be a submix voice or the mastering voice, as well as whether a filter is applied (we'll describe filters in a later tutorial). The second structure allows us to specify a number of output voices. Potentially, then, we could output to a number of submix voices simultaneously. Can you imagine the kind of applications this has? In later tutorials, we'll examine benefits to outputting to multiple voices.

IMPORTANT! As of 3/29/12, the documentation is still incorrect about how to use these structures in the article "How to use Submix Voices". The documentation incorrectly states to set the XAUDIO2_SEND_DESCRIPTOR::Flags value to 1, which is an invalid value. The only valid values are 0 or XAUDIO2_SEND_USEFILTER (which is 0x80 on Window's).

musicSource->SetOutputVoices( &voiceSends );

This call sets the output voices of the object. This call is valid for both source voices and submix voices. Obviously, since mastering voices do not produce output except through the speakers, this call will fail on mastering voices. Interestingly, it is possible to set "0" output voices. We will demonstrate this in a later tutorial, where we create a file writer object using the XAPO framework.

The rest of the main function is well commented and the material covered in previous tutorials. The Win32 programming is regular control management, simplified for this project.

One more comment about Submix voices, and sample rates. You may notice that playing source voices with a sample rate of, say, 22050Hz "magically" renders onto a 48000Hz mastering voice. There is a hidden sample rate conversion (SRC) occurring to render it correctly in the output stream. This will start to become an issue when a source voice must output to multiple submix voices, with different sample rate requirements. The problem is that a voice can only have one sample rate for all of its output voices. Therefore, to solve this problem given the current restrictions, multiple submix voices can be used as SRC buffers so that, though the source voice will output one sample rate, multiple simultaneous submix voices can receive that sample rate, convert it to an appropriate target sample rate, and pass that conversion on. All submix voices can perform SRC, but will skip the step if the input and output sample rates are the same.

Things to Try

Load multiple, different action sounds to be played into the action submix group.
Load multiple songs to be played into the music submix group.
Programmatically fade between both groups
Buy Rachmaninoff's performance of his own pieces on Amazon

Additional Information

Demo wave files. The piano recording is a sample of a live recording I did on 4/7/12 for this demo in particular. The clip is from Sergey Rachmaninoff's Humoresque, Op. 10, No. 5.


Next tutorial

Tutorial 7 - XAudio2: Using filters