XAudio2 Tutorial 8

Author: Jay Tennant

A Brief Look at XAudio2: Adjusting the Frequency Ratio

XAudio2 is a sound API available on the Windows Vista/7+ and XBox 360 platforms. This tutorial aims at demonstrating in brevity how to adjust the frequency ratio on source voices. This tutorial will produce output that is sped up or slowed down (like an LP running at an inappropriate speed).

The target audience should be at least intermediate level C++ programming. Moderate familiarity with Win32 programming is required.

In this series, we use the rule: code first, ask questions later. The wave file is available here and at the end. So here is the code:

//by Jay Tennant 4/7/12
//A Brief Look at XAudio2: Adjusting the Frequency Ratio
//demonstrates changing the frequency ratio
//win32developer.com
//this code provided free, as in public domain; score!

#include <windows.h>
#include <tchar.h>
#include <xaudio2.h>
#include "staticWave.h"

//XAudio2 objects
IXAudio2* g_engine = NULL;
IXAudio2MasteringVoice* g_master = NULL;
IXAudio2SourceVoice* g_source = NULL;

//custom window creation
void createCustomWindow();

int WINAPI WinMain( HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nShowCmd )
{
	//required by XAudio2
	CoInitializeEx( NULL, COINIT_MULTITHREADED );

	//create the engine
	if( FAILED( XAudio2Create( &g_engine ) ) )
	{
		CoUninitialize();
		return -1;
	}

	//create the mastering voice
	if( FAILED( g_engine->CreateMasteringVoice( &g_master ) ) )
	{
		g_engine->Release();
		CoUninitialize();
		return -2;
	}

	//load a sound effect
	StaticWave sfx;
	if( !sfx.load( TEXT("thisisatest.wav") ) )
	{
		g_engine->Release();
		CoUninitialize();
		return -3;
	}

	//create source voice
	if( FAILED( g_engine->CreateSourceVoice( &g_source, sfx.wf(), 0, XAUDIO2_MAX_FREQ_RATIO ) ) )
	{
		g_engine->Release();
		CoUninitialize();
		return -4;
	}

	//start consuming audio
	g_source->Start();

	//create the custom window
	createCustomWindow();

	//main message loop
	XAUDIO2_VOICE_STATE voiceState;
	MSG msg;
	bool quitting = false;
	while( !quitting )
	{
		if( PeekMessage( &msg, 0, 0, 0, PM_REMOVE ) )
		{
			TranslateMessage( &msg );
			DispatchMessage( &msg );
			if( msg.message == WM_QUIT )
				quitting = true;
		}

		g_source->GetState( &voiceState );
		if( voiceState.BuffersQueued < 1 )
			g_source->SubmitSourceBuffer( sfx.buffer() );
	}

	//flush source buffer
	g_source->Stop();
	g_source->FlushSourceBuffers();

	//release the engine, cleanup
	g_engine->Release();
	CoUninitialize();

	return 0;
}

#define ID_BUTTON_APPLY 101

//custom window procedure
LRESULT CALLBACK customWindowProc( HWND hWnd, UINT Msg, WPARAM wParam, LPARAM lParam )
{
	static HWND m_hFrequencyText = NULL;

	switch( Msg )
	{
	case WM_CREATE:
		{
			//create the controls
			CreateWindowEx( 0, TEXT("STATIC"), TEXT("Frequency Ratio (1.0 normal):"), WS_CHILD | WS_VISIBLE | SS_SIMPLE, 10, 10, 200, 20, hWnd, NULL, GetModuleHandle(NULL), NULL );
			m_hFrequencyText = CreateWindowEx( 0, TEXT("EDIT"), TEXT("1.300"), WS_CHILD | WS_VISIBLE, 10, 30, 50, 20, hWnd, NULL, GetModuleHandle(NULL), NULL );
			CreateWindowEx( 0, TEXT("BUTTON"), TEXT("Apply"), WS_CHILD | WS_VISIBLE | BS_DEFPUSHBUTTON, 10, 60, 180, 40, hWnd, (HMENU)ID_BUTTON_APPLY, GetModuleHandle(NULL), NULL );

			g_source->SetFrequencyRatio( 1.3f );
		} break;
	case WM_DESTROY:
		PostQuitMessage(0);
		break;
	case WM_COMMAND:
		switch( LOWORD( wParam ) )
		{
		case ID_BUTTON_APPLY:
			if( HIWORD( wParam ) == BN_CLICKED )
			{
				TCHAR buffer[16] = TEXT("");
				float f = 0.0f;

				//read value in edit control
				SendMessage( m_hFrequencyText, WM_GETTEXT, sizeof(buffer), (LPARAM)buffer );
				f = _tstof( buffer );
				f = max( f, XAUDIO2_MIN_FREQ_RATIO );
				f = min( f, XAUDIO2_MAX_FREQ_RATIO );

				//reset that value in the edit box
				memset( buffer, 0, sizeof(buffer) );
				_stprintf_s( buffer, TEXT("%.3f"), f );
				SendMessage( m_hFrequencyText, WM_SETTEXT, -1, (LPARAM)buffer );

				//apply the filter to the voice
				g_source->SetFrequencyRatio( f );
			}
			break;
		}
		break;
	default:
		return DefWindowProc( hWnd, Msg, wParam, lParam );
	}
	return 0;
}

//create our custom window procedure
void createCustomWindow()
{
	WNDCLASSEX wc = {0};
	wc.cbSize = sizeof(wc);
	wc.hbrBackground = (HBRUSH)COLOR_WINDOW;
	wc.hCursor = LoadCursor( NULL, IDC_ARROW );
	wc.hIcon = LoadIcon( NULL, IDI_APPLICATION );
	wc.hIconSm = LoadIcon( NULL, IDI_APPLICATION );
	wc.hInstance = (HINSTANCE)GetModuleHandle( NULL );
	wc.lpfnWndProc = (WNDPROC)customWindowProc;
	wc.lpszClassName = TEXT("test");
	wc.style = CS_HREDRAW | CS_VREDRAW;

	ATOM atom = RegisterClassEx( &wc );
	if( !atom )
		return;

	HWND hWnd = CreateWindowEx( 0, (LPCTSTR)atom, TEXT("ABLAX: Adjusting Frequency Ratio"), WS_OVERLAPPEDWINDOW, 50, 50, 220, 240, NULL, NULL, wc.hInstance, NULL );

	if( hWnd == 0 || hWnd == INVALID_HANDLE_VALUE )
		return;

	ShowWindow( hWnd, SW_NORMAL );
	UpdateWindow( hWnd );
}

The accompanying headers staticWave.h and waveInfo.h are available in tutorial 6, Streaming a Wave. From a preliminary glance, it is apparent that this code is not much different from the previous tutorial. The differences will be highlighted here:

Follow the Yellow Brick Road!

The output generated from this code sounds like the user is one of the Munchkins from the Wizard of Oz--the voice is substantially raised in pitch and is faster. The two important functions to highlight are the source voice creation and the SetFrequencyRatio() method of the source voice.

g_engine->CreateSourceVoice( &g_source, sfx.wf(), 0, XAUDIO2_MAX_FREQ_RATIO )

The source voice was created with the extra parameter XAUDIO2_MAX_FREQ_RATIO for the frequency ratio, which has the value of 1024.0. The default value is XAUDIO2_DEFAULT_FREQ_RATIO, which is 2.0. If a value outside that range is attempted to be set on the source voice, the method SetFrequencyRatio() will clamp the value to within the XAUDIO2_MIN_FREQ_RATIO (0.005) and the value set at source voice creation.

g_source->SetFrequencyRatio( f );

This call sets the frequency ratio. The frequency ratio is the ratio of the source frequency to the target frequency: source / target. A higher ratio means that the voice will play higher and faster, and a lower ratio means it will play lower and slower.

The word frequency is in reference to the frequency of the wave form, measured in Hertz. For example, a wave form that represents the musical note A4 is a wave with a frequency of 440Hz. The musical note one octave higher (A5) is double the frequency, 880Hz. One octave higher (A6) is double that frequency, so 1760Hz. Jumping by octaves then is simply doubling or halving the frequency, depending on the direction your moving.

As an interesting sidepoint, in tuning a piano, you could tune the frequency of every string to a perfect, calculated frequency. It will be perfectly in tune, but when you play a chord that spans the whole piano's range, it sounds horrendous. This is because of a limitation of the human ear. To make it sound right, you have to "bend" the pitches sharper (or higher) as you get to top frequencies, and "bend" the pitches flatter (or lower) as you get to the bottom frequencies. This process is called "tempering" the pitch, and is an art form that varies in preference from tuner to tuner, and performer to performer.

The frequency adjustment is going to be important for the doppler shift effect which is used in the X3DAudio tutorial.

Things to Try

Set the frequency to gradually diminish, like when a record player's motor is slowing.

Additional Information

Demo wave



Next tutorial

Tutorial 9 - Coming soon