Web Audio tuner
This is a simple tuner built using the WebAudio API. Like tuners in the real world, you can also use it to generate a reference tone so you can tune by ear.
Tuner
Pitch
-- HzCents
Note
--Tune using:
APIs used
Web Audio
This API is the core of this demo. We use it to perform different tasks, from generating synthetic sounds, to analysing the sound we get, to chanelling the sound to whatever the default audio output device is.
Note: Some of the code snippets below are fragments of the source code of this demo, and as such the initialization of some variables may not appear in them. You can find them in other sections on this page and/or the source code of the demo.
AudioContext
This is the entry point of Web Audio, and responsible of generating all the AudioNode instances we use throughout of this demo.
We start by checking if the browser supports Web Audio by looking at whether window.AudioContext (or its webkit prefixed version) is defined. This also will set window.AudioContext so we can easily instance it later if the browser, in fact, supports it.
Example
var isAudioContextSupported = function () {
// This feature is still prefixed in Safari
window.AudioContext = window.AudioContext || window.webkitAudioContext;
if(window.AudioContext){
return true;
}
else {
return false;
}
};
var audioContext;
if(isAudioContextSupported()) {
audioContext = new window.AudioContext();
}
OscillatorNode
In order to generate a synthetic sound to tune by ear we use an OscillatorNode, which we can configure to play at a specific frequency.
The "Base frequency" controls that you see when you enable this part of the tuner adjusts the frequency of A4, which is used as a reference for the rest of the notes. Although all of them can be calculated from A4's frequency, we have pre-calculated them and placed them in a notes.json?_ts=1548612692908 file that we dynamically load at runtime. After that, it's just a matter of iterating through the array of notes for a particular A4 frequency and set the correct frequency in the oscillator node.
Example
var freqTable, baseFreq = 440, currentNodeIndex = 57; // A4 in the array of notes
$.getJSON('notes.json?_ts=1548612692908', function(data) {
freqTable = data;
});
var audioContext = new window.AudioContext();
var notesArray = freqTable[baseFreq];
var sourceAudioNode = audioContext.createOscillator();
sourceAudioNode.frequency.value = notesArray[currentNoteIndex].frequency;
sourceAudioNode.connect(audioContext.destination);
sourceAudioNode.start();
// This function is called passing either a +2 or a -2 to increase or decrease
// the A4 frequency we are using as a reference
var changeBaseFreq = function(delta) {
var newBaseFreq = baseFreq + delta;
if(newBaseFreq >= 432 && newBaseFreq <= 446)="" {="" basefreq="newBaseFreq;" notesarray="freqTable[baseFreq.toString()];" we="" set="" this="" flag="" to="" 'true'="" when="" the="" reference="" sound="" feature="" is="" enabled="" if(isrefsoundplaying){="" only="" change="" frequency="" if="" are="" playing="" a="" sound,="" since="" sourceaudionode="" will="" be="" an="" instance="" of="" oscillatornode="" var="" newnotefreq="notesArray[currentNoteIndex].frequency;" sourceaudionode.frequency.value="newNoteFreq;" }="" };="" function="" used="" note="" currently="" playing,="" using="" whatever="" current="" notes="" (changed="" by="" above)="" changereferencesoundnote="function(delta)" if(isrefsoundplaying)="" newnoteindex="currentNoteIndex" +="" delta;="" if(newnoteindex="">= 0 && newNoteIndex < notesArray.length) {
currentNoteIndex = newNoteIndex;
var newNoteFreq = notesArray[currentNoteIndex].frequency;
sourceAudioNode.frequency.value = newNoteFreq;
}
}
};=>
MediaStreamAudioSourceNode
We use this type of node as a source AudioNode with the stream of data we get from the Media Stream API (see below). You should take into account that the sampling frequency used will match the sampling rate that your output device uses (typically 44.1kHz or 48kHz).
Example
// micStream is the MediaStream object we get from the Media Stream API
var sourceAudioNode = audioContext.createMediaStreamSource(micStream);
sourceAudioNode.connect(analyserAudioNode); // See initialization in the AnalyserNode section of the demo.
AnalyserNode
This node receives data from the MediaStreamAudioSourceNode and performs a Fast Fourier Transform on those samples. This data is later used by an autocorrelation algorithm to detect the pitch of the sound. For this node we set an fftSize of 2048 (the maximum allowed by the Web Audio API), which although is very tight for such a big sampling rate (we can only fit a tiny fraction of a second in that space) it is the best we can do without downsampling the stream by ourselves.
Example
var analyserAudioNode, sourceAudioNode, micStream;
var streamReceived = function(stream) {
micStream = stream;
analyserAudioNode = audioContext.createAnalyser();
analyserAudioNode.fftSize = 2048;
sourceAudioNode = audioContext.createMediaStreamSource(micStream);
sourceAudioNode.connect(analyserAudioNode);
/* This is our pitch detection algorithm.
You can find its implementation in the Autocorrelation section of this demo. */
detectPitch();
};
navigator.getUserMedia({audio: true}, streamReceived);
Media Stream
From this API we only use one specific function to access the audio input device, generally the microphone: getUserMedia. Just like with AudioContext, we whould consider the possibility that the API may be prefixed in some browsers or older versions of them.
Example
var isGetUserMediaSupported = function(){
navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia;
if((navigator.mediaDevices && navigator.mediaDevices.getUserMedia) || navigator.getUserMedia){
return true;
}
return false;
};
if(isGetUserMediaSupported()){
var getUserMedia = navigator.mediaDevices && navigator.mediaDevices.getUserMedia ?
navigator.mediaDevices.getUserMedia.bind(navigator.mediaDevices) :
function (constraints) {
return new Promise(function (resolve, reject) {
navigator.getUserMedia(constraints, resolve, reject);
});
};
getUserMedia({audio: true}).then(streamReceived).catch(reportError);
}
Pitch detection
Autocorrelation
There is a variety of methods to detect the pitch of a sound, some work in the frequency domain (like HPS, or Harmonic Product Spectrum), while some others do in the time domain (like Autocorrelation). With such a high sampling rate, we can only fit a small fraction of a second in the buffer used by the AnalyserNode. In these conditions, the latter algorithm usually does a better job than the former, and this is why we chose it for this demo.
Autocorrelation is the process of cross-correlating a signal with a time-delayed version of itself. In other words, we will be comparing a signal at two different points in time. As Wikipedia puts it:
It is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by noise, or identifying the missing fundamental frequency in a signal implied by its harmonic frequencies.
Given a delay time \(k\) we:
- Find the value at a time \(t\)
- Find the value at a time \(t+k\)
- Multiply those values together
- Accumulate those products over a series of times (1000 in our code)
- Divide by the number of samples to get the average
As seen on this page about autocorrelation, the resulting formula would be something like this:
\[ R(k) = \frac{1}{t_{max} - t_{min}} \int_{t_{min}}^{t_{max}}s(t)s(t+k)dt \]
In addition, as you can see in the example, we also normalize the data. Since we are working with an array of bytes (0-255), we subtract 128 and divide by the same value.
Remember that we are working with periodic signals. As you may imagine, the highest correlation will happen once that signal "repeats itself", i.e. that \(bestK\) will match the period (in frames) of the fundamental frequency. In order to get that frequency we just need to divide the sampling rate by that distance \(bestK\).
Example
var findFundamentalFreq = function(buffer, sampleRate) {
// We use Autocorrelation to find the fundamental frequency.
// In order to correlate the signal with itself (hence the name of the algorithm), we will check two points 'k' frames away.
// The autocorrelation index will be the average of these products. At the same time, we normalize the values.
// Source: http://www.phy.mty.edu/~suits/autocorrelation.html
// Assuming the sample rate is 48000Hz, a 'k' equal to 1000 would correspond to a 48Hz signal (48000/1000 = 48),
// while a 'k' equal to 8 would correspond to a 6000Hz one, which is enough to cover most (if not all)
// the notes we have in the notes.json?_ts=1548612692908 file.
var n = 1024, bestR = 0, bestK = -1;
for(var k = 8; k <= 1000;="" k++){="" var="" sum="0;" for(var="" i="0;" <="" n;="" i++){="" +="((buffer[i]" -="" 128)="" *="" ((buffer[i="" k]="" 128);="" }="" r="sum" (n="" k);="" if(r=""> bestR){
bestR = r;
bestK = k;
}
if(r > 0.9) {
// Let's assume that this is good enough and stop right here
break;
}
}
if(bestR > 0.0025) {
// The period (in frames) of the fundamental frequency is 'bestK'. Getting the frequency from there is trivial.
var fundamentalFreq = sampleRate / bestK;
return fundamentalFreq;
}
else {
// We haven't found a good correlation
return -1;
}
};
var frameId;
var detectPitch = function () {
var buffer = new Uint8Array(analyserAudioNode.fftSize);
// See initializations in the AudioContent and AnalyserNode sections of the demo.
analyserAudioNode.getByteTimeDomainData(buffer);
var fundalmentalFreq = findFundamentalFreq(buffer, audioContext.sampleRate);
if (fundalmentalFreq !== -1) {
var note = findClosestNote(fundalmentalFreq, notesArray); // See the 'Finding the right note' section.
var cents = findCentsOffPitch(fundalmentalFreq, note.frequency); // See the 'Calculating the cents off pitch' section.
updateNote(note.note); // Function that updates the note on the page (see demo source code).
updateCents(cents); // Function that updates the cents on the page and the gauge control (see demo source code).
}
else {
updateNote('--');
updateCents(-50);
}
frameId = window.requestAnimationFrame(detectPitch);
};=>
Finding the right note
Now that we have the fundamental frequency, we just need to find the note with the closest frequency. Since the notesArray that we showed in previous code snippets is already sorted by this value, we only need to perform a binary search to find it.
Example
// 'notes' is an array of objects like { note: 'A4', frequency: 440 }.
// See initialization in the source code of the demo
var findClosestNote = function(freq, notes) {
// Use binary search to find the closest note
var low = -1, high = notes.length;
while (high - low > 1) {
var pivot = Math.round((low + high) / 2);
if (notes[pivot].frequency <= freq)="" {="" low="pivot;" }="" else="" high="pivot;" if(math.abs(notes[high].frequency="" -="" <="Math.abs(notes[low].frequency" freq))="" notes[high]="" is="" closer="" to="" the="" frequency="" we="" found="" return="" notes[high];="" notes[low];="" };<="" code="">=>
Calculating the cents off pitch
The last step, given the fundalmental frequency that we have found and the frequency of the closest note, we need how far the former one is from the latter. This is done using the following formula.
\[ cents = \left \lfloor 1200 \frac{ \log_{10}(f/refF) }{\log_{10}2} \right \rfloor \]
Where \(f\) is the fundamental frequency and \(refF\) is the frequency from the closes note. Since \(\log_{10}2\) is a constant that we can precalculate, we just use it directly in our code.
Example
var findCentsOffPitch = function(freq, refFreq) {
// We need to find how far freq is from baseFreq in cents
var log2 = 0.6931471805599453; // Math.log(2)
var multiplicativeFactor = freq / refFreq;
// We use Math.floor to get the integer part and ignore decimals
var cents = Math.floor(1200 * (Math.log(multiplicativeFactor) / log2));
return cents;
};