{"id":391,"date":"2012-08-07T11:46:12","date_gmt":"2012-08-07T11:46:12","guid":{"rendered":"http:\/\/garysieling.com\/blog\/?p=391"},"modified":"2012-08-07T11:46:12","modified_gmt":"2012-08-07T11:46:12","slug":"how-to-find-pitches-in-music","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/how-to-find-pitches-in-music\/","title":{"rendered":"How to find Pitches in Music"},"content":{"rendered":"<p>Have you ever wanted to find what chords were in a song? A good first step is to read the notes played during a short time interval. If several notes are played simultaneously, in order to figure out the notes, we need to separate the waveform into the individual frequencies. Fortunately there is a well-established mathematical process for doing this. Music stored in a WAV file is a good starting point, as it is a relatively simple audio format. WAV files store audio as the y-value, or loudness, of points on a sine wave.<\/p>\n<p><a href=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/08\/music2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-400\" title=\"music2\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/08\/music2.png\" alt=\"\" width=\"468\" height=\"120\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/08\/music2.png 468w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/08\/music2-300x77.png 300w\" sizes=\"(max-width: 468px) 100vw, 468px\" \/><\/a><\/p>\n<p>The <a href=\"http:\/\/en.wikipedia.org\/wiki\/Fourier_transform\">Fourier transform<\/a> is a mathematical process designed to separate a mathematical function into its component frequencies. It works on a wide range of functions, including polynomial functions (e.g. parabolas). If you applied the Fourier Transform to a polynomial, the result would be an infinite series of waves added together &#8211; a bit of a mind bender. Interestingly, you can go the other direction using the Taylor series, which represents a wave as an infinite series of summed polynomials.<\/p>\n<p>In mathematical theory, these operations are done on an infinitely large plane with continuous lines, which is impractical in real engineering exercises. The <a href=\"http:\/\/en.wikipedia.org\/wiki\/Fast_Fourier_transform\">Fast Fourier Transform<\/a> is a variation which operates on a finite list of points along a graph, like that in a WAV file. Since there are potentially infinite frequencies in a sound sample, the output is placed into buckets based on the loudness of each pitch (10-20hz, 20-30hz, 30-40hz, etc). The size and number of these buckets are controlled by the end application and sound quality.<\/p>\n<p>There is an open source project to read wave files, which I discussed previously <a href=\"http:\/\/garysieling.com\/blog\/how-to-write-a-wav-file-in-scala\">here<\/a>. This library will identify the sample rate of the file &#8211; this is how many loudness measurements are stored for each second of audio. 44,100 is a common rate.<\/p>\n<pre lang=\"scala\">wavFile.readFrames(frameData, dataSize)<\/pre>\n<p>The <a href=\"http:\/\/commons.apache.org\/math\/\">Apache Commons Math<\/a> library provides code to execute an FFT. This bit of Scala code calls it with sampled data that has been read from a file.<\/p>\n<pre lang=\"scala\">\nval buckets : Array[Complex] = \n   fft.transform(frameData, TransformType.FORWARD);\n<\/pre>\n<p>The results are actually an array of vectors. To find the loudness of a pitch, you must compute the length of the vector. For a vector (a, b), this is computed by &#x221a;(a<sup>2<\/sup> + b<sup>2<\/sup>).<\/p>\n<pre lang=\"scala\">\nvar index = 0\nfor (val c : Complex {\nval magnitude : Double =\nMath.sqrt(c.getImaginary() * c.getImaginary() +\nc.getReal() * c.getReal())\n\nindex = index + 1\n<\/pre>\n<p>The position of each bucket combined with the sample rate, determines the range of frequencies this tone represents. To find which notes are included in output, I take the center of the range:<\/p>\n<pre lang=\"scala\">val frq = (.5 + index) * sampleRate \/ dataSize;<\/pre>\n<p>In this application, all octaves will be combined. This tests each notes in the lowest octave (A&#8230;G) and finds the closest match:<\/p>\n<pre lang=\"scala\">var numOctaves = Math.log(frq\/noteFrq)\/Math.log(2)\nvar normFrq = Math.abs(numOctaves - Math.floor(numOctaves));\nif (normFrq &lt; distance)\n\u2026found closer match...<\/pre>\n<p>I generated WAV files for all combinations of five notes in an octave &#8211; for the current version, 80% of generated test audio files are identified correctly. See this post on <a href=\"http:\/\/garysieling.com\/blog\/using-prolog-to-generate-test-data\">using Prolog to generate all chord combinations<\/a>.<\/p>\n<p>Working with audio is challenging, but rewarding. Files are hard to test without perfect pitch, and you must be remember to start with the speaker volume on low \ud83d\ude42<\/p>\n<p>The full source to this project is <a href=\"https:\/\/github.com\/garysieling\/fft-scala\/blob\/master\/pitches.scala\">available on Github<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Have you ever wanted to find what chords were in a song? A good first step is to read the notes played during a short time interval. If several notes are played simultaneously, in order to figure out the notes, we need to separate the waveform into the individual frequencies. Fortunately there is a well-established &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.garysieling.com\/blog\/how-to-find-pitches-in-music\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;How to find Pitches in Music&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4,5],"tags":[109,178,228,237,300,373,480,584],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/391"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=391"}],"version-history":[{"count":0,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/391\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=391"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=391"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=391"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}