{"id":531,"date":"2012-09-08T21:47:44","date_gmt":"2012-09-08T21:47:44","guid":{"rendered":"http:\/\/garysieling.com\/blog\/?p=531"},"modified":"2012-09-08T21:47:44","modified_gmt":"2012-09-08T21:47:44","slug":"onset-detection-with-r","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/onset-detection-with-r\/","title":{"rendered":"Finding the beat in R"},"content":{"rendered":"<p>In a previous article, I described a method for <a href=\"http:\/\/garysieling.com\/blog\/detecting-pitches-in-music-with-r\">detecting chords in an audio file<\/a> (<a href=\"http:\/\/garysieling.com\/blog\/how-to-find-pitches-in-music\">also available for Scala<\/a>). Continuing on this theme, the following will find the onset of a drumbeat in a file, <a href=\"http:\/\/garysieling.com\/blog\/book-review-r-cookbook\">using R<\/a>. I&#8217;m using a single drumstick click, which you can <a href=\"http:\/\/www.freesound.org\/people\/TicTacShutUp\/sounds\/432\/\">hear on freesound.org<\/a>.<\/p>\n<p>This method detects sudden volume increases- it is not made to respond to changes in pitch or timbre (i.e. a song that marks the beat by changing pitch or switching instruments, respectively). However, the methods for doing this seem to be based on the method described below. <\/p>\n<p>We&#8217;re looking for the onset of the drumbeat- where the anticipation starts. From reading literature, it appears that this is believed to be what we perceive as the beat in music, rather than, say, the loudest point.<\/p>\n<p>Load the file into memory:<\/p>\n<pre>\nlibrary(sound)\n\nfile<-'432__tictacshutup__prac-perc-4.wav'\nsample<-loadSample(file)\n\nfourbeats<-appendSample(sample, sample, sample, sample)\nsaveSample(fourbeats, \"out\\\\fourbeats.wav\")\n\neightbeats<-appendSample(sample, sample, \n                         sample, sample, \n                         sample, sample,\n                         sample, sample)\nsaveSample(eightbeats, \"out\\\\eightbeats.wav\")\n<\/pre>\n<p>Next, define the first order differential function. (There is a method in R called diff, which is essentially the same)<\/p>\n<pre>\nfirstOrderDiff<-function(x, lag){ x[(1+lag):length(x) - \n                  x[1:(length(x)-lag)]] }\nwav<-sample$sound\nplot(1:length(wav), abs(wav), type=\"l\")<\/pre>\n<p><a href=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/09\/wav-graph11.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/09\/wav-graph11.png\" alt=\"\" title=\"wav-graph1\" width=\"463\" height=\"368\" class=\"alignnone size-full wp-image-559\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/09\/wav-graph11.png 463w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/09\/wav-graph11-300x238.png 300w\" sizes=\"(max-width: 463px) 100vw, 463px\" \/><\/a><\/p>\n<p>Clearly there is a lot going on- for the sake of example, let's zoom in:<\/p>\n<pre>\nbegin<-abs(wav[1:2000])\nplot(1:length(begin), abs(begin), type=\"l\")\n<\/pre>\n<p>We're really interested in magnitude (loudness) of the sound - it's much easier to work with if you take the absolute value.<\/p>\n<p>Still, there are a lot of peaks and valleys. The first order differential is approximately the derivative, and can be computed over a range (e.g. sample 99 - sample 0, sample 100 - sample 1, etc), but experimentally this seems unstable. Instead, I compute the rolling mean over a small sample, then compute the derivative. This is part of the value in working with only positive numbers, as rolling mean is useless on alternating negative and positive numbers.<\/p>\n<p><a href=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/09\/wav-graph22.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/09\/wav-graph22.png\" alt=\"\" title=\"wav-graph2\" width=\"490\" height=\"316\" class=\"alignnone size-full wp-image-556\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/09\/wav-graph22.png 490w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/09\/wav-graph22-300x193.png 300w\" sizes=\"(max-width: 490px) 100vw, 490px\" \/><\/a><\/p>\n<pre>\nlibrary(zoo)\nsmoothed<-rollmean(begin, 100)\nplot(abs(smoothed), type=\"l\")\n<\/pre>\n<p><a href=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/09\/wav-graph31.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/09\/wav-graph31.png\" alt=\"\" title=\"wav-graph3\" width=\"488\" height=\"343\" class=\"alignnone size-full wp-image-560\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/09\/wav-graph31.png 488w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/09\/wav-graph31-300x211.png 300w\" sizes=\"(max-width: 488px) 100vw, 488px\" \/><\/a><\/p>\n<p>And for the key, find the max value of the derivative to determine where the sound rises fastest:<\/p>\n<pre>\nstart<-which.max(firstOrderDiff(begin, 100))\nabline(v=start)\n<\/pre>\n<p><a href=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/09\/wav-graph4.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/09\/wav-graph4.png\" alt=\"\" title=\"wav-graph4\" width=\"469\" height=\"324\" class=\"alignnone size-full wp-image-542\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/09\/wav-graph4.png 469w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/09\/wav-graph4-300x207.png 300w\" sizes=\"(max-width: 469px) 100vw, 469px\" \/><\/a><\/p>\n<p>In the future I will describe how to generalize this for finding each beat, and handling more types of music. <\/p>\n<p>If you're interested in R (the statistical programming language), check out my review of the <a href=\"http:\/\/garysieling.com\/blog\/book-review-r-cookbook\">R Cookbook<\/a>. You may also be interested in <a href=\"http:\/\/www.amazon.com\/gp\/product\/0966017633\/ref=as_li_ss_tl?ie=UTF8&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0966017633&#038;linkCode=as2&#038;tag=thesecrelifeo-20\">The Scientist & Engineer's Guide to Digital Signal Processing<\/a><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.assoc-amazon.com\/e\/ir?t=thesecrelifeo-20&#038;l=as2&#038;o=1&#038;a=0966017633\" width=\"1\" height=\"1\" border=\"0\" alt=\"\" style=\"border:none !important; margin:0px !important;\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In a previous article, I described a method for detecting chords in an audio file (also available for Scala). Continuing on this theme, the following will find the onset of a drumbeat in a file, using R. I&#8217;m using a single drumstick click, which you can hear on freesound.org. This method detects sudden volume increases- &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.garysieling.com\/blog\/onset-detection-with-r\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Finding the beat in R&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[5,6,27],"tags":[178,228,373,402,450,532],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/531"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=531"}],"version-history":[{"count":0,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/531\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=531"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=531"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=531"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}