{"id":598,"date":"2012-10-05T13:36:29","date_gmt":"2012-10-05T13:36:29","guid":{"rendered":"http:\/\/garysieling.com\/blog\/?p=598"},"modified":"2012-10-05T13:36:29","modified_gmt":"2012-10-05T13:36:29","slug":"marking-time-in-r","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/marking-time-in-r\/","title":{"rendered":"Marking time in R"},"content":{"rendered":"<p>In a previous example, I showed how to find the onset of a <a href=\"http:\/\/garysieling.com\/blog\/onset-detection-with-r\">single drumbeat<\/a>, as well <a href=\"http:\/\/garysieling.com\/blog\/detecting-pitches-in-music-with-r\">as the chord at an instant<\/a>. <\/p>\n<p>This new example extends the method to detect the onset of several notes in a row, and demonstrates some interesting challenges involved in musical transcription. The general process is to read in audio, smooth the input, then find the most significant sudden increases in loudness (other techniques measure energy or pitch).<\/p>\n<p>Each of the above steps has many choices- for instance there are different ways to smooth audio. I used a rolling mean, but one paper I read recommended a low pass filter. <\/p>\n<p>The greatest challenge is to control false positives and false negatives to an acceptable level. One heuristic improvement to the technique below might filter out sudden sounds when the music volume is low, preventing a cough registering as a beat in a quiet section of an orchestral piece.<\/p>\n<pre>\nlibrary(sound)\nlibrary(zoo)\n\nfourbeats<-loadSample(\"out\\\\fourbeats.wav\")\nfiftyms<-sample$rate\/200\ntenms<-fiftyms \/ 5\nplot(1:length(fourbeats$sound), \n     abs(fourbeats$sound), type=\"l\")\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/10\/fourbeats-1.png\" alt=\"\" title=\"fourbeats-1\" width=\"554\" height=\"426\" class=\"alignnone size-full wp-image-600\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-1.png 554w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-1-300x231.png 300w\" sizes=\"(max-width: 554px) 100vw, 554px\" \/><\/p>\n<pre>\nx<-fourbeats$sound[1,]\nax<-abs(x)\nsx<-rollmean(ax, 100)\nplot(1:length(x), ax, type=\"l\")\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/10\/fourbeats-2.png\" alt=\"\" title=\"fourbeats-2\" width=\"533\" height=\"432\" class=\"alignnone size-full wp-image-601\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-2.png 533w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-2-300x243.png 300w\" sizes=\"(max-width: 533px) 100vw, 533px\" \/><\/p>\n<pre>\nplot(1:length(sx), sx, type=\"l\")\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/10\/fourbeats-3.png\" alt=\"\" title=\"fourbeats-3\" width=\"512\" height=\"423\" class=\"alignnone size-full wp-image-602\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-3.png 512w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-3-300x248.png 300w\" sizes=\"(max-width: 512px) 100vw, 512px\" \/><\/p>\n<pre>\ndx<-firstOrderDiff(x, 50)\nplot(1:length(dx), dx, type=\"l\")\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/10\/fourbeats-4.png\" alt=\"\" title=\"fourbeats-4\" width=\"549\" height=\"387\" class=\"alignnone size-full wp-image-603\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-4.png 549w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-4-300x211.png 300w\" sizes=\"(max-width: 549px) 100vw, 549px\" \/><\/p>\n<p>Now, we look for local maxima, in overlapping 50ms sections. These overlap to allow the detection of instances where the beat occurs right on the dividing line.<\/p>\n<pre>\ncandidates<-rollapply(dx, fiftyms, \n   which.max, align=\"left\", by=fiftyms\/2)\nix<-(0:(length(candidates)-1))*fiftyms\/2 + candidates\n\nfor (i in 1:length(ix)) { \n  if (dx[ix[i]] >= 0.01) { \n    abline(v=(i-1)*fiftyms*2, col=\"red\") \n  } \n}\n<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/10\/fourbeats-5.png\" alt=\"\" title=\"fourbeats-5\" width=\"524\" height=\"410\" class=\"alignnone size-full wp-image-604\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-5.png 524w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-5-300x235.png 300w\" sizes=\"(max-width: 524px) 100vw, 524px\" \/><\/p>\n<p>As you can see above, this technique detects several local maxima in the area where the beat occurs, even in spite of looking within 25ms intervals. An improvement to this technique would limit the number of candidates in a given time period based on the knowledge that the frequency of musical beats is typically limited to what a musician can play (although this is not true of electronic music).<\/p>\n<p>If we cheat a little, we can also set a minimum level for what is considered to be a beat - in a more realistic scenario this would likely be a ratio to the surrounding music. This filtering parameter would also provide a good hook to link this technique to a AI training algorithm.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/172.104.26.128\/wp-content\/uploads\/2012\/10\/fourbeats-6.png\" alt=\"\" title=\"fourbeats-6\" width=\"516\" height=\"389\" class=\"alignnone size-full wp-image-605\" srcset=\"https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-6.png 516w, https:\/\/www.garysieling.com\/blog\/wp-content\/uploads\/2012\/10\/fourbeats-6-300x226.png 300w\" sizes=\"(max-width: 516px) 100vw, 516px\" \/><\/p>\n<p><b>See also:<\/b> <a href=\"http:\/\/garysieling.com\/blog\/book-review-r-cookbook\">R Cookbook Review<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In a previous example, I showed how to find the onset of a single drumbeat, as well as the chord at an instant. This new example extends the method to detect the onset of several notes in a row, and demonstrates some interesting challenges involved in musical transcription. The general process is to read in &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.garysieling.com\/blog\/marking-time-in-r\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Marking time in R&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4,27],"tags":[178,450,585,586],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/598"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=598"}],"version-history":[{"count":0,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/598\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=598"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=598"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=598"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}