{"id":6700,"date":"2022-09-11T18:35:59","date_gmt":"2022-09-11T18:35:59","guid":{"rendered":"https:\/\/www.garysieling.com\/blog\/?p=6700"},"modified":"2022-09-11T18:37:02","modified_gmt":"2022-09-11T18:37:02","slug":"a-simple-implementation-of-data-shadowing-in-r","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/a-simple-implementation-of-data-shadowing-in-r\/","title":{"rendered":"A simple implementation of data shadowing in R"},"content":{"rendered":"\n<p>Idiomatic R often uses a neat syntactic sugar called shadowing.<\/p>\n\n\n\n<p>Imagine you have a dataframe (df) containing the costs of summer camps:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>Child<\/td><td>Week<\/td><td>Cost<\/td><\/tr><tr><td>1<\/td><td>1<\/td><td>$60<\/td><\/tr><tr><td>1<\/td><td>2<\/td><td>$200<\/td><\/tr><tr><td>2<\/td><td>1<\/td><td>$400<\/td><\/tr><tr><td>2<\/td><td>2<\/td><td>$275<\/td><\/tr><tr><td>2<\/td><td>2<\/td><td>$1000<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>A common access pattern is to filter the data frame using the column names. Note however that we pass an expression &#8211; not a lambda, as we would in other languages:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>df&#91;Week == 1]<\/code><\/pre>\n\n\n\n<p>To learn why this works, we can implement a crude version of a filtering function ourselves.<\/p>\n\n\n\n<p>First, let&#8217;s define a function that takes a dataframe and an expression:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>simple_filter &lt;- function(df, e) {\n  print(enexpr(e))\n}<\/code><\/pre>\n\n\n\n<p>If you call this as &#8220;filter(df, Week ==     1)&#8221;, it will print out:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code data-enlighter-language=\"r\" class=\"EnlighterJSRAW\">Week == 1<\/code><\/code><\/pre>\n\n\n\n<p>This output is the .toString equivalent for the abstract syntax tree of the expression provided in <code>e<\/code>. <\/p>\n\n\n\n<p>Note that if we replace <code>print(enexpr(e))<\/code> with <code>print(e)<\/code>, we&#8217;ll get an error:<\/p>\n\n\n\n<p><code>Error in print(e) : object 'Week' not found<\/code><\/p>\n\n\n\n<p>There is no variable named &#8216;a&#8217; in scope in the environment. We only get this error when the expression is used, as the arguments are promises that represent the result of the expression passed in, and are lazily evaluated.<\/p>\n\n\n\n<p>The difference between the expression <code>Week == <\/code>1 and the lambda <code>(Week) => Week == 1<\/code> is that in a lambda, context is provided. Context includes both arguments and variables the lambda closes over. These form the &#8220;environment&#8221; in which the function runs.<\/p>\n\n\n\n<p>To complete our <code>filter<\/code> implementation, we need to build an environment in which to evaluate the variables:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code data-enlighter-language=\"r\" class=\"EnlighterJSRAW\">env &lt;- new.env()<\/code><\/code><\/pre>\n\n\n\n<p>Then all we need to do is to loop over the rows in the dataframe, populating the variables as we look at each row. The <code>assign<\/code> function inserts a value into the environment:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cols &lt;- colnames(df)\nfor(i in 1:nrow(df)) { \n  for(j in 1:ncol(df)) {\n    col &lt;- cols&#91;j]\n    assign(col, df&#91;i, j], env)\n  }\n\n  ...\n}<\/code><\/pre>\n\n\n\n<p>For our final step, we can put this all together into a function which prints out rows that match our expression. This adds a call to <code>eval<\/code>, which evaluates the given expression in the context of the environment we&#8217;ve built from the row:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>simple_filter &lt;- function(df, e) {\n  query &lt;- enexpr(e)\n  cols &lt;- colnames(df)\n  env &lt;- new.env()\n  for(i in 1:nrow(df)) { \n    for(j in 1:ncol(df)) {\n      col &lt;- cols&#91;j]\n      assign(col, df&#91;i, j], env)\n    }\n\n    if(eval(query, envir = env)) {\n      print(df&#91;i, ])\n    }\n  }\n}<\/code><\/pre>\n\n\n\n<p>And there you have it!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Idiomatic R often uses a neat syntactic sugar called shadowing. Imagine you have a dataframe (df) containing the costs of summer camps: Child Week Cost 1 1 $60 1 2 $200 2 1 $400 2 2 $275 2 2 $1000 A common access pattern is to filter the data frame using the column names. Note &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.garysieling.com\/blog\/a-simple-implementation-of-data-shadowing-in-r\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;A simple implementation of data shadowing in R&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4,6],"tags":[152,244,441,450],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/6700"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=6700"}],"version-history":[{"count":3,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/6700\/revisions"}],"predecessor-version":[{"id":6703,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/6700\/revisions\/6703"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=6700"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=6700"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=6700"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}