{"id":6810,"date":"2024-01-02T14:24:31","date_gmt":"2024-01-02T14:24:31","guid":{"rendered":"https:\/\/www.garysieling.com\/blog\/?p=6810"},"modified":"2024-01-02T14:24:32","modified_gmt":"2024-01-02T14:24:32","slug":"r-code-to-summarize-dept-of-ed-civil-rights-data-by-school-district-lea","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/r-code-to-summarize-dept-of-ed-civil-rights-data-by-school-district-lea\/","title":{"rendered":"R Code to summarize Dept of Ed, Civil Rights Data by school district \/ LEA"},"content":{"rendered":"\n<p>Notes:<\/p>\n\n\n\n<ul>\n<li>Takes the data from <a href=\"https:\/\/civilrightsdata.ed.gov\/\">Civil Rights Data | U.S. Department of Education<\/a><\/li>\n\n\n\n<li>This rolls it up to the school district level by summing numeric columns and &#8220;or&#8221;ing boolean columns<\/li>\n\n\n\n<li>Filters to a specific state (PA)<\/li>\n\n\n\n<li>Source data is 733mb and result is 3.7mb<\/li>\n\n\n\n<li>Data definitions are here: <a href=\"https:\/\/data.nysed.gov\/files\/essa\/19-20\/crdcGlossaryAndGuide.pdf\">crdcGlossaryAndGuide.pdf (nysed.gov)<\/a><\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>library(dplyr)\n\ncrdc &lt;- \"C:\\\\data\\\\CRDC\\\\CRDC\\\\School\"\nload_csv_files &lt;- function(folder_path) {\n  csv_files &lt;- list.files(folder_path, pattern = \"\\\\.csv$\", full.names = TRUE)\n  \n  loaded_dataframes &lt;- list()\n  \n  for (csv_file in csv_files) {\n    dataframe &lt;- read.csv(csv_file) \n    pa_data &lt;- dataframe %>% filter(LEA_STATE == 'PA')\n    \n    loaded_dataframes&#91;&#91;basename(csv_file)]] &lt;- pa_data\n  }\n  \n  return(loaded_dataframes)\n}\n\nall_data &lt;- load_csv_files(crdc)\n\ngroup_and_summarize &lt;- function(data_frame) {\n  grouped_data &lt;- data_frame %>%\n    group_by(LEA_STATE, LEA_STATE_NAME, LEAID, LEA_NAME) %>%\n    select(-SCHID, -SCH_NAME, -COMBOKEY, -JJ)\n\n  grouped_data&#91;grouped_data&lt;0] &lt;- NA\n\n  return (grouped_data %>% summarise_all(function(x) {\r\n      if (is.numeric(x)) {\r\n        sum(x, na.rm = TRUE)\r\n      } else if (is.logical(x)) {\r\n        any(x)\r\n      } else {\r\n        first(x)\r\n      }\r\n    }))\n}\n\nsummarize_all_dfs &lt;- function(all_data) {\n  summarized_dataframes &lt;- list()\n\n  for (name in names(all_data)) {\n    summarized_dataframes&#91;&#91;name]] &lt;- group_and_summarize(all_data&#91;&#91;name]])\n  }\n\n  return (summarized_dataframes)\n}\n\nall_summarized &lt;- summarize_all_dfs(all_data)\n\nwrite_dataframes_to_csv &lt;- function(dataframes_list, output_folder) {\r\n  if (!file.exists(output_folder)) {\r\n    dir.create(output_folder)\r\n  }\r\n  \r\n  for (csv_name in names(dataframes_list)) {\r\n    output_path &lt;- file.path(output_folder, csv_name)\r\n    \r\n    write.csv(dataframes_list&#91;&#91;csv_name]], file = output_path, row.names = FALSE)\r\n  }\r\n}\n\nwrite_dataframes_to_csv (all_summarized, \"C:\\\\data\\\\CRDC\\\\CRDC\\\\School\\\\summarized\")<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Notes:<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[6],"tags":[450],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/6810"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=6810"}],"version-history":[{"count":2,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/6810\/revisions"}],"predecessor-version":[{"id":6812,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/6810\/revisions\/6812"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=6810"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=6810"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=6810"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}