{"id":4628,"date":"2016-07-04T14:43:44","date_gmt":"2016-07-04T14:43:44","guid":{"rendered":"http:\/\/www.garysieling.com\/blog\/?p=4628"},"modified":"2016-07-04T14:43:44","modified_gmt":"2016-07-04T14:43:44","slug":"import-set-json-files-rethinkdb","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/import-set-json-files-rethinkdb\/","title":{"rendered":"How to import a set of JSON files into RethinkDB"},"content":{"rendered":"<p>RethinkDB ships with utilities for doing imports and exports. There are two purposes in this; database backup and restore, and the import of new data.<\/p>\n<p>Importing new data is probably a more interesting challenge, since you have to get your import process to map to what RethinkDB wants.<\/p>\n<p>If you haven&#8217;t done this before, it requires a python script for RethinkDB:<\/p>\n<pre lang=\"bash\">\napt-get install -y python-pip\npip install rethinkdb\n<\/pre>\n<p>Once you do this, you&#8217;ll need to define a database in RethinkDB &#8211; my example is a series of JSON exports from the Watson API so I&#8217;ve called this &#8220;Watson&#8221;.<\/p>\n<p>This will let you import a single JSON file:<\/p>\n<pre lang=\"bash\">\nrethinkdb import -f \\\n  .\/watson\/transcript_s_TextGetEmotion_1608.json \\\n  --table Watson.transcript_s_TextGetEmotion\n<\/pre>\n<p>When you run this, it creates the table automatically. It seems to treat the file as a row (possibly because mine contains one object). If you import it again, you will need to use &#8220;&#8211;force&#8221; because it&#8217;s not sure how to reconcile it with the existing table. The &#8220;&#8211;force&#8221; option will put the new data in as new rows.<\/p>\n<p>In my case I have a folder that has all the JSON files, named based on the originating ID and the API they are exporting.<\/p>\n<p>Thus, to import an entire folder, I can do this:<\/p>\n<pre lang=\"bash\">\ncd watson\n\nfor f in *\ndo\n  table=$(echo $f | sed \"s\/\\(.*\\)_[0-9]\\+.json\/\\1\/g\")\n  table=$(echo $table | sed \"s\/-\/_\/g\")\n\n  rethinkdb import -f $f --table Watson.$table --force\ndone\n\ncd ..\n<\/pre>\n<p>Note that you can&#8217;t use &#8220;-&#8221; in a RethinkDB table name, so you&#8217;ll want to replace those with underscores if you have them in your source file names.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How to import a folder of JSON files into RethinkDB<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4],"tags":[466],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/4628"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=4628"}],"version-history":[{"count":0,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/4628\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=4628"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=4628"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=4628"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}