Converting JSON to a CSV file with Python

In a previous post, I showed how to extract data from the Google Maps API, which leaves a series of JSON files, like this:

{"address_components":
[{"long_name":"576","short_name":"576","types":["street_number"]},
{"long_name":"Concord Road","short_name":"Concord Road","types":["route"]},{"long_name":"Glen Mills","short_name":"Glen Mills","types":["locality","political"]},{"long_name":"PA","short_name":"PA","types":...

Ideally we want selections from these as a CSV for manual review, and import into mapping software. First, we load a list of files:

import os
 
folder = 'G:\\maps\\churches\\db.tar\\db\\db'
 
files = []
for root, dirs, f in os.walk(folder):
    files = files + f

Then, we define a function which reads the contents of each file, and parses it as JSON, skipping non-JSON files:

import json
 
def js(fn):
  try:
    with open(fn, 'r') as f:
      contents = f.read()
      f.close()
      return json.loads(contents)
  except:
    return ''
 
text = [js(f) for f in files]

We can then easily retrieve rows as tuples – since JSON is typically a hierarchical format, you may need to manually filter these in some way.

csv_rows = \
  [(j['name'], \
   get(j, 'website'), \
   j['geometry']['location']['kb'], \
   j['geometry']['location']['jb'], \
   j['formatted_address']) for j in text]

Finally, we use a simple CSV API provided by Python to write the file. Typically values in CSV files are quoted where necessary and separated by commas, although this API lets you control that:

with open('maps.csv', 'wb') as csvfile:
 mapswriter = csv.writer(csvfile, delimiter=',',
                         quotechar='"', quoting=csv.QUOTE_MINIMAL)
  for r in csv_rows:
    mapswriter.writerow(r)

One of the great things about Python as a scripting language for data analysis is the host of APIs available to make this operation simple – as you can see this is quite easy, as it doesn’t require downloading any additional add-on libraries.

Interested in Python? I send out weekly, personalized emails with articles and conference talks. Click here to see an example and subscribe.

4 replies
  1. Eduardo
    Eduardo says:

    Hello,

    I’m getting
    “Traceback (most recent call last):
    File “json2csv3.py”, line 30, in
    mapswriter = csv.writer(csvfile, delimiter=’,’,
    NameError: name ‘csv’ is not defined”

    Do you know hot to fix it?

    Reply
  2. Bhanu Pratap Singh
    Bhanu Pratap Singh says:

    How do I convert 1000 json files in to 1000 csv files using python. I found several codes using python but it is only for converting single files. I want to write a code in which ; I can browse the folder and select 1000 or upto more than 1000 files, and covert them directly into a CSV file.

    Please help, I’m trying this for long time. “OR” I can use by any other language e.g, Javascript

    Reply
  3. martin
    martin says:

    hello, this is great!!! Many thanks!

    currently want to dive into python with some real world projects and connecting to the endpoint of Overpass. – see http://www.overpass-turbo.eu

    i ant to get a result in a list of comma seperated values – is this doable?

    see https://github.com/mvexel/overpass-api-python-wrapper

    the developer has left some examples – at least one.

    But it would be great to have some more examples and explanation
    see https://github.com/mvexel/overpass-api-python-wrapper

    scroll down – and see. Note that you don’t have to include any of the output meta statements. The wrapper will, well, wrap those.
    You will get your result as a dictionary, which represents the JSON output you would get from the Overpass API directly. So you could do this for example:

    print [(feature[‘tags’][‘name’], feature[‘id’]) for feature in response[‘elements’]]
    [(u’Salt Lake City’, 150935219), (u’Salt Lake City’, 585370637), (u’Salt Lake City’, 1615721573)]

    You can specify the format of the response. By default, you will get GeoJSON using the responseformat parameter. Alternatives are plain JSON (json) and OSM XML (xml), as ouput directly by the Overpass API.

    response = api.Get(‘node[“name”=”Salt Lake City”]’, responseformat=”xml”)
    import overpass
    api = overpass.API()
    response = api.Get(‘node[“name”=”Salt Lake City”]’)

    first of all: this is no python code for version 3xy

    btw – i want to import something like so…:

    currently want to dive into python with some real world projects and connecting to the endpoint of Overpass.
    i ant to get a result in a list of comma seperated values – is this doable?
    i want to translate the following overopass-turbo request into the overpass-api-python wrapper

    [out:csv(::id,::type,”name”,”addr:postcode”,”addr:city”,”addr:street”,”addr:housenumber”,”website”,” contact:email=*”)][timeout:600];
    {{geocodeArea:Schweiz}}->.a;
    ( node(area.a)[amenity=hospital];
    way(area.a)[amenity=hospital];
    rel(area.a)[amenity=hospital];);
    out;

    with the overpass-api-python wrapper i want to do the first steps today:
    the examples that are shown here: https://github.com/mvexel/overpass-api-python-wrapper – they are written in Python version 2xy
    so they do not work with the Version 3 xy

    well this example code here:

    import overpass
    api = overpass.API()
    response = api.Get (area(3601744366)->.a;
    (node(area.a)[amenity=hospital];
    way(area.a)[amenity=hospital];);
    (._;>;);

    can i do so?

    well i love to hear from you

    regards martin

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *