{"id":1483,"date":"2013-07-29T11:40:03","date_gmt":"2013-07-29T11:40:03","guid":{"rendered":"http:\/\/garysieling.com\/blog\/?p=1483"},"modified":"2013-07-29T11:40:03","modified_gmt":"2013-07-29T11:40:03","slug":"scraping-google-maps-search-results-with-javascript-and-php","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/scraping-google-maps-search-results-with-javascript-and-php\/","title":{"rendered":"Scraping Google Maps Search Results with Javascript and PHP"},"content":{"rendered":"<p>Google Maps provides several useful APIs for accessing data: a geocoding API to convert addresses to latitude and longitude, a search API to provide locations matching a term, and a details API for retrieving location metadata.<\/p>\n<p>For many mapping tasks it is valuable to get a large list of locations (restaurants, churches, etc) &#8211; since this is valuable, Google places a rate limiter on the information, and encourages caching query results.<\/p>\n<p>You can load a specific area of a map &#8211; the best way to find the starting point for the latitude and longitude is to enter an address in a geocoding API:<\/p>\n<pre lang=\"Javascript\">\nmap = new google.maps.Map(document.getElementById('map-canvas'), {\n  mapTypeId: google.maps.MapTypeId.ROADMAP,\n  center: new google.maps.LatLng(curLat, curLong),\n  zoom: 15,\n  styles: [\n    {\n      stylers: [\n        { visibility: 'simplified' }\n      ]\n    },\n    {\n      elementType: 'labels',\n      stylers: [\n        { visibility: 'off' }\n      ]\n    }\n  ]\n});\n<\/pre>\n<p>To run a search, you can use the radarSearch API, which appears to return up to 200 results. However, this only returns latitudes and longitudes &#8211; not place names or anything you&#8217;d really want to a full application.<\/p>\n<pre lang=\"Javascript\">\ngoogle.maps.event.addListenerOnce(map, 'bounds_changed', performSearch);\n\nfunction performSearch() {\n  var request = {\n    bounds: map.getBounds(),\n    keyword: 'church'\n  };\n  service.radarSearch(request, callback);\n}\n<\/pre>\n<p>Once that finishes, it runs a callback &#8211; in this we save off the results so far, and set up a timer to get the full address of each entity. I determined experimentally that the Maps API won&#8217;t let you run a query more than once every two seconds &#8211; this adds a little extra lag because I&#8217;d rather the script continue than risk an error being slightly too soon.<\/p>\n<pre lang=\"Javascript\">\nfunction callback(results, status) {\nfor (var i = 0, place; place = results[i]; i++) {\n  createMarker(place);\n\n  setTimeout(loadPlace, 2200 * i);\n}\n<\/pre>\n<p>Each &#8220;place&#8221; is hydrated using the getDetails function on the maps API, then saved back to a server:<\/p>\n<pre lang=\"Javascript\">\nfunction loadPlace() { \n  place = places[placeIdx++];\n\n  service.getDetails(place, \n    function(result, status) {\n      if (status !=\n      google.maps.places.PlacesServiceStatus.OK) {\n        return;\n    }\n    $.post(\n      \"save.php\",\n      {text: JSON.stringify(result)},\n      function() {\n        next();\n      });  \n  });\n}\n<\/pre>\n<p>This requires a simple PHP file- the results can be extracted later or used as a cache.<\/p>\n<pre lang=\"php\">\n$text = $_POST['text'];\n$json = json_decode($text, true);\n\n$id = md5($text);\nfile_put_contents('db\/' . $id, $text);\n<\/pre>\n<p>Up to this point, we only have the ability to script a specific segment of a map &#8211; in reality we likely want to loop back and forth across an area. I found a bounding box that encompasses Philadelphia and the surrounding counties relatively well experimentally, by loading the map in several areas until I found good edges.<\/p>\n<p>Interestingly, Google Maps does not seem to have the same scale for latitude and longitude, as I found about one map unit area to be about 20x longitude as latitude (ideally this is slightly smaller than one box &#8211; this gives a little overlap and record a few entries twice)<\/p>\n<pre lang=\"Javascript\">\nvar minLat = 39.873;\nvar minLong = -75.483;\nvar maxLat = 40.453;\nvar maxLong = -75.163;\n\nvar dLat = 0.01;\nvar dLong = 0.2;\n<\/pre>\n<p>Finally, we need to define a function which moves the current map location over to the right or down, back and forth, until we read the entire area we want:<\/p>\n<pre lang=\"Javascript\">\nfunction next() {\n if (placeIdx >= places.length) {\n    curLat += dLat;\n    if (curLat > maxLat) {\n      curLong += dLong;\n      curLat = minLat;\n    }\n    if (curLong <= maxLong) {\n       setTimeout(initialize, \n         Math.max(\n           2100, \n           2100 * (places.length - placeIdx)));\n    }\n  }\n}\n<\/pre>\n<p>This function must be called in a few places- anywhere there could be an error or a finished task which would otherwise stop the script. If we don't do this, it will stop partway through:<\/p>\n<pre lang=\"Javascript\">\nif (status != google.maps.places.PlacesServiceStatus.OK) {\n  placeIdx = 1000000;    \n  next();\n  return;\n}\n\nplaces = results;\n\nif (!results) { \n  next();\n  return;\n}\nif (results.length == 0) {\n  next();\n  return;\n}\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Google Maps provides several useful APIs for accessing data: a geocoding API to convert addresses to latitude and longitude, a search API to provide locations matching a term, and a details API for retrieving location metadata. For many mapping tasks it is valuable to get a large list of locations (restaurants, churches, etc) &#8211; since &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.garysieling.com\/blog\/scraping-google-maps-search-results-with-javascript-and-php\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Scraping Google Maps Search Results with Javascript and PHP&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4],"tags":[255,302,432,495],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/1483"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=1483"}],"version-history":[{"count":0,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/1483\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=1483"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=1483"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=1483"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}