Scala: Read JSON from Solr

The Scala play library has a library for creating and reading JSON. To import it you can add a couple lines in SBT:

libraryDependencies += "com.typesafe.play" %% "play-json" % "2.4.6"
libraryDependencies += "org.scalaj" %% "scalaj-http" % "2.3.0"

To hit Solr, you can build a URL and pull the result with the HTTP library I imported:

val url = 
  "http://localhost:8983" +
  "/solr/ssl_certificates/" + 
  "select?q=level:root" +
  "&rows=83059
  "&fl=domain
  "&wt=json"

val result = Http.apply(url)
  .header("Content-Type", "application/json")
  .header("Charset", "UTF-8")
  .asString
  .body

This is an example of what the output looks like:

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "fl":"domain",
      "indent":"true",
      "q":"level:root",
      "wt":"json",
      "rows":"1"}
  },
  "response":{"numFound":83059,"start":0,"docs": [
      {
        "domain":["www.01com.com"]
      }
    ]
  }
}

To parse this, you can use an expression with Scala operators resembling an XPath, to get to the area we want, then parse out the important bits. You can match on different types (strings, ints, etc), or use case classes to match as well, if the schema is regular.

val jsonResult = Json.parse(result)
val domains =
  (jsonResult \ "response" \ "docs").get match {
    case domainObjList: JsArray =>
      domainObjList.value.map(
        (domainObj) =>
          domainObj match {
            case obj: JsObject =>
              obj.value("domain") match {
                case arr: JsArray => arr(0).get.toString
              }
            case _ => ???
          }
        )
      case _ => ???
    }

Leave a Reply

Your email address will not be published. Required fields are marked *