{"id":5624,"date":"2018-01-07T17:51:14","date_gmt":"2018-01-07T17:51:14","guid":{"rendered":"http:\/\/www.garysieling.com\/blog\/?p=5624"},"modified":"2018-01-07T17:51:14","modified_gmt":"2018-01-07T17:51:14","slug":"query-solr-lucene-repository-scala","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/query-solr-lucene-repository-scala\/","title":{"rendered":"Query a Solr\/Lucene Repository from Scala"},"content":{"rendered":"<p>You can easily query Solr&#8217;s repository with the Lucene APIs, if you know the path and lucene version of the index.<\/p>\n<p>These are the dependencies you need:<\/p>\n<pre lang=\"scala\">\ncore\nlibraryDependencies += \"org.apache.lucene\" % \"lucene-core\" % \"7.2.0\"\nlibraryDependencies += \"org.apache.lucene\" % \"lucene-queryparser\" % \"7.2.0\"\n<\/pre>\n<p>And the imports:<\/p>\n<pre lang=\"scala\">\nimport org.apache.lucene.analysis.standard.StandardAnalyzer\nimport org.apache.lucene.index.DirectoryReader\nimport org.apache.lucene.queryparser.classic.MultiFieldQueryParser\nimport org.apache.lucene.search.{IndexSearcher, ScoreDoc}\nimport org.apache.lucene.store.SimpleFSDirectory\n\nimport scala.collection.JavaConverters._\n<\/pre>\n<p>Finally, code to query, and retrieve the titles of each result:<\/p>\n<pre lang=\"scala\">\nval indexLocation = \"C:\\\\projects\\\\solr-7.0.0\\\\server\\\\solr\\\\talks\\\\data\\\\index\"\n    val indexPath = FileSystems.getDefault().getPath(indexLocation)\n    val directory = new SimpleFSDirectory(indexPath)\n\n    val reader =  DirectoryReader.open(directory)\n    val searcher = new IndexSearcher(reader)\n    val analyzer = new StandardAnalyzer()\n    val qp = new MultiFieldQueryParser(\n      Array[String](\"title_s\", \"auto_transcript_txt_en\"),\n      analyzer,\n      Map(\n        \"title_s\" -> new java.lang.Float(2.0),\n        \"auto_transcript_txt_en\" -> new java.lang.Float(1.0)\n      ).asJava\n    )\n\n    val q = qp.parse(\"title_s: python OR auto_transcript_txt_en:python\")\n\n    val results = searcher.search(q, 1000)\n    println(\"total: \" + results.totalHits)\n    results.scoreDocs.toList.map(\n      (docMeta: ScoreDoc) => {\n        val doc = reader.document(docMeta.doc)\n        println(doc.getField(\"title_s\").stringValue())\n      }\n    )\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Query a lucene index from Scala<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4],"tags":[300,348,480,517],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/5624"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=5624"}],"version-history":[{"count":0,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/5624\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=5624"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=5624"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=5624"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}