Query a Solr/Lucene Repository from Scala

You can easily query Solr’s repository with the Lucene APIs, if you know the path and lucene version of the index.

These are the dependencies you need:

core
libraryDependencies += "org.apache.lucene" % "lucene-core" % "7.2.0"
libraryDependencies += "org.apache.lucene" % "lucene-queryparser" % "7.2.0"

And the imports:

import org.apache.lucene.analysis.standard.StandardAnalyzer
import org.apache.lucene.index.DirectoryReader
import org.apache.lucene.queryparser.classic.MultiFieldQueryParser
import org.apache.lucene.search.{IndexSearcher, ScoreDoc}
import org.apache.lucene.store.SimpleFSDirectory

import scala.collection.JavaConverters._

Finally, code to query, and retrieve the titles of each result:

val indexLocation = "C:\\projects\\solr-7.0.0\\server\\solr\\talks\\data\\index"
    val indexPath = FileSystems.getDefault().getPath(indexLocation)
    val directory = new SimpleFSDirectory(indexPath)

    val reader =  DirectoryReader.open(directory)
    val searcher = new IndexSearcher(reader)
    val analyzer = new StandardAnalyzer()
    val qp = new MultiFieldQueryParser(
      Array[String]("title_s", "auto_transcript_txt_en"),
      analyzer,
      Map(
        "title_s" -> new java.lang.Float(2.0),
        "auto_transcript_txt_en" -> new java.lang.Float(1.0)
      ).asJava
    )

    val q = qp.parse("title_s: python OR auto_transcript_txt_en:python")

    val results = searcher.search(q, 1000)
    println("total: " + results.totalHits)
    results.scoreDocs.toList.map(
      (docMeta: ScoreDoc) => {
        val doc = reader.document(docMeta.doc)
        println(doc.getField("title_s").stringValue())
      }
    )

Leave a Reply

Your email address will not be published. Required fields are marked *