{"id":4368,"date":"2016-06-13T02:40:12","date_gmt":"2016-06-13T02:40:12","guid":{"rendered":"http:\/\/www.garysieling.com\/blog\/?p=4368"},"modified":"2016-06-13T02:40:12","modified_gmt":"2016-06-13T02:40:12","slug":"random-sort-order-solr","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/random-sort-order-solr\/","title":{"rendered":"Random Sort order in Solr"},"content":{"rendered":"<p>Sorting by random values in Solr is an interesting concept. A few people have done this<sup><a href=\"#footnote_0_4368\" id=\"identifier_0_4368\" class=\"footnote-link footnote-identifier-link\" title=\"https:\/\/lucene.apache.org\/solr\/6_0_1\/solr-core\/org\/apache\/solr\/schema\/RandomSortField.html\">1<\/a><\/sup> but I want to expand some more options here. First, there is a built-in random field<sup><a href=\"#footnote_1_4368\" id=\"identifier_1_4368\" class=\"footnote-link footnote-identifier-link\" title=\"http:\/\/solr.pl\/en\/2013\/04\/02\/random-documents-from-result-set-giveaway-results\/\">2<\/a><\/sup>, which you can use to sort by.<\/p>\n<p>If you just sort by random, you can do this:<\/p>\n<pre>\nsort=random_1234 DESC\n<\/pre>\n<p>This uses the built-in auto-fields, which pick up anything starting as random_ as a random number:<\/p>\n<pre lang=\"xml\">\n<dynamicField name=\"random_*\" type=\"random\"\/>\n<\/pre>\n<p>Unfortunately you can&#8217;t use this field with a &#8220;copyField&#8221; to put random numbers into your index, like you&#8217;d expect, so if you want these in the index itself, you&#8217;ll have to add it to your loading mechanism.<\/p>\n<p>Sorting by a random field does not add data to the index either, it just uses the name to seed the random number. I tested this by sorting by a number of fields, and checking the index size over time. Every time you use this field name, you will get the same sort order. If you don&#8217;t want this, you&#8217;ll need to generate new names, e.g. if you use the current date, it will change the sort order once a day automatically.<\/p>\n<p>If you prefer sort by relevancy score (or other attribute), and then sort by random to break ties, do this:<\/p>\n<pre>\nsort=score:DESC,random_1234 DESC\n<\/pre>\n<p>If you want to do something more complex, this usage can fall apart. For instance, say you want to keep Solr&#8217;s relevancy algorithm, but fuzz it a bit. At this writing there does not seem to be a random function available, so you may need to consider adding random numbers into your index.<\/p>\n<p>You might be considering designing a system where each document has a topic, and you want to see roughly even numbers of documents matching each topic in the top of search results (all other things being equal).<\/p>\n<p>To do this, you can create a field containing a randomized numeric value, call it &#8220;topic_boost&#8221;. For each topic, you choose a random number at index time, from 0 to the number of documents in that category &#8211; this ensures that until a topic runs out, search results sorted by this value would show an even amount of each topic.<\/p>\n<p>If you are using the edismax or dismax parser, you can then add this random number to the score, like so:<\/p>\n<pre>\nbq=category_l1_ss_boost\n<\/pre>\n<p>When I was doing this, I found that it was easier to use negative values for the field I&#8217;m using, because otherwise it sorts in the wrong order. This also made it really easy to identify documents that didn&#8217;t have this treatment applied correctly, as any document with a positive number for score was missing the randomized fields.<\/p>\n<p>The nice thing about this is that you can do this multiple times and add the results, or weight them to affect the balance of results you get back.<\/p>\n<pre>\nbq=add(category_l1_ss_boost,topic_boost)\n<\/pre>\n<ol class=\"footnotes\"><li id=\"footnote_0_4368\" class=\"footnote\">https:\/\/lucene.apache.org\/solr\/6_0_1\/solr-core\/org\/apache\/solr\/schema\/RandomSortField.html<span class=\"footnote-back-link-wrapper\"> [<a href=\"#identifier_0_4368\" class=\"footnote-link footnote-back-link\">&#8617;<\/a>]<\/span><\/li><li id=\"footnote_1_4368\" class=\"footnote\">http:\/\/solr.pl\/en\/2013\/04\/02\/random-documents-from-result-set-giveaway-results\/<span class=\"footnote-back-link-wrapper\"> [<a href=\"#identifier_1_4368\" class=\"footnote-link footnote-back-link\">&#8617;<\/a>]<\/span><\/li><\/ol>","protected":false},"excerpt":{"rendered":"<p>Exploring concepts for randomized search in Solr<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[11],"tags":[517],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/4368"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=4368"}],"version-history":[{"count":0,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/4368\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=4368"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=4368"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=4368"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}