Apache Solr – Facets Configuration

                 In the previous post, we have seen, how to configure Apache data import handler to import the data from database to Solr and performed some basic search related queries on that. In this post, we are going to see how to configure the facets and how we can filter the results based on the facets. Before going to configure facets, we will seeApache Solr answers for couple of questions.

What is a facet?

A facet is a specific property extracted from the data. This might be a flat list that allows only one choice or it might be a hierarchical list. The combination of all facets and values is often called as facet taxonomy.

What is the importance of facets or guided navigation?

The answer for this will be explained with an example. Here, when I am searching for products in an e-commerce store, as an end user,

  • I want shoes

  • I want “Puma” shoes

  • I want 9 inches shoes

  • I want white shoes

                 To get the search results with the above specified criteria, without grouping or facets, finding 9 inches, white, puma shoes from thousands of shoes across the product catalog is very difficult. To make search enjoyable to the user, we need to provide facets to the user. From the facets, user can find the group to filter the results and finds the shoes what he want. So, proving the right facets to the user ultimately improves the user experience and increase the sales.

             With Solr, we can configure date facets, range facets(for Price),  normal field facets(like String, Text etc). Now, in this example, we will configure Solr to provide facets on “Retailer name” and the “Sale Amount”. We can enable the facets by providing the configuration to the request handler or during query time, we can send query parameters.

The request handler configuration is given below.


<requestHandler name="standard" class="solr.StandardRequestHandler" default="true">
 <!-- default values for query parameters -->
 <lst name="defaults">
 <str name="echoParams">explicit</str>
 </lst>
 <lst name="invariants">
 <str name="facet">on</str>
 <str name="facet.field">retailerName</str>
 <str name="facet.mincount">1</str>
 <str name="facet.range.other">after</str>
 <str name="facet.range">salePrice</str>
 <int name="f.salePrice.facet.range.start">0</int>
 <int name="f.salePrice.facet.range.end">400</int>
 <int name="f.salePrice.facet.range.gap">100</int>
 </lst>
</requestHandler>

              Now, while querying the Solr, send “facet=true” query parameter. This tells Solr to send the facets as part of response. For example http:// < host > :<port>/solr/sampleCatalog/select?q=*%3A*&wt=xml&indent=true&facet=true

The search response for facets is given below.


<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">
<lst name="retailerName">
<int name="Classic Metal Creations">10</int>
<int name="Carousel DieCast Legends">9</int>
<int name="Exoto Designs">9</int>
<int name="Gearbox Collectibles">9</int>
<int name="Highway 66 Mini Classics">9</int>
<int name="Motor City Art Classics">9</int>
<int name="Autoart Studio Design">8</int>
<int name="Min Lin Diecast">8</int>
<int name="Second Gear Diecast">8</int>
<int name="Studio M Art Models">8</int>
<int name="Unimax Art Galleries">8</int>
<int name="Welly Diecast Productions">8</int>
<int name="Red Start Diecast">7</int>
</lst>
</lst>
<lst name="facet_dates"/>
<lst name="facet_ranges">
<lst name="salePrice">
<lst name="counts">
<int name="0.0">8</int>
<int name="100.0">40</int>
<int name="200.0">46</int>
<int name="300.0">11</int>
</lst>
<double name="gap">100.0</double>
<double name="start">0.0</double>
<double name="end">400.0</double>
<int name="after">5</int>
</lst>
</lst>
</lst>

                 As per the facets configuration, the response has facets on retailer name and sale price. Now, we will filter the search results for retailer “Red Start Diecast” by applying the filter query (fq=retailerName:”Red Start Diecast”) parameter. The mentioned retailer has 7 results associated. For  example http:// < host > : <port>/solr/sampleCatalog/select?q=*%3A*&wt=xml&indent=true&facet=true&fq=retailerName:%22Red%20Start%20Diecast%22

The search response is given below.


<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">82</int>
<lst name="params">
<str name="facet">true</str>
<str name="indent">true</str>
<str name="q">*:*</str>
<str name="wt">xml</str>
<str name="fq">retailerName:"Red Start Diecast"</str>
</lst>
</lst>
<result name="response" numFound="7" start="0">...</result>
<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">
<lst name="retailerName">
<int name="Red Start Diecast">7</int>
</lst>
</lst>
<lst name="facet_dates"/>
<lst name="facet_ranges">
<lst name="salePrice">
<lst name="counts">
<int name="0.0">1</int>
<int name="200.0">5</int>
<int name="300.0">1</int>
</lst>
<double name="gap">100.0</double>
<double name="start">0.0</double>
<double name="end">400.0</double>
<int name="after">0</int>
</lst>
</lst>
</lst>
</response>

Now, we will find search results between 100 to 200 price range. With this price range, we should get 40 results. For example

http:// < host > :<port>/solr/sampleCatalog/select?q=*%3A*&wt=xml&indent=true&facet=true&fq=salePrice:[100%20TO%20199]

The search response is given below.


<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4</int>
<lst name="params">
<str name="facet">true</str>
<str name="indent">true</str>
<str name="q">*:*</str>
<str name="wt">xml</str>
<str name="fq">salePrice:[100 TO 199]</str>
</lst>
</lst>
<result name="response" numFound="40" start="0">
<doc>...</doc>
<doc>...</doc>
<doc>...</doc>
<doc>...</doc>
<doc>...</doc>
<doc>...</doc>
<doc>...</doc>
<doc>...</doc>
<doc>...</doc>
<doc>...</doc>
</result>
<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">
<lst name="retailerName">
<int name="Studio M Art Models">5</int>
<int name="Autoart Studio Design">4</int>
<int name="Classic Metal Creations">4</int>
<int name="Gearbox Collectibles">4</int>
<int name="Highway 66 Mini Classics">4</int>
<int name="Min Lin Diecast">4</int>
<int name="Motor City Art Classics">4</int>
<int name="Unimax Art Galleries">4</int>
<int name="Carousel DieCast Legends">2</int>
<int name="Exoto Designs">2</int>
<int name="Welly Diecast Productions">2</int>
<int name="Second Gear Diecast">1</int>
</lst>
</lst>
<lst name="facet_dates"/>
<lst name="facet_ranges">
<lst name="salePrice">
<lst name="counts">
<int name="100.0">40</int>
</lst>
<double name="gap">100.0</double>
<double name="start">0.0</double>
<double name="end">400.0</double>
<int name="after">0</int>
</lst>
</lst>
</lst>
</response>

Advertisements

I am Siva Prasad Rao Janapati. Working as a software developer. Has hands on experience on ATG Commerce(DAS/DPS/DCS), Mozu commerce, Broadleaf Commerce, Java, JEE, Spring, Play, JPA, Hibernate, Velocity, JMS, Jboss, Weblogic,Tomcat, Jetty, Apache, Apache Solr, Spring Batch, JQuery, NodeJS, SOAP, REST, MySQL, Oracle, Mongo DB, Memcached, HazelCast, Git, SVN, CVS, Ant, Maven, Gradle, Amazon Web services, Rackspace, Quartz, JMeter, Junit, Open NLP, Facebook Graph,Twitter4J, YouTube Gdata, Bazzarvoice,Yotpo, 4-Tell, Alatest, Shopzilla, Linkshare. I have hands on experience on open sources and commercial technologies.

Tagged with: ,
Posted in Apache Solr, Facetes Configuration, guided navigation

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

DZone

DZone MVB

Java Code Geeks
Java Code Geeks
%d bloggers like this: