Fixing a broken faceted search in Sitecore with Lucene

September 29, 2016

Fixing a broken faceted search in Sitecore with Lucene

A faceted search should allow users to filter search results easily by selecting each facet, but what happens when it doesn't work as you expect?

On one multilingual site, facets worked perfectly fine in most languages, except German.

Despite the still-English facet names at this point, for whatever reason, the names of each facet were being truncated, and the reason wasn't immediately obvious.

The facets themselves came from a computed type field:

public object ComputeFieldValue(IIndexable indexable)
    {
      Item item = indexable as SitecoreIndexableItem;

      if (item == null)
      {
        return null;
      }
      else if (item.Fields["content type"] == null)
      {
        return null;
      }
      else
      {
        Sitecore.Data.Fields.MultilistField multilistField =
item.Fields["content type"];
                List<string> typeList = new List<string>();
          var items = multilistField.GetItems();
          foreach (Item i in items)
          {
              typeList.Add(i.Name);
          }

          return typeList.ToArray();

            }      
    }

but, I couldn't find anything that was specific to German. The search results that came back from the actual search all had their values truncated:

var queryableResultItems = context.GetQueryable<SiteSearchResultItem>()
                                        .Where(keyExpression)
                    .Where(i => i["_hide_from_search"] != "1")
                    .Where(i => !i.Path.Contains("content-blocks"))
                                        .Where(i => !i.Path.Contains("sitecore/templates") )
                                        .Where(i => i.Language == Context.Language.Name)
                                        .FacetOn(i => i.ComputedContentType);

                                var queryableResults = queryableResultItems.GetResults();
                                searchResults.Facets = queryableResults.Facets;
                                searchResults.TotalRecordCount = queryableResults.Count();
                                searchResults.SearchRequest.PaginationInfo.RecordCount = searchResults.TotalRecordCount;
                                searchResults.TotalResults = searchResults.PageResults(queryableResults.Hits.Select(hit => hit.Document).ToList());

                                //Break up the results into categories according to the first facet
                                foreach (var value in searchResults.Facets.Categories[0].Values)
                                {
                                        var list = queryableResults.Hits.Select(hit => hit.Document).Where(i => i.ComputedContentType == value.Name.ToLower()).ToList();
                                        searchResults.FacetedResultItems.Add(new FacetedResultItem { Keyword = searchResults.SearchRequest.Keyword, FacetValue = value.Name, Results = searchResults.PageResults(list), RecordCount = list.Count });
                                }

                                return searchResults;

As it turns out, defining an index field as UN_TOKENIZED isn't enough. When using computed fields, you do need to add one under raw:AddComputedIndexField to define the computed field in the first place:

<field fieldName="computedcontenttype" storageType="YES"
indexType="UN_TOKENIZED">
Custom.ComputedContentType, Custom.Domain</field>

however, you also need to add a field with a string type under the fieldMap > raw:AddFieldByFieldName section:

<field fieldName="computedcontenttype" storageType="YES" indexType="UN_TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
                <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
              </field>

If you don't add the second field, Lucene still uses word stemming on the computed field and doesn't treat it as a single entity. Adding it tells Lucene to treat the field as a single un-tokenized string.

Search This Blog

Essertown

Fixing a broken faceted search in Sitecore with Lucene

Popular Posts

A Windows User visits Slackware Linux, Part 1

A Windows User visits Slackware Linux, Part 3: Impressions