Indexing additional content with SmartSearch
15th of August, 2016 0 comments

Indexing additional content with SmartSearch

SmartSearch is a great way to get good performance for searches, and reduce your SQL query count. This applies to any Kentico object, including custom classes in your modules.

One limitation that could trouble you is that relational data is not indexed. Your index can include all columns of your class, but there are no configuration options to include any related data.

What are we dealing with?

I am not going to go into detail about Lucene and Lucene.Net (the engine that SmartSearch is based on), however it is important to have at least a basic understanding of what it is. In a nutshell, flat files store your data, depending on the configuration of your index and the search settings of the fields of your object.

Let's take a look at some of the columns inside an index from the Dancing Goat sample site:

Name Value
_content coffee-processing-techniques coffee processing techniques /articles/coffee processing techniques coffee processing techniques before a ripe coffee plant transforms into a delicious cup sitting on your table, it
_created 20141001215810
_culture en-us
_id119;73
_index DancingGoat.Pages
_site dancinggoat
_type cms.document
classname dancinggoat.coffee
documentcreatedbyuserid 10000000053
documentcreatedwhen 20141001215810
documentculture en-us

Kentico has documented how to customise the content of search indexes, however this only affects the _content field.

The _content field is used for searching by text input, and usually contains keywords. It cannot be used for storing custom data that can be easily retrieved.

Indexing additional content on pages (documents)

There are two ways to accomplish this - the DocumentEvents.GetContent.Execute event, and overriding the GetSearchFields method. Both methods will allow you to use the GetSearchValue(string columnName) transformation method to display the value from your field.


DocumentEvents

By subscribing to the DocumentEvents.GetContent.Execute event, you can access the SearchDocument property of the event's DocumentSearchEventArgs. The SearchDocument property allows us to add additional fields to the SmartSearch index.

Assuming you have a custom module:

protected override void OnInit()
{
	// Subscribe to the event
    DocumentEvents.GetContent.Execute += Document_GetContent_Execute;
}

private static void Document_GetContent_Execute(object sender, DocumentSearchEventArgs e)
{
	var customData = "Custom field data";
    // Add custom data to the index
    e.SearchDocument.Add("CustomField", customData, true, false);
}

The parameters of the Add method are:

  • string name - the name of your field. This name will be used to reference the field in your transformations and conditions
  • string value - the value to store in the field
  • bool store = true - indicates if the data should be stored in the index. You need this set to true
  • bool tokenized = false - indicates if the data should be tokenised, to allow search to find results that match tokens (subsets) of the field's value. This is usually not needed, and can be left as false

Pros:

  • Straightforward way to add custom data - by simply subscribing to a global event, and injecting custom data.
  • You could subscribe to the event multiple times, to declutter your code and separate concerns

Cons:

  • Runs for all documents that are covered by indexes. You will need to manually check for the page type and/or index before adding custom data. This increases CPU cycles, and you might forget to limit your code to apply to certain pages only.

#### GetSearchDocument

Another way of accomplishing the task is to override the GetSearchFields method.

For this method, classes for your page type must be generated.

In a partial class that extends the generated class of your page type:

public override ISearchFields GetSearchFields(ISearchIndexInfo index, ISearchFields searchFields = null)
{
  var sf = base.GetSearchFields(index, searchFields);

  // Add custom data
  sf.Add("CustomField", true, false, () =>
  {
    var customData = "Custom field data";

    return customData;
  }, true);

  return sf;
}

The parameters of the Add method are:

  • string fieldName - the name of your field. This name will be used to reference the field in your transformations and conditions
  • bool searchable - indicates if the field can be searched (used in search conditions of SmartSearch webparts)
  • bool tokenized - indicates if the data should be tokenised, to allow search to find results that match tokens (subsets) of the field's value. This is usually not needed
  • Func<object> getValueFunc - function that returns the value of the field. The function (method or lambda expression) will be executed when SmartSearch is retrieving the value of your field
  • bool insertDirectly - indicates if the field should be directly inserted into the index. If set to true, I have noticed that the value is converted to lowercase, so you would have this set to false in most cases

Pros:

  • No need to check for the page type
  • Probably a cleaner solution

Cons:

  • The page type code must be generated for the pages you are adding additional data to

Indexing additional content in custom classes

The methods for adding additional fields to indexes of custom classes (also known as general indexes), are slightly different from the ones used for pages. Classes also have a GetContent event (ObjectEvents.GetContent.Execute), however the event handler does not contain a SearchDocument property.

This leaves us with a single option - overriding the GetSearchFields method.

You will need the Info and InfoProvider classes generated for your custom classes.

In the Info class of your custom class:

public override ISearchFields GetSearchFields(ISearchIndexInfo index, ISearchFields searchFields = null)
{
	var sf = base.GetSearchFields(index, searchFields);

	sf.Add("CustomField", true, false, () =>
    {
    	var customData = "Custom field data";
        
        return customData;
    }, true);

  	return sf;
}

A bonus trick is to use the magic of IDataContainer to clean up the GetSearchFields method. We can use RelatedData to add additional properties to our custom class.

Following Kentico's instructions, we subscribe to the OnLoadRelatedData event:

protected override void OnInit()
{
	MyCustomClass.TYPEINFO.OnLoadRelatedData += MyCustomClass_LoadRelatedData;
    
    private MyCustomClassData MyCustomClass_LoadRelatedData(BaseInfo infoObj)
    {
    	// Cast the infoObj to our class
        var myClass = (MyCustomClass)infoObj;
        
        // Retrieve data from the database or an external source...
        
        // Return a MyCustomClassData object
        return new MyCustomClassData
        {
        	CustomField = "Custom field data";
        };
    }
}

The MyCustomClassData class would implement the IDataContainer interface. It can also contain the logic to retrieve the additional data, as opposed to having it in the MyCustomClass_LoadRelatedData method.

public class MyCustomClassData : IDataContainer
{
  private string _customField;

  public string CustomField
  {
    get { return _customField; }
    set { _customField = value; }
  }

  public List<string> ColumnNames
  {
    get
    {
      return new List<string>
      {
        "CustomField"
      };
    }
  }

  public bool ContainsColumn(string columnName)
  {
    switch (columnName.ToLower())
    {
      case "customfield":
      return true;

      default: return false;
    }
  }

  public object GetValue(string columnName)
  {
    switch (columnName.ToLower())
    {
      case "customfield":
      return _customField;

      default:
      return null;
    }
  }

  public bool SetValue(string columnName, object value)
  {
    switch (columnName.ToLower())
    {
      case "customfield":
      _customField = (string)value;
      return true;

      default:
      return false;
    }
  }

  public bool TryGetValue(string columnName, out object value)
  {
    switch (columnName.ToLower())
    {
      case "customfield":
      value = _customField;
      return true;

      default:
      value = null;
      return false;
    }
  }

  public object this[string columnName]
  {
    get { return GetValue(columnName); }
    set { SetValue(columnName, value); }
  }
}

This will let us use GetValue("CustomField") on objects of the MyCustomClass type, and a CustomField property in macros. But the main benefit is that it simplifies our GetSearchFields method:

public override ISearchFields GetSearchFields(ISearchIndexInfo index, ISearchFields searchFields = null)
{
	var sf = base.GetSearchFields(index, searchFields);

	sf.Add("CustomField", true, false, () => GetValue("CustomField"), true);

  	return sf;
}

Use the force

If you are ever stuck trying to get the right data in your index, and you have no idea if it is there or not, use Luke. It is a useful tool that will display the actual data in your index.

Written by Kristian Bortnik



Comments