Zoran Babovic, Author at Perficient Blogs https://blogs.perficient.com/author/zbabovic/ Expert Digital Insights Thu, 30 Sep 2021 20:13:37 +0000 en-US hourly 1 https://blogs.perficient.com/files/favicon-194x194-1-150x150.png Zoran Babovic, Author at Perficient Blogs https://blogs.perficient.com/author/zbabovic/ 32 32 30508587 Coveo for Sitecore: Customizing Index Parameters https://blogs.perficient.com/2017/08/11/coveo-sitecore-use-no-index-field-exclude-items-index/ https://blogs.perficient.com/2017/08/11/coveo-sitecore-use-no-index-field-exclude-items-index/#comments Fri, 11 Aug 2017 10:23:00 +0000 https://jockstothecore.com/?p=4540

Excluding items from Coveo indexing and search

Working with Coveo for Sitecore you’ve probably encountered the need to prevent some items from being indexed or displayed in your search results. The other day a client asked me how they could exclude specific user-related pages (e.g. login, registration, and change password pages) from showing up in search results. In their own words they needed an easy way to mark specific pages and prevent them from being indexed.

Sure, you could exclude certain templates by adding them to the excludeTemplate section in Coveo.SearchProvider.config. However, since these pages share a common template with other pages that should be indexed, adding them to excludeTemplate section isn’t an option. Another option would be to add filtering rules to prevent items from showing in your search results based on certain criteria. However, since this will be part of your advanced query which gets dispatched each time you run a search, adding too many filtering rules may impact performance.

This is where custom field comes into play. In the example below we used No Index field. This field comes with the SCORE accelerator out of the box, as part of Page Meta Data template. However, any Checkbox type field can be used to achieve this functionality.

No Index field offers a more granular approach

In order to use this during indexing, you will need to add a custom processor to your CoveoInboundFilterPipeline. So, the first step is to implement a class which inherits from AbstractCoveoInboundFilterProcessor:

public class NoIndexFilter : AbstractCoveoInboundFilterProcessor
    {
        public override void Process(CoveoInboundFilterPipelineArgs args)
        {
            if (args.IndexableToIndex != null && !args.IsExcluded && ShouldExecute(args))
            {
                if (args.IndexableToIndex.Item.GetField("No Index") != null
&& args.IndexableToIndex.Item.GetFieldValue("No Index") == "1")
                {
                    args.IsExcluded = true;
                }
            }
        }
    }

Last step, configure the Coveo Search Provider

After that, all that remains is to add this processor to coveoInboundFilterPipeline in SeachProvider.custom.config:

<pipelines>
  <coveoInboundFilterPipeline>
    <processor type="MyNamespace.Data.Processors.NoIndexFilter, MyNamespace.Data" />
  </coveoInboundFilterPipeline>
</pipelines>

Using this method can really give you more control and speed up your indexing operations. Until next time.

Happy Coveoing 🙂
Zoran

]]>
https://blogs.perficient.com/2017/08/11/coveo-sitecore-use-no-index-field-exclude-items-index/feed/ 3 279663
Coveo Database Connector: Refresh Schedule https://blogs.perficient.com/2017/07/21/coveo-database-connector-refresh-schedule/ https://blogs.perficient.com/2017/07/21/coveo-database-connector-refresh-schedule/#respond Fri, 21 Jul 2017 13:16:01 +0000 https://jockstothecore.com/?p=4608

Assuming that you have set up Coveo database connector correctly and that you are able to index data, the next challenge you will probably face is how to set up a refresh schedule. If you are dealing with a database source that’s frequently updated,  you need to propagate these changes to your index so that your search results display up-to-date data. This can be challenging if you’re dealing with a large data set. Let’s look at some of the options for setting up a refresh scheduling strategy.

Keep calm and schedule on

Setting a Refresh Schedule

Incremental Refresh will allow you to propagate new and modified items from your database source to your index. You can set up Incremental Refresh on short intervals to have up-to-date data in your index. As a safety net, you probably want to do a full Rebuild once in a while to handle deleted items and make sure that your index data is consistent.

In the example below, we have set up Incremental Refresh to run every 30 minutes and Rebuild to run once a week, every Sunday.

Refresh Schedule

Incremental Refresh

A prerequisite for this feature is to have a Date Type field in your database source. Also, the date field must be updated in the database each time the record is updated. To set up Incremental Refresh you must use this ‘field in WHERE’ clause when configuring your query. As you can see in the example below, the field name from our database is dateModified. In order to fetch only the last content from our database source, we use @LastRefresh. Coveo will then dynamically populate this variable data based on the time of last refresh.

<Accessor type="query"
OrderByFieldName="dateCreated"
OrderByFieldType="DateTime"
IncrementalRefreshFieldName="dateModified">
<![CDATA[
Select id, title, dateModified, content, author
FROM blog WHERE dateModified>=@LastRefresh order by dateModified;
]]>
</Accessor>

 

Don’t Forget to Set a Maintenance Rebuild Schedule

Ok, we are indexing just modified data every 30 minutes and the time has come for a weekly cleanup and we want to schedule a full Rebuild. Coveo will use the same query and populate the value of @LastRefresh to pull all the content from our database source. But what happens if this is a large data set? Trying to fetch it could take a long time and possibly end up timing out your session?

In this case, we want to use pagination and fetch our database records in batches. To be able to do that, we must add an OFFSET clause when configuring our query. Coveo will set the values of @startRow and @endRow to run the query multiple times until it fetches all the records.  Now a full index Rebuild can be performed.

<Accessor type="query"
OrderByFieldName="dateCreated"
OrderByFieldType="DateTime"
IncrementalRefreshFieldName="dateModified">
<![CDATA[
Select id, title, dateModified, content, author
FROM blog WHERE dateModified>=@LastRefresh order by dateModified
/* ADD PAGINATION TO YOUR QUERY */
OFFSET @startRow ROWS FETCH NEXT (@endRow-@startRow) ROWS ONLY;
]]>
</Accessor>

Configure Your Batch Page Size

If you are using Coveo On-Prem you are probably wondering how you can set your batch page sizes? The answer is you can add it in your index configuration as an additional parameter – QueryPageSize.

Missing Piece

Conclusion

Working with Coveo database connector is fun. It allows you to index the database directly and integrate database content into a unified Coveo index. Setting up your refresh schedule can be tricky, but Coveo’s scheduling options are flexible enough to meet your requirements. I hope you’ll find this blog post useful when choosing the right scheduling strategy.

Happy Coveoing 🙂 Until next time. Zoran

]]>
https://blogs.perficient.com/2017/07/21/coveo-database-connector-refresh-schedule/feed/ 0 279664