First of all, I want to mention that EPiServer Find (now known as Optimizely Search and Navigation) is a great tool to search for content on an Optimizely site, whether its CMS or Commerce. It comes with a range of options to customize and with filtering and boosting ability to get the best results. Often times, the out of the box indexing and searching works just fine.
However, there come instances, where you need something very specific or unique and have to write custom logic to achieve the end result. In my case, that instance was when I needed to pull all expired content from Find and process it a certain way.
Most of the search results I found online related to the topic routed me to the code snippet that helps filter expired content from Find search results, like :
However, that’s not what I really needed. My goal was to actually retrieve all the expired assets in the search results and not have them excluded. Another objective was to be able to get them all using one query and not have to query individually for each type. This bit took me a while to figure out as well. But I finally managed to put together the code that did exactly this :
Here’s a little breakdown of what each piece here does :
Why am I pulling IContentData here and not IContent?
Because blocks don’t inherit IContent. They inherit ContentData class which in turn inherits IContentData. And media types inherit directly from IContentData. So by pulling items of type IContentData, I can expect the result to have both blocks and media (and more) types of content in it.
.Filter(x => x.MatchTypeHierarchy(typeof(BlockData)) | x.MatchTypeHierarchy(typeof(MediaData)))
Why am I filtering on BlockData and MediaData types only?
Plan. Expand. Optimize. A Cloud Migration Workbook.
Strategize the next steps of your organization's application modernization journey leveraging our experts' pragmatic approach.
Because I only needed expired assets, which are of one of these types only.
Why am I using MatchTypeHierarchy() instead of MatchType()?
Because most of the times, we create custom base types for each of the Optimizely types in our projects to add the base functionality as needed in the project. So in my case too, my project had a custom BaseBlockType that inherited from BlockData and a few different custom base Media types that inherited from MediaData to handle different kinds of media like images, documents, files etc.
But when indexed in Find, they get indexed with the type they have in the project and not the base type they were derived from. So how do we tell Find to pull them using Optimizely base types? By using MatchTypeHierarchy(). This will find all types and sub types derived from the specified type and give you the desired results.
Now, instead of this if I had a unique situation to pull a specific type of block only in the search result, one that’s created in my project, then I would have used .MatchType(typeof(MySpecificBlockType)) to get just that.
.Filter(x => ((IVersionable)x).StopPublish.BeforeNow())
Why am I looking for StopPublish date to be before now?
Because that would mean the content already expired and that’s what I need.
Why am I casting my type to IVersionable?
Because StopPublish is a property of IVersionable and doesn’t directly exist on either of my BlockData or MediaData type objects. So in order to pull the StopPublish date, I have to cast my type to IVersionable.
Why do I need this?
Because by default, Find only returns 10 results. If you need more, you use .Take(x) and specify whatever number makes sense. The highest you can go is a 1000. (At least on the dev license. and I know this because I tried upping this to 10000 and it threw an error).
For my scenario, this works, because I’m hoping there won’t be more than a 1000 assets expired and in the index at one point of time. Worst case scenario that they are, since I use this code in a scheduled job, I’ll have to run it a few times to get to all of them. Or if you are using it on a page, you could use .Skip(x).Take(y) approach. Here, you specify the x and y appropriately and run this in a loop until you’ve gotten to all.
Why this and not GetResult()?
Because I actually need all the ContentData to do the processing on after this and GetContentResult() gets me the actual Content objects. Also since I’m already telling my search query to pull content of type IContentData, if I use GetResult(), it’ll throw an error.
Hope this was helpful!