Want to elevate your Optimizely PaaS CMS site’s search capabilities? Azure AI Search could be just the tool you need! In this blog, I’ll discuss how to connect your CMS with Microsoft’s advanced AI-driven search platform to create fast, smart search experiences that surpass regular keyword searches.
Azure AI Search is Microsoft’s cloud-based search service powered by AI. It enables you to index, search, and analyze extensive amounts of content utilizing full-text searches, faceted navigation, and machine-learning features (such as language comprehension and semantic search).
Why it’s great
In short: it’s a smart search made user-friendly.
Before we get into the benefits, let’s take a moment to consider how Azure AI Search compares to Optimizely’s native search functionalities. Optimizely Search (which relies on Lucene or Find/Search & Navigation) works well for straightforward keyword searches and basic filters, and it’s closely tied to the CMS. However, it doesn’t offer the advanced AI features, scalability, or flexibility that Azure provides right off the bat. Azure AI Search enriches the search experience with functionalities like semantic search, cognitive enhancements, and external data indexing, making it perfect for enterprise-level sites with intricate search requirements.
Here’s why merging these two solutions is beneficial:
To set up Azure AI Search, just follow these steps:
Once created, make sure to note down the Search Service Name and Admin API Key – you’ll need these to send and retrieve
By utilizing the Optimizely ServiceAPI, we can effectively get updated content and synchronize it with Azure AI Search. This process avoids the need to re-index the entire site, which helps boost performance.
[ScheduledPlugIn(DisplayName = "Sync Updated Content to Azure Search")] public class AzureSearchJob : ScheduledJobBase { private readonly HttpClient _httpClient; private readonly string _serviceApiBaseUrl = "https://yourwebsite.com/episerverapi/content/"; public AzureSearchJob() { _httpClient = new HttpClient(); IsStoppable = true; } public override string Execute() { // Step 1: Get content updated in the last 24 hours var yesterday = DateTime.UtcNow.AddDays(-1).ToString("o"); var contentApiUrl = $"{_serviceApiBaseUrl}?updatedAfter={Uri.EscapeDataString(yesterday)}"; var response = _httpClient.GetAsync(contentApiUrl).Result; if (!response.IsSuccessStatusCode) return "Failed to fetch updated content from ServiceAPI."; var contentJson = response.Content.ReadAsStringAsync().Result; var documents = JsonSerializer.Deserialize<JsonElement>(contentJson).EnumerateArray() .Select(content => new Dictionary<string, object> { ["id"] = content.GetProperty("ContentGuid").ToString(), ["name"] = content.GetProperty("Name").GetString(), ["content"] = content.GetProperty("ContentLink").GetRawText(), ["type"] = content.GetProperty("ContentTypeName").GetString() }).ToList(); // Step 2: Push to Azure AI Search var json = JsonSerializer.Serialize(new { value = documents }); var request = new HttpRequestMessage(HttpMethod.Post, "https://servicename.search.windows.net/indexes/<index-name>/docs/index?api-version=2021-04-30-Preview") { Content = new StringContent(json, Encoding.UTF8, "application/json") }; request.Headers.Add("api-key", "<your-admin-key>"); var result = _httpClient.SendAsync(request).Result; return result.IsSuccessStatusCode ? "Success" : "Failed to index in Azure Search."; } }
You can filter and transform the ServiceAPI response further to match your index schema.
Create a new page type to serve as a Search Results page.
[ContentType(DisplayName = "Search Results Page", GUID = "3C918F3E-D82B-480B-9FD8-A3A1DA3ECB1B", Description = "Search using Azure Search")] public class AzureSearchPage : PageData { [Display(Name = "Search Placeholder")] public virtual string PlaceholderText { get; set; } }
public class AzureSearchPageController : PageController<AzureSearchPage> { public ActionResult Index(AzureSearchPage currentPage, string q = "") { var results = new List<string>(); if (!string.IsNullOrEmpty(q)) { var url = $"https://<search-service>.search.windows.net/indexes/<index-name>/docs?api-version=2021-04-30-Preview&search={q}"; using var client = new HttpClient(); client.DefaultRequestHeaders.Add("api-key", "<your-query-key>"); var response = client.GetStringAsync(url).Result; var doc = JsonDocument.Parse(response); results = doc.RootElement.GetProperty("value") .EnumerateArray() .Select(x => x.GetProperty("name").GetString()) .ToList(); } ViewBag.Results = results; ViewBag.Query = q; return View(currentPage); } }
@model AzureSearchPage @{ Layout = "~/Views/Shared/_Layout.cshtml"; } <h1>Search Results</h1> <form method="get"> <input type="text" name="q" value="@ViewBag.Query" placeholder="@Model.PlaceholderText" /> <button type="submit">Search</button> </form> <ul> @foreach (var result in ViewBag.Results as List<string>) { <li>@result</li> } </ul>
Integrating Azure AI Search with Optimizely CMS can truly take your site search from basic to brilliant. With a bit of setup and some clean code, you’re empowering users with fast, smart, and scalable content discovery.
This blog is also published here
]]>Perficient is excited to announce we’ve won Coveo’s exclusive partner award for the third time since inception! The Accelerator Award commemorates a Coveo partner and customer that worked together and exhibited deep knowledge and technical expertise and clearly delivered value-driven business outcomes.
A Coveo platinum partner with 100+ Coveo consultants globally, our award-winning work in collaboration with ADI | Snap One paved the way for increased conversions and purchases with fewer rules while improving employee efficiency. Coveo’s Merchandising Hub and dashboards provide ADI | Snap One with a single, centralized location to seamlessly understand and better service customers.
This award highlights the outstanding efforts and achievements our team delivers using the Coveo platform to accelerate value-driven business outcomes for our clients. A big thank you to our client, ADI | Snap One. Our joint success is a result of the true collaborative spirit you brought to the project and fostered among your team. And to Eric Immermann, Director Enterprise Search, Zachary Fischer, Senior Solutions Architect, and Kyla Faust, Alliance Manager, at Perficient, we are grateful for your dedication to the Coveo partnership, and to the entire team for their ongoing collaboration in ensuring the success of our joint customers.
“We are honored to receive the Coveo Accelerator Award, which reflects the strength of our partnership with Coveo and ADI | Snap One, said Michael Patterson” Managing Director, Data and Analytics. “This recognition reinforces our shared commitment to innovation, and we are excited about the opportunities that lie ahead. We appreciate the support and collaboration from the Coveo team, which has been instrumental in our success.”
Our expertise across various technologies and platforms allows us to effortlessly integrate intelligent search with numerous enterprise applications, unlocking valuable information and transforming your business. By utilizing the top features from leading industry platforms, we deliver innovative solutions tailored to the unique needs of each client. With Coveo, you can anticipate tangible benefits such as enhanced productivity, better customer satisfaction, and increased revenue.
]]>“I am so proud of this team and humbled to be recognized by Coveo for the delivery of value-driven business outcomes,” said Eric Immermann. “Coveo’s platform allows us to rapidly deploy personalized and highly relevant commerce, knowledge, and generative experiences. By collaborating with Coveo, we can bring cutting edge and holistic implementations to our customers.”
Sitecore Search is a powerful tool that helps users quickly find relevant content by indexing various sources and presenting results in an intuitive, customizable way. To get the most out of it, businesses must optimize how search widgets function, manage crawlers to keep content fresh, and enhance query handling through automated tools. Let’s dive into these essential components to understand how they work together.
Search widgets shape how users interact with search results. These elements power features such as autocomplete, filtering, sorting, and ranking adjustments, making searches more intuitive and personalized. Businesses can configure widgets to offer spell-check, AI-driven recommendations, and customized ranking rules, ensuring that users find what they need with minimal effort. A well-optimized search widget improves relevance and usability, allowing users to refine results based on parameters like relevance, date, or popularity.
The Search API Explorer is another valuable tool that allows businesses to test and refine search queries before deploying them. It acts as a testing ground for adjusting ranking logic, playing with filters, and debugging settings in real time. By viewing API responses in JSON format, developers can fine-tune search behavior to better align with user expectations and business goals.
For search results to remain accurate, Sitecore Search relies on crawlers to fetch and index content from different sources. These crawlers can be scheduled to update content in real time, at set intervals, or manually. Businesses have control over crawl frequency, depth, and which URLs should be included or excluded. Additionally, authentication settings ensure that restricted content can be indexed when necessary.
There are different types of crawlers depending on where content resides:
A crucial part of making all content searchable is the Document Extractor, which enables search engines to read and index PDFs and Word documents. Since these file types often contain valuable information that would otherwise be hidden from search results, the extractor converts document content into searchable text and extracts metadata such as titles, authors, and dates. This makes it easier for users to find relevant information even when it exists within non-web-based files.
Automation plays a key role in delivering personalized and relevant search results. Search request triggers dynamically modify search behavior based on user actions, ensuring that results align with their needs. For instance, businesses can set up triggers to automatically apply filters based on past searches, boost specific results for certain keywords, or redirect users to relevant pages when they search for particular terms.
Similarly, search request extractors enhance the quality of search results by interpreting user intent, applying synonyms, and refining queries. These tools help improve accuracy by understanding variations in phrasing and context, ensuring that users receive the most relevant results even if their initial search terms aren’t perfectly aligned.
Sitecore Search is more than just a search tool—it’s a dynamic system that can be tailored to enhance user experience and ensure content is indexed efficiently. By optimizing search widgets, managing crawlers effectively, and leveraging automation, businesses can significantly improve search accuracy and usability.
If you need help configuring Sitecore Search or have questions about optimizing your setup, drop a comment or reach out—we’d love to chat!
]]>Optimizely Graph lets you fetch content and sync data from other Optimizely products. For content search, this lets you create custom search tools that transform user input into a GraphQL query and then process the results into a search results page.
Why use Graph for Content Search?
The benefits of a Optimizely Graph-based search service include:
Let’s explore the steps to make this work using the Alloy project. First, obtain the Content graph keys/secret from Optimizely.
Optimizely.ContentGraph.Cms
package, and note that the Content Delivery API must also be installed as a prerequisite for the graph to function
IServiceCollection
. Additional configuration options are available as needed.Now that the server-side querying is ready, let’s configure the client side to query from the application code.
StrawberryShake.Server
StrawberryShake.Transport.Http
.StrawberryShake.Tools
on the machine
.graphql
file. Create a Queries
folder and place the query file inside it.OptimizelyGraphSingleKeyValue
with the key received from Optimizely (as shown in the appSettings step):
-n AlloyGraphClient
, the generated client will be AddAlloyGraphClient
IAlloyGraphClient
StartPageController
and verify the results
.graphql
query file. If you make any additions or modifications to the query, ensure that the latest schema is downloaded when you rebuild the code
]]>
In the new composable world, it’s common for medium to large Sitecore solutions to include a search appliance like Coveo and a digital asset management tool like Sitecore Content Hub. A typical use-case is to build search sources in Coveo that index content residing in Content Hub. Those indexes, in turn, can then be used to build front-end search experiences. In this blog post, I’d like to cover a few tips for working with the Content Hub REST API to populate search sources in Coveo. These tips are based on my experiences on a recent project that used the Content Hub REST API to index PDF documents in Coveo.
Having not previously used the Content Hub REST API, I wasn’t initially aware that there are several endpoints. Here’s a quick rundown of a few of them:
Query API (GET http://<hostname>/api/entities/query/
)
The Querying feature allows you to query for specific entities using specific indexed metadata fields. This basic querying is contrasted against the more elaborate search functionality offered by the M.Content API.
Scroll API (GET http://<hostname>/api/entities/scroll/
)
You can use Scroll API to retrieve a large number of results (or even all results) from a single query.
It does not support the skip parameter and only lets you request the next page through the resource. You can continue paging until it no longer returns results or you have reached the last page.
SearchAfter API (POST http://<HOSTNAME>/api/entities/searchafter/
)
The SearchAfter API is used to fetch multiple pages of search results sequentially. To start, a request is made for the first page of results, which includes a last_hit_data value corresponding to the last item in the page. This value is then used to fetch subsequent pages until all results are retrieved.
On this particular project, the Query API was used to pull PDFs. By design, the Query API returns a maximum of 10k results. In this case, that was okay–there were something like ~9k assets in Content Hub at the time (without any additional filtering applied). However, in order to future-proof the query a little and to avoid unnecessary processing of non-PDF documents, it made sense to make the query more specific (see #2, below ).
Net out: If you know you’ll need to pull 10k+ items from Content Hub and efficiently paginate through all of them, use the SearchAfter API. If your number of assets is smaller than 10k, then the Query API is probably fine. Note that the SearchAfter API will soon deprecate and replace the Scroll API so it’s best to avoid the Scroll API for any new work.
I like to think that I can figure most things out if I read the documentation. However, when it came to updating the query to filter down to approved PDFs for indexing, it wasn’t at all clear to me how to do that. As mentioned above, the Query API is limited to 10k results and we were pretty close to that in terms of total asset count. It was important to be more selective when pulling assets such that only approved PDFs were returned.
After unsuccessfully experimenting for while, I broke down and opened a Sitecore support ticket to ask how that could be accomplished. I got an answer…and it worked, but it wasn’t as obvious as I would have liked it to be. Who likes magic numbers?
To query for PDF assets: ... AND Parent('AssetMediaToAsset').id==1057
.
To ensure that only approved assets are included: ... AND Parent('FinalLifeCycleStatusToAsset')==544
.
Putting it together, the full query URL (without any ordering applied; see #3 below ) was:
{baseURL}/api/entities/query?query=Definition.Name=='M.Asset' AND Parent('AssetMediaToAsset').id==1057 AND Parent('FinalLifeCycleStatusToAsset').id==544
In other words:
Give me all assets whose file type is PDF and whose approval status is approved.
Now, I think these IDs are common across all Content Hub instances but, just in case, please make sure they match the appropriate values in your Content Hub instance prior to using the same IDs in your queries. You can find the asset media type IDs under Taxonomy Management in Content Hub:
Asset media types in Content Hub’s Taxonomy Management interface.
When you’re building a REST API source in Coveo with the intention of iterating through hundreds or thousands of assets in Content Hub, it’s best to return them in a consistent order. At one point during the troubleshooting of some indexing issues, Coveo support suggested that the Content Hub API was returning results in an inconsistent order and that that was potentially a contributing factor. While that was never conclusively shown to be the case, it does make sense to apply a sort, even if only to ensure assets are processed in a specific, predictable order.
The query was updated to sort on createdOn
ascending (oldest first); the updated query URL looked like this:
{baseURL}/api/entities/query?query=Definition.Name=='M.Asset' AND Parent('AssetMediaToAsset').id==1057 AND Parent('FinalLifeCycleStatusToAsset').id==544&sort=createdOn&order=Asc
Interestingly enough, I found that created_on
worked, too, but, according to Sitecore support, createdOn
should be used instead.
REST API sources in Coveo will almost always be configured to paginate through the results coming from the external API, otherwise only the first page’s worth of data will be processed and indexed. It’s important to ensure paging is configured correctly to allow for reasonable index rebuild and rescan times, too. In this case, using the Query API, and with a page size of 25
items per page, the paging
configuration section in the Coveo REST API source looked like this:
... "paging": { "pageSize": 25, "offsetType": "url", "nextPageKey": "next.href", "parameters": { "limit": "take" }, "totalCountKey": "total_items" }, ...
The corresponding paging properties as returned in the Query API response (for the first page) looked like this:
{ "items": [ ... ], "total_items": 12345, "returned_items": 25, "next": { "href": "https://{baseURL}/api/entities/query?skip=25&take=25&query=Definition.Name%3D%3D%27M.Asset%27%20AND%20Parent(%27AssetMediaToAsset%27).id%3D%3D1057%20AND%20Parent(%27FinalLifeCycleStatusToAsset%27).id%3D%3D%20544&sort=createdOn&order=Asc", ... }, ... }
Note that the paging configuration may need to change if you’re using a different Content Hub API endpoint. For more information about configuring paging in Coveo REST API sources, refer to the official documentation.
In Coveo, the maximum size for a single item is approximately 256 MB (reference). That number includes the item’s permissions, metadata, and content. For larger files, the content isn’t indexed, just the metadata. This limit came to light indirectly on this recent project.
While outside the scope of this post, Coveo supports extensions that can be attached to search sources. Extensions are bits of Python code that Coveo will run in the context of each document while processing the source. On this project, an extension was used to do things like conditionally reject (skip indexing) documents, set metadata fields based on other properties, etc. At one point, the extension attempted to resolve the extension (file type) for the document using the following code:
filetype = document.get_meta_data_value("detectedfiletype")[0]
For any documents not above the maximum size, the filetype
variable would have the expected value: "pdf"
. For any documents that were above the maximum size, the variable had a generic value that, while non-empty, was also not the expected file type. Because the document was too large, the document
object available within the extension didn’t have the expected values, including detectedfiletype
. As a result, because the file was large, some logic within the extension broke as this case wasn’t accounted for.
Upon further investigation of the PDFs in Content Hub, it was noted that, of the 10
or so that consistently exhibited indexing issues, all of them were 300+ MB in size.
For more information on indexing pipeline extensions (IPE), please see Indexing pipeline extension overview.
Net out: If you’re using an extension on a source and you’re noticing that the document
object has one or more properties that aren’t returning what you’d expect to see, double-check to ensure that the underlying document isn’t > 256 MB and that you aren’t trying to access properties within the extension that will never correctly resolve.
Thanks for the read!
]]>
Perficient is excited to announce we’ve won Coveo’s exclusive partner award! The Accelerator Award commemorates a Coveo partner that exhibited deep knowledge and technical expertise, understands a customers’ business challenges, and clearly delivers value-driven business outcomes.
Perficient is a trusted Coveo Platinum Partner with expertise in modern intelligent search solutions. Our award-winning work helps organizations provide their customers and employees with relevant search results and recommendations while increasing revenue, boosting conversion rates, and improving employee efficiency.
This award marks the second time Perficient has received a Coveo partner award. These awards are a testament to the incredible work our teams deliver using the Coveo platform to accelerate outcomes for our clients. Huge shoutout to Perficient’s own Eric Immermann, Director Enterprise Search, and Kyla Faust, Alliance Manager, for their investment in the partnership, as well as the extended team for all their continuous collaboration in making our joint customers successful.
“We’re honored to be recognized by Coveo for the delivery of value-driven business outcomes,” said Eric Immermann. “Coveo serves as a powerful platform as customers demand more personalized and conversational experiences. In partnering with them, Perficient can deploy the latest advancements in AI and search to our clients.”
Our cross-technology and platform expertise enables us to seamlessly integrate intelligent search with a variety of enterprise applications to unlock the value of information and transform your business. We leverage the best features across industry-leading platforms to provide innovative solutions and drive outcomes that meet the unique needs of each client. With Coveo, you can expect to see tangible results such as higher productivity, improved customer satisfaction, and increased revenue.
]]>The Optimizely configured commerce introduces Elasticsearch v7 for a better search experience. In the dynamic landscape of the Commerce world, there is always room for extended code customization. Optimizely offers detailed instructions on customizing Elasticsearch v7 indexes.
There are a lot of advantages to using Elasticsearch v7. some are
In this post, we will go through how we will add the custom column in the Elasticsearch v7 index step by step.
The very first step is to set a default provider in admin. Below are the steps to set the default provider:
After configuring the default provider in the admin section, the site will use Elasticsearch v7, conducting searches on indexes newly established by Elasticsearch v7.
If we want to add a new custom field to these indexes, Optimizely provides some pipelines to add the new custom field.
In this class, we have created a property named StockedInWharehouses which is the type of list of strings.
namespace Extensions.Search.ElasticsearchV7.DocumentTypes.Product { using Insite.Search.ElasticsearchV7.DocumentTypes.Product; using Nest7; [ElasticsearchType(RelationName = "product")] public class ElasticsearchProductCustom : ElasticsearchProduct { public ElasticsearchProductCustom(ElasticsearchProduct source) : base(source) // This constructor copies all base code properties. { } [Keyword(Index = true, Name = "stockedInWarehouses")] public List<string> StockedInWarehouses { get; set; } } }
To add the data into custom property, use a PrepareToRetrieveIndexableProducts class extension. Handle data retrieval within custom code by composing a LINQ query to fetch the required data. The best performance achive by returing Dictonary like.ToDictionary(record => record.ProductId). Here is an example code snippet
namespace Extensions.Search.ElasticsearchV7.DocumentTypes.Product.Index.Pipelines.Pipes.PrepareToRetrieveIndexableProducts { using Insite.Core.Interfaces.Data; using Insite.Core.Plugins.Pipelines; using Insite.Data.Entities; using Insite.Search.ElasticsearchV7.DocumentTypes.Product.Index.Pipelines.Parameters; using Insite.Search.ElasticsearchV7.DocumentTypes.Product.Index.Pipelines.Results; using System.Linq; public sealed class PrepareToRetrieveIndexableProducts : IPipe<PrepareToRetrieveIndexableProductsParameter, PrepareToRetrieveIndexableProductsResult> { public int Order => 0; // This pipeline has no base code so Order can be anything. public PrepareToRetrieveIndexableProductsResult Execute(IUnitOfWork unitOfWork, PrepareToRetrieveIndexableProductsParameter parameter, PrepareToRetrieveIndexableProductsResult result) { result.RetrieveIndexableProductsPreparation = unitOfWork.GetRepository<ProductWarehouse>().GetTableAsNoTracking() .Join(unitOfWork.GetRepository<Product>().GetTableAsNoTracking(), x => x.ProductId, y => y.Id, (x, y) => new { prodWarehouse = x }) .Join(unitOfWork.GetRepository<Warehouse>().GetTableAsNoTracking(), x => x.prodWarehouse.WarehouseId, y => y.Id, (x, y) => new { Name = y.Name, productId = x.prodWarehouse.ProductId }) .GroupBy(z => z.productId).ToList() .Select(p => new { productId = p.Key.ToString(), warehouses = string.Join(",", p.Select(i => i.Name)) }) .ToDictionary(z => z.productId, x => x.warehouses); return result; } } }
After retrieving data into the “RetrieveIndexableProductsPreparation” result property, set the data into a custom property for indexable products. To achieve this crate a class “ExtendElasticsearchProduct” and extend with IPipe<CreateElasticsearchProductParameter, CreateElasticsearchProductResult>
Here in the execute method, the parameter contains the RetrieveIndexableProductsPreparation property and this contains our data. Fetch this data using the TryGetValue method.
Avoid having the logic to fetch data return in the CreateElasticsearchProductResult extension class. Writing the data retrieval logic in this class will impact the performance of creating the product indexes.
Here you’ll find an illustrative code snippet:
namespace Extensions.Search.ElasticsearchV7.DocumentTypes.Product.Index.Pipelines.Pipes.CreateElasticsearchProduct { using System; using System.Collections.Generic; using Insite.Core.Interfaces.Data; using Insite.Core.Plugins.Pipelines; using Insite.Search.ElasticsearchV7.DocumentTypes.Product.Index.Pipelines.Parameters; using Insite.Search.ElasticsearchV7.DocumentTypes.Product.Index.Pipelines.Results; public sealed class ExtendElasticsearchProduct : IPipe<CreateElasticsearchProductParameter, CreateElasticsearchProductResult> { public int Order => 150; public CreateElasticsearchProductResult Execute(IUnitOfWork unitOfWork, CreateElasticsearchProductParameter parameter, CreateElasticsearchProductResult result) { var elasticsearchProductCustom = new ElasticsearchProductCustom(result.ElasticsearchProduct); if (((Dictionary<Guid, int>)parameter.RetrieveIndexableProductsPreparation).TryGetValue(elasticsearchProductCustom.ProductId.ToString(), out var stockedInWarehouses)) { elasticsearchProductCustom.StockedInWarehouses = ExtractList(stockedInWarehouses); } result.ElasticsearchProduct = elasticsearchProductCustom; return result; } private static List<string> ExtractList(string content) { if (string.IsNullOrWhiteSpace(content)) return new List<string>(); return content .Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries) .ToList(); } } }
After rebuilding the full product index, it displays the newly created ‘StockedInWarehouses’ column in product indexes. The below screeshot showing index with value
Now you can easily use the StockedInWarehouses field in term query to filter out search results.
var currentWarehouseId = SiteContext.Current.PickUpWarehouseDto == null ? SiteContext.Current.WarehouseDto.Name : SiteContext.Current.PickUpWarehouseDto.Name; result.StockedItemsOnlyQuery = result .SearchQueryBuilder .MakeTermQuery("stockedInWarehouses.keyword", currentWarehouseId);
Reference Link : https://docs.developers.optimizely.com/configured-commerce/docs/customize-the-search-rebuild-process
]]>This blog post concerns the intermittent issue with Sitecore Solr search results in Solr Cloud. Depending on the application and architecture implementation, root causes can be different. But I would like to share the problem we faced and its fix so it may help someone facing the same problem and can apply the same fix.
We created a sophisticated search results page that leverages SXA search components and filters. Additionally, we integrated custom code in various website sections to retrieve data from Solr indexes using Sitecore Content Search APIs. Everything functioned seamlessly in our local environment and even in higher-level environments. However, after some days, we encountered intermittent issues retrieving data from the indexes in a higher environment.
Intermittent issues arose, meaning that when loading a page that relied on indexed data, it would occasionally fetch incomplete or no data. Refreshing the page would sometimes resolve the issue, providing access to all the data. In essence, the results were inconsistent.
Furthermore, we implemented a custom item resolver for Bucket item URLs, in which we indexed user-friendly URLs for bucket-able items as part of a computed field. While URL resolution worked at times, it also intermittently resulted in 404 errors.
In a nutshell, any feature dependent on the Solr search does not work consistently with all the requests.
We did not discover any Sitecore error logs specific to features or indexes. Even after attempting to rebuild the indexes, these issues were not resolved. We contacted Sitecore Support to report the problem and learned that the replicas within the Solr cloud were not synchronized. This implies that the document count for a given index, such as ‘sitecore_web_index,’ differs across each replica node.
To confirm the issue of ‘document counts not being in sync among replica nodes,’ we logged in to the Solr dashboard of the affected environment. We checked the Nodes tab under the Cloud tab on the left, as shown below.
Solr Cloud Nodes
As shown below, the document count is not in sync for the sitecore_web_index. sitecore_web_index_s1r1 and sitecore_web_index_s1r2 has 15.3k docs whereas sitecore_web_index_s1r4 has 12.9k docs.
The document count is not in sync for the sitecore_web_index
Click on more links to access other indexes and verify the document counts for the relevant indexes. These counts are not specific and are displayed in thousands. If the counts appear to be the same, check the specific count. To determine the specific count, check the steps outlined in the ‘Self-Diagnosis’ section of the Identical Queries Produce Inconsistent Results – SearchStax Docs.
Once we confirmed the issue, we proceeded with the solution to sync the document count across all replicas. This required a rolling restart of the Solr cluster. SearchStax Support has suggested the rolling restart to the Sitecore support team. We requested the Sitecore support team to perform the necessary rolling restart for the concerned environment, as they have the required access for this action.
Following the rolling restart, we verified that the document counts were in sync, and the intermittent search issue was successfully resolved!
Please consider checking for the exact resolution in your case, if you encounter a similar situation. If the website is new, it is advisable to perform this document count check for all the indexes. If not in sync, it is good to have a rolling restart at least once before the UAT phase or going live.
Computed field is getting updated in solr index only when we fully rebuild reindex. Only one Sitecore instance within the environment cluster must define an indexing role, meaning a single Sitecore instance actively handles indexing. When multiple servers assume indexing roles, it leads to issues within the indexes, such as potential clearing of computed field values or updates occurring only after a complete index rebuild.
Performance optimizations when using Solr
Special thanks to our dedicated colleagues for their invaluable support during this investigation, and thanks to the Sitecore Support Team and SearchStax Support for their fantastic assistance.
Hope this helps. Happy Sitecore searching!
]]>I have always said that Sitecore MVP Summit is the best perk of all the MVP program benefits available. Usually, it is a tsunami of insightful information and considerations, shared with the group of MVPs for discussion and feedback in return. Because of that MVPs can have a planning horizon, in terms of work, development, and contributions.
Due to the sensitive nature of such early access information, the vast majority of it cannot be shared with a wider public and it is always explicitly highlighted. It is only a small volume of it comparable with the peak of a floating iceberg above the water line, that could be shared. Today I would like to present you all the information from MVP Summit we are allowed to share so that you could also benefit from it.
With that in mind, here we go!
The portal team is working to make an even more granular set of permissions for the portal, and I can only applaud that. Over the past 6 months that was a very frequent request from our customers, asking for specific fine-tuned permission, and I am glad that we now will be able to do that. Notify the new roles other than previously available User and Admin, below.
OrderCloud eventually makes its way to the Sitecore Cloud Portal. It will be available for relevant subscriptions, as it is not included in XM Cloud Plus. If you never heard the term Xm Cloud Plus – I’d recommend reading a blog post from my colleague David San Filippo about this new offering.
Not just XM Cloud, but also Search, Commerce, and Personalize would share the unified experience. Components would be easily draggable to a page, for example, you can drag-drop search results component to a storefront page powered by your OrderCloud.
We’ve got a React-based starter kit for creating an admin experience.
Next is really big! Making Edge personalization possible opens plenty of opportunities – starting with being able to do per-component personalization as it was with XP platforms, rather than having solid page variants as it is implemented in XM Cloud today. That also will reduce the amount of custom logic from your head application, keeping it clean and having fewer dependencies. The other benefit of edge personalization is that it removes one intermediate request to the decisioning engine between the browser and your head application making it low-latency calls, which is especially valuable in geo-distributed solutions.
One more loud shout-out! We’ve been waiting so much and it is coming in a matter of weeks rather than months. Forms are coming back with its new truly composable implementation, as a separate product. Forms will be provided with every XM Cloud subscription, including XM Cloud Plus.
With Forms, Sitecore decided to re-use existing technology rather than building one from scratch.
Guessing what do they take as a foundation? XP Forms? Incorrect! XP Forms require having a CD server, it was generally a non-SaaS solution so there was much to get redone in that case. They chose another product that features better, composable forms – Sitecore Send. I recently wrote the full overview of this product, including the forms and how exactly they function. That allows deploying forms as a component not just to Pages, but universally elsewhere, for example, a third-party platform. SaaS forms are also much more friendly for less technical users.
What type of fields will be available?
Where do the form submissions go? That’s for you to decide! Forms will give you a webhook to define that, a kind of flexibility that is done by purpose leaving the Forms product with maximum compliance of PII data storage. I believe, as an option, there will be some connectors, say for CDP and/or simplified integration for those clients who are less focused on compliance.
I was really impressed by the UX and UI of the new Forms experience, it is really easy, user-friendly, and powerful!
Lots of work has been recently done on site management in XM Cloud. As an example, you can now access sites that you created from other existing source code than the templates provided in the Dashboard. These sites are labeled as non-SXA managed, and the Dashboard has now a filter so you can decide which sites you want to display on the Dashboard Sites tab.
Pages and Components as well as Sites are coming together into a single visual building experience, called Experience Studio. Its idea is to bring a seamless transition from what developers provide in a form (say) React component to marketers actually utilizing these inputs with various composable products, not just XM Cloud alone.
Leveraging Site Templates was a long promise from Sitecore and I am glad it is now getting to the implementation.
Embedded Personalization has been integral to XM Cloud from day one and covering most common scenarios it has proven to be a success. However, as soon as your websites become personalization-heavy heavy need a more robust solution to address personalization cases rather than generating numerous page variants. That was one of the most wanted features and it is coming soon – component-level personalization execution at Edge. That in turn brings more flexible scenarios of A/B testing.
Unified Tracking finally will get there, collecting plenty of meaningful behavioral information into CDP. Instead of having individual trackers for different products (ie. Search, Send, CDP) there will be a single unified API used, which also means creating a single and unified API key to it.
New Engage SDK – replaces Boxever scripts with engage script which you can embed externally and npm-based engage SDK to work with React and Next.js frameworks. Unlike old Boxever scripts, it supports geo locations out-of-box.
You can create API Keys to work with Sitecore Personalize so that you use them to create experiences and experiments, please see relevant documentation.
Another interesting slide.
You can create custom conditions.
The new Site Analytics homepage gives you deeper analytics with interactive heatmap and usage dynamics.
Content interaction Analytics. Going beyond Sites and Pages – not just page visits and sessions, but more in-depth analysis of visitors’ actions – how and where they go, behavioral trends, etc.
With the entire range of all the products in a composable family, Sitecore is rethinking the overall developer experience for the better. That includes productivity, reducing time to receiving operable products, minimizing configuration to getting them up and running. We already saw multiple starter kits for almost any product, which indeed helped us (developers) to speed up.
The biggest change from the Deploy App is that now we can keep and deploy the front end separately from a CM instance and the rest of our codebase. That means, frontenders who typically prefer working on Mac equipment, will be able to have fully autonomous local development. They don’t need to spin up complicated Windows-based containers any longer with the single purpose of working on the Next.js app. Their workflow has now been proven to be genuinely cross-platform, and the corresponding Deploy App tooling has been provided. They of course need Sitecore CLI which is already cross-platform, as been written with a cross-platform .NET so that works universally everywhere.
One more user-friendly feature: upon completion, the tool exposes all the important settings and secrets to me, exactly what I may need for further setup.
Next, let’s talk about content migration. There were not one but two migration solutions presented to us, with slightly different coverage and use cases. The integrated tool is under NDA, so I will only focus on the other migration tool developed by Ivan Brygar which was also presented:
It is a standalone WFP application that is capable of moving both content and users from your legacy platform to your XM Cloud
Migrating users is an important moment. It is crucial to note that in SaaS version users are not part of XM Cloud, but rather exist within Unified Identity and belong to a Cloud Portal itself. From there, admins or organization owners can define the exact set of permissions for users across the available applications.
With that in mind, this migration tool is creating such users by emailing the invites and waiting for the activation. Later it will be also able to set up the desired permissions.
Here is the more detailed roadmap for this migrations product:
It is expected to become open-source soon, as the creator confirmed to me.
I feel so impatient to share much more exciting things learned at MVP Summit which aren’t yet allowed for sharing. As soon as relevant products and updates come out of NDA, I’ll be able to provide you a detailed overview of them, so please stay tuned!
]]>
When Sitecore announced their Cloud Offerings at Sitecore Symposium in October 2022, It crystalized their composable strategy in providing separate product offerings that were each best of breed but could be better together. When you look at Engagement Cloud and Commerce Cloud, with products that have clear capabilities with little cross over, that strategy is well understood.
But when you look at Content Cloud, things get a bit more fuzzy. Even more so when you consider how their platform DXP solutions do very similar things. With multiple CMS offerings and several “Content Hub” products, it may not be clear what are the differences and when should you consider one product versus another. In this series I’ll compare and contrast these options. Looking back at my first post I compared Sitecore’s CMS offerings. In my second post, I covered Sitecore’s Content Hub offerings. Today I’ll look at Sitecore Search and other composable alternatives.
Sitecore’s flagship CMS product, XM Cloud brings a different architecture compared to the traditional platform DXP. Sitecore XP and XM required a SOLR instance to manage internal search features, but also supported website search needs.
When items were published, it automatically updated the search index, keeping results up to date, and that index could be queried using Sitecore’s Content Search API, a flexible API that was easier to use than consuming the SOLR API’s directly. Sitecore’s SXA toolkit even provided additional features to make it even easier to configure queries and boosting rules, but everything was dependent on the content search API’s and SOLR.
XM Cloud is a pure SaaS solution for content management. It does not include indexes for published content as content is only published to “Experience Edge.” Although under the covers, it does leverage SOLR for internal search needs, it does not provide the ability for you to create custom indexes, and the Content Search API is not supported. XM Cloud doesn’t include Content Delivery servers, so there is no place to serve a custom search API from.
XM cloud does support graphQL, which does provide the ability to run simple queries against the content published in Experience edge. This does not support features like faceting, boosting, or any other advanced relevancy features.
So websites on XM Cloud that require more than simple query capabilities need to find another solution. This is not only needed if implementing a site search feature but any other feature that requires faceting and filtering. Blogs, News, Events, Locations, and other similar features would require a composable search solution.
Even if you’re not on XM Cloud today, you may want to consider the implications of these architectural restrictions when implementing search features on your site today. If you want to be able to move to XM Cloud in the future, you’ll need to remove any dependencies on the web index, content search API’s and CD servers in order to do that. Even those these techniques may work on a headless solution running on XP or XM today, you’ll need to rewrite that functionality completely in order to move to XM Cloud. It may be better to invest in a composable search solution today to avoid having to throw away that code in the future.
Besides compatibility with modern architecture’s, composable search solutions offer a number of advantages over Sitecore’s out-of-the-box search capabilities. These include:
In October 2022, Sitecore announced a new composable search product: Sitecore Search. Building off the search technology behind Sitecore Discover (formerly Reflektion), Sitecore Search aims to bring the relevancy, speed and flexibility of Sitecore Discover’s product driven search to content-based search.
Other Perficient bloggers have written in-depth articles about how Sitecore search works and what it’s capabilities are, which are all worth a read, especially if you’re moving toward an implementation phase. I suspect many clients will purchase Sitecore search when they license XM Cloud as it fills a functionality gap, and there will be additional advantages to leveraging Sitecore Search with XM Cloud over using other options:
For more details on Sitecore Search, read this series of posts by Eric Sanner.
It is worth noting that there are other composable search options to consider besides Sitecore search. There are many platforms that support enterprise search needs, but there are a few that we tend to see used with Sitecore implementations.
Coveo and Sitecore have a long history. They’ve filled a lot of the gaps that out of the box Sitecore required customization to achieve. It integrated directly into Sitecore, re-indexing on publish, and provided some of the best AI driven search results available. Today, Coveo remains a market leader in search experience. Perficient even has a Coveo practice with 6 Coveo MVP’s supporting customers on the platform with and without Sitecore.
Coveo is probably one of the most fully featured solutions out there, but also more expensive than some of the other options. But there are some key requirements that would quickly make it the best option including:
For more details on Coveo and its capabilities in a Sitecore context, read this article from Martin Miles.
Sitecore customers may be familiar with Search Stax as a SOLR PaaS solution. Sitecore Managed Cloud customers typically used this service to manage their SOLR environments. Search Stax Studio is a separate stand-alone composable Search solution that can be integrated with XM cloud or any other site to provide search capabilities.
Although this option is probably one of the more affordable options, it brings a lot of functionality that meets most search requirements out of the box including search components, analytics, auto suggest, related searches and more.
For more details on implementing Search Stax Studio, read this article from Martin Miles.
You may be tempted to build out your own search implementation, especially if you have simpler requirements. I strongly suggest not to go down that path. The complexity of such solutions is deceptively complex. As you peel down that onion, you’ll find even more complexity. And any of these composable search solutions will be much more robust than anything you roll your own. Don’t forget about indexing, performance, troubleshooting and monitoring, not to mention effort needed to build a custom search interface.
If you don’t want to heed this advice, for more details on how to approach it, read this article from Martin Miles.
Composable Search solutions fill a real need in modern architectures and something you need to consider if XM Cloud is on your roadmap. This post wraps up my series on Making Sense of Sitecore’s Content Cloud. You can read the first post on content management options here, and the second post on Content Hub offerings here.
If you need help with evaluating your composable search options, we’d love to help. Reach out to me on LinkedIn, Twitter or fill out our contact form.
]]>What do your customers crave from your brand? It’s simple – Personalization.
The Next in Personalization 2021 Report from McKinsey reveals that personalization matters more now than ever with 71% of consumers expecting companies to deliver personalized experiences.
One way you can demonstrate customer intimacy and personalization is with search. And not just your average run-of-the-mill traditional search. You need natural language search.
Instead of using manual tagging or keywords queried against an index to provide results, natural language search uses a machine learning technique called natural language processing. It can infer meaning from complex queries and allow users to conduct a search using human language. Hence why natural language search is also often called conversational search.
When your search is able to consider how your customers think, how they behave and predict expectation you can deliver the personalized experiences they demand.
Join experts from Coveo, Adobe and Perficient for a webinar where you’ll learn the latest trends in search technology and what all the hype is around natural language search.
During this session, you’ll hear about:
Plus, attendees who book a meeting post-webinar will receive a special, complimentary gift!
Register for this 45-minute session today. See you there!
]]>While crawling object ContentDodumentLink from Salesforce you might come across this issue which is a SOQL limitation on ContentDocumentLink object. the exact error from Coveo Source activity log looks like this:
Implementation restriction: ContentDocumentLink requires a filter by a single Id on ContentDocumentId or LinkedEntityId using the equals operator or multiple Id’s using the IN operator. (MALFORMED_QUERY) -> Failed to GET resource at ‘https://org-url/services/data/v51.0/query/01g79000003Bd2sAAC-1’. [BEGIN RESPONSE BODY][{“message”:”Implementation restriction: ContentDocumentLink requires a filter by a single Id on ContentDocumentId or LinkedEntityId using the equals operator or multiple Id’s using the IN operator.”,”errorCode”:”MALFORMED_QUERY”}][END RESPONSE BODY] -> The remote server returned an error: (400) Bad Request.
this can be handled from Coveo admin console Source config, below are the steps to fix this issue:
select Id,SystemModStamp,ContentDocument.CreatedDate, ContentDocument.Id,ContentDocument.LastModifiedDate,ContentDocument.SystemModStamp,ContentDocument.LatestPublishedVersion.CreatedDate,ContentDocument.LatestPublishedVersion.Id,ContentDocument.LatestPublishedVersion.LastModifiedDate,ContentDocument.LatestPublishedVersion.SystemModStamp,ContentDocument.LatestPublishedVersion.Title,ContentDocument.LatestPublishedVersion.VersionData from ContentDocumentLink where Id != null and LinkedEntityId in (select id from opportunity) order by Id desc
To save the changes click on the + button than click Apply changes at the bottom and Save and rebuild source or Save: