SharePoint Search provides excellent search capabilities and features that many organizations can leverage with minimal customizations and configurations.Plus, with the addition of FAST for SharePoint and its big brother FAST Search for Internet Servers (FSIS), Microsoft has a deep search offerings to solve countless search opportunities.
However, even with features such as metadata-based results refinement and query suggestions, users of SharePoint search often need to climb the highest mountains and run through the fields, and sill not find what they are looking for (thanks, Bono).
Much of this is due to the flexible way SharePoint presents content, and how SharePoint crawls content.Let’s say you are searching for ‘Joshua Tree’ in your vast SharePoint-based library of discography built with an asset library of mp3 files with lookup columns into other custom lists to specify the Albums and Artists.By default, you’ll get back something like this:
At first glance, this looks great, but looking further you’ll see that SharePoint is returning duplicate results.Instead of just one ‘hit’ for the Joshua Tree album, we’re seeing quite a few – one for the album itself, one for each ‘View’ of the Albums list:All Items, By Artist, By Genre.In addition, because we have a multiple views in our Songs Library too, we’re getting multiple hits for each song as well.Once you’ve made this realization, the results appear ‘less great’ and frustration sets in.
Luckily, the solution is easy once you know what SharePoint is doing.When SharePoint crawls content, it is looking at each and every page of the site, indexing what it finds.The views we created to provide flexibility in navigating to our albums and songs by artist, genre, etc. are actually making our search results less effective.In addition, SharePoint is indexing the songs and albums themselves, so we end up with many search results, all pointing to the same thing.
Our solution is to configure search to ignore the types of results we don’t want displayed.In this case, we are only interested in the Albums and Songs themselves, not the lists that contain them.There are two ways in which we can configure SharePoint to accomplish this:Query Results or Crawl Rules
Every item displayed in search query has a Content Class.The Content Class indicates whether the result was found in a List, Library, Calendar, Page, etc., and is a search managed property so we can specify what Content Classes we want returned (or not returned) in our query.To test this, you can enter the following syntax into your query box:
Joshua Tree AND ContentClass<>STS_List_GenericList
Since the ‘Albums’ List was created as a Custom List, this query eliminates any result stemming from a custom list, thus eliminating the duplicate and unnecessary views from our search results, as shown below:
The IT Leader's Guide to Multicloud Readiness
This guide provides practical key insights and important factors to consider to make informed decisions in your multicloud journey.
Further, we can also eliminate Picture Libraries be adding another clause:
Joshua Tree AND ContentClass<>STS_List_GenericList AND ContentClass<>STS_List_851
STS_List_851 represents a SharePoint 2010 Asset Library, and thus are query results are reduced to exactly we expect when performing our search:
So, now that we can form a query that will return the results we need, we can modify our search results page to automatically append our Content Class filter clauses to each and every query performed on the page.(A SharePoint Search Results page is simply a set of web parts that allow us to enter a search term and then display results, and filter on those results.)To do this, simply modify the ‘Results Query Options’ of the Search Results web part to include a query clause in the ‘Append Text to Query’ field as shown below:
Modifying a Query Results web part is certainly easy and can provide some quick results, but when the number of results being returned is large, it is less than efficient to perform the filtering at query time; instead we should consider modifying our crawl rules to prevent these pages from ever reaching the index.
When specifying crawl rules, additional options are presented.Not only can we use the exact same query clauses to restrict results by content class, but we can also restrict results based on specific paths, such as:
*/Lists/Albums/By Artist.aspx, or
*/Lists/Albums/AllItems.aspx, or even
The exact techniques used to specify how crawl rules are created and managed are outside the scope of this blog, but details can be found here: http://technet.microsoft.com/en-us/library/ee792871.aspx
Office 365 Consideration:For those of us who are leveraging SharePoint Online in Microsoft’s Office 365 suite, specifying crawl rules is not an option as administering the Search Service Application is unavailable.
So, in summary, when the number of results is small as in this example, it is easy to understand the results and locate the correct link to follow, but when we think about the amount of content contained in most SharePoint implementations, it becomes critical to plan for how our search results will be returned and may the necessary adjustments to how content is crawled or how it is displayed to provide effective and relevant search results.