This post is the first in a series on Perficient’s newest product: Handshake, the Extensible Search Connector. Perficient’s Data Solution group has built more than 30 custom search connectors. FileNet P8, Documentum, JIRA/Confluence, Salesforce, and Adobe Experience Manager to name a few. As more customers recognize the benefits of search, the demand for additional content sources grows. We’ve developed Handshake to meet this need and go beyond it.
What is a search connector?
Search use cases are abundant. Enterprise, Commerce, Website and Service are the major categories of implementation we work with. For any use case, however, the search engine is only as good as the fuel it consumes.
Enterprise search applications come bundled with source connectors, which create the fuel. They crawl and index content customers want to see in a search result. They know how to map fields to act as facets or filters for the search engine. Each solution has a set of supported connection types. These vary product to product, company to company. Web content crawling, for example, is a near ubiquitous source type and available in almost every product. Sitecore, on the other hand, is supported OOTB by Coveo only.
A search connector is typically a standalone application. It is designed to interface with a specific source repository and crawl it for search rich content, metadata, and security permissions. Content is typically pushed to a search engine for indexing. In our practice, a connector is built for a specific job (a specific subset of content) and execute a standard workflow of: connection, standardization or transformation, and transmission to the search repository.
As Integration Specialists, we’ve built connectors for most use cases. The far most common use case indexing an unsupported source. These are sources, like content management solutions, that are either too complicated or too obscure to included in a partner’s standard product roadmap.
Over time, we have identified that the most labor intensive aspect of developing a connector is defining the interface to a source. We asked ourselves the same questions again and again:
- What protocols are available to crawl content?
- How to coalesce metadata to be consumable by a search engine?
- The best way to flatten or map permissions of varying complexity?
- How to read read content quickly?
In developing Handshake, we sought (and have achieved!) to abstract these components and make them reusable, plug and play code elements.
We’ve designed Handshake to generate connector instances. These instances consist of pipelines, controlled through a central user interface. Unlike standalone connectors, they don’t require re-deploying a java application for major or minor tweaks. Additional transformations can easily be added to a pipeline. Search destinations can be swapped out in minutes. Different rules can be applied to multiple connector instances with relative ease.
Our framework allows higher precision and reuse of source interfaces. This represents a massive reduction in custom code needed to transform content. We write once, reuse many. Connectors no longer are stand alone applications, but rather instances of shared code: reducing time to deploy and empowering administrators to control the flow of data to their search solutions.
In future posts, we will be discussing some of the technology we’ve used to solve this problem, go in depth into some of Handshake’s features, challenges and solutions for search connectors in general, the challenges and benefits of product development at a consulting firm, and what it’s like being integration experts.