Posts Tagged ‘document filter’

Document filter inside out (p4): work with Lister/Retriever Model

In last installment, we carefully examined the usage of document filters in creating feed XML. Feed protocol is a push mechanism for the content source to send information to GSA. Since GSA connector framework 3.0, GSA introduced a Lister/Retriever model, which was first implemented in File System Connector. The connector is no longer using the […]

Document filter inside out(part 3): the anatomy working with feed

In this installment, we will discuss how the document filter is utilized by Connector Manager to achieve its functionality. Sometimes it’s very easy to show what’s happening with the code itself. I duplicated many section from Google site, and also provide link to the original source. Since Google engineers keep updating their implementation, the observation […]

Document filter inside out (part 2): the configuration

Last time, we discussed the basics about document filter. In this installment, we will talk about practical aspects of document filter. How to configure document filters? Google had a document explaining the usage of document filters. There are two ways you can configure document filters. The first is at Connector Manager level, specifically within <Tomcat>/webapps/connector-manager/WEB-INF/documentFilters.xml. […]

Document filter inside out (part 1): the fundamentals

Document Filter is a mechanism from Google connector framework 3.x to manipulate document during traversal for connectors. It is mainly supported at Connector Manager (CM) level. Thanks to the open source nature of Google connector framework, we could examine carefully about how document filter is defined and implemented. com.google.enterprise.connector.spi.Document Document is an interface defined by […]