Microsoft

Blog Categories

Subscribe to RSS feed

Archives

Debugging your FAST Search Server: Document Processing Pipeline

For an example of the processing pipeline itself, check out Jeff Monnette’s blog here.

The FAST Processing Pipeline is a great method of feeding in your own Crawled Properties into FAST, and then being able to expose those properties by mapping to managed properties and searching on them. We ran into a situation with some custom code which stored ratings and comments for a document together in a column, which caused a mess when FAST showed the data. While I’m not going to go into what the entire processing pipeline code does, I am going to step through my debugging process and hopefully you will find something that will help when it comes to your own processing pipeline code.

We will look at three different errors that I encountered, and use log4net as a logger and use the FFDumper to look at crawled properties

Error 1: “The FAST Search backend reported warnings when processing the item. ( Customer-supplied command failed: Process terminated abnormally: Unknown error (0×80131700) )”

This error is common, and is generally an issue somewhere within the .exe file that you created. Usually the code is attempting to execute on something that doesn’t exist (hooray null reference). If the crawled property that you are pulling in does not have a value attached to it, then often this error will be thrown. Be sure to handle this by checking if the value exists before executing on it. For example:

Here is the value that I am reading in from FAST pipelineextensibility.xml. Sometimes it exists, often time it does not.

1: var rateValue = (from cp in inDoc.Descendants(“CrawledProperty”) 2: where new Guid(cp.Attribute(“propertySet”).Value).Equals(PROPERTYSET) && 3: cp.Attribute(“propertyName”).Value == “ows_RateValue” && 4: cp.Attribute(“varType”).Value == “31″ 5: select cp.Value);

.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode, .ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{font-size:small;color:black;font-family:consolas, “Courier New”, courier, monospace;background-color:#ffffff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .rem
{color:#008000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .kwrd
{color:#0000ff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .str
{color:#006080;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .op
{color:#0000c0;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .preproc
{color:#cc6633;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .asp
{background-color:#ffff00;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .html
{color:#800000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .attr
{color:#ff0000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .alt
{background-color:#f4f4f4;width:100%;margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .lnum
{color:#606060;}

To ensure that you only use this value when it exists, simply enclose your code with this:

1: if (rateValue.Count() > 0 && rateValue.First().Length > 0) 2: { 3: // CALL YOUR CODE HERE, FOR EXAMPLE 4: string output = rateValue.First(); 5: }

 

.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode, .ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{font-size:small;color:black;font-family:consolas, “Courier New”, courier, monospace;background-color:#ffffff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .rem
{color:#008000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .kwrd
{color:#0000ff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .str
{color:#006080;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .op
{color:#0000c0;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .preproc
{color:#cc6633;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .asp
{background-color:#ffff00;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .html
{color:#800000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .attr
{color:#ff0000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .alt
{background-color:#f4f4f4;width:100%;margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .lnum
{color:#606060;}

There are other possible points to check if this error arises, but the best place to start is with the code in your .exe file that you are attaching to the pipeline.

Error 2: The Crawl completes without processing pipeline errors, but your custom crawled properties do not show up as managed properties when you search.

This is definitely the harder one to debug, since there can be so many places where you may or may not have gone wrong. The good news here is that your code is executing without throwing any kind of errors, the bad news is you’re probably not getting inputs and/or outputs from and to FAST.

First, let’s ensure that we are getting inputs in from FAST. Vassili gave me an example using log4net to be able to log what was coming in and out of the pipeline. Download the .dll for log4net, and ensure that it is copied over along with your .exe to the FAST server.

In your code, add an App.config file for your debugging purposes. Should looks something similar to this:

1: <?xml version=“1.0″?> 2: <configuration> 3: <configSections> 4: <section name=“log4net” type=“log4net.Config.Log4NetConfigurationSectionHandler, log4net”/> 5: </configSections> 6: <log4net> 7: <appender name=“RollingFileAppender” type=“log4net.Appender.RollingFileAppender”> 8: <threshold value=“DEBUG”/> 9: <file value=“C:UserssvcfastAppDataLocalLowPB.FAST.PipelineExtensibility.log”/> 10: <appendToFile value=“true”/> 11: <rollingStyle value=“Size”/> 12: <maxSizeRollBackups value=“0″/> 13: <maximumFileSize value=“10000000KB”/> 14: <staticLogFileName value=“true”/> 15: <layout type=“log4net.Layout.PatternLayout”> 16: <conversionPattern value=“%date [%thread] %-5level -%message%newline”/> 17: </layout> 18: </appender> 19: <root> 20: <level value=“All”/> 21: <appender-ref ref=“RollingFileAppender”/> 22: </root> 23: </log4net>

.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode, .ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{font-size:small;color:black;font-family:consolas, “Courier New”, courier, monospace;background-color:#ffffff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .rem
{color:#008000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .kwrd
{color:#0000ff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .str
{color:#006080;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .op
{color:#0000c0;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .preproc
{color:#cc6633;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .asp
{background-color:#ffff00;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .html
{color:#800000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .attr
{color:#ff0000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .alt
{background-color:#f4f4f4;width:100%;margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .lnum
{color:#606060;}

The important part of this is the <file value=”C:UsersUSERNAMEAppDataLocalLowYOURLOGNAME.log” />. FAST can only write to the AppDataLocalLow location. Be sure that you are pointing the logger here (for any logging you do, unless you write to SQL).

You will also want to add the following to your AssemblyInfo.cs:

1: [assembly: log4net.Config.XmlConfigurator(Watch = false)]

.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode, .ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{font-size:small;color:black;font-family:consolas, “Courier New”, courier, monospace;background-color:#ffffff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .rem
{color:#008000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .kwrd
{color:#0000ff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .str
{color:#006080;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .op
{color:#0000c0;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .preproc
{color:#cc6633;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .asp
{background-color:#ffff00;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .html
{color:#800000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .attr
{color:#ff0000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .alt
{background-color:#f4f4f4;width:100%;margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .lnum
{color:#606060;}

In your code now, you can add the following at the beginning of your code for the inputs:

1: class Program 2: { 3: private static readonly ILog _log = LogManager.GetLogger(“YOURLOGFILE”); /* log4net code */ 4: 5: public static void Main(string[] args) 6: { 7: try { 8: string input = args[0]; 9: string output = args[1]; 10: 11: XDocument inDoc = XDocument.Load(input); 12: 13: _log.Debug(“Input Xml:”); /* log4net code */ 14: _log.Debug(inDoc.ToString()); /* log4net code */

.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode, .ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{font-size:small;color:black;font-family:consolas, “Courier New”, courier, monospace;background-color:#ffffff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .rem
{color:#008000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .kwrd
{color:#0000ff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .str
{color:#006080;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .op
{color:#0000c0;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .preproc
{color:#cc6633;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .asp
{background-color:#ffff00;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .html
{color:#800000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .attr
{color:#ff0000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .alt
{background-color:#f4f4f4;width:100%;margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .lnum
{color:#606060;}

And at the end of your code for the outputs:

1: _log.Debug(“Output Xml:”); 2: _log.Debug(docElement.ToString());

.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode, .ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{font-size:small;color:black;font-family:consolas, “Courier New”, courier, monospace;background-color:#ffffff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .rem
{color:#008000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .kwrd
{color:#0000ff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .str
{color:#006080;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .op
{color:#0000c0;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .preproc
{color:#cc6633;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .asp
{background-color:#ffff00;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .html
{color:#800000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .attr
{color:#ff0000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .alt
{background-color:#f4f4f4;width:100%;margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .lnum
{color:#606060;}

 

 

Copy over your new code and kick off a crawl. Go to the folder C:UsersUSERNAMEAppDataLocalLowYOURLOGNAME.log and you should see a log file being populated live as your crawl runs. The data will look something similar to this:

CorrectProperties

As you can see here, we’ve got proper inputs coming in, and proper outputs going out. Given that your output crawled properties have the correct properties in the XML, the crawled properties should be getting set and able to be mapped to their managed properties.

If the inputs in the XML are empty (and your output is dependent on them), then the outputs will be empty and will not show up in the log.

emptyxml

So now what? If you know for sure that there should be crawled property data coming in and your code and logged XML look correct, something is not coming in from FAST properly.

Error (and Fix) 3: The Crawled Property elements in PipelineExtensibility.xml ARE Case Sensitive

This error is about as vague and transparent to catch as any error could be. Since nothing gets thrown except an empty log file, it was difficult to chase this down. Looking up crawled properties in the UI and PowerShell was also misleading, since they both show up as fully lowercase.

Incorrectcase

So not helpful at all here in determining how exactly the crawled property should be displayed, and I spent too much time looking in the wrong places. You will need to look in the FFDDumper to inspect the crawled properties that are coming in.

To enable your FFDumper debugging process, go to FASTSEARCHetcconfig_dataDocumentProcessor and open the optionalprocessing.xml file. Here you will want to change <processor active=”no” name=”FFDDumper”/> to <processor active=”yes” name=”FFDDumper” />. Reset your document processors by typing reset psctrl in your FAST PowerShell.

As a warning, only use this for debugging purposes and remember to turn this off when you are finished debugging your code. Having FFDDumper enabled will slow down your crawls and use up quite a bit of disk space.

Now, kick off a full crawl, or queue a single document up for incremental crawl. Queuing a single document for an increment crawl is recommend since a full crawl will have multitudes of files and text that will be difficult to get through.

Go to FASTSEARCHdataffd. Here you will find folders for each document processor you have running. Go into a folder and you will see multitudes of .ffd files, which can be opened in NotePad. I wrote a simple, probably slightly inefficient script to loop through the files and find the property name that I am looking for:

1: $items = Get-ChildItem -Path FULL PATH OF FFD FOLDER

"

2: 3: # enumerate the items array 4: 5: foreach ($item in $items) 6: { 7: if (Get-Content $item | Select-String “ows_ratevalue

")

8: { 9: Write-Host “LOOK HERE!

" $item

10: } 11: else 12: { 13: Write-Host “FALSE” $item 14: } 15: }

.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode, .ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{font-size:small;color:black;font-family:consolas, “Courier New”, courier, monospace;background-color:#ffffff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .rem
{color:#008000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .kwrd
{color:#0000ff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .str
{color:#006080;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .op
{color:#0000c0;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .preproc
{color:#cc6633;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .asp
{background-color:#ffff00;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .html
{color:#800000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .attr
{color:#ff0000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .alt
{background-color:#f4f4f4;width:100%;margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .lnum
{color:#606060;}

This finally gave me the correct camel case for my crawled property (ows_RateValue) after digging through the .ffd file that I pinpointed with my script; which I placed in my pipelineextensiblity.xml file.

1: <CrawledProperty propertySet=“00130329-0000-0130-c000-000000131346″ varType=“31″ propertyName=“ows_RateValue” />

.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode, .ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{font-size:small;color:black;font-family:consolas, “Courier New”, courier, monospace;background-color:#ffffff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode pre
{margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .rem
{color:#008000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .kwrd
{color:#0000ff;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .str
{color:#006080;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .op
{color:#0000c0;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .preproc
{color:#cc6633;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .asp
{background-color:#ffff00;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .html
{color:#800000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .attr
{color:#ff0000;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .alt
{background-color:#f4f4f4;width:100%;margin:0em;}
.ExternalClass7303424B2DFC41F5A35EDBD1F9A47EC8 .csharpcode .lnum
{color:#606060;}

For more specific FFDDumper steps, check out Neil Richard’s blog.

I hope this has helped some of you through the process of debugging your processing pipeline. Happy Searching!

Tags: ,

Leave a Reply