Skip to main content

Cloud

Cookie-based crawl rules require a <!CDATA[[]> section that contains the whole cookie

A few days ago I wrote a quick post about installing the AddRule utility that allows you to create crawl rules for forms-based and cookie-based sites. The manner in which you create the crawl rule is to write a short xml document that contains the important access information that the crawler will need to login to the target site.
While the documentation is adequate for building this document, there are a couple of points that may trip you up when preparing your xml. The way in which the documentation describes how to add cookie parameters to your xml document seems to indicate that you should create a <cookie></cookie> tag for each parameter; using the [key]=[value] format as the content of each tag. So if your cookie needs to include a login and a password, it would look like the following:
<cookie>login=mylogin</cookie>
<cookie>password=pwd</cookie>
The reality is that the cookie tag should contain the entire POST body or GET querystring included in the communication between the form and the server. This being the case, your xml should look like this:
<cookie>login=mylogin&password=pwd</cookie>
Now if you look at that line of code closely, you will see that it is not well formed XML. In order to make this line work in the xml document you must enclose the content of the cookie in a < ![CDATA[ ]]> section. The correct code will look like:
<cookie><![CDATA[login=mylogin&password=pwd]]></cookie>
While totally missing the ball on installing the AddRule application may be attributed to my lack of attention, this is one section of the documentation that isn’t quite clear and can cause you to lose some time trying to configure. I am hoping that the SP will provide a much easier interface for building these crawl rules.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

PointBridge Blogs

More from this Author

Follow Us