Google includes a comprehensive, HTML-based Admin Console for making changes to the Google Search Appliance configuration. Adding KeyMatches, changing crawl URL patterns, exporting reports, etc. This works very well during development and for making minor changes thereafter. But in a large, complex deployment, it is often desirable to script changes or write code to do them automatically. Manual changes can lead to errors, and some actions might need to be performed on a schedule, such as making daily backups.
Google provides two mechanisms for programmatically making changes to the GSA – the Administrative API or the gsa_admin.py script.
The Administrative API can make certain changes to the configuration and gives you access to certain reports and data, but, unfortunately, it is not 100% comprehensive. Certain settings cannot be changed by the API, and certain files and reports cannot be downloaded. The Administrative API is available through an XML protocol, or through Java and .NET wrappers. The following code shows an example of creating a Search Report using the Administrative API in Java:
GSAClient client = new GsaClient(host, port, user, pwd); GsaEntry insertEntry = new GsaEntry(); insertEntry.addGsaContent("reportName", "MyReport"); insertEntry.addGsaContent("collectionName", "default_collection"); insertEntry.addGsaContent("reportDate", "date_08_24_2015"); client.insertEntry("searchLog", insertEntry);
Good UX Means Good Business
In a world where technology is rapidly advancing and user expectations are rising, it’s no longer enough to have an average user experience; to delight your users and surpass your competition you must strive for the exceptional.
I recommend using the Administrative API because it is very reliable and less complex than the second mechanism. It might take a combination of both techniques to accomplish what you need to do, but start with the Administrative API wherever possible.
Google provides a Python script called gsa_admin.py (download here) that essentially remote controls the HTML-based Admin Console, providing access to settings and downloads not available through the Administrative API. It handles all the request parameters that the GSA Admin Console requires, including hidden security token. The script allows you to do things like stop and start the crawler or export the All URLs file. By copying and changing bits of the code, we have modified this script to export other types of files, like Search Reports and ASR Logs.
The script also allows you to export and import the entire configuration XML file – and make changes to it in the process. This opens up many possibilities. The configuration XML file contains almost every setting in the GSA, including Dynamic Navigation, Front Ends and OneBox Modules. If you can find the setting you want to change in the XML file, this Python script allows you to download the configuration, make the change, and re-upload the configuration.
The following commands show how to export, change, and upload the GSA Configuration File:
python ./gsa_admin.py -n 192.168.1.2 --port 8000 -u admin -p <password> -e --sign-password 12341234 -o ./config.xml -v
... [make some changes to config.xml] ...
python ./gsa_admin.py -n 192.168.1.2 --port 8000 -u admin -p <password> -s --sign-password 12341234 -f ./config.xml -v -o ./signed_config.xml
python ./gsa_admin.py -n 192.168.1.2 --port 8000 -i --sign-password 12341234 -f ./signed_config.xml -v
Note: the gsa_admin.py script requires Python 2.x, and not Python 3.x
We have used the gsa_admin.py for a variety of useful applications, such as doing nightly backups of the configuration file, automating the migration of a GSA configuration from DEV to TEST, and downloading ASR log files every day for analysis. In the case of the ASR log files, we used a combination of both the API and the Python script – with the API telling the GSA to generate a new log file, and the Python script being used to download it. It wasn’t the prettiest solution, but it got the job done.