One frequent use case most of the Adobe Experience Manager (AEM) Full Stack Developers would have come across is migrating content from different applications into AEM. Data from source applications can come in various formats like JSON, XML, CSV, etc. When the source file format is JSON, in order to transform the source data to target structure, we need to write complex programs to read and parse it. In this blog, I will show you an efficient and configurable solution to this problem.
I recently faced this challenge with one of our clients. They have a Solr service, in which the most recent information about their shops around the world is stored. I wanted to avoid calling that service every time we present that information in AEM. For this purpose, I decided to implement a daily batch process to get the data from the service and turn it into JCR nodes and properties.
The problem was that the information is stored in a format that does not directly match the structure required for shop pages in our current AEM implementation. Therefore, I needed to perform a transformation of that data before uploading it to the system.
My first thought was to write an elaborate transformation algorithm. Nevertheless, I realized that this probably was a pretty common problem and decided to look for a library to do the job. The best option I found was Jolt. It is an open-source library with zero external dependencies, tested in several other open-source projects like Apache Camel and Apache NiFi, and highly discussed in software development forums.
What is Jolt?
Jolt is an open-source JSON to JSON transformation library written in Java.
Jolt’s main features are:
- Provides a set of transformations to deal with the JSON-to-JSON conversion.
- Transforms the structure of the JSON data, not data manipulation.
- Uses a JSON file for the specification.
Demo: http://jolt-demo.appspot.com/#inception
GitHub repo: https://github.com/bazaarvoice/jolt
Jolt supports the following transformations.
- Shift: copies data from input to the output tree
- Default: applies default values to the tree.
- Remove: removes data from the tree.
- Sort: sorts the values.
- Cardinality: adjusts the cardinality of input data.
Further information and examples of these transformations can be found on Jolt’s GitHub page.
Steps to Integrate Jolt in an AEM Project
First, we need to add the Maven dependency. Apache ServiceMix offers an OSGi wrapper for Jolt, so we do not require embedding it in our bundle. Remember to install the jar via the Adobe Experience Manager Web Console Bundles.
<dependency> <groupId>org.apache.servicemix.bundles</groupId> <artifactId>org.apache.servicemix.bundles.bazaarvoice-jolt</artifactId> <version>0.1.1_1</version> </dependency>
Now, we can use Jolt to do all the heavy lifting work for us.
For demonstration proposes, imagine that you have this JSON document that comes from an external system:
{ "store_id": "1234", "address": { "street_addresses": ["742 Evergreen Terrace"], "city": "Springfield" } }
And we want to load it into AEM like this:
{ "storeId" : "1234", "location" : { "address" : [ "742 Evergreen Terrace" ], "city" : "Springfield" } }
Jolt can chain different transformations to process a record. So, our transformation specification file must consist of an array of operations. In our case, we only need one.
To do our transformation, we will use the shift operation. It copies properties from the input JSON to the desired location in the output file. In the spec property, we tell Jolt the initial and final locations for a given field.
Let’s see how to move street_addresses to address.
First, we start with a copy of the input:
"spec": { "address": { "street_addresses": } }
Then we define where we want to move the property:
"spec": { "address": { "street_addresses": "location" } }
Finally, we specify the new name for it:
"spec": { "address": { "street_addresses": "location.address" } }
With this simple specification, we have moved address.street_addresses
to location.address
The remaining transformations can be done with the following spec file:
[ { "operation": "shift", "spec":{ "store_id": "storeId", "address": { "street_addresses": "location.address", "city": "location.city" } } } ]
As you can see, the specifications file is not overcomplicated.
Now that we know how to do transformations using Jolt, it is time to use them in Java.
To use Jolt, we need to instance Chainr
. Here is an example code that executes the discussed transformation:
import com.bazaarvoice.jolt.Chainr; import com.bazaarvoice.jolt.JsonUtils; public class JoltTest { public static void transformJsonJolt() { final Chainr chainr = Chainr.fromSpec(JsonUtils.classpathToList("/path/to/specFile.json")); final Object jsonInput = JsonUtils.classpathToObject("/path/to/jsonInput.json"); final Object jsonOutput = chainr.transform(jsonInput); System.out.println(jsonOutput); } }
The output:
{storeId=1234, location={address=[742 Evergreen Terrace], city=Springfield}}
Conclusion
Jolt is a powerful tool that helps you to perform complex JSON transformations without reinventing the wheel. It is simple, open-source, and dependency-free.
Now that you know to transform your JSON data, check this blog from one of my colleagues to learn how to import it into AEM using the ContentImporter
component.