Adobe

Loading JSON Content into AEM

Istock 1135346386

Let’s talk about extract, transform, and load, also known as ETL. If you are an AEM professional, this is something you have previously dealt with. It could be something along the lines of products, user bios, or store locations.

The extract and transform parts may differ depending on your source and requirements. The loading part is almost always going to be into AEM. While there may be a few ways to do that, let us talk about what is there for you out-of-the-box.

Sling Post Servlet

As an AEM developer, the Sling Post Servlet is something you should be familiar with. In particular, there is an import operation. This allows us to do the following:

curl -L https://www.boredapi.com/api/activity | \
curl -u admin:admin \
     -F":contentFile=@-" \
     -F":nameHint=activity" \
     -F":operation=import" \
     -F":contentType=json" \
     http://localhost:4502/content/mysite/us/en/jcr:content/root/container

You can run this many times. You will get activity_* nodes under /content/mysite/us/en/jcr:content/root/container. This assumes that the source is already in the format you desire. Meaning you have already done the transform part.

And the import operation can deal with more complex JSON structures, even XML. Here is a possible output that could be provided by a transform:

{
    "jcr:primaryType": "cq:Page",
    "jcr:content": {
        "jcr:primaryType": "cq:PageContent",
        "jcr:title": "My Page",
        "sling:resourceType": "mysite/components/page",
        "cq:template": "/conf/mysite/settings/wcm/templates/page-content",
        "root": {
            "jcr:primaryType": "nt:unstructured",
            "sling:resourceType": "mysite/components/container",
            "layout": "responsiveGrid",
            "container": {
                "jcr:primaryType": "nt:unstructured",
                "sling:resourceType": "mysite/components/container"
            }
        }
    }
}

Save this to a file named mypage.json and run the following curl command.

curl -u admin:admin \
     -F":name=my-page" \
     -F":contentFile=@mypage.json" \
     -F":operation=import" \
     -F":contentType=json" \
     -F":replace=true" \
     -F":replaceProperties=true" \
     http://localhost:4502/content/mysite/us/en

And boom! You have an instant page. This time instead of the :nameHint I used the :name and :replace properties. Running this command again will update the page. The loading part becomes really trivial and you need only worry about extracting and transforming.

Finding the Content Importer Service

While the Sling Post Servlet is well documented, its internal implementation is not. Luckily, it is open source. You won’t have to do any decompiling today! Let’s read the doPost function of the implementation. There are too many goodies we could dive into. Let’s stay focused. We are looking for the import operation. Did you find it?

You should have wound up at the doRun function of the ImportOperation.java. This is where all those request parameters from the curl commands above come into play. Go further down. You will find a call to ContentImporter.importContent(Node, String, String, InputStream, ImportOptions, ContentImportListener). Can you find its implementation?

Finally, you should have wound up on the DefaultContentImporter.java implementation. An OSGi component that implements the ContentImporter interface.

Programmatically Using the Content Importer

Yes! Programatically doing things. Now that we know that the ContentImporter is available as an OSGi component all we need is:

@Reference
private ContentImporter contentImporter;

And assuming you have your content via InputStream we can import the content under any node. As an example, I am using the SimpleServlet generated as part of the AEM Maven Archtype. I’m using Lombok to speed things up a little.

@Component(service = { Servlet.class })
@SlingServletResourceTypes(resourceTypes = "mysite/components/page", methods = HttpConstants.METHOD_GET, extensions = "txt")
@ServiceDescription("Simple Demo Servlet")
@Slf4j
public class SimpleServlet extends SlingSafeMethodsServlet {

    private static final long serialVersionUID = 1L;

    @Reference
    private ContentImporter contentImporter;

    @Override
    protected void doGet(final SlingHttpServletRequest request, final SlingHttpServletResponse response)
            throws
            IOException {

        final MyContentImportListener contentImportListener = new MyContentImportListener();
        final Node node = request.getResource().adaptTo(Node.class);

        if (node != null) {

            final MyImportOptions importOptions = MyImportOptions.builder()
                                                                 .overwrite(true)
                                                                 .propertyOverwrite(true)
                                                                 .build();
            try (InputStream inputStream = IOUtils.toInputStream("{\"foo\":\"bar\"}", StandardCharsets.UTF_8)) {
                this.contentImporter.importContent(node, "my-imported-structure", "application/json", inputStream, importOptions, contentImportListener);
            } catch (final RepositoryException e) {
                log.error(e.getMessage(), e);
            }
        }

        response.setContentType("text/plain");
        response.getWriter().println(contentImportListener);

    }

    @Builder
    @Getter
    private static final class MyImportOptions extends ImportOptions {

        private final boolean checkin;
        private final boolean autoCheckout;
        private final boolean overwrite;
        private final boolean propertyOverwrite;

        @Override
        public boolean isIgnoredImportProvider(final String extension) { return false; }
    }

    @Getter
    @ToString
    private static final class MyContentImportListener implements ContentImportListener {

        private final com.google.common.collect.Multimap<String, String> changes =
                com.google.common.collect.ArrayListMultimap.create();

        @Override
        public void onReorder(final String orderedPath, final String beforeSibbling) {this.changes.put("onReorder", String.format("%s, %s", orderedPath, beforeSibbling)); }

        @Override
        public void onMove(final String srcPath, final String destPath) { this.changes.put("onMove", String.format("%s, %s", srcPath, destPath)); }

        @Override
        public void onModify(final String srcPath) { this.changes.put("onModify", srcPath); }

        @Override
        public void onDelete(final String srcPath) { this.changes.put("onDelete", srcPath); }

        @Override
        public void onCreate(final String srcPath) { this.changes.put("onCreate", srcPath); }

        @Override
        public void onCopy(final String srcPath, final String destPath) { this.changes.put("onCopy", String.format("%s, %s", srcPath, destPath)); }

        @Override
        public void onCheckin(final String srcPath) { this.changes.put("onCheckin", srcPath); }

        @Override
        public void onCheckout(final String srcPath) { this.changes.put("onCheckout", srcPath); }
    }
}

Conclusion

First, extract and transform your content into the desired JSON structure. It should represent the content as you want it. Then you can leverage the Sling Post Servlet’s import feature to pipe it into AEM. If you need to be within the context of the AEM instance, you can use the Content Importer service instead.
 
In the end, loading becomes trivial leaving you to focus on the harder export and transform. Now, check out my colleague’s blog that goes over one way to simplify the transformation process.
About the Author

Juan Ayala is a Lead Developer in the Adobe practice at Perficient, Inc., focused on the Adobe Experience platform and the things revolving around it.

More from this Author

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to the Weekly Blog Digest:

Sign Up