Let’s talk about extract, transform, and load, also known as ETL. If you are an AEM professional, this is something you have previously dealt with. It could be something along the lines of products, user bios, or store locations.
The extract and transform parts may differ depending on your source and requirements. The loading part is almost always going to be into AEM. While there may be a few ways to do that, let us talk about what is there for you out-of-the-box.
Sling Post Servlet
As an AEM developer, the Sling Post Servlet is something you should be familiar with. In particular, there is an import operation. This allows us to do the following:
curl -L https://www.boredapi.com/api/activity | \ curl -u admin:admin \ -F":contentFile=@-" \ -F":nameHint=activity" \ -F":operation=import" \ -F":contentType=json" \ http://localhost:4502/content/mysite/us/en/jcr:content/root/container
You can run this many times. You will get activity_*
nodes under /content/mysite/us/en/jcr:content/root/container
. This assumes that the source is already in the format you desire. Meaning you have already done the transform part.
And the import operation can deal with more complex JSON structures, even XML. Here is a possible output that could be provided by a transform:
{ "jcr:primaryType": "cq:Page", "jcr:content": { "jcr:primaryType": "cq:PageContent", "jcr:title": "My Page", "sling:resourceType": "mysite/components/page", "cq:template": "/conf/mysite/settings/wcm/templates/page-content", "root": { "jcr:primaryType": "nt:unstructured", "sling:resourceType": "mysite/components/container", "layout": "responsiveGrid", "container": { "jcr:primaryType": "nt:unstructured", "sling:resourceType": "mysite/components/container" } } } }
Save this to a file named mypage.json
and run the following curl
command.
curl -u admin:admin \ -F":name=my-page" \ -F":contentFile=@mypage.json" \ -F":operation=import" \ -F":contentType=json" \ -F":replace=true" \ -F":replaceProperties=true" \ http://localhost:4502/content/mysite/us/en
And boom! You have an instant page. This time instead of the :nameHint
I used the :name
and :replace
properties. Running this command again will update the page. The loading part becomes really trivial and you need only worry about extracting and transforming.
Finding the Content Importer Service
While the Sling Post Servlet is well documented, its internal implementation is not. Luckily, it is open source. You won’t have to do any decompiling today! Let’s read the doPost function of the implementation. There are too many goodies we could dive into. Let’s stay focused. We are looking for the import operation. Did you find it?
You should have wound up at the doRun function of the ImportOperation.java
. This is where all those request parameters from the curl commands above come into play. Go further down. You will find a call to ContentImporter.importContent(Node, String, String, InputStream, ImportOptions, ContentImportListener). Can you find its implementation?
Finally, you should have wound up on the DefaultContentImporter.java implementation. An OSGi component that implements the ContentImporter
interface.
Programmatically Using the Content Importer
Yes! Programatically doing things. Now that we know that the ContentImporter
is available as an OSGi component all we need is:
@Reference private ContentImporter contentImporter;
And assuming you have your content via InputStream
we can import the content under any node. As an example, I am using the SimpleServlet
generated as part of the AEM Maven Archtype. I’m using Lombok to speed things up a little.
@Component(service = { Servlet.class }) @SlingServletResourceTypes(resourceTypes = "mysite/components/page", methods = HttpConstants.METHOD_GET, extensions = "txt") @ServiceDescription("Simple Demo Servlet") @Slf4j public class SimpleServlet extends SlingSafeMethodsServlet { private static final long serialVersionUID = 1L; @Reference private ContentImporter contentImporter; @Override protected void doGet(final SlingHttpServletRequest request, final SlingHttpServletResponse response) throws IOException { final MyContentImportListener contentImportListener = new MyContentImportListener(); final Node node = request.getResource().adaptTo(Node.class); if (node != null) { final MyImportOptions importOptions = MyImportOptions.builder() .overwrite(true) .propertyOverwrite(true) .build(); try (InputStream inputStream = IOUtils.toInputStream("{\"foo\":\"bar\"}", StandardCharsets.UTF_8)) { this.contentImporter.importContent(node, "my-imported-structure", "application/json", inputStream, importOptions, contentImportListener); } catch (final RepositoryException e) { log.error(e.getMessage(), e); } } response.setContentType("text/plain"); response.getWriter().println(contentImportListener); } @Builder @Getter private static final class MyImportOptions extends ImportOptions { private final boolean checkin; private final boolean autoCheckout; private final boolean overwrite; private final boolean propertyOverwrite; @Override public boolean isIgnoredImportProvider(final String extension) { return false; } } @Getter @ToString private static final class MyContentImportListener implements ContentImportListener { private final com.google.common.collect.Multimap<String, String> changes = com.google.common.collect.ArrayListMultimap.create(); @Override public void onReorder(final String orderedPath, final String beforeSibbling) {this.changes.put("onReorder", String.format("%s, %s", orderedPath, beforeSibbling)); } @Override public void onMove(final String srcPath, final String destPath) { this.changes.put("onMove", String.format("%s, %s", srcPath, destPath)); } @Override public void onModify(final String srcPath) { this.changes.put("onModify", srcPath); } @Override public void onDelete(final String srcPath) { this.changes.put("onDelete", srcPath); } @Override public void onCreate(final String srcPath) { this.changes.put("onCreate", srcPath); } @Override public void onCopy(final String srcPath, final String destPath) { this.changes.put("onCopy", String.format("%s, %s", srcPath, destPath)); } @Override public void onCheckin(final String srcPath) { this.changes.put("onCheckin", srcPath); } @Override public void onCheckout(final String srcPath) { this.changes.put("onCheckout", srcPath); } } }