Adobe

How to Be Friends with a Search Engine

As is often the case when a website is re-platformed, the organic search engine rank for your website dips for a short period of time post-launch. Many customers don’t anticipate this happening and are concerned about the effects on their business. I have read that a normal website traffic dip is between 3-5% when everything goes according to plan. If it’s more than 5% something was done incorrectly or something was missing and often times it’s because SEO requirements were overlooked during the early stages of the website replatforming.
I typically  see missing links before our customers and begin asking the questions in the the requirement gathering stage and acceptance criteria stage of a project. I ask these questions because in my experience most of the SEO requirements can be retrofitted into the code base at a later stage; however, a few requirements such as URL structure, Servlet paths, and API design can prove to be costly to retrofit and should be considered upfront as we gather the requirements.
There are several places in AEM implementation where the designer needs to be mindful of SEO paradigms and it is important to clearly articulate these to everyone who is touching the website replatforming project, from the IT resources to the SEO Manager and marketing team.
I have aggregated some of the major SEO requirements that often arise in a web replatforming project from creating the right URL structure and rewriting URLs to site map management and redirects. This information can help any AEM replatforming project go more smoothly and provide insight for your marketing users to minimize the impact to the typical traffic dip that occurs.

URL STRUCTURE

The basic idea here is to make your URL simple and easy to understand. Readability goes beyond the human reader and extends to Google and Bing. A good rule of thumb is that if it’s easy to read for a person, it will be easy for the search engine.Below are 5 tech tips follow when considering the URL design:

  • Use hyphens “-” to separate words.
  • Avoid underscores “_” between words.
  • Avoid using query parameters, like ?= or ?id=, instead use selectors if the response is required to be cached.
  • The URL structure should be keyword rich including for e.g. product family, subfamily, model names.
  • URLs can follow the same hierarchy as the HTML sitemap. For example, the URL structure could look something like this /printer/laser/M120

Based on the above requirements, especially if you are using Multi Site Manager (MSM), it is important that your information architecture is solid. I recommend having a subdirectory over a subdomain as the later one ranks less in the search engine than the first one. Follow the MSM best practice and set your structure similar to /sitename/locale/regions/L1/L2

URL REWRITING

AEM stores content under /content/sitename/locale/… structure internally, but almost everytime, we get requirements to make URL customer friendly. There are a couple of different ways of achieving rewrites, but Adobe recommends using SlingResourceResolver for rewriting outgoing URLs.  On the flip side, Apache mod_rewrite can be used for mapping incoming URLs.
Additionally, AEM also provides vanity URL features for authors to provide alternate ways to request the same content. However, this will fragment the SEO value of the page, so a canonical URL tag should be added to the page to avoid this issue.

SITEMAP

The Digital Essentials, Part 3
The Digital Essentials, Part 3

Developing a robust digital strategy is both a challenge and an opportunity. Part 3 of the Digital Essentials series explores five of the essential technology-driven experiences customers expect, which you may be missing or not fully utilizing.

Get the Guide

In it’s simplest form, a sitemap is an XML file that lists the URLs of a site along with additional metadata about each URL. It tells search engines what pages to index and serve to searchers. There are multiple ways to achieve this in AEM. Below are a the couple of possible solutions:

  1. Register sling servlet listening for the request containing sitename.sitemap.xml and sitename.sitemapdam.xml. This servlet then iterates through current page and its children hence outputting XML rendition.
  2. Create site map template and site map component that generates sitemap.xml and sitemapdam.xml page.
  3. Create sitemap.xml by scheduled job, which generates the list and publishes to invalidate cache from dispatcher.
  4. ACS commons sitemap generator.

In all the above approaches it generates the XML rendition of the page which should be cached at the dispatcher. The location of the XML file should be referenced in the Sitemap property of the robots.txt file or sitemap xml can be submitted to search engines directly. Depending upon your dispatcher configuration, a custom flush rule will need to be implemented to make sure to flush this file whenever a new page is activated. Your solution will be based on how big your site is and how frequently you need to generate the sitemap.

CANONICAL TAGS

Canonical tags are a short snippet of code that typically sits on the header section of the HTML page.  It is used to signal search engines as to which version of content is the “original” or the one that you wish to have appear in your search results.
In short, canonical tags are used to reduce duplicate content in search results, allowing the pages that we want to rank to appear without competing with duplicate content URLs. Especially when you have MSM setup and pages get lived copied from source to other locals/regions. Each page should have a self-referencing canonical tag unless the page is known to be a duplicate page of another on the site. If later is the case, provide page properties to override canonical URL to point to source page.
Example where national pages gets lived copied to regional pages:
Screen Shot 2016-06-30 at 2.01.16 PM
Default State :<link rel=”canonical” href=”https://domain-name/site/national/page“>
Overridden State : <link rel=”canonical” href=”https://domain-name/site/$regional/page“>.

SEMANTIC AND SCHEMA MARKUP

Semantic HTML is the use of HTML markup to reinforce the semantics, or meaning, of the information in webpages rather than merely to define its presentation or look. It relates to the study of words and their logic to improve search accuracy through understanding a searcher’s intent through contextual meaning.
Using Schema.org markup can give context to search engines about the information on a page and can render rich snippets in SERPs.There are various areas where semantic markup should be used to boost your search ranks such as when are identifying a person, video, address, event, social (open graph,twitter and etc.)
When designing markups and AEM component the following items should be considered:

  • Social (Open Graps)
  • Shareability (Twitter Cards)
  • Accessibility ( ADA compliance, Page Speed and etc.)
  • Crawlability ( pages should be available with or without js and css)
  • On-Page Elements (Title, Keywords, Descriptions, H1 tags and etc.)
  • Image Optimization (Alt tags, sort and unique file names and etc.)
  • Video Optimization (captioned, keywords, descriptions and etc.)

USING SLING SELECTORS

AEM is based on Apache Sling, which is an open source RESTful API pattern that associates content nodes with resource types. It then applies script resolution principles to the requests coming into the system.
AEM provides us with two options when writing servlets, that are referred to as Sling servlets and bin servlets. However, It is important to keep in mind that, requests containing query strings are generally not cacheable in dispatcher whereas requests containing selectors are fully cacheable. However, careful consideration should be taken to whitelist selectors to avoid unwanted caching.

REDIRECTS

Redirects signal to search engines that a page has moved and instructs browsers to request the new page. There are many types of redirects, below are some of the most common and their uses:

  • 301 redirect – It signals to browsers and search engines that a page has permanently moved to another location. These redirects transfer link authority to the new page. These redirects should be used typically for site migration and url structure changes.
  • 302 redirects – It signals search engines and browsers that a page has temporarily moved to a new location. These redirects maintain their link authority and do not transfer it to the new page.These redirects are typically used to signal when a e-commerce page has sold out, but will return at a later date.

CONCLUSION

One goal for any new replatforming project should be to minimize the impact that a new site has on traffic to your website. Following these tips are a good start as is having a partner that understands the impacts, how to guide you through the technical specifics to mitigate them is critical to any replatforming project.
References:
SEO: https://moz.com/blog/15-seo-best-practices-for-structuring-urls
Multi Site Manager: https://docs.adobe.com/docs/en/aem/6-0/administer/sites/multi-site-manager.html
Adobe SEO Guideline: https://docs.adobe.com/docs/en/aem/6-1/manage/seo-and-url-management.html
Semantic Schema: http://schema.org/docs/gs.html#microdata_itemscope_itemtype
Google Sitemap: https://support.google.com/webmasters/answer/156184?hl=en

Tags

About the Author

More from this Author

Thoughts on “How to Be Friends with a Search Engine”

  1. I’ve learn several excellent stuff here. Definitely price bookmarking for
    revisiting. I wonder how so much attempt you set to make this type of fantastic informative website.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to the Weekly Blog Digest:

Sign Up