Skip to main content

Data + Intelligence

Revisiting Semantic Layers: Lessons from Their Rise and Return

For most of my data-focused career, I’ve been dealing with semantic layers one way or another. Either because the tool I was using to present data required it explicitly, or because the solution itself needed data to have relationships defined to make sense and be better organized.

With the recent focus and hype on AI-infused solutions, there has been an increasing amount of chatter around semantic layers. What is it? What is it used for? Does my organization need one? And, what does it have to do with AI?

What are semantic layers?

In its simplest form, a semantic layer is a collection of rules that define the relationships between different data concepts. For example, your organization may have an idea of office locations and territories. Each office location belongs to one (and only one) territory. A semantic layer would contain the definition that a group of office locations constitutes a territory. Similarly, a Person may have a current address assigned to them. These definitions are typically established by the business and its operational practices. A typical business analyst would be able to define this in your organization.

The semantic layer bridges the gap between how the data is stored and how it’s used by the business.

History of semantic layers

Pre 2000s

1970s-1980s: As relational databases started to become conceptualized, there was a need to create high-level, business-oriented, views. These sometimes included business logic in the form of rollups, simple aggregations, and other functions. These concepts started laying the groundwork for modern-day semantic layers.

1980s-1990s: Data warehousing started to become common and we saw the emergence of OLAP cubes. The primary purpose of data warehousing was to support analytical processing, primarily for business use. We saw the rise of Ralph Kimball’s modeling approach (which is still very much relevant today). This started to focus on business needs when relating data tables in a warehouse.

Additionally, we saw the invention of the Online Analytical Processing (OLAP) Cubes. This took data warehousing a step further because multi-dimensional “cubes” allowed data to be accessed in multiple intersections of the dimensions the data had a relationship with. You can try to visualize a 3-dimensional cube that hosts transaction data with the axes being: Time, Cashier, Product, and the intersection being the Sales Price. Any point in the cube will hold Sales for all permutations of the dimensions.

Prior to 2000, accessing data still required a high level of technical skill in addition to understanding how the business would use the data in order to solve problems or perform day-to-day operations.

Early 2000s: The Rise of Semantic Layers

The early 2000s saw a significant increase in the popularity of semantic layers. This was primarily driven by the adoption of business intelligence tools. Companies like Business Objects, Cognos, Hyperion, and MicroStrategy all had their own semantic layers. The aim was to make it easier for business users to access data.

Business Intelligence tools utilize their own semantic layers to provide:

  • Consistency and governance
  • Performance utilization
    • Caching and precalculated aggregates
    • Some tools had their own in-memory layer that served as a quicker way to store aggregated data for quick retrieval
  • Dashboards and reporting
    • Users could create their own reporting and dashboards without involving IT by leveraging business-friendly entities, without worrying about the underlying data structure.

The Fall of Semantic Layers

As BI tools (and semantic layers) gained popularity, a new professional emerged: the Business Intelligence Professional. These were highly analytical individuals who sat between IT and business users, translating business requirements into IT requirements. Additionally, they were able to create semantic models and configure various business intelligence platforms to extract the necessary business value from the stored data.

As business intelligence tools became more monolithic and harder to maintain, we started to see the emergence of departmental business intelligence tools. The most notable example is Tableau. 

In 2005, Tableau launched globally with the promise of “eliminating IT” from business intelligence. Users had the ability to connect directly to databases, spreadsheets, and other data sources, eliminating the need for an organization’s IT staff to provide connectivity or curate the data.

Because of how easily business users can connect to data and manipulate it. There was no “single version of the truth”, no governance on the data being consumed, and certainly no centralized semantic layer that housed the enterprise’s business rules. Instead, each business user or department had their own view (and presentation of the data). The time from requiring data to be presented or reported on to the time it actually happened was reduced dramatically. It was during this time that enterprise-wide semantic layers became less popular.

In parallel, many of the business rules started to become more and more incorporated into the ETL and ELT processes. This allowed some of the semantics to be precalculated before being consumed by the business intelligence layer. This had many drawbacks that were not apparent to the typical Data Engineer, but were very apparent (and important) to the business intelligence professionals.

The rise (again) of Semantic Layers

As time passed, we began to see business executives, business operators, and other data consumers questioning the veracity of the data. Since there was no centralized location, there was no central owner of the data. This is when the industry started seeing the creation of the Chief Data Office role which, among other things, typically has the responsibility of data governance.

For some years, the battle between centralized BI and departmental BI continued. Agility vs uniformity constantly fueled arguments and as companies started to force centralized BI, the emergence of shadow IT groups within organizations started popping up. You can likely see this in your own organization where departments run part of their operations in Excel because of lack of access to proper data.

We also saw the popularity of Analytics Centers of Excellence increasing. They took care of data governance and the single version of the truth. The greatest tool at their disposal was the mighty semantic layer.

Enter Gen AI

No doubt, generative AI has taken the world by storm. Everyone is trying to make sense of it: how do I use it? Am I doing it correctly? What do I not know? One thing is certain: for Gen AI to work properly, it needs to understand how users utilize the data and what it means to them. This is accomplished by semantic layers. This little concept that has been sticking with us for decades is suddenly even more important than it was in the past.

There is a current push for smaller, purpose-built LLMs. This will undoubtedly increase the importance of semantic layers in feeding necessary metadata to the application that utilizes them.

What’s going on right now?

Currently, we are seeing an increasing number of semantic layer-only tools that are decoupled from business intelligence platforms. Companies like AtScale, Denodo, and Dreamio promise to host the business rules and apply them to queries issued by business intelligence and visualization tools. They act as a broker between such tools and the underlying data. This, in theory, has the great benefit of having many tools utilize the semantics built into the data in their favorite tool of choice, whether that is a command-line SQL interface, a REST API call, or a visualization tool like Tableau. Additionally, companies like Tableau, which previously lacked semantic layers, are now incorporating semantic layer capabilities into their suite of tools. Others, such as Strategy (former MicroStrategy) are decoupling their powerful semantic layer from their BI suite to provide it as a standalone product.

Does my organization need one?

By now, you probably already have an idea of the answer to the question of whether your organization could benefit from a semantic layer. If you want your organization to succeed in its quest to leverage AI properly and derive proper business insight from it, you should think about what is telling AI how your business operates and how that data is organized.

What do I do now?

Contact Perficient for a conversation around how we can help your organization leverage analytical tools (including artificial intelligence) properly through our experience with semantic models.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Roberto Trevino

Roberto is an engineer (M.S.) with a passion for automation, data-driven solutions, and technology in general. He is almost always talking about something new in the technology world and enjoys learning about it. He's also done some pretty interesting stuff both personal and professional. Currently, he develops solutions for all sorts of clients at Perficient, Inc. His strongest competency is Analytics and Business Intelligence, but has experience designing a plenitude of solutions and integrations ranging from front-end and client-facing to back-end, transactional and server-side.

More from this Author

Follow Us