There are a proliferation of acronyms with the Ops suffix for the software architect to choose from. It’s reasonable to question whether the number are needed and necessary. All of these are, at the core, a targeted expressions of foundational business management methodology. The end goal will be continuous improvement in some business critical metric. nOps has its roots in statistical process control.
Statistical Process Control
Statistical Process Control involves bringing a process under statistical control to identify special causes of variation.
- Identify target processes critical to the business
- Determine measurable attributes of the target process
- Determine the measurement method and its repeatability and reproducibility
- Develop a repeatable plan to sample data
- Establish upper and lower bounds
- Monitor process variation that exceeds these bounds
The first part of SPC has less rigor than the following steps. The Theory of Constraints principle focused on the importance of target identification.
Theory of Constraints
The Theory of Constrains proposes that every system must have at least one constraint that serves as a barrier to higher performance.
“Any improvements made anywhere besides the bottleneck are an illusion.”
The Theory of Constraints proposes the the Five Focusing Steps as a process for breaking the constraint:
- Identify the constraint
- Exploit the constraint
- Subordinate everything to the constraint
- Elevate the constraint
- Don’t let inertia cause the constraint
Manufacturers found that small lot sizes tended to lower latency, reduce errors and raise overall system input. This is sometimes known as Lean manufacturing. Agile Software Development proposes short development iterations would yield similar benefits in software development.
Agile Project Mangement
Agile project management is a project framework that takes an iterative approach towards completing a project. The Agile Manifesto proposes four core values:
- Individuals and interactions over processes and tools
- Working product over comprehensive documentation
- Customer collaboration over contract negotiation
- Customer satisfaction through continuous delivery of the product
Agile is a development methodology designed to maintain productivity and drive consistent deliverables within the short timeframe to allow for changing priorities. Ops is a culture of the development and maintenance of software.
Ops is culture where all aspects of the software development lifecycle work together to increase efficiencies through automation and programmable processes. If Agile’s goal is to reduce the length of time needed to deliver a unit of business value, then the goal of Ops is to optimize the Work In Progress pipeline. There are two major manifestations of Ops: DevOps and DataOps.
DevOps focuses on continuous integration and continuous delivery of software by using an infrastructure-as-code model allowing for the automation of the integration, testing and delivery of code. If Agile is famous for delivering units of business value in weeks instead of months, DevOps has been known to reduce software release cycle time from months to seconds.
- DevSecOps – Integrate Security into the Development, Security and Operations.
- GitOps – infrastructure automation (typically Kubernetes) where Git is the single source of truth
- AIOps – enhancing traditional DevOps with AI and machine learning to manage large scale environments and data
- CloudOps – migration of DevOps to the cloud
- NoOps – automate IT infrastructure to negate need for an inhouse team for operational purposes
DataOps is not very similar to DevOps so it’s better to consider similarities as the result of their common root. DataOps came around after DevOps but it is not a variation of DevOps. Data Ops orchestrates, monitors and manages the data factory with the goal of improving data quality and accessibility.
- DataGovOps – data catalogs, data lineage, data quality, security and roles and responsibilities as code
- ETLOps – largely vaporware. Do ELT instead of ETL and combine DevOps and DataGovOps.
- ModelOps – automate monitoring training, deployment and governance of machine learning models
- AnalyticOps – ensure operational safety of deployed analytics
Personally, I would consider DevSecOps to be the foundation from which I build out enterprise applications since security is notoriously difficult to fix in right before production. Shifting-left on security may slow down turnaround time in some instances, but that debt was always there. Pay as you go. If you are architecting a modern platform, containerization and the cloud should be at the core of the design, which brings GitOps into play. If everything is code, git should be at the center of everything. For the most part, the only practical route I see for most of the larger scale environments I would with is to leverage AIOps where it makes sense as we move to NoOps. When you consider that most IT departments use 70% of their annual budget to to keep the lights on, shrinking down the technical rent whenever and wherever possible is the real key to being able to respond to new challenges and priorities.
Pretty much every organization needs to start seriously working on a DataGovOps solutions because your current solution is either expensive and not work or non-existent and not working. This is also where you will likely use AIOps: you need to be incorporating machine learning into your data quality checks. You have PII stored in comment fields in your customer service database.
Machine learning models are likely to come under the same broad regulatory scrutiny as operational databases. In some industries, this is already the case. ModelOps deal directly with the automation of the machine learning pipeline. Even before ModelOps, you need to start with AnalyticOps. Unlike the other roles, the AnalyticOps engineer answers to the business leads. The responsibility of taking the raw data and analytics and deploying it to the production system falls to the AnalyticOps engineer. In practice, this may require rewriting the R or python code to Spark or converting pandas to koalas so that a local model can be deployed at scale. There are tools for ModelOps but AnalyticOps relies solely on people and processes.