The Operations Coordinator plays a key role in DevSecOps. In my previous post, DevSecOps and Release Coordination, I introduced the idea of four key responsibilities in the DevSecOps mediated release management process. The idea is to consolidate the validation and approval steps from a “gated” process involving many approvers, and shift the actual work of validation earlier in development. To illustrate these necessary roles I have been using the technique of “personas”. So far we have met the Release Coordinator and the Security Architect. In this post, I will continue to introduce the next critical role in this process, the DevSecOps Operations Coordinator.
So if you will, please meet Sandy your Operations Coordinator.
Operations Coordinator Persona
As the Operations Coordinator, I ensure that all production applications are properly deployed, supported, and monitored. I also ensure my team is prepared to immediately respond to any outages or incidents.
I and my team are responsible for the applications that are in production environments. We ensure that all applications are established with proper system monitors (including CPU, network utilization, storage, process memory, etc.) to allow for leading detection of potential issues before they result in an incident or outage. Moreover, I coordinate with the Release Coordinator, Product Owner, and Security Architect to ensure that all scheduled application releases have been properly documented for support and troubleshooting.
As discussed previously, there are four required key system release readiness states (Figure 2). These four product statuses provide confidence to the release team that the software candidate has met all defined standards for production release. The team consists of the Product Owner, Operations Coordinator, Security Architect, and the Release Coordinator. This team represents the sole deciders for what is, and is not, releasing to production. The decision is influenced by product quality, compliance, and organization’s readiness to support. Each member of the release team is responsible for one of these states. This role ensures that both the target environment and the operations group is ready to accept responsibility for production support. During regularly scheduled release meetings, the team reviews each scheduled product release and capture the four readiness states. The product release moves forward with approval of all readiness states.
Tool Use and Workflow Responsibilities
The Operations Coordinator has the overall responsibility to ensure production environments are available, performant, and secure. As part of the release coordination team, this translates to ensuring that all of the necessary software support artifacts are correct, concise, and complete. While the Operations Coordinator is not responsible for creating the Run Book, he/she is responsible for reviewing the contents. Moreover, if there are changes to the Release Plan (say by the addition of a new data source that must be connected), or additional environment elements are required (e.g. servers, load-balancers, etc.), then it is the Operations Coordinator who creates the necessary change tickets. Finally, a review is made of changes around monitoring, logging, and/or auditing tools.
In short, the Operations Coordinator includes the following responsibilities:
- Review of all release-related documents (Run Book, Release Plan, Release Notes, etc.)
- Coordination with the Infrastructure team on necessary changes to production systems
- Update to the Incident Response Plan
- Monitoring of the release deployment automation for successful completion
- Verification of system logs and audit monitors post-deployment
There are several key artifacts that the Operation Coordinator either uses, tracks, or otherwise manages:
- Run Book – A description of error conditions, application start/stop procedures, configuration settings, security features, repair and recovery guide. Typically, this is provided by the development team.
- Trouble Shooting Guide – A set of frequently asked questions for a specific application or system. This is used by the call center in support of customer inquiries or other application issues.
- Release Plan – A detailed set of steps followed to ensure a proper application/system deployment into the production environment. This is also typically created by the development team in partnership with operations.
- Incident Response Plan – A set of procedures to be followed in the event of an outage, incident, or other service interruption. The plan should include contact points for support, notification, and coordination. An incident response team (IRT) will execute the appropriate plan steps to return to normal operations and support root cause investigation.
The Operation Coordinator’s role in release coordination primarily focuses on the deployment process and subsequent production support. This includes ensuring that all supporting documentation is up-to-date. On-going production monitoring ensures the newly deployed system will continue to function as expected. In this way, the Operation Coordinator supports the organization as a whole to promote business continuity and system availability.