Benjamin Lieberman, Author at Perficient Blogs

An Automated DevSecOps Framework

Benjamin Lieberman — Mon, 05 Dec 2022 18:38:31 +0000

Automation for Automation – An Executable Framework for DevSecOps

In an age where automated Continuous Integration and Continuous Delivery (CI/CD) is becoming more and more critical to the success of any organization, why are we still building our DevSecOps environments by hand? Instead why don’t we leverage automation for our automation? DevSecOps support teams are frequently faced with multiple challenges and using an automated DevSecOps framework will directly aid in meeting the expectations of the organization and development teams.

Commonly heard DevSecOps team frustrations:

We need to work with several corporate groups (network, infrastructure, security, etc.) to modify our automation tool ecosystem
We don’t own our environments and need to get multiple approvals to make improvements/updates
Installation and integration of any new tooling takes a very long time, including updates to all impacted development pipelines
On-boarding new teams to pipelines is very time-consuming and requires significant manual pipeline configuration
Pipeline and tool integration governance is poor across the organization, making support by our tiny team all but impossible

Development teams too are held back by the very automation that is intended to make their lives easier. The issues include pipeline failures, numerous approvals for deployments, and difficulties in getting feedback on security vulnerability information from CI/CD build results.

Commonly heard Development team frustrations:

We just want our automation to work – we don’t have time to figure out pipeline failures

How then can we best resolve these issues? Here we introduce a new approach to DevSecOps automation – The CoStar Automation Framework

The CoStar Framework focuses on three key benefits: building the DevSecOps automation environment, delivering software into production with greater consistency, and maintaining the overall platform across the organization.

Build

The first benefit noted is the creation of system build automation (aka Continuous Integration). The intent is to establish a fully integrated and functional DevSecOps environment in minutes rather than months through the use of now standard Infrastructure as Code (IaC) techniques. This approach uses automated environment creation, configuration, and depends on cloud hosted resources.

DevSecOps as Code – Rapid automated environment resource creation and integration

As shown in the figure, the CoStar Framework architecture is implemented through the use of a cloud-resource manager (e.g. Terraform), and resource configurator (e.g. Ansible), and a workflow management tool (i.e. Azure DevOps). Each of these tools is leveraged where it is best suited; the cloud-resource manager allows for cloud-agnostic deployment of the initial infrastructure, which in this example is the creation of a set of Virtual Machines (VMs) or Kubernetes clusters that host Docker-contained DevSecOps tools. The resource configurator is then employed for configuration of each tool, for example to generate integration security tokens or register a client application. Finally, a workflow manager ties together and orchestrates the overall execution of the framework instantiation.

Figure 1. CoStar DevSecOps Automation Framework

An example of one way to integrate these activities is shown in Figure 2. The automation pipelines are separated to allow them to be run independently. After the initial build step is completed the token generator can be re-run to perform token rotation, or the configuration playbooks can be run separately to register new applications into the framework.

Figure 2. CoStar pipeline orchestration

Another benefit of this approach is the ability to treat the DevSecOps environment as a product rather than a platform. This enables creation of a DEV/TEST area to explore new DevSecOps tools, improved pipeline orchestration, or test updates to supported tooling without impacting production pipelines. By the very nature of the framework, both the lower and upper environments are built identically which simplifies testing and approval of changes prior to roll-out across the organization.

Generate Pipelines from Templates – Rapid pipeline establishment and execution

Pipeline configuration for on-boarding new teams to a DevSecOps platform can be complex and require a great deal of manual editing to pipeline definition files (i.e. yaml). To simplify and improve pipeline consistency, the CoStar Framework uses a set of configurable template pipelines. These pipelines are pre-configured with connectivity and in-line tasks to integrate with the supported DevSecOps tooling. For example, with SonarQube connectivity the same access tokens generated during initial configuration are re-used by the pipeline connectivity tools, in-turn established as part of the CoStar system build-out. This allows a development team to on-board to the platform with a few initial product differentiating values (e.g. product name, unique product key, organization, etc.), and access to the source code to be built. The CoStar configuration Ansible playbooks update integrated tools with these values, including generation of the initial CI and CD pipelines. The development team is then ready to use the build system and respond to any discovered issues/vulnerabilities.

Consolidated Dashboard – Integrated observability into all pipeline quality measures and security findings

The next, and perhaps most useful, aspect of the CoStar Framework is an integrated observability into all pipeline execution results. This permits teams to respond to the majority of findings without needing to access additional tooling interfaces. Moreover, if additional specific information/findings are needed, then a direct access link is provided as part of the detail drill-down. This is a key advantage in that development teams are not required to navigate multiple system interfaces for each of the different tools and instead focus on correcting discovered issues.

Deliver

The only true measure of value in software development is the secure release of capability into production. The CoStar Framework is designed to provide continuous delivery (CD) through automation of key workflows and readiness states (for a more detailed treatment of release coordination please refer to the Perficient blog series on Release Coordination). The combination of governed workflow automation, infrastructure-as-code target environment management, and built-in tracking metrics allows the CoStar Framework to scale across the enterprise.

Workflow Automation – Secure SDLC definition, governance, and execution

A critical part of repeatable, reliable product release is a set of well-governed automated workflows. The intent of this automation is allow “presumptive release“, where every system candidate release is assumed ready for production. The consequence of this approach is the need to unambiguously define what is required to go into production with confidence (e.g. “readiness states“). As part of the framework, once a production candidate release is identified it is automatically placed onto the production release schedule. By analogy, the candidate release is like a rocket on the pad – all of the preparation steps are automated on the way to launch; only intervention by the launch director will prevent lift-off.

In the CoStar framework the four critical process areas (Intake, Construction, Release, Maintenance) are designed to automate as much of the release coordination activities as possible. For steps that absolutely require manual intervention, the framework provides task management and automated notifications.

Metrics and Measures – DevSecOps historical metric gathering and reporting

A process is only as good as the feedback that allows it to grow. Metrics and measures in DevSecOps are often difficult to collect due to the large number of integrated systems typically found in automated CI/CD pipelines. In the CoStar framework the metrics are collected as a consequence of build results being captured and summarized in a common “single-pane” dashboard. By analyzing the history of build results over time, each system development team will be able to track their improvements in release consistency – fewer uncaught issues/vulnerabilities.

Continuous Delivery – Configuration control and automated target environment compliance

The final piece to releasing a system candidate to production is to automate the configuration of the target environments. Given the size and complexity of modern software deployments, this is a task that best left to specifically designed release management and deployment tools (e.g. CloudBees CD/RO, UrbanCode, Octopus, etc.). The CoStar framework creates and manages a set of “deployable units” that can be targeted to specific deployment/release target environments. In a future version of the CoStar Framework continuous compliance will be enabled to verify that deployment target environments meet pre-defined organizational security policies, such as operating system hardening or network whitelisting.

Maintain

Cloud Resource Management – Update or replace any automation tooling with direct redeployment

The architecture of the CoStar framework is based on modularity; the tooling is built into independent containers and deployed separately. This allows for update/replacement of any given set of tools with minimal disruption to the remaining systems. The automated playbooks are built to allow for on-the-fly configuration of running systems, including the generation of new security access tokens. This approach minimizes the need to involve other teams in the organization, simplifies the process for framework maintenance, and places the authority over DevSecOps tooling with the team responsible for pipeline stability.

DevSecOps Platform Governance – Ensure platform configuration and consistency across the enterprise

Automation has proven to be one of the best, most efficient ways to establish and maintain consistency for enterprise systems. A platform for DevSecOps is no exception. By utilizing the aforementioned automation to build the CoStar Framework, governance is established around system tooling, user access, application pipelines, and the other critical aspects of DevSecOps best-practices. As changes are introduced to improve the framework, they can be tested in stable on-demand environments prior to roll-out to the larger enterprise. This provides a means for training environment creation as well, so that development teams are aware of improvements to the overall automation frameworks.

Organizational Personas – Clearly define organizational roles and responsibilities

Finally, it is important to remember that people are always going to be part of any development organization. No amount of automation can replace the need for talented, creative, and goal driven teams. In order to build these teams, and to show how each individual works within the processes and tools defined in the CoStar Framework, we are leveraging Organizational Personas. These role and responsibility descriptions detail each individual’s responsibility area. The also provide detailed tool expectations/guides, typical deliverable artifacts, decision rights, expected skills, and workflow assignments.

Figure 3. Organizational Persona – CI/CD Engineer

Essentially an Organizational Persona is similar but opposite of a marketing persona; the individual takes on the predefined qualities of a specific role rather than representing a diverse marketing target audience. In essence, the individual is “putting on the badge” and assuming the well-defined responsibilities. The value to this approach is to unambiguously define role boundaries directly understood by each individual. If someone wants to perform a task assigned to another persona, that individual is now responsible for the entirety of the second role! This strongly discourages blurring responsibility ownership and enables better collaboration between individuals and teams.

Conclusion

Most organizations have recognized the importance in hosting a DevSecOps program to automate their software and infrastructure development. However, there are significant challenges faced by DevSecOps teams as they work to support larger and more complex corporate systems. Many of these issues can be solved by treating as a product the DevSecOps environment itself. The CoStar Framework represents one such approach to providing “automation for automation”.

Using Ansible URI module with SonarQube tokens

Benjamin Lieberman — Tue, 31 May 2022 21:16:43 +0000

RedHat Ansible is a very flexible configuration management tool that comes with a variety of built in modules. One of these modules, ansible.builtin.uri, is provided as an alternative to using “curl” commands through the ansible.builtin.shell or ansible.builtin.command modules. However, the module documentation does not provide a specific example of how to use the URI module when leveraging token based authentication. This blog post shows examples of how this can be done using SonarQube API calls.

Example #1: User Name and Password authentication

When using the URI module for SonarQube API calls with user-name and password authentication (which can be retrieved from Ansible Vault if desired) it is necessary to provide the “user:” and “password:” fields as well as the “force_basic_auth”. In this example we will generate a new token for the provided default user using a unique token name:

# generate a new user token for this session using the project_key
  - name: Generate a SonarQube Session Token
    ansible.builtin.uri:
      url: http://{{sonar_url}}:9000/api/user_tokens/generate
      user: "{{sonar_user_name}}"
      password: "{{sonar_user_pwd}}"
      method: POST
      force_basic_auth: yes
      body_format: form-urlencoded
      body: 
        name: "{{project_key}}"
    register: get_sonor_output
    no_log: true
  - set_fact: sonar_admin_token="{{ get_sonor_output.json.token }}"

The return result will contain the newly generated token under the json{} section which can be stored to a variable using ‘set_fact’:

"json": {
            "createdAt": "2022-05-31T20:34:12+0000",
            "login": "admin",
            "name": "costar125",
            "token": "4f12fd916c878f936e9bf3f04fc460ee4b225be3"
        },

By storing this token into a “fact” variable it can be used in subsequent API calls.

Example #2: Token based authentication – creating a project

In order to use a SonarQube derived token, the API expects the token in the “user:” field. However, the “password:” field must also be provided and be left as a ‘null’. This forces the URI module to generate a password-less API call that SonarQube is expecting. In this example, we use the token generated above to authenticate the API call in order to create a new SonarQube project. Note that if the project already exists it will generate an error. This can be avoided by a previous task API call to get all projects (using the unique project key) and then validate using a ‘when’ clause that the new project is not already present


  - name: Get SonarQube Projects
    ansible.builtin.uri:
      url: http://{{sonar_url}}:9000/api/projects/search
      user: "{{sonar_admin_token}}"
      password: ""
      method: GET
      force_basic_auth: yes
      body_format: form-urlencoded
      body: 
        projects: "{{project_key}}"
    no_log: false
    register: get_sonar_projects

  - set_fact: check_project_name="{{ get_sonar_projects.json.components | json_query(query) }}"
    vars:
      query: "[?name=='{{ project_name }}'].name"

- name: Create SonarQube Project ansible.builtin.uri: url: http://{{sonar_url}}:9000/api/projects/create user: "{{sonar_admin_token}}" password: "" method: POST force_basic_auth: yes body_format: form-urlencoded body: name: "{{project_name}}" project: "{{project_key}}" visibility: public register: create_sonor_output no_log: false when: check_project_name is not search(project_name|string)

Example #3: Token based authentication – revoking the token

Finally, if the token is not to be used further it can be revoked in the same manner that it was created:

# revoke the session token
  - name: Revoke a SonarQube Session Token
    ansible.builtin.uri:
      url: http://{{sonar_url}}:9000/api/user_tokens/revoke
      user: "{{sonar_admin_token}}"
      password: ""
      method: POST
      force_basic_auth: yes
      body_format: form-urlencoded
      body: 
        name: "{{project_key}}"
      status_code: [200, 204]
    register: get_sonor_output
    no_log: true

In this blog post we explored how the Ansible URI module can be used to a) generate a new security token, b) pass a security token to SonarQube APIs to create a new project (if not already present), and c) to revoke the newly created token at the end of the Ansible session. In these examples the SonarQube API call expected the token in the ‘user:’ field with a null password. Other API calls may require the “password:” field to contain the token and ignore the user field. Check your documentation carefully to see which approach is being used for the specific API calls.

Software Attack Surface Analysis

Benjamin Lieberman — Tue, 31 Aug 2021 21:52:06 +0000

All software systems exist in an insecure state, which creates the need for a way to conduct software attack surface analysis. This is because any useful system must connect in some way with the outside world and therefore contains at least one point of interaction with that world. These communication paths accept data / instructions into the system and report processing results out. Modern web-enabled software systems, as opposed to older client-server systems, are usually directly connected to the broader Internet. These connection points are required for the system to provide value to its stakeholders, but also represents opportunities for attackers to suborn the system.

In this blog post we will explore a visual modeling approach to attack surface discovery for rapidly identifying software system assets, evaluate various attack point vulnerabilities, definition of controls against those risks, and reporting evidence of attack mitigation.

Figure 1. Example Attack Surface Model

As shown in Figure 1, an Attack Surface Model is a technique for evaluating and assessing the vulnerabilities of a system that are potentially exposed and available for exploit. The purpose of this exercise is to identify the organizational assets that have value to an attacker and to associate them with appropriate risks. The model focuses on the external access points, or “surface”, of the target system as these are the most likely points for an external/internal actor to target for access. For example, a web-site hosted on a corporate network may be vulnerable from a variety of external exploits such as denial-of-service, cross-site scripting, unauthorized data exfiltration, and malicious code execution, just to name a few. To mitigate these exposed vulnerabilities a series of controls are established to either eliminate the vulnerability or educe the potential for exploit. Examples of controls for “data leaks” (aka unauthorized data exfiltration) include encryption, removal of unneeded sensitive/proprietary information, or anonymization of the data.

Identification of Assets

An organization’s assets are represented by any system, data, or artifact that has value. For example, a corporate human resources system contains highly sensitive and/or private data regarding compensation, bonus awards, equity awards, and the like. Exposure, loss, or corruption of this system will result in a high business, and possibly legal, impact. Identification and characterization of assets is beyond the scope of this post, but for more information please refer to the ISO 270001/2 standard. For the purpose of Attack Surface modeling, it is sufficient to identify all components of a software system that are potentially exposed to exploitation.

Table 1. Example Description of Software System Assets

Asset	Description	Value Statement	Threat Profile
AEM Platform	AWS hosted Adobe Experience Manager development and testing environments.	Lower environments are essential to development efforts; loss or corruption of these will result in extra time/effort to recover functionality.	These platforms are hosted on the AWS cloud, which involves the Shared Security Model. The organization is responsible for the virtual machines, network configuration, and access management (i.e. not physical security of the data center). As A lower development environment this poses a moderate attack target.
Adobe SharePoint	This data store is used as the primary repository for AEM content deployment	Web-site content is versioned and maintained in this systems for use in public-facing web applications.	As publicly facing information, this represents a significant attack target. Loss or corruption may affect organization reputation, customer confidence, or limit system functional behavior

Understanding Attackers

There are many possible motivations behind a software system attacker. An ‘extortionist’ may simply be after monetary reward to avoid causing damage to the target systems or reputation. A ‘vandal’ by contrast may be interested in causing as much damage as possible. Understanding the the types of attackers likely to target a particular system helps give insight into the means and mechanisms used by these actors, and in turn aids in identification of system vulnerabilities.

Table 2. Description of Attackers and Motivations

Actor	Description	Goal	Motivation
Extortionist	This Actor is looking for opportunities to insert ransomware or other non-destructive ways of forcing the organization to pay for return of data and/or system capability. Typically the attack does not expose private data, but rather prevents approved access.	Force target organization to pay a ransom for return of data / system access.	Money and pride are key motivations.
Vandal	This Actor is looking to cause as much disruption and destruction of property as possible. They desire to disrupt the organization by blocking access, corrupting data, inserting false data, or otherwise co-opting production systems.	Disruption of business activities, degradation of organizational reputation, exposure to legal / governmental consequence.	Pride, activism, or revenge are key motivations.
Thief	This Actor is focused on accessing and acquiring valuable data. Typically, they will access systems covertly (sometimes for years) collecting private data on customers, clients, and any other target of interest.	Acquisition of private data for sale, business disruption, espionage, identity theft, or other means of producing profit from data theft.	Money is the primary motivation.

Discovery of Risks/Vulnerabilities

Software systems, and in particular web-applications, are vulnerable to a variety of different attacks. Nefarious actors seek these attack points in order to uncover vulnerabilities that can be exploited to compromise the system. Shown in Table 3 is a short collection of such attack-points grouped under a general category of risks. There are many available resources to identify and detail potential risks, such as the Open Web Application Security Project^®, the open-source National Vulnerability Database, the HITRUST Alliance, and the Center for Internet Security. By categorizing potential vulnerabilities, and rapidly discarding ones that are not relevant to the current investigation, the analysis space can be rapidly defined. For example, a web-application that is hosted by a cloud provider does not need to consider physical security of the servers (which is the shared responsibility of the vendor). Refer to Figure 1 for the hierarchy of risks, attacks, vulnerabilities, and exploits.

Table 3. Example Risks – Attacks – Vulnerabilities

Risk	Attack	Vulnerability	Consequence	Description
System Operation	Insufficient Monitoring	Intrusion Awareness	Unknown unauthorized system access	Logging and monitoring is the process of performing and storing audit logs for sign-ins to detect unauthorized security-related actions performed on a framework or application that forms, transmits, or stores sensitive data. This vulnerability occurs when the security event is not logged properly and/or the system is not actively monitored. Lack of implementation of such practices can make malicious activities harder to detect, affecting the process by which the incident is handled. While logging and monitoring are universally important to all aspects of data security, this vulnerability becomes particularly acute when bad actors with valid credentials (such as Trusted Insiders) are enabled to traverse a system and exfiltrate data undetected due to lack of comprehensive access logs.
System Operation	Insufficient Monitoring	Platform Health	Reduced system availability / compromised behavior
Invalid Authorization	Session Spoof	X-Site Scripting	User session compromised	Often initiated through “sniffing” (the grabbing of unencrypted network data through the use of a network controller in Monitor mode), the Session Spoof vulnerability is enacted when a highly qualified specialist actor obtains the identifiers (TCP Sequence Number and TCP Acknowledgement Number) of a user’s active web service session. The actor can then use the current identifiers to create a falsified data packet which can be sent from any internet connection to fool the service that the actor’s session is legitimate, providing the actor with access control of whatever credentials the user was implementing. Session Spoofing is rarely used by modern actors, as OS providers have developed defenses against these attacks; however, some estimates put the number as high as 35% of modern web-systems still being vulnerable to Session Spoofing.
		Header Impersonation	User session compromised
		Automated Response	User session compromised

Definition of Controls

Controls are defined as technical, procedural, or administrative mechanisms used to prevent or mitigate one or more vulnerabilities (see ISO 270001, Annex A for details on control categories). For the Attack Surface Model the key points are the type of control, the specific vulnerability targeted, the mitigation mechanism, and the resulting evidence of mitigation. Examples of common controls are noted in Table 4. As part of the Attack Surface Model analysis approach, once a set of potential vulnerabilities are identified the next step is to investigate what (if any) controls have been applied. As also shown in Table 4, the mechanism used for mitigation (and the evidence of effectiveness) is tied to the way the control is implemented.

Table 4 – Example Controls and Mitigations

Control	Target Vulnerability	Mitigation Mechanism	Evidence of Mitigation
Establish Secure Configuration Process for Network Infrastructure	Public port availability	Automated port access grant/restrict network configuration	Network port availability report
WebApp Firewall/AWS CloudWatch	Intrusion Awareness	Monitoring of network traffic for invalid sources and/or packet patterns	Firewall intrusion report Firewall configuration report AWS CloudWatch report
Load-balancer Alarm	Platform Health	Evaluation of platform operation via health-check (i.e. ‘heart-beat’ request).	System/Platform uptime report
Patch Process	Malware Distribution Security Monitor Disable Component Exploit	Ensuring timely application of all upgrade and security patches	Vulnerability mitigation report
SSH Secure Key Access	Log Data Access	Shared secret access management for platform logs	Implementation of SSH platform security with periodic key rotation

Using the Attack Surface Model for Investigation

The key to an effective security investigation is to ensure a consistent, thorough approach. The model presented here provides guidance for such an approach, but should not be considered the only way to conduct attack surface modeling. In the end, it only takes one critical security miss to make the newspaper headlines.

Limit system scope to focus on a limited risk area. Taking on a large an initial investigation will result in confusion for the development teams. It will also provide opportunities for missed vulnerabilities. A good rule of thumb is to keep each investigation centered on a single functional area, such as a web-site or set of micro-services.

Work with risk areas as a unit, as controls are often related. By leveraging the various vulnerability similarities it is much easier to identify appropriate controls. For example, when considering data risks, a common control across a wide variety of vulnerabilities is to use encryption. Likewise, user session vulnerabilities can often be mitigated by using a properly configured web-server that leverages modern session management.

Eliminate potential vulnerabilities that are not relevant. For most systems, not all of the possible risks/vulnerabilities are present. As one example, session management is typically only relevant for web-based systems; a database management system would not have the same risks. Limiting the vulnerability space to a small set also helps with control identification for the reason given above.

Note areas of potential high risk consequence. Not all vulnerabilities are equal in the potential impact to the business. Therefore, it is a good practice to rank the identified vulnerabilities according to the value of the asset involved, and the potential consequence of a successful attack. Note all vulnerabilities without adequate mitigation and rank by consequence (i.e. Catastrophic, Major, Moderate, Minor).

Finally, all vulnerability mitigations require evidence of effectiveness. It is not enough to state in documentation that a particular control is in place, it is also necessary to show proof that the vulnerability has been mitigated. For example, if proxy-servers are used to control against unauthorized network access, then a periodic test must be run to ensure the network address configurations are still in place and functioning.

Conclusion

There are many techniques for performing security threat assessments. The Attack Surface Model approach has been shown to be effective and complete when investigating system vulnerabilities and controls. As this post illustrates, there is significant effort spent up-front to create a risk/vulnerability framework for a given set of assets. However, once built the same framework can then be applied across a wide variety of software / network systems. Therefore, this approach is recommended for critical business support systems as part of a full security assessment approach.

GitHub Code Migration Using DevOps Automation

Benjamin Lieberman — Fri, 16 Apr 2021 02:38:14 +0000

Migration from one code management system to another is a non-trivial exercise. Most of the time the team wishes to maintain code history, branch structure, team permissions, and integrations. This blog post investigates one such migration from Bitbucket to GitHub for a large health maintenance organization.

Due to growth and acquisition over time, the organization found that development teams were using multiple source control systems. This led to increased expense from duplicate support efforts and license costs. This included platform management, automated Continuous Integration / Continuous Delivery (CI/CD) integration, and end-user support. To resolve these issues, GitHub was chosen as the single platform for source control. The GitHub enterprise product offers multiple benefits, including tool integrations (e.g., web-hooks, SSH key based access, workflow plugins), an intuitive UI for team and project management, and notifications on specific behavior driven events (e.g., pull-request, merge, branch creation). Additionally, there is the option for cloud or on-premises deployment of their source code management (SCM) platform.

The migration of several thousand repositories presented a significant challenge. Beyond the logistics of coordination, it was also required that the DevOps team meet the tight timeframe around license renewal. To avoid this additional expense, teams were required to migrate not just the code base, but all of the associated meta-data (e.g., branch history, user permissions, tool integrations, etc.). In the approach detailed below we extensively leveraged CloudBees Jenkins workflows, Red Hat Ansible playbooks, and Python scripting to perform much of the required setup and migration work.

Approach

As shown in Figure 1, the migration effort involved creating a Jenkins migration workflow driven by user-provided information to define the Bitbucket source project, the GitHub target project, team ownership information, repository information, and additional integration requirements. This migration information was stored into a new file added at the root of the source tree (‘app-info.yml’). This approach facilitates future automation integration and provides a simple way to track application metadata within the code base itself.

Figure 1. GitHub migration automation workflow

There were multiple considerations to address in the GitHub migration automation, including ensuring the target GitHub project had proper visibility permissions (e.g., public/private), using consistent project naming standards, integrating with pre-existing or to-be established security scanning automation, applying organization defined branch protection rules, and maintaining all necessary CI/CD pipeline automation.

Code Transfer

While technically the most straightforward migration operation was to clone the code into the new repository, this required significant manual modifications to several key automation files maintained at the root of the project folder structure. For example, the pre-existing Jenkins configuration (‘Jenkinsfile’) was updated post migration to point to the correct shared library project; these had been previously migrated to GitHub from Bitbucket. Unfortunately, given that each development team used a specific library version this step was a manual rather than automated onboarding activity.

Branch Protection Rules

The organization had established a set of consistent branch management rules for source control trees. For example, the policy requires that a pull-request be approved by at least one reviewer prior to code merges for the ‘master’, ‘release’, and ‘develop’ branches within the repository. These rules were encoded within the migration Python scripts and pulled from the Ansible playbook during GitHub project creation.

Automated CI/CD Pipeline Modifications

To support the existing CI/CD pipelines, the migrated code bases required pipeline configuration file updates. This included configuration links for automated Jira issue updates, proper Jenkins master/agent execution (i.e., web-hooks), security automation scans, and integration with library package control (e.g., JFrog Artifactory). These modifications were captured in migration Python scripts and pulled from the Ansible playbook during GitHub code migration.

Access Key and Service Account Management

Automated CI/CD processes often require the use of service accounts and shared-secret access keys to function properly. During the GitHub migration it was critically important to maintain these access keys to prevent improper exposure to logs, notifications, or any other insecure reporting. The GitHub migration team used the Ansible vault feature and Groovy scripts to update built-in Jenkins credential management to ensure that project specific secrets/accounts/keys were securely transferred to the newly created GitHub linked jobs during the migration process.

GitHub Pre-Migration Setup

The GitHub Jenkins integration was built as a separate job to create the GitHub ‘team’. This included configuration of the team with a proper name, administration users, and match in the Jenkins build folder. For each repository we also set a Jenkins “web-hook” to ensure the proper Jenkins master is used to run each CI/CD pipeline.

Automated Testing Integration

As a part of code quality control, SonarQube code scanning is tied to a defined repository and required as part of the Jenkins CI/CD workflow. The scan results are reported to a separate GitHub tab which needed to be matched up with the project team. In this way, the newly created GitHub project could directly report to developers the results of the automated code quality analysis.

Results

The DevOps enablement team was required to meet a very tight deadline of four months to complete the full migration from Bitbucket to GitHub and avoid the expense of license renewals. Given the scope of the challenge, the only viable solution was to automate as much of the migration as possible. Where manual intervention was required, the DevOps team clearly communicated a checklist of activities to the affected teams for both pre- and post-migration changes. Using the combined tool set of scripted Jenkins jobs, Ansible playbooks, and Python scripting, the DevOps team successfully completed all migrations and modifications to all code bases several weeks prior to the deadline. The organization’s information technology team has reported that all teams are active on GitHub and the Bitbucket repositories have been archived.

DevSecOps – Canary Deployment Pattern

Benjamin Lieberman — Wed, 01 Jul 2020 13:00:07 +0000

The Canary Deployment Pattern, or canary release, is a DevSecOps deployment strategy that minimizes risk by targeting a limited audience. As with all deployment patterns, the goal is to introduce the newly deployed system to the users with as least risk and in as secure a manner as possible. As noted below, the motivation of this particular approach is to identify a small segment of the user community that can act as an initial response group. Typically, this means that the selected user segment is “friendly” to the idea of trying out a new and possibly unfamiliar set of system features. By limiting the general release in the this manner, feed-back can be gathered on the impact/acceptance of the new features. Once the “canary” group has had sufficient time to validate the deployment, the full user-base can be updated.

Canary Deployment

Pattern Name: Canary Release
Intent: A canary release is a way to identify potential system problems early without exposing all users. The intent is to deploy the application to a limited user audience and gain feed-back on any issues that may arise.
Also Known As: Limited Rollout, Feature Trial, Beta Release, Soak Deployment
Motivation (Forces): Reduce the impact of a new and potentially disruptive change to the user community. For example, a significant change to a user experience, such as release of a new user interface.
Applicability: Any system where users can be impacted by a significant functional change; often applied to system releases that have readily identified user sub-populations (i.e. ‘friendly’ or ‘pre-release’)
Structure:

Figure 1. Structure of rollout in the canary release pattern

Participants:

Participant	Role	Description
Production Release	Current production release	This is the current production release system. It will not be affected by the experimental release.
Primary User Group	Represents the core user group for the system	This group is comprised of the core system users who will remain on the current release as the “control” group.
Load Balancer	Channel traffic from specific user groups or regions to target server environment	Depending on how the user population is segregated (e.g. organization internal users vs. external users) the load balancer is used to direct user traffic to a well-defined release endpoint
‘Canary’ User Group	Represents a ‘friendly’ user group for experimental potentially disruptive releases	This is a select group of system users who have agreed to try out the new functionality/capabilities and provide feedback as needed.
‘Canary’ Release	Proposed experimental release	This is the experimental or potentially disruptive release. It will be separated from the current production release.

Collaboration: This pattern depends on the ability of the load balancer to properly shift user traffic so that the “canary” group is the only set of users who see the experimental release. Typically, this group will include a set of pre-selected ‘friendlies’ who understand that they will be using a new, possibly unstable system. It is expected that the ‘canary’ users will be contacted to review the results of the system use.
Consequences: Given that the ‘canary’ release is potentially disruptive or unstable it should be closely monitored during the ‘canary’ testing period. Follow up with the targeted user group is also highly recommended to gain feedback on usability and stability. If necessary, the ‘canary’ release can be rolled-back to the previous release version while discovered issues are remediated.
Implementation: This pattern requires that the production environment be capable of segregation into two groups. It is necessary that the production release servers be capable of handling the full user load in the case where the ‘canary’ release must be rolled back. Moreover, the user base must have some differentiating factor that can be used at the load balancer level to route traffic to the appropriate end-point (e.g. IP address range, internal/external user group, VPN/IPSEC tunnel, etc.).

Figure 2. Canary deployment traffic routing

Trade Offs: This approach requires that a portion of the production environment be deployed with a different release version. There is additional monitoring overhead and potentially data migration that will be required once the ‘canary’ period expires. While the ‘canary’ version is in production there may also be difficulty with any additional release development and support as the development team may be required to review issues and observations.
Known Uses: This pattern is often found during ‘beta-releases’ of new systems where stability and/or capabilities are rapidly changing. It is also found where development teams are rolling-out significant modifications to core system functionality and are concerned with user acceptance or system stability under production loads.

Conclusion

The canary release pattern offers product development teams an opportunity to validate system changes on a small sub-set of the user population, thereby avoiding wide-spread disruption if a set of new features fails to meet expectations. Moreover, for very large user groups (e.g. national or international in scope), this approach provides a mechanism to target specific user communities in isolation. The ‘canary’ group can therefore provide valuable feedback that can be incorporated into the overall system prior to a full user-base exposure.

DevSecOps Best Practices – Automated Compliance

Benjamin Lieberman — Fri, 05 Jun 2020 13:00:39 +0000

Secure software practices are at the heart of all system development; doubly so for highly regulated industries such as health-care providers. Multiple regulatory controls are required for the custodianship of patient and customer data, creation of secure software systems, governance of development environments, and ensuring proper management of audit information. As a best-practice it is recommended to adopt automation of certain security audits, integration of compliance oversight into key development process areas (e.g. Intake, Construction, Release Management), and DevOps pipeline tooling.

A critical aspect to the Security and Audit of DevSecOps is to provide for the automation of system code vulnerability scanning. This includes static application structure, dynamic application behavior, third-party component patch/version levels, and overall deployment environment compliance with hardened operating systems. The value of this automation is time reduction for security personnel to audit, gather, and publish system vulnerabilities for remediation. This provides assurance that the discovered issues were resolved in a timely manner. With an automated approach there is more frequent security governance, improved vulnerability detection, and evidence of remediation for external auditors.

Secure Development and Automated Security Audit

Software development is a complex process and there are multiple best practices that exist to address application vulnerability. The following sections cover the primary areas of concern when developing a secure coding practice and automation of security governance.

Source Code Analysis

A first-line defensive measure is to perform static scanning of source code during a continuous build process. This active detection mechanism prevents known vulnerabilities from being inadvertently embedded into the system. The security and architecture teams should be involved in the cultivation of a standard set of security practices to help support a tool assisted process. For the purpose of efficiency, these policies should be encoded into an automated scanning tool, with the output a set of recommended remediation steps to reduce or remove the detected vulnerabilities. For example, if a developer creates a user interface element (i.e. text box) for data collection, but does not validate the contents of that element, then the possibility exists for a SQL-Injection attack, cross-site script execution attack, or other known vulnerabilities (see Figure 1). The secure code scan will detect these situations, flag them for further attention, and create an audit log of the scan result. In the Continuous Integration (CI) process, these scans will occur during or just prior to the build step of the DevOps pipeline.

Figure 1. SonarQube Security Scan Result

Dynamic Application Analysis

Dynamic application analysis is an automated process run against a deployed application to detect various forms of known vulnerabilities. Given that the OWASP top ten vulnerabilities are well known in the development community but continue to drive incidents, outages, losses, and data breaches, it is important to include a dynamic application vulnerability scan with each and every system deployment. This is especially important for when a system is deployed to production, but has significant value as part of the overall Continuous Deployment (CD) process. Several tools are available in the commercial marketplace, such as Veracode dynamic analysis, that can scan deployed applications/APIs to detect possible vulnerabilities, report these findings to the compliance and development teams, and provide for an audit record of detection/remediation.

Data Stewardship

While not technically part of the DevSecOps automated security audit architecture, the governance of data is critical to a highly-compliant organization. While beyond the scope of CI/CD automation discussed in this blog post, it is nevertheless useful to consider adding automation for audit of data encryption validation, data visibility rules and permissions, data retention policies, and data secure transport as part of the overall development security stance.

Third Party Component

A significant number of the libraries and components that are utilized in modern software development are reused across projects. Many development groups rely on third-party components to gain development efficiency (i.e. open-source or commercial licensed), with the consequence that possible vulnerabilities are introduced. It is important to develop a plan for governing what types of libraries, components, licenses, and versions of those dependencies are permitted for use in the computing environment. An active process must note the components that are approved for use in development and the versioning of those components in existing applications. This means that a central controlled repository (e.g. JFrog Artifactory, Nexus Repo, Azure DevOps Artifacts, etc.) must be established to store and make available all of the approved components and reusable code. In the case of most open-source and commercial components, the NIST National Vulnerability Database provides a continuously updated list of known or suspected vulnerabilities. However, given that this database is not updated as rapidly as threats are detected, several commercial products have been created to provide a more timely reporting mechanism (e.g. Sonatype Nexus IQ).

Figure 2. Sonatype Nexus IQ Security Violation Report

Additionally, by continuously monitoring third-party component version status for software deployed into production environments, the overall attack surface for the application (and its dependencies) is greatly reduced. Automation of these scans via commercial tools is therefore highly recommended, with all detected vulnerabilities immediately researched for validity and remediation.

Operational Security

As a core part of development, integration, and deployment of software systems it is necessary to configure the software application for deployment into multiple lower and upper computing environments (e.g. DEV/TST/INT/UAT/STG/PRD). This requires that every application be configured with connection/configuration information that often includes various keys, certificates, credentials, and other information that must be kept private and secure. In the DevSecOps model, this is accomplished by separation of the configuration item name (e.g. “DB_Connection”) from the value (e.g. “SERVER=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=MyHost)(PORT=MyPort))(CONNECT_DATA=(SERVICE_NAME=MyOracleSID)));uid=myUsername;pwd=myPassword;”). The common approach is via segregating these secrets in a “vault” that is accessed during the Continuous Deployment process. Secrets are securely retrieved via the deployment automation, associated with the deployment unit produced by CI automation, and the resulting product is deployed to the target environment.

The management and governance of these application secrets should be separate from the development team itself, but readily accessible via the deployment automation.

Secrets Management

Secrets are necessary for the operation of all modern software systems. These secrets may be in the form of an SSH encryption key, digitally signed certificate, or user credentials (e.g. username/password). These secrets must be kept secure and separate from the primary system code. As discussed above, applications are configured with the appropriate credentials/keys/certificates during the deployment process, when these configuration items are substituted for placeholder values in the application configuration file/datastore. A common approach to the challenge of managing secrets is to use an application that is specifically designed to secure such information, but is pragmatically accessible by the deployment automation as needed. Examples include Hashicorp Vault, Azure Vault, and other secret management systems.

Separation of Duties

A critical part of the creation of secure systems is to separate the duties of those responsible for creating the software system from those tasked with auditing the security state of those systems. In most organizations, this separation is enforced by having a centralized security/compliance team that works in close relationship with the development teams. These “Compliance Officers” are tasked with the responsibility of establishing the proper secure coding practices, development policies, and policy automation for efficient and frequent audit. In turn, development teams are required to know and understand all applicable corporate policies, documentation requirements, and understand where/when in the development process the security Compliance Officer should be consulted.

Reference

NIST 800 Series – IT Security Standard – The US Government standards for IT computing security

OWASP Security Center – A special interest group that specializes in IT Security, secure coding practices, and evaluation of organizational maturity of software security practices

CIS Operating System Hardening – A collection of operating system specific controls to minimize the attack surface for applications running on those operating systems

DevSecOps – Blue/Green Deployment Pattern

Benjamin Lieberman — Thu, 14 May 2020 14:01:21 +0000

Blue/Green Deployment Pattern

The goal of any software development program is to release system changes into production. There are many ways to safely and securely deploy software into a production environment. In most cases these patterns follow a similar strategy of limiting exposure of the released software changes to the overall user audience. This is done to reduce the impact of a failed deployment, to offer an opportunity for feedback from a select user group, or to provide a “zero-outage” deployment with no downtime to the users. One of the most common is the ‘blue/green’ deployment pattern.

A blue/green deployment is where the production system is segmented into two environments – one (active) with the current production system deployed and the other (staging) for the new release. As shown in Figure 1 below these environments switch roles from one release to the next, acting in turn as the staging or active target. The remainder of this blog post will present the blue/green deployment strategy using the standard software pattern format.

Pattern Name: Blue/Green Deployment
Intent: Reduce new release risk by deploying to a parallel non-live environment
Also Known As: Zero Downtime Deployment
Motivation (Forces): Deployment to production always entails some risk of a failed deployment leading to system outage and user impact. By deploying to a separate non-live environment the release can be completed and verified prior to being made the production site. Additionally, the original production environment can be maintained as a fall-back during the “burn-in” period for the new release.
Applicability: This pattern is well suited to rapid production release schedules based on Continuous Deployment (CD) via an automated DevOps pipeline. In particular, this pattern supports agile development practices with short release cycles (e.g. 2-3 weeks) by allowing controlled deployments to a ‘staging’ environment prior to full production release.
Structure:

Figure 1. Blue-Green Deployment Pattern

Participants:

Collaboration: The ‘blue’ and ‘green’ environments rotate responsibility for hosting the production application/system. The Load Balancer is key to the pattern implementation to facilitate transition of user traffic from one environment to the other with minimal disruption. The addition of automated deployment via a DevOps pipeline ensures a secure and consistent mechanism of deployment configuration.
Consequences: This pattern requires the ability to have all production traffic directed toward one environment (i.e. ‘blue’ environment) while a release deployment is in progress to the other target. Post-deployment the user traffic is re-directed to the newly deployed system (e.g. ‘green’) while the original production system is maintained as a fall back (e.g. ‘blue’). For subsequent deployments the pattern of environment use is alternated.
Implementation:

On-Premise (Ground)

For a standard data-center hosted solution, the typical implementation is to have long-standing environments that are configured as a primary (“hot’) and secondary (“cold”) set of servers. These are fronted by the Load Balancer acting as a gate-keeper for traffic routing to the two environments. In this configuration, all of the network elements are pre-configured with the addresses for the target servers (virtual or non-virtual) which do not change during or after the release is complete. Both the ‘blue’ and ‘green’ environments are required to be identical, but may be hosted in different geographic locations.

Off-Premise (Cloud)

For a cloud-based solution to this pattern, the typical implementation will be via hosted virtual servers contained with in a virtual network. This option is more flexible in that resources can be created or destroyed as needed (e.g. see Variant – Infrastructure as Code below), and network addressing is automatically updated as necessary for the virtual routers. Moreover, servers can be automatically configured to expand capacity (‘auto-scale’) as required by user traffic. Similar to the ‘ground’ based pattern implementation, the resources used in the ‘blue’ and ‘green’ environments can be geographically distributed.

Variant – Infrastructure as Code (IaC)

Figure 2. Blue-Green Deployment Variant – Infrastructure as Code (IaC)

in this variant of the pattern the release deployment target environment does not exist until it is created by the DevOps pipeline. This requires that all of the necessary environment configuration (including security hardening) is coded and tested prior to the release deployment. Using Infrastructure as Code (IaC) the target servers and other network components can be pre-defined and stored with the application code-base. In this manner the application and the target environment are always evolving together. In the blue-green deployment the ‘green’ environment is first created and then the deployable unit is deployed to the new environment. Post deployment the original ‘blue’ environment is scheduled for destruction once the ‘green’ environment is considered stable in production.

Note that this variant may be used in either ‘ground’ or ‘cloud’ implementations, but it is more common to find IaC approaches used in cloud-based agile development.

Variant – Container-based Deployment

Figure 3. Blue-Green Deployment Variant – Containers

In this variant of the pattern the release deployment target is represented as a collection of one or more containers (e.g. Docker). The target environment is managed by the ‘container manager’ that is responsible for running and managing all container hosts (e.g. Kubernetes). The deployment is managed by the DevOps pipeline automation, but in this instance the deployment is of a container rather than a ‘deployable unit’ maintained in a container registry. Post release, once the ‘green’ environment is considered stable in production, the containers represented by the ‘blue’ container group are scheduled for destruction.

Trade Offs: Depending on the variant of the pattern used (ground-based, cloud-based, IaC, or container) there are benefits and drawbacks. In the case of ‘ground’-based implementations there is a requirement to maintain a separate “staging” environment at all times. This represents a significant expense, but is off-set to some degree by the support for fail-over business continuity. In the IaC-based approach it is necessary to fully-define all of the target environment as declaration (Ansible/Chef-Infra/Azure Resource Template/etc.) files. This includes all necessary secret management, network configuration, and operating system security hardening. For container based deployments, a registry and container manager must be added to the solution pattern, as well as proper creation of the initial container images used for deployment.
Known Uses: This pattern is well-known and utilized by a variety of software development teams. It is most often found with agile development teams utilizing cloud-based resources, but is also frequently found implemented in on-premise data centers using non-virtualized target environments.

Conclusion

A consistent pattern for automated deployment, such as presented here, provides your teams with a predictable mechanism for production releases. There are many related patterns to a blue/green, such as progressive disclosure, canary deployment, feature toggle, and A/B testing. Each of these patterns offers a different focus for an automated production deployment, and will be explored in more detail in future blog posts.

Understanding Security Policies for Development

Benjamin Lieberman — Fri, 10 Apr 2020 15:16:19 +0000

Secure Software Development

Understanding security policies and how they apply to development practices is key to delivery of secure software. Unfortunately, most development teams do not have a clear understanding of security implementation. This may be due to several factors, but a common theme is that security professionals speak a different ‘language’ from developers (i.e. requirements vs. controls, code vs. vulnerability, feature vs. risk factors). This lack of a common way to communicate security compliance impedes identification and removal of system vulnerabilities. There are many examples of system and data breaches that have occurred as a result of misunderstood or lax security standards.

This is not to say that there are not very useful resources already available. The Open Web Application Security Project (OWASP) group regularly publishes advice on how to identify and prevent common code-level vulnerabilities. The HITRUST Common Security Framework (CSF) provides a unified security framework across virtually all of the current compliance and regulatory standards, such as NIST, ISO, and HIPAA (see Figure 1). The Center for Internet Security (CIS) provides open-source resources for hardening of all major operating systems, as well as tools to help establish and maintain secure deployment environments.

Figure 1. HITRUST CSF v. 9.3.1 Control Categories

So while there is no lack of information available for identification and remediation of security risks, it is difficult to determine exactly what will meet a specific organization’s goal for security policy compliance. Even a well organized standard, such as the HITRUST CSF is well over 500 pages! Expecting all of your development teams to read through and understand every aspect is simply not reasonable. Instead, the development team needs guidance to finding the exact resources needed to meet policy via proper controls.

Development Security Policies

To accomplish this, I propose the organization’s solution architects and security experts combine forces to create a set of ‘development security polices”. As shown in Figure 2 and 3, these restatements of standard security policies enable development teams to quickly identify the guidelines, standards, and practices that will meet the compliance objective. In these examples (taken from the HITRUST CSF, version 9.3) the policies are organized based on the control category and objective. The policy statement goes further to define specific ‘indicators’ for when the policy is applicable, ‘controls’ for ensuring compliance, and ‘implications’ to note likely side-effects of compliance. The policy statement then directs the development team to applicable standards, guidelines, practices and useful references.

Control Category: Access Control

Objective Name: Authorized Access to Information Systems

Control Reference: User Authentication for External Connections

Description: Appropriate authentication methods shall be used to control access by remote users.

Indicators –

OAUTH2/OpenID is required for access to additional internal or external resources
Single-Sign On is required for multi-system access
Multi-factor authentication is required by regulation or business policy (eg. FISMA, PCI, HIPAA, etc.)

Controls –

Utilization of industry standards and methods (i.e. OAUTH2, MFA, SAML, etc.)
Auditing of system design to ensure compliance with regulations and company standards

Implications –

Systems that require enhanced authentication mechanisms will need to be carefully reviewed in the design phase to ensure that proper architectural mechanisms are in place to meet the additional requirements.

Related Standards:

OATH2/OpenID Resource Authentication, SAML single-sign on, Kerberos authentication

References:

NIST SP 800-63 Digital Identity Guidelines

Figure 2. Example development security policy for identity management

This approach significantly reduces development team confusion. It restates the security policy in specific terms that a development team can both understand, implement, and test. Finally, it bridges the gap between development groups and security professionals by translating global statements into actionable requirements. Rather than be subject to external security audits, each development team is empowered to evaluate their own systems. The security team gains valuable insights into how development teams operate, especially in agile practices where “shift-left” applies equally to security needs. Finally, by creation and curation of a focused set of defined polices, there is increased adoption of best-practices leading to safer and more reliable software releases.

Control Category: Information System Acquisition, Development, and Maintenance

Objective Name: Correct Processing in Applications

Description:

To ensure the prevention of errors, loss, unauthorized modification or misuse of information in applications, controls shall be designed into applications, including user developed applications to ensure correct processing. These controls shall include the validation of input data, internal processing and output data.

Indicators –

OAUTH2/OpenID is required for access to additional internal or external resources
Single-Sign On is required for multi-system access
Multi-factor authentication is required by regulation or business policy (eg. FISMA, PCI, HIPAA, etc.)

Controls –

Utilization of industry standards and methods (i.e. OAUTH2, MFA, SAML, etc.)
Auditing of system design to ensure compliance with regulations and company standards

Implications –

Systems that require enhanced authentication mechanisms will need to be carefully reviewed in the design phase to ensure that proper architectural mechanisms are in place to meet the additional requirements.

Related Standards:

OWASP Data Validation Security

References:

NIST SP 800-171 Protecting Controlled Unclassified Information in Nonfederal Systems and Organizations

Figure 3. Example development security policy for data integrity

Conclusion

Security teams understand compliance standards and polices. Development teams understand requirements and create solutions to problems. Restating compliance polices as noted in this post will benefit both teams by creating a common security compliance definition. Having these teams work together is crucial to delivery of secure, stable, and productive software.

Security Threat Assessment Modeling

Benjamin Lieberman — Tue, 17 Mar 2020 18:43:12 +0000

Security threat assessment models are an important tool of an overall security and compliance program. In order to create an effective set of security policies, it is necessary to understand the types of threats, their likelihood of occurrence, the impact of a breach/incident, and how the business can mitigate or control against these threats. There are many different threat analysis techniques that have been developed for various industries. These approaches all involve some degree of systematic “divide-and-conquer,” where the security space is divided into categories that are then investigated. For example, the HITRUST Cyber Security Framework (CSF) divides the overall organization security space into 14 key categories. When performing a threat assessment it is good practice to focus on areas of high business impact, as will be discussed below.

Key Concepts

There are a few key security concepts that should be considered while conducting a threat assessment. These include coverage of confidentiality, integrity, accessibility (CIA). Confidentiality refers to the ability of a system to keep secret information secure against unauthorized access. Integrity ensures that the information remains correct, consistent, and complete. Accessibility provides that information is available to those authorized for review/modification. When considering threats to a particular system make sure to include all three CIA areas.

The definition of security risk is any human or environmental impact that could disrupt information confidentiality, integrity or accessibility. There are literally hundreds (if not thousands) of possible risks, but usually, there is only a limited subset that is applicable to any given situation. For example, an unencrypted database information store is subject to possible intrusion or data loss but is not very likely to be physically stolen from a data center. Likewise, a physical plant element (i.e. network cable) does not need to be “encrypted” to ensure the security of the element.

Control represents mitigation against a defined security risk. These can take many forms, both administrative and technical, and are intended to provide a level of protection. While many controls are general in nature (e.g. data encryption), some controls are defined after a threat is identified so they may be properly tailored against the specific threat. For example, the threat of a critical system failure can be mitigated by high-availability techniques (e.g. multiple data center deployment and active-active automatic failover).

FRAP – Facilitated Risk Analysis Process

One well-established threat assessment technique is the Facilitated Risk Analysis Process (FRAP). In this approach, it is the business value that drives the threat assessment rather than from a pure security/compliance viewpoint. This is important for several reasons. First, it leverages the internal experience and expertise of the various teams, rather than relying on external groups to discover the risks and provide necessary controls. Second, it is a lightweight approach that is workshop based. This avoids a common problem with organizational threat assessments in that they can take months if not years to perform. Finally, it is based on a qualitative rather than quantitative methodology. It is often difficult to precisely determine the probability and business impact for any particular risk, so using categories rather than values simplifies the analysis. As illustrated in Figure 1, a qualitative approach allows the relative grouping of identified risks based on the probability of occurrence vs. the expected business impact. Any risks in the ‘red’ area must be addressed, risks categorized as ‘orange’ should be addressed, and the remainder may be addressed as resources permit.

Figure 1. Likelihood of Occurrence vs. Business Impact

Methodology

FRAP is based on a series of threat discovery sessions. These are typically held by a team assembled for this purpose, derived from across the organization. In general, the approach is as follows:

The pre-FRAP meeting takes about an hour and has the business manager, project lead, and facilitator. The focus is on scope, team construction, modeling forms, definitions, and meeting mechanics
The FRAP session takes approximately four hours and involves ideally 7 to 15 people. A facilitated workshop approach is used that addresses “what can happen” and “what is the consequence if it did?”
FRAP analysis and report generation usually takes 4 to 6 days and is completed by the facilitator and scribe.
Post-FRAP read-out session takes about an hour and has the same attendees as the pre-FRAP meeting.

As part of the final readout, the FRAP team should capture all investigated systems, what risks were identified for those systems, and recommended controls/mitigations for those risk areas.

The results of the FRAP are a comprehensive document that identifies threats, assigns priorities to those threats and identifies controls that will help mitigate those threats.

Reference: Peltier, T., “Facilitated Risk Analysis Process (FRAP)”, Auerbach Press (2000)

OCTAVE – Operational Critical Threat, Asset, Vulnerability, Evaluation

The intended audience for the original OCTAVE method are large organizations with 300 or more employees. Due to the high overhead associated with the original approach, several variants have been developed to allow smaller groups to conduct security risk assessments without needing to gain full organizational approval (e.g. OCTAVE Allegro). This approach is similar to the FRAP technique in that facilitated workshops and questionnaires are used to collect and organize the assessment information.

More specifically, it was designed for organizations that

• have a multi-layered hierarchy

• maintain their own computing infrastructure

• have the ability to run vulnerability evaluation tools

• have the ability to interpret the results of vulnerability evaluations

Figure 2. OCTAVE Phase-based Approach

The goals of the approach are to:

Establish drivers, where the organization develops risk measurement criteria that are consistent with organizational drivers.
Profile assets, where the assets that are the focus of the risk assessment are identified and profiled and the assets’ containers are identified.
Identify threats, where threats to the assets—in the context of their containers—are identified and documented through a structured process.
Identify and mitigate risks, where risks are identified and analyzed based on threat information, and mitigation strategies are developed to address those risks.

The methodology is illustrated in Figure 2 and is broken further down in to a series of steps:

Step 1 – Establish Risk Measurement Criteria

A formal measurement approach is identified for every risk that is discovered/determined. This may be similar to the approach used for FRAP (see Figure 1) or may involve more dimensions, such as breaking business impact into loss of revenue, loss of reputation, loss of regulatory compliance (e.g. R³Losses). Regardless of the measurement approach, it should be consistently applied across all discovered risk areas.

Step 2 – Create an Information Asset Profile

The assets under consideration include physical, informational, monetary, proprietary, and other types of business value areas. These should include applications as well as the information/data processed by those applications. Many large organizations capture this type of information in a configuration management database (CMDB). The assets may be further categorized by subject area (e.g. formulary, intellectual property, customer data, etc.).

Step 3 – Organize Assets into “Containers” (stored, processed, transferred)

The assets are assigned into “containers” which represent where they will “live” within the organization. Software systems are typically contained within servers (note: this applies to cloud-based resources as well as on-premise). Data is often stored in various data stores (e.g. relational database) that is then implemented on physical disks. Information may be stored, transferred and processed in different “containers” during its lifetime.

Step 4 – Identify Areas of Concern

For all collected assets, determine where threats exist and are of concern to the enterprise. To help focus the discovery effort, decide on what is most important to the business function. For example, an enterprise focused on manufacturing will be concerned with supply chains, delivery models, customer sales, and through-put. A financial institution will be concerned with ensuring the integrity of transactions, providing customers with access to their financial data, and timely processing of business events (e.g. trades, contracts, transfers, etc.).

Step 5 – Establish Threat Scenarios

Threat scenarios are an exercise in “what if”. This means for every identified asset determining what could possibly cause disruption or loss. This requires both a detailed understanding of the asset in question and the possible risks to that asset.

Step 6 – Identify and Analyze Risks

For each threat scenario, there will be one or more vulnerabilities or risks involved with that threat. For example, the threat of data loss can be caused by fire (environmental), a data breach exposing sensitive information (human actor), or a flaw in the design of the system itself (systematic).

Step 7 – Select Mitigations

Once all threats and risks are identified for the selected assets, a set of mitigations can be proposed against those threats. These may include administrative or technical controls.

Reference: Caralli, R.A., et. al., “Introducing OCTAVE Allegro,” Carnegie Mellon (2007); Alberts, C. & Dorofee, A. “Managing Information Security Risks: The OCTAVE Approach.” Boston, MA: Addison-Wesley, 2002 (ISBN 0-321-11886-3).

ISO27005 – Risk Assess for Information Systems

Another approach to threat analysis is provided by the ISO standards. One difference, however, is that the ISO standard doesn’t specify, recommend or even name any specific risk management method. It does, however, imply a continual process consisting of a structured sequence of activities, some of which are iterative:

Establish the risk management context (e.g. the scope, compliance obligations, approaches/methods to be used and relevant policies and criteria such as the organization’s risk tolerance);
Quantitatively or qualitatively assess (i.e. identify, analyze and evaluate) relevant information risks, taking into account the information assets, threats, existing controls and vulnerabilities to determine the likelihood of incidents or incident scenarios, and the predicted business consequences if they were to occur, to determine a ‘level of risk’;
Treat (i.e. modify [use information security controls], retain [accept], avoid and/or share [with third parties]) the risks appropriately, using those ‘levels of risk’ to prioritize them;
Keep stakeholders informed throughout the process; and
Monitor and review risks, risk treatments, obligations and criteria on an ongoing basis, identifying and responding appropriately to significant changes.

Figure 3. ISO Standard Risk Assessment Approach

The organization should establish and maintain a procedure to identify requirements for:

Selection of risk evaluation, impact, and acceptance
Definition of scope/boundaries for information security risk management
Risk evaluation approach
Risk treatment and risk reduction plans
Monitoring, review, and improvement of risk plans
Asset identification and valuation
Risk impact estimation

Reference: ISO/IEC 27005:2018 – Information security risk management

FEMA – Failure Effect and Mode Analysis

Originally developed in the 1950s to study failure problems with military equipment, it has since been revised to apply to a wide array of system reliability assessments. This approach is a very formal review of assets and all of the possible failure modes. It is intended to be a very thorough analysis that is intended to “leave no stone unturned”. As such it should only be used when highly critical assets are required to be reviewed to this level of detail (e.g. regulations or business requirements).

The analysis should always be started by listing the functions that the design needs to fulfill. Functions are the starting point of FMEA, and using functions as baseline provides the best yield of an FMEA. After all, a design is only one possible solution to perform functions that need to be fulfilled. This way an FMEA can be done on concept designs as well as detail designs, on hardware as well as software, and no matter how complex the design.

Worksheets are used to uniformly capture the details of each possible failure mode: (source: Failure Mode and Effect Analysis – Wikipedia)

FMEA Ref.	Item	Potential failure mode	Potential cause(s) / mechanism	Mission Phase	Local effects of failure	Next higher level effect	System Level End Effect	(P) Probability (estimate)	(S) Severity	(D) Detection (Indications to Operator, Maintainer)	Detection Dormancy Period	Risk Level P*S (+D)	Actions for further Investigation / evidence	Mitigation / Requirements
1.1.1.1	Brake Manifold Ref. Designator 2b, channel A, O-ring	Internal Leakage from Channel A to B	a) O-ring Compression Set (Creep) failure b) surface damage during assembly	Landing	Decreased pressure to main brake hose	No Left Wheel Braking	Severely Reduced Aircraft deceleration on ground and side drift. Partial loss of runway position control. Risk of collision	(C) Occasional	(V) Catastrophic (this is the worst case)	(1) Flight Computer and Maintenance Computer will indicate “Left Main Brake, Pressure Low”	Built-In Test interval is 1 minute	Unacceptable	Check Dormancy Period and probability of failure	Require redundant independent brake hydraulic channels and/or Require redundant sealing and Classify O-ring as Critical Part Class 1

Rating	Meaning
A	Extremely Unlikely (Virtually impossible or No known occurrences on similar products or processes, with many running hours)
B	Remote (relatively few failures)
C	Occasional (occasional failures)
D	Reasonably Possible (repeated failures)
E	Frequent (failure is almost inevitable)

Reference: D. H. Stamatis, “Failure Mode and Effect Analysis: FMEA from Theory to Execution.” American Society for Quality, Quality Press (2003)

Attack-Vulnerability-Vector Risk Modeling

Yet another approach to Threat Assessment is to capture the organizational risk analysis result in an Attack-Vulnerability-Vector risk model. As illustrated in Figure 4, This approach uses a visual modeling technique to tie assets to risks and controls. Moreover, it breaks down the information into a collection of attack-vulnerability-vector subgroups that can then be assigned an appropriate control. In the example below, a physical lock secures the asset. The analysis shows the ways such a device can be subverted through picking, decoding, bypassing, or brute-force attack. One or more vulnerabilities exist that are being leveraged in the attack, such as the use of an angle grinder in a destructive attack. Mitigations are used to combat vulnerabilities reducing or eliminating the associated attack-vector.

Figure 4. Physical Lock Threat Model – Unauthorized Access

Conclusion

While threats will always exist to our systems and practices, there are a number of ways to discover and mitigate against these risks. In this post, we looked at several common approaches to discovery, mitigation, and remediation of software system threats. In future posts, we will investigate creating security policies in such a way as to facilitate development and methods for the determination of software system attack surfaces.

Security Incident Management

Benjamin Lieberman — Fri, 28 Feb 2020 14:00:47 +0000

Security Incident Management

Incident Management can be defined as “effectively managing unexpected disruptive events with the objective of minimizing impacts and restoring normal operations” (1). For security-related incidents involves all of the steps prior, during, and subsequent to an information security incident. This may have consequences far beyond the restoration of normal service. For example, the inclusion of law-enforcement, notification of customers/clients, and public relations management efforts. The purpose of establishing Incident Management is to ensure that the organization anticipates and prepares ahead of time the capability to rapidly and effectively respond to outages, attacks, intrusions, and other security-related events.

Incident Response Team

Incident Response Teams (IRT) are typically well trained to identify an on-going situation, take immediate protective steps to limit/contain the damage, and ensure capture of forensic evidence to facilitate later research and investigation. These teams may be full-time business units (i.e. financial institutions, health-care organizations, government organizations, etc.) or share other IT Security responsibilities (i.e design/system review, policy education, secure system automation, secure coding practices, hardened operating systems, etc.). As an analogy, consider the unexpected event of a house fire. There is a time between ignition and detection, notification of the emergency, the arrival of “first response” in the form of firefighters, and mitigation of the initial incident by extinguishing the flames. However, after the fire is out there are still a large number of activities that often follow, such as clean up of the damage to restore normal use of the household, potentially an investigation as to the cause of the fire (either accidental, natural, or man-made), and some form of remediation for prevention (up to and including law enforcement!).

Likewise, IT Security IRT will have a well-defined approach to managing an exceptional business event that impairs or disrupts normal IT system behavior.

Detection

The initial part of any incident management is based on the detection of an exceptional condition or event. This may be via an automated intrusion detection system (IDS), observation of abnormal system behavior or resource utilization (e.g. excessive CPU use), reports from customers of a system outage, or other detection mechanisms. Proactive system surveillance and monitoring include several approaches for ensuring the integrity and resilience of any “defense in depth” approach to security awareness. One such approach is to conduct regular penetration testing against production and non-production systems to look for weaknesses in the firewall, application, data protection, or other security features of the IT organization.

Once a possible event has been detected the appropriate IRT is notified to immediately begin a validity determination (to rule out “false positive” events), capture and protection of log and audit data for forensic analysis, utilization of anti-malware/vulnerability remediation applications, and deployment of other intrusion prevention systems (IPS).

Triage

Once a security incident is detected and verified, it is important to determine the nature of the threat and the appropriate response. This security “triage” is similar to the treatment of multiply injured patients when there are limited medical resources; the most critical injuries are treated first. In the case of an IT Security event, the intent is to contain the systematic damage or information leakage to prevent any non-affected systems from becoming compromised. There is typically a period of time while the IRT determines what the exact attack/incident involves, which makes pre-planning with a set of prepared contingencies absolutely critical. These plans must be considered well ahead of any potential incident and include a hierarchy of notification points (see Escalation below), establish a set of emergency response procedures intended to minimize exposure, and provide for a business continuity plan that can be executed while the primary IT systems are not available.

Escalation

Security incident management typically involves the determination of the need for additional skills and/or knowledge for resolution. Given the highly technical nature of IT system management, it is a common practice to establish multiple levels of response to an incident. For most organizations a three-tiered response structure with defined escalation points is considered sufficient:

Tier 1 – Call Center / Service Desk – This function allows for IT system users to report incidents, system flaws, outages, and other abnormal IT events. A call center is usually provided with a set of standard procedures to diagnose and evaluate the seriousness of any reported incident. These can include workarounds for system behavior flaws, assistance in recovery from a given error condition, or other routine assistance. For situations that are of greater impact, such as an outage, the incident is forwarded to the Tier 2 team for handling.

Tier 2 – Incident Response Team (IRT) – As discussed above the IRT is a part or full-time group that is brought in to handle more serious IT security incidents. It is important that the Tier 1 team be well trained in when to call upon the Tier 2 response team given the disruptive nature of such requests. This training should include the ability to recognize a potential security incident (intrusion, denial-of-service attack, data breach, etc.) and know the correct point of contact for escalation.

Tier 3 – Technical Support and System Development – The final response team is typically the IT technical teams who have a deep understanding of the network, software, data, and other aspects of the IT environment. These groups are contacted and brought into an IT security event by the IRT to assist in the identification of the threat, containment of loss, and remediation for future prevention.

Root Cause Analysis

Once the immediate situation around an IT Security incident is resolved, the next step is to conduct an investigation of the root-cause for the incident. This can be a straightforward exercise for easily identified causes (e.g. system design flaw), or may involve deep analysis of log files, audit records, database changes, and other forensic evidence. In some cases, it will be necessary to involve Law Enforcement and the corporate legal department as part of the overall investigation.

Collect Data – the sources of information will often be found in system logs, audit records, data change records, file edits, or other persistent records of a system change. For example, if a user’s credentials have been subverted then any audit records generated after that point can be used to investigate changes made under those credentials.
Construct Causal Factor Chart – this exercise is to create a connected list of possible causes for the security event. One such model is the “fish-bone” chart that has a series of connected causes that directly lead to a specific event. Alternatively, risk/threat trees can be used to investigate all of the possible causes of an event.
Identify Root Cause – The identification of the actual root-cause for a security event may take a great deal of time and investigation. However, taking action before the root-cause is understood may result in unnecessary business complications and revenue loss.
Generate Remediation Approach – As the final step in root-cause analysis, a remediation approach is created and applied. These ‘controls’ can be for additional detection and monitoring, restriction of a network segment, installation of additional/upgraded firewalls, etc.

Prevention

As a final aspect of security incident management, preventative action should be considered for all known risks. While not every risk can be mitigated for a reasonable cost, many of the common system errors can be avoided. Ensuring a regular operating system patching schedule, ensuring that applications use a secure third-party component version, audit logs are reviewed for irregularities, intrusion detection and prevention system are deployed, are amongst the many measures that can be taken to reduce or eliminate unexpected system security events.

Reference

DevSecOps Release – Product Owner

Benjamin Lieberman — Sun, 16 Feb 2020 16:44:49 +0000

The Product Owner plays a particularly important role in DevSecOps and release coordination. In this final blog post on DevSecOps and release coordination, we will explore the Product Owner persona.

So far we have met the Release Coordinator, Security Architect, and the Operations Coordinator. Together with these other three key members of the release team, the Product Owner is responsible for ensuring the product has been built to the requirements. This typically includes what I term “full-spectrum testing” – unit, functional, regression, security, integration, user acceptance, and performance test.

In my previous post on DevSecOps and Release Coordination, I defined the concept of “presumptive release.” The idea is that all candidate releases are targeted for eventual deployment into production. This makes sense given that business value is derived from a development effort once it is delivered into the hands of the end-users. To that end, the Product Owner works closely with the other release team members to minimize the amount of overhead and approvals required to deliver value. This is the primary purpose of the release team – to ensure products are deployed to production safely, efficiently, and securely.

And so, last but not least, please meet Katie our Product Owner!

Introduction: Product Owner

As the Product Owner, I have responsibility and authority over my product, and my primary role is to understand and communicate application functional behavior based on defined business needs.

I am responsible for reviewing and prioritizing business requests as they relate to my product. I coordinate regularly with the business stakeholders to update product strategy and positioning in support of overall business goals. During development, I work closely with the Development Lead to size and assign work to specific sprints according to available resources, target release dates, and team delivery capacity. For the release process, I am responsible for assuring that the product readiness is complete for release into production.

As discussed previously, there are four required key system release readiness states. These four product statuses provide confidence to the release team that the software candidate has met all defined standards for production release. The release team consists of the Product Owner, Operations Coordinator, Security Architect, and the Release Coordinator. This team represents the sole deciders for what is, and is not, releasing to production. As illustrated below, the Product Owner is ultimately responsible for ensuring that a product has been properly evaluated, tested, and is ready for production. It should be noted, however, that most of the evaluation will be conducted by other teams, such as the development team conducting code and unit test and the business owner performing user acceptance. The Product Owner is expected to be involved in all product readiness assurance activities.

Figure 2. Example Release Readiness States

During the Release Coordination Meeting, the Product Owner reviews all of the testing and verification results with the release team, discuss any out-standing system issues, and asserts that the product is ready for delivery.

Tool Use and Workflow Responsibilities

The Product Owner may use many tools in the execution of her duties. Some of the possible tools are shown below (e.g. Jira/Confluence for requirements and collaboration). The workflow responsibilities are also shown and include backlog management, prioritization grooming, agile sprint planning, and development support.

Figure 3. Product Owner Backlog Responsibilities and Tool Use

Figure4. DevSecOps Product Owner Sprint Planning and Execution

Key Artifacts

There are several key artifacts that the Product Owner either uses, tracks, or otherwise manages:

User Stories – A full description of a particular application functional behavior. Define (basic name and description) and detail (data elements, common behavior, exception handling, interface design) to a sufficient degree for estimation, development, and testing.
Business Requests – These requests are collected separately from user stories and evaluated by the Product Owner against the product vision, budget, and delivery schedule.
Product Strategy/Vision – This is an overall description of a particular software product, such as a web-site or collection of APIs. The product strategy outlines the purpose of the product/application, target audience, the value provided to the organization, the near and mid-term vision for product enhancement, and a detailed description of the functional behavior.
Prioritized Work Item Backlog – This is a collection of work items (i.e. tasks, or user stories) that are prioritized in a single work backlog. The Development Lead and the Product Owner work together to groom the backlog items into sprint planning.
Sprint Planning – Sprint planning is the core element of agile development. A sprint content is typically developed at least one sprint ahead of current work. The intent is to ensure that the team is assigned prioritized work based on the expected level of effort, team capability, resource count, and sprint length (e.g. 2 weeks).

Conclusion

It should be clear at this point that release coordination is an essential aspect of software development. Without some way to govern the release process, it is all be inevitable that increasing system integration will result in issues. However, there is also the need to ensure agile release practices to permit rapid development and delivery of business value. In these blog posts, I have presented one such approach to balance these two opposing forces – governance and agility. The use of various DevSecOps personas illustrates not only the role/responsibility of the team members but also how and when they interact. I hope these posts will help you with your DevSecOps journey!

DevSecOps – Reference Architecture

Benjamin Lieberman — Tue, 28 Jan 2020 21:58:29 +0000

DevSecOps Reference Architecture

When approaching a complex DevSecOps implementation, it is often useful to consider a Reference Architecture as a starting point. As illustrated in Figure 1, the automation activities can be broken up into three major areas: Continuous Integration (CI), Continuous Deployment (CD) and Continuous Compliance (CC). Each of these areas encompasses a separate target for DevSecOps implementation. For example, a team can automate the build, test, and security scan aspects of CI without fully implementing automated deployment. However, to realize the full benefit of the architecture, it is necessary to have all areas built out and functioning as a unit. This includes proper integration across the selected pipeline of tools.

There are many tools available in the marketplace to enable a DevSecOps automated delivery. For example, many automated pipelines are managed by CloudBees^(TM) Jenkins workflow automation. Source control is often fulfilled by a variant of ‘git’, such as BitBucket, GitLab, or GitHub. Automated code scanning can be accomplished with a variety of open-source and commercial products, such as SonarQube, Veracode, Sonatype-Nexus IQ, or Blackduck. Xebialabs (provider of deployment tools) offers a very nice summary page for most of the currently available tooling.

Figure 1. DevSecOps automated pipeline reference architecture

Beyond tools, it is important to consider how the development, operations, and security processes will integrate with the automated delivery pipeline. For example, release coordination is critically important to the delivery of products into production environments. An inefficient release review and approval process can derail the most efficient DevSecOps pipeline. In the reference architecture, this is represented under the Governance aspect along with operational monitoring and deployment practices.

Continuous Integration (CI)

Continuous Integration automates the repetitive aspects of system building, packaging, unit testing, and security scanning. This allows development teams to focus energy on fulfilling feature requests rather than on the mechanics of system building. Moreover, this approach addresses the dreaded “it works on my laptop” problem where code changes break when combined with contributions from other team members. Building a full system on every approved submission (e.g. merge to the Development or Master code line) allows for verification of unit test coverage, early detection of security policy violations, integration failures, and collisions with other developer’s code.

Source Control

Source control is a key aspect of CI automation. The selection of the proper SCM tool facilitates integration with the pipeline workflow such that key events can trigger pipeline activities. For example, a product such as BitBucket can integrate directly with Jenkins to initiate the CI pipeline build process upon specific events (such as a code merge or a pull request approval). By selectively choosing these code-level events, the CI pipeline can be leveraged to dramatically speed development.

Build management

Many, if not most, software systems must be compiled, linked, or otherwise converted from raw code into an executable product. These steps typically include the resolution of dependencies, such as third-party components or libraries. The end result of the build phase is a versioned “deployable unit” as the basis of automated deployment.

Automated Unit Test and Test-Driven Development

After the system builds, it is common in DevSecOps pipelines to perform some level of automated unit test. Teams who leverage TTD (test-driven development)are best able to leverage automated unit testing, given the unit tests are the first code artifacts created for any given feature.

Security and Compliance Assurance

Automation of secure code verification is a core best practice for DevSecOps automation. There are three automated scans performed for each CI build: secure code practices, third-party component use, and application penetration testing. The first two scans are often considered “static” code scans as they refer to the code base before compilation. The third test looks at the end product (such as a web-application) and verifies resistance against common attacks (e.g. OWASP Top Ten vulnerabilities).

Continuous Deployment (CD)

In contrast to CI, Continuous Deployment manages the consistent delivery of deployable units into target environments. This element of DevSecOps automation manages system parameters, connection details, encryption, secrets, and other ‘run-time’ requirements. Deployment tooling simplifies the management of these values.

Repository Implementation

All deployments should be made from a verified deployable unit. This is to ensure that each promotion to subsequent environments (e.g. Dev to Test to Production) always leverages the exact same build products. The use of a repository is therefore critical to ensuring versioning of these deployable units. Moreover, repositories can manage libraries, components, and other build-time dependencies.

Infrastructure as Code

A goal of DevSecOps automation is to reduce or eliminate repetitive tasks from the delivery pipeline. The target environment for a particular system build should always be in a predictable state. This avoids time-consuming deployment problems when one environment configuration drifts from another. By encoding the expected system, secondary installed components, and network configuration this uniformity can be enforced at deploy-time.

Configuration Management

Along with establishing and maintaining a target environment configuration, the associated system parameters and connection information must be maintained. As noted above, this is the purpose of configuration and secret management tools. Decoupling environment configuration information from the deployable unit greatly reduces the likelihood of a deployment error breaking the automation.

Continuous Compliance (CC) and Governance

The final DevSecOps reference architecture section covers compliance and governance. For many, compliance with regulatory, industry, or customer demands requires that the organization expend valuable time verifying and approving adherence to security policy. To the greatest extent possible, this compliance audit function should be automated.

Automated Compliance Audit

It is important to verify the configuration state of any target environment to ensure the consistency of servers, networks, and data access points. Likewise, the security hardening of operating systems, network policy/restrictions, firewall/load balancer configuration, and other system security aspects should be verified before system deployment. This is particularly necessary for production deployment, but should not be neglected for lower environments.

Release Coordination

As discussed in my post on Release Coordination, a poor system release process can significantly degrade the advantages gained by automating CI and CD operations. Expected delivery performance is assured with an agile process joined to DevSecOps automation.

Environment Monitoring and Control

Support for operations and on-going system monitoring is the final aspect of the DevSecOps reference architecture. This will enable the operations team to better respond to problems before they occur.

Conclusion

DevSecOps implementation is a non-trivial event. Multiple integrated tools, processes, and policies must be aligned. Leveraging a consistent reference architecture will ensure that all aspects of the automation work closely together. In this post, I have detailed one such architecture.

Benjamin Lieberman, Author at Perficient Blogs

An Automated DevSecOps Framework

Automation for Automation – An Executable Framework for DevSecOps

Build

DevSecOps as Code – Rapid automated environment resource creation and integration

Figure 1. CoStar DevSecOps Automation Framework

Figure 2. CoStar pipeline orchestration

Generate Pipelines from Templates – Rapid pipeline establishment and execution

Consolidated Dashboard – Integrated observability into all pipeline quality measures and security findings

Deliver

Workflow Automation – Secure SDLC definition, governance, and execution

Metrics and Measures – DevSecOps historical metric gathering and reporting

Continuous Delivery – Configuration control and automated target environment compliance

Maintain

Cloud Resource Management – Update or replace any automation tooling with direct redeployment

DevSecOps Platform Governance – Ensure platform configuration and consistency across the enterprise

Organizational Personas – Clearly define organizational roles and responsibilities

Figure 3. Organizational Persona – CI/CD Engineer

Conclusion

Using Ansible URI module with SonarQube tokens

Example #1: User Name and Password authentication

Example #2: Token based authentication – creating a project

Example #3: Token based authentication – revoking the token

Software Attack Surface Analysis

Figure 1. Example Attack Surface Model

Identification of Assets

Table 1. Example Description of Software System Assets

Understanding Attackers

Table 2. Description of Attackers and Motivations

Discovery of Risks/Vulnerabilities

Table 3. Example Risks – Attacks – Vulnerabilities

Definition of Controls

Table 4 – Example Controls and Mitigations

Using the Attack Surface Model for Investigation

Conclusion

GitHub Code Migration Using DevOps Automation

Approach

Code Transfer

Branch Protection Rules

Automated CI/CD Pipeline Modifications

Access Key and Service Account Management

GitHub Pre-Migration Setup

Automated Testing Integration

Results

DevSecOps – Canary Deployment Pattern

Canary Deployment

Pattern Name: Canary Release

Conclusion

DevSecOps Best Practices – Automated Compliance

Secure Development and Automated Security Audit

Source Code Analysis

Dynamic Application Analysis

Data Stewardship

Third Party Component

Operational Security

Secrets Management

Separation of Duties

Reference

DevSecOps – Blue/Green Deployment Pattern

Blue/Green Deployment Pattern

Conclusion

Understanding Security Policies for Development

Secure Software Development

Development Security Policies

Control Category: Access Control

Objective Name: Authorized Access to Information Systems

Control Category: Information System Acquisition, Development, and Maintenance

Objective Name: Correct Processing in Applications

Related Standards:

References:

Conclusion

Security Threat Assessment Modeling

Key Concepts

FRAP – Facilitated Risk Analysis Process

Methodology

OCTAVE – Operational Critical Threat, Asset, Vulnerability, Evaluation

Step 1 – Establish Risk Measurement Criteria

Step 2 – Create an Information Asset Profile

Step 3 – Organize Assets into “Containers” (stored, processed, transferred)

Step 4 – Identify Areas of Concern