Six Tests to Determine if Your Cloud Environment is Configured Properly
I’m asked frequently, “how do I know if there is waste in my cloud environment” or “how do I know if my cloud environment is following best practices?”. In response, I have six initial tests to evaluate the state of an already deployed cloud environment. These tests can also be used in reverse to think about the critical tests for a cloud environment you are deploying new. In this article I’ll walk through each of those six tests and how to evaluate them.
Test #1: Can you map spend back to business services?
The first test is the most crucial as it directly represents the relationship between your cloud environment and value. A properly deployed cloud environment can effectively and efficiently show the translation between cost and the business services using that cost. At the very minimum this shows where the costs are going. To ensure this story is fully completed the business services should also have an owner indicated, ideally a cost center, and an ability to prove to the business that the cost is resulting in return on investment.
The translation of costs to business services and owners is also crucial to facilitating efforts to drive efficiency from the deployed environment. Estimate that there is between 20% and 30% waste in your environment at any given time and mapping to business services is the first step to analyzing where that cost can be reclaimed. I’ve found that applying energy to this optimization pays for itself, especially in larger cloud environments.
The following is an example of looking at costs by business unit and environment type:
So, what information is needed on every deployed resource to facilitate this?
- Business Unit
- Business Service (or Application Name)
- Cost Center (ideally)
- Classification (ideally)
Test #2: Are your workloads micro-segmented?
The majority of on-premises environments are NOT micro-segmented. This is a problem, as it facilitates the easy movement of ransomware and other malicious payloads across the environment and has resulted in many, many organizations being compromised over the years. An under-rated, but potentially the MOST important reason to move legacy workloads to the cloud is to micro-segment your workloads. This essentially creates a vertical network for every independent business service in the cloud, with strong firewall rules governing what the application can and can’t talk to.
Why is the movement to the cloud the right time to do this? The migration to the cloud is a perfect time because to move the legacy app, you need to know what it talks to, have a validation plan, and are going to test the app anyway. This overhead is exactly why micro-segmentation never got priority in the on-premises environment… because it is hard. This opportunity makes it slightly less hard and also modernizes the deployment approach in conjunction with the move to micro-segmentation.
The following is an example of workload micro-segmentation:
What about non-legacy workloads? In the case of modern workloads, like PaaS, serverless, data, queues, etc. the workloads are already segmented. The back-end of the workloads are inaccessible. However, in these cases the tech teams need to get comfortable with a network design which assumes external accessibility and builds the platforms at a higher degree of resilience.
Test #3: Do you have a pipeline built for each environment?
A pipeline is a direct relationship between what an environment should be and what it is. The use of pipelines have been commonplace for mature development organizations and not using one would be considered gross negligence. However, in legacy VM environments or data environments it is frequently felt as optional, mostly because of a skill gap or due to the use of packaged applications. That said, there are pipelines available for any environment, including:
- Infrastructure-as-code pipeline (such as VM size, memory, storage) – viable for any application
- Configuration-as-code pipeline (such as app configuration) – viable for some applications
- Application-as-code pipeline (such as actual app code) – viable for custom applications
The opportunity for even legacy workloads when moving to the cloud is to enact strict infrastructure-as-code pipelines, requiring the infrastructure config to live in a source code repository. This mitigates outages due to mistaken infrastructure configurations as the previous configuration can be easily recovered to an earlier state.
Another reason to do this in a cloud environment is because the cloud is ALREADY infrastructure-as-code and the configuration can be easily exported to be placed into a pipeline. The closing of the skill gap is an essential step, but an important one for your team to have the competency to operate and govern the cloud environment anyway.
Test #4: Do your admins require MFA and conditional access?
The majority of administrator accounts in cloud environments do not require multi-factor authentication. Let that sink in. All that is protecting many companies from compromise is a username and password??? Pretty scary. If you had one thing to do TODAY, it would be to validate that your administrator accounts not only use multi-factor authentication, but also have additional forms of conditional access applied. A user with elevated privileges should have the following:
- An account with multi-factor auth
- From a non-risky device with a validated managed state
- From a device that isn’t compromised
- …and multiple approvals through PIM if certain actions are being taken
Test #5: Are you using CAF consistent landing zone structure?
This is specific to Azure, but there are similar structures in other cloud environments. The days of inventing your own best practices approach are gone and the Cloud Adoption Framework is here. The Cloud Adoption Framework enterprise landing zone structure gives a mature approach to a scale-up / scale-down implementation of an Azure environment. The basic tenants of this environment include:
- Management groups for Platform, Landing Zones (think any environment), Decommissioned, and Sandbox environments
- Isolation of Identity, Management, and Connectivity elements
- Landing Zone subscriptions that scale out and are business group or application aligned
- Sandboxes that are easily torn down
An easy picture of this is here:
The delineation between platform and landing zones facilitates least privilege, as well as mitigation of self-inflicted outages by combining too much into one environment. It also pairs well with our final topic, which is about the alignment of policy with structure.
Test #6: Do you have policies enforcing tags, backup, and critical settings?
The final element is whether policy enforces the critical elements of the environment above (such as tags, structure, naming, and infrastructure-as-code) as well as deploys the security controls tied to the organization’s business rules. Here are examples of what you might use policy for:
- Enforcing tagging requirements
- Enforcing backup for servers
- Enforcing firewall rules
- Enforcing encryption requirements
- Enforcing security classifications
- Enforcing RBAC roles
So, in summary, ask your team if these elements have been deployed in your cloud environment. If you don’t have a cloud environment yet, or haven’t deployed governance, think about these elements as a critical part of your enterprise build-out and understand how the cloud provides a unique opportunity to implement controls that don’t exist in on-premises environments.