Container Cost Allocation

문서 > Cloud Cost Management > Container Cost Allocation

이 페이지는 아직 한국어로 제공되지 않으며 번역 작업 중입니다. 번역에 관한 질문이나 의견이 있으시면 언제든지 저희에게 연락해 주십시오.

Overview

Datadog Cloud Cost Management (CCM) automatically allocates the costs of your cloud clusters to individual services and workloads running in those clusters. Use cost metrics enriched with tags from pods, nodes, containers, and tasks to visualize container workload cost in the context of your entire cloud bill.

Clouds: CCM allocates costs of your AWS, Azure, or Google host instances. A host is a computer (such as an EC2 instance in AWS, a virtual machine in Azure, or a Compute Engine instance in Google Cloud) that is listed in your cloud provider’s cost and usage report and may be running Kubernetes pods.
Resources: CCM allocates costs for Kubernetes clusters and includes cost analysis for many associated resources such as Kubernetes persistent volumes used by your pods.

CCM displays costs for resources including CPU, memory, and more depending on the cloud and orchestrator you are using on the Containers page.

Cloud cost allocation table showing requests and idle costs over the past month on the Containers page

Prerequisites

CCM allocates costs of AWS ECS clusters as well as all Kubernetes clusters, including those managed through Elastic Kubernetes Service (EKS).

The following table presents the list of collected features and the minimal Agent and Cluster Agent versions for each.

Feature	Minimal Agent version	Minimal Cluster Agent version
Container Cost Allocation	7.27.0	1.11.0
GPU Container Cost Allocation	7.54.0	7.54.0
AWS Persistent Volume Allocation	7.46.0	1.11.0
Data Transfer Cost Allocation	7.58.0	7.58.0

Configure the AWS Cloud Cost Management integration on the Cloud Costs Setup page.
For Kubernetes support, install the Datadog Agent in a Kubernetes environment and ensure that you enable the Orchestrator Explorer in your Agent configuration.
For AWS ECS support, set up Datadog Container Monitoring in ECS tasks.
Optionally, enable AWS Split Cost Allocation for usage-based ECS allocation.
To enable storage cost allocation, set up EBS metric collection.
To enable GPU container cost allocation, install the Datadog DCGM integration.
To enable Data transfer cost allocation, set up Cloud Network Monitoring. Note: additional charges apply

CCM allocates costs of all Kubernetes clusters, including those managed through Azure Kubernetes Service (AKS).

The following table presents the list of collected features and the minimal Agent and Cluster Agent versions for each.

Feature	Minimal Agent version	Minimal Cluster Agent version
Container Cost Allocation	7.27.0	1.11.0
GPU Container Cost Allocation	7.54.0	7.54.0

Configure the Azure Cost Management integration on the Cloud Costs Setup page.
Install the Datadog Agent in a Kubernetes environment and ensure that you enable the Orchestrator Explorer in your Agent configuration.
To enable GPU container cost allocation, install the Datadog DCGM integration.

CCM allocates costs of all Kubernetes clusters, including those managed through Google Kubernetes Engine (GKE).

The following table presents the list of collected features and the minimal Agent and Cluster Agent versions for each.

Feature	Minimal Agent version	Minimal Cluster Agent version
Container Cost Allocation	7.27.0	1.11.0
GPU Container Cost Allocation	7.54.0	7.54.0

Configure the Google Cloud Cost Management integration on the Cloud Costs Setup page.
Install the Datadog Agent in a Kubernetes environment and ensure that you enable the Orchestrator Explorer in your Agent configuration.
To enable GPU container cost allocation, install the Datadog DCGM integration.

Allocate costs

Cost allocation divides host compute and other resource costs from your cloud provider into individual tasks or pods associated with them. These divided costs are then enriched with tags from related resources so you can break down costs by any associated dimensions.

Use the allocated_resource tag to visualize the spend resource associated with your costs at various levels, including the Kubernetes node, container orchestration host, storage volume, or entire cluster level.

These divided costs are enriched with tags from nodes, pods, tasks, and volumes. You can use these tags to break down costs by any associated dimensions.

Compute

For Kubernetes compute allocation, a Kubernetes node is joined with its associated host instance costs. The node’s cluster name and all node tags are added to the entire compute cost for the node. This allows you to associate cluster-level dimensions with the cost of the instance, without considering the pods scheduled to the node.

Next, Datadog looks at all of the pods running on that node for the day. The cost of the node is allocated to the pod based on the resources it has used and the length of time it ran. This calculated cost is enriched with all of the pod’s tags.

Note: Only tags from pods and nodes are added to cost metrics. To include labels, enable labels as tags for nodes and pods.

All other costs are given the same value and tags as the source metric aws.cost.amortized.

Persistent volume storage

For Kubernetes Persistent Volume storage allocation, Persistent Volumes (PV), Persistent Volume Claims (PVC), nodes, and pods are joined with their associated EBS volume costs. All associated PV, PVC, node, and pod tags are added to the EBS volume cost line items.

Next, Datadog looks at all of the pods that claimed the volume on that day. The cost of the volume is allocated to a pod based on the resources it used and the length of time it ran. These resources include the provisioned capacity for storage, IOPS, and throughput. This allocated cost is enriched with all of the pod’s tags.

AWS ECS on EC2

For ECS allocation, Datadog determines which tasks ran on each EC2 instance used for ECS. If you enable AWS Split Cost Allocation, the metrics allocate ECS costs by usage instead of reservation, providing more granular detail.

Based on resources the task has used, Datadog assigns the appropriate portion of the instance’s compute cost to that task. The calculated cost is enriched with all of the task’s tags and all of the container tags (except container names) running in the task.

AWS ECS on Fargate

ECS tasks that run on Fargate are already fully allocated in the CUR. CCM enriches that data by adding out-of-the-box tags and container tags to the AWS Fargate cost.

Data transfer

For Kubernetes data transfer allocation, a Kubernetes node is joined with its associated data transfer costs from the CUR. The node’s cluster name and all node tags are added to the entire data transfer cost for the node. This allows you to associate cluster-level dimensions with the cost of the data transfer, without considering the pods scheduled to the node.

Next, Datadog examines the daily workload resources running on that node. The node cost is allocated to the workload level according to network traffic volume usage. This calculated cost is enriched with all of the workload resource’s tags.

Note: Only tags from pods and nodes are added to cost metrics. To include labels, enable labels as tags for nodes and pods.

Cloud Network Monitoring must be enabled on all AWS hosts to allow accurate data transfer cost allocation. If some hosts do not have Cloud Network Monitoring enabled, the data transfer costs for these hosts is not allocated and may appear as an n/a bucket depending on filter and group-by conditions.

Datadog supports data transfer cost allocation using standard 6 workload resources only. For custom workload resources, data transfer costs can be allocated down to the cluster level only, and not the node/namespace level.

Compute

Note: Only tags from pods and nodes are added to cost metrics. To include labels, enable labels as tags for nodes and pods.

All other costs are given the same value and tags as the source metric azure.cost.amortized.

Compute

Note: Only tags from pods and nodes are added to cost metrics. To include labels, enable labels as tags for nodes and pods.

All other costs are given the same value and tags as the source metric gcp.cost.amortized.

Agentless Kubernetes costs

To view the costs of GKE clusters without enabling Datadog Infrastructure Monitoring, use GKE cost allocation. Enable GKE cost allocation on unmonitored GKE clusters to access this feature set.

Limitations and differences from the Datadog Agent

There is no support for tracking workload idle costs.
The cost of individual pods are not tracked, only the aggregated cost of a workload and the namespace. There is no pod_name tag.
GKE enriches data using pod labels only and ignores any Datadog tags you add.
The full list of limitations can be found in the official GKE documentation.

To enable GKE cost allocation, see the official GKE documentation.

Understanding spend

Use the allocated_spend_type tag to visualize the spend category associated with your costs at various levels, including the Kubernetes node, container orchestration host, storage volume, or entire cluster level.

Compute

The cost of a host instance is split into two components: 60% for the CPU and 40% for the memory. If the host instance has GPUs, the cost is split into three components: 95% for the GPU, 3% for the CPU, and 2% for the memory. Each component is allocated to individual workloads based on their resource reservations and usage.

Costs are allocated into the following spend types:

Spend type	Description
Usage	Cost of resources (such as memory, CPU, and GPU) used by workloads, based on the average usage on that day.
Workload idle	Cost of resources (such as memory, CPU, and GPU) that are reserved and allocated but not used by workloads. This is the difference between the total resources requested and the average usage.
Cluster idle	Cost of resources (such as memory, CPU, and GPU) that are not reserved by workloads in a cluster. This is the difference between the total cost of the resources and what is allocated to workloads.

Persistent volume

The cost of an EBS volume has three components: IOPS, throughput, and storage. Each is allocated according to a pod’s usage when the volume is mounted.

Spend type	Description
Usage	Cost of provisioned IOPS, throughput, or storage used by workloads. Storage cost is based on the maximum amount of volume storage used that day, while IOPS and throughput costs are based on the average amount of volume storage used that day.
Workload idle	Cost of provisioned IOPS, throughput, or storage that are reserved and allocated but not used by workloads. Storage cost is based on the maximum amount of volume storage used that day, while IOPS and throughput costs are based on the average amount of volume storage used that day. This is the difference between the total resources requested and the average usage. Note: This tag is only available if you have enabled `Resource Collection` in your AWS Integration. To prevent being charged for `Cloud Security Posture Management`, ensure that during the `Resource Collection` setup, the `Cloud Security Posture Management` box is unchecked.
Cluster idle	Cost of provisioned IOPS, throughput, or storage that are not reserved by any pods that day. This is the difference between the total cost of the resources and what is allocated to workloads.

Note: Persistent volume allocation is only supported in Kubernetes clusters, and is only available for pods that are part of a Kubernetes StatefulSet.

Data transfer

Costs are allocated into the following spend types:

Spend type	Description
Usage	Cost of data transfer that is monitored by Cloud Network Monitoring and allocated.
Not monitored	Cost of data transfer not monitored by Cloud Network Monitoring. This cost is not allocated.

Compute

Costs are allocated into the following spend types:

Spend type	Description
Usage	Cost of resources (such as memory, CPU, and GPU) used by workloads, based on the average usage on that day.
Workload idle	Cost of resources (such as memory, CPU, and GPU) that are reserved and allocated but not used by workloads. This is the difference between the total resources requested and the average usage.
Cluster idle	Cost of resources (such as memory, CPU, and GPU) that are not reserved by workloads in a cluster. This is the difference between the total cost of the resources and what is allocated to workloads.

Compute

Costs are allocated into the following spend types:

Spend type	Description
Usage	Cost of resources (such as memory, CPU, and GPU) used by workloads, based on the average usage on that day.
Workload idle	Cost of resources (such as memory, CPU, and GPU) that are reserved and allocated but not used by workloads. This is the difference between the total resources requested and the average usage.
Cluster idle	Cost of resources (such as memory, CPU, and GPU) that are not reserved by workloads in a cluster. This is the difference between the total cost of the resources and what is allocated to workloads.
Not monitored	Cost of resources where the spend type is unknown. To resolve this, install the Datadog Agent on these clusters or nodes.

Understanding resources

Depending on the cloud provider, certain resources may or may not be available for cost allocation.

Resource	Azure	Google Cloud
CPU
Memory
Persistent volumes Storage resources within a cluster, provisioned by administrators or dynamically, that persist data independently of pod lifecycles.
Managed service fees Cost of associated fees charged by the cloud provider for managing the cluster, such as fees for managed Kubernetes services or other container orchestration options.
ECS costs	N/A	N/A
Data transfer costs	Limited*	Limited*
GPU
Local storage Directly-attached storage resources for a node.	Limited*	Limited*

Limited* resources have been identified as part of your Kubernetes spend, but are not fully allocated to specific workloads or pods. These resources are host-level costs, not pod or namespace-level costs, and are identified with allocated_spend_type:<resource>_not_supported.

Cost metrics

When the prerequisites are met, the following cost metrics automatically appear.

Cost Metric	Description
`aws.cost.amortized.shared.resources.allocated`	EC2 costs allocated by the CPU & memory used by a pod or ECS task, using a 60:40 split for CPU & memory respectively and a 95:3:2 split for GPU, CPU, & memory respectively if a GPU is used by a pod. Also includes allocated EBS costs. Based on `aws.cost.amortized`
`aws.cost.net.amortized.shared.resources.allocated`	Net EC2 costs allocated by CPU & memory used by a pod or ECS task, using a 60:40 split for CPU & memory respectively and a 95:3:2 split for GPU, CPU, & memory respectively if a GPU is used by a pod. Also includes allocated EBS costs. Based on `aws.cost.net.amortized`, if available

Cost Metric	Description
`azure.cost.amortized.shared.resources.allocated`	Azure VM costs allocated by the CPU & memory used by a pod or container task, using a 60:40 split for CPU & memory respectively and a 95:3:2 split for GPU, CPU, & memory respectively if a GPU is used by a pod. Also includes allocated Azure costs. Based on `azure.cost.amortized`

Cost Metric	Description
`gcp.cost.amortized.shared.resources.allocated`	Google Compute Engine costs allocated by the CPU & memory used by a pod, using 60:40 split for CPU & memory respectively and a 95:3:2 split for GPU, CPU, & memory respectively if a GPU is used by a pod. This allocation method is used when the bill does not already provide a specific split between CPU and memory usage. Based on `gcp.cost.amortized`

These cost metrics include all of your cloud costs. This allows you to continue visualizing all of your cloud costs at one time.

For example, say you have the tag team on a storage bucket, a cloud provider managed database, and Kubernetes pods. You can use these metrics to group costs by team, which includes the costs for all three.

Applying tags

Datadog consolidates and applies the following tags from various sources to cost metrics.

Kubernetes

In addition to Kubernetes pod and Kubernetes node tags, the following non-exhaustive list of out-of-the-box tags are applied to cost metrics:

Out-of-the-box tag	Description
`orchestrator:kubernetes`	The orchestration platform associated with the item is Kubernetes.
`kube_cluster_name`	The name of the Kubernetes cluster.
`kube_namespace`	The namespace where workloads are running.
`kube_deployment`	The name of the Kubernetes Deployment.
`kube_stateful_set`	The name of the Kubernetes StatefulSet.
`pod_name`	The name of any individual pod.

Conflicts are resolved by favoring higher-specificity tags such as pod tags over lower-specificity tags such as host tags. For example, a Kubernetes pod tagged service:datadog-agent running on a node tagged service:aws-node results in a final tag service:datadog-agent.

Persistent volume

In addition to Kubernetes pod and Kubernetes node tags, the following out-of-the-box tags are applied to cost metrics.

Out-of-the-box tag	Description
`persistent_volume_reclaim_policy`	The Kubernetes Reclaim Policy on the Persistent Volume.
`storage_class_name`	The Kubernetes Storage Class used to instantiate the Persistent Volume.
`volume_mode`	The Volume Mode of the Persistent Volume.
`ebs_volume_type`	The type of the EBS volume. Can be `gp3`, `gp2`, or others.

Amazon ECS

In addition to ECS task tags, the following out-of-the-box tags are applied to cost metrics.

Note: Most tags from ECS containers are applied (excluding container_name).

Out-of-the-box tag	Description
`orchestrator:ecs`	The orchestration platform associated with the item is AWS ECS.
`ecs_cluster_name`	The name of the ECS cluster.
`is_aws_ecs`	All costs associated with running ECS.
`is_aws_ecs_on_ec2`	All EC2 compute costs associated with running ECS on EC2.
`is_aws_ecs_on_fargate`	All costs associated with running ECS on Fargate.

Data transfer

The following list of out-of-the-box tags are applied to cost metrics associated with Kubernetes workloads:

Out-of-the-box tag	Description
`source_availability_zone`	The availability zone name where data transfer originated.
`source_availability_zone_id`	The availability zone ID where data transfer originated.
`source_region`	The region where data transfer originated.
`destination_availability_zone`	The availability zone name where data transfer was sent to.
`destination_availability_zone_id`	The availability zone ID where data transfer was sent to.
`destination_region`	The region where data transfer was sent to.
`allocated_resource:data_transfer`	The tracking and allocation of costs associated with data transfer activities.

In addition, some Kubernetes pod tags that are common between all pods on the same node are also applied.

Kubernetes

In addition to Kubernetes pod and Kubernetes node tags, the following non-exhaustive list of out-of-the-box tags are applied to cost metrics:

Out-of-the-box tag	Description
`orchestrator:kubernetes`	The orchestration platform associated with the item is Kubernetes.
`kube_cluster_name`	The name of the Kubernetes cluster.
`kube_namespace`	The namespace where workloads are running.
`kube_deployment`	The name of the Kubernetes Deployment.
`kube_stateful_set`	The name of the Kubernetes StatefulSet.
`pod_name`	The name of any individual pod.
`allocated_resource:data_transfer`	The tracking and allocation of costs associated with data transfer activities used by Azure services or workloads.
`allocated_resource:local_storage`	The tracking and allocation of costs at a host level associated with local storage resources used by Azure services or workloads.

Kubernetes

In addition to Kubernetes pod and Kubernetes node tags, the following non-exhaustive list of out-of-the-box tags are applied to cost metrics:

Out-of-the-box tag	Description
`orchestrator:kubernetes`	The orchestration platform associated with the item is Kubernetes.
`kube_cluster_name`	The name of the Kubernetes cluster.
`kube_namespace`	The namespace where workloads are running.
`kube_deployment`	The name of the Kubernetes Deployment.
`kube_stateful_set`	The name of the Kubernetes StatefulSet.
`pod_name`	The name of any individual pod.
`allocated_spend_type:not_monitored`	The tracking and allocation of Agentless Kubernetes costs associated with resources used by Google Cloud services or workloads, and the Datadog Agent is not monitoring those resources.
`allocated_resource:data_transfer`	The tracking and allocation of costs associated with data transfer activities used by Google Cloud services or workloads.
`allocated_resource:gpu`	The tracking and allocation of costs at a host level associated with GPU resources used by Google Cloud services or workloads.
`allocated_resource:local_storage`	The tracking and allocation of costs at a host level associated with local storage resources used by Google Cloud services or workloads.