> ## Documentation Index
> Fetch the complete documentation index at: https://cerebrium.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Calculating compute cost

> How to calculate the cost of your deployment on Cerebrium

Deployment cost is based on the hardware selected and the execution time. <b>Every time code runs or a machine is specified to stay running, compute is billed</b>. GPU, CPU, and Memory usage are charged per second; persistent storage is charged per GB per month. View compute pricing on the [pricing page](https://www.cerebrium.ai/pricing).

Deploying a model incurs two billable processes:

1. **Build process** — sets up the app environment: a Python environment with the specified parameters, required apt packages, Conda and Python packages, and any model files.
   A build is only charged when the environment needs rebuilding, i.e., a `build` or `deploy` command runs with changed requirements, parameters, or code. Each build step is cached, so subsequent builds cost substantially less than the first.

2. **App runtime** — the time code runs from start to finish on each request. Three cost components apply:

* <u>Cold-start</u>: The time to spin up server(s), load the environment,
  connect storage, etc. Cerebrium continuously optimizes cold-start latency.
  <b>Cold-start time is not billed.</b>
* <u>Model initialization</u>: Code outside the request function that only runs
  on cold start (e.g., loading a model into GPU RAM, importing packages). This
  time is billed.
* <u>Function runtime</u>: Code inside the request function, executed on every
  request

**Example cost calculation**

A model deployment requires:

* 24 GB VRam (A10): \$0.000306 per second
* 2 CPU cores: 2 \* \$0.00000655 per second
* 20GB Memory: 20 \* \$0.00000222 per second

Assume the app works on the first deployment, incurring a single 2-minute build. The app has 10 cold starts per day with an average initialization of 2 seconds and an average runtime (predict) of 2 seconds. The expected monthly volume is 100,000 inferences.

```python theme={null}
# Your variables
average_initialization_time = 2
cold_starts_per_month = 300  # 10 a day for 30 days
average_inference_time = 2  # seconds
number_of_inferences = 100000  # number of inferences per month

GPU_cost = 0.000306  # per second
CPU_cost = 0.00000655  # per second per core
memory_cost = 0.00000222  # per second per GB
num_of_cpu_cores = 2
gb_of_RAM = 20
build_seconds = 120  # 2 minutes

# cost calculation
compute_rate = GPU_cost + (CPU_cost * num_of_cpu_cores) + (memory_cost * gb_of_RAM)

total_build_compute_cost = build_seconds * compute_rate
total_initialization_time = average_initialization_time * cold_starts_per_month
total_inference_time = average_inference_time * number_of_inferences

initialization_compute_cost = total_initialization_time * compute_rate
inference_compute_cost = total_inference_time * compute_rate
storage_cost = gb_of_persistent_storage * persistent_storage_cost

total_cost = inference_compute_cost + storage_cost + total_build_compute_cost + initialization_compute_cost

print(f"Build Compute cost: ${total_build_compute_cost :.2f}/month",
      f"Initialization Compute cost: ${initialization_compute_cost :.2f}/month",
      f"Inference Compute cost: ${inference_compute_cost :.2f}/month",
      f"\nStorage cost: ${storage_cost :.2f}/month",
      f"\nTotal cost: ${total_cost :.2f}/month")
```
