-
Build process — sets up the app environment: a Python environment with the specified parameters, required apt packages, Conda and Python packages, and any model files.
A build is only charged when the environment needs rebuilding, i.e., a
buildordeploycommand runs with changed requirements, parameters, or code. Each build step is cached, so subsequent builds cost substantially less than the first. - App runtime — the time code runs from start to finish on each request. Three cost components apply:
- Cold-start: The time to spin up server(s), load the environment, connect storage, etc. Cerebrium continuously optimizes cold-start latency. Cold-start time is not billed.
- Model initialization: Code outside the request function that only runs on cold start (e.g., loading a model into GPU RAM, importing packages). This time is billed.
- Function runtime: Code inside the request function, executed on every request
- 24 GB VRam (A10): $0.000306 per second
- 2 CPU cores: 2 * $0.00000655 per second
- 20GB Memory: 20 * $0.00000222 per second