Tutorial
Oct 28, 2024
ML apps at scale: ASGI support now available on Cerebrium
Kyle Gani
Senior Technical Product Manager
Looking to deploy machine learning models in production? Struggling with real-time inference or model serving at scale? Trying to run Gradio and Streamlit apps internally buts its cumbersome? Cerebrium now supports ASGI (Asynchronous Server Gateway Interface) applications, solving common MLOps challenges around model deployment, scalability, and real-time processing.
Let's explore what ASGI support on the Cerebrium platform unlocks for you (Including examples), as well as how to build and deploy an ASGI application in under 5 minutes.
What ASGI support enables you to do:
ASGI is the backbone of modern Python web applications, enabling your applications to handle multiple concurrent connections efficiently. With Cerebrium's ASGI support, you now have complete control over how your ML applications handle requests, process data, and make inference. This means you can build everything from real-time streaming applications to complex ML pipelines, all while maintaining cost efficiency and performance.
Example applications you can try now
Implement WebSocket streaming for real-time voice applications. Check out our updated Twilio voice agent example.
Build intuitive dashboards and web interfaces for your applications. Take a look at our Gradio example.
Batch processing your requests for cost and performance efficiency (Example coming soon)
The best part? Since all these applications run in the same cluster on Cerebrium, they communicate with ultra-low latency within the cluster. This means your monitoring dashboard gets instant updates from your model application, your batch processing system can efficiently manage GPU resources, and your real-time applications can maintain consistent performance.
Want to see how this works in practice? Let's build an ASGI FastAPI application. Check out the complete code, here.
Deploy Your ASGI app to production
Here's how to deploy a FastAPI ASGI application, quickly and easily using the Cerebrium platform. Add the following to your main.py
file:
Next, update your cerebrium.toml
file to include the following configurations:
Deploy to production with a single command:
Lastly, calling your application is as simple as running this command in your terminal (Notice the placeholders, which you’ll have to replace with your own application specifics):
Need support?
Building an ML startup? We know costs can be challenging when you're just getting started. Reach out to support@cerebrium.ai for additional credits and deployment support.
Want to learn more about deploying ML models in production? Check out our guides on:
Need more?
Explore our examples repository
Start with $30 in free credits (No credit card required for signup). Sign up, here.
Join our Discord community for deployment support.
Don't forget to star our example repository and share your ML deployment success stories. Our team is constantly adding new examples based on real-world deployment scenarios.