What is Iris Takeoff? 🦅

Iris Takeoff is an inference server designed for the super fast inference and deployment of large language models (LLMs) built by TitanML.

Iris Takeoff is designed to be as easy as possible to start experimenting with and deploying generative text models with accelerated inference on both CPU and NVIDIA GPUs.

You can check out the kinds of results you can achieve with the Takeoff server!!

Last updated