These docs are outdated! Please check out https://docs.titanml.co/docs/category/titan-takeoff for the latest information on the Titan Takeoff server. If there's anything that's not covered there, please contact us on our discord.

Iris Takeoff is an inference server designed for the super fast inference and deployment of large language models (LLMs) built by TitanML.

Iris Takeoff is designed to be as easy as possible to start experimenting with and deploying generative text models with accelerated inference on both CPU and NVIDIA GPUs.

