TitanML Documentation
TitanHub Dashboard
  • 💡Overview
    • Guide to TitanML...
    • Need help?
  • 🦮Guides
  • Getting started
    • Installing iris
    • Sign up & sign in
    • Iris commands
      • Using iris upload
      • iris API
  • 🛫Titan Takeoff 🛫: Inference Server
    • When should I use the Takeoff Server?
    • Getting started
    • Supported models
    • Using the Takeoff API (Client-side)
    • Chat and Playground UI
    • Shutting down
    • Using a local model
    • Generation Parameters
  • 🎓Titan Train 🎓: Finetuning Service
    • Quickstart
    • Supported models & Tasks
    • Using iris finetune
      • Benchmark experiments for finetuning
      • A closer look at iris finetune arguments
      • Evaluating the model performance
    • Deploying and Inferencing the model
    • When should I use Titan Train?
  • ✨Titan Optimise ✨: Knowledge Distillation
    • When should I use Titan Optimise?
    • How to get the most out of Titan Optimise
    • Supported models & Tasks
    • Using iris distil
      • Benchmark experiments for knowledge distillation
      • A closer look at iris distil arguments
      • Monitoring progress
    • Evaluating and selecting a model
    • Deploying the optimal model
      • Which hardware should I deploy to?
      • Pulling the model
      • Inferencing the model
  • 🤓Other bits!! 🤓
    • Iris roadmap
Powered by GitBook
On this page
  • How to use a model I have saved locally?
  • Caching HF Models
  1. Titan Takeoff 🛫: Inference Server

Using a local model

PreviousShutting downNextGeneration Parameters

Last updated 1 year ago

These docs are outdated! Please check out for the latest information on the TitanML platform. If there's anything that's not covered there, please contact us on our .

How to use a model I have saved locally?

If you have fine-tuned a model already, you might want to run that in the Takeoff server instead of a model on huggingface.

To do this, you save the model to a local folder ~/.iris_cache.

E.g. you can do this using the huggingface .save_pretrained interface:

model = ...     # load your model 
tokenizer = ... # load your tokenizer

model.save_pretrained('~/.iris_cache/<my_model_folder>')
tokenizer.save_pretrained('~/.iris_cache/<my_model_folder>')

Now you can start the Takeoff server using:

iris takeoff --model <my_model_folder>

Caching HF Models

The .iris_cache folder is also where we will save models once they have been downloaded from huggingface. This is to avoid lengthy repeated downloads of large language models.

🛫
https://docs.titanml.co
discord