TitanML Documentation
TitanHub Dashboard
  • 💡Overview
    • Guide to TitanML...
    • Need help?
  • 🦮Guides
  • Getting started
    • Installing iris
    • Sign up & sign in
    • Iris commands
      • Using iris upload
      • iris API
  • 🛫Titan Takeoff 🛫: Inference Server
    • When should I use the Takeoff Server?
    • Getting started
    • Supported models
    • Using the Takeoff API (Client-side)
    • Chat and Playground UI
    • Shutting down
    • Using a local model
    • Generation Parameters
  • 🎓Titan Train 🎓: Finetuning Service
    • Quickstart
    • Supported models & Tasks
    • Using iris finetune
      • Benchmark experiments for finetuning
      • A closer look at iris finetune arguments
      • Evaluating the model performance
    • Deploying and Inferencing the model
    • When should I use Titan Train?
  • ✨Titan Optimise ✨: Knowledge Distillation
    • When should I use Titan Optimise?
    • How to get the most out of Titan Optimise
    • Supported models & Tasks
    • Using iris distil
      • Benchmark experiments for knowledge distillation
      • A closer look at iris distil arguments
      • Monitoring progress
    • Evaluating and selecting a model
    • Deploying the optimal model
      • Which hardware should I deploy to?
      • Pulling the model
      • Inferencing the model
  • 🤓Other bits!! 🤓
    • Iris roadmap
Powered by GitBook
On this page
  1. Overview

Guide to TitanML...

NextNeed help?

Last updated 1 year ago

These docs are outdated! Please check out for the latest information on the TitanML platform. If there's anything that's not covered there, please contact us on our .

Titan Takeoff 🛫: Inference Server

What does it do?

  • Quickly experiment with inferencing different LLMs

  • Create inference servers that are local and private (think HF Inference Servers but local)

Supported models: Most OS Generative model architectures

Titan Train 🎓: Finetuning Service

What does it do?

  • Fine-tuning of language models

  • Using QLoRA for super efficient training

  • Super simple, only a few lines of code

  • Don't worry about infrastructure - all hosted by TitanML

Supported models: Both generative and non-generative language models

Titan Optimise ✨: Knowledge Distillation

What does it do?

  • Compression of Natural Language Understanding tasks

  • Helps when latency, memory, or cost is a severe bottleneck

  • Uses the latest compression techniques like pruning & knowledge distillation for non-generative tasks

Supported models:

💡
https://docs.titanml.co
discord
Super fast optimised inference, even on local hardware like CPUs
Create local versions of ChatGPT & The Playground
All for Generative models
Non-generative models