TitanML Documentation
TitanHub Dashboard
  • 💡Overview
    • Guide to TitanML...
    • Need help?
  • 🦮Guides
  • Getting started
    • Installing iris
    • Sign up & sign in
    • Iris commands
      • Using iris upload
      • iris API
  • 🛫Titan Takeoff 🛫: Inference Server
    • When should I use the Takeoff Server?
    • Getting started
    • Supported models
    • Using the Takeoff API (Client-side)
    • Chat and Playground UI
    • Shutting down
    • Using a local model
    • Generation Parameters
  • 🎓Titan Train 🎓: Finetuning Service
    • Quickstart
    • Supported models & Tasks
    • Using iris finetune
      • Benchmark experiments for finetuning
      • A closer look at iris finetune arguments
      • Evaluating the model performance
    • Deploying and Inferencing the model
    • When should I use Titan Train?
  • ✨Titan Optimise ✨: Knowledge Distillation
    • When should I use Titan Optimise?
    • How to get the most out of Titan Optimise
    • Supported models & Tasks
    • Using iris distil
      • Benchmark experiments for knowledge distillation
      • A closer look at iris distil arguments
      • Monitoring progress
    • Evaluating and selecting a model
    • Deploying the optimal model
      • Which hardware should I deploy to?
      • Pulling the model
      • Inferencing the model
  • 🤓Other bits!! 🤓
    • Iris roadmap
Powered by GitBook
On this page
  • Supported tasks
  • Supported models
  1. Titan Optimise ✨: Knowledge Distillation

Supported models & Tasks

PreviousHow to get the most out of Titan OptimiseNextUsing iris distil

Last updated 1 year ago

These docs are outdated! Please check out for the latest information on the TitanML platform. If there's anything that's not covered there, please contact us on our .

Titan Optimise was created to optimise non-generative language models and tasks. For generative model optimisations check out

Supported tasks

  • - sequence_classification or glue

    Note that , the General Language Understanding Evaluation benchmark, is a collection of sequence classification tasks used to evaluate natural language understanding systems - so you'd have to specify which columns in your dataset contain the sequences which are to be classified, as well as how many labels there are. Using glueas the task and the glue task as the dataset is a handy shortcut.

  • - question_answering The most common datasets for question answering are the , but TitanML does support others. If you decide to use a different training dataset, you must indicate to Iris whether your dataset contains unanswerable questions (see how to do this here).

  • - token_classification

    TitanML also supports classification tasks involving individual tokens (including Named Entity Recognition). As with sequence classification, you must indicate which columns in your input dataset are to be classified, and how many labelled classes they are to be classified into.

  • - language_modelling

    Causal language modelling is how large language models like GPT-4 and Claude are trained. The model learns to predict the next word (technically, token) given a string of previous words (tokens). TitanML supports language modelling for LLMs like OPT and pythia. Large models will automatically use state of the art parameter efficient training. See below for supported models.

  • (sequence to sequence) - language_modelling

    TitanML also supports conditional language modelling as a task. Conditional language modelling (also known as sequence to sequence modelling) involves producing output tokens conditioned on both previous tokens, and an additional sequence. Examples include translation, & summarization. Provide language_modelling as the iris task, and the platform will automatically deduce the task type from the model used. See below for supported models.

Supported models

The TitanML platform supports optimizing models for sequence_classification, token_classification, and question_answering. language_modelling is coming soon! Only the following model families are supported.

  • (make sure you use BERT uncased and not cased!)

  • (, , )

Never use DistilBert again!

There's usually little reason to use DistilBert! You'll typically get far better results by starting with a much better and larger model, like BERT or ELECTRA, and then using TitanML to compress down to a similar size to DistilBert.

Are there models that you use that you would like to see supported? Please let us know at hello@titanml.co or on the .

✨
https://docs.titanml.co
discord
Titan Takeoff
Text classification
GLUE
Question answering
SQuAD datasets
Token classification
Causal language modelling
Conditional language modelling
BERT
ELECTRA
DeBERTa
V1
V2
V3
ALBERT
discord