Evaluating and selecting a model

These docs are outdated! Please check out https://docs.titanml.co for the latest information on the TitanML platform. If there's anything that's not covered there, please contact us on our discord.

On TitanHub (app.titanml.co) you can view a comparison of your input models with the Titan-optimised output models. The TitanML engine produces a different number of models based on the input architecture (e.g. 4 for BERT), each with different accuracy / size / latency trade-offs. You can use the information on TitanHub to decide which trade-off is best for your use-case and deployment.

Baseline Optimised

This model (in green) is the optimised version of the baseline, with negligible accuracy loss. This model is the baseline model with best practice lossless optimisations applied to it. This is typically 6x faster and roughly 2x smaller than the baseline.

Titan M/S/XS

The titan engine produces a family of compressed models of fixed sizes, shown in blue above. These are the Titan M, S and XS models. Typically these models have a different accuracy trade-off to the original model. As a user, you can select the model which is most relevant to your use case.

For example, if you want to deploy a BERT-based model on an edge device. you might want to select Titan-XS; however, if you want to achieve good accuracy on a low-powered GPU, you might want to go with Titan-M. Typically it would take a skilled engineer multiple weeks to produce just one of these models; with TitanML you can generate these models in a few hours, thus reducing experimentation time.

The titan models currently come in three fixed sizes:

Model

Size

Size reduction vs Bert-large

Latency reduction vs Bert-large

TitanM

33MB

40x

25x

TitanS

22MB

60x

46x

TitanXS

11MB

120x

60x

Comparing results

To make your final choice of model, you can use TitanHub to easily compare any of the three improved models with the original baseline model.

Before your compressed models are ready, you can navigate to Experiments, select the experiment whose progress you want to track, and view how far along in development your compressed models are. Once the distillation process has finished, selecting an experiment will allow you to see a graph of accuracy, F1 or loss against model cost or size (according to your choice of metric). Hovering over a data point will allow you to see its precise statistics:

Click on a data point to open a more detailed results display:

Then, simply choose the compressed model which is most suitable for your cost and performance requirements!

PreviousMonitoring progress NextDeploying the optimal model

Last updated 1 year ago