# Evaluating and selecting a model

{% hint style="danger" %}
These docs are outdated! Please check out <https://docs.titanml.co> for the latest information on the TitanML platform.\
\
If there's anything that's not covered there, please contact us on our [discord](https://discord.com/invite/83RmHTjZgf).
{% endhint %}

On TitanHub (app.titanml.co) you can view a comparison of your input models with the Titan-optimised output models. The TitanML engine produces a different number of models based on the input architecture (e.g. 4 for BERT), each with different accuracy / size / latency trade-offs. You can use the information on TitanHub to decide which trade-off is best for your use-case and deployment.

### Baseline Optimised

This model (in green) is the optimised version of the baseline, with negligible accuracy loss. This model is the baseline model with best practice lossless optimisations applied to it. This is typically **6x faster** and roughly **2x smaller** than the baseline.

### Titan M/S/XS

The titan engine produces a family of compressed models of fixed sizes, shown in blue above. These are the Titan M, S and XS models. Typically these models have a different accuracy trade-off to the original model. As a user, you can select the model which is most relevant to your use case.&#x20;

For example, if you want to deploy a BERT-based model on an edge device. you might want to select Titan-XS; however, if you want to achieve good accuracy on a low-powered GPU, you might want to go with Titan-M. Typically it would take a skilled engineer multiple weeks to produce just one of these models; with TitanML you can generate these models in a few hours, thus reducing experimentation time.

The titan models currently come in three fixed sizes:&#x20;

<table><thead><tr><th width="121">Model</th><th width="86">Size</th><th width="245">Size reduction vs Bert-large</th><th>Latency reduction vs Bert-large</th></tr></thead><tbody><tr><td>TitanM</td><td>33MB</td><td>40x</td><td>25x</td></tr><tr><td>TitanS</td><td>22MB</td><td>60x</td><td>46x</td></tr><tr><td>TitanXS</td><td>11MB</td><td>120x</td><td>60x</td></tr></tbody></table>

## Comparing results

To make your final choice of model, you can use TitanHub to easily compare any of the three improved models with the original baseline model.&#x20;

Before your compressed models are ready, you can navigate to **Experiments,** select the experiment whose progress you want to track, and view how far along in development your compressed models are. Once the distillation process has finished, selecting an experiment will allow you to see a graph of accuracy, F1 or loss against model cost or size (according to your choice of metric). Hovering over a data point will allow you to see its precise statistics:

<figure><img src="https://2010767686-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F62Ho1Wqz3OE1YkIDbG5o%2Fuploads%2FjhIstqiczZNotTresUbt%2FScreenshot%202023-06-16%20at%2016.30.58.png?alt=media&#x26;token=b154ce15-42a8-419d-a484-16c0b4d9d04e" alt=""><figcaption></figcaption></figure>

Click on a data point to open a more detailed results display:

<figure><img src="https://2010767686-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F62Ho1Wqz3OE1YkIDbG5o%2Fuploads%2FqPkDPguNqlNZD0ZnvUmf%2FScreenshot%202023-06-16%20at%2016.44.57.png?alt=media&#x26;token=850d81ee-84f7-486f-8c7a-949b56dc6fd3" alt=""><figcaption></figcaption></figure>

Then, simply choose the compressed model which is most suitable for your cost and performance requirements!


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://titanml.gitbook.io/iris-documentation/titan-optimise-knowledge-distillation/evaluating-and-selecting-a-model.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
