> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cellulose.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Operator Annotations

> Figure out TensorRT compatibility on a per operation / layer basis at a glance from the model visualizer

## Introduction

Cellulose has integrated NVIDIA TensorRT into the dashboard and for each
operator in a machine learning model, identify which is *compatible* /
*convertible* to a particular TensorRT version.

<Note>
  This feature is only enabled for users on the Professional / Enterprise plan.
  You can read more about our pricing [here](https://docs.cellulose.ai/pricing/overview).
</Note>

## How It Works

For example, if TensorRT version X.Y.Z. is selected as the runtime target,
we'll add *Annotations* to each operator in the ONNX graph that indicate if it
can be *convertible* or not.

<Note>
  There are several ML framework entry points to TensorRT today. We will cover the
  only path supported today - ONNX. We plan to also support other frameworks
  such as PyTorch (and Torch-TRT) in the future.
</Note>

## Selecting a Runtime and Version

Navigate to a tracked model and look for this runtime selector at the top right
corner of the page.

<Frame caption="Runtime selector at the top right of a model visualizer page">
  <img src="https://mintcdn.com/cellulose/x0ZxTL5USk0t6UqF/images/tensorrt/runtime-selection-dropdown-tooltip.png?fit=max&auto=format&n=x0ZxTL5USk0t6UqF&q=85&s=e9c62fea745bba8a5c48c353aa088531" alt="Runtime dropdown tooltip" width="692" height="398" data-path="images/tensorrt/runtime-selection-dropdown-tooltip.png" />
</Frame>

Pick a *Runtime Type* and *Desired Precision*. We'll go with
*TensorRT 8.6.1* and *FP16* respectively in this example.

<Frame>
  <img src="https://mintcdn.com/cellulose/x0ZxTL5USk0t6UqF/images/tensorrt/runtime-selection-dropdown-selected.png?fit=max&auto=format&n=x0ZxTL5USk0t6UqF&q=85&s=27dbb89ecbf4db8baa13bb08ac35c462" alt="Selected TensorRT 8.6.1 and FP16" width="704" height="274" data-path="images/tensorrt/runtime-selection-dropdown-selected.png" />
</Frame>

You'll note that many of the nodes now have a *TRT* badge with a green checkmark.

<Frame>
  <img src="https://mintcdn.com/cellulose/x0ZxTL5USk0t6UqF/images/tensorrt/tensorrt-compatibility-badge.png?fit=max&auto=format&n=x0ZxTL5USk0t6UqF&q=85&s=0dda8b9d04909202c923e704dbce9d1f" alt="TensorRT compatibility badge" width="1654" height="1074" data-path="images/tensorrt/tensorrt-compatibility-badge.png" />
</Frame>

This means that "these nodes are *compatible* and *convertible* to TensorRT 8.6.1!

<Frame>
  <img src="https://mintcdn.com/cellulose/x0ZxTL5USk0t6UqF/images/tensorrt/tensorrt-support-valid.png?fit=max&auto=format&n=x0ZxTL5USk0t6UqF&q=85&s=42e7a21e378ccbee4df294f79ce57c12" alt="Convertible nodes" width="1768" height="878" data-path="images/tensorrt/tensorrt-support-valid.png" />
</Frame>

Phew! That's a good sign. Now we know that the model can be converted to a
TensorRT *FP16* engine.

Let's try something else. What about *INT8*?

<Tip>
  Go ahead and select *INT8* at the runtime selector at the top right corner again.
</Tip>

<Frame>
  <img src="https://mintcdn.com/cellulose/x0ZxTL5USk0t6UqF/images/tensorrt/tensorrt-support-invalid.png?fit=max&auto=format&n=x0ZxTL5USk0t6UqF&q=85&s=6ec231303f7390458b3520490ca33138" alt="Non-convertible nodes" width="1418" height="942" data-path="images/tensorrt/tensorrt-support-invalid.png" />
</Frame>

Uh oh, seems like many of these nodes won't work.
That *Reshape* node is the only convertible one in this subgraph. Note that
*INT8* precision also require calibration dataset integration. We'll cover this
in detail under the quantization section.

<Tip>
  Some nodes here may be marked convertible by TensorRT but there are implicit
  downcasts so engines can be successfully exported.

  While this is fine for most workflows, we'd ideally know what has been
  done to the model before it is shipped as a production asset. Cellulose plans
  to fill this gap over time by providing even more insights than we already have
  here.
</Tip>

## Understanding TensorRT Convertibility for a given node

Let's dig a little deeper on that *Reshape* node. Click on the node to open
the drawer. Navigate to the *Supported Runtimes* tab:

<Frame>
  <img src="https://mintcdn.com/cellulose/x0ZxTL5USk0t6UqF/images/tensorrt/supported-runtimes-reshape-example.png?fit=max&auto=format&n=x0ZxTL5USk0t6UqF&q=85&s=cd70b897d57d5d0b41081031b5e08f70" alt="Support TensorRT 8.6.1 precisions for Reshape" width="1762" height="1148" data-path="images/tensorrt/supported-runtimes-reshape-example.png" />
</Frame>

We now see that *FP16*, *FP32*, *INT32*, *INT8* and *BOOL* are supported precisions
for *Reshape* in TensorRT 8.6.1.

Let's look at another node like *BatchNormalization*.

<Frame>
  <img src="https://mintcdn.com/cellulose/x0ZxTL5USk0t6UqF/images/tensorrt/supported-runtimes-batchnorm-example.png?fit=max&auto=format&n=x0ZxTL5USk0t6UqF&q=85&s=745a5789fe986b2e40747b264a60b496" alt="Support TensorRT 8.6.1 precisions for BatchNormalization" width="1782" height="1764" data-path="images/tensorrt/supported-runtimes-batchnorm-example.png" />
</Frame>

In contrast, only *FP16* and *FP32* are supported for *BatchNormalization*.

## Model Specific Information

### ONNX Models

We currently use the [ONNX-TensorRT](https://github.com/onnx/onnx-tensorrt) tool
to evaluate TensorRT compatibility. You can read more about it
[here](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#onnx-intro)

## Have questions / need help?

Please reach out to [support@cellulose.ai](mailto:support@cellulose.ai), and we'll get back to you as soon
as possible.
