TensorRT

+ Gradient

How to use Gradient and

Heading

together

Using TensorRT to deploy models on Gradient

TensorRT is an optimization library (SDK) and deep learning model serving system built on CUDA. NVIDIA developed TensorRT for GPU-enabled production environments that are focused on high-throughput and low-latency applications.

Gradient natively integrates with TensorRT and includes a pre-built TensorRT image out of the box which is updated regularly. Alternatively, customers can use a customized version of TensorRT by using their own Docker image hosted on a public or private Docker registry.

Deploying models with TensorRT

When creating a Deployment, you can select the prebuilt image or bring your own custom image. These options are possible via the web UI, the CLI, or defined as a step within an automated pipeline.

Selecting the prebuilt TensorRT image when creating a deployment

When using the CLI, the command would like something like this:

Learn more about the Gradient TensorRT integration in the docs.