DeployU
Interviews / AI & MLOps / What is TorchScript and how do you use it to deploy a model?

What is TorchScript and how do you use it to deploy a model?

conceptual Deployment Interactive Quiz Code Examples

The Scenario

You are an ML engineer at a medical imaging company. Your team has developed a new PyTorch model that can detect tumors in MRI scans. The model has been trained and tested, and it is now ready to be deployed to the company’s C++ desktop application.

The C++ application does not have a Python runtime, so you cannot just import the PyTorch model and run it. You need to find a way to deploy the model in a way that is compatible with the C++ application.

The Challenge

Explain your strategy for deploying this model to the C++ application using TorchScript. What are the key benefits of using TorchScript, and what are the trade-offs between tracing and scripting?

Wrong Approach

A junior engineer might suggest rewriting the model in C++, which would be a very time-consuming and error-prone task. They might not be aware of TorchScript, which is a much simpler and more robust solution.

Right Approach

A senior engineer would know that TorchScript is the best tool for this job. They would be able to explain how to use tracing and scripting to convert the model to TorchScript, and they would have a clear plan for how to load and run the model in the C++ application.

Step 1: Why TorchScript?

TorchScript is a way to create serializable and optimizable models from PyTorch code. It is the recommended way to deploy PyTorch models in a non-Python environment.

FeatureTorchScriptOther options (e.g., ONNX)
PerformanceCan be highly optimized for performance.Can also be highly optimized, but might not be as fast as TorchScript.
IntegrationTightly integrated with the PyTorch ecosystem.Requires a separate runtime and can be more difficult to integrate.
FlexibilitySupports a wide variety of PyTorch models and operations.Might not support all PyTorch models and operations.

Step 2: Choose the Right Conversion Method

There are two ways to convert a PyTorch model to TorchScript:

MethodDescription
TracingRecords the operations that are performed when the model is run with an example input.
ScriptingRecursively compiles the model’s code into TorchScript.

When to use it

MethodWhen to use it
TracingWhen your model has a simple, static control flow.
ScriptingWhen your model has complex control flow, such as if statements and for loops.

For our MRI tumor detection model, we will use tracing, because the model has a simple, static control flow.

Step 3: Convert the Model to TorchScript

Here’s how we can convert the model to TorchScript using tracing:

import torch

# ... (load your model) ...

# Create an example input
example_input = torch.randn(1, 3, 224, 224)

# Trace the model
traced_model = torch.jit.trace(model, example_input)

# Save the traced model
traced_model.save("mri_tumor_detection_model.pt")

Step 4: Load and Run the Model in C++

Once we have the TorchScript model, we can load and run it in the C++ application.

#include <torch/script.h>
#include <iostream>
#include <memory>

int main() {
  // Deserialize the ScriptModule from a file using torch::jit::load().
  std::shared_ptr<torch::jit::Module> module = torch::jit::load("mri_tumor_detection_model.pt");

  // Create a vector of inputs.
  std::vector<torch::jit::IValue> inputs;
  inputs.push_back(torch::randn({1, 3, 224, 224}));

  // Execute the model and turn its output into a tensor.
  at::Tensor output = module->forward(inputs).toTensor();

  std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';
}

By using TorchScript, we can easily deploy our PyTorch model to the C++ desktop application without having to rewrite the model in C++.

Practice Question

You are converting a model to TorchScript, but you are getting an error because the model has a `for` loop that depends on the input data. Which conversion method should you use?