How do you save and load a model in TensorFlow?

Q: How do you save and load a model in TensorFlow?

Learn the answer to "How do you save and load a model in TensorFlow?" with detailed explanations, code examples, and best practices on DeployU.

The Scenario

You are an ML engineer at a smart home company. You have just finished training a new model that can detect whether a person in a video feed is a resident of the home or an intruder.

You need to save the model for three different purposes:

Continued training: You want to be able to load the model back into memory to continue training it later.
Production deployment: You want to deploy the model to a production server so that it can be used for inference by the company’s mobile app.
Edge deployment: You want to deploy the model to a small edge device (like a smart camera) that has limited resources.

The Challenge

Explain your strategy for saving the model for each of these three purposes. What are the different saving formats that you would use, and what are the trade-offs between them?

Wrong Approach

A junior engineer might just use the `model.save()` method without considering the different saving formats or the specific requirements of each use case. They might not be aware of the `SavedModel` format or the difference between saving the entire model and saving only the weights.

Addresses symptoms, not root cause

Right Approach

A senior engineer would know that different use cases require different saving formats. They would be able to explain the trade-offs between the `SavedModel` format and the HDF5 format, and they would know when to save the entire model and when to save only the weights.

Step 1: Choose the Right Format for Each Use Case

The first step is to choose the right saving format for each use case.

Use Case	Recommended Format	Why?
Continued training	`SavedModel` or HDF5	Both formats save the entire model, including the architecture, the weights, and the optimizer state. This makes it easy to resume training from where you left off.
Production deployment	`SavedModel`	The `SavedModel` format is the recommended format for deploying models to TensorFlow Serving. It is a language-neutral format that can be run in any environment.
Edge deployment	TensorFlow Lite	TensorFlow Lite is a lightweight version of TensorFlow that is designed for deploying models to mobile and embedded devices.

Step 2: Save the Model

Here’s how we can save the model in each format:

1. SavedModel format:

model.save("my_model")

2. HDF5 format:

model.save("my_model.h5")

3. TensorFlow Lite format:

import tensorflow as tf

# Convert the model to the TensorFlow Lite format.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model.
with open("my_model.tflite", "wb") as f:
    f.write(tflite_model)

Step 3: When to Save Only the Weights

In some cases, you might want to save only the model’s weights. This can be useful if:

You only need the weights for inference and don’t need the entire model.
You want to transfer the weights from one model to another (e.g., for transfer learning).

Saving the weights:

model.save_weights("my_model_weights.h5")

Loading the weights:

# Create the model architecture first
model = create_my_model()

# Load the weights
model.load_weights("my_model_weights.h5")

Systematic, production-ready debugging

Practice Question

You want to deploy your model to an Android app. Which format should you use to save your model?

Questions