Deploying a TensorFlow model on a Jetson Nano using TensorFlow serving and K3s
Deploying a TensorFlow model on a Jetson Nano using TensorFlow serving and K3s
The Nvidia Jetson Nano constitutes a low cost platform for AI applications, ideal for edge computing.However, due to the architecture of its CPU, deploying applications to the SBC can be challenging. In this guide, we'll install and configure K3s, a lightweight kubernetes distribution made specifically for edge devices. Once done we'll build and deploy an TensorFlow model in the K3s cluster.
Environment preparation
Our objective is to deploy a TensorFlow Serving container in a Kubernetes cluster running on the Jetson Nano. Moreover, this container should fully take advantage of the CUDA capabilities of the Nano. Consequently, some preliminary environment preparation is necessary, namely .The following is based on this guide. Additionally this article assumes that the reader has a Docker container registry available to push to and pull from.
Docker configuration
JetPack, the default Linux distribution that comes with the Jetson Nano, ships with Docker preinstalled. However, Docker must be configured to take advantage of the Jetson Nano's GPU capabilities. This can be done by editing the /etc/docker/daemon.json
file. Moreover, if we intend to use a private docker registry served over HTTP, the address of the latter must be specified in the same file. Here is how the file should look like after modifications:
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"insecure-registries" : ["192.168.1.2:5000"]
}
Here, 192.168.1.2:5000 is to be replaced by the URL of your registry.
Those changes can be applied by restarting docker:
sudo systemctl restart docker
K3s installation
The objective is to deploy an AI model to a Kubernetes cluster running on the Jetson Nano. However, due to the architecture of the latter, most distributions of Kubernetes like Microk8s cannot run on it. This is why we are going to use K3s, a Kubernetes distribution made specifically for edge devices. K3s can be installed as so:
curl -sfL https://get.k3s.io | sh -
By default, K3s uses containerd as container engine. However, here, we would prefer to use Docker since we've configured it to fit our needs in the previous step. This can be done by editing the /etc/systemd/system/k3s.service file and adding the --docker flag to he ExecStart line as follows:
ExecStart=/usr/local/bin/k3s \
server \
--docker
K3s must then be restarted using
sudo systemctl restart k3s
Futher details in the official documentation.
To check if K3s properly uses the Jetson Nano's GPU, one can deploy the following container in it:
sudo kubectl run -i -t gpu-check --image=jitteam/devicequery --restart=Never
TensorFlow Serving
First, we are going to need a model to deploy. Here is that of the TensorFlow getting started example, which classifies images from the MNIST dataset:
# Importing TensorFlow
import tensorflow as tf
# Loading the data
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Data preprocessing (here, normalization)
x_train, x_test = x_train / 255.0, x_test / 255.0
# Building the model
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
# Loss function declaration
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
# Model compilation
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
# Training
model.fit(x_train, y_train, epochs=5)
# Exporting
model.save('./mymodel/1/')
Running this snippet should generate a folder called mymodel in the working directory, containing our exported model. Thus, the next step is to embed themodel in a TensorFlow Serving container.
The official TensorFlow serving image is unfortunately incompatible with the Jetson Nano's architecture. However, Docker Hub user helmuthva provided this alternative which works nicely. it can be used just like an usual TensorFlow Serving image:
docker run -d --name serving_base helmuthva/jetson-nano-tensorflow-serving
docker cp ./mymodel serving_base:/models/mymodel
docker commit --change "ENV MODEL_NAME mymodel" serving_base my-registry/mymodel-serving
The container image can then be pushed to our registry, making it available for Kubernetes to deploy:
docker push my-registry/mymodel-serving
Kubernetes
Now that an image of our TensorFlow serving container is available in our container registry, it is time to pull it in the Kubernetes cluster. to do so, we create a Kubernetes manifest file, for example kubernetes_manifest.yml, with the following content:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mymodel-serving
spec:
replicas: 1
selector:
matchLabels:
app: mymodel-serving
template:
metadata:
labels:
app: mymodel-serving
spec:
containers:
- name: mymodel-serving
image: my-registry/mymodel-serving
ports:
- containerPort: 8501
---
apiVersion: v1
kind: Service
metadata:
name: mymodel-serving
spec:
ports:
- port: 8501
nodePort: 30111
selector:
app: mymodel-serving
type: NodePort
Those resources can be created using kubectl apply
kubectl apply -f kubernetes_manifest.yml
The model should now be deployed to the Kubernetes cluster. This can be verified by pointing a web browser to http://<Kubernetes cluster IP>/v1/models/mymodel, which should yield the following JSON:
{
"model_version_status": [
{
"version": "1",
"state": "AVAILABLE",
"status": {
"error_code": "OK",
"error_message": ""
}
}
]
}