How-To - Gary Sieling

Retinanet is an object detection model that is supposed to be suitable for tagging objects in videos. To use this, you need to install the keras-retinanet project from github.

The following example shows how to train this, taken from the excellent pyimagesearch book:

retinanet-train \
  --batch-size 4 \
  --steps 349 \
  --epochs 50 \
  --weights logos/resnet50_coco_best_v2.1.0.h5 \
  --snapshot-path $DIR/snapshots \
  csv logos/retinanet_train.csv \
  logos/retinanet_classes.csv

Under the hood this uses Tensorflow, which is easy to get running against a CPU (and the default install doesn’t use Intel vector instructions, so even there it’s slower than it needs to be)

When I ran the above command it estimated 5 hours per epoch – 50 training epochs would take around ten days. Based on my laptop’s electricity consumption, this would consume around $5.25 of electricity ($0.21/kWh).

The “easy” answer is to use a GPU (1080 Ti), which will bring the time to 5 minutes per epoch (4.1 hours for 50 epochs).

While it may look like this saves 9.75 days, you could easily spend that time just installing dependencies: reinstalling your OS to a version CUDA supports (I chose Ubuntu 18.04), identifying which of the six ways to install CUDA works best (the .run file, if you want Steam to keep working), building a custom Tensorflow version (if using CUDA 10, which isn’t officially supported yet), and identifying the correct version of Google’s build tool (bazel) to build Tensorflow.

Enter Docker.

Tensorflow provides convenient pre-built Docker containers, so you can avoid a lot of this hassle.

docker run --rm -it --runtime=nvidia tensorflow/tensorflow:latest-gpu-py3

To use these, you must install the NVIDIA Docker runtime, so you do need to have CUDA configured, but this lets you avoid the pain of configuring the downstream libraries (Tensorflow, mxnet, etc) by delegating it to library maintainers.

You may be wondering how this integrates with your main machine: you want tagged images and the model stored outside a container, with minimal I/O overhead. Because we’re using the Linux version of Docker we avoid all the I/O problems of OSX Docker versions (while there are workarounds, you can waste a few hours the first time you hit these problems).

To use this to run Keras / retinanet, we also need opencv, but the version on pip seems to work OK for this:

FROM tensorflow/tensorflow:latest-gpu-py3
RUN apt-get clean && \
    apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y git
RUN git clone https://github.com/fizyr/keras-retinanet
RUN cd keras-retinanet && pip install . --user
RUN pip install opencv-python
RUN apt-get install -y libsm6 libxrender1
ENTRYPOINT ["/bin/bash", "-c", "/root/.local/bin/retinanet-train ${*}", "--"]

We’ll take any arguments and pass them on to retinanet, so you can treat this as if it was the equivalent command line app running locally.

docker build . -t retinanet-train

I found it convenient to trick the Docker container into looking like it’s file system had the same structure as my laptop – so the volume mount is the output of “pwd -P” (a deep link into a drive partition).

DIR=$(pwd -P) docker run --rm -it \
  --runtime=nvidia \
  -v $DIR:$DIR \
  retinanet-train \
  --batch-size 4 \
  --steps 349 \
  --epochs 50 \
  --weights $DIR/logos/resnet50_coco_best_v2.1.0.h5 \
  --snapshot-path $DIR/snapshots \
  csv $DIR/logos/retinanet_train.csv \
  $DIR/logos/retinanet_classes.csv

Once you do this it works like a charm – you’ll be building new models in no time.

Category: How-To

Retinanet Dockerfile with GPU support