Efficient City Traffic Metering With PeopleNet/YOLOv8, Savant, And Grafana At Scale

Published in

Inside In-Sight

13 min readMay 8, 2023

UPD 2023.05.25: The article is updated to use Savant 0.2.2.

The article demonstrates how to deploy a ready-to-use traffic metering pipeline for people, cars, and other objects. The counting solution is deployed “as-is” and scalable to many RTSP cameras.

You only need to be familiar with Docker and Docker Compose to use it.

To count the number of objects passing through, two registration lines, which need not be parallel, differentiate between "entry" and "exit" events. When an object, such as a person or a car, crosses these lines, it is recorded. Grafana displays the counters.

This module can efficiently analyze more than 10+ cameras on a modern GPU. The solution can be deployed on AWS or any other provider that offers Nvidia GPU instances, or you can choose to host it on your server.

The diagram below depicts the software architecture:

In the upcoming sections, we will explore how to deploy the system using a single docker-compose to familiarize oneself with it. Then, we will proceed to make adjustments to enable it to function with multiple cameras.

Counting people or cars doesn’t require any knowledge of AI or programming. However, if you want to use your custom-trained YOLOv8, we will guide you through the process in the final section of this article.

To implement the solution, you may utilize an X86 server that comes with a separate Nvidia GPU or opt for devices such as Nvidia Jetson Xavier NX/AGX or newer models.

The software is licensed under Apache 2.0, allowing you to tailor it according to your specific needs.

About Tools

PeopleNet is a model developed by Nvidia specifically designed to detect people. It has been optimized for TensorRT and quantized to int8, which makes it highly efficient. It is more effective at detecting people than publicly available models trained on open datasets.

The YOLOv8 model, created by Ultralitics, is highly popular for detecting a wide range of objects and can be easily trained to enhance accuracy on customized datasets.

Nvidia DCF Tracker is an online multi-object tracker that employs a discriminative correlation filter for visual object tracking, allowing independent object tracking even when detection results are unavailable. It combines correlation filter responses and bounding box proximity for data association.

We used Savant, a framework based on Nvidia DeepStream, to make it easier to deploy computer vision and video analysis pipelines for production purposes. Our pipeline includes a detector, a tracker, and custom Python code, which comprise our solution.

Grafana is a free software data visualization system focused on data from IT monitoring systems. Implemented as a web application in the style of “dashboards” with charts, graphs, and tables.

Graphite is a metric storage used by Grafana to collect metrics and execute time-based queries on them.

Counting People Traffic

To avoid making extra clauses regarding YOLOv8, we focus on people counting with the PeopleNet model along the tutorial. But, in the last sections, we demonstrate how to run the pipelines with YOLOv8 and change our YOLOv8 model to a custom-trained one.

All-In-One Docker Compose Solution

First, we would like to guide you on using a properly packed bundle that counts people on a looped video. It helps to understand every component's role and implementation. In the diagram below, the services within the compose file are depicted:

Currently, the "compose" feature does not allow for the use of external RTSP cameras to keep the system design simple and easy to understand. However, in the next phase, we will separate the Video Loop Adapter and AA-RTSP Sink from the compose bundle to make it compatible with external sources. Now, let us explore every component to understand how the system works.

Video Loop Source Adapter (Savant Component). It is a utility module that sends a video file repeatedly into the analytics module simulating the infinite RTSP stream.

AI Module (Savant Component). It is a runtime that deploys and runs the AI pipeline with models and Python analytics code. The module is initialized with the YAML configuration file where analytics blocks are configured.

Always-On RTSP Sink Adapter (AO-RTSP Sink, Savant Component). It is a utility module that casts the processed video with metadata drawn over it to users via RTSP and LL-HLS protocols. We use it to demonstrate how the system works on a particular video stream.

We previously mentioned Graphite and Grafana: the former collects metrics, and the latter displays them in a web UI as bar charts. The compose bundle includes a preconfigured dashboard that can be used without any additional modifications.

First things first, ensure that your environment is configured correctly to run Savant. After the environment is ready, we can download the source code and launch it:

git clone --depth 1 --branch v0.2.2 https://github.com/insight-platform/Savant.git
cd Savant/samples/traffic_meter
git lfs pull

# if you want to share with us where are you from
# run the following command, it is completely optional
curl --silent -O -- https://hello.savant.video/traffic_meter.html

# if x86 (for Jetson, change 'x86' to 'l4t').
docker compose -f docker-compose.x86.yml pull
docker compose -f docker-compose.x86.yml build
docker compose -f docker-compose.x86.yml up

The first launch takes time to initialize because the compose downloads “pudgy” Nvidia images from their registry.

After the start, Savant compiles a detection model into a highly optimized TensorRT engine required by Nvidia; it takes up to 1 minute, depending on your hardware. You can monitor the process by looking at the CPU utilization: it significantly decreases after completion.

Nevertheless, you can open RTSP URL in your favorite video player (like VLC) and wait until the video occurs: rtsp://127.0.0.1:554/stream. If you want to use a browser, navigate to http://127.0.0.1:888/stream. Until the frames are actually processed, you will see the stub picture like that:

If you can see it in a video player or a browser, it means that AO-RTSP works appropriately, and you have to wait until frames happen. You should see the scene depicted in the following image:

In the upper part of the scene, a dashboard displays the numbers of people crossing the lines in both ways. Also, you can see that every tracked person is marked with a color dot mark. When a person crosses the lines, it is annotated with a badge displaying the crossing direction.

When you reach the point of the video stream playing in a player or a browser, you can open Grafana UI (http://127.0.0.1:3000), login to the system with admin/admin to view the chart: http://127.0.0.1:3000/d/WM6WimE4z/entries-exits?orgId=1&refresh=5s.

Congratulations, you have made it: the bundle handles the video and displays the footage and graphs. The next step is decomposing the “compose” file for real-life usage by decoupling sources and sinks.

Take your time exploring Savant/samples/traffic_meter/docker-compose.x86.yml to get acquainted with its implementation.

Registration lines are defined in Savant/samples/traffic_meter/line_config.yml.

Building a Real-Life Topology

To convert the all-in-one compose bundle into a real-life system, we want to make the following:

use RTSP data sources instead of looped video;
be able to play footage for a chosen video stream on demand;
specify registration lines per stream;
configure the persistent storage for metrics.

Let us begin with decoupling. In the following picture, you may see what will be done:

The compose file corresponding to the new architecture is listed below:

version: "3.3"
services:
  module:
    build: 
      context: .
      dockerfile: docker/Dockerfile.x86
    volumes:
      - zmq_sockets:/tmp/zmq-sockets
      - ../../models/traffic_meter:/models
      - ../../downloads/traffic_meter:/downloads
      - .:/opt/savant/samples/traffic_meter
    command: samples/traffic_meter/module-${DETECTOR}.yml
    environment:
      - ZMQ_SRC_ENDPOINT=sub+bind:ipc:///tmp/zmq-sockets/input-video.ipc
      - ZMQ_SINK_ENDPOINT=pub+bind:ipc:///tmp/zmq-sockets/output-video.ipc
      - FPS_PERIOD=1000
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
  graphite:
    image: graphiteapp/graphite-statsd
  grafana:
    image: grafana/grafana-oss
    volumes:
      - ./grafana_datasources:/etc/grafana/provisioning/datasources/
      - ./grafana_dashboards:/etc/grafana/provisioning/dashboards/
    ports:
      - "3000:3000"
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_NAME=Main Org.
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer

volumes:
  zmq_sockets:

Now we want to make the changes allowing RTSP Source adapters to connect to the module over the network, so I change UNIX domain sockets to TCP/IP sockets:

version: "3.3"
services:
  module:
    build: 
      context: .
      dockerfile: docker/Dockerfile.x86
    volumes:
      # - zmq_sockets:/tmp/zmq-sockets                # not needed
      - ../../models/traffic_meter:/models
      - ../../downloads/traffic_meter:/downloads
      - .:/opt/savant/samples/traffic_meter
    command: samples/traffic_meter/module-${DETECTOR}.yml
    environment:
      - ZMQ_SRC_ENDPOINT=sub+bind:tcp://0.0.0.0:3331   # changed
      - ZMQ_SINK_ENDPOINT=pub+bind:tcp://0.0.0.0:3332  # changed
      - FPS_PERIOD=1000
    ports:
      - "3331:3331"
      - "3332:3332"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

# the rest is skipped

Also, we want to make the Graphite database persistent, so I specify a host’s directory for it:

version: "3.3"
services:
  # the module service skipped
  graphite:
    image: graphiteapp/graphite-statsd
    volumes:
      - ./graphite-storage:/opt/graphite/storage   # changed
  grafana:
    image: grafana/grafana-oss
    volumes:
      - ./grafana_datasources:/etc/grafana/provisioning/datasources/
      - ./grafana_dashboards:/etc/grafana/provisioning/dashboards/
      - grafana_var:/var/lib/grafana/             # changed
    ports:
      - "3000:3000"
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_NAME=Main Org.
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer

volumes:                                          # changed
  grafana_data:                                   # changed

The final compose file looks as follows:

version: "3.3"
services:
  module:
    build:
      context: .
      dockerfile: docker/Dockerfile.x86
    volumes:
      - ../../models/traffic_meter:/models
      - ../../downloads/traffic_meter:/downloads
      - .:/opt/savant/samples/traffic_meter
    command: samples/traffic_meter/module-${DETECTOR}.yml
    environment:
      - ZMQ_SRC_ENDPOINT=sub+bind:tcp://0.0.0.0:3331
      - ZMQ_SINK_ENDPOINT=pub+bind:tcp://0.0.0.0:3332
      - FPS_PERIOD=1000
    ports:
      - "3331:3331"
      - "3332:3332"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
  graphite:
    image: graphiteapp/graphite-statsd
    volumes:
      - ./graphite-storage:/opt/graphite/storage
  grafana:
    image: grafana/grafana-oss
    volumes:
      - ./grafana_datasources:/etc/grafana/provisioning/datasources/
      - ./grafana_dashboards:/etc/grafana/provisioning/dashboards/
      - grafana_data:/var/lib/grafana/
    ports:
      - "3000:3000"
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_NAME=Main Org.
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer

volumes:
  grafana_data:

Until now, we have received the compose file, which executes only the analytics module configured to communicate with adapters by TCP/IP, Graphite with the volume mapped to the host, and Grafana.

The resulting module is in the Savant/samples/traffic_meter/docker-compose-no-adapters.x86.yml file.

Run it in the detached mode (-d) or in the interactive mode (without -d ):

docker compose -f docker-compose-no-adapters.x86.yml up -d

# to stop use
# docker compose -f docker-compose-no-adapters.x86.yml stop

# to remove use
# docker compose -f docker-compose-no-adapters.x86.yml rm -f

Please ensure that the services are running

docker compose -f docker-compose-no-adapters.x86.yml ps  | awk '{print $1}'

NAME
traffic_meter-grafana-1
traffic_meter-graphite-1
traffic_meter-module-1

and Grafana is up and running on http://127.0.0.1:3000/d/WM6WimE4z/entries-exits?orgId=1&refresh=5s.

Connecting Adapters

No video sources and sinks are attached to the module in the current configuration. You can add an arbitrary number of RTSP cams (as soon as the CPU and GPU can process the traffic) to the module. You can also attach the AO-RTSP sink adapter to a stream to visually control its processing.

GeForce Video Encoding Limitation

The GeForce GPU family has a hardware limitation on the number of simultaneously encoded streams equal to three. It is impossible to run more than 3 streams when hardware encoding is used. Read the article explaining Savant in detail to know more about that. To run the system on GeForce, you need to disable output_frame encoding completely or switch it to raw_rgba.

The raw_rgba mode decreases performance significantly, so we recommend not using it in production; thus, consider switching to Tesla, Quadro GPUs, or Nvidia Jetson.

So, if you are on GeForce, change the following lines in module-peoplenet.yml:

output_frame:
  codec: h264

to raw_rgba:

output_frame:
  codec: raw_rgba

or disable output frames, commenting the output_frame section.

Configure Per-Cam Registration Lines

The lines are configured in the line_config.yml file for each cam. They are measured in the [1280x900] resolution (the initial frame is scaled to 1280x720, and the padding of 180 px is added on top).

Assuming that you have a cam with the cam-1 name, you need to add to the config lines for it:

town-centre-processed:
  from: [305, 600, 685, 700]
  to: [325, 580, 705, 680]

# add other source ids like so
cam-1:
   from: [105, 665, 1160, 665]
   to: [115, 645, 1175, 645]

After changing the file, you must restart the module:

docker compose -f docker-compose.x86.yml restart module

Connect The RTSP Cam “cam-1”

To connect an RTSP cam to the module, you need to use the RTSP Source adapter provided by Savant:

# you are in Savant/samples/traffic_meter

docker run --rm -it --name rtsp-cam-1 \
    --entrypoint /opt/savant/adapters/gst/sources/rtsp.sh \
    -e ZMQ_ENDPOINT=pub+connect:tcp://10.10.10.1:3331 \
    -e SOURCE_ID=cam-1 \
    -e RTSP_URI=rtsp://10.10.10.10 \
    ghcr.io/insight-platform/savant-adapters-gstreamer:0.2.2

Assuming that the module is available at the IP 10.10.10.1:3331, and the cam is available at the IPrtsp://10.10.10.10, the module captures the cam’s traffic and sends it to the module for processing under the name cam-1.

Take a look at SOURCE_ID=cam-1. It must be a unique value among all the cams connected to the module (if you use a globally shared Graphite, then the name must be unique globally because it is used in metric).

Visual Inspection Of The “cam-1” RTSP Cam

Now, when you have connected the cam, you can ensure it works properly. Let’s use the Always-On RTSP Sink adapter for it:

# you are in Savant/samples/traffic_meter

docker run --rm -it --name sink-always-on-rtsp \
    --gpus=all \
    --entrypoint python \
    -p 554:554 -p 888:888 \
    -e ZMQ_ENDPOINT=sub+connect:tcp://10.10.10.1:3332 \
    -e SOURCE_ID=cam-1 \
    -e STUB_FILE_LOCATION=/stub_imgs/smpte100_1280x720.jpeg \
    -e DEV_MODE=True \
    -v $(pwd)/../assets/stub_imgs:/stub_imgs:ro \
    ghcr.io/insight-platform/savant-adapters-deepstream:0.2.2-6.2 \
    -m adapters.ds.sinks.always_on_rtsp

Take a look at SOURCE_ID=cam-1. It corresponds the name of the cam configured in the RTSP Source adapter.

Now you can open the URL rtsp://X.Y.Z.C:554/stream in your video player, where the address X.Y.Z.C is the IP where you started AO-RTSP. Alternatively, you can navigate to http://X.Y.Z.C:888/stream in your browser (LL-HLS).

You must see the lines drawn and the traffic metered as shown in the video (but from your RTSP cam):

Configure Charts For “cam-1”

The last step to complete is configuring the charts related to “cam-1”. We will use the existing chart provided with the sample as a template.

Log In to Grafana (following the previous agreement that the compose is deployed at 10.10.10.1): http://10.10.10.1:3000/login. Use admin/admin credentials to sign in.

Navigate to “Dashboards”:

Open the dashboard and Click “Duplicate” in “More”:

Next, click “Edit” to customize the newly created dashboard:

Change the fields as demonstrated in the following picture:

You are good to go. Metrics for the cam-1 must appear soon.

Adding More Cams

You can add more cams, as demonstrated in the previous sections. Remember to supply the unique SOURCE_ID to each cam.

To efficiently process multiple streams at once, you need to tweak the batch size of the pipeline. You need to experiment with model.batch_size for the detector unit increasing its value from 1 to 2, 4, etc., to find the value that simultaneously allows the simultaneous processing of the highest number of streams.

Changing batch_size causes the model to be recompiled.

About YOLOv8

You may find a module working with the YOLOv8 model located in the same directory. The pros of YOLO are that it works with various classes out of the box and can be trained to detect other classes. However, if you are focused on person detection, we recommend using PeopleNet because of lesser CPU utilization and higher quality.

Take a look at the file module-yolov8m.yml. You can switch the module by changing peoplenet to yolov8m in the .env file:

DETECTOR=peoplenet
# DETECTOR=yolov8m

Now, when you run the compose, the pipeline with YOLOv8 will work.

TensorRT compiles YOLOv8 for a very long time. You are expected to wait for 10 minutes until it is ready. You may want to distribute ready-to-use engines when deploying at scale to avoid long boot time in case of an ephemeral environment without persistent volumes. Consult with Savant samples how to distribute precompiled engines.

If you take a look at the contents of module-yolov8m.yml you will find the lines like (counts people):

detected_object:
  id: 0
  label: person
  # id: 2
  # label: car

By changing it to:

detected_object:
  # id: 0
  # label: person
  id: 2
  label: car

You will make the system to track the cars like it is demonstrated in the following video:

If you carefully analyze the video, you may find the pipeline counts only passenger cars, leaving trucks and buses out of scope. To count them in the same group, you may “merge” multiple classes as shown in the following listing:

    # detector
    - element: nvinfer@detector
      name: yolov8m
      model:
        remote:
          url: s3://savant-data/models/yolov8m/yolov8m.zip
          checksum_url: s3://savant-data/models/yolov8m/yolov8m.md5
          parameters:
            endpoint: https://eu-central-1.linodeobjects.com
        format: custom
        config_file: config_infer_primary_yoloV8.txt
        custom_lib_path: /opt/savant/lib/libnvdsinfer_custom_impl_Yolo.so
        output:
          objects:
            - class_id: 0
              label: ${parameters.detected_object.label}
            - class_id: 1 # bycicle
              label: ${parameters.detected_object.label}
            - class_id: 2 # car
              label: ${parameters.detected_object.label}
            - class_id: 3 # motorcycle
              label: ${parameters.detected_object.label}
            - class_id: 5 # bus
              label: ${parameters.detected_object.label}
            - class_id: 7 # truck
              label: ${parameters.detected_object.label}

Don’t forget to change the object type in Grafana:

Custom YOLOv8

You can use your custom-trained YOLOv8. Download the archive with the model used in the example from https://eu-central-1.linodeobjects.com/savant-data/models/yolov8m/yolov8m.zip. Investigate its contents and craft the one corresponding to your model. Then, deploy the archive to the S3 or HTTP(S) storage and change the URL in the module.

For details, refer to the Savant documentation.

Conclusion

We have demonstrated how to use the Savant framework, Graphite, and Grafana to count various traffic and visualize it. The system can be deployed in production with Docker or K8s. There are various approaches to optimize the performance, like introducing the heuristics for skipping the processing if there are no objects in the frame or detecting the situation when the viewport is too dark to track something. With Savant, it is possible to implement the scaling with non-real-time processing involved, which helps utilize 24x7 processing when the results are not required immediately.

You may want to decouple Graphite and Grafana from the compose, leaving the only module inside: it is beneficial when multiple GPU systems are involved in processing.

Please, Visit our Discord server and GitHub Discussions to know more and discuss Savant. Consider giving us 🌠 on GitHub: it makes our developers happy and helps Savant become more visible.