Monitoring a Dockerised Celery Cluster with Flower

A flower, sometimes known as bloom or blossom, is the reproductive structure found in flowering plants. Celery is a marshland plant in the family in the family Apicaeae that has been cultivated as a vegetable since antiquity. A docker is a waterfront manual laborer who is involved in loading and unloading ships, trucks, trains or airplanes (Wikipedia).

Flower is a web-based tool for monitoring Celery workers and task progress. Install with pip:

# install flower via pip
$ pip install flower

# start flower, pass the message broker url and flower port
$ flower --broker=redis://localhost:6379/0 --port=8888

Open http://localhost:8888 in your browser.

Flower on Docker

Use the official mher/flower Docker image to dockerize Flower. Define the docker-compose flower service:

# docker-compose.yaml
flower:
  image: mher/flower
  command: ["flower", "--broker=redis://redis:6379/0", "--port=8888"]
  ports:
     - 8888:8888

This solution is not ideal. If you need to change your broker url, you have to touch the flower command. And your Celery workers, as they use the same broker. Sounds messy if you run your app in more than one environment (say, QA and production). And this is not even a complex setup.

Stuff like broker url and flower port is configuration. The twelve-factor app stores config in environment variables. Environment variables are easy to change between deploys. Docker supports and encourages the use of environment variables for config. Both Celery and Flower support configuration via environment variables out of the box. Flower is (roughly speaking) a Celery extension and thus supports all Celery settings.

All Celery settings (the list is available here) can be set via environment variables. In capital letters, prefixed with CELERY_. For example, to set broker_url, use the CELERY_BROKER_URL environment variable. The Flower specific settings can also be set via environment variables. A full list is available here, uppercase the variable and prefix with FLOWER_. For instance, to configure port, use the FLOWER_PORT environment variable.

Refactor the docker-compose flower service:

# docker-compose.yaml
flower:
  image: mher/flower
  environment:
    - CELERY_BROKER_URL=redis://redis:6379/0
    - FLOWER_PORT=8888
  ports:
    - 8888:8888

Celery Worker on Docker

The Flower dashboard lists all Celery workers connected to the message broker. Celery assigns the worker name. The worker name defaults to celery@hostname. In a container environment, hostname is the container hostname. For what it's worth, the container hostname is a meaningless string.

screenshot

As long as you run only one type of Celery worker, this is not an issue. Unlike when you run specialised workers in dedicated containers. If you have different workers processing different queues, this becomes an issue. You cannot tell which worker is what by looking at the Flower dashboard. All you see is a list of celery@gibberish workers.

One way to solve this is to control the hostname. Docker gives you control over the hostname via the hostname property:

# docker-compose.yaml
worker_1:
  hostname: worker_1
  command: ["celery", "worker", "--app=worker.app", "--loglevel=INFO"]

worker_2:
  hostname: worker_2
  command: ["celery", "worker", "--app=worker.app", "--loglevel=INFO"]

The Flower dashboard shows these workers now as celery@_worker_1 and celery@_worker_2. Unfortunately, this solution does not scale. Docker uses the same hostname for all containers that belong to the same service.

Scaling worker_1 to two containers results in two workers named celery@_worker_1. This is ok with Celery but not so much for Flower. Flower shows only one worker in the dashboard, arguably a bug (?). But even if it did show both workers with the same name, you would not be able to tell them apart.

There is an alternative solution hidden in the Celery docs. Celery provides a --hostname command line argument to set the worker name. The --hostname argument itself supports variables:

%h: hostname, including domain name
%n: hostname only
%d: domain name only

Refactor the docker-compose worker services:

# docker-compose.yaml
worker_1:
  command: ["celery", "worker", "--app=worker.app", "--hostname=worker_1@%h", ", "--loglevel=INFO"]

worker_2:
  command: ["celery", "worker", "--app=worker.app", "--hostname=worker_2@%h", ", "--loglevel=INFO"]

screenshot

Here, we set the names worker_1 and worker_2. You can make this meaningful. For example assigning the same name as the queue name the worker subscribes to. The hostname %h is still container hostname gibberish but it ensures a unique name, even at scale. Plus, it allows you to link back to the container, which can be useful for logging and debugging.

Gotchas and shortfalls

Flower has no idea which Celery workers you expect to be up and running. The Flower dashboard shows workers as and when they turn up. When a Celery worker comes online for the first time, the dashboard shows it. When a Celery worker disappears, the dashboard flags it as offline.

When you run Celery cluster on Docker that scales up and down quite often, you end up with a lot of offline workers. That's a lot of dashboard clutter. As of now, the only solution is to restart Flower. There is an open GitHub issue for this.

Conclusion

Flower is the de-facto monitoring tool for Celery. It is easy to set up and deploy into a containerized stack. But it takes some tweaking to make it work effectively in a microservices setup.