Auto-Scaling Services With Instrumented Metrics¶
Docker Scaler provides an alternative to using Jenkins for service scaling shown in Docker Flow Monitor's auto-scaling tutorial. In this tutorial, we will construct a system that will scale a service based on response time. Here is an overview of the triggered events in our self-adapting system:
- The go-demo service response times becomes high.
- Docker Flow Monitor is querying the services' metrics, notices the high response times, and alerts the Alertmanager.
- The Alertmanager is configured to forward the alert to Docker Scaler.
- Docker Scaler scales the service up.
This tutorial assumes you have Docker Machine version v0.8+ that includes Docker Engine v1.12+.
Info
If you are a Windows user, please run all the examples from Git Bash (installed through Docker for Windows). Also, make sure that your Git client is configured to check out the code AS-IS. Otherwise, Windows might change carriage returns to the Windows format.
We will be using Slack webhooks to notify us. Create a Slack channel and setup a webhook by consulting Slack's Incoming Webhook page. After obtaining a webhook URL set it as an environment variable:
export SLACK_WEBHOOK_URL=[...]
Setting Up A Cluster¶
Info
Feel free to skip this section if you already have a Swarm cluster that can be used for this tutorial
We create a Swarm cluster consisting of three nodes created with Docker Machine.
git clone https://github.com/thomasjpfan/docker-scaler.git cd docker-scaler ./scripts/ds-swarm.sh eval $(docker-machine env swarm-1)
The repo contains all the scripts and stack files needed throughout this tutorial. Next, we executed ds-swarm.sh
creating the cluster with docker-machine
. Finally, we used the eval
command to tell our local Docker client to use the remote Docker Engine swarm-1
.
Deploying Docker Flow Proxy (DFP) and Docker Flow Swarm Listener (DFSL)¶
For convenience, we will use Docker Flow Proxy and Docker Flow Swarm Listener to get a single access point to the cluster.
docker network create -d overlay proxy docker stack deploy \ -c stacks/docker-flow-proxy-mem.yml \ proxy
Please visit proxy.dockerflow.com and swarmlistener.dockerflow.com for details on the Docker Flow stack.
Deploying Docker Scaler¶
We can now deploy the Docker Scaler stack:
docker network create -d overlay scaler docker stack deploy \ -c stacks/docker-scaler-service-scale-tutorial.yml \ scaler
This stack defines a single Docker Scaler service:
... services: scaler: image: thomasjpfan/docker-scaler environment: - ALERTMANAGER_ADDRESS=http://alertmanager:9093 - SERVER_PREFIX=/scaler volumes: - /var/run/docker.sock:/var/run/docker.sock networks: - scaler deploy: replicas: 1 labels: - com.df.notify=true - com.df.distribute=true - com.df.servicePath=/scaler - com.df.port=8080 placement: constraints: [node.role == manager] ...
This definition constraints Docker Scaler to run on manager nodes and gives it access to the Docker socket, so that it can scale services in the cluster. The label com.df.servicePath=/scaler
and environement variable SERVER_PREFIX=/scaler
allows us to interact with the scaler
service.
Deploying Docker Flow Monitor and Alertmanager¶
The next stack defines the Docker Flow Monitor and Alertmanager services. Before we deploy the stack, we defined our Alertmanager configuration as a Docker secret:
echo "global: slack_api_url: '$SLACK_WEBHOOK_URL' route: receiver: 'slack' group_by: [service, scale, type] group_interval: 5m repeat_interval: 5m routes: - match_re: scale: up|down type: service receiver: 'scale' - match: alertname: scale_service group_by: [alertname, service, status] repeat_interval: 10s group_interval: 1s group_wait: 0s receiver: 'slack-scaler' receivers: - name: 'slack' slack_configs: - send_resolved: true title: '[{{ .Status | toUpper }}] {{ .GroupLabels.service }} service is in danger!' title_link: 'http://$(docker-machine ip swarm-1)/monitor/alerts' text: '{{ .CommonAnnotations.summary }}' - name: 'slack-scaler' slack_configs: - title: '{{ .GroupLabels.alertname }}: {{ .CommonAnnotations.request }}' color: '{{ if eq .GroupLabels.status \"error\" }}danger{{ else }}good{{ end }}' title_link: 'http://$(docker-machine ip swarm-1)/monitor/alerts' text: '{{ .CommonAnnotations.summary }}' - name: 'scale' webhook_configs: - send_resolved: false url: 'http://scaler:8080/v1/scale-service' " | docker secret create alert_manager_config -
This configuration groups alerts by their service
, scale
, and type
labels. The routes
section defines a match_re
entry, that directs scale alerts to the scale
reciever. Another route is configured to direct alerts from the scaler
service to the slack-scaler
receiver. Now we deploy the monitor stack:
docker network create -d overlay monitor DOMAIN=$(docker-machine ip swarm-1) \ docker stack deploy \ -c stacks/docker-flow-monitor-slack.yml \ monitor
The alert-manager
service is configured to read the alert_manager_config
secret in the stack definition as follows:
... alert-manager: image: prom/alertmanager:v0.14.0 networks: - monitor - scaler secrets: - alert_manager_config command: --config.file=/run/secrets/alert_manager_config --storage.path=/alertmanager ...
With access to the scaler
network, alert-manager
can send scaling requests to the scaler
service. For information about the Docker Flow Monitor stack can be found in its documentation.
Let us confirm that the monitor
stack is up and running:
docker stack ps monitor
Please wait a few moments for all the replicas to have the status running
. After the monitor
stack is up and running.
Manually Scaling Services¶
For this section, we deploy a simple sleeping service:
docker service create -d --replicas 1 \ --name demo \ -l com.df.scaleUpBy=3 \ -l com.df.scaleDownBy=2 \ -l com.df.scaleMin=1 \ -l com.df.scaleMax=7 \ alpine:3.6 sleep 100000000000
Labels com.df.scaleUpby=3
and com.df.scaleDownby=2
configures how many replicas to scale up and down by respectively. Labels com.df.scaleMin=1
and com.df.scaleMax=7
denote the mininum and maximum number of replicas. We manually scale up this service by sending a POST request:
curl -X POST http://$(docker-machine ip swarm-1)/scaler/v1/scale-service -d \ '{"groupLabels": {"scale": "up", "service": "demo"}}'
We confirm that the demo
service have been scaled up by 3 from 1 replica to 4 replicas:
docker service ls -f name=demo
Similarily, we manually scale down the service by sending:
curl -X POST http://$(docker-machine ip swarm-1)/scaler/v1/scale-service -d \ '{"groupLabels": {"scale": "down", "service": "demo"}}'
We confirm that the demo
service has been scaled down by 2 from 4 replicas to 2 replicas:
docker service ls -f name=demo
Before we continuing, remove the demo
service:
docker service rm demo
Deploying Instrumented Service¶
The go-demo service already exposes response time metrics with labels for Docker Flow Monitor to scrape. We deploy the service to be scaled based on the response time metrics:
docker stack deploy \ -c stacks/go-demo-instrument-alert-short.yml \ go-demo
The full stack definition can be found at go-demo-instrument-alert-short.yml. We will focus on the service labels for the go-demo_main
service relating to scaling and alerting:
main: ... deploy: ... labels: ... - com.df.alertName.1=resptimeabove - com.df.alertIf.1=@resp_time_above:0.1,5m,0.99 - com.df.alertName.2=resptimebelow_unless_resptimeabove - com.df.alertIf.2=@resp_time_below:0.025,5m,0.75_unless_@resp_time_above:0.1,5m,0.9 - com.df.alertLabels.2=receiver=system,service=go-demo_main,scale=down,type=service - com.df.alertAnnotations.2=summary=Response time of service go-demo_main is below 0.025 and not above 0.1 ...
The alertName
, alertIf
and alertFor
labels uses the AlertIf Parameter Shortcuts for creating Prometheus expressions that firing alerts. The alert com.df.alertIf.1=@resp_time_above:0.1,5m,0.99
fires when the number of responses above 0.1 seconds in the last 5 minutes consist of more than 99% of all responses. The second alert fires when the number of response below 0.025 seconds in the last 5 minutes consist of more than 75% unless alert 1 is firing.
We can view the alerts generated by these labels:
open "http://$(docker-machine ip swarm-1)/monitor/alerts"
Let's confirm that the go-demo stack is up-and-running:
docker stack ps -f desired-state=running go-demo
There should be three replicas of the go-demo_main
service and one replica of the go-demo_db
service. Please wait for all replicas to be up and running.
We can confirm that Docker Flow Monitor is monitoring the go-demo
replicas:
open "http://$(docker-machine ip swarm-1)/monitor/targets"
There should be two or three targets depending on whether Prometheus already sent the alert to de-scale the service.
Automatically Scaling Services¶
Let's go back to the Prometheus' alert screen:
open "http://$(docker-machine ip swarm-1)/monitor/alerts"
By this time, the godemo_main_resptimebelow_unless_resptimeabove
alert should be red due to having no requests sent to go-demo
. The Alertmanager recieves this alert and sends a POST
request to the scaler
service to scale down go-demo
. The label com.df.scaleDownBy
on go-demo_main
is set to 1 thus the number of replicas goes from 4 to 3.
Let's look at the logs of scaler
:
docker service logs scaler_scaler
There should be a log message that states Scaling go-demo_main from 4 to 3 replicas (min: 2). We can check that this happened:
docker service ls -f name=go-demo_main
The output should be similar to the following:
NAME MODE REPLICAS IMAGE PORTS
go-demo_main replicated 3/3 vfarcic/go-demo:latest
Please visit your channel and you should see a Slack notification stating that go-demo_main
has scaled from 4 to 3 replicas.
Let's see what happens when response times of the service becomes high by sending requests that will result in high response times:
for i in {1..30}; do DELAY=$[ $RANDOM % 6000 ] curl "http://$(docker-machine ip swarm-1)/demo/hello?delay=$DELAY" done
Let's look at the alerts:
open "http://$(docker-machine ip swarm-1)/monitor/alerts"
The godemo_main_resptimeabove
turned red indicating that the threshold is reached. Alertmanager receives the alert, sends a POST
request to the scaler
service, and docker-scaler
scales go-demo_main
up by the value of com.df.scaleUpBy
. In this case, the value of com.df.scaleUpBy
is two. Let's look at the logs of docker-scaler
:
docker service logs scaler_scaler
There should be a log message that states Scaling go-demo_main from 3 to 5 replicas (max: 7). This message is also sent through Slack to notify us of this scaling event.
We can confirm that the number of replicas indeed scaled to three by querying the stack processes:
docker service ls -f name=go-demo_main
The output should look similar to the following:
NAME MODE REPLICAS IMAGE PORTS go-demo_main replicated 5/5 vfarcic/go-demo:latest
What Now?¶
We just went through a simple example of a system that automatically scales and de-scales services. Feel free to add additional metrics and services to this self-adapting system to customize it to your needs.
Please remove the demo cluster we created and free your resources:
docker-machine rm -f swarm-1 swarm-2 swarm-3