26 May 2026
Clustering RabbitMQ on AWS ECS with EC2
A while back I worked on a PoC for running a RabbitMQ cluster on AWS ECS with EC2 as the underlying compute layer. Most material online assumes Kubernetes, so working out the ECS path took more trial and error than it should have. This is the writeup I wish I'd had back then.
Why ECS on EC2
RabbitMQ nodes form a cluster over Erlang's distribution protocol (port 25672) and epmd (4369). We ran the task with network_mode = "host" so those ports map straight onto the instance, which is the simplest way to let cluster traffic flow without extra hops. The trade-off is one task per instance, so the setup is a three-node Auto Scaling Group with three tasks pinned to it.
Peer discovery
Instead of hardcoding peers we used the rabbitmq_peer_discovery_aws plugin, which asks the AWS API for siblings inside the same Auto Scaling Group. The crucial part of rabbitmq.conf:
cluster_formation.peer_discovery_backend = aws
cluster_formation.aws.region = eu-central-1 # or whatever region you are deploying to
cluster_formation.aws.use_autoscaling_group = true
cluster_partition_handling = autoheal
For this to work the EC2 instance profile needs permission to read the ASG and its members (IAM reference):
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingInstances",
"ec2:DescribeInstances"
],
"Resource": "*"
}
The shared Erlang cookie is pulled from Secrets Manager and injected as RABBITMQ_ERLANG_COOKIE, which is what lets the three nodes authenticate to each other once they've discovered themselves.
And that's basically it. Three EC2 instances, one task each, host networking, a plugin and a short IAM policy. The whole setup runs pretty smooth, without any major hassle. Hope this helps anyone facing similar requirements!