I have an application deployed in a Docker Swarm which have two publicly reachable services, HTTP and WS.
I created two target groups, one for each service, and the registered instances are the managers of the Docker Swarm. Then I created the ALB and added two HTTPS listeners, each one pointing to the specific target group.
Now comes the problem. The HTTP health check passes without no problem, but the Websocket check is always unhealthy, and I don’t know why. According to http://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-listeners.html, using a HTTP/HTTPS listener should work for WS/WSS as well.
In the WS check, I have tried as path both
/ and the path the application is actually using
/ws. Neither of them passes the health check.
It’s not a problem related to firewall either. Security groups are wide open and there are no iptables rules, so connection is possible in both directions.
I launched the websocket container out of Docker Swarm, just to test if it was something related to Swarm (which I was pretty sure it was not, but hell.. for testing’s sake), and it did not work either, so now I’m a little out of hope.
What configuration might I be missing, so that HTTP services work but Websocket services don’t?.
I’m back with this issue, and after further researching, the problem seems to be the Target Group, not the ALB per se. Reading through the documentation http://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-target-groups.html, I had forgotten to enable the stickiness option. However, I just did, and the problem persists.
It looks like the ELB is not upgrading the connection from HTTP to WebSocket.
ALBs do not support websocket health checks per:
” Health checks do not support WebSockets.”
The issue is that despite AWS claiming that the ALB supports HTTP2 in fact it downsamples everything to HTTP1 and then does it’s thing then upgrades it to HTTP2 again which breaks everything.