Overforcing liveview websockets on coolify
TL;DR
Add these params to the Traefik command:
- '--entrypoints.https.transport.respondingTimeouts.readTimeout=0s'
- '--entrypoints.https.transport.respondingTimeouts.idleTimeout=3600s'
Add these service labels to your Phoenix app:
traefik.http.middlewares.proto-headers.headers.customrequestheaders.X-Forwarded-Proto=https
traefik.http.middlewares.no-alt-svc.headers.customresponseheaders.Alt-Svc=
traefik.http.routers.https-1-z8008og8o8wc0oc8c44wkwk8.middlewares=gzip,proto-headers,no-alt-svc
The Bait
I got shame-sniped by some other blog audit so I started compressing PNGs, self-hosting fonts, preloading CSS and LCP images, etc. To check if any of that worked, I opened Chrome DevTools and ran Lighthouse. Everything looked good.
Then I throttled the network to simulate 3G and tried interacting with some other post's widgets. The collapse/expand of the search result was unusually slow.
Quick look at the network tab revealed that the WebSocket connection was never established. Phoenix was using a longpoll fallback instead. Locally, WebSocket worked fine, but when deployed (on a Hostinger VPS running Coolify) it never connected.
Coolify, Traefik and HTTP/3
I checked the HAR logs:
h3 GET 200 https://blog.aezakmi.top/
h3 GET 200 https://blog.aezakmi.top/assets/js/app-...js
h3 GET 200 https://blog.aezakmi.top/live/longpoll?...
h3 GET 200 https://blog.aezakmi.top/live/longpoll?...
h3 POST 200 https://blog.aezakmi.top/live/longpoll?...
The initial page was served over h3 - HTTP/3 (QUIC). That explains it: WebSockets require an HTTP/1.1 Upgrade handshake, and HTTP/3 (QUIC) does not support this mechanism.
But why is HTTP/3 enabled? I don't recall setting that. And the WebSocket connection isn't even attempted - we see longpoll immediately. Does the browser skip WebSocket entirely when it negotiates HTTP/3?
Coolify uses Traefik as the proxy, so it's probably configured there. I couldn't find this option anywhere in the Coolify UI (there were some Traefik-related labels in the General tab, but nothing that mentioned HTTP/3).
I inspected the coolify-proxy container directly:
$ docker inspect coolify-proxy --format '{{json .Args}}' | python3 -m json.tool
[
...
"--entrypoints.https.http3",
...
]
Yep, Traefik was forcing HTTP/3 globally, and WebSocket connections couldn't be established. Time to test if we can establish the WebSocket connection at all.
Initial curling
Maybe it's a Phoenix-only issue? Let's try connecting to the WebSocket directly with curl:
$ curl -i -N \
-H "Connection: Upgrade" \
-H "Upgrade: websocket" \
-H "Sec-WebSocket-Version: 13" \
-H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
https://blog.aezakmi.top/live/websocket
HTTP/2 400
'connection' header must contain 'upgrade', got []
Two things here:
ConnectionandUpgradeheaders were being stripped- Something was wrong before the request even reached Phoenix
How about bypassing Traefik and hitting the Phoenix container directly inside the Docker network?
$ docker inspect z8008og8o8wc0oc8c44wkwk8-164726038186 \
--format '{{range .NetworkSettings.Networks}}{{.IPAddress}} {{end}}'
10.0.1.14
$ curl -i -N --http1.1 \
-H "Connection: Upgrade" \
-H "Upgrade: websocket" \
-H "Sec-WebSocket-Version: 13" \
-H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
http://10.0.1.14:4000/live/websocket
HTTP/1.1 301 Moved Permanently
location: https://blog.aezakmi.top/live/websocket
We hit it directly over plain http:// and got a 301 to https://. Instead of 101 Switching Protocols we got a redirect. A second, separate problem.
Fixing X-Forwarded-Proto
The prod config:
config :blog_ex, BlogExWeb.Endpoint,
force_ssl: [
rewrite_on: [:x_forwarded_proto],
exclude: [
hosts: ["localhost", "127.0.0.1"]
]
]
Phoenix's force_ssl uses rewrite_on: [:x_forwarded_proto] to detect HTTPS. Without that header, it assumes plain HTTP and redirects. Since Traefik terminates TLS and proxies plain HTTP to Phoenix, it needs to send X-Forwarded-Proto: https. It wasn't.
We can fix that in Configuration -> General -> Container Labels in Coolify (uncheck Readonly Labels first) by adding these labels:
traefik.http.middlewares.proto-headers.headers.customrequestheaders.X-Forwarded-Proto=https
traefik.http.routers.https-1-z8008og8o8wc0oc8c44wkwk8.middlewares=gzip,proto-headers
Quick curl to confirm:
$ curl -i -N --http1.1 \
-H "Connection: Upgrade" \
-H "Upgrade: websocket" \
-H "Sec-WebSocket-Version: 13" \
-H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
https://blog.aezakmi.top/live/websocket
HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Sec-Websocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Upgrade: websocket
WebSocket connection works now. But how do we make the browser use it too?
Timeouts, HTTP/3 and Alt-Svc
Even after the redirect fix, the browser keeps using longpoll. Presumably because it negotiated HTTP/3 and the WebSocket upgrade can't happen over HTTP/3.
Meanwhile, I found the global Traefik config (in Servers -> localhost -> Proxy):
name: coolify-proxy
networks:
aezakmi-net:
external: true
coolify:
external: true
services:
traefik:
container_name: coolify-proxy
image: 'traefik:v3.6'
restart: unless-stopped
extra_hosts:
- 'host.docker.internal:host-gateway'
networks:
- aezakmi-net
- coolify
ports:
- '80:80'
- '443:443'
- '443:443/udp'
- '8080:8080'
healthcheck:
test: 'wget -qO- http://localhost:80/ping || exit 1'
interval: 4s
timeout: 2s
retries: 5
volumes:
- '/var/run/docker.sock:/var/run/docker.sock:ro'
- '/data/coolify/proxy/:/traefik'
command:
- '--ping=true'
- '--ping.entrypoint=http'
- '--api.dashboard=true'
- '--entrypoints.http.address=:80'
- '--entrypoints.https.address=:443'
- '--entrypoints.http.http.encodequerysemicolons=true'
- '--entryPoints.http.http2.maxConcurrentStreams=250'
- '--entrypoints.https.http.encodequerysemicolons=true'
- '--entryPoints.https.http2.maxConcurrentStreams=250'
- '--entrypoints.https.http3'
- '--providers.file.directory=/traefik/dynamic/'
- '--providers.file.watch=true'
- '--certificatesresolvers.letsencrypt.acme.httpchallenge=true'
- '--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=http'
- '--certificatesresolvers.letsencrypt.acme.storage=/traefik/acme.json'
- '--api.insecure=false'
- '--providers.docker=true'
- '--providers.docker.exposedbydefault=false'
labels:
- traefik.enable=true
- traefik.http.routers.traefik.entrypoints=http
- traefik.http.routers.traefik.service=api@internal
- traefik.http.services.traefik.loadbalancer.server.port=8080
- coolify.managed=true
- coolify.proxy=true
There's the --entrypoints.https.http3 flag from the container inspect. I could remove it globally, but that kills HTTP/3 for all projects. I don't think I need it for any of them right now, but it feels wrong to disable it globally.
While we're here, I also added timeouts so WebSocket connections don't get killed prematurely:
- '--entrypoints.https.transport.respondingTimeouts.readTimeout=0s'
- '--entrypoints.https.transport.respondingTimeouts.idleTimeout=3600s'
Browsers negotiate HTTP/3 via the Alt-Svc header. We can strip it for just the blog, without affecting other Coolify services:
traefik.http.middlewares.no-alt-svc.headers.customresponseheaders.Alt-Svc=
traefik.http.routers.https-1-z8008og8o8wc0oc8c44wkwk8.middlewares=gzip,proto-headers,no-alt-svc
This prevents the browser from discovering HTTP/3 for this service, so it stays on HTTP/2. When it needs a WebSocket, it can open a separate HTTP/1.1 connection for the upgrade.
Almost there?
So that should fix it, right? Nope. Deploy, check the network tab, loaded via HTTP/2... still longpoll. The Traefik config changes were clearly applied, so what gives?
The answer lies in the Phoenix socket code:
// From deps/phoenix/assets/js/phoenix/socket.js line 427:
if(this.getSession(`phx:fallback:${fallbackTransportName}`)){ return fallback("memorized") }
Phoenix stores the preferred transport in session storage. Once the WebSocket attempt failed (before our fixes), it saved a flag:
// from the browser console
sessionStorage
// Storage {phx:fallback:LongPoll: 'true', length: 1}
Right, of course. Clear that and reload:
sessionStorage.removeItem("phx:fallback:LongPoll")
Success! Or..
This fixed it. But did we actually gain anything? Not really. This is a blog, it doesn't have any live features that need WebSockets. The expand/collapse that surfaced this issue is faster now on simulated 3G, but the proper fix for that is to load the expanded content upfront and use a JS hook, which would be instant. It was already instant without throttling.
Removing HTTP/3 might actually hurt mobile performance, since constant network switching is one of the main benefits of QUIC.
Better just to compress those PNGs.
Clear your session storage:
```js
sessionStorage.removeItem("phx:fallback:LongPoll")