# Overforcing liveview websockets on coolify

## TL;DR

Add these params to the Traefik command:

```yaml
- '--entrypoints.https.transport.respondingTimeouts.readTimeout=0s'
- '--entrypoints.https.transport.respondingTimeouts.idleTimeout=3600s'
```

Add these service labels to your Phoenix app:
```
traefik.http.middlewares.proto-headers.headers.customrequestheaders.X-Forwarded-Proto=https
traefik.http.middlewares.no-alt-svc.headers.customresponseheaders.Alt-Svc=
traefik.http.routers.https-1-z8008og8o8wc0oc8c44wkwk8.middlewares=gzip,proto-headers,no-alt-svc
```

## The Bait

I got shame-sniped by [some other blog audit](https://x.com/Gregorein/status/2038953944475472316?s=20) so I started compressing PNGs, self-hosting fonts, preloading CSS and LCP images, etc. To check if any of that worked, I opened Chrome DevTools and ran Lighthouse. Everything looked good.

Then I throttled the network to simulate 3G and tried interacting with some [other post's widgets](https://blog.aezakmi.top/rolling-your-own-ofac-search). The collapse/expand of the search result was unusually slow.

Quick look at the network tab revealed that the WebSocket connection was never established. Phoenix was using a longpoll fallback instead. Locally, WebSocket worked fine, but when deployed (on a Hostinger VPS running Coolify) it never connected.

## Coolify, Traefik and HTTP/3

I checked the HAR logs:

```
h3  GET 200 https://blog.aezakmi.top/
h3  GET 200 https://blog.aezakmi.top/assets/js/app-...js
h3  GET 200 https://blog.aezakmi.top/live/longpoll?...
h3  GET 200 https://blog.aezakmi.top/live/longpoll?...
h3  POST 200 https://blog.aezakmi.top/live/longpoll?...
```

The initial page was served over `h3` - HTTP/3 (QUIC). That explains it: WebSockets require an HTTP/1.1 [Upgrade handshake][rfc6455], and HTTP/3 (QUIC) [does not support this mechanism][rfc9114-4.5].

But why is HTTP/3 enabled? I don't recall setting that. And the WebSocket connection isn't even attempted - we see longpoll immediately. Does the browser skip WebSocket entirely when it negotiates HTTP/3?

Coolify uses Traefik as the proxy, so it's probably configured there. I couldn't find this option anywhere in the Coolify UI (there were some Traefik-related labels in the `General` tab, but nothing that mentioned HTTP/3).

I inspected the `coolify-proxy` container directly:

```bash
$ docker inspect coolify-proxy --format '{{json .Args}}' | python3 -m json.tool
[
    ...
    "--entrypoints.https.http3",
    ...
]
```

Yep, Traefik was forcing HTTP/3 globally, and WebSocket connections couldn't be established.
Time to test if we can establish the WebSocket connection at all.

## Initial curling

Maybe it's a Phoenix-only issue? Let's try connecting to the WebSocket directly with curl:

```bash
$ curl -i -N \
  -H "Connection: Upgrade" \
  -H "Upgrade: websocket" \
  -H "Sec-WebSocket-Version: 13" \
  -H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
  https://blog.aezakmi.top/live/websocket

HTTP/2 400
'connection' header must contain 'upgrade', got []
```

Two things here:
1. `Connection` and `Upgrade` headers were being stripped
2. Something was wrong before the request even reached Phoenix

How about bypassing Traefik and hitting the Phoenix container directly inside the Docker network?

```bash
$ docker inspect z8008og8o8wc0oc8c44wkwk8-164726038186 \
  --format '{{range .NetworkSettings.Networks}}{{.IPAddress}} {{end}}'
10.0.1.14

$ curl -i -N --http1.1 \
  -H "Connection: Upgrade" \
  -H "Upgrade: websocket" \
  -H "Sec-WebSocket-Version: 13" \
  -H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
  http://10.0.1.14:4000/live/websocket

HTTP/1.1 301 Moved Permanently
location: https://blog.aezakmi.top/live/websocket
```

We hit it directly over plain `http://` and got a 301 to `https://`. Instead of `101 Switching Protocols` we got a redirect. A second, separate problem.

## Fixing X-Forwarded-Proto

The prod config:
```elixir
config :blog_ex, BlogExWeb.Endpoint,
  force_ssl: [
    rewrite_on: [:x_forwarded_proto],
    exclude: [
      hosts: ["localhost", "127.0.0.1"]
    ]
  ]
```

Phoenix's `force_ssl` uses `rewrite_on: [:x_forwarded_proto]` to detect HTTPS. Without that header, it assumes plain HTTP and redirects. Since Traefik terminates TLS and proxies plain HTTP to Phoenix, it needs to send `X-Forwarded-Proto: https`. It wasn't.

We can fix that in `Configuration -> General -> Container Labels` in Coolify (uncheck `Readonly Labels` first) by adding these labels:

```
traefik.http.middlewares.proto-headers.headers.customrequestheaders.X-Forwarded-Proto=https
traefik.http.routers.https-1-z8008og8o8wc0oc8c44wkwk8.middlewares=gzip,proto-headers
```

Quick curl to confirm:

```bash
$ curl -i -N --http1.1 \
  -H "Connection: Upgrade" \
  -H "Upgrade: websocket" \
  -H "Sec-WebSocket-Version: 13" \
  -H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
  https://blog.aezakmi.top/live/websocket

HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Sec-Websocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Upgrade: websocket
```

WebSocket connection works now. But how do we make the browser use it too?

## Timeouts, HTTP/3 and Alt-Svc

Even after the redirect fix, the browser keeps using longpoll. Presumably because it negotiated HTTP/3 and the WebSocket upgrade can't happen over HTTP/3.

Meanwhile, I found the global Traefik config (in `Servers -> localhost -> Proxy`):
```
name: coolify-proxy
networks:
  aezakmi-net:
    external: true
  coolify:
    external: true
services:
  traefik:
    container_name: coolify-proxy
    image: 'traefik:v3.6'
    restart: unless-stopped
    extra_hosts:
      - 'host.docker.internal:host-gateway'
    networks:
      - aezakmi-net
      - coolify
    ports:
      - '80:80'
      - '443:443'
      - '443:443/udp'
      - '8080:8080'
    healthcheck:
      test: 'wget -qO- http://localhost:80/ping || exit 1'
      interval: 4s
      timeout: 2s
      retries: 5
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
      - '/data/coolify/proxy/:/traefik'
    command:
      - '--ping=true'
      - '--ping.entrypoint=http'
      - '--api.dashboard=true'
      - '--entrypoints.http.address=:80'
      - '--entrypoints.https.address=:443'
      - '--entrypoints.http.http.encodequerysemicolons=true'
      - '--entryPoints.http.http2.maxConcurrentStreams=250'
      - '--entrypoints.https.http.encodequerysemicolons=true'
      - '--entryPoints.https.http2.maxConcurrentStreams=250'
      - '--entrypoints.https.http3'
      - '--providers.file.directory=/traefik/dynamic/'
      - '--providers.file.watch=true'
      - '--certificatesresolvers.letsencrypt.acme.httpchallenge=true'
      - '--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=http'
      - '--certificatesresolvers.letsencrypt.acme.storage=/traefik/acme.json'
      - '--api.insecure=false'
      - '--providers.docker=true'
      - '--providers.docker.exposedbydefault=false'
    labels:
      - traefik.enable=true
      - traefik.http.routers.traefik.entrypoints=http
      - traefik.http.routers.traefik.service=api@internal
      - traefik.http.services.traefik.loadbalancer.server.port=8080
      - coolify.managed=true
      - coolify.proxy=true
```

There's the `--entrypoints.https.http3` flag from the container inspect. I could remove it globally, but that kills HTTP/3 for all projects. I don't think I need it for any of them right now, but it feels wrong to disable it globally.

While we're here, I also added timeouts so WebSocket connections don't get killed prematurely:

```yaml
- '--entrypoints.https.transport.respondingTimeouts.readTimeout=0s'
- '--entrypoints.https.transport.respondingTimeouts.idleTimeout=3600s'
```

Browsers negotiate HTTP/3 via the `Alt-Svc` header. We can strip it for just the blog, without affecting other Coolify services:

```
traefik.http.middlewares.no-alt-svc.headers.customresponseheaders.Alt-Svc=
traefik.http.routers.https-1-z8008og8o8wc0oc8c44wkwk8.middlewares=gzip,proto-headers,no-alt-svc
```

This prevents the browser from [discovering HTTP/3 for this service][alt-svc], so it stays on HTTP/2. When it needs a WebSocket, it can open a separate HTTP/1.1 connection for the upgrade.

## Almost there?

So that should fix it, right? Nope. Deploy, check the network tab, loaded via HTTP/2... still longpoll.
The Traefik config changes were clearly applied, so what gives?

The answer lies in the Phoenix socket code:

```js
// From deps/phoenix/assets/js/phoenix/socket.js line 427:
if(this.getSession(`phx:fallback:${fallbackTransportName}`)){ return fallback("memorized") }
```

Phoenix stores the preferred transport in session storage. Once the WebSocket attempt failed (before our fixes), it saved a flag:

```js
// from the browser console
sessionStorage
// Storage {phx:fallback:LongPoll: 'true', length: 1}
```

Right, of course. Clear that and reload:

```js
sessionStorage.removeItem("phx:fallback:LongPoll")
```

## Success! Or..

This fixed it. But did we actually gain anything? Not really. This is a blog, it doesn't have any live features that need WebSockets. The expand/collapse that surfaced this issue is faster now on simulated 3G, but the proper fix for that is to load the expanded content upfront and use a JS hook, which would be instant. It was already instant without throttling.

Removing HTTP/3 might actually hurt mobile performance, since constant network switching is one of the main benefits of QUIC.

Better just to compress those PNGs.


[rfc6455]: https://datatracker.ietf.org/doc/html/rfc6455
[rfc7838]: https://datatracker.ietf.org/doc/html/rfc7838
[rfc8441]: https://datatracker.ietf.org/doc/html/rfc8441
[rfc9114-4.5]: https://datatracker.ietf.org/doc/html/rfc9114#section-4.5
[alt-svc]: https://http.dev/3
[cloudflare-what-is-http3]: https://www.cloudflare.com/learning/performance/what-is-http3