mirror of https://github.com/coturn/coturn.git synced 2026-05-14 11:19:54 +00:00

coturn TURN server project

networking server turn

C 90.5%
CMake 4.1%
Shell 2.1%
C++ 1.5%
Makefile 1%
Other 0.7%

Find a file

Pavel Punsky fb94ab117d turnutils_uclient: sender thread pool + UDP-GSO send batching + recv_pps reporting (#1913 ) ## Summary Three related changes to `turnutils_uclient` that together unblock the loadgen from being the bottleneck when benchmarking the relay: 1. Sender thread pool (`--sender-threads <N>`, max 4, auto-bumped to 2 at `-m >= 4`). Mirrors the listener pool that landed in #1911. Each sender thread owns its own libevent base, a session shard (round-robin assigned at allocation time via `elem->sender_id`), and a 100 µs timer that runs the burst loop just like the legacy main-thread `timer_handler` did. Send-side counters (`tot_send_messages`, `tot_send_bytes`, `tot_send_dropped`, `load_sent_packets`) and the completion accumulators in `client_timer_handler` (`total_loss` / `total_latency` / `total_jitter`) are written into per-thread cache-line-aligned slabs and reduced into the globals after `pthread_join`. This avoids the cross-core atomic-counter contention that the listener-pool work already documented. 2. UDP-GSO send batching in `send_buffer` for the plain-UDP path. The sender pool opens a thread-local batch window around its per-tick iteration; within the window, `send_buffer` copies the payload into a per-thread slot and appends to a scatter-gather `iov[]`. On flush: - If `count > 1` and all segments share the same size → one `sendmsg(2)` with a `UDP_SEGMENT` cmsg. - If GSO is unavailable (kernel returns `EINVAL`/`ENOPROTOOPT`/`EOPNOTSUPP`) → sticky-disable per thread, fall back to `sendmmsg(2)` over the same iov array. - Per-entry `send(2)` as the final fallback for whatever sendmmsg refused (EAGAIN tail, etc.). Auto-flush triggers: different fd (next session in iteration), different segment size, batch capacity (64), or end of iteration. 3. `recv_pps` in `print_load_generator_rate`, alongside the existing `send_pps`. Once the sender pool + GSO let uclient push >>1 Mpps of UDP, the meaningful end-to-end metric is the round-trip count, not the send-side count — the relay/peer pipeline drops 95+% of packets when uclient outpaces it. The progress line now reads: send_pps=6012928.00, recv_pps=101486.00, total_sent=112975924, total_recv=1853369 ## Why Benchmarking `--multiplex-client` / `--multiplex-peer` on a c-4 DigitalOcean droplet, the loadgen's single-threaded `timer_handler` saturated one CPU around 300 kpps regardless of `-m`. The relay was never put under real pressure, so the multiplex paths' value couldn't be measured. With this patch the loadgen can produce >6 Mpps from a single c-4 droplet, far above the relay's per-thread saturation point, so the bottleneck moves to the server where it belongs. ## Benchmark — multiplex-client turnserver, c-4 loadgen, m=4, 20 s \| Round \| OLD (master) \| NEW (this PR) \| Lift \| \|-------\|--------------\|---------------\|------\| \| 1 \| 246k send_pps \| 7.48M \| 30.4× \| \| 2 \| 459k \| 6.06M \| 13.2× \| \| 3 \| 360k \| 5.07M \| 14.1× \| \| avg \| 355k \| 6.20M \| 17.5× \| Throughput cap shifts from loadgen to relay. End-to-end recv_pps (which is now first-class in the progress line) is ~100 kpps in this configuration — limited by the relay, not uclient. ## Design notes - Cache-line alignment on `uclient_sender` mirrors the listener-pool's slab pattern. Same false-sharing trap, same fix. - Main-thread timer slows to 10 ms when the sender pool is engaged. The main timer still fires for lifecycle / `__turn_getMSTime` refresh, but `timer_handler` early-returns when `num_sender_threads > 0` so we don't burn a core on no-op 100 µs ticks. - Stop ordering: `stop_sender_threads()` runs before `stop_listener_threads()` — the senders own session mutation (wmsgnum, to_send_timems, shutdown), so joining them first prevents a race where a listener accumulates a stat into a session whose owning sender is still iterating it. - UDP-GSO copy: the per-slot memcpy is intentional. The caller (`client_write`) reuses `elem->out_buffer` across burst iterations, so pointing `iov[i]` at the session buffer would alias all entries to the most recent payload. A rotating per-session output ring would eliminate the copy — left out of this PR because the kernel-side savings from collapsing N sendmsg into one GSO sendmsg dominate the per-packet copy cost at the rates we measured. - Linux-only: send-side batching machinery is gated by `#if defined(__linux__)`. Non-Linux builds get no-op `uclient_send_batch_begin`/`_end` and `uclient_tx_enqueue` returns false, falling through to the legacy `send(2)` loop. ## Test plan - [x] macOS local build (Apple Silicon, AppleClang). Sender-pool code paths compile under both Linux and non-Linux gates. - [x] `clang-format-15 --dry-run --Werror` clean. - [x] Linux build on a c-4 Ubuntu 24.04 droplet (`cmake -DCMAKE_BUILD_TYPE=Release`). - [x] `--help` includes the new `--sender-threads` option with valid-range hint; out-of-range values rejected. - [x] Benchmark on two c-4 droplets in nyc1 against `turnserver --multiplex-client`: 3 alternating rounds OLD vs NEW, +17.5× average send-side lift (data table above). - [x] `print_load_generator_rate` output verified — `send_pps`, `recv_pps`, `total_sent`, `total_recv` all populated and consistent across listener slab reductions. ## Limitations - `--multiplex-peer` is not driven by this PR. uclient's pattern (each `-m N` opens two internal sessions per client that share the same peer port) hits the multiplex-peer "one allocation per peer endpoint" rule; benchmarking that flag at high concurrency requires a separate small change (per-session secondary peer port) — not in scope here. - The wider per-round variance under the sender pool (rounds in our bench ranged 13×–30× lift) is timing/scheduler noise at small per-thread shards. Smoothens out as `-m` and per-thread session counts grow.		2026-05-11 20:59:12 -07:00
.github	Restore CodeQL permissions, category, and manual build mode (#1901 )	2026-05-08 09:02:51 -07:00
.vscode	Fixes: run_tests.sh and no db (#1834 )	2026-03-12 22:01:28 -07:00
cmake	Implement custom prometheus http handler (#1591 )	2024-12-10 10:28:43 -08:00
docker	Upgrade Docker image to 4.11.0 Coturn version	2026-05-09 13:12:33 +03:00
docs	Add UDP-GSO send path (--udp-gso) (#1907 )	2026-05-09 08:05:38 -07:00
examples	examples/turnserver.conf: update description of cli option (#1909 )	2026-05-10 21:12:56 -07:00
filc	Filc harness and pointer typedefs (#1896 )	2026-05-04 18:49:18 -07:00
fuzzing	fuzzing: use hex escapes for HTTP EOH dictionary entry (#1905 )	2026-05-08 20:26:46 -07:00
man/man1	Sync turnserver man page with current CLI options (#1903 )	2026-05-08 18:27:30 -07:00
rpm	Update version to 4.8.0 (#1791 )	2026-01-05 17:35:27 -08:00
scripts	Update config and Readme files about deprecated TLSv1/1.1 (#1848 )	2026-04-05 20:49:46 -07:00
src	turnutils_uclient: sender thread pool + UDP-GSO send batching + recv_pps reporting (#1913 )	2026-05-11 20:59:12 -07:00
tests	HTTP parsing fixes (#1882 )	2026-04-27 08:34:38 -07:00
turndb	Add hash algorithm for key value to redis userdb schema	2021-01-14 09:57:10 -06:00
.clang-format	Improve const correctness in coturn (#1424 )	2025-09-08 21:14:56 -07:00
.clang-tidy	Improve const correctness in coturn (#1424 )	2025-09-08 21:14:56 -07:00
.dockerignore	Avoid duplication via common rootfs/ dir	2021-04-20 10:36:52 +03:00
.gitignore	Use bool to enable prometheus (#1779 )	2025-12-08 08:43:36 -08:00
AUTHORS.md	Update version to 4.10.0 (#1864 )	2026-04-13 15:16:42 -07:00
authors.sh	Generate AUTHORS as Markdown, update references (#1102 )	2022-11-21 16:23:22 -08:00
ChangeLog	Update version to 4.11.0 (#1897 )	2026-05-04 18:55:49 -07:00
CLAUDE.md	Relay recvmmsg (#1906 )	2026-05-08 22:47:46 -07:00
CMakeLists.txt	Update version to 4.11.0 (#1897 )	2026-05-04 18:55:49 -07:00
configure	Fix build failure: define _GNU_SOURCE for recvmmsg() on Linux (#1868 )	2026-04-18 22:08:11 -07:00
CONTRIBUTING.md	Update CONTRIBUTING.md	2023-01-09 19:27:00 +01:00
INSTALL	Move and split documentation files (#1096 )	2022-12-22 11:13:24 -08:00
iwyu-ubuntu.imp	Add clang-tidy, include-what-you-use, and msvc-analyzer github actions (#1363 )	2024-01-16 19:49:30 -08:00
LICENSE	initial code import	2014-04-20 21:10:18 +00:00
make-man.sh	man pages util fixed	2017-02-20 01:10:38 -08:00
Makefile.in	HTTP parsing fixes (#1882 )	2026-04-27 08:34:38 -07:00
postinstall.txt	Move and split documentation files (#1096 )	2022-12-22 11:13:24 -08:00
README.md	add fuzzing to ci workflows (#1745 )	2025-09-05 10:37:58 -07:00
README.turnadmin	Regenerate manual pages from README files (#1117 )	2022-12-06 17:04:13 -08:00
README.turnserver	Relay recvmmsg (#1906 )	2026-05-08 22:47:46 -07:00
README.turnutils	Load generator mode in turnutils_uclient (#1894 )	2026-05-03 22:03:08 -07:00
release.sh	Update version to 4.7.0 (#1691 )	2025-05-30 14:13:59 -07:00
STATUS.md	Keep only NEV_UDP_SOCKET_PER_THREAD network engine (#1849 )	2026-04-06 19:26:46 -07:00
vcpkg.json	Update version to 4.8.0 (#1791 )	2026-01-05 17:35:27 -08:00

README.md

Docker Hub | GitHub Container Registry | Quay.io

Coturn TURN server

coturn is a free open source implementation of TURN and STUN Server. The TURN Server is a VoIP media traffic NAT traversal server and gateway.

Installing / Getting started

Linux distros may have a version of coturn which you can install by

apt install coturn
turnserver --log-file stdout

Or run coturn using docker container:

docker run -d -p 3478:3478 -p 3478:3478/udp -p 5349:5349 -p 5349:5349/udp -p 49152-65535:49152-65535/udp coturn/coturn

See more details about using docker container Docker Readme

Developing

Dependencies

coturn requires following dependencies to be installed first

libevent2

Optional

openssl (to support TLS and DTLS, authorized STUN and TURN)
libmicrohttp and prometheus-client-c (prometheus interface)
MariaDB/MySQL (user database)
Hiredis (user database, monitoring)
SQLite (user database)
PostgreSQL (user database)

Building

git clone git@github.com:coturn/coturn.git
cd coturn
./configure
make

Features

STUN specs:

RFC 3489 - "classic" STUN
RFC 5389 - base "new" STUN specs
RFC 5769 - test vectors for STUN protocol testing
RFC 5780 - NAT behavior discovery support
RFC 7443 - ALPN support for STUN & TURN
RFC 7635 - oAuth third-party TURN/STUN authorization

TURN specs:

RFC 5766 - base TURN specs
RFC 6062 - TCP relaying TURN extension
RFC 6156 - IPv6 extension for TURN
RFC 7443 - ALPN support for STUN & TURN
RFC 7635 - oAuth third-party TURN/STUN authorization
RFC 8016 - Mobility with Traversal Using Relays around NAT (TURN)
DTLS support (http://tools.ietf.org/html/draft-petithuguenin-tram-turn-dtls-00)
TURN REST API (http://tools.ietf.org/html/draft-uberti-behave-turn-rest-00)
Origin field in TURN (Multi-tenant TURN Server) (https://tools.ietf.org/html/draft-ietf-tram-stun-origin-06)
TURN Bandwidth draft specs (http://tools.ietf.org/html/draft-thomson-tram-turn-bandwidth-01)
TURN-bis (with dual allocation) draft specs (http://tools.ietf.org/html/draft-ietf-tram-turnbis-04)

ICE and related specs:

RFC 5245 - ICE
RFC 5768 – ICE–SIP
RFC 6336 – ICE–IANA Registry
RFC 6544 – ICE–TCP
RFC 5928 - TURN Resolution Mechanism

The implementation fully supports the following client-to-TURN-server protocols:

UDP (per RFC 5766)
TCP (per RFC 5766 and RFC 6062)
TLS (per RFC 5766 and RFC 6062): including TLS1.3; ECDHE is supported.
DTLS1.0 and DTLS1.2 (http://tools.ietf.org/html/draft-petithuguenin-tram-turn-dtls-00)
SCTP (experimental implementation).

Relay protocols:

UDP (per RFC 5766)
TCP (per RFC 6062)

User databases (for user repository, with passwords or keys, if authentication is required):

SQLite
MariaDB/MySQL
PostgreSQL
Redis
MongoDB

Management interfaces:

telnet cli
HTTPS interface

Monitoring:

Redis can be used for status and statistics storage and notification
prometheus interface (unavailable on apt package)

Message integrity digest algorithms:

HMAC-SHA1, with MD5-hashed keys (as required by STUN and TURN standards)

TURN authentication mechanisms:

'classic' long-term credentials mechanism;
TURN REST API (a modification of the long-term mechanism, for time-limited secret-based authentication, for WebRTC applications: http://tools.ietf.org/html/draft-uberti-behave-turn-rest-00);
experimental third-party oAuth-based client authorization option;

Performance and Load Balancing:

When used as a part of an ICE solution, for VoIP connectivity, this TURN server can handle thousands simultaneous calls per CPU (when TURN protocol is used) or tens of thousands calls when only STUN protocol is used. For virtually unlimited scalability a load balancing scheme can be used. The load balancing can be implemented with the following tools (either one or a combination of them):

DNS SRV based load balancing;
built-in 300 ALTERNATE-SERVER mechanism (requires 300 response support by the TURN client);
network load-balancer server.

Traffic bandwidth limitation and congestion avoidance algorithms implemented.

Target platforms:

Linux (Debian, Ubuntu, Mint, CentOS, Fedora, Redhat, Amazon Linux, Arch Linux, OpenSUSE)
BSD (FreeBSD, NetBSD, OpenBSD, DragonFlyBSD)
Solaris 11
Mac OS X
Cygwin (for non-production R&D purposes)
Windows (native with, e.g., MSVC toolchain)

This project can be successfully used on other *NIX platforms, too, but that is not officially supported.

The implementation is supposed to be simple, easy to install and configure. The project focuses on performance, scalability and simplicity. The aim is to provide an enterprise-grade TURN solution.

To achieve high performance and scalability, the TURN server is implemented with the following features:

High-performance industrial-strength Network IO engine libevent2 is used
Configurable multi-threading model implemented to allow full usage of available CPU resources (if OS allows multi-threading)
Multiple listening and relay addresses can be configured
Efficient memory model used
The TURN project code can be used in a custom proprietary networking environment. In the TURN server code, an abstract networking API is used. Only couple files in the project have to be re-written to plug-in the TURN server into a proprietary environment. With this project, only implementation for standard UNIX Networking/IO API is provided, but the user can implement any other environment. The TURN server code was originally developed for a high-performance proprietary corporate environment, then adopted for UNIX Networking API
The TURN server works as a user space process, without imposing any special requirements on the system

README.md Unescape Escape