Experience-Driven Congestion Control: When Multi-Path TCP Meets Deep Reinforcement Learning
IEEE Journal of Selected Areas in Communications (JSAC)
¶ Overview
- Scenario: Multi-Path TCP
- One RL agent $\longleftrightarrow$ MPTCP flows on an end host
- Implemented on Linux kernel
- Policy gradient, Actor-Critic
- LSTM
- Setting of testing scenario
¶ Model
Flows $\to$ LSTM $\to$ RL
90%
¶ LSTM
state of all flows $s_t = [s_t^1, …, s_t^N] \to$ one output
90%
¶ State
- Flow(total: $N$): $i$, subflow(total: $K_i$): $k$, epoch: $t$
- Agent $\to$ Flows(TCP & MPTCP) $\to$ Subflows(only 1 if is regular TCP)
- $s_t^{i, k} = [b, g, d, v, w]$
- corresponding sending rate
- goodput
- average RTT
- mean deviation of RTTs
- respective cwnd
- $s_t = [s_t^1, …, s_t^N]$
- $s_t^i = [s_t^{i, 1}, …, s_t^{i, K_i}]$
¶ Action
- $a_t = [x_t^1, …, x_t^K]$
- $x_i$: changes to current subflows’ cwnd
-
DRL-CC only takes an action on one (target) MPTCP flow
¶ Reward
- $r_t = \sum_{i = 1 \to N}{U(i, t)}$
-
U depends on upper-layer apps
- in paper: $U=\lg{g_t^i}$ ($g_t$: average goodput during the $t-1$ epoch)
-
maximizing this utility function leads to proportional fairness (Why?)
-
¶ Training
¶ Pre-training
- Environment
- using iPerf3 to continuously generate packets
- 2 laptop $\longleftarrow$ Gigabit switch $\longrightarrow$ 2 server
- 1 MPTCP = 2 subflow: 8Mbps, 200ms, 0.5%
- 50000 epochs
- 2.5 hours
¶ Online Test
- Benchmark
- Jain’s fairness index: $\bar{x}^2 / \bar{x^2}$
- goodput
- General Environment
- client $\longleftarrow$ 5 MPTCP flows $\longrightarrow$ server
- transporting document through HTTP / iPerf3
- 0.5ms to convergence
- Parameters
- delay: $50ms \to 400ms$
- packet loss rate: $0.5% \to 4%$
- bottleneck bandwidth: $2Mbps \to 16Mbps$
- document: 2 $\to$ 8 MB
- Scenerios
- 4 (HTTP) + 3 (iPerf3) + 1 (wireless)
- 5-th: dynamic establishments and terminations of MPTCP flows
- establish: Poisson process, each flow lasted for 30s
- 6-th: 5 MPTCP in begining, close 1 subflow per 60s
- 7-th: MPTCP and TCP co-exist $\to$ TCP-friendliness

- 9-th: wireless environment