Sunday, April 10, 2016

TCP reloaded (part 1)

TCP mathematical model

Abstract

The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. TCP provides reliable, ordered, and error-checked delivery of a stream of octets between applications running on hosts communicating by an IP network. Internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP. For a detailed description, please see ref. [4].

In this post series, I am going to analyse the evolution with time of a TCP connection, in terms of sender window size, round-trip-time, transmission rate and further. The TCP connection traffic crosses a link of known capacity and it is rate-limited to the link capacity. The excess rate fraction is queued into a FIFO buffer of known size. Some packet discarding policy is applied based on RED (Random Early Discard) Active Queue Management (AQM) criteria. As a result TCP sender window size will be updated by sender and the resulting traffic rate will change.

The aim of such analysis is to describe the dynamic behavior of a TCP connection and, based on exploratory analysis of simulated variables, reveals more of the TCP protocol nature.

All that is possible thanks to the TCP mathematical model as discussed in literature (ref. [1] and [2]) that I am going to detail.

TCP model

Be:

  • W(t): TCP sender window size at instant t

  • R(t): round trip time at instant t

  • q(t): FIFO buffer queue length

  • x(t): exponential moving average of q(t)

  • p(.): traffic discarding probability

  • S(t): tcp transmission rate

and:

  • C: link capacity (pkt/s)

  • D: fixed transmission delay (s)

  • \(\alpha\): queue exponential moving average (EMA) constant

  • \(\beta\): TCP transmission window size reduction factor

  • \(T_{min}\): EWMA queue minimum threshold for applying RED traffic drop policy

  • \(T_{max}\): EWMA queue maximum threshold for applying RED traffic drop policy

  • \(Q_{max}\): FIFO buffer maximum queue size

  • \(SST\): TCP slow start threshold

Based on results already established by the TCP behavior analysis literature, TCP equations are herein below outlined.

\[ \begin{equation} \begin{cases} t &=\ t_{s} + R(t_{s})\ =\ t_{s} + \tau \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq.\ 1) \\ S(t) &= \dfrac{W(t)}{R(t)} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq. \ 2) \\ \\ \dfrac{dW(a_{ck})}{da_{ck}} &= \left\{ \begin{array}{@{}ll@{}} \ 1, & \text{if}\ W < SST \\ \dfrac{1}{W}, & \text{if}\ W >= SST \end{array} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq.\ 3) \right. \\ \dfrac{da_{ck}(t)}{dt} &= min \{S(t),\ C\} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq. \ 4) \\ \\ \dfrac{dW(t)}{dt} &= \dfrac{dW(a_{ck})}{da_{ck}}\dfrac{da_{ck}(t-\tau)}{dt}{(1-p(x(t-\tau)))} - \beta \ W(t)\dfrac{da_{ck}(t-\tau)}{dt}p(x(t-\tau)) \ \ \ \ \ \ \ \ \ (eq.\ 5) \\ \\ \dfrac{dq(t)}{dt} &= \ S(t) - \textbf{1}(q(t))*C \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq.\ 6) \\ \\ R(t) &= \dfrac{q(t)}{C} + D \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq.\ 7) \\ \\ \dfrac{dR(t)}{dt} &= \dfrac{1}{C}\dfrac{dq(t)}{dt} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq.\ 8) \\ \\ \dfrac{dx(t)}{dt} &= K(x(t) - q(t)) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq.\ 9) \\ \\ K &= \dfrac{log_{e}(1-\alpha)}{\delta} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq.\ 10) \\ \\ p(x) &=\left\{ \begin{array}{@{}ll@{}} 0 & \text{if}\ x < T_{min} \\ 1 & \text{if}\ x > T_{max} \\ \dfrac{x-T_{min}}{T_{max}-T_{min}}& \text{if}\ T_{min}<= x <= T_{max} \end{array} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (eq.\ 11) \right. \end{cases} \end{equation} \]

Some comments to clarify follow.

The first equation states that when at time \(t_{s}\) the source sends a packet, the corresponding acknowledgment will be received after the round-trip-time \(R(t_{s})\). On the other hand, when receiving an acknowledgment, the round-trip-time and probability loss both refer to their values at time \(t_{s}\). As a consequence:

\[ \begin{equation} \begin{aligned} t\ -\ \tau = R(t-\tau) \\ \end{aligned} \end{equation} \]

Further, the starting value of R can be computed by equation 7 by imposing \(q(t) = 0\):

\[ \begin{equation} \begin{aligned} R(0) = D \\ \end{aligned} \end{equation} \]

The second equation shows how the TCP tranmission rate is determined, based on TCP sender window size and round trip time.

The third equation models the acknowledge receiving rate which cannot exceed the link capacity.

The fourth equation states the sender window increase policy can be different based upon the comparison of the current window size with the TCP slow-start threshold (SST).

The fifth equation states the evolution with time of the TCP sender window is driven by additive-increse-multiplicative-decrease (AIMD) which is represented by two terms. The first one is the additive increase weighted on discarding probability complement (packet is not discarded):

\[ \begin{equation} \begin{aligned} \dfrac{dW(t)}{da_{ck}}\dfrac{da_{ck}(t-\tau)}{dt}{(1-p(x(t-\tau))} \\ \end{aligned} \end{equation} \]

It means that based on the positive acknowledgment rate of packet received at lag , the window size is increased based on equation 4. The second term:

\[ \begin{equation} \begin{aligned} -\beta\ W(t)\dfrac{da_{ck}(t-\tau)}{dt}p(x(t-\tau)) \\ \end{aligned} \end{equation} \]

takes into account the multiplicative decrease behavior in case of dropped packets occured at lag equal to \(\tau\), which is round trip time expressed by equation 1. The \(\beta\) decrease factor is typically constant and equal to \(\dfrac{1}{2}\). The term \(\beta\ W(t)\) represents the amount of TCP window size decrease. The term \(\dfrac{da_{ck}(t-\tau)}{dt}p(x(t-\tau))\) is the rate of such decrease based on the rate duplicated acknowledge packets are received by TCP source and packet discarding probability, all at lag \(\tau\).

Equation 6 states the variation with time of the FIFO buffer queue length is based on two terms. The first term \(S(t)\) is the queue length increase as a result of incoming traffic as fed by the TCP source. The second term \(-\textbf{1}(q(t))*C\) is the queue length decrement at constant rate C, occurring whenever there are FIFO buffer enqueued packets, in other words when \(q(t) > 0\). The \(1()\) function takes into account such latter boundary condition.

Equation 7 states that the round trip time is determined by the current queued packets \(q(t)\) draining rate plus a fixed delay \(D\).

Equation 8 is derived from equation 7 by taking derivative on t of both terms.

Equations 9 and 10 model the exponential weighted moving average (EWMA) of the istantaneous queue length, whose purpose is to filter out high frequency queue oscillations. The EWMA can be so expressed:

\[ \begin{equation} x((k+1)\delta)) = (1-\alpha)x(k\delta)\ +\ \alpha x(k\delta), \ \ \ \ 0\ <\ \alpha\ <\ 1 \end{equation} \]

The parameter is the sampling time expressed in seconds. Based on \(\alpha\) and \(\delta\) parameters, the \(K\) parameter is computed by equation 10.

Equation 11 models the discarding probability profile based on parameters \(T_{min}\) and \(T_{max}\) supposed as constant.

As a result, we have the following parameters to set:

\[ \begin{equation} P_{0}:=\ \{W_{0}, SST, C, D, \alpha, \beta, \delta, T_{min}, T_{max}, Q_{max}\} \end{equation} \]

and the initial values of unkwown \(Y(t)\), t = 0:

\[ \begin{equation} Y(0):=\ \{W(0), R(0), q(0), x(0), p(0)\}\ =\ \{W_{0}, D, 0, 0, 0\} \end{equation} \]

To computed the evolution with time of unknown:

\[ \begin{equation} Y(t):=\ \{W(t), R(t), q(t), x(q(t)), p(x(t))\} \end{equation} \]

I am going to apply an Euler based procedure to solve above system of equations. Further, from resulting W(t) and R(t), the S(t) transmission rate will be computed (based on equation 2).

Simulation

First I define known constant parameters, initial values and minimum and maximum values. The latter two shall be used to ensure feasible values.

# constant parameters
param.init <- c(W0 = 2, # initial window size [packets],
                SST = 10, # slow start threshold
                C = 30, # link bandwidth [packets/s]
                D = 0.2, # delay [s]
                alpha = 0.0001, # EWMA constant
                beta = 0.5, # TCP window decrease factor
                delta = 0.000266, # sampling time [s]
                Tmin = 20, # RED minimum threshold
                Tmax = 60, # RED maximum threshold
                Qmax = 80 # maximum buffer size
                )

# initial value of unknown y
yinit <- c(W = param.init["W0"], 
           R = param.init["D"], 
           q = 0, 
           x = 0, 
           p = 0)

# output y vector minimal feasible values
ymin <- c(W = param.init["W0"], 
          R = param.init["D"],
          q = 0,
          x = 0,
          p = 0)

# output y vector maximal feasible values
ymax <- c(W = 100, 
          R = param.init["D"] + param.init["Tmax"]/param.init["C"],
          q = param.init["Qmax"],
          x = param.init["Tmax"],
          p = 1)

names(yinit) <- c("W", "R", "q", "x", "p")
names(ymax) <- names(yinit)
names(ymin) <- names(yinit)

Further, I define some utility functions to implement the 1() function, the RED drop probability computation and feasibility constraint enforcement. Further, the get_tau utility function is deputed to determine the time tick corresponding to time lag \(\tau\) as modeled by equation 1.

# one function used for equation 3
one_f <- function(x) {
   ifelse (x > 0, 1 ,0)  
}

# RED drop probability
drop_p <- function(avg, Tmin, Tmax) {
  r <- 0
  if (avg >= Tmax) { 
    r <- 1
  } else if (avg > Tmin) {
    r <- (avg-Tmin)/(Tmax-Tmin)
  }
  r
}

# enforcement of Y feasible values to stand within [ymin, ymax] interval 
feasible <- function(z, zmin, zmax) {
  r <- z
  if (r < zmin) {
    r <- zmin
  } else if (z > zmax) {
    r <- zmax
  }
  r
}

# return the tick the lag terms refer to
get_tau <- function(y, t) {
  # using the global vector t_tau.v to store lag-tau tick values
  tau <- t_tau.v[t]
  if (is.na(tau)) {
    # if it is not available, looking backward for precedent available value
    tt <- t-1
    f <- FALSE
    if (tt >=1) {
      for (k in seq(tt, 1, -1)) {
        if (!is.na(t_tau.v[k])) {
          f <- TRUE
          break;
        }
      }
      if (f == TRUE) {
        # if found a value, replacing potential NA values with found one
        tau <- t_tau.v[k]
        for (s in seq(k+1, t, 1)) {
          if (is.na(t_tau.v[s])) {
              t_tau.v[s] <<- tau
          }
        }
      }
    }
  }
  tau
}

I am going to implement some core functions to solve above outlined system of equation.

Specifically for every time tick, I will compute the next tick \(Y(t):=\ \{W(t), R(t), q(t), x(q(t)), p(x(t))\}\) value based on current values and their variations.

# TCP function to determine next y value
TCP <- function(t, y, parms, dt) {
  # current transmission rate 
  S.t <- y[t,"W"]/y[t,"R"]
  # current round trip time
  rtt <- y[t,"R"]
  # round trip time acknowledge packets refer to t_tau tick
  t_tau <- get_tau(y, t)
  # if no lag tau data is yet available
  if (is.na(t_tau)) {
    # no increment of W as no acknowledge has yet been received
    dy1 <- 0
  } else {
    # lag tau values
    lag <- y[t_tau,]
    # lag tau transmission rate
    S.tau <- lag["W"]/lag["R"] 
    # equation 3
    dw.dack <- ifelse(y[t,"W"] >= parms["SST"], 1/y[t,"W"], 1)
    # equation 4
    dack.dt <- min(S.tau, parms["C"])
    # equation 5
    dy1 = dw.dack*dack.dt*(1-lag["p"]) - parms["beta"]*y[t,"W"]*dack.dt*lag["p"]
  }
  # equation 6
  dy3 <- (-one_f(y[t,"q"])*parms["C"] + S.t)
  # equation 8
  dy2 = dy3/parms["C"]
  # equation 10
  alpha <- parms["alpha"]
  delta <- parms["delta"]
  K <- (log(1-alpha))/delta
  # equation 9
  dy4 = K*(y[t,"x"] - y[t,"q"])
  # equation 11
  y5 <- drop_p(y[t,"x"], parms["Tmin"], parms["Tmax"])
  # computing y increment
  dy <- c(dy1, dy2, dy3, dy4)*dt
  names(dy) <- c("W", "R", "q", "x")
  # collecting next tick values and ensuring they are feasible values
  ynew <- vector(mode="numeric", length=ncol(y))
  ynew[1] <- feasible(y[t,"W"] + dy["W"], ymin["W"], ymax["W"])
  ynew[2] <- feasible(y[t,"R"] + dy["R"], ymin["R"], ymax["R"])
  ynew[3] <- feasible(y[t,"q"] + dy["q"], ymin["q"], ymax["q"])
  ynew[4] <- feasible(y[t,"x"] + dy["x"], ymin["x"], ymax["x"])
  ynew[5] <- feasible(y5, ymin["p"], ymax["p"])
  # computing the tick to refer to the current values, lag tau in the future
  delta_tick <- t+rtt/dt
  if (delta_tick <= length(t_tau.v)) {
    t_tau.v[delta_tick] <<- t
  }
  # final result is next tick values ynew vector
  ynew
}

# tcp equation solver
tcp.solve <- function(y0, time, tcp.func, parms, dt) {
  ncol <- length(y0)
  nrow <- length(time)
  y <- matrix(NA, nrow=nrow, ncol=ncol)
  colnames(y) <- names(y0)
  # initial value stored as first tick values
  y[1,] <- y0
  # main loop to collect y time evolution
  for (i in 1:(nrow-1)) {
    t <- i # 't' is the current time tick
    y[t+1,] <- tcp.func(t, y, parms, dt)
  }
  y
}

All is ready to run a simulation.

# time steps
dt.step <- 0.02
# timeline in seconds
times <- seq(0, 50, by = dt.step)
# array storing tick lag terms refer to
t_tau.v <- rep(NA, length(times))
# tcp simulation run
y.res <- tcp.solve(y0 = yinit, time = times, tcp.func = TCP, parms = param.init, dt = dt.step)

I gather all results in a single data frame and compute the TCP transmission rate \(S(t)\) based on equation 2. Further some plots are shown.

df <- data.frame(cbind(time = times, y.res))
df <- transform(df, S = df$W/df$R)
dim(df)
## [1] 2501    7
head(df)
time W R q x p S
0.00 2 0.2000000 0.0 0.0000000 0 10.000000
0.02 2 0.2066667 0.2 0.0000000 0 9.677419
0.04 2 0.2000000 0.0 0.0015038 0 10.000000
0.06 2 0.2066667 0.2 0.0014925 0 9.677419
0.08 2 0.2000000 0.0 0.0029851 0 10.000000
0.10 2 0.2066667 0.2 0.0029627 0 9.677419
tail(df)
time W R q x p S
2496 49.90 27.58127 0.9907832 23.72350 21.85527 0.0460201 27.83785
2497 49.92 27.41972 0.9893418 23.68025 21.86931 0.0463816 27.71512
2498 49.94 27.25502 0.9878185 23.63456 21.88293 0.0467328 27.59112
2499 49.96 27.08724 0.9862126 23.58638 21.89610 0.0470732 27.46593
2500 49.98 26.91647 0.9845232 23.53570 21.90881 0.0474025 27.33960
2501 50.00 26.74279 0.9827496 23.48249 21.92104 0.0477203 27.21221
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(Rmisc))
p1 <- ggplot(data=df, aes(x=time, y = W)) + geom_line() + ggtitle("TCP sender window size")
p2 <- ggplot(data=df, aes(x=time, y = R)) + geom_line() + ggtitle("TCP round-trip-time")
p3 <- ggplot(data=df, aes(x=time, y = S)) + geom_line() + ggtitle("TCP transmission rate")
p4 <- ggplot(data=df, aes(x=time, y = q)) + geom_line() + ggtitle("FIFO buffer queue length")
p5 <- ggplot(data=df, aes(x=time, y = x)) + geom_line() + ggtitle("FIFO Buffer queue EWMA")
p6 <- ggplot(data=df, aes(x=time, y = p)) + geom_line() + ggtitle("RED drop probability")
multiplot(p1, p3, p5, p2, p4, p6, cols=2)

As final step, I save the results into a file to be used in next post for exploratory analysis.

write.csv(df, file = "TCP.csv", row.names = FALSE, sep = ",")

Conclusions

In this post I showed a simulation run of the TCP dynamic behavior based upon its mathematical model.

Next post will provide an exploratory analysis of collected simulation results.

References

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.