Saturday, December 26, 2015

Optimization in networks

Single Commodity Network Flow problem

In this post, I am going to show the solution of a single commodity resource allocation problem involving a small network. Such network comprises a set of nodes and links which have the capability to forward and deliver traffic. Nodes are connected each other by bidirectional links.

The network links are characterized by two quantities:

  • capacity, determining how much traffic such link can transport
  • weight, representing the cost per traffic unit spent by forwarding traffic through

As hypothesis:

  • some traffic demand has to be handled by a specific node, the source node, and it is required to be delivered to the destination node, while minimizing the overall transportation cost. Such overall transportation cost is determined by the sum of costs of the links the optimal pah crosses. Such costs are implied by the link weights per each transported unit of traffic.

  • a path is represented as an alternating sequence of nodes and links.

  • traffic demand can also be split across multiple links and nodes, if that improves the overall cost.

  • intermediates node are the ones which are not source or destination nodes.

  • an intermediate node cannot send back traffic to the source node.

  • the destination node cannot send back traffic to other nodes (it is a sink).

Goal of the problem is to determine the traffic delivery path and related traffic amounts the original demand is splitted off (in the case).

All that determines the optimization problem that I am going to solve taking advantage of the linprog package.

Our sample network is the following

sample network.

Be:

\(x_{1,2}\) the traffic amount delivered through link connecting nodes 1 and 2

\(x_{1,3}\) the traffic amount delivered through link connecting nodes 1 and 3

\(x_{2,3}\) the traffic amount delivered through link connecting nodes 2 and 3

\(x_{3,2}\) the traffic amount delivered through link connecting nodes 3 and 2

\(x_{2,4}\) the traffic amount delivered through link connecting nodes 2 and 4

\(x_{3,4}\) the traffic amount delivered through link connecting nodes 3 and 4

In the order, above listed variables are the unknowns of our problem.

The objective function is:

\(min_{\textbf{x}} F(\textbf{x}) = min_{\textbf{x}} (w_{1,2}*x_{1,2} + w_{1,3}*x_{1,3} + w_{2,3}*x_{2,3} + w_{3,2}*x_{3,2} + w_{2,4}*x_{2,4} + w_{3,4}*x_{3,4})\)

where:

\(\textbf{x} = (x_{1,2}, x_{1,3}, x_{2,3}, x_{3,2}, x_{2,4}, x_{3,4})\)

is the vector of unknowns.

A first set of constraints is determined by the traffic demand at the source and destination nodes and the absence of traffic losses throughout the network.

\(x_{1,2} + x_{1,3} = D\)

\(x_{1,2} - x_{2,3} - x_{2,4} + x_{3,2} = 0\)

\(x_{1,3} + x_{2,3} - x_{3,2} - x_{3,4} = 0\)

\(x_{2,4} + x_{3,4} = D\)

where D is the traffic demand.

Further constraints are determined by the link capacity limits and non-negative traffic flow amounts.

\(x_{1,2} <= c_{1,2}\), \(x_{1,3} <= c_{1,3}\), \(x_{2,3} <= c_{2,3}\), \(x_{2,4} <= c_{2,4}\), \(x_{3,2} <= c_{2,3}\), \(x_{3,4} <= c_{3,4}\)

\(x_{1,2} >= 0\), \(x_{1,3} >= 0\), \(x_{2,3} >= 0\), \(x_{2,4} >= 0\), \(x_{3,2} >= 0\), \(x_{3,4} >= 0\)

Of course the constraints inequalities can be put in matricial form. Having finished up with the mathematical formulation of our problem, let us talk about the R language solution implementation.

I will take advantages of two R packages, igraph and linprog.

The igraph package basically allows to create graphs of various types, associate attributes to vertexes and edges, plot them and to compute properties as defined the mathematical graph theory.

The linprog package basically allows to define and solve a linear programming optimization problem. In the following I may use the term node in place of vertex and the term link in place of edge.

A requirement of the approach I will follow here below is that to define the equations to solve the single commodity network flow starting from the igraph object as defined by:

  • nodes (vertexes), which are identified by a numeric label
  • links (edges), which are characterized by capacity and weights numeric values

As further input, the source node label, the destination node label and the traffic demand to be satisfied will be specified.

The advantage of the approach I am proposing is to have available a function capable to compute all the variables involved by the implied linear programming optimization problem, which are:

  • the number of the unknowns
  • the definition of the unknowns
  • the objective function
  • the constraint inequalities

and then easily input such quantities to the linprog package optimization routine and having back the solution.

The igraph object which represents the network as above displayes is herein shown.

suppressPackageStartupMessages(library(igraph))

g <- graph.formula(1-2, 1-3, 2-3, 2-4, 3-4)
E(g)$weight <- c(10, 12, 8, 7, 5)
E(g)$capacity <- c(10, 9, 8, 10, 10)

As you can see above, I defined a four nodes and five links graph. The links can carry traffic both directions. Further, I defined the links weights and their capacities. As already said, the weights represent the unitary cost for transporting traffic throughout the associated link. The links capacity will be involved in defining some of the constraints.

As further input to our problem, the request as defined below.

request <- list("node.source" = 1, "node.destination" = 4, "demand" = 15)

The above request asks for determining a path able to carry traffic from node 1 to node 4. The traffic demand may also be forwarded by splitting it in two or more sub-flows crossing the network throughout different paths starting always from node 1 and reaching node 4. That will be implied by any demand whose value is strictly greater than the links capacities at some stage of the forwarding process. Considering the above demand value of 15 and the links capacities which are not strictly greater than 10, this is the case.

The objective function results:

\(min_{\textbf{x}} F(\textbf{x}) = min_{\textbf{x}} (10*x_{1,2} + 12*x_{1,3} + 8*x_{2,3} + 8*x_{3,2} + 7*x_{2,4} + 5*x_{3,4})\)

as \(w_{3,2}\) is equal to \(w_{2,3}\)

The constraints are:

\(x_{1,2} + x_{1,3} = 15\)

\(x_{1,2} - x_{2,3} - x_{2,4} + x_{3,2} = 0\)

\(x_{1,3} + x_{2,3} - x_{3,2} - x_{3,4} = 0\)

\(x_{2,4} + x_{3,4} = 15\)

\(x_{1,2} <= 10\), \(x_{1,3} <= 9\), \(x_{2,3} <= 8\), \(x_{2,4} <= 10\), \(x_{3,2} <= 8\), \(x_{3,4} <= 10\)

\(x_{1,2} >= 0\), \(x_{1,3} >= 0\), \(x_{2,3} >= 0\), \(x_{2,4} >= 0\), \(x_{3,2} >= 0\), \(x_{3,4} >= 0\)

The compute_variables() function as below implemented fulfills the following tasks:

  • compute the graph adjacency matrix

  • it deletes the source node column and destination node row of the adjacency matrix to fulfill with the requirement that the intermediate nodes cannot send back traffic to the source node and the destination node cannot send back traffic to the intermediate nodes

  • computes the number of unknowns

  • determine the label associated to the unknowns

  • computes the constraint matrix (M), its right hand side (rhs) and the inequalities directins (dir)

  • computes the weights vector starting from the graph link weights and considering their association to the unknowns.

About the last point, note that two unknowns, \(x_{2,3}\) and \(x_{3,2}\) have been defined to represent traffic sent from node 2 to node 3 and viceversa. Since we do not know in advance what path shall be the optimal one, we need to represent both within our objective function and hence associate weights to both of them.

compute_variables <- function(g, request) {
  s <- request$node.source
  d <- request$node.destination
  demand <- request$demand
  adj.m <- get.adjacency(g)
  m <- as.matrix(adj.m)
  m[, s] <- 0
  m[d, ] <- 0
  n <- sum(m)
  unknowns <- vector(mode="character", length=n)
  k <- 1
  for (i in 1:nrow(m)) {
    for(j in 1:ncol(m)) {
      if (m[i,j] == 1) {
        unknowns[k] <- paste(i, j, sep='|')
        k <- k + 1
      }
    }
  }
  
  n.vertex <- length(V(g))
  n.edge <- length(E(g))
  n.unknowns <- length(unknowns)
  rhs.len <- n.vertex + 2*n.unknowns
  rhs <- rep(0, rhs.len)
  weights <- rep(0, n.unknowns)

  M <- matrix(0, nrow=length(rhs), ncol=n.unknowns)
  for ( i in 1:n.vertex) {
    if (i == s) {
      p <- paste(i, "|", sep="")
      f <- grep(p, unknowns, fixed=TRUE)
      M[i, f] <- 1
      rhs[i] <- demand
    } else if (i == d) {
      p <- paste("|", i, sep="")
      f <- grep(p, unknowns, fixed=TRUE)
      M[i, f] <- 1
      rhs[i] <- demand
    } else {
      p <- paste("|", i, sep="")
      f <- grep(p, unknowns, fixed=TRUE)
      M[i,f] <- 1
      p <- paste(i, "|", sep="")
      f <- grep(p, unknowns, fixed=TRUE)
      M[i,f] <- (-1)
      rhs[i] <- 0
    }
  }

  e.v <- attr(E(g), "vnames")
  
  for (i in 1:n.unknowns) {
    M[n.vertex+i,i] <- 1
    item <- unknowns[i]
    elem <- strsplit(item, "|")
    dual <- as.numeric(elem[[1]][3])
    item.dual <- paste(elem[[1]][3], elem[[1]][1], sep="|")
    f <- grep(item, e.v, fixed=TRUE)
    if (length(f) > 0) {
      rhs[n.vertex+i] <- E(g)$capacity[f]
      weights[i] <- E(g)$weight[f]
    }
    f <- grep(item.dual, e.v, fixed=TRUE)
    if (length(f) > 0) {
      rhs[n.vertex+i] <- E(g)$capacity[f]
      weights[i] <- E(g)$weight[f]
    }
  }
  
  for (i in 1:n.unknowns) {
    M[n.vertex+n.edge+i,i] <- 1
  }
  nrow.M <- nrow(M)
  dir <- rep("=", nrow.M )
  dir[n.vertex+1:(n.vertex+n.edge)] <- "<="
  dir[(n.vertex+n.edge+1):nrow.M] <- ">="

  list("n.unknowns"=n, "unknowns" = unknowns, "weights" = weights, 
       "matrix"=M, "rhs"=rhs, "dir"=dir)
}

It is time now to compute the variables of our problem based upon the graph and request items.

var <- compute_variables(g, request)
var
## $n.unknowns
## [1] 6
## 
## $unknowns
## [1] "1|2" "1|3" "2|3" "2|4" "3|2" "3|4"
## 
## $weights
## [1] 10 12  8  7  8  5
## 
## $matrix
##       [,1] [,2] [,3] [,4] [,5] [,6]
##  [1,]    1    1    0    0    0    0
##  [2,]    1    0   -1   -1    1    0
##  [3,]    0    1    1    0   -1   -1
##  [4,]    0    0    0    1    0    1
##  [5,]    1    0    0    0    0    0
##  [6,]    0    1    0    0    0    0
##  [7,]    0    0    1    0    0    0
##  [8,]    0    0    0    1    0    0
##  [9,]    0    0    0    0    1    0
## [10,]    1    0    0    0    0    1
## [11,]    0    1    0    0    0    0
## [12,]    0    0    1    0    0    0
## [13,]    0    0    0    1    0    0
## [14,]    0    0    0    0    1    0
## [15,]    0    0    0    0    0    1
## [16,]    0    0    0    0    0    0
## 
## $rhs
##  [1] 15  0  0 15 10  9  8 10  8 10  0  0  0  0  0  0
## 
## $dir
##  [1] "="  "="  "="  "="  "<=" "<=" "<=" "<=" "<=" ">=" ">=" ">=" ">=" ">="
## [15] ">=" ">="

We can recognize the unknowns vector so represented:

\("1|2" -> x_{1,2}\)

\("1|3" -> x_{1,3}\)

\("2|3" -> x_{2,3}\)

\("2|4" -> x_{2,4}\)

\("3|2" -> x_{3,2}\)

\("3|4" -> x_{3,4}\)

the matrix M of constraints multiplied by the vector of unknowns as follows:

\(\textbf{M * x}\)

is the left hand side of constraint inequalities, whose direction and right hand side values can be found in dir and rhs.

Finally, it is time to compute the optimal solution.

suppressPackageStartupMessages(library(linprog))
opt.sol <- solveLP(var$weights, var$rhs, var$matrix, maximum=FALSE, var$dir, lpSolve=TRUE)
opt.sol
## 
## 
## Results of Linear Programming / Linear Optimization
## (using lpSolve)
## 
## Objective function (Minimum): 255 
## 
## Solution
##   opt
## 1  10
## 2   5
## 3   0
## 4  10
## 5   0
## 6   5
## 
## Constraints
##    actual dir bvec free
## 1      15   =   15    0
## 2       0   =    0    0
## 3       0   =    0    0
## 4      15   =   15    0
## 5      10  <=   10    0
## 6       5  <=    9    4
## 7       0  <=    8    8
## 8      10  <=   10    0
## 9       0  <=    8    8
## 10     15  >=   10    5
## 11      5  >=    0    5
## 12      0  >=    0    0
## 13     10  >=    0   10
## 14      0  >=    0    0
## 15      5  >=    0    5
## 16      0  >=    0    0
str(opt.sol)
## List of 9
##  $ status    : num 0
##  $ lpStatus  : int 0
##  $ solution  : Named num [1:6] 10 5 0 10 0 5
##   ..- attr(*, "names")= chr [1:6] "1" "2" "3" "4" ...
##  $ opt       : num 255
##  $ con       :'data.frame':  16 obs. of  4 variables:
##   ..$ actual: num [1:16] 15 0 0 15 10 5 0 10 0 15 ...
##   ..$ dir   : Factor w/ 3 levels "<=","=",">=": 2 2 2 2 1 1 1 1 1 3 ...
##   ..$ bvec  : num [1:16] 15 0 0 15 10 9 8 10 8 10 ...
##   ..$ free  : num [1:16] 0 0 0 0 0 4 8 0 8 5 ...
##  $ maximum   : logi FALSE
##  $ lpSolve   : logi TRUE
##  $ solve.dual: logi FALSE
##  $ maxiter   : num 1000
##  - attr(*, "class")= chr "solveLP"

Note the status and lpStatus bot equals to zero inidicate successfully solution of the linear programming problem.

This is the optimal solution:

\(x_{1,2} = 10\)

\(x_{1,3} = 5\)

\(x_{2,3} = 0\)

\(x_{2,4} = 10\)

\(x_{3,2} = 0\)

\(x_{3,4} = 5\)

implying an objective function value equal to:

\(F(\textbf{x}) = 10*x_{1,2} + 12*x_{1,3} + 8*x_{2,3} + 7*x_{2,4} + 8*x_{3,2} + 5*x_{3,4}\) = 10 * 10 + 12 * 5 + 8 * 0 + 7 * 10 + 8 * 0 + 5 * 5 = 255

If the demands is less than 10, no path split is necessary, as herein shown

request$demand <- 10
var <- compute_variables(g, request)
var
## $n.unknowns
## [1] 6
## 
## $unknowns
## [1] "1|2" "1|3" "2|3" "2|4" "3|2" "3|4"
## 
## $weights
## [1] 10 12  8  7  8  5
## 
## $matrix
##       [,1] [,2] [,3] [,4] [,5] [,6]
##  [1,]    1    1    0    0    0    0
##  [2,]    1    0   -1   -1    1    0
##  [3,]    0    1    1    0   -1   -1
##  [4,]    0    0    0    1    0    1
##  [5,]    1    0    0    0    0    0
##  [6,]    0    1    0    0    0    0
##  [7,]    0    0    1    0    0    0
##  [8,]    0    0    0    1    0    0
##  [9,]    0    0    0    0    1    0
## [10,]    1    0    0    0    0    1
## [11,]    0    1    0    0    0    0
## [12,]    0    0    1    0    0    0
## [13,]    0    0    0    1    0    0
## [14,]    0    0    0    0    1    0
## [15,]    0    0    0    0    0    1
## [16,]    0    0    0    0    0    0
## 
## $rhs
##  [1] 10  0  0 10 10  9  8 10  8 10  0  0  0  0  0  0
## 
## $dir
##  [1] "="  "="  "="  "="  "<=" "<=" "<=" "<=" "<=" ">=" ">=" ">=" ">=" ">="
## [15] ">=" ">="
opt.sol <- solveLP(var$weights, var$rhs, var$matrix, maximum=FALSE, var$dir, lpSolve=TRUE)
opt.sol
## 
## 
## Results of Linear Programming / Linear Optimization
## (using lpSolve)
## 
## Objective function (Minimum): 170 
## 
## Solution
##   opt
## 1  10
## 2   0
## 3   0
## 4  10
## 5   0
## 6   0
## 
## Constraints
##    actual dir bvec free
## 1      10   =   10    0
## 2       0   =    0    0
## 3       0   =    0    0
## 4      10   =   10    0
## 5      10  <=   10    0
## 6       0  <=    9    9
## 7       0  <=    8    8
## 8      10  <=   10    0
## 9       0  <=    8    8
## 10     10  >=   10    0
## 11      0  >=    0    0
## 12      0  >=    0    0
## 13     10  >=    0   10
## 14      0  >=    0    0
## 15      0  >=    0    0
## 16      0  >=    0    0

If the demand input variable exceeds 19 (strictly greater than), there is no path that can be found to deliver traffic. In that case the lpSolve routine will return status = 1 and lpStatus = 2, as herein shown.

request$demand <- 20
var <- compute_variables(g, request)
var
## $n.unknowns
## [1] 6
## 
## $unknowns
## [1] "1|2" "1|3" "2|3" "2|4" "3|2" "3|4"
## 
## $weights
## [1] 10 12  8  7  8  5
## 
## $matrix
##       [,1] [,2] [,3] [,4] [,5] [,6]
##  [1,]    1    1    0    0    0    0
##  [2,]    1    0   -1   -1    1    0
##  [3,]    0    1    1    0   -1   -1
##  [4,]    0    0    0    1    0    1
##  [5,]    1    0    0    0    0    0
##  [6,]    0    1    0    0    0    0
##  [7,]    0    0    1    0    0    0
##  [8,]    0    0    0    1    0    0
##  [9,]    0    0    0    0    1    0
## [10,]    1    0    0    0    0    1
## [11,]    0    1    0    0    0    0
## [12,]    0    0    1    0    0    0
## [13,]    0    0    0    1    0    0
## [14,]    0    0    0    0    1    0
## [15,]    0    0    0    0    0    1
## [16,]    0    0    0    0    0    0
## 
## $rhs
##  [1] 20  0  0 20 10  9  8 10  8 10  0  0  0  0  0  0
## 
## $dir
##  [1] "="  "="  "="  "="  "<=" "<=" "<=" "<=" "<=" ">=" ">=" ">=" ">=" ">="
## [15] ">=" ">="
opt.sol <- solveLP(var$weights, var$rhs, var$matrix, maximum=FALSE, var$dir, lpSolve=TRUE)
opt.sol
## 
## 
## Results of Linear Programming / Linear Optimization
## (using lpSolve)
## lpSolve returned error code '2'

In this post I showed how to solve the single commodity network flow problem starting from the network graph and its nodes and links attributes. A function to compute the input variables for the associated linear programming optimization problem was shown. Of course if you like, you may combine those two steps as a single function.