9. Punishment values and BLPs¶

The first step is to construct punishment values over each partition element of the state space. This will turn out to be amenable to separable bilinear programming formulations.

Punishment payoffs for the large player $G$ are constructed as follows.

For each $a\in A$ , define a correspondence $\tilde{{\mathcal{W}}}:D\times A \rightrightarrows \mathbb{R}^\mathcal{Z}$ by

$\begin{align*} \tilde{{\mathcal{W}}}(\lambda',a)&:=\{w\in {\mathcal{W}}(\lambda')\,|\, \forall j\in \mathcal{Z}, \forall a'\in \tilde{A} \\ & \delta[p^j(a')-p^j(a_j)]\cdot w\leq (1-\delta) [\phi(a')-\phi(a_j)]\}. \end{align*}$
Note

Note $\tilde{\mathcal{W}}$ encodes two requirements:
- Continuation values $w$ must be consistent with the set $\tilde{\mathcal{W}}(\lambda')$ where $w$ itself enforces the action-continuation-state profile pair $(a,\lambda') \equiv (a,\lambda P(a))$ ; and
- individually, the action-and-promised-value pair $(a_j, w_j)$ are optimal (i.e. “incentive compatible”).
Next, construct government punishment vectors
- $(\check{\pi}_k)_{k=1}^K$ , and
- $(\hat{\pi}_k)_{k=1}^K$
by letting

(1) $\pi(\lambda):=\max_{b\in B} \min_{a,\lambda',w}&[(1-\delta)\lambda\cdot v(a,b)+\delta \lambda'\cdot w],\\ \text{s.t.}\,\,& \lambda'=\lambda P(a),\\ & -\lambda b^T \geq 0,\\ & w\in \tilde{{\mathcal{W}}}(\lambda',a),$

Then define: $\check{\pi}_k:=\min_{\lambda\in Q_k}\pi(\lambda)$ and $\hat{\pi}_k:=\max_{\lambda\in Q_k}\pi(\lambda)$ .

Note

If $\mathcal{W}(\lambda')$ is defined as a convex polytope, then $\tilde{\mathcal{W}}(\lambda',a)$ is also a convex polytope.
Given $(a,b) \in A \times B$ , the minimization program (1) is a mild nonlinear programming problem–i.e. a separable bilinear programming formulation–of the following generic form

(2) $\min_{\lambda,w} & u^{T} + \lambda^{T} Q w \\ s.t. & \lambda \in \mathcal{F}_1, \\ & w \in \mathcal{F}_2.$

Moreover, $\mathcal{F}_1$ and $\mathcal{F}_2$ are disjoint convex and bounded polytopes.

Existence of Optimum

If $F_1$ and $F_2$ are bounded then there exists an optimal solution of (2), $(\lambda^{\ast}, w^{\ast})$ , such that $\lambda^{\ast} \in \mathcal{F}_1$ and $w \in \mathcal{F}_2$ .

See [HPT2000] or [HT1996].

Separable BLP is NP-complete

A bilinear program can be solved in NP time.

See [Man1995].

Assume that we are solving the minimization problem in (2). We employ a well-known deterministic global optimization algorithm known as branch-and-bound (BNB). First, a relaxation of the bilinear program is solved. Typically, this is done by solving inexpensive LPs. [CM2009] The solution of the relaxed problem yields a lower bound on the globally optimal solution which is a convex lower envelope. The relaxation is then iteratively refined, by refining the domain (feasible sets) and successively eliminating dominated local optima. (This is also a common method in solving integer linear programs.) An upper bound estimate of the optima can be found by using local nonlinear solvers (e.g. SNOPT and IPOPT) over each branch. Thus we have successively improved branching partitions of the domain (i.e. branching) and lower- and upper-bounding estimates (i.e. bounding) of the $\epsilon$ -global optimum.

Note

The BNB algorithm we use follows [McC1976] and is implemented through the Bilinear Matrix Inequality BNB interface (BMIBNB) available in Stefan Lofberg’s YALMIP. To solve the local lower-bounding LPs, we use the GNU GLPK open-source optimizer and to solve the upper-bounding nonlinear programs, we use either SNOPT or IPOPT.

9.1. Implementing punishment values¶

The following pseudocode implements the punishment calculations for the outer-approximation scheme:

Pseudocode

Input: $H, c, PolyLcons$

For each $(Q_k,a,b) \in D \times A \times B$ :

Markov map $P \gets P(a)$

Simplex $T \gets P(Q_k)$

Get $\mathbf{I}(a,k) := \{ k' \in \mathbf{K}: k' \mapsto Q_{k'} \in D, Q_{k'} \cap T \neq \emptyset$ }

Uses: xTriIndex from Simplex_IntersectPmap

For each $k' \in \mathbf{I}(a,k)$ :

Get linear inequality representations of $T \cap Q_{k'}$ as

$(M_{k'}, d_{k'}) \gets xPolyLcons(a,k,k')$

Get current payoff profile $v(a,b)$

Solve separable BLP

$\check{\pi}(Q_k,a,b,k') & = \min_{\lambda \in Q_k,w} \lambda [ (1-\delta)v(a,b) + \delta P w ] \\ s.t. & \\ M_{k'} (\lambda P)^T & \leq d_{k'} \\ \lambda b^T & \leq 0 \\ H w & \leq c^{o}(Q_k', \cdot) \\ \delta[p^j(a')-p^j(a_j)]\cdot w & \leq (1-\delta) [\phi(a')-\phi(a_j)], \ \forall j \in \mathcal{Z}$

Get $\check{\pi}(Q_k,a,b) = \min_{k' \in \mathbf{I}(a,k)} \check{\pi}(Q_k,a,b,k')$ .

Get $\check{\pi}(Q_k) = \max_{b \in B} \min_{a \in A}\check{\pi}(Q_k,a,b)$ .

Note

The separable constraint sets $\mathcal{F}_1$ and $\mathcal{F}_2$ for $\lambda$ and $w$ , respectively, are given by constraints in the BLP. These constraint say the following.
- $\mathcal{F}_1$ : the first two constraints require $\lambda \in Q_k$ to be such that
  for each $a \in A, k \in \mathbf{K}$ and $k' \in \mathbf{I}(a,k)$ , the resulting continuation state $\lambda P(a) \in Q_{k'} \cap P(Q_k)$ ; and
  
  given a fixed policy $b$ , the choice over $\lambda$ renders $b$ feasible according the the government budget constraint;
- $\mathcal{F}_2$ is given by the requirements that $w$ be
  consistent with respect to the step correspondence slice $\mathcal{W}(k')$ which has constant levels over the partition element $Q_{k'}$ ; and
  
  such that $w$ is incentive compatible for all small agents.
Constructing the punishment values $\hat{\pi}_k$ for the inner-approximation scheme is similar to what we did above in detail for $\check{\pi}_k$ . The only differences are

in the second last step of the pseudocode above, replace that line with:

$\hat{\pi}(Q_k,a,b) = \max_{k' \in \mathbf{I}(a,k)} \hat{\pi}(Q_k,a,b,k').$

Of course we should also re-label all the $\check{\pi}$ notation for the punishment value function with $\hat{\pi}$ ; and

maximize over $\lambda \in Q_k$ in the main BLP problem.

Relevant functions $\blacktriangleright$

Punish_Outer(self)¶

Returns:

pival :
- A $(K \times 1)$ numeric array containing elements $\check{\pi}(Q_k)$ where $k \in \mathbf{K}$ .

9. Punishment values and BLPs¶

9.1. Implementing punishment values¶

Table Of Contents

Previous topic

Next topic