I was watching a tutorial for some modeling software when I saw a reference to Lagrange multipliers. Lagrange multipliers... it range a bell from my distant past but I didn't recall what it was. Fortunately Google is my friend and I pulled up a video from MIT on a "recitation" -- what MIT calls sessions with teaching assistants which occur between lectures taught by professors designed to provide practice, review, and supplemental material.

The problem posted on the video was the following: suppose I want to optimize (maximize or minimize) a function

f(x, y, z) = x^{2} + x + 2 y^{2} + 3 z^{2},

where the solution is constrained on the unit sphere:

g(x, y, z) = x^{2} + y^{2} + z^{2} - 1 = 0

Lagrange multipliers are based on the assumption that for points g(x, y, z) = 0 if the function f(x, y, z) is optimal along contours in g(x, y, z), then the gradient of f(x, y, z) and the gradient of g(x, y, z) must differ only by a scale factor λ, assuming the derivatives of both functions are continuous. This means there exists a λ for which the function f'(x, y, z) = f(x, y, z) - λ g(x, y, z) has zero gradient: the gradients of f and g, both pointing in the same direction, cancel when the gradient of g is multiplied by the Lagrange multiplier λ.

This is very clever, as is pretty much everything in math named after someone. To test myself, I applied it to the test problem. And, after I fixed an extremely embarrassing algebra error, it worked. I got 6 local extrema for f'(x, y, z), for which the global maximum was a tie between the two points in x, y, z space (1/4, 0, ±sqrt(15)/4) with a value 25/8, and the global minimum was (-1, 0, 0) with a value of zero.

But Lagrange multipliers were overkill, I thought. This problem is easily solved without them. All I need to do is apply the constraint to substitute for either x, y, or z, then find extrema in the 2-dimensional plane.

So that's easy: I picked z^{2} = x^{2} + y^{2} and that changed my function f to the following:

f(x, y) = x^{2} + x + 2 y^{2} + 3 ( 1 - x^{2} - y^{2} ),

which is trivially simplified:

f(x, y) = -2 x^{2} + x - y^{2},

which yields the derivatives:

f_{x} = -4 x + 1,

f_{y} = -2 y .

Some care needs to be applied here since I need to make sure the resulting points actually lay on the unit sphere: x^{2} + y^{2} must be no greater than 1.

No problem so far. But here's where things went bad. There's only solution for f_{x} = f_{y} = 0, and that's x = 1/4, y = 0, z = sqrt(15)/4. Note I found this same solution with the Lagrange multiplier method, but here I'm missing the point (-1, 0, 0). What's wrong?

The problem is a boundary, a singularity. Although the gradient with respect to x is non-zero at (-1, 0, 0), and for that matter at (1, 0, 0), moving along the surface of the unit sphere results in no change (to first order) of x: dx= 0. So the points can still be constrained extrema even though the 2-dimensionsal equation, with z^{2} replaced with the constraint for x and y, is not extreme at that point.

What I should have done was recognized I had a boundary at x^{2 + y2 = 1 and tested the points along this boundary for local extrema in f(x, y). Along this boundary z = 0, so f(x, y) = x2 + x + 2 y2.
}

For these points I can simply x^{2} = 1 - y^{2}, yielding:

f(y) = 1 + y^{2} + sqrt[ 1 - y^{2} ].

This has only a single derivative:

f_{y} = 2 y - y / sqrt[ 1 - y^{2} ].

This has a singularity at |y| = 1, but it is clearly zero at |y| = 0, which reveals the local extrema at the point:

( ±1, 0, 0 )

These points were made obvious from the Lagrange multiplier method. I would have missed them otherwise.

Anyway, I don't get to do as much calculus as I'd like during the day job. It's fun to return to this basic stuff to keep from losing it completely. It's tempting to solve all extremum problems numerically, which tends to work well for a single variable, but get up to 3 and it's not so easy. Constraint problems are quite common: this is a useful tool to have at hand. However, the classic problem with exemplary problems like this is they are hand-picked to be solvable. Trying to apply this method to arbitrary analytic functions is considerably more challenging.

For example, to test myself on a Caltrain commute I tried to find the extrema of the following simple equation:

f = (sin(x) + cos(2y)) exp(-x^{2} + y^{2}),

subject to the constraint:

g = (x - 1)^{2} + y^{2} = 0

Unfortunately this got fairly ugly fairly quickly:

f' = (sin(x) + cos(2y)) exp(-x^{2} + y^{2}) - λ [ (x - 1)^{2} + y^{2} ]

f'x = [ cos(x) - 2x (sin(x) + cos(2y)) ] exp(-x^{2} + y^{2}) - 2 λ (x - 1) = 0

f'y = [ -2 sin(y) - 2y (sin(x) + cos(2y)) ] exp(-x^{2} + y^{2}) - 2 λ y = 0

f'λ = (x - 1)^{2} + y^{2} ] = 0

I don't know what to do with this.

What does this all have to do with cycling? Optimization problems relative to constraints are a common application in cycling as well as in other areas. For example, suppose I have to ride a course consisting of ten one-kilometer segments and I want to ride the course as quickly as possible for a given normalized power. I might assume power is constant over each of the one-kilometer segments, as is the road grade and wind speed. I'm not sure if Lagrange multipliers would help here, but it's a problem I've solved before numerically.

## No comments:

Post a Comment