Lec 9

6.801/6.
866: Machine Vision, Lecture 9
Professor Berthold Horn, Ryan Sander, Tadayuki Yoshitake

MIT Department of Electrical Engineering and Computer Science
Fall 2020
These lecture summaries are designed to be a review of the lecture. Though I do my best to include all main topics from the
lecture, the lectures will have more elaborated explanations than these notes.
1 Lecture 9: Shape from Shading, General Case - from First Order Nonlinear
PDE to five ODEs
In this lecture, we will begin by exploring some applications of magnification, shape recovery, and optics through Transmission
and Scanning Electron Microscopes (TEMs and SEMs, respectively). Then, we will discuss how we can derive shape from shading
using needle diagrams, which capture surface orientations at each (x, y) pixel in the image. This procedure will motivate the use
of Green’s Theorem, “computational molecules”, and a discrete approach to our standard unconstrained optimization problem.
We will conclude by discussing more about recovering shape from shading for Hapke surfaces using initial curves and rotated
coordinate systems.
1.1 Example Applications: Transmission and Scanning Electron Microscopes (TEMs and
SEMs, respectively)
We will begin with a few motivating questions/observations:
• How do TEMs achieve amazing magnification? They are able to do so due to the fact that these machines are not
restricted by the wavelength of the light they use for imaging (since they are active sensors, they image using their own
“light”, in this case electrons.
• What are SEM images more enjoyable to look at than TEM images? This is because SEM images reflect shading,
i.e. differences in brightness based off of surface orientation. TEM images do not do this.
• How do SEMs work? Rely on an electron source/beam, magnetic-based scanning mechanisms, photodiode sensors to
measure secondary electron current. Specifically:
– Many electrons lose energy and create secondary electrons. Secondary electrons are what allow us to make measure-
ments.
– Secondary electron currents vary with surface orientation.
– Objects can be scanned in a raster-like format.
– Electron current is used to modulate a light ray. Magnification is determined by the degree of deflection.
– Gold plating is typically used to ensure object is conductive in a vacuum.
– Inclines/angles can be used for perturbing/measuring different brightness values.
– From a reflectance map perspective, measuring brightness gives is the slope (a scalar), but it does not give us the
gradient (a vector). This is akin to knowing speed, but not the velocity.
1.2 Shape from Shading: Needle Diagram to Shape

This is another class of problems from the overarching “Shape from Shading” problem. Let us first begin by defining what a
needle diagram is:
1
A needle diagram is a 2D representation of the surface orientation of an object for every pixel in an image, i.e. for ev-
Δ Δ dz
ery (x, y) pair, we have a surface orientation (p, q), where p = dz
dt , q = dy . Recall from photometric stereo that we cannot simply
parameterize Z(x, y); we can only parameterize the surface gradients p(x, y) and q(x, y).
In this problem, our goal is that given (p, q) for each pixel (i.e. given the needle diagram), recover z for each pixel. Note
that this leads to an overdetermined problem (more constraints/equations than unknowns) [1]. This actually will allow us to
reduce noise and achieve better results.
For estimating z, we have:

Z x
x : z(x) = z(0) + pdx0
0
Z y
y : z(y) = z(0) + qdy 0
0
Z
x&y : z(x, y) = z(0, 0) + pdx0 + qdy 0
Let us define δx0 = pdx0 + qdy. Next, we construct a contour in the (x, y) plane of our (p, q) measurements, where the contour
starts and ends at the origin, and passes through a measurement. Our goal is to have the integrals of p and q be zero over these
contours, i.e.
I
(pdx0 + qdy 0 ) = 0
This is equivalent to “z being conserved” over the contour.
But note that these measurements are noisy, and since we estimate p and q to obtain estimates for z, this is not necessar-
ily true.
Note that an easy way to break this problem down from one large problem into many smaller problems (e.g. for computa-
tional parallelization, greater accuracy, etc.) is to decompose larger contours into smaller ones - if z is conserved for a series of
smaller loops, then this implies z is conserved for the large loop as well.
1.2.1 Derivation with Taylor Series Expansion

Let us now suppose we have a point (x, y), centered around a square loop with lengths equal to δ. Then, applying the formula
above, we have that:
δy δx δy δx
p x, y − δx + q x + , y δy − p x, y + δx − q x − , y δy = 0
2 2 2 2
If we now take the first-order Taylor Series Expansion of this equation above, we have:
δy ∂p(x, y) δx ∂q(x, y) δy ∂p(x, y) δx ∂q(x, y)
Expansion : p(x, y) − δx + q(x, y) + δy − p(x, y) + δx − q(x, y) − δy = 0
2 ∂y 2 ∂x 2 ∂y 2 ∂x
δyδx ∂p(x, y) δxδy ∂q(x, y) δyδx ∂p(x, y) δxδy ∂q(x, y)

Rewriting : p(x, y)δx − + q(x, y)δy + = p(x, y)δx + + q(x, y)δy −
2 ∂y 2 ∂x 2 ∂y 2 ∂x
∂p(x, y) ∂q(x, y)
Simplifying : δyδx = δxδy
∂y ∂x
∂p(x, y) ∂q(x, y)
Solution : =
∂y ∂x
2
∂z ∂z
This is consistent with theory, because since our parameters p ≈ ∂x and q ≈ ∂y , then the condition approximately becomes
(under perfect measurements):
∂p(x, y) ∂ ∂z ∂2z
= ( )=
∂y ∂y ∂x ∂y∂x
∂q(x, y) ∂ ∂z ∂2z
= ( )=
∂x ∂x ∂y ∂x∂y
∂p(x, y) ∂q(x, y) ∂2z ∂2z

= =⇒ = , Which is consistent with Fubini’s Theorem as well [2].
∂y ∂x ∂y∂x ∂x∂y
Though this will not be the case in practice we are measuring p & q, and thus we will encounter some measurement noise.
1.2.2 Derivation with Green’s Theorem

We can now show the same result via Green’s Theorem, which relates contour integrals to area integrals:
I ZZ
∂M ∂L
(Ldx + M dy) = − dxdy
L D ∂x ∂y
∂M
Where the term Ldx + M dy on the lefthand side is along the boundary of the contour of interest, and the term ∂x − ∂L
∂y is along
the interior of the contour.
Green’s Theorem is highly applicable in machine vision because we can reduce two-dimensional computations to one-dimensional
computations. For instance, Green’s Theorem can be helpful for:
• Computing the area of a contoured object/shape
• Computing the centroid of a blob or object in two-dimensional space, or more generally, geometric moments of a surface.
Moments can generally be computed just by going around the boundary of a contour.
Let us now apply Green’s Theorem to our problem:
I ZZ
∂q(x, y) ∂p(x, y)
(pdx + qdy) = − dxdy = 0
L D ∂x ∂y
∂q(x,y) ∂p(x,y) ∂q(x,y) ∂p(x,y)
This requires ∂x − ∂y = 0 =⇒ ∂x = ∂y ∀ x, y ∈ D.
We could solve for estimates of our unknowns of interest, p and q, using unconstrained optimization, but this will be more
difficult than before. Let us try using a different tactic, which we will call “Brute Force Least Squares”:
ZZ 2 ∂z 2
∂z
min −p + − q dxdy
z(x,y) D ∂x ∂y
I.e. we are minimizing the squared distance between the partial derivatives of z with respect to x and y and our respective
parameters over the entire image domain D.
However, this minimization approach requires having a finite number of variables, but here we are optimizing over a continuous
function (which has an infinite number of variables). Therefore, we have infinite degrees of freedom. We can use calculus of
variations here to help us with this. Let us try solving this as a discrete problem first.
1.2.3 Shape with Discrete Optimization

For this, let us first take a grid of unknowns {zij }(i,j)∈D . Our goal is to minimize the errors of spatial derivatives of z with
respect to p and q, our measurements in this case (given by {pij , qij }(i,j)∈D . Our objective for this can then be written as:
X X zi,j+1 − zi,j 2 z
i+1,j− zi,j 2
min − pi,j + − qi,j
{zij }
i j

Note that these discrete derivatives of z with respect to x and y present in the equation above use finite forward differences.
3
Even though we are solving this discretely, we can still think of this as solving our other unconstrained optimization prob-
lems, and therefore can do so by taking the first-order conditions of each of our unknowns, i.e. ∀ (k, l) ∈ D. The FOCs are
given by |D| equations (these will actually be linear!):
∂
(J({zi,j }(i,j)∈D ) = 0 ∀ (k, l) ∈ D
∂zk,l
Let us take two specific FOCs and use them to write a partial differential equation:
• (k, l) = (i, j):
∂ 2 zk,l+1 − zk,l 2z
k+1,l − zk,l

(J({z,j }(i,j)∈D ) = − pk,l + − qk,l = 0
∂zk,l
• (k, l) = (i+1, j+1):

∂ 2 zk,l − zk,l−1 2z − z
k,l k−1,l

(J({z,j }(i,j)∈D ) = − pk,l−1 + − qk−1,l = 0
∂zk,l
Gathering all terms (we can neglect the zs):

pk,l − pk,l−1 ∂p
(1) ≈
∂x
qk,l − qk−1,l ∂q
(2) ≈
∂y
1
(3) 2 ((−zk,l+1 − zk+1,l − zk,l−1 − zk−1,l ) + 4zk,l ) = 0 ≈ −Δz = −r2 z The Laplacian of z

Where (1) + (2) + (3) = 0
Using the approximations of these three terms, our equation becomes:

∂p ∂q ∂p ∂q
+ − Δz = 0 =⇒ + = Δz
∂x ∂y ∂x ∂y
This approach motivates the use of “computational molecules”.
1.2.4 “Computational Molecules”

These are computational molecules that use finite differences [3] to estimate first and higher-order derivatives. They can be
thought of as filters, functions, and operators that can be applied to images or other multidimensional arrays capturing spatial
structural. Some of these are (please see the handwritten lecture notes for what these look like graphically):
1. zx = 1 (z(x, y) − z(x − 1, y)) (Backward Difference), 1 (z(x + 1, y) − z(x, y)) (Forward Difference)
2. zy = 1 (z(x, y) − z(x, y − 1)) (Backward Difference), 1 (z(x, y + 1) − z(x, y)) (Forward Difference)
1
3. Δz = r2 z = 2 (4z(x, y) − (z(x − 1, y) + z(x + 1, y) + z(x, y − 1) + z(x, y + 1)))
∂2z 1
4. zxx = ∂x2 = 2 (z(x − 1, y) − 2(x, y) + z(x + 1, y))
∂2z 1
5. zyy = ∂y 2 = 2 (z(x, y − 1) − 2(x, y) + z(x, y + 1))
These computational molecules extend to much higher powers as well. Let us visit the Laplacian operator Δ(·). This operator
comes up a lot in computer vision:
∂z ∂z T ∂z ∂z ∂2z ∂2z
• Definition: Δz = r2 z = ( ∂x , ∂y ) ( ∂x , ∂y ) = ∂x2 + ∂y 2
• The Laplacian is the lowest dimensional rotationally-invariant linear operator, i.e. for a rotated coordinate system
(x0 , y 0 ) rotated from (x, y) by some rotation matrix R ∈ SO(2), we have:
zx0 x0 + zy0 y0 = zxx + zyy
I.e. the result of the Laplacian is the same in both coordinate systems.
As we can see, the Laplacian is quite useful in our derived solution above.
4
1.2.5 Shape with Discrete Optimization Cont.
Let us return to our discrete optimization problem. Our derivation above is a least-squares solution. This turns out to be the
discrete version of our original continuous problem. Since these first-order equations are linear, we can solve them as a system
of linear equations with Gaussian elimination. But note that these processes take O(N 3 ) time. We can avoid this complexity by
taking advantage of the sparsity in these equations.
Iterative Approach: The sparse structure in our First-Order Equations allows us to use an iterative approach for shape
estimation. Our “update equation” updates the current depth/shape estimate zk,l using its neighboring indices in two dimen-
sions:
(n+1) 1 (n) (n) (n) (n)
zk,l = (z + zk+1,l + zk,l−1 + zk−1,l ) − (pk,l − pk,l−1 ) − (qk,l − qk−1,l )
4 k,l+1
A few terminology/phenomenological notes about this update equation:
• The superscripts n and n + 1 denote the number of times a given indexed estimate has been updated (i.e. the number of
times this update equation has been invoked). It is essentially the iteration number.
• The subscripts k and l refer to the indices.
• The first term on the righthand side 14 (·) is the local average of zk,l using its neighbors.
• This iterative approach converges to the solution much more quickly than Gaussian elimination.
• This iterative approach is also used in similar ways for solving problems in the Heat and Diffusion Equations (also
PDEs).
• This procedure can be parallelized so long as the computational molecules do not overlap/touch each other. For instance,
we could divide this into blocks of size 3 x 3 in order to achieve this.
• From this approach, we can develop robust surface estimates!
1.2.6 Reconstructing a Surface From a Single Image

Recall this other shape from brightness problem we solved for Hapke surfaces (Lecture 8). For Hapke surfaces, we have that our
brightness in the image (radiance) L is given by:
r r
cos θi n̂ · ŝ
L= =
cos θe n̂ · v̂
Recall from our last lecture that this gives us a simple reflectance map of straight lines in gradient (p, q) space. By rotating this
gradient space coordinate system from (p, q) → (p0 , q 0 ), we can simplify our estimates for shape.
With this rotation, we also claimed that rotating the system in gradient space is equivalent to using the same rotation ma-
trix R in our image space (x, y). Here we prove this:
Rotating Points : x0 = x cos θ − y sin θ, y 0 = x sin θ + y cos θ

Reverse Rotating Points : x = x0 cos θ + y 0 sin θ, y = −x0 sin θ + y 0 cos θ
∂z ∂z ∂x ∂z ∂y ∂z ∂z ∂x ∂z ∂y
Taking Derivatives : = + , = +
∂x0 ∂x ∂x0 ∂y ∂x0 ∂y 0 ∂x ∂y 0 ∂y ∂y 0
Combining the Above : p0 = p cos θ − q sin θ, q 0 = p sin θ + q cos θ
Then, in our rotated coordinate system where p0 is along the brightness gradient, we have that:
ps p + qs q rs E 2 − 1
p0 = p =p
2
ps + qs2 ps2 + qs2
Δ
(Where p0 = ∂x∂z
0 is the slope of the surface of interest in a particular direction. This phenomenon only holds for Hapke surfaces
with linear isophotes. We can integrate this expression out for surface estimation:
Z x0
z(x) = z(x0 ) + p0 (x)dx
x
5
Integrating out as above allows us to build a surface height profile of our object of interest. Can do this for the y-direction as
well:
Z x0
z(x, y) = z(x0 , y) + p0 (x, y)dx
x
A few notes from this, which we touched on in lecture 8 as well:
• Adding a constant to z does not change our profile integrals, except by an offset. Therefore, in order to obtain absolute
height measurements of z, we need to include initial values.
• In this case, we need an initial condition for every horizontal row/profile. This is the same as requiring an “initial curve”,
and allows us to effectively reduce our computations from 2D to 1D. Note from our previous lecture that these initial
conditions are needed to determine the surface orientation/shape at interfaces between these different profiles.
• Let us examine what this looks like when we parameterize an initial curve with η:
Take x(η), y(η), z(η), and rotate with ξ
Then we can compute δz to recover shape, along with δx and δy:

qs
1. δx = √ δξ
p2s +qs2
2. δy = √ p2s δξ
ps +qs2
[rs E 2 (x,y)−1]
3. δz = √ 2 2 δξ
ps +qs
Note that we can adjust the speed of motion here by adjusting the parameter δ.
Next time, we will generalize this from Hapke reflectance maps to arbitrary reflectance maps!
1.3 References
1. Overdetermined System, https://en.wikipedia.org/wiki/Overdetermined
_______________________________________________
system
2. Fubini’s Theorem, https://en.wikipedia.org/wiki/Fubini%27s
____________________________________________
theorem
3. Finite Differences, https://en.wikipedia.org/wiki/Finite difference
__________________________________________
6
MIT OpenCourseWare
https://ocw.mit.edu
6.801 / 6.866 Machine Vision

Fall 2020
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms

Lec 9

Uploaded by

Copyright:

Available Formats

Lec 9

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec 9

Uploaded by

Copyright:

Available Formats

6.801/6.

866: Machine Vision, Lecture 9

Professor Berthold Horn, Ryan Sander, Tadayuki Yoshitake

1.2 Shape from Shading: Needle Diagram to Shape

For estimating z, we have:

This is equivalent to “z being conserved” over the contour.

1.2.1 Derivation with Taylor Series Expansion

δyδx ∂p(x, y) δxδy ∂q(x, y) δyδx ∂p(x, y) δxδy ∂q(x, y)

∂p(x, y) ∂q(x, y) ∂2z ∂2z

1.2.2 Derivation with Green’s Theorem

1.2.3 Shape with Discrete Optimization

• (k, l) = (i+1, j+1):

Gathering all terms (we can neglect the zs):

Using the approximations of these three terms, our equation becomes:

1.2.4 “Computational Molecules”

zx0 x0 + zy0 y0 = zxx + zyy

1.2.6 Reconstructing a Surface From a Single Image

Rotating Points : x0 = x cos θ − y sin θ, y 0 = x sin θ + y cos θ

Combining the Above : p0 = p cos θ − q sin θ, q 0 = p sin θ + q cos θ

A few notes from this, which we touched on in lecture 8 as well:

Take x(η), y(η), z(η), and rotate with ξ

Then we can compute δz to recover shape, along with δx and δy:

6.801 / 6.866 Machine Vision

You might also like