$$ \newcommand{\vb}{\mathbf} \newcommand{\wt}{\widetilde} \newcommand{\mc}{\mathcal} \newcommand{\bmc}[1]{\boldsymbol{\mathcal{#1}}} \newcommand{\sup}[1]{^{\text{#1}}} \newcommand{\pard}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\VMV}[3]{ \Big\langle #1 \Big| #2 \Big| #3 \Big\rangle} $$

Adjoint-based optimization in meep: Implementation Notes

These notes are intended as something of a companion to the user’s manual and tutorial documentation for adjoint-based optimization in meep; whereas the goal of those pages is to document the user interface and explain how to use the solver for practical problems, our focus here will be what’s going on beneath the hood—how the solver actually works.

Actually, as will be clear to anyone who has ever reviewed the fundamentals of adjoint sensitivity analysis, the theoretical basis of the method and the derivation of its key formulas are almost trivially straightforward, with the only potential source of difficulty being how to massage the mechanics of the procedure into a meep-friendly form.

Toy problem

Ultimately we will want to use adjoint methods to differentiate complex objective functions—involving quantities such as Poynting fluxes and mode-expansion coefficients—with respect to various different types of parameters describing material geometries. However, before tackling the problem in that full generality, it’s useful to build up to it by starting with a simple toy problem and adding complications one at a time. Thus we consider a simple waveguide geometry, excited by a point source at , and define our objective function to be simply the frequency-domain electric-field amplitude at a point ; we will compute the derivative of this objective function with respect to the permittivity in a small “design region” centered at a third point .

AdjointToyGeometry1

Permittivity derivative by finite-differencing

An obvious brute-force way to get at this is simply to do two meep calculations, with augmented by a small finite amount on the second run, then compute the difference between the frequency-domain electric fields at and divide by to estimate the derivative. Here are plots of the unperturbed field and the field perturbation (difference between perturbed and unperturbed field):

  • (unperturbed):

Unperturbed field

  • (perturbed-unperturbed): Perturbation field

(Here and below we use the tilde sign () to indicate frequency-domain fields and sources.)

Permittivity derivative from effective sources

One way to think about the effect of a localized permittivity bump goes like this: Increasing the permittivity in some localized region of a material body corresponds to increasing the polarizability in that region—that is, the ease with which positive and negative charges in the material, ordinarily bound so tightly together that they neutralize each other as sources, can be induced by applied electric fields to separate (“polarize”), whereupon they cease to cancel each other and act as effective sources contributing to electromagnetic fields. Of course, if there were no electric field in the material, then we could increase its polarizability as much as we pleased without producing any sources—zero times a bigger coefficient being still zero—but here there is a nonzero electric field throughout our geometry, due to the point source in the unperturbed problem, which means that the effect of bumping the permittivity of the design region may be approximated by adding new sources in that region, with strength proportional to and to the unperturbed electric field. More specifically, in a frequency-domain problem involving time-harmonic fields and sources with angular frequency (time dependence ), the following perturbations are equivalent:

AdjointToyGeometry1

Superposing this effective source with the original point source at yields a source configuration that, acting on the unperturbed geometry, produces the same fields as the point source alone acting on the perturbed geometry.

Alternatively, by exploiting the linearity of Maxwell’s equations (and assuming we have linear media!) we could just as easily remove the original point source and compute the fields of alone, which, upon dividing through by , give just the derivatives of field components with respect to . In other words, Analogous reasoning yields a prescription for magnetic-field derivatives:

Digression: Configuring time-domain sources for desired frequency-domain fields in meep

In frequency-domain electromagnetism we usually consider a time-harmonic source distribution of the form and we ask for the time-harmonic electric field distribution radiated by this distribution: where indicates frequency-domain amplitudes. A typical frequency-domain solver might input and output :

On the other hand, when using meep to compute the fields produced by a given spatial source distribution, we typically construct a time-domain source of the form where is a Gaussian temporal envelope. More specifically, for meep‘s GaussianSrc with center frequency , frequency width , and peak time , we have The Fourier transform of this is The meep version of the above input/output diagram looks like

The upshot is that the frequency-domain fields obtained from a meep run with a Gaussian source come out multiplied by a factor of that should be divided out to yield the desired frequency-domain quantities.

Invoking reciprocity

It is convenient to describe the process described above in the language of frequency-domain Green’s functions, which express the fields radiated by monochromatic source distributions as spatial convolutions: with the electric-electric dyadic Green’s function of the material geometry (giving the electric field produced by a unit-strength electric current). In this language, the effective-source representation of the permittivity derivative reads It is convenient to think of the RHS here as a double convolution of two vector-valued functions with the kernel: or where denotes convolution, is short for and the bra-ket notation describes a machine that inputs two vector-valued functions and a kernel and outputs a scalar quantity: (Note that this is not a Hermitian inner product, i.e. the first factor is not conjugated.)

For the magnetic-field derivative we have similarly where is the magnetic-electric Green’s function, giving the magnetic field produced by an electric current.

Computationally, inner products like for arbitrary functions may be evaluated in meep as follows:

  1. Create an electric current source with spatially-varying amplitude and Gaussian temporal envelope .

  2. Timestep and DFT to compute the frequency-domain electric field produced by this source.

  3. Compute the inner product (The normalization prefactor was discussed above.)

The virtue of writing things this way is that it allows the physical property of reciprocity to be expressed as the mathematical property that the aforementioned inner-product machine is insensitive to the order of its arguments, i.e. we can flip the and inputs and still get the same scalar output:

Applying reciprocity to the above expressions for field derivatives yields where in going to the last line we invoked the identity

Note that equations (3a) and (3b), notwithstanding their nearly identical appearance, describe two rather different meep calculations: In the former case we place an electric source at and timestep to compute the resulting magnetic field, while in the latter case we place a magnetic source and timestep to compute the resulting electric field. (In both cases, upon computing the field in question we proceed to compute its overlap with the unperturbed field in the design region.)

Differentiating more complicated functions of field components

Thus far we have only considered derivatives of individual field components, and then only at a single point ; more generally, we will want to differentiate functions of multiple field components over a subregion of the grid, which we will call the objective region .

E-field energy in region

As one example, the electric field energy in the objective region is defined by an integral over that region, which meep approximate by a weighted sum over grid points: Here the sum is over all field components and all grid points lying in , and is a cubature weight associated with point (as returned by get_dft_array_metadata). Differentiating, we have

Poynting flux

A case that arises frequently is that in which the objective region is a cross-sectional surface cutting normally through a waveguide or similar structure and the objective function is the normal Poynting flux through . For example, the -directed Poynting flux is given by where refers to three other terms of the form . Differentiating and rearranging slightly, we have

Mode coefficient