|
TOP:
Introduction
FORWARD:
Laser guide stars
The problem of measuring wave-front distortions is common to optics (e.g. in the fabrication and control of telescope mirrors), and typically is solved with the help of interferometers. Why do not use standard laser interferometers in Adaptive Optics Wave-Front Sensors (WFSs)?
First, an AO system must use the light of stars passing through the turbulent atmosphere to measure the wave-fronts, hence use incoherent (and sometimes non-point) sources. Even the laser guide stars are not coherent enough to work in typical interferometers. WFS must work on white-light incoherent sources.
Second, the interference fringes are chromatic. We can not afford to filter the stellar light, because we want to use faint stars. WFS must use the photons very efficiently.
Third, interferometers have an intrinsic phase ambiguity of
, whereas atmospheric phase distortions exceed
, typically. The WFS must be linear over the
full range of atmospheric distortions. There are algorithms to "un-wrap" the
phase and to remove this ambiguity, but they are slow, while atmospheric
turbulence evolves fast, on a millisecond time scale: WFS must be fast.
These requirements are fulfilled in several existing WFS concepts. Each WFS consists of the following main components:
Needless to say that any real WFS has a finite spatial resolution, which must
match the size of correcting elements (e.g. inter-actuator spacing of the DM).
Wave-front distortions of smaller size are not sensed. However, they influence
the WFS signal, causing the so-called aliasing error (like an aliasing
error in temporal signals with finite sampling, see the Figure). Turbulence
spectrum decreases at high spatial frequencies, hence aliasing error is often of
little importance compared to other AO errors, e.g. to the fitting error.
A well-known Hartmann test devised initially for telescope optics control was adapted for AO and is the most frequently used type of WFS. An image of the exit pupil is projected onto a lenslet array - a collection of small identical lenses. Each lens takes a small part of the aperture, called sub-pupil, and forms an image of the source. All images are formed on the same detector, typically a CCD.
When an incoming wave-front is plane, all images are located in a regular
grid defined by the lenslet array geometry. As soon as the wave-front is
distorted, the images become displaced from their nominal positions.
Displacements of image centroids in two orthogonal directions
are proportional to the average wave-front slopes in
over the sub-apertures. Thus, a Shack-Hartmann
(S-H) WFS measures the wave-front slopes. The wave-front itself is
reconstructed from the arrays of measured slopes, up to a constant which is of
no importance for imaging. Resolution of a S-H WFS is equal to the sub-aperture
size.
Question: What is the maximum angular size
of the source when images from adjacent sub-apertures begin to overlap? Take
lenslet size of 0.5 mm and its focal distance 50 mm. Will this lenslet array be
adequate for an AO system with sub-aperture size
=1 m?
Question: Estimate the r.m.s. slopes of
wave-fronts on the sub-apertures as a function of sub-aperture size
and
(use the coefficients of atmospheric tip and tilt from Sect. 1.10).
Compute for
=1 m and 1 arcsecond seeing.
A good feature of the S-H WFS is that it is completely achromatic, the slopes
do not depend on the wavelength. It can also work on non-point (extended)
sources. If
is the wave-front phase, the x-slope measured by a S-H WFS is
computed as
![]() |
(1) |
![]() |
(2) |
Now the error of slope measurement which arises from the photon noise will be
estimated. Let
radians be the radius of the image formed by each sub-aperture. For
extended sources,
is equal to the source size (more precisely, to the dispersion of the
intensity distribution around the center). For point sources,
if the sub-apertures are smaller than
(diffraction-limited images), or
for large sub-apertures (image size determined by the atmospheric
blur). The image intensity distribution can be regarded as a probability density
distribution of the arriving photons. Hence, each arriving photon permits to
determine image position with an error of
. When
photons are detected during exposure time, the photon error of the
centroid position (i.e. slope) becomes
, like after repeating the same measurement
times.
In the photometric band R (wavelength around 600 nm) where the modern
detectors are most sensitive, a star of magnitude 0 gives a flux of 8000
photons per second per square centimeter per nanometer of bandpass (effective
bandpass may reach 300 nm for a good CCD). For a star of magnitude m the
flux diminishes by
times. In calculating the flux available for the WFS detector, the
optical transmission must be taken into account.
Question: Compute the number of photons detected in 1 ms exposure time per sub-aperture of 1 m available from a star of 15-th magnitude. Assume total transmission of 0.3 and quantum efficiency of 0.6.
It is generally agreed to express all wave-front errors in radians. We
multiply the slope error by
to obtain the variance of phase difference between the edges of
sub-aperture in square radians:
![]() |
(3) |
Question: How many photons per exposure are
needed to achieve a 1 radian photon error in a S-H WFS with
? Assume that imaging and sensing is done at the
same wavelength.
The error of reconstructed wave-fronts is proportional to
with a coefficient called noise propagation. It is known that
for a S-H WFS noise propagation is of the order of 1 and increases only slowly
with the number of elements (the slopes are integrated in the reconstructor, so
noise is not amplified).
The photon flux is proportional to the square of sub-aperture size
. It means that, for a given
, the photon error of a S-H WFS is independent of the size of its
sub-apertures. This conclusion applies only to the ideal detector; in real
systems with CCDs (e.g. NAOS
at VLT) larger sub-apertures are selected for fainter guide stars.
How many detector pixels must be allocated for
each sub-aperture? In order to compute the centroids accurately, the individual
images must be well sampled, more than 4x4 pixels per sub-aperture. However,
each pixel of a CCD detector contributes the readout noise which dominates the
photon noise for faintest guide stars. Thus, in some designs (e.g. Altair
for Gemini-North) there are only 2x2 pixels per sub-aperture. In this case
each element works as a quad cell, the x,y slopes are deduced from
the intensity ratios:
| (4) |
The response of a quad-cell slope detector is linear only for slopes less
than
, the response coefficient is proportional to
(hence may be variable, depending on seeing or object
size). This is the price to pay for the increased sensitivity, which is of major
importance to astronomers.
Question: What shape of the guide star image is needed to achieve the exactly linear response curve of a quad cell?
The S-H WFSs are very common because they rely on a proven technology and solid experience, are compact and stable. These WFSs require a calibration of the nominal spot positions, which is achieved by imaging an artificial point source.
The curvature wave-front sensing was developed by F. Roddier since 1988. His idea was to couple a curvature sensor (CS) and a bimorph DM directly, without a need for intermediate calculations (although nobody actually does this).
Let
be the light intensity distribution in the intra-focal stellar image,
defocused by some distance
, and
- the corresponding intensity distribution in the extra-focal image.
Here
is the coordinate in the image plane and
is the focal distance of the telescope. These two images are like
pupil images reduced by a factor of
. In the geometrical optics approximation, a local wave-front curvature
makes one image brighter and the other one dimmer; the normalized intensity
difference is written as
![]() |
(5) |
Question: Draw the pairs of intra- and extra-focal images for Zernike aberrations from 2 to 6. Hint: defocused images from astigmatism to number 12.
For a source of finite angular size
the intra- and extra-focal images are blurred by the amount of
. The blur must be less than the projected size of sub-aperture
:
| (6) |
| (7) |
Larger de-focusing is needed to measure wave-front with higher resolution, the sensitivity of CS will be reduced accordingly. This means that a CS may have problems for sensing high-order aberrations.
For point sources and large sub-apertures (a case of practical interest) the
blur
is defined by the atmospheric aberrations,
, as in the S-H WFS. If the AO system works in the closed loop and the
residual aberrations (at the sensing wavelength) become small, the blur is
reduced to
, permitting to reduce de-focusing and to gain
the sensitivity. This feature is actually used to a limited extent in the real
AO systems: de-focusing is reduced once the loop is closed.
The high-frequency wave-front distortions (smaller than sub-aperture size)
have power spectrum (variance of Fourier amplitudes) proportional to
, but their curvature spectrum is
proportional to
and may cause a large aliasing error. To prevent this, the
signal must be smoothed before being sub-divided into sub-apertures (sampled).
Smoothing is achieved by decreasing the defocusing
, which also increases the sensitivity. In short, the choice of
in a CS is critical and must be adjusted to
varying seeing conditions. The signal of a CS is only a more or less crude
approximation of the true wave-front curvature...
We give without derivation the formula for a phase variance due to photon
noise in a CS when the defocusing is adjusted to its optimum value:
![]() |
(8) |
The scale of intra- and extra-focal images depends on defocusing
which must be changed during operation. This is
not convenient; in fact the curvature signal is detected in the pupil image with
fixed scale, while the amount of de-focusing is adjusted by a special optical
element (see below). The outer sub-apertures project onto the pupil boundary,
their signal provides information on the radial phase gradients, including
global tip and tilt (see the Figure).
The CSs that actually work in astronomical AO systems (e.g. in PUEO
and Hokupa'a
) use the Avalanche Photo-Diodes (APDs) as light detectors. These are
single-pixel devices, like photo-multipliers. The individual photons are
detected and converted to electrical pulses with no readout noise and small dark
count, maximum quantum efficiency is around 60%. Individual segments of the
pupil are isolated by a lenslet array (which, typically, matches the radial
geometry of the bimorph DM), then the light from each segment is focused and
transmitted to the corresponding APD via an optical fiber. The number of APDs is
equal to the number of segments. Outer segments sample the edge of the aperture,
and their signals are proportional to the wavefront gradients along normal.
APDs are bulky and expensive, hence this design is suitable only for
low-order systems. In order to have only 1 detector per pixel, the intra- and
extra-focal images are switched in time and directed to the same APD, then the
signal is de-modulated in the wave-front computer. The focus modulation is done
by placing an oscillating membrane mirror in the focal plane (typical frequency
is 2 kHz). The defocusing
is inversely proportional to the amplitude of membrane oscillation,
which is adjusted to varying seeing conditions and can be reduced once the AO
loop is closed, increasing the sensitivity of the CS. Some useful turbulence
compensation was achieved even with signals as low as 1 photon per sub-aperture
per loop cycle!
Alternative solution would be to use CCDs as light detectors in the CS. This is discussed for a long time, but not yet implemented in real systems. The drawback of CCDs is their readout noise which becomes a dominating noise source at low light levels. Special CCDs were developed at ESO that permit multiple modulation cycles per single readout.
Question: Suppose that a CCD with 5
electrons readout noise is used in the WFS. How large a number of detected
photons
must be to make the readout noise smaller than the photon noise?
The problems of interferometric wave-front measurement can be overcome
when the interfering beams represent wave-fronts with a small lateral shift
(this is called shearing interferometer). If the shear is less
than
, the phase differences are less than 1 wavelength, and there is no
ambiguity. The light intensity in the
interferogram is
![]() |
(9) |
For small shifts the phase difference is proportional to the first derivative (slope), hence the signal of a shearing interferometer is is similar to that of S-H WFS. Two shears in the orthogonal directions are needed to measure x,y slopes. The first successful AO system (RTAC) used a WFS based on the shearing interferometer, but this approach is now completely abandoned in favor of S-H WFS.
Question: Estimate the maximum shear
to preserve a linear response of the shearing interferometer under
given seeing conditions (given
).
Other types of interferometers were suggested for wave-front sensing. Some of them can provide signals directly proportional to the phase (thus not needing reconstructor), although in a limited dynamical range. Such solutions can be interesting for correcting high-order residual aberrations (e.g. in AO systems with a very high degree of compensation as needed for detecting extra-solar planets).
The pyramid WFS (P-WFS) is being developed by Italian astronomers. A transparent pyramid is placed in the focal plane and dissects the stellar image into four parts. Each beam is deflected, these beams form four images of the telescope pupil on the same CCD detector. Thus, each sub-aperture is detected by 4 CCD pixels. This optical setup is similar to Foucault knife-edge test.
Let us suppose that the light source is extended and use the geometric
optics. A wave-front slope at some sub-aperture changes the source position on
the pyramid, hence changes the light flux detected by the 4 pixels which would
otherwise be equal. By computing the normalized intensity differences we get two
signals proportional to the wave-front slopes in two directions. The sensitivity
of a P-WFS depends on a source size
. P-WFS can be viewed as an array of quad-cells and is similar to a S-H
WFS.
What happens when a point source (star) is used and when diffraction effects
are taken into account? The intensity distributions in the four pupil images
become complicated and non-linear functions of the wave-front shape, P-WFS does
not measure slopes any longer. In case of weak aberrations (amplitude much less
than
) the wave-front shape can still be reconstructed, although in a more
complex way. In order to retrieve the linearity, the star is rapidly moved over
the pyramid edge (e.g. in a circular pattern), creating a ring-shaped source.
This is not modulation (like in the CS), but simply smearing of the point
source, because the signal is integrated over one or more wobble cycles.
Question: Draw the four pupil images in a P-WFS for the case of defocusing (Zernike mode number 6).
What are the advantages of a P-WFS? First, there is no lenslet array, the sub-apertures are defined by the detector pixels. It means that for faint stars the number of sub-apertures can be reduced simply by binning the CCD. Second, the amplitude of the star wobble can be adjusted as a trade-off between the sensitivity (smaller wobble) and linearity (larger wobble). At small amplitudes the sensitivity of a P-WFS can be higher than that of a S-H WFS (see Astron. Astrophys. V. 369, P. L9, 2001). Finally, it is possible (at least in principle) to place several pyramids in the focal plane, in order to combine the light from several faint guide stars on a single detector. Despite the general interest in P-WFS, there are yet no working AO systems with this kind of WFS.
The phase can be retrieved from the analysis of two simultaneous images of a star, one in-focus and the other one defocused (or, generally, with some known aberration). This approach is called phase diversity. The algorithm is non-linear (hence slow?), the advantages of its application to AO are not yet clear.
The "ideal" WFS is not yet invented. There is no general theorem which would state the absolute sensitivity limit of any WFS due to photon noise. Instead, we have several empirical solutions, optimize their parameters and choose the best among available options.
In this section the problem of computing the wave-front shape from the data provided by a WFS is addressed in a general way.
The measurements (WFS data) can be represented by a vector
(its length is twice the number of sub-apertures N
for a S-H WFS, because slopes in two directions are measured, and equal to
N for CS). The unknowns (wave-front) is a vector
, which can be specified as phase values on a grid, or,
more frequently, as Zernike
coefficients. It is supposed that the relation between the measurements and
unknowns is linear, at least in the first approximation. The most general form
of a linear relation is given by matrix multiplication,
| (10) |
A reconstructor matrix B performs the inverse operation,
retrieving wave-front vector from the measurements:
| (11) |
Question: For a given number of sub-apertures N, estimate the number of arithmetic operations needed to reconstruct phase. How does it depend on the imaging wavelength (for given Strehl ratio)?
The number of measurements is typically more than the number of unknowns, so
a least-squares solution is useful. In the least-squares approach we look for
such a phase vector
that would best match the data. The resulting reconstructor is
| (12) |
In almost all cases the matrix inversion presents problems because the matrix
is singular. It means that some parameters (or combinations of
parameters) are not constrained by the data. For example, we can not determine
the first Zernike
mode (piston) from the slope measurements. In practice the matrix inversion is
done by removing the indetermined (or poorly determined) parameters with the
help of Singular Value Decomposition algorithm. In S-H systems with
square geometry, poorly determined modes typically include "waffle"
(quasi-periodic deformation with actuator-grid frequency).
How many Zernike
modes can be reconstructed with a S-H WFS having
sub-apertures? At first sight, up to 2
. In fact, only
, because the x,y slopes are not completely independent, they
are redundant. For a CS, the maximum number of modes is also
.
The least-squares reconstructor is not the best one. It is known from the statistical textbooks that by using a priori information on the signal properties a better reconstruction can be achieved. In case of AO, this information is the statistics of wave-front perturbations (e.g. a covariance of Zernike modes) and the statistics of WFS noise. Looking for a solution that gives the minimum expected residual phase variance (hence maximum Strehl ratio), we obtain a reconstructor matrix which is similar to a Wiener filter.
In case of one-dimensional signals, the Wiener filter in frequency space is
written as
![]() |
(13) |
Question: The spatial power spectrum of
slope errors is white (independent of frequency f) and the power spectrum
of atmospheric tilts is proportional to
. How does the maximum frequency of the compensated aberrations depend
on the noise level
?
In AO systems the expressions for minimal variance reconstructor involve the interaction matrix and the covariance matrices of noise and atmospheric perturbations. Similar results are obtained using other statistical approaches (maximum likelihood or maximum a posteriori probability).
For any reconstructor B, the noise of the reconstructed phase
is
| (14) |
Summary. Wave-front sensor is the most critical part of astronomical AO systems because guide stars are often faint, limiting the achievable degree of turbulence compensation. The two most common WFS concepts, Shack-Hartmann and curvature, were studied. For both of them we can compute the photon error and estimate the error of reconstructed wave-fronts as a function of guide star magnitude and system parameters. The basic ideas of wave-front reconstruction were introduced without going into much details.
TOP: Introduction
FORWARD: Laser guide stars