combine (Aug07) | combine (Aug07) |
combine input output dcombine input output fcombine input output scombine input output zcombine input output cgroup input output
Because the variants have the same parameters with different values and these tasks are intended to be customized for different packages the default values shown in this section are only examples.
The following four parameters define keyword expressions. General information about expressions is found under the help topic expressions (V2.13 and later). The sections below on grouping and expressions and the examples provide additional help.
none - No rejection minmax - Reject the nlow and nhigh pixels ccdclip - Reject pixels using CCD noise parameters crreject - Reject only positive pixels using CCD noise parameters sigclip - Reject pixels using a sigma clipping algorithm avsigclip - Reject pixels using an averaged sigma clipping algorithm pclip - Reject pixels using sigma based on percentiles
grid [n1] [s1] [n2] [s2] ...
where ni is the number of images in dimension i and si is the step in dimension i. For example "grid 5 100 5 100" specifies a 5x5 grid with origins offset by 100 pixels.
variance in DN = (rdnoise/gain)^2 + DN/gain + (snoise*DN)^2 variance in e- = (rdnoise)^2 + (gain*DN) + (snoise*(gain*DN))^2 = rdnoise^2 + Ne + (snoise * Ne)^2
where DN is the data number and Ne is the number of electrons. Sensitivity noise typically comes from noise introduced during flat fielding.
combine -- Combine science/object exposures dcombine -- Combine dark exposures fcombine -- Combine dome flat and twilight sky exposures scombine -- Combine dithered exposures for sky flats zcombine -- Combine zero (bias) exposures cgroup -- Create grouping lists
These tasks are related to the core system imcombine task but add features for instrumental data reduction. They also support multiextension format (MEF) files which is not currently included with imcombine. The task cgroup provides only the parameters related to the selection and grouping and produces lists of image files rather than combined images.
The images to be combined or grouped are specified by an input list (input) of files from which a subset satisfying an optional boolean keyword expression (select) may be selected. The resulting set of files may then be grouped (group, seqval, extension) and each group combined or the filename written to a grouping list file. When grouping, the output file names (output, headers, bpmasks, rejmasks, nrejmasks, expmasksfI, fIsigmas) may be used as root names with group identifiers appended and the files may be multiextension files containing multiple images, masks, or text blocks.
These tasks accept input which are multiextensions format (MEF) files. Only the image extensions are used. In this case either all the image extensions are combined or an output MEF file is created with each extension group as an extension.
A group of images are combined by registering them in pixel space (offsets) and making an output image which is either the maximum overlap or a trimmed section (outlimits). The overlapping pixels may be scaled (scale, zero) and rejected (masktype, lthreshold, hthreshold, reject). The remaining overlapping pixels are then summed, medianed, or weighted averaged (combine, weights). If all pixel are rejected or the output image size includes regions with no overlapping images then a blank value (blank) is output.
For combined images the output header is a copy of the first image in the combined set. In addition, the number of images combined is recorded under the keyword NCOMBINE. The input image names which are combined are also recorded under the indexed keywords IMCMBnnn provided there are fewer than 100 images. If a bad pixel mask is created, the name of the mask will be included in the output image header under the keyword BPM. The output pixel type is set by the parameter outtype. If left blank then the input datatype of highest precision is used. If there is a mixture of short and unsigned short images then the highest precision is integer.
In addition to the output combined images various optional outputs include header listings of the combined images (headers), bad pixel masks (bpmasks) identifying regions with no input data, masks specifying the number of pixels rejected and which pixels were rejected (nrejmasks, rejmasks), the effective exposure time of each output pixel, and the sigma of the combined pixels (sigmas). Information about the combining is also written to a log file (logfile) which may be the terminal output or a text file.
INPUT SELECTION AND GROUPING WITH KEYWORD EXPRESSIONS
One may control the images to be combined explicitly with the input list. In this case the selection and grouping parameters are set to null strings, though when combining MEF files the xtension parameter may still be needed to combining matching image extensions into an output MEF file.
The other way to control the images to be combined is through the four keyword expression selection and grouping parameters select, group, seqval and extension. Before discussing these we describe the general keyword expression syntax. The use of these expressions provides a more powerful alternative to translation files for dealing with data from different observatories and instruments.
The variables in the expressions in these tasks are unquoted keyword names whose values are to be used. Expressions usually refer to a single keyword but more complex combinations of keyword values may be used. For a generic discussion of expressions see the expressions help page. Expressions may evaluate to strings, numbers, or booleans (true or false). It is important to know what type of value an expression parameter must produce. Except for the seqval expression, keyword values are automatically converted to strings so there is no need to cast keywords with numeric values to strings.
The combining and grouping tasks provide three special functions.
These functions are useful for grouping and setting the suffix strings to be applied to the output rootname. Note that "mkid" should be used if there is a possiblity of whitespace or non-alphanumeric characters. The functions can be imbedded within each other.
The select, group, and seqval expressions only use keywords from the first image in an MEF file while the extension expression applies to all extensions in an MEF and is not used for single images.
The select expression must evaluate to a boolean. Those input files which evaluate to false are eliminated from the input list. The typical use of this parameter is to select exposures of a particular type such as bias, dark, flat, or object. The NOAO mosaic cameras identify the type by an OBSTYPE. The selection is often based on a string valued keyword and the string matching syntax of expressions is used.
The group expression must evaluate to a string. Any whitespace or trailing period in the result are removed. A group consists of all files with the identical group value. Because the group value is appended to the output root name one should try and produce short but unique values. The mkid functions is useful for this. The common use for the group parameter is to group dark exposure sequences by exposure time and flat field and science exposures by filter. Spectroscopic images would be grouped by keywords related to the dispersion configuration.
The seqval expression must evaluate to a number. After grouping by the group and extension expressions the images in the group are sorted by the sequence value. The the sorted list is scanned looking for increments in the sequence value greater than the value given by the seqgap parameter. When such a gap is found a new sequence group is formed. The sequence groups are numbered and the number is appended to the output root name. Obviously the purpose of this parameter is to group a large set of input into sequences. Typically the sequences are calibration exposures (biases, darks, or dome flats) and the interval between the sequences is usually a small amount. The sequence value is often a time. If the time is cyclic special care must be taken and a modified Julian date may be desireable. Althernatively, some data may have an explicit sequence number that is unique within a sequence. In this case the sequence number could be used with the group parameter or with the sequence parameter using a gap value of zero.
The extension expression, used only with MEF files, must evaluate to a string. An extension group consists of all files with the identical value. The use for the extension parameter is match extensions in MEF files. This is often done for data taken with a mosaic camera consisting of multiple detectors or readout channels. The matching may be done with keywords unique to each channel which, in many cases, could be the extension name.
OPTIONAL OUTPUTS
There are various additional outputs that may be produced by providing the filenames.
This mask is the only way to record whether a particular input image pixel contributed to the output image. As an example, consider the case of three input two dimensional images of sizes 1020x1020, 1010x1010, and 1000x1000 with relative offsets of (0,0), (10,10), and (20,20). The output image would then be 1020x1020.
Suppose that the only input pixels not used are pixels (1,1) in each input image. Because of the offsets the first 10 rows and columns of the output will be based on just a single pixel except for (1,1) which has no input pixels. The next 10 rows and colums of the output will be a combination of 2 input pixels except (11,11) which is just based on pixel (11,11) in the first input image. Finally all other pixels except (21,21) will be a combination of 3 values.
The rejection mask will be three dimensional of size 1020x1020x3. Plane 1 will correspond to the first input image, plane 2 to the second, and so on. All of the pixels will be zero except for the following pixels which will have a value of 1. In the first plane only pixel (1,1,1) will be one. In the second plane the first 10 rows and columns and pixel (11,11,2) will be one. And in the third plane, the first 20 rows and columns and pixel (21,21,3) will be one. So if we want to know about output pixel (11,11) the rejection mask will tell us that pixels from the second and third images were excluded.
This is a complex example because of the offsets and dissimilar sizes. In the more common and simpler case of equal sizes and registered images, the mask planes would have one to indicate that the pixel in the input image at that coordinate was not used. For instance if pixel (12,15,2) is one in the rejection mask then pixel (12,15) in the second input image was excluded.
Note that one can use image sections to extract a mask matching the input image. For the example case with the offsets masks in the input coordinates can be extracted with the commands
cl> imcopy rejmask[*,*,1] mask1 cl> imcopy rejmask[11:1020,11:1020,2] mask2 cl> imcopy rejmask[21:1020,21:1020,3] mask3
For the case of equal sized and registered images one could also use imslice.
OFFSETS
The images to be combined need not be of the same size or overlap. They do have to have the same dimensionality which will also be the dimensionality of the output image. Any dimensional images supported by IRAF may be used.
The overlap of the images is determined by a set of integer pixel offsets with an offset for each dimension of each input image. For example offsets of 0, 10, and 20 in the first dimension of three images will result in combining the three images with only the first image in the first 10 colums, the first two images in the next 10 columns and all three images starting in the 21st column. At the 21st output column the 21st column of the first image will be combined with the 11th column of the second image and the 1st column of the third image.
The output image size is set by the maximum extent in each dimension of any input image after applying the offsets. In the above example if all the images have 100 columns then the output image will have 120 columns corresponding to the 20 column offset in the third image. Note that this same output image size is computed and used as the basis for the outlimits parameter to specify a subregion to actually be output.
The input image offsets are set using the offset parameter. There are four ways to specify the offsets. If the word "none" or the empty string "" are used then all offsets will be zero and all pixels with the same coordinates will be combined. The output image size will be equal to the biggest dimensions of the input images.
If "wcs" offsets are specified then the world coordinate systems (wcs) in the image headers are used to derived the offsets. The world coordinate at the center of the first input image is evaluated. Then integer pixel offsets are determined for each image to bring the same world coordinate to the same point. Note the following caveats. The world coordinate systems must be of the same type, orientation, and scale and only the nearest integer shift is used.
If the input images have offsets in a regular grid or one wants to make an output image in which the input images are "mosaiced" together in a grid then the special offset string beginning with the word "grid" is used. The format is
grid [n1] [s1] [n2] [s2] ...
where ni is the number of images in dimension i and si is the step in dimension i. For example "grid 5 100 5 100" specifies a 5x5 grid with origins offset by 100 pixels. Note that one must insure that the input images are specified in the correct order. This may best be accomplished using a "@" list. One useful application of the grid is to make a nonoverlapping mosaic of a number of images for display purposes. Suppose there are 16 images which are 100x100. The offset string "grid 4 101 4 101" will produce a mosaic with a one pixel border having the value set by blank parameter between the images.
The offsets may be defined in a file by specifying the file name in the offset parameter. (Note that the special file name STDIN may be used to type in the values terminated by the end-of-file character). The file consists of a line for each input image. The lines must be in the same order as the input images and so an "@" list may be useful. The lines consist of whitespace separated offsets one for each dimension of the images. In the first example cited above the offset file might contain:
0 0 10 0 20 0
where we assume the second dimension has zero offsets.
The offsets need not have zero for one of the images. The offsets may include negative values or refer to some arbitrary common point. When the offsets are read by the program it will find the mininum value in each dimension and subtract it from all the other offsets in that dimension. The above example could also be specified as:
225 15 235 15 245 15
There may be cases where one doesn't want the mininum offsets reset to zero. If all the offsets are positive and the comment "# Absolute" appears in the offset file then the images will be combined with blank values between the first output pixel and the first overlapping input pixel. Continuing with the above example, the file
# Absolute 10 10 20 10 30 10
will have the first pixel of the first image in the 11th pixel of the output image. Note that there is no way to "pad" the other side of the output image.
OUTPUT OF SUBREGIONS
The output image size is computed from all of the input images including the offsets as described previously. The outlimits may be used to specify a subregion of this full size to be created. The syntax of this parameter is pairs of whitespace separated numbers selecting the first and last pixel in each output dimension. The pairs for each dimension are also whitespace separated and are given in order of the dimensions. Any missing values at the end of the string default to the full limits of the output image. If the requested limits fall outside the full output image they are reset to the size of the full computed output size.
As an example, consider combining 10 images of size 1000x1000 with offsets of 0, 1, ..., 9 along the first dimension. Because of the offsets the full output size is 1010x1000. To output only the region [1:100,101:200] of this full size the parameter value would be the string "1 100 101 200". Note that if the value was just "1 100" then the output region would be [1:100,1:1000].
The intended purpose for this option is to allow creating subregions using a smaller number of images in the case of offset data taken at a raster of positions. This is important since when the number of images becomes too large (>4000) the program either has to prestack the images into a higher dimensional single image (requiring equal sized images) or utilize an inefficient algorithm where images are opened and closed for each input line. A detail of how this task works is that it is the number of images required to be accessed for each output line that is significant.
The following example was developed when the maximum number of images open at one time was ~240. In V2.12 the number was increased to more than 4000 so it is not as relevant though it may apply to very large surveys with many small images.
As an example, consider a survey of a region of the sky composed of 8000 images which are each 500x1000. The offsets between each image are 50 pixels along the first dimension and 900 pixels along the second dimension, give or take a few pixels due to telescope pointing errors. Thus this survey consists of strips of exposures. Within a strip the images over by about 450 pixels. Betweein strips the overlap is 100 pixels. So the strips consist 400 exposures and there are 20 strips.
The full size of this survey is then about 20450x18900. At any point in a single strip the number of images contributing is no more than 10. Including the overlap of the strips the maximum number is then 20. In order to combine the data for such a survey one would like to create subregion outputs which are 120 images from each strip. The lines where the two strips overlap then require 240 images. To produce roughly equal size regions we choose sizes along the first dimension of 5200 pixels. The number of lines in the output subregions might be the full size of the survey. However, it might be desirable to also break the second dimension into blocks for ease of display and manipulation.
The method for combining this example survey is then to combine the data in four groups along the first dimension to produce subimages each 5200 pixels wide which have no overlap. The reason for wanting to create nonoverlapping subregions is to simplify creation of the related masks, most importantly, the exposure masks. The outlimits parameter would have the values "1 5200", "5201 10400", "10401 15600", and "15601 20800". The second pair of limits is not specified to obtain the full size along the second dimension. Note that the last block will actually be smaller than 5200 pixels since the survey is less than 20800 pixels.
In each combining step all the images must be specified for the input in order to compute the full output size but then only those images needed to produce an output line will be accessed at the same time. By design this is roughly 240 images for lines where the strips overlap. The nonoverlapping blocks can be mosaiced together with this task as a final step if desired.
SCALES AND WEIGHTS
In order to combine images with rejection of pixels based on deviations from some average or median they must be scaled to a common level. There are two types of scaling available, a multiplicative intensity scale and an additive zero point shift. The intensity scaling is defined by the scale parameter and the zero point shift by the zero parameter. These parameters may take the values "none" for no scaling, "mode", "median", or "mean" to scale by statistics of the image pixels, "exposure" (for intensity scaling only) to scale by the exposure time keyword in the image header, any other image header keyword specified by the keyword name prefixed by the character '!', and the name of a file containing the scale factors for the input image prefixed by the character '@'.
Examples of the possible parameter values are shown below where "myval" is the name of an image header keyword and "scales.dat" is a text file containing a list of scale factors.
scale = none No scaling zero = mean Intensity offset by the mean scale = exposure Scale by the exposure time zero = !myval Intensity offset by an image keyword scale = @scales.dat Scales specified in a file
The image statistics are computed by sampling a uniform grid of points with the smallest grid step that yields less than 100000 pixels; sampling is used to reduce the time needed to compute the statistics. If one wants to restrict the sampling to a region of the image the statsec parameter is used. This parameter has the following syntax:
[input|output|overlap] [image section]
The initial modifier defaults to "input" if absent. The modifiers are useful if the input images have offsets. In that case "input" specifies that the image section refers to each input image, "output" specifies that the image section refers to the output image coordinates, and "overlap" specifies the mutually overlapping region of the input images. In the latter case an image section is ignored.
The statistics are as indicated by their names. In particular, the mode is a true mode using a bin size which is a fraction of the range of the pixels and is not based on a relationship between the mode, median, and mean. Also masked pixels are excluded from the computations as well as during the rejection and combining operations.
The "exposure" option in the intensity scaling uses the value of the image header keyword specified by the expname keyword. As implied by the parameter name, this is typically the image exposure time since intensity levels are linear with the exposure time in CCD detectors. Note that the exposure keyword is also updated in the final image as the weighted average of the input values. Thus, if one wants to use a nonexposure time keyword and keep the exposure time updating feature the image header keyword syntax is available; i.e. !<keyword>.
Scaling values may be defined as a list of values in a text file. The file name is specified by the standard @file syntax. The list consists of one value per line. The order of the list is assumed to be the same as the order of the input images. It is a fatal error if the list is incomplete and a warning if the list appears longer than the number of input images. Because the scale and zero levels are adjusted only the relative values are important.
If both an intensity scaling and zero point shift are selected the zero point is added first and the the scaling is done. This is important if the scale and offset values are specified by header keywords or from a file of values. However, in the log output the zero values are given as the scale times the offset hence those numbers would be interpreted as scaling first and zero offset second.
The image statistics and scale factors are recorded in the log file unless they are all equal, which is equivalent to no scaling. The scale factors are normalized so that the first input image has no scaling. This is done because the header of the first input image is used as the template header for the combined output image. By scaling to this first image this means that flux related keywords, such as exposure time and airmass, are representative of the output (except when the "sum" option is used).
Scaling affects not only the mean values between images but also the relative pixel uncertainties. For example scaling an image by a factor of 0.5 will reduce the effective noise sigma of the image at each pixel by the square root of 0.5. Changes in the zero point also changes the noise sigma if the image noise characteristics are Poissonian. In the various rejection algorithms based on identifying a noise sigma and clipping large deviations relative to the scaled median or mean, one may need to account for the scaling induced changes in the image noise characteristics.
In those algorithms it is possible to eliminate the "sigma correction" while still using scaling. The reasons this might be desirable are 1) if the scalings are similar the corrections in computing the mean or median are important but the sigma corrections may not be important and 2) the image statistics may not be Poissonian, either inherently or because the images have been processed in some way that changes the statistics. In the first case because computing square roots and making corrections to every pixel during the iterative rejection operation may be a significant computational speed limit the parameter sigscale selects how dissimilar the scalings must be to require the sigma corrections. This parameter is a fractional deviation which, since the scale factors are normalized to unity, is the actual minimum deviation in the scale factors. For the zero point shifts the shifts are normalized by the mean shift before adjusting the shifts to a zero mean. To always use sigma scaling corrections the parameter is set to zero and to eliminate the correction in all cases it is set to a very large number.
If the final combining operation is "average" then the images may be weighted during the averaging. The weights are specified in the same way as the scale factors. In addition the NCOMBINE keyword, if present, will be used in the weights. The weights, scaled to a unit sum, are printed in the log output.
The weights are used for the final weighted average, sigma image, and exposure mask output. They are not used to form averages in the various rejection algorithms. For weights in the case of no scaling or only multiplicative scaling the weights are used as given or determined so that images with lower signal levels will have lower weights. However, for cases in which zero level scaling is used and the zero levels are determined from image statistics (not from an input file or keyword) the weights are computed from the initial weights (the exposure time, image statistics, or input values) using the formula:
weight_final = weight_initial / (scale * sky)
where the sky values are those from the image statistics before conversion to zero level shifts and adjustment to zero mean over all images. The reasoning is that if the zero level is high the sky brightness is high and so the S/N is lower and the weight should be lower. If any sky value determined from the image statistics comes out to be negative a warning is given and the none of the weight are adjusted for sky levels.
The weights are not adjusted when the zero offsets are input from a file or keyword since these values do not imply the actual image sky value. In this case if one wants to account for different sky statistics in the weights the user must specify the weights in a file taking explicit account of changes in the weights due to different sky statistics.
When forming the final weighted averages if the sum of the weights of the non-rejected or excluded pixels is zero then instead of producing a zero average the unweighted average of the pixels is produced. Similarly, in the sigma calculation when the weights of the pixels are all zero then the sigma is computed as if all pixels have unit weights.
When there are zero weights only the pixels with non-zero weights are used in computing the output exposure time mask. Note that the actual weight values are not used but simply the sum of all exposure times of pixels from images with non-zero weights is produced.
The purpose of using zero weights is to identify images that are of poor quality (such as non-photometric or bad seeing) which are then excluded in the final weighted average or exposure time. However, they contribute to the final image when there is no good quality data but with an output exposure time of zero.
INPUT PIXEL MASKS
A pixel mask is a type of IRAF file having the extension ".pl" or a FITS extension of "type=mask" which identifies an integer value with each pixel of the images to which it is applied. In future masks may also be stored as special FITS extensions. The integer values may denote regions, a weight, a good or bad flag, or some other type of integer or integer bit flag. In the common case where many values are the same, this type of file is compact. It is most compact and efficient if the majority of the pixels have a zero mask value so frequently zero is the value for good pixels. Note that these files, while not stored as a strict pixel array, may be treated as images in programs. This means they may be created by programs such as mkpattern, edited by imedit, examined by imexamine, operated upon by imarith, graphed by implot, and displayed by display.
To use pixel masks with combine one must associate a pixel mask file with an image by entering the pixel list file name in the image header under the keyword BPM (bad pixel mask) or some other keyword to be specified. This can be done with hedit. Note that the same pixel mask may be associated with more than one image as might be the case if the mask represents defects in the detector used to obtain the images.
If a pixel mask is associated with an image the mask is used when the masktype parameter is set to a value other than "none" or "". Note that when it is set to "none", mask information is not used even if it exists for the image. The values of masktype which apply masks are "goodvalue", "badvalue", "goodbits", "badbits", and "!<keyword>". The last choice allows specifying the keyword whose value is the mask to be used otherwise the keyword "BPM" is used.
The masktype choices are used in conjunction with the maskvalue parameter. When the mask type is "goodvalue" or an explicit keyword is specified, the pixels with mask values matching the specified value are included in combining and all others are rejected. For a mask type of "badvalue" the pixels with mask values matching the specified value are rejected and all others are accepted. The bit types are useful for selecting a combination of attributes in a mask consisting of bit flags. The mask value is still an integer but is interpreted by bitwise comparison with the values in the mask file.
If a mask operation is specified and an image has no mask image associated with it (the BPM or specified keyword is absent), the mask values are taken as all zeros. In those cases be careful that zero is an accepted value otherwise the entire image will be rejected.
When the number of input images exceeds the maximum number of open files allowed by IRAF and the input images need to be "stacked" then the masks are also stacked. The stacking requires all the images to have the same size.
THRESHOLD REJECTION
In addition to rejecting masked pixels, pixels in the unscaled input images which are below or above the thresholds given by the parameters lthreshold and hthreshold are rejected. Values of INDEF mean that no threshold value is applied. Threshold rejection may be used to exclude very bad pixel values or as an alternative way of masking images. In the latter case one can use a task like imedit or imreplace to set parts of the images to be excluded to some very low or high magic value.
REJECTION ALGORITHMS
The reject parameter selects a type of rejection operation to be applied to pixels not masked or thresholded. If no rejection operation is desired the value "none" is specified.
MINMAX
A specified fraction of the highest and lowest pixels are rejected. The fraction is specified as the number of high and low pixels, the nhigh and nlow parameters, when data from all the input images are used. If pixels have been rejected by offseting, masking, or thresholding then a matching fraction of the remaining pixels, truncated to an integer, are used. Thus,
nl = n * nlow/nimages + 0.001 nh = n * nhigh/nimages + 0.001
where n is the number of pixels surviving offseting, masking, and thresholding, nimages is the number of input images, nlow and nhigh are task parameters and nl and nh are the final number of low and high pixels rejected by the algorithm. The factor of 0.001 is to adjust for rounding of the ratio.
As an example with 10 input images and specifying one low and two high pixels to be rejected the fractions to be rejected are nlow=0.1 and nhigh=0.2 and the number rejected as a function of n is:
n 0 1 2 3 4 5 6 7 8 9 10 nl 0 0 0 0 0 0 0 0 0 0 1 nh 0 0 0 0 0 1 1 1 1 1 2
CCDCLIP
If the images are obtained using a CCD with known read out noise, gain, and sensitivity noise parameters and they have been processed to preserve the relation between data values and photons or electrons then the noise characteristics of the images are well defined. In this model the sigma in data values at a pixel with true value <I>, as approximated by the median or average with the lowest and highest value excluded, is given by:
sigma = ((rn / g) ** 2 + <I> / g + (s * <I>) ** 2) ** 1/2
where rn is the read out noise in electrons, g is the gain in electrons per data value, s is a sensitivity noise given as a fraction, and ** is the exponentiation operator. Often the sensitivity noise, due to uncertainties in the pixel sensitivities (for example from the flat field), is not known in which case a value of zero can be used. See the task stsdas.wfpc.noisemodel for a way to determine these vaues (though that task expresses the read out noise in data numbers and the sensitivity noise parameter as a percentage).
The read out noise is specified by the rdnoise parameter. The value may be a numeric value to be applied to all the input images or a image header keyword containing the value for each image. Similarly, the parameter gain specifies the gain as either a value or image header keyword and the parameter snoise specifies the sensitivity noise parameter as either a value or image header keyword.
The algorithm operates on each output pixel independently. It starts by taking the median or unweighted average (excluding the minimum and maximum) of the unrejected pixels provided there are at least two input pixels. The expected sigma is computed from the CCD noise parameters and pixels more that lsigma times this sigma below or hsigma times this sigma above the median or average are rejected. The process is then iterated until no further pixels are rejected. If the average is used as the estimator of the true value then after the first round of rejections the highest and lowest values are no longer excluded. Note that it is possible to reject all pixels if the average is used and is sufficiently skewed by bad pixels such as cosmic rays.
If there are different CCD noise parameters for the input images (as might occur using the image header keyword specification) then the sigmas are computed for each pixel from each image using the same estimated true value.
If the images are scaled and shifted and the sigscale threshold is exceeded then a sigma is computed for each pixel based on the image scale parameters; i.e. the median or average is scaled to that of the original image before computing the sigma and residuals.
After rejection the number of retained pixels is checked against the nkeep parameter. If there are fewer pixels retained than specified by this parameter the pixels with the smallest residuals in absolute value are added back. If there is more than one pixel with the same absolute residual (for example the two pixels about an average or median of two will have the same residuals) they are all added back even if this means more than nkeep pixels are retained. Note that the nkeep parameter only applies to the pixels used by the clipping rejection algorithm and does not apply to threshold or bad pixel mask rejection.
This is the best clipping algorithm to use if the CCD noise parameters are adequately known. The parameters affecting this algorithm are reject to select this algorithm, mclip to select the median or average for the center of the clipping, nkeep to limit the number of pixels rejected, the CCD noise parameters rdnoise, gain and snoise, lsigma and hsigma to select the clipping thresholds, and sigscale to set the threshold for making corrections to the sigma calculation for different image scale factors.
CRREJECT
This algorithm is identical to "ccdclip" except that only pixels above the average are rejected based on the hsigma parameter. This is appropriate for rejecting cosmic ray events and works even with two images.
SIGCLIP
The sigma clipping algorithm computes at each output pixel the median or average excluding the high and low values. The sigma is then computed about this estimate (without excluding the low and high values). There must be at least three input pixels, though for this method to work well there should be at least 10 pixels. Values deviating by more than the specified sigma threshold factors are rejected. These steps are repeated, except that after the first time the average includes all values, until no further pixels are rejected or there are fewer than three pixels.
After rejection the number of retained pixels is checked against the nkeep parameter. If there are fewer pixels retained than specified by this parameter the pixels with the smallest residuals in absolute value are added back. If there is more than one pixel with the same absolute residual (for example the two pixels about an average or median of two will have the same residuals) they are all added back even if this means more than nkeep pixels are retained. Note that the nkeep parameter only applies to the pixels used by the clipping rejection algorithm and does not apply to threshold or bad pixel mask rejection.
The parameters affecting this algorithm are reject to select this algorithm, mclip to select the median or average for the center of the clipping, nkeep to limit the number of pixels rejected, lsigma and hsigma to select the clipping thresholds, and sigscale to set the threshold for making corrections to the sigma calculation for different image scale factors.
AVSIGCLIP
The averaged sigma clipping algorithm assumes that the sigma about the median or mean (average excluding the low and high values) is proportional to the square root of the median or mean at each point. This is described by the equation:
sigma(column,line) = sqrt (gain(line) * signal(column,line))
where the estimated signal is the mean or median (hopefully excluding any bad pixels) and the gain is the estimated proportionality constant having units of photons/data number.
This noise model is valid for images whose values are proportional to the number of photons recorded. In effect this algorithm estimates a detector gain for each line with no read out noise component when information about the detector noise parameters are not known or available. The gain proportionality factor is computed independently for each output line by averaging the square of the residuals (at points having three or more input values) scaled by the median or mean. In theory the proportionality should be the same for all rows but because of the estimating process will vary somewhat.
Once the proportionality factor is determined, deviant pixels exceeding the specified thresholds are rejected at each point by estimating the sigma from the median or mean. If any values are rejected the median or mean (this time not excluding the extreme values) is recomputed and further values rejected. This is repeated until there are no further pixels rejected or the number of remaining input values falls below three. Note that the proportionality factor is not recomputed after rejections.
If the images are scaled differently and the sigma scaling correction threshold is exceeded then a correction is made in the sigma calculations for these differences, again under the assumption that the noise in an image scales as the square root of the mean intensity.
After rejection the number of retained pixels is checked against the nkeep parameter. If there are fewer pixels retained than specified by this parameter the pixels with the smallest residuals in absolute value are added back. If there is more than one pixel with the same absolute residual (for example the two pixels about an average or median of two will have the same residuals) they are all added back even if this means more than nkeep pixels are retained. Note that the nkeep parameter only applies to the pixels used by the clipping rejection algorithm and does not apply to threshold or bad pixel mask rejection.
This algorithm works well for even a few input images. It works better if the median is used though this is slower than using the average. Note that if the images have a known read out noise and gain (the proportionality factor above) then the "ccdclip" algorithm is superior. The two algorithms are related in that the average sigma proportionality factor is an estimate of the gain.
The parameters affecting this algorithm are reject to select this algorithm, mclip to select the median or average for the center of the clipping, nkeep to limit the number of pixels rejected, lsigma and hsigma to select the clipping thresholds, and sigscale to set the threshold for making corrections to the sigma calculation for different image scale factors.
PCLIP
The percentile clipping algorithm is similar to sigma clipping using the median as the center of the distribution except that, instead of computing the sigma of the pixels from the CCD noise parameters or from the data values, the width of the distribution is characterized by the difference between the median value and a specified "percentile" pixel value. This width is then multipled by the scale factors lsigma and hsigma to define the clipping thresholds above and below the median. The clipping is not iterated.
The pixel values at each output point are ordered in magnitude and the median is determined. In the case of an even number of pixels the average of the two middle values is used as the median value and the lower or upper of the two is the median pixel when counting from the median pixel to selecting the percentile pixel. The parameter pclip selects the percentile pixel as the number (if the absolute value is greater than unity) or fraction of the pixels from the median in the ordered set. The direction of the percentile pixel from the median is set by the sign of the pclip parameter with a negative value signifying pixels with values less than the median. Fractional values are internally converted to the appropriate number of pixels for the number of input images. A minimum of one pixel and a maximum corresponding to the extreme pixels from the median are enforced. The value used is reported in the log output. Note that the same percentile pixel is used even if pixels have been rejected by offseting, masking, or thresholding; for example, if the 3nd pixel below the median is specified then the 3rd pixel will be used whether there are 10 pixels or 5 pixels remaining after the preliminary steps.
After rejection the number of retained pixels is checked against the nkeep parameter. If there are fewer pixels retained than specified by this parameter the pixels with the smallest residuals in absolute value are added back. If there is more than one pixel with the same absolute residual (for example the two pixels about an average or median of two will have the same residuals) they are all added back even if this means more than nkeep pixels are retained. Note that the nkeep parameter only applies to the pixels used by the clipping rejection algorithm and does not apply to threshold or bad pixel mask rejection.
Some examples help clarify the definition of the percentile pixel. In the examples assume 10 pixels. The median is then the average of the 5th and 6th pixels. A pclip value of 2 selects the 2nd pixel above the median (6th) pixel which is the 8th pixel. A pclip value of -0.5 selects the point halfway between the median and the lowest pixel. In this case there are 4 pixels below the median, half of that is 2 pixels which makes the percentile pixel the 3rd pixel.
The percentile clipping algorithm is most useful for clipping small excursions, such as the wings of bright objects when combining disregistered observations for a sky flat field, that are missed when using the pixel values to compute a sigma. It is not as powerful, however, as using the CCD noise parameters (provided they are accurately known) to clip about the median.
The parameters affecting this algorithm are reject to select this algorithm, pclip to select the percentile pixel, nkeep to limit the number of pixels rejected, and lsigma and hsigma to select the clipping thresholds.
GROW REJECTION
Neighbors of pixels rejected by the rejection algorithms may also be rejected. The number of neighbors to be rejected is specified by the grow parameter which is a radius in pixels. If too many pixels are rejected in one of the grown pixels positions (as defined by the nkeep parameter) then the value of that pixel without growing will be used.
COMBINING
After all the steps of offsetting the input images, masking pixels, threshold rejection, scaling, and applying a rejection algorithms the remaining pixels are combined and output. The pixels may be combined by computing the median or by computing a weighted average.
SIGMA OUTPUT
In addition to the combined image and optional sigma image may be produced. The sigma computed is the standard deviation, corrected for a finite population by a factor of n/(n-1), of the unrejected input pixel values about the output combined pixel values.
Input, output and intermediate stack images must be less than 2Gb in size. Various workarounds must then be used. For large mosaics the outlimits parameter may be used to produced smaller tiles of the output region.
In the first group of examples we demonstrate the selection and grouping parameters. The cgroup task is used for the illustrations though all the combining tasks select and group in the same way. In these examples the image files have the keywords shown in the table.
FILE OBSTYPE OBSTYPE1 SEQNUM MJD-OBS EXPTIME FILTER ----------------------------------------------------------- test01 dark dark 120 58323.1235 10 DARK test02 dark dark 120 58323.1236 10 DARK test03 dark dark 120 58323.1237 10 DARK test04 flat dome 121 58323.1278 10 V #123 test05 flat dome 121 58323.1279 10 V #123 test06 flat dome 121 58323.1280 10 V #123 test07 flat dome 122 58323.1321 10 R #124 test08 flat dome 122 58323.1322 10 R #124 test09 flat dome 122 58323.1323 10 R #124 test10 flat dome 123 58323.1364 15 V #123 test11 flat dome 123 58323.1365 15 V #123 test12 flat dome 123 58323.1366 15 V #123 test13 flat dome 124 58323.1407 15 R #124 test14 flat dome 124 58323.1408 15 R #124 test15 flat dome 124 58323.1409 15 R #124 test16 flat sky 125 58323.1450 10 V #123 test17 flat sky 125 58323.1451 10 V #123 test18 flat sky 125 58323.1452 10 V #123 test19 flat sky 126 58323.1493 10 R #124 test20 flat sky 126 58323.1494 10 R #124 test21 flat sky 126 58323.1495 10 R #124 test22 flat sky 127 58323.1536 15 V #123 test23 flat sky 127 58323.1537 15 V #123 test24 flat sky 127 58323.1538 15 V #123 test25 flat sky 128 58323.1579 15 R #124 test26 flat sky 128 58323.1580 15 R #124 test27 flat sky 128 58323.1581 15 R #124 test28 object program 129 58323.1622 300 V #123 test29 object program 129 58323.1623 300 V #123 test30 object program 129 58323.1624 300 V #123 test31 object program 130 58323.1665 300 R #124 test32 object program 130 58323.1666 300 R #124 test33 object program 130 58323.1667 300 R #124
1. The first example shows the case where no keyword expressions are used resulting in all the files in the input list being combined.
cl> cgroup test* example1 select="" group="" seqval="" test01.fits -> example1.list ... test33.fits -> example1.list
2. In this example the files are grouped by the first word in the FILTER keyword.
cl> cgroup test* example2 select="" \
>>> group="mkid(filter,1,1)" seqval="" select="" test01.fits -> example2DARK.list test02.fits -> example2DARK.list test03.fits -> example2DARK.list test04.fits -> example2V.list ... test30.fits -> example2V.list test07.fits -> example2R.list ... test33.fits -> example2R.list
3. In this example only files with OBSTYPE keyword having the value "object" are selected and no grouping is used.
cl> cgroup test* example3 select="obstype='object'" \
>>> group="" seqval="" test28.fits -> example3.list ... test33.fits -> example3.list
4. Files with OBSTYPE of "flat" are selected and then grouped by filter and by a combination of OBSTYPE and OBSTYPE1. This is a fairly complex grouping expression showing various features of string manipulation.
cl> cgroup test* example4 select="obstype='flat'" seqval="" \
>>> group="'_'//mkid(filter,1,1)//'_'//obstype//'_'//obstype1" test04.fits -> example4_V_flat_dome.list test05.fits -> example4_V_flat_dome.list test06.fits -> example4_V_flat_dome.list test10.fits -> example4_V_flat_dome.list test11.fits -> example4_V_flat_dome.list test12.fits -> example4_V_flat_dome.list test07.fits -> example4_R_flat_dome.list test08.fits -> example4_R_flat_dome.list test09.fits -> example4_R_flat_dome.list test13.fits -> example4_R_flat_dome.list test14.fits -> example4_R_flat_dome.list test15.fits -> example4_R_flat_dome.list test16.fits -> example4_V_flat_sky.list test17.fits -> example4_V_flat_sky.list test18.fits -> example4_V_flat_sky.list test22.fits -> example4_V_flat_sky.list test23.fits -> example4_V_flat_sky.list test24.fits -> example4_V_flat_sky.list test19.fits -> example4_R_flat_sky.list test20.fits -> example4_R_flat_sky.list test21.fits -> example4_R_flat_sky.list test25.fits -> example4_R_flat_sky.list test26.fits -> example4_R_flat_sky.list test27.fits -> example4_R_flat_sky.list
5. In this example files are selected with a particular combination of keywords and then grouped by filter and exposure time.
cl> cgroup test* example5 \
>>> select="obstype='flat'&&obstype1='dome'" \
>>> group="'_'//mkid(filter,1,1)//'_'//exptime" seqval="" test04.fits -> example5_V_10.list test05.fits -> example5_V_10.list test06.fits -> example5_V_10.list test07.fits -> example5_R_10.list test08.fits -> example5_R_10.list test09.fits -> example5_R_10.list test10.fits -> example5_V_15.list test11.fits -> example5_V_15.list test12.fits -> example5_V_15.list test13.fits -> example5_R_15.list test14.fits -> example5_R_15.list test15.fits -> example5_R_15.list
6. A simpler version of the previous example illustrates selecting dark exposures and goruping by exposure time.
cl> cgroup test* example6 select="obstype='dark'" \
>>> group="'_'//exptime" seqval="" test03.fits -> example6_10.list test02.fits -> example6_10.list test01.fits -> example6_10.list
7. The next example selects dome flats and groups them by filter and a sequence number. The sequence gap is zero since any increment in the sequence number means a new sequence. Note the difference with example 5 where the exposure time was used to separate to sequences.
cl> cgroup test* example7 \
>>> select="obstype='flat'&&obstype1='dome'" \
>>> group="'_'//mkid(filter,1,1)" \
>>> seqval="seqnum" seqgap=0. test05.fits -> example7_V_1.list test06.fits -> example7_V_1.list test04.fits -> example7_V_1.list test11.fits -> example7_V_2.list test10.fits -> example7_V_2.list test12.fits -> example7_V_2.list test08.fits -> example7_R_1.list test09.fits -> example7_R_1.list test07.fits -> example7_R_1.list test14.fits -> example7_R_2.list test13.fits -> example7_R_2.list test15.fits -> example7_R_2.list
8. Since sequence numbers already contain sequence information an alternative to using the sequence expression is just to use the sequence number. In this example the sequence number is part of the grouping expression. The filter in the grouping expression is redundant but provides a more recognizable name.
cl> cgroup test* example3 \
>>> select="obstype='flat'&&obstype1='dome'" \
>>> group="'_'//mkid(filter,1,1)//'_'//seqnum" test04.fits -> example8_V_121.list test05.fits -> example8_V_121.list test06.fits -> example8_V_121.list test07.fits -> example8_R_122.list test08.fits -> example8_R_122.list test09.fits -> example8_R_122.list test10.fits -> example8_V_123.list test11.fits -> example8_V_123.list test12.fits -> example8_V_123.list test13.fits -> example8_R_124.list test14.fits -> example8_R_124.list test15.fits -> example8_R_124.list
9. In this example the sequences are defined by the time of observations. In particular, when there is a gap of more than 7 minutes in a particular group.
cl> cgroup test* example3 \
>>> select="obstype='flat'&&obstype1='dome'" \
>>> group="'_'//mkid(filter,1,1)" seqval="@'mjd-obs'" \
>>> seqval=0.005 test05.fits -> example9_V_1.list test06.fits -> example9_V_1.list test04.fits -> example9_V_1.list test11.fits -> example9_V_2.list test10.fits -> example9_V_2.list test12.fits -> example9_V_2.list test08.fits -> example9_R_1.list test09.fits -> example9_R_1.list test07.fits -> example9_R_1.list test14.fits -> example9_R_2.list test13.fits -> example9_R_2.list test15.fits -> example9_R_2.list
10. In the next two examples we consider MEF files. The files have two image extensions with extension names "im1" and "im2". In both examples the object exposures are selected and grouped by filter. First the extension parameter is set to null to combine all extensions to a single output image. In the second case the extensions are combined separately. Note the use of the substring function to used the numeric part of the extension name.
cl> cgroup meftest* example10 select="obstype='object'" \
>>> group="'_'//mkid(filter,1,1)" seqval="" extension="" meftest28.fits[1] -> example10_V.list meftest28.fits[2] -> example10_V.list meftest29.fits[1] -> example10_V.list meftest29.fits[2] -> example10_V.list meftest30.fits[1] -> example10_V.list meftest30.fits[2] -> example10_V.list meftest31.fits[1] -> example10_R.list meftest31.fits[2] -> example10_R.list meftest32.fits[1] -> example10_R.list meftest32.fits[2] -> example10_R.list meftest33.fits[1] -> example10_R.list meftest33.fits[2] -> example10_R.list cl> cgroup meftest* example10 select="obstype='object'" \
>>> group="'_'//mkid(filter,1,1)" seqval="" \
>>> extension="substr(extname,3,3)" meftest28.fits[1] -> example11_V1.list meftest29.fits[1] -> example11_V1.list meftest30.fits[1] -> example11_V1.list meftest28.fits[2] -> example11_V2.list meftest29.fits[2] -> example11_V2.list meftest30.fits[2] -> example11_V2.list meftest31.fits[1] -> example11_R1.list meftest32.fits[1] -> example11_R1.list meftest33.fits[1] -> example11_R1.list meftest31.fits[2] -> example11_R2.list meftest32.fits[2] -> example11_R2.list meftest33.fits[2] -> example11_R2.list
The next set of examples illustrate actual combining.
11. To average and median images without any other features:
cl> combine obj* avg combine=average reject=none cl> combine obj* med combine=median reject=none
12. To reject cosmic rays:
cl> combine obs1,obs2 Obs reject=crreject rdnoise=5.1, gain=4.3
13. To make a grid for display purposes with 21 64x64 images:
cl> combine @list grid offset="grid 5 65 5 65"
14. To apply a mask image with good pixels marked with a zero value and bad pixels marked with a value of one:
cl> hedit ims* bpm badpix.pl add+ ver- cl> combine ims* final combine=median masktype=goodval
15. To scale image by the exposure time and then adjust for varying sky brightness and make a weighted average:
cl> combine obj* avsig combine=average reject=avsig \
>>> scale=exp zero=mode weight=exp expname=exptime