Welcome to PCAT-DE’s documentation!

This readthedocs page describes the software PCAT-DE, or Probabilistic CATaloging in the presence of Diffuse Emission. PCAT-DE was first used in Butler & Feder et al. (2021), while PCAT-DE is tested in more detail in Feder et al. (2022).

The advantage of PCAT-DE is its flexibility: any spatial-spectral template may be used in principle and fit alongside a point source population, where the number of sources is unknown. This allows one to probe the transdimensional covariance between a given signal and the union of point source models with varying \(N_{src}\), otherwise referred to as a metamodel.

Existing applications of PCAT-DE include:

  • Detection and measurement of point-like sources in the presence of diffuse galactic cirrus

  • Measurement of spatially extended Sunyaev-Zel’dovich effect in the presence of cosmic infrared background (CIB) galaxies and diffuse galactic cirrus

Changelog

This is where updates on PCAT-DE will be posted, referencing the main PCAT-DE Github branch. The current version is 0.0.1 (1/1/2023). As such, the existing software should be treated as a beta release. Please do not hesitate to file an issue through Github or to directly contact me at rfederst@caltech.edu if there appear to be any bugs in the code.

(6/21/23): The code has been consolidated and result plots are correctly made at the end of sampling. Note however that we have identified a bug in the PCAT-DE visual mode, which plots the data/model/residuals in real time as the fit converges. Until this is resolved we advise running with visual=False.

Existing work on probabilistic cataloging

This work builds on a long list of existing implementations and extensions based on the framework of probabilistic cataloging. All of these methods build on transdimensional inference, in which the number of model components is inferred simultaneously with the properties of the components. This list is almost certaintly non-exhaustive:

Implementation details

The code is structured as follows. First, the pcat_main() class is instantiated, using a combination of user-provided paths (stored in config.py) and tunable hyperparameters which are specified in params.py. Each time PCAT is run, a parameter file is saved in pickled and readable forms (params.txt, params_read.txt) within the results folder.

In practice many of the hyperparameters do not need to be modified, however doing so is straightforward (along with adding new parameters). The existing hyperparameters can be broken down into various groups:

Data configuration parameters

This includes the location of the files (can be input through data_path or combination of im_fpath and err_fpath) and details of the noise model implementation. These should be the names of FITS files (with .fits extensions). The data can be fed in as either a single observed map (e.g., image_extnames=['SIGNAL']), or as a sum of several maps, (e.g., image_extnames=[{signal_noiseless}, {noise}]), where {signal_noiseless} and {noise} should be customized to the saved FITS image cards. Additional Gaussian noise can be added by setting add_noise to True and either specifying a constant noise level (scalar_noise_sigma) or using the uncertainty map (add_noise=True and use_uncertainty_map=True). The details of the PSF can be specified in terms of a beam full width at half maximum (FWHM, psf_fwhm) provided in pixel units (this assumes a Gaussian beam), or as a generic PSF postage stamp. When an empirical PSF estimate is available it can be fed into PCAT using the psf_postage_stamp keyword. If one wants to run PCAT on a masked version of the image, the most straightforward way to do this is to set all pixels in the uncertainty map to zero/inf/NaN. PCAT will have predicted model values for these pixels, however they are zero-weighted in the likelihood evaluation.

PCAT sampler parameters/model hyperparameters

The number of samples is set by nsamp – by default the chains are thinned by a factor of nloop=1000, so a run with nsamp=4000 is really \(4 \times 10^6\) model evaluations. For computational efficiency, it is recommended to set a max_nsrc for the model. The maximum should be sufficiently far from the bulk of the posterior on \(N_{src}\). Oftentimes if the number of sources is diverging it means something is not correct in the data parsing or the astrometric calibration. - Hyperparameters describing model for constant background/mean normalization of maps (“Background Parameters”), any fixed spatial templates (“Template Parameters”) and the Fourier component templates (“Fourier Component Parameters”). - Run time diagnostics/posterior plot details. - Optional parameters for computing condensed catalog from posterior samples (“Condensed catalog”).

These parameters can be directly modified in the configuration file, or passed as keyword arguments to the lion class instantiation. Model proposals are called many times within the PCAT chains. These are included in the Proposal() class and are drawn from according to the model components and proposal weights (“moveweights”).

Data parsing/Map pre-processing parameters

One important (and error prone) step in running PCAT is the proper parsing of maps and other data products to PCAT. Because PCAT builds a generative model for the observed data, it typically needs:

  • The observed maps

  • A model for the point spread function (PSF) of the telescope optics and the pixel function

  • A noise model image for each map

  • If running on several maps (e.g., multiband data), a consistent astrometric reference frame across images (along with consistent trimming of maps)

In PCAT-DE, two sets of diagnostics are included to ensure the data products are parsed correctly. To validate the astrometry, PCAT has a test module validate_astrometry() which projects a grid of points across each of the images. The second shows the data as they are parsed in and is used when show_input_maps is set to True. To plot these for an individual run in real time you want to set matplotlib.colors('tkAGG'), otherwise they will be saved as files in the results folder (specified by config.result_path).

Examples

Example scripts can be found in the repository under example1.py. Some code implementing artificial star tests can be found in the script artificial_star_test.py. More detailed demos will be included in the future.

Posteriors and Diagnostics

Verifying the proper convergence of PCAT can be done by inspecting the posteriors and other diagnostics derived from posterior samples.

  • The chi squared of the samples and the reduced chi squared statistics.

  • Pixel-wise residual maps

  • Number of sources. Does the posterior on \(N_{src}\) reside well within the range of \([N_{min}, N_{max}]\)?

  • If running on several maps (e.g., multiband data), a consistent astrometric reference frame across images (along with consistent trimming of maps)

  • Acceptance fractions for different proposals. If these are too low, it may suggest the model has not converged. If they are too high, it may suggest the proposal kernels are too narrow, such the delta log posterior between models is close to zero.