edges.filters.xrfi

Functions for excising RFI.

class edges.filters.xrfi.FilterModeler(kernel: ndarray)[source]

A Modeler that uses a convolutional filter to model = data or std.

classmethod gaussian(size: int) Self[source]

Create a Gaussian kernel.

get_model(model: None, data: ndarray, weights: ndarray) ndarray[source]

Convolve the data with the kernel to get a smoother model.

classmethod mean(size: int) Self[source]

Create a mean kernel.

class edges.filters.xrfi.IterativeXRFIInfo(n_flags_changed: list[int], total_flags: list[int], model_params: list[dict], data_models: list[ndarray], std_params: list[dict], thresholds: list[float], stds: list[ndarray[float]], x: ndarray, data: ndarray, flags: list[ndarray[bool]])[source]

A simple object representing the information returned by model_filter().

get_model(indx: int = -1)[source]

Get the model values.

get_residual(indx: int = -1)[source]

Get the residuals.

get_std_model(indx: int = -1)[source]

Get the model of the absolute residuals.

property n_iters: int

Get the number of iterations.

class edges.filters.xrfi.LinearModeler(model: Model, min_terms=NOTHING, max_terms=NOTHING, term_increase: int = 0)[source]

A Modeler that uses a linear model to fit either data or std.

get_model(model: FixedLinearModel, data: ndarray, weights: ndarray) ndarray[source]

Perform a model fit and evaluate it.

init_model(params: dict, freqs: ndarray, model: FixedLinearModel | None = None) FixedLinearModel[source]

Initialize the model at the known frequencies, and at the default terms.

set_params(iteration: int) dict[str, Any][source]

Set the number of terms for the model for a given iteration.

stopping_condition(flags: ndarray, iteration: int) bool[source]

Extra stopping conditions specific to this kind of modeler.

In this case, stop if the number of unflagged channels is less than twice the number of terms in the model. This is unfittable.

class edges.filters.xrfi.MedianFilterModeler(size: int)[source]

A Modeler that uses a median filter to model data or std.

get_model(model: None, data: ndarray, weights: ndarray) ndarray[source]

Perform a median filter to get a smoothed model.

get_std(model, resids: ndarray, weights: ndarray) ndarray[source]

Calculate the rolling median-absolute-deviation.

class edges.filters.xrfi.ModelFilterInfoContainer(models: list[~edges.filters.xrfi.IterativeXRFIInfo] = <factory>)[source]

A container of ModelFilterInfo objects.

This is almost a perfect drop-in replacement for a singular ModelFilterInfo instance, but combines a number of them together seamlessly. This can be useful if several sub-models were fit to one long stream of data.

append(model: IterativeXRFIInfo) Self[source]

Create a new object by appending a set of info to the existing.

property data

The raw data that was filtered.

property flags

The returned flags on each iteration.

get_absres_model(indx: int = -1)[source]

Get the model of the absolute residuals.

get_model(indx: int = -1)[source]

Get the model values.

get_residual(indx: int = -1)[source]

Get the residual values.

property n_flags_changed

The number of flags changed on each filtering iteration.

property n_iters

The number of iterations of the filtering.

property stds

The standard deviations at each datum for each iteration.

property thresholds

The threshold at each iteration.

property total_flags

The total number of flags after each iteration.

property x

The data coordinates.

class edges.filters.xrfi.Modeler[source]

Class for modeling either data or standard deviation as a function of freq.

This class is used for RFI excision.

abstractmethod get_model(model, data: ndarray, weights: ndarray) ndarray[source]

Get the model for the given data and weights.

get_std(model, resids: ndarray, weights: ndarray) ndarray[source]

Get the standard deviation for the given residuals and weights.

init_model(params: dict, freqs: ndarray, model: T | None = None) T[source]

Initialze the model.

Use this method to initialize any data that doesn’t need to be updated on each iteration.

set_params(iteration: int) dict[str, Any][source]

Set the parameters for the model for a given iteration.

stopping_condition(flags: ndarray, iteration: int) bool[source]

Extra stopping conditions specific to this kind of modeler.

edges.filters.xrfi.xrfi_explicit(spectrum: ndarray | None = None, *, freq: ndarray, flags: ndarray | None = None, rfi_file=None, extra_rfi=None) ndarray[bool][source]

Excise RFI from given data using an explicitly set list of flag ranges.

Parameters:
  • spectrum – This parameter is unused in this function.

  • freq – Frequencies, in MHz, of the data.

  • flags – Known flags.

  • rfi_file (str, optional) – A YAML file containing the key ‘rfi_ranges’, which should be a list of 2-tuples giving the (min, max) frequency range of known RFI channels (in MHz). By default, uses a file included in edges-analysis with known RFI channels from the MRO.

  • extra_rfi (list, optional) – A list of extra RFI channels (in the format of the rfi_ranges from the rfi_file).

Returns:

flags (array-like) – Boolean array of the same shape as spectrum indicated which channels/times have flagged RFI.

edges.filters.xrfi.xrfi_iterative(data: ~numpy.ndarray, *, freqs: ~numpy.ndarray, flags: ~numpy.ndarray | None = None, weights: ~numpy.ndarray | None = None, data_modeler: ~edges.filters.xrfi.Modeler = LinearModeler(model=Fourier(parameters=None, n_terms=37, _transform=IdentityTransform(), xtransform=IdentityTransform(), basis_scaler=None, data_transform=IdentityTransform(), period=6.283185307179586), min_terms=37, max_terms=37, term_increase=0), std_modeler: ~edges.filters.xrfi.Modeler = LinearModeler(model=Fourier(parameters=None, n_terms=15, _transform=IdentityTransform(), xtransform=IdentityTransform(), basis_scaler=None, data_transform=IdentityTransform(), period=6.283185307179586), min_terms=15, max_terms=15, term_increase=0), threshold_setter: ~collections.abc.Callable = <function <lambda>>, max_iter: int = 20, watershed: dict[float, int] | None = None, flag_if_broken: bool = True, init_flags: ~numpy.ndarray | None = None)[source]

Run a generalized iterative RFI excision algorithm.

This algorithm works by iteratively modeling the data and its standard deviation, flagging data points that are likely affected by RFI based on a z-score threshold.

Parameters:
  • data (np.ndarray) – The input data to be processed.

  • freqs (np.ndarray) – The frequency channels corresponding to the data.

  • flags (np.ndarray | None) – Initial flags for the data points. If this is defined, the output flags will at least include these flags.

  • weights (np.ndarray | None) – Weights for the data points. If not given, use uniform weights of one. If given, zero weights will be treated the same as flags.

  • data_modeler (Modeler) – The modeler to use for the data.

  • std_modeler (Modeler) – The modeler to use for the standard deviation.

  • threshold_setter (Callable) – A callable that sets the threshold for flagging on each iteration. This callable should take the current iteration number as input and return the threshold value to use for that iteration.

  • max_iter (int) – The maximum number of iterations to run.

  • watershed (dict[float, int] | None) – Parameters for watershed flagging. Each key is a threshold value, and each value is the number of surrounding channels to include in the flagging. The threshold value is multiplied by the threshold of the basic z-score threshold. That is, it is not a number of sigma, but is a multiple of a number of sigma.

  • flag_if_broken (bool) – Whether to flag an entire integration if the iterative process stops due to either hitting the maximum number of iterations, or hitting some model-specific condition without convergence of the flags.

  • init_flags (np.ndarray | None) – Initial flags for the data points. If given, the initial iteration will use these flags, but they will be updated in subsequent iterations. This can be used to flag out regions of known likely RFI that might poorly affect the first model.

Returns:

  • np.ndarray – The flags for the data points.

  • IterativeXRFIInfo – Information about the iterative RFI excision process.

edges.filters.xrfi.xrfi_iterative_sliding_window(spectrum: ndarray, *, freqs: ndarray, model: Model, flags=None, window_frac: int = 16, min_window_size: int = 10, max_iter: int = 100, threshold: float = 2.5, watershed: dict | None = None, reflag_thresh: float = 1.01, fit_kwargs: dict | None = None, weights: ndarray | None = None)[source]

Flag RFI using a model fit and a sliding RMS window.

This function is algorithmically the same as that used in Bowman+2018. The differences between this and xrfi_model() (which is the recommended function to use) are:

  • This does flagging inside the sliding window – i.e. once you move the window up by one channel, the flags can be different in the previous bins. This is a bit strange, since it makes the process more non-linear. If you were to start from the top of the band and slide the window down, you’d get different results.

  • The watershedding (flagging channels around the “bad” one) only happens if the main central channel is far enough away from the edges of the band.

  • It only flags positive outliers.

Parameters:
  • spectrum (array-like) – The 1D spectrum to flag.

  • freq – The frequencies associated with the spectrum.

  • model (edges_cal.modelling.Model) – The model to fit to the spectrum to get residuals.

  • flags (array-like, optional) – The initial flags to use. If not given, all channels are unflagged.

  • window_frac (int, optional) – The size of the sliding window as a fraction of the number of channels (i.e. the final window is int(Nchannels / window_frac) in size).

  • min_window_size (int, optional) – The minimum size of the sliding window, in number of channels.

  • max_iter (int, optional) – The maximum number of iterations to perform.

  • threshold (float, optional) – The threshold for flagging a channel. The threshold is the number of standard deviations the residuals are from zero.

  • watershed (dict, optional) – The parameters for the watershedding algorithm. If not given, no watershedding is performed. Each key should be a float that specifies the number of threshold*stds away from zero that a channel should be flagged. The value should be the number of channels to flag on either side of the flagged channel for that threshold. For example, {3: 2} would flag 2 channels on either side of any channel that is 3*threshold standard deviations away from zero.

  • reflag_thresh (float, optional) – The basic algorithm has “memory”, i.e. if a channel is flagged in one iteration, it will remain flagged for all following iterations, even if it is no longer an outlier for the updated model. This parameter allows you to re-consider a flag on a later iteration, if it was originally flagged at less than reflag_thresh times the threshold. This can improve conformity to the results of Bowman+2018, because the model fits are very slightly different between the codes used, but it is very difficult to predict exactly how the parameter will affect the results.

  • fit_kwargs (dict, optional) – Any additional keyword arguments to pass to the model fit. Use the key “method” with value “alan-qrd” for the closest match to the Bowman+2018 code.

Returns:

  • flags (array-like) – Boolean array of the same shape as spectrum indicated which channels/times have flagged RFI.

  • info (ModelFilterInfo) – A ModelFilterInfo object containing information about the fit at each iteration.

edges.filters.xrfi.xrfi_watershed(spectrum: ndarray | None = None, *, freqs: ndarray | None = None, flags: ndarray | None = None, weights: ndarray | None = None, tol: float | tuple[float] = 0.5, inplace=False)[source]

Apply a watershed over frequencies and times for flags.

Make sure that times/freqs with many flags are all flagged.

Parameters:
  • spectrum – Not used in this routine.

  • flags (ndarray of bool) – The existing flags.

  • tol (float or tuple) – The tolerance – i.e. the fraction of entries that must be flagged before flagging the whole axis. If a tuple, the first element is for the frequency axis, and the second for the time axis.

  • inplace (bool, optional) – Whether to update the flags in-place.

Returns:

  • ndarray – Boolean array of flags.

  • dict – Information about the flagging procedure (empty for this function)