Utility classes and functions

Utility classes and functions#

This subpackage contains general utility classes and functions

Calibrator class.

class dysh.util.calibrator.Calibrator(name: str, scale: str, coefs: list, method: Callable, calibrator_table: CalibratorTable | None = None, las: float | None = None, cal: bool = True, nu_min: Quantity | None = None, nu_max: Quantity | None = None)[source]#

Bases: object

Holds calibrator parameters for a specific calibrator. Using these parameters it is possible to get the flux density for the calibrator using the Calibrator.compute_sed method.

Parameters:

namestr: Calibrator name.
scalestr: Scale in which the calibrator flux density is defined.
coefslist: List of coefficients that define the flux density as a function of frequency. The exact definition of the cofficients is determined by the method used.
methodCallable: Function used to compute the flux density of the calibrator as a function of frequency given the input coefs.
calibrator_tableCalibratorTable: Table with calibrator details. This is only used when creating a Calibrator using the from_name class method. Otherside it is kept for reference.
lasfloat or None: Largest angular size for the calibrator in degrees.
calbool: Is the calibrator suitable for flux density calibration? This is used to issue warnings.
nu_minQuantity or None: Minimum frequency over which the coefficients are valid.
nu_maxQuantity or None: Maximum frequency over which the coefficients are valid.

Methods

`compute_sed`(nu[, hpbw])	Evaluate the calibrator flux density at `nu`.
`from_name`(name[, scale, calibrator_table])	Create a `Calibrator` given a calibrator `name`.

Examples

Create a Calibrator for 3C123 using the Perley & Butler 2017 flux scale.

>>> from dysh.util import calibrator
>>> c = calibrator.Calibrator.from_name("3C123")

Compute the flux density at 1 GHz

>>> import astropy.units as u
>>> c.compute_sed(1*u.GHz).value
63.34320007697665

compute_sed(nu: Quantity, hpbw: float | None = None)[source]#

Evaluate the calibrator flux density at nu.

Parameters:

nuQuantity: Frequency values.
hpbwfloat or None: Half Power Beam Width in degrees.

Warns:

UserWarning: If the Calibrator is not suitable as a flux density calibrator, if nu is above or below the range where coefs are defined, of if hpbw is smaller than the largest angular size of the calibrator.

classmethod from_name(name: str, scale: Literal['Perley-Butler 2017', 'Ott 1994'] | None = None, calibrator_table: CalibratorTable | None = None)[source]#

Create a Calibrator given a calibrator name.

Parameters:

namestr: Calibrator name, case insensitive.
scaleNone or str or “Perley-Butler 2017” or “Ott 1994”: The name of the flux scale to use, case insensitive. Minimum string match allowed. If set to None, the default, it will use the Perley & Butler 2017 scale if available, otherwise Ott et al. 1994. Only the Perley & Butler 2017 and Ott et al. 1994 scales are shipped with dysh. If a custom calibrator_table is provided this can be the name of a scale defined in that table.

Raises:

ValueError: If scale is not one of the keys in the calibrator_table, or if name is not one of the keys in calibrator_table.data[scale].

class dysh.util.calibrator.CalibratorTable(calibrator_table_file: Path | str | None = None)[source]#

Bases: object

This class is used to hold calibrator information. It uses as input a json file. By default it will look for dysh/data/calibrators.json, which is distributed with dysh.

The structure of the input table should be:

{
    "Objects" : {
        "name" : {
            "LAS"   : number,
            "cal"   : boolean,
            "alias" : array
            },
    "Scale name" : {
        "fluxscale" : string,
        "objects" : {
            "name" : {
                "coefs"  : array,
                "nu_min" : number,
                "nu_max" : number
                }
            },
        "method" : string
        }
    }
}

Where “name” is the object name. It should match under the “Objects” and “Scale name” “objects”.

“cal” is a boolean indicating if the object is suitable as a flux calibrator. This is mainly used to issue warnings.

“alias” is a list of alises for the object name.

“LAS” is the largest angular extent of the object in degrees.

“Scale name” is the name of the scale, for example “Ott 1994”.

“fluxscale” is a repeat of “Scale name”.

“coefs” are the cofficients that define the spectral energy distribution of the object. The order and the meaning of the coefficients depends on the function defined by “method”.

“nu_min” is the minimum frequency, in GHz, where the coeffcients are valid.

“nu_max” is the maximum frequency, in GHz, where the coeffcients are valid.

“method” is the method used to compute the spectral energy distribution given the list of coefficients provided for each object. The method should be defined in this file, calibrators.py.

Parameters:

calibrator_table_filePath or str or None: Path to the json table with calibrator information. The contents of the file are defined above. If None, it will use dysh/data/calibrators.json

Methods

`load`()	Load the json table with calibrator information.
`valid_names`([scale])	Recognized calibrator names and their alises defined in the json table.
`valid_scales`()	Valid flux scales defined in the json table.

load()[source]#: Load the json table with calibrator information.

valid_names(scale: str | None = None)[source]#

Recognized calibrator names and their alises defined in the json table.

Parameters:

scalestr or None: If set to None will return all the objects names found in the table. If set to a known flux scale it will return the names of the objects defined for that flux scale.

valid_scales()[source]#: Valid flux scales defined in the json table.

dysh.util.calibrator.poly_ott(nu: Quantity, coefs: list)[source]#

From Table 5 in Ott et al. 1994 [1].

Parameters:

nuQuantity: Frequency values.
coefsarray: Coefficients that define the radio spectral energy distribution. For the Ott et al. 1994 model this must have three values.

Returns:

snuQuantity: Radio flux density evaluated at nu in Jy.

Notes

The flux density, $S$, is computed from

\[\log(S)=a+b\log(\nu)+c\log^{2}(\nu)\]

with $a$ the first element of coefs and $\nu$ the frequency in MHz.

References

[1]

Ott et al. (1994)

Equation (1) in Perley & Butler 2017 [1].

Parameters:

nuQuantity: Frequency values.
coefsarray: Coefficients that define the radio spectral energy distribution.

Returns:

snuQuantity: Radio flux density evaluated at nu in Jy.

Notes

The flux density, $S$, is computed from

\[\log(S)=a_{0}+a_{1}\log(\nu)+a_{2}\log^{2}(\nu)+\cdots\]

with $a_{0}$ the last element of coefs, $a_{1}$ the second to last, and so on. $\nu$ is the frequency in GHz.

References

[1]

Perley & Butler (2017)

dysh.util.files.dysh_data(sdfits=None, test=None, example=None, accept=None, dysh_data=None, gui=False)[source]#

Resolves the filename within the dysh data system without the need for an absolute path by passing mnemonics to any of four entry points (sdfits, test, example, accept).

Currently configured to work at GBO. For other sites users need to configure a $DYSH_DATA directory, properly populated with (symlinks to) directories as described below. Optionally, an explicit dysh_data can be given, which overrides any possible $DYSH_DATA environment (or configuration) that may exist.

Only one of the keywords sdfits, test, example, accept can be given to probe for data.

As an exception, if the first argument (sdfits) has an absolute filename, it is passed unchecked.

gui mode is experimental and may disappear or re-implemented at a later stage.

The locations of various dysh_data directory roots are presented in the following Table, where $DYSH is the repo root for developers (this can be found using dysh.util.get_project_root).

keyword	location at GBO	$DYSH_DATA root
sdfits=	/home/sdfits	$DYSH_DATA/sdfits
test=	$DYSH/testdata	$DYSH_DATA/testdata
example=	/home/dysh/example_data	$DYSH_DATA/example_data
accept=	/home/dysh/acceptance_testing	$DYSH_DATA/acceptance_testing

test resolves to the same filename as the util.get_project_testdata() function but it is otherwise only available for developers (the testdata directory is not available if you pip install dysh).

If present, the $SDFITS_DATA directory is honored instead of the default for sdfits and overrides the $DYSH_DATA directory.

Notes

if $DYSH_DATA exist, it will prepend that to the argument of get_dysh_data() and check for existence if $DYSH_DATA does not exist, but $SDFITS_DATA exists (a GBTIDL feature) it will use that
if /home/dysh exists, it will prepend this and check for existence this will keep GBO people happy. Offsite a symlink should also work.
if none of those gave a valid name, it will fall back to making a URL by prepending http://www.gb.nrao.edu/dysh/ and using from_url for as long we want to support that. astropy caching is also an option
directories (names not ending on .fits) cannot be downloaded using from_url
configuration (not implemented yet)

Examples

Using mnemonics

>>> fn = dysh_data(test='getps')
>>> fn = dysh_data(example='getfs')

Using full paths

>>> fn = dysh_data(example='onoff-L/data/TGBT21A_501_11.raw.vegas')

Using a project id

>>> fn = dysh_data('AGBT21B_024_54')

This will return /home/sdfits/AGBT21B_024_54 at GBO, or ${DYSH_DATA}/sdfits/AGBT21B_024_54 if the $DYSH_DATA environment variable is set.

dysh.util.files.fdr(filename, path=None, recursive=False, wildcard=False, maxfiles=None)[source]#

Input:: filename - can be wildcard too (but see the wildcard option) path - optional. can be : separated, can start with $ if envvar recursive - recursively search: Default: False wildcard - automatically wildcard the filename: Default not used maxfiles - maximum number of files to be returns. Default: All
Returns:: list of found filenames, with maxfiles entries if applicable. Note list could be empty. Note if multiple paths are given, maxfiles is applied to each sublist
See also:: astropy’s getdata ??? pdrptry.pdrutils.get_testdata() astropy.utils.data.get_pkg_data_filenames
Examples:: fdr(‘ngc1234.fits’) - this exact file! fdr(’.fits’) - all fits file in this directory fdr(‘ngc1234.fits’,’/tmp’) - this file in /tmp fdr(’.fits’,’/tmp’) - all fits files in /tmp fdr(‘ngc1234.fits’,’$DYSH_DATA_PATH’) fdr(‘ngc1234.fits’,’$DYSH_DATA_PATH’, True) fdr(‘ngc1234.fits’,’$DYSH_DATA_PATH:/data/gbt’)

dysh.util.files.main_cli()[source]#

class dysh.util.gaincorrection.BaseGainCorrection[source]#

Bases: ABC

This class is the base class for gain corrections. It is intended to be subclassed for specific antennas. Subclasses will be used to calculate various gain corrections to go from antenna temperature to other scales like brightness temperature of flux density. Subclasses can implement the following attributes:

ap_eff_0
Long wavelength aperture efficiency (number between 0 and 1), $\eta_{a}$, i.e., in the absence of surface errors, $\lambda >> \epsilon_0$.
epsilon_0
Default rms surface accuracy with units of length (Quantity)
physical_aperture
Antenna physical aperture with units of length**2 (Quantity)
loss_eff_0
The telescope efficiency combining radiation efficiency $\eta_r$ and rearward scattering and spillover efficiency, $\eta_{rss}$. $\eta_{loss} = \eta_r\eta_{rss}$. This is the term $\eta_l$ as defined by Kutner & Ulich (1981) equation 12. See https://articles.adsabs.harvard.edu/pdf/1981ApJ…250..341K

Attributes:

jyperk: The default Gain off the telescope in Jy/K, $G = 2 k_B/A_p$, where k_B is Boltzmann’s constant and A_p is the area of the physical aperture of the telescope.

Methods

`airmass`(angle, zd, **kwargs)	Computes the airmass at given elevation(s) or zenith distance(s).
`aperture_efficiency`(specval, **kwargs)	Calculate the antenna aperture efficiency.
`zenith_opacity`(specval, **kwargs)	Compute the zenith opacity.

abstract airmass(angle: Angle | Quantity, zd: bool, **kwargs) → float | ndarray[source]#

Computes the airmass at given elevation(s) or zenith distance(s). Subclasses should implement an airmass function specific to their application.

Parameters:

angleAngle or Quantity: The elevation(s) or zenith distance(s) at which to compute the airmass
zd: bool: True if the input value is zenith distance, False if it is elevation.
**kwargsAny: Other possible parameters that affect the airmass, e.g. weather data.

Returns:

airmassfloat or ndarray: The value(s) of the airmass at the given elevation(s)/zenith distance(s)

abstract aperture_efficiency(specval: Quantity, **kwargs) → float | ndarray[source]#

Calculate the antenna aperture efficiency.

Parameters:

specvalQuantity: The spectral value – frequency or wavelength – at which to compute the efficiency
**kwargsAny: Other possible parameters that affect the aperture efficiency, e.g., elevation angle.

Returns:

aperture_efficiencyfloat or ndarray: The value(s) of the aperture efficiency at the given frequency/wavelength. The return value(s) are float(s) between zero and one.

property jyperk#: The default Gain off the telescope in Jy/K, $G = 2 k_B/A_p$, where k_B is Boltzmann’s constant and A_p is the area of the physical aperture of the telescope.

zenith_opacity(specval: Quantity, **kwargs) → float | ndarray[source]#

Compute the zenith opacity.

Parameters:

specvalQuantity: The spectral value – frequency or wavelength – at which to compute the opacity
**kwargsAny: Other possible parameters that affect the output opacity, e.g., MJD

Returns:

taufloat or ndarray: The values of the zenith opacity at the given frequency/wavelength. The return value(s) are non-negative float(s).

class dysh.util.gaincorrection.GBTGainCorrection(gain_correction_table: Path | None = None)[source]#

Bases: BaseGainCorrection

Gain correction class and functions specific to the Green Bank Telescope.

Parameters:

gain_correction_tablestr or pathlib.Path: File to read that contains the parameterized gain correction as a function of zenith distance and time (see GBT Memo 301). Must be in an QTable readable format. Default None will use dysh’s internal GBT gain correction table.

Attributes:

valid_scalestuple

Strings representing valid options for scaling spectral data, specifically

‘ta’ : Antenna Temperature in K
‘ta*’ : Antenna temperature corrected to above the atmosphere in K
‘flux’ : flux density in Jansky

Methods

`airmass`(angle[, zd])	Computes the airmass at given elevation(s) or zenith distance(s).
`aperture_efficiency`(specval, angle, date[, ...])	Compute the aperture efficiency, as a float between zero and 1.
`atm_temperature`(specval[, mjd, coeffs, ...])	Compute the atmospheric temperature `Tatm`, optionally interfacing with the GBO `getForecastValues` script.
`gain_correction`(angle, date[, zd])	Compute the gain correction scale factor, to be used in the aperture efficiency calculation.
`get_weather`(specval, vartype[, mjd, coeffs])	Call the GBO `getForecastValues` script with the given inputs.
`is_valid_scale`(scale)	Check that a string represents a valid option for scaling spectral data.
`scale_ta_to`(tscale, specval, angle, date, ...)	Scale the antenna temperature to a different brightness temperature unit.
`surface_error`(date)	Lookup the applicable surface error in the gain correction table for the observation date.
`zenith_opacity`(specval[, mjd, coeffs, ...])	Compute the zenith opacity, optionally interfacing with the GBO `getForecastValues` script.

airmass(angle: Angle | Quantity, zd: bool = False, **kwargs) → float | ndarray[source]#

Computes the airmass at given elevation(s) or zenith distance(s). The formula used is

$A = -0.0234 + 1.014/sin(El+5.18/(El+3.35))$

for elevation in degrees. This function is specific for the GBT location derived from vertical weather data. Source: (Maddalena 2007)

Parameters:

angleAngle or Quantity: The elevation(s) or zenith distance(s) at which to compute the airmass
zd: bool: True if the input value is zenith distance, False if it is elevation. Default: False

Returns:

airmassfloat or ndarray: The value(s) of the airmass at the given elevation(s)/zenith distance(s)

aperture_efficiency(specval: Quantity, angle: Angle | Quantity, date: Time, zd: bool = False, surface_error: Quantity | None = None, **kwargs) → float | ndarray[source]#

Compute the aperture efficiency, as a float between zero and 1. The aperture efficiency $\eta_a$, is determined by:

\[\eta_a = \eta_0 G(ZD) \exp(-(4\pi\epsilon/\lambda)^2)\]

where $\eta_0$ is the long wavelength aperture efficiency, $G(ZD)$ is the gain correction factor at a zenith distance $ZD$ (zd), $\epsilon$ (surface_error) is the surface error, and $\lambda$ is the wavelength.

Rules for input of multiple dates, spectral values, and angles

For a single date:

If one spectral value and multiple angles, then the aperture efficiency at each angle is returned.

If multiple spectral values and one angle, then the aperture efficiency at each spectral value is returned

If multiple spectral values and multiple angles, then it is assumed they are to be paired and the aperture efficiency at each pair will be returned. Therefore the lengths must be equal.

For mutiple dates:

For one spectral value and one angle, the aperture efficiency at each date is returned.

For multiple spectral values and multiple angles, it is assumed they go together, so the lengths must match the number of dates. The aperture efficiency for each (spectral value, angle, date) tuple will be returned.

Parameters:

specvalQuantity: The spectral value(s) – frequency or wavelength – at which to compute the efficiency
angleAngle or Quantity: The elevation(s) or zenith distance(s) at which to compute the efficiency
dateTime: The date(s) at which to compute the efficiency.
zdbool: True if the input value is zenith distance, False if it is elevation. Default: False
surface_errorQuantity or None: The value of $\epsilon$ to use, the surface rms error. If given, must have units of length (typically microns). If None, the measured value from observatory testing will be used (See surface_error()).

Returns:

eta_afloat or ndarray: The aperture efficiency at the given inputs

atm_temperature(specval: Quantity, mjd: Time | float | None = None, coeffs: bool = True, use_script: bool = True, **kwargs) → ndarray[source]#

Compute the atmospheric temperature Tatm, optionally interfacing with the GBO getForecastValues script. If multiple specval are given, an array is returned otherwise a float is returned.

For frequencies below 2 GHz, the value at 2 GHz will be returned since the getForecastValues does not cover < 2GHz. Returned values will be sorted by frequency, low to high.

Parameters:

specvalQuantity: The spectral value – frequency or wavelength – at which to compute Tatm.
mjdTime or float: The date at which to compute Tatm. If given as a float, it is interpreted as Modified Julian Day. Default: None, meaning ignore this parameter. If the user is not on the GBO network, this argument is ignored and the opacity will only be a function of frequency.
coeffsbool: If True and at GBO, getForecastValues will be passed the -coeffs argument which returns polynomial coefficients to fit Tatm as a function of frequency for each MJD.
use_scriptbool: If at GBO, use the getForecastValues script to determine Tatm. This argument is ignored if the user is not on the GBO network.

Returns:

tatmndarray: The atmospheric temperature at the given input(s) as a $\ N_{mjd} \times N_{freq}\ $ array

gain_correction(angle: Angle | Quantity, date: Time, zd: bool = True) → float | ndarray[source]#

Compute the gain correction scale factor, to be used in the aperture efficiency calculation. The factor is a float between zero and 1. (See GBT Memo 301). The factor is determined by:

$G = A0 + A1*ZD + A2*ZD^2$

where An are the time-dependent coefficients and ZD is the zenith distance angle in degrees.

Parameters:

angleAngle or Quantity: The elevation(s) or zenith distance(s) at which to compute the gain correction factor
dateTime: The date at which to compute the gain correction factor
zd: bool: True if the input value is zenith distance, False if it is elevation. Default: False

Returns:

gain_correctionfloat or ndarray: The gain correction scale factor(s) at the given elevation(s)/zenith distance(s)

property gain_correction_table#: The table containing the parameterized gain correction as a fucntion of zenith distance and time

get_weather(specval: Quantity, vartype: str, mjd: Time | float | None = None, coeffs: bool = True, **kwargs) → ndarray[source]#

Call the GBO getForecastValues script with the given inputs. For frequencies below 2 GHz, the value at 2 GHz will be returned since the getForecastValues does not cover < 2GHz. Returned values will be sorted by frequency, low to high.

See GBTdocs for more details.

Parameters:

specvalQuantity, optional: The spectral value – frequency or wavelength – at which to compute vartype. For data such as ‘Winds’ that don’t depend on frequency, specval can be None.
vartypestr, optional: Which weather variable to fetch. See Notes for a description of valid values. If the user is not on the GBO network , the only variable available is Opacity.
mjdTime or float: The date at which to compute the opacity. If given as a float, it is interpreted as Modified Julian Day. Default: None, meaning the data will be fetched at the most recent MJD available. If the user is not on the GBO network, this argument is ignored and the opacity will only be a function of frequency.
coeffsbool: If True and at GBO, getForecastValues will be passed the -coeffs argument which returns polynomial coefficients to fit vartype as a function of frequency for each MJD. This is only valid for `vartype` “Opacity” or “Tatm.”

Returns:

weather_datandarray: The requested value(s) at the given input(s) as a $N_{mjd} \times N_{freq}$ array

classmethod is_valid_scale(scale)[source]#

Check that a string represents a valid option for scaling spectral data. See: valid_scales.

Parameters:

scalestr: temperature scale descriptive string.

Returns:

bool: True if scale is a valid scaling option, False otherwise

scale_ta_to(tscale: str, specval: Quantity, angle: Angle | Quantity, date: Time, zenith_opacity, zd=False, ap_eff=None, surface_error=None) → float | ndarray[source]#

Scale the antenna temperature to a different brightness temperature unit.

tscalestr

The brightness scale unit for the output scan, must be one of (case-insensitive)

‘Ta’ : Antenna Temperature in K
‘Ta*’ : Antenna temperature corrected to above the atmosphere in K
‘Flux’ : flux density in Jansky

If ‘Ta*’ or ‘Flux’ the zenith opacity must also be given.

specvalQuantity

The spectral value(s) – frequency or wavelength – at which to compute the efficiency

angleAngle or Quantity

The elevation(s) or zenith distance(s) at which to compute the efficiency

dateTime

The date(s) at which to compute the efficiency.

zenith_opacity: float

The zenith opacity to use in calculating the scale factors for the integrations.

zdbool

True if the input value is zenith distance, False if it is elevation. Default: False

ap_efffloat or None

Aperture efficiency to be used when scaling data to brightness temperature of flux. The provided aperture efficiency must be a number between 0 and 1. If None, dysh will calculate it aperture_efficiency(). Only one of ap_eff or surface_errors can be provided.

surface_errorQuantity or None

The value of $\epsilon_0$ to use, the surface rms error. If given, must have units of length (typically microns). If None, the measured value from observatory testing will be used (See surface_error()).

surface_error(date: Time) → Quantity[source]#

Lookup the applicable surface error in the gain correction table for the observation date.

Parameters:

dateTime: Date of observation

Returns:

Quantity: Surface error for the given date.

valid_scales: tuple[str, str, str] = ('ta', 'ta*', 'flux')#

zenith_opacity(specval: Quantity, mjd: Time | float | None = None, coeffs: bool = True, use_script: bool = True, **kwargs) → ndarray[source]#

Compute the zenith opacity, optionally interfacing with the GBO getForecastValues script. If multiple specval are given, an array is returned otherwise a float is returned.

For frequencies below 2 GHz, the value at 2 GHz will be returned since the getForecastValues does not cover < 2GHz. Returned values will be sorted by frequency, low to high.

Parameters:

specvalQuantity: The spectral value – frequency or wavelength – at which to compute the zenith opacity.
mjdTime or float or list of float: The date(s) at which to compute the opacity. If given as a float, it is interpreted as Modified Julian Day. Default: None, meaning the data will be fetched at the most recent MJD available. If the user is not on the GBO network, this argument is ignored and the opacity will only be a function of frequency.
coeffsbool: If True and at GBO, getForecastValues will be passed the -coeffs argument which returns polynomial coefficients to fit opacity as a function of frequency for each MJD.
use_scriptbool: If at GBO, use the getForecastValues script to determine the opacity. This argument is ignored if the user is not on the GBO network.

Returns:

taundarray: The zenith opacity at the given input(s) as a $\ N_{mjd} \times N_{freq}\ $ array.

class dysh.util.selection.Flag(initobj, aliases={'dec': 'crval3', 'elevation': 'elevatio', 'freq': 'crval1', 'gallat': 'crval3', 'gallon': 'crval2', 'glat': 'crval3', 'glon': 'crval2', 'pol': 'plnum', 'ra': 'crval2', 'source': 'object', 'subref': 'subref_state'}, **kwargs)[source]#

Bases: SelectionBase

This class contains the methods for creating rules to flag data from an SDFITS object. Data (rows) can be selected for flagging using any column name in the input SDFITS object. Exact selection, range selection, upper/lower limit selection, and any-of selection are all supported.

Users create flag rules by specifying keyword (SDFITS columns) and value(s) to be flagged. Briefly, the flag methods are:

flag() - Flag exact values

flag_range() - Flag ranges of values

flag_within() - Flag a value +/- epsilon

flag_channel() - Flag channels or ranges of channels

The Flag object maintains a DataFrame for each flag rule created by the user. The final() flag is the logical OR of these rules. Users can examine the current flags with show() which will show the current rules and how many rows each rule selects for flagging from the unfiltered data.

The actual flags, which are per channel, are stored in the GBTFITSLoad object, not in the Flag object. The Flag object just contains the flagging rules.

Aliases of keywords are supported. The user may add an alias for an existing SDFITS column with alias(). Some default aliases() have been defined.

GBTIDL Flags can be read in with read().

Attributes:

T

The transpose of the DataFrame.

aliases

The aliases that may be used to refer to SDFITS columns.

at

Access a single value for a row/column label pair.

attrs

Dictionary of global attributes of this dataset.

axes

Return a list representing the axes of the DataFrame.

columns

The column labels of the DataFrame.

>>> df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
>>> df
     A  B
0    1  3
1    2  4
>>> df.columns
Index(['A', 'B'], dtype='object')

dtypes

Return the dtypes in the DataFrame.

empty

Indicator whether Series/DataFrame is empty.

final

Create the final flag selection.

flags

Get the properties associated with this pandas object.

iat

Access a single value for a row/column pair by integer position.

iloc

Purely integer-location based indexing for selection by position.

index

The index (row labels) of the DataFrame.

The index of a DataFrame is a series of labels that identify each row. The labels can be integers, strings, or any other hashable type. The index is used for label-based access and alignment, and can be accessed or modified using this attribute.

pandas.Index: The index labels of the DataFrame.

DataFrame.columns : The column labels of the DataFrame. DataFrame.to_numpy : Convert the DataFrame to a NumPy array.

>>> df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Aritra'],
...                    'Age': [25, 30, 35],
...                    'Location': ['Seattle', 'New York', 'Kona']},
...                   index=([10, 20, 30]))
>>> df.index
Index([10, 20, 30], dtype='int64')

In this example, we create a DataFrame with 3 rows and 3 columns, including Name, Age, and Location information. We set the index labels to be the integers 10, 20, and 30. We then access the index attribute of the DataFrame, which returns an Index object containing the index labels.

>>> df.index = [100, 200, 300]
>>> df
    Name  Age Location
100  Alice   25  Seattle
200    Bob   30 New York
300  Aritra  35    Kona

In this example, we modify the index labels of the DataFrame by assigning a new list of labels to the index attribute. The DataFrame is then updated with the new labels, and the output shows the modified DataFrame.

loc

Access a group of rows and columns by label(s) or a boolean array.

ndim

Return an int representing the number of axes / array dimensions.

shape

Return a tuple representing the dimensionality of the DataFrame.

size

Return an int representing the number of elements in this object.

style

Returns a Styler object.

values

Return a Numpy representation of the DataFrame.

Methods

`abs`()	Return a Series/DataFrame with absolute numeric value of each element.
`add`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator `add`).
`add_prefix`(prefix[, axis])	Prefix labels with string `prefix`.
`add_suffix`(suffix[, axis])	Suffix labels with string `suffix`.
`agg`([func, axis])	Aggregate using one or more operations over the specified axis.
`aggregate`([func, axis])	Aggregate using one or more operations over the specified axis.
`alias`(aliases)	Alias a set of keywords to existing columns.
`align`(other[, join, axis, level, copy, ...])	Align two objects on their axes with the specified join method.
`all`([axis, bool_only, skipna])	Return whether all elements are True, potentially over an axis.
`any`(*[, axis, bool_only, skipna])	Return whether any element is True, potentially over an axis.
`apply`(func[, axis, raw, result_type, args, ...])	Apply a function along an axis of the DataFrame.
`applymap`(func[, na_action])	Apply a function to a Dataframe elementwise.
`asfreq`(freq[, method, how, normalize, ...])	Convert time series to specified frequency.
`asof`(where[, subset])	Return the last row(s) without any NaNs before `where`.
`assign`(**kwargs)	Assign new columns to a DataFrame.
`astype`(dtype[, copy, errors])	Cast a pandas object to a specified dtype `dtype`.
`at_time`(time[, asof, axis])	Select values at particular time of day (e.g., 9:30AM).
`backfill`(*[, axis, inplace, limit, downcast])	Fill NA/NaN values by using the next valid observation to fill the gap.
`between_time`(start_time, end_time[, ...])	Select values between particular times of the day (e.g., 9:00-9:30 AM).
`bfill`(*[, axis, inplace, limit, limit_area, ...])	Fill NA/NaN values by using the next valid observation to fill the gap.
`bool`()	Return the bool of a single element Series or DataFrame.
`boxplot`([column, by, ax, fontsize, rot, ...])	Make a box plot from DataFrame columns.
`clear`()	Remove all selection rules
`clip`([lower, upper, axis, inplace])	Trim values at input threshold(s).
`columns_selected`()	The names of any columns which were used in a selection rule
`combine`(other, func[, fill_value, overwrite])	Perform column-wise combine with another DataFrame.
`combine_first`(other)	Update null elements with value in the same location in `other`.
`compare`(other[, align_axis, keep_shape, ...])	Compare to another DataFrame and show the differences.
`convert_dtypes`([infer_objects, ...])	Convert columns to the best possible dtypes using dtypes supporting `pd.NA`.
`copy`([deep])	Make a copy of this object's indices and data.
`corr`([method, min_periods, numeric_only])	Compute pairwise correlation of columns, excluding NA/null values.
`corrwith`(other[, axis, drop, method, ...])	Compute pairwise correlation.
`count`([axis, numeric_only])	Count non-NA cells for each column or row.
`cov`([min_periods, ddof, numeric_only])	Compute pairwise covariance of columns, excluding NA/null values.
`cummax`([axis, skipna])	Return cumulative maximum over a DataFrame or Series axis.
`cummin`([axis, skipna])	Return cumulative minimum over a DataFrame or Series axis.
`cumprod`([axis, skipna])	Return cumulative product over a DataFrame or Series axis.
`cumsum`([axis, skipna])	Return cumulative sum over a DataFrame or Series axis.
`describe`([percentiles, include, exclude])	Generate descriptive statistics.
`diff`([periods, axis])	First discrete difference of element.
`div`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `truediv`).
`divide`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `truediv`).
`dot`(other)	Compute the matrix multiplication between the DataFrame and other.
`drop`([labels, axis, index, columns, level, ...])	Drop specified labels from rows or columns.
`drop_duplicates`([subset, keep, inplace, ...])	Return DataFrame with duplicate rows removed.
`droplevel`(level[, axis])	Return Series/DataFrame with requested index / column level(s) removed.
`dropna`(*[, axis, how, thresh, subset, ...])	Remove missing values.
`duplicated`([subset, keep])	Return boolean Series denoting duplicate rows.
`eq`(other[, axis, level])	Get Equal to of dataframe and other, element-wise (binary operator `eq`).
`equals`(other)	Test whether two objects contain the same elements.
`eval`(expr, *[, inplace])	Evaluate a string describing operations on DataFrame columns.
`ewm`([com, span, halflife, alpha, ...])	Provide exponentially weighted (EW) calculations.
`expanding`([min_periods, axis, method])	Provide expanding window calculations.
`explode`(column[, ignore_index])	Transform each element of a list-like to a row, replicating index values.
`ffill`(*[, axis, inplace, limit, limit_area, ...])	Fill NA/NaN values by propagating the last valid observation to next valid.
`fillna`([value, method, axis, inplace, ...])	Fill NA/NaN values using the specified method.
`filter`([items, like, regex, axis])	Subset the dataframe rows or columns according to the specified index labels.
`first`(offset)	Select initial periods of time series data based on a date offset.
`first_valid_index`()	Return index for first non-NA value or None, if no non-NA value is found.
`flag`([tag, check])	Add one or more exact flag rules, e.g., `key1 = value1, key2 = value2, ...` If `value` is array-like then a match to any of the array members will be flagged.
`flag_channel`(channel[, tag])	Flag channels and/or channel ranges for all data.
`flag_range`([tag, check])	Flag a range of inclusive values for a given key(s).
`flag_within`([tag, check])	Flag a value within a plus or minus for a given key(s).
`floordiv`(other[, axis, level, fill_value])	Get Integer division of dataframe and other, element-wise (binary operator `floordiv`).
`from_dict`(data[, orient, dtype, columns])	Construct DataFrame from dict of array-like or dicts.
`from_records`(data[, index, exclude, ...])	Convert structured or record ndarray to DataFrame.
`ge`(other[, axis, level])	Get Greater than or equal to of dataframe and other, element-wise (binary operator `ge`).
`get`(key)	Get the selection/flag rule by its ID
`groupby`([by, axis, level, as_index, sort, ...])	Group DataFrame using a mapper or by a Series of columns.
`gt`(other[, axis, level])	Get Greater than of dataframe and other, element-wise (binary operator `gt`).
`head`([n])	Return the first `n` rows.
`hist`([column, by, grid, xlabelsize, xrot, ...])	Make a histogram of the DataFrame's columns.
`idxmax`([axis, skipna, numeric_only])	Return index of first occurrence of maximum over requested axis.
`idxmin`([axis, skipna, numeric_only])	Return index of first occurrence of minimum over requested axis.
`infer_objects`([copy])	Attempt to infer better dtypes for object columns.
`info`([verbose, buf, max_cols, memory_usage, ...])	Print a concise summary of a DataFrame.
`insert`(loc, column, value[, allow_duplicates])	Insert column into DataFrame at specified location.
`interpolate`([method, axis, limit, inplace, ...])	Fill NaN values using an interpolation method.
`isetitem`(loc, value)	Set the given value in the column with position `loc`.
`isin`(values)	Whether each element in the DataFrame is contained in values.
`isna`()	Detect missing values.
`isnull`()	DataFrame.isnull is an alias for DataFrame.isna.
`items`()	Iterate over (column name, Series) pairs.
`iterrows`()	Iterate over DataFrame rows as (index, Series) pairs.
`itertuples`([index, name])	Iterate over DataFrame rows as namedtuples.
`join`(other[, on, how, lsuffix, rsuffix, ...])	Join columns of another DataFrame.
`keys`()	Get the 'info axis' (see Indexing for more).
`kurt`([axis, skipna, numeric_only])	Return unbiased kurtosis over requested axis.
`kurtosis`([axis, skipna, numeric_only])	Return unbiased kurtosis over requested axis.
`last`(offset)	Select final periods of time series data based on a date offset.
`last_valid_index`()	Return index for last non-NA value or None, if no non-NA value is found.
`le`(other[, axis, level])	Get Less than or equal to of dataframe and other, element-wise (binary operator `le`).
`lt`(other[, axis, level])	Get Less than of dataframe and other, element-wise (binary operator `lt`).
`map`(func[, na_action])	Apply a function to a Dataframe elementwise.
`mask`(cond[, other, inplace, axis, level])	Replace values where the condition is True.
`max`([axis, skipna, numeric_only])	Return the maximum of the values over the requested axis.
`mean`([axis, skipna, numeric_only])	Return the mean of the values over the requested axis.
`median`([axis, skipna, numeric_only])	Return the median of the values over the requested axis.
`melt`([id_vars, value_vars, var_name, ...])	Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
`memory_usage`([index, deep])	Return the memory usage of each column in bytes.
`merge`(how[, on])	Merge selection rules using a specific type of join.
`min`([axis, skipna, numeric_only])	Return the minimum of the values over the requested axis.
`mod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator `mod`).
`mode`([axis, numeric_only, dropna])	Get the mode(s) of each element along the selected axis.
`mul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator `mul`).
`multiply`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator `mul`).
`ne`(other[, axis, level])	Get Not equal to of dataframe and other, element-wise (binary operator `ne`).
`nlargest`(n, columns[, keep])	Return the first `n` rows ordered by `columns` in descending order.
`notna`()	Detect existing (non-missing) values.
`notnull`()	DataFrame.notnull is an alias for DataFrame.notna.
`nsmallest`(n, columns[, keep])	Return the first `n` rows ordered by `columns` in ascending order.
`nunique`([axis, dropna])	Count number of distinct elements in specified axis.
`pad`(*[, axis, inplace, limit, downcast])	Fill NA/NaN values by propagating the last valid observation to next valid.
`pct_change`([periods, fill_method, limit, freq])	Fractional change between the current and a prior element.
`pipe`(func, args, *kwargs)	Apply chainable functions that expect Series or DataFrames.
`pivot`(*, columns[, index, values])	Return reshaped DataFrame organized by given index / column values.
`pivot_table`([values, index, columns, ...])	Create a spreadsheet-style pivot table as a DataFrame.
`plot`	alias of `PlotAccessor`
`pop`(item)	Return item and drop from frame.
`pow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator `pow`).
`prod`([axis, skipna, numeric_only, min_count])	Return the product of the values over the requested axis.
`product`([axis, skipna, numeric_only, min_count])	Return the product of the values over the requested axis.
`quantile`([q, axis, numeric_only, ...])	Return values at the given quantile over requested axis.
`query`(expr, *[, inplace])	Query the columns of a DataFrame with a boolean expression.
`radd`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator `radd`).
`rank`([axis, method, numeric_only, ...])	Compute numerical data ranks (1 through n) along axis.
`rdiv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `rtruediv`).
`read`(fileobj[, ignore_vegas])	Read a GBTIDL flag file and instantiate Flag object.
`reindex`([labels, index, columns, axis, ...])	Conform DataFrame to new index with optional filling logic.
`reindex_like`(other[, method, copy, limit, ...])	Return an object with matching indices as other object.
`remove`([id, tag])	Remove (delete) a selection rule(s).
`rename`([mapper, index, columns, axis, copy, ...])	Rename columns or index labels.
`rename_axis`([mapper, index, columns, axis, ...])	Set the name of the axis for the index or columns.
`reorder_levels`(order[, axis])	Rearrange index levels using input order.
`replace`([to_replace, value, inplace, limit, ...])	Replace values given in `to_replace` with `value`.
`resample`(rule[, axis, closed, label, ...])	Resample time-series data.
`reset_index`([level, drop, inplace, ...])	Reset the index, or a level of it.
`rfloordiv`(other[, axis, level, fill_value])	Get Integer division of dataframe and other, element-wise (binary operator `rfloordiv`).
`rmod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator `rmod`).
`rmul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator `rmul`).
`rolling`(window[, min_periods, center, ...])	Provide rolling window calculations.
`round`([decimals])	Round a DataFrame to a variable number of decimal places.
`rpow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator `rpow`).
`rsub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator `rsub`).
`rtruediv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `rtruediv`).
`sample`([n, frac, replace, weights, ...])	Return a random sample of items from an axis of object.
`select_dtypes`([include, exclude])	Return a subset of the DataFrame's columns based on the column dtypes.
`sem`([axis, skipna, ddof, numeric_only])	Return unbiased standard error of the mean over requested axis.
`set_axis`(labels, *[, axis, copy])	Assign desired index to given axis.
`set_flags`(*[, copy, allows_duplicate_labels])	Return a new object with updated flags.
`set_index`(keys, *[, drop, append, inplace, ...])	Set the DataFrame index using existing columns.
`shift`([periods, freq, axis, fill_value, suffix])	Shift index by desired number of periods with an optional time `freq`.
`show`()	Print the current selection rules.
`skew`([axis, skipna, numeric_only])	Return unbiased skew over requested axis.
`sort_index`(*[, axis, level, ascending, ...])	Sort object by labels (along an axis).
`sort_values`(by, *[, axis, ascending, ...])	Sort by the values along either axis.
`sparse`	alias of `SparseFrameAccessor`
`squeeze`([axis])	Squeeze 1 dimensional axis objects into scalars.
`stack`([level, dropna, sort, future_stack])	Stack the prescribed level(s) from columns to index.
`std`([axis, skipna, ddof, numeric_only])	Return sample standard deviation over requested axis.
`sub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator `sub`).
`subtract`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator `sub`).
`sum`([axis, skipna, numeric_only, min_count])	Return the sum of the values over the requested axis.
`swapaxes`(axis1, axis2[, copy])	Interchange axes and swap values axes appropriately.
`swaplevel`([i, j, axis])	Swap levels i and j in a `MultiIndex`.
`tail`([n])	Return the last `n` rows.
`take`(indices[, axis])	Return the elements in the given positional indices along an axis.
`to_clipboard`(*[, excel, sep])	Copy object to the system clipboard.
`to_csv`([path_or_buf, sep, na_rep, ...])	Write object to a comma-separated values (csv) file.
`to_dict`([orient, into, index])	Convert the DataFrame to a dictionary.
`to_excel`(excel_writer, *[, sheet_name, ...])	Write object to an Excel sheet.
`to_feather`(path, **kwargs)	Write a DataFrame to the binary Feather format.
`to_gbq`(destination_table, *[, project_id, ...])	Write a DataFrame to a Google BigQuery table.
`to_hdf`(path_or_buf, *, key[, mode, ...])	Write the contained data to an HDF5 file using HDFStore.
`to_html`([buf, columns, col_space, header, ...])	Render a DataFrame as an HTML table.
`to_json`([path_or_buf, orient, date_format, ...])	Convert the object to a JSON string.
`to_latex`([buf, columns, header, index, ...])	Render object to a LaTeX tabular, longtable, or nested table.
`to_markdown`([buf, mode, index, storage_options])	Print DataFrame in Markdown-friendly format.
`to_numpy`([dtype, copy, na_value])	Convert the DataFrame to a NumPy array.
`to_orc`([path, engine, index, engine_kwargs])	Write a DataFrame to the ORC format.
`to_parquet`([path, engine, compression, ...])	Write a DataFrame to the binary parquet format.
`to_period`([freq, axis, copy])	Convert DataFrame from DatetimeIndex to PeriodIndex.
`to_pickle`(path, *[, compression, protocol, ...])	Pickle (serialize) object to file.
`to_records`([index, column_dtypes, index_dtypes])	Convert DataFrame to a NumPy record array.
`to_sql`(name, con, *[, schema, if_exists, ...])	Write records stored in a DataFrame to a SQL database.
`to_stata`(path, *[, convert_dates, ...])	Export DataFrame object to Stata dta format.
`to_string`([buf, columns, col_space, header, ...])	Render a DataFrame to a console-friendly tabular output.
`to_timestamp`([freq, how, axis, copy])	Cast to DatetimeIndex of timestamps, at beginning of period.
`to_xarray`()	Return an xarray object from the pandas object.
`to_xml`([path_or_buffer, index, root_name, ...])	Render a DataFrame to an XML document.
`transform`(func[, axis])	Call `func` on self producing a DataFrame with the same axis shape as self.
`transpose`(*args[, copy])	Transpose index and columns.
`truediv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `truediv`).
`truncate`([before, after, axis, copy])	Truncate a Series or DataFrame before and after some index value.
`tz_convert`(tz[, axis, level, copy])	Convert tz-aware axis to target time zone.
`tz_localize`(tz[, axis, level, copy, ...])	Localize tz-naive index of a Series or DataFrame to target time zone.
`unstack`([level, fill_value, sort])	Pivot a level of the (necessarily hierarchical) index labels.
`update`(other[, join, overwrite, ...])	Modify in place using non-NA values from another DataFrame.
`value_counts`([subset, normalize, sort, ...])	Return a Series containing the frequency of each distinct row in the Dataframe.
`var`([axis, skipna, ddof, numeric_only])	Return unbiased variance over requested axis.
`where`(cond[, other, inplace, axis, level])	Replace values where the condition is False.
`xs`(key[, axis, level, drop_level])	Return cross-section from the Series/DataFrame.

property final#

Create the final flag selection. This is done by a logical OR of each of the flag rules (specifically pandas.merge(how='outer')). Unlike Selection which uses AND logic to progressively narrow down data, Flag uses OR logic to cumulatively flag any data matching any rule.

Returns:

finalDataFrame: The resultant flagged rows from all the rules.

flag(tag=None, check=False, **kwargs)[source]#

Add one or more exact flag rules, e.g., key1 = value1, key2 = value2, ... If value is array-like then a match to any of the array members will be flagged. For instance flag(object=['3C273', 'NGC1234']) will select data for either of those objects and flag(ifnum=[0,2]) will flag IF number 0 or IF number 2. Channels for selected data can be flagged using keyword channel, e.g., flag(object='MBM12',channel=[0,23]) will flag channels 0 through 23 inclusive for object MBM12.

Parameters:

tagstr: An identifying tag by which the rule may be referred to later. If None, a randomly generated tag will be created.
checkbool: If True, check that a previous selection does not give an identical result as this one.
keystr: The key (SDFITS column name or other supported key)
valueany: The value to select

flag_channel(channel, tag=None, **kwargs)[source]#

Flag channels and/or channel ranges for all data. These are NOT used in final() but rather will be used to create a mask for flagging. Single arrays/tuples will be treated as channel lists; nested arrays will be treated as inclusive ranges. For instance:

` # flag channel 24 flag_channel(24) # flag channels 1 and 10 flag_channel([1,10]) # flags channels 1 thru 10 inclusive flag_channel([[1,10]]) # flag channel ranges 1 thru 10 and 47 thru 56 inclusive, and channel 75 flag_channel([[1,10], [47,56], 75)]) # tuples also work, though can be harder for a human to read flag_channel(((1,10), [47,56], 75)) `

Note : channel numbers start at zero

Parameters:

channelnumber, or array-like: The channels to flag

Returns:

None.

flag_range(tag=None, check=False, **kwargs)[source]#

Flag a range of inclusive values for a given key(s). e.g., key1 = (v1,v2), key2 = (v3,v4), ... will flag data v1 <= data1 <= v2, v3 <= data2 <= v4, ... ` Upper and lower limits may be given by setting one of the tuple values to None. e.g., `key1 = (None,v1) for an upper limit data1 <= v1 and key1 = (v1,None) for a lower limit data >=v1. Lower limits may also be specified by a one-element tuple key1 = (v1,).

Parameters:

tagstr, optional: An identifying tag by which the rule may be referred to later. If None, a randomly generated tag will be created.
checkbool: If True, check that a previous selection does not give an identical result as this one.
keystr: The key (SDFITS column name or other supported key)
valuearray-like: Tuple or list giving the lower and upper limits of the range.

Returns:

None.

flag_within(tag=None, check=False, **kwargs)[source]#

Flag a value within a plus or minus for a given key(s). e.g. key1 = [value1,epsilon1], key2 = [value2,epsilon2], ... Will select data value1-epsilon1 <= data1 <= value1+epsilon1, value2-epsilon2 <= data2 <= value2+epsilon2,...

Parameters:

tagstr, optional: An identifying tag by which the rule may be referred to later. If None, a randomly generated tag will be created.
checkbool: If True, check that a previous selection does not give an identical result as this one.
keystr: The key (SDFITS column name or other supported key)
valuearray-like: Tuple or list giving the value and epsilon

Returns:

None.

read(fileobj, ignore_vegas=False, **kwargs)[source]#

Read a GBTIDL flag file and instantiate Flag object.

Parameters:

fileobjstr, file-like or pathlib.Path: File to read. If a file object, must be opened in a readable mode.
ignore_vegasbool: If True, ignore any flag rules which contain ‘VEGAS_SPUR’ in the line, as these are usually flagged via algorithm. See calc_vegas_spurs().
**kwargsdict: Extra keyword arguments to apply to the flag rule. (This is mainly for internal use.)

Returns:

None.

class dysh.util.selection.Selection(initobj, aliases={'dec': 'crval3', 'elevation': 'elevatio', 'freq': 'crval1', 'gallat': 'crval3', 'gallon': 'crval2', 'glat': 'crval3', 'glon': 'crval2', 'pol': 'plnum', 'ra': 'crval2', 'source': 'object', 'subref': 'subref_state'}, **kwargs)[source]#

Bases: SelectionBase

This class contains the methods for creating rules to select data from an SDFITS object. Data (rows) can be selected using any column name in the input SDFITS object. Exact selection, range selection, upper/lower limit selection, and any-of selection are all supported.

Users create selection rules by specifying keyword (SDFITS columns) and value(s) to be selected. Briefly, the selection methods are:

select() - Select exact values

select_range() - Select ranges of values

select_within() - Select a value +/- epsilon

select_channel() - Select channels or ranges of channels

The Selection object maintains a DataFrame for each selection rule created by the user. The final() selection is the logical OR of these rules. Users can examine the current selections with show() which will show the current rules and how many rows each rule selects from the unfiltered data.

Aliases of keywords are supported. The user may add an alias for an existing SDFITS column with alias(). Some default aliases() have been defined.

Attributes:

T

The transpose of the DataFrame.

aliases

The aliases that may be used to refer to SDFITS columns.

at

Access a single value for a row/column label pair.

attrs

Dictionary of global attributes of this dataset.

axes

Return a list representing the axes of the DataFrame.

columns

The column labels of the DataFrame.

>>> df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
>>> df
     A  B
0    1  3
1    2  4
>>> df.columns
Index(['A', 'B'], dtype='object')

dtypes

Return the dtypes in the DataFrame.

empty

Indicator whether Series/DataFrame is empty.

final

Create the final selection.

flags

Get the properties associated with this pandas object.

iat

Access a single value for a row/column pair by integer position.

iloc

Purely integer-location based indexing for selection by position.

index

The index (row labels) of the DataFrame.

The index of a DataFrame is a series of labels that identify each row. The labels can be integers, strings, or any other hashable type. The index is used for label-based access and alignment, and can be accessed or modified using this attribute.

pandas.Index: The index labels of the DataFrame.

DataFrame.columns : The column labels of the DataFrame. DataFrame.to_numpy : Convert the DataFrame to a NumPy array.

>>> df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Aritra'],
...                    'Age': [25, 30, 35],
...                    'Location': ['Seattle', 'New York', 'Kona']},
...                   index=([10, 20, 30]))
>>> df.index
Index([10, 20, 30], dtype='int64')

In this example, we create a DataFrame with 3 rows and 3 columns, including Name, Age, and Location information. We set the index labels to be the integers 10, 20, and 30. We then access the index attribute of the DataFrame, which returns an Index object containing the index labels.

>>> df.index = [100, 200, 300]
>>> df
    Name  Age Location
100  Alice   25  Seattle
200    Bob   30 New York
300  Aritra  35    Kona

In this example, we modify the index labels of the DataFrame by assigning a new list of labels to the index attribute. The DataFrame is then updated with the new labels, and the output shows the modified DataFrame.

loc

Access a group of rows and columns by label(s) or a boolean array.

ndim

Return an int representing the number of axes / array dimensions.

shape

Return a tuple representing the dimensionality of the DataFrame.

size

Return an int representing the number of elements in this object.

style

Returns a Styler object.

values

Return a Numpy representation of the DataFrame.

Methods

`abs`()	Return a Series/DataFrame with absolute numeric value of each element.
`add`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator `add`).
`add_prefix`(prefix[, axis])	Prefix labels with string `prefix`.
`add_suffix`(suffix[, axis])	Suffix labels with string `suffix`.
`agg`([func, axis])	Aggregate using one or more operations over the specified axis.
`aggregate`([func, axis])	Aggregate using one or more operations over the specified axis.
`alias`(aliases)	Alias a set of keywords to existing columns.
`align`(other[, join, axis, level, copy, ...])	Align two objects on their axes with the specified join method.
`all`([axis, bool_only, skipna])	Return whether all elements are True, potentially over an axis.
`any`(*[, axis, bool_only, skipna])	Return whether any element is True, potentially over an axis.
`apply`(func[, axis, raw, result_type, args, ...])	Apply a function along an axis of the DataFrame.
`applymap`(func[, na_action])	Apply a function to a Dataframe elementwise.
`asfreq`(freq[, method, how, normalize, ...])	Convert time series to specified frequency.
`asof`(where[, subset])	Return the last row(s) without any NaNs before `where`.
`assign`(**kwargs)	Assign new columns to a DataFrame.
`astype`(dtype[, copy, errors])	Cast a pandas object to a specified dtype `dtype`.
`at_time`(time[, asof, axis])	Select values at particular time of day (e.g., 9:30AM).
`backfill`(*[, axis, inplace, limit, downcast])	Fill NA/NaN values by using the next valid observation to fill the gap.
`between_time`(start_time, end_time[, ...])	Select values between particular times of the day (e.g., 9:00-9:30 AM).
`bfill`(*[, axis, inplace, limit, limit_area, ...])	Fill NA/NaN values by using the next valid observation to fill the gap.
`bool`()	Return the bool of a single element Series or DataFrame.
`boxplot`([column, by, ax, fontsize, rot, ...])	Make a box plot from DataFrame columns.
`clear`()	Remove all selection rules
`clip`([lower, upper, axis, inplace])	Trim values at input threshold(s).
`columns_selected`()	The names of any columns which were used in a selection rule
`combine`(other, func[, fill_value, overwrite])	Perform column-wise combine with another DataFrame.
`combine_first`(other)	Update null elements with value in the same location in `other`.
`compare`(other[, align_axis, keep_shape, ...])	Compare to another DataFrame and show the differences.
`convert_dtypes`([infer_objects, ...])	Convert columns to the best possible dtypes using dtypes supporting `pd.NA`.
`copy`([deep])	Make a copy of this object's indices and data.
`corr`([method, min_periods, numeric_only])	Compute pairwise correlation of columns, excluding NA/null values.
`corrwith`(other[, axis, drop, method, ...])	Compute pairwise correlation.
`count`([axis, numeric_only])	Count non-NA cells for each column or row.
`cov`([min_periods, ddof, numeric_only])	Compute pairwise covariance of columns, excluding NA/null values.
`cummax`([axis, skipna])	Return cumulative maximum over a DataFrame or Series axis.
`cummin`([axis, skipna])	Return cumulative minimum over a DataFrame or Series axis.
`cumprod`([axis, skipna])	Return cumulative product over a DataFrame or Series axis.
`cumsum`([axis, skipna])	Return cumulative sum over a DataFrame or Series axis.
`describe`([percentiles, include, exclude])	Generate descriptive statistics.
`diff`([periods, axis])	First discrete difference of element.
`div`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `truediv`).
`divide`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `truediv`).
`dot`(other)	Compute the matrix multiplication between the DataFrame and other.
`drop`([labels, axis, index, columns, level, ...])	Drop specified labels from rows or columns.
`drop_duplicates`([subset, keep, inplace, ...])	Return DataFrame with duplicate rows removed.
`droplevel`(level[, axis])	Return Series/DataFrame with requested index / column level(s) removed.
`dropna`(*[, axis, how, thresh, subset, ...])	Remove missing values.
`duplicated`([subset, keep])	Return boolean Series denoting duplicate rows.
`eq`(other[, axis, level])	Get Equal to of dataframe and other, element-wise (binary operator `eq`).
`equals`(other)	Test whether two objects contain the same elements.
`eval`(expr, *[, inplace])	Evaluate a string describing operations on DataFrame columns.
`ewm`([com, span, halflife, alpha, ...])	Provide exponentially weighted (EW) calculations.
`expanding`([min_periods, axis, method])	Provide expanding window calculations.
`explode`(column[, ignore_index])	Transform each element of a list-like to a row, replicating index values.
`ffill`(*[, axis, inplace, limit, limit_area, ...])	Fill NA/NaN values by propagating the last valid observation to next valid.
`fillna`([value, method, axis, inplace, ...])	Fill NA/NaN values using the specified method.
`filter`([items, like, regex, axis])	Subset the dataframe rows or columns according to the specified index labels.
`first`(offset)	Select initial periods of time series data based on a date offset.
`first_valid_index`()	Return index for first non-NA value or None, if no non-NA value is found.
`floordiv`(other[, axis, level, fill_value])	Get Integer division of dataframe and other, element-wise (binary operator `floordiv`).
`from_dict`(data[, orient, dtype, columns])	Construct DataFrame from dict of array-like or dicts.
`from_records`(data[, index, exclude, ...])	Convert structured or record ndarray to DataFrame.
`ge`(other[, axis, level])	Get Greater than or equal to of dataframe and other, element-wise (binary operator `ge`).
`get`(key)	Get the selection/flag rule by its ID
`groupby`([by, axis, level, as_index, sort, ...])	Group DataFrame using a mapper or by a Series of columns.
`gt`(other[, axis, level])	Get Greater than of dataframe and other, element-wise (binary operator `gt`).
`head`([n])	Return the first `n` rows.
`hist`([column, by, grid, xlabelsize, xrot, ...])	Make a histogram of the DataFrame's columns.
`idxmax`([axis, skipna, numeric_only])	Return index of first occurrence of maximum over requested axis.
`idxmin`([axis, skipna, numeric_only])	Return index of first occurrence of minimum over requested axis.
`infer_objects`([copy])	Attempt to infer better dtypes for object columns.
`info`([verbose, buf, max_cols, memory_usage, ...])	Print a concise summary of a DataFrame.
`insert`(loc, column, value[, allow_duplicates])	Insert column into DataFrame at specified location.
`interpolate`([method, axis, limit, inplace, ...])	Fill NaN values using an interpolation method.
`isetitem`(loc, value)	Set the given value in the column with position `loc`.
`isin`(values)	Whether each element in the DataFrame is contained in values.
`isna`()	Detect missing values.
`isnull`()	DataFrame.isnull is an alias for DataFrame.isna.
`items`()	Iterate over (column name, Series) pairs.
`iterrows`()	Iterate over DataFrame rows as (index, Series) pairs.
`itertuples`([index, name])	Iterate over DataFrame rows as namedtuples.
`join`(other[, on, how, lsuffix, rsuffix, ...])	Join columns of another DataFrame.
`keys`()	Get the 'info axis' (see Indexing for more).
`kurt`([axis, skipna, numeric_only])	Return unbiased kurtosis over requested axis.
`kurtosis`([axis, skipna, numeric_only])	Return unbiased kurtosis over requested axis.
`last`(offset)	Select final periods of time series data based on a date offset.
`last_valid_index`()	Return index for last non-NA value or None, if no non-NA value is found.
`le`(other[, axis, level])	Get Less than or equal to of dataframe and other, element-wise (binary operator `le`).
`lt`(other[, axis, level])	Get Less than of dataframe and other, element-wise (binary operator `lt`).
`map`(func[, na_action])	Apply a function to a Dataframe elementwise.
`mask`(cond[, other, inplace, axis, level])	Replace values where the condition is True.
`max`([axis, skipna, numeric_only])	Return the maximum of the values over the requested axis.
`mean`([axis, skipna, numeric_only])	Return the mean of the values over the requested axis.
`median`([axis, skipna, numeric_only])	Return the median of the values over the requested axis.
`melt`([id_vars, value_vars, var_name, ...])	Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
`memory_usage`([index, deep])	Return the memory usage of each column in bytes.
`merge`(how[, on])	Merge selection rules using a specific type of join.
`min`([axis, skipna, numeric_only])	Return the minimum of the values over the requested axis.
`mod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator `mod`).
`mode`([axis, numeric_only, dropna])	Get the mode(s) of each element along the selected axis.
`mul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator `mul`).
`multiply`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator `mul`).
`ne`(other[, axis, level])	Get Not equal to of dataframe and other, element-wise (binary operator `ne`).
`nlargest`(n, columns[, keep])	Return the first `n` rows ordered by `columns` in descending order.
`notna`()	Detect existing (non-missing) values.
`notnull`()	DataFrame.notnull is an alias for DataFrame.notna.
`nsmallest`(n, columns[, keep])	Return the first `n` rows ordered by `columns` in ascending order.
`nunique`([axis, dropna])	Count number of distinct elements in specified axis.
`pad`(*[, axis, inplace, limit, downcast])	Fill NA/NaN values by propagating the last valid observation to next valid.
`pct_change`([periods, fill_method, limit, freq])	Fractional change between the current and a prior element.
`pipe`(func, args, *kwargs)	Apply chainable functions that expect Series or DataFrames.
`pivot`(*, columns[, index, values])	Return reshaped DataFrame organized by given index / column values.
`pivot_table`([values, index, columns, ...])	Create a spreadsheet-style pivot table as a DataFrame.
`plot`	alias of `PlotAccessor`
`pop`(item)	Return item and drop from frame.
`pow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator `pow`).
`prod`([axis, skipna, numeric_only, min_count])	Return the product of the values over the requested axis.
`product`([axis, skipna, numeric_only, min_count])	Return the product of the values over the requested axis.
`quantile`([q, axis, numeric_only, ...])	Return values at the given quantile over requested axis.
`query`(expr, *[, inplace])	Query the columns of a DataFrame with a boolean expression.
`radd`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator `radd`).
`rank`([axis, method, numeric_only, ...])	Compute numerical data ranks (1 through n) along axis.
`rdiv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `rtruediv`).
`reindex`([labels, index, columns, axis, ...])	Conform DataFrame to new index with optional filling logic.
`reindex_like`(other[, method, copy, limit, ...])	Return an object with matching indices as other object.
`remove`([id, tag])	Remove (delete) a selection rule(s).
`rename`([mapper, index, columns, axis, copy, ...])	Rename columns or index labels.
`rename_axis`([mapper, index, columns, axis, ...])	Set the name of the axis for the index or columns.
`reorder_levels`(order[, axis])	Rearrange index levels using input order.
`replace`([to_replace, value, inplace, limit, ...])	Replace values given in `to_replace` with `value`.
`resample`(rule[, axis, closed, label, ...])	Resample time-series data.
`reset_index`([level, drop, inplace, ...])	Reset the index, or a level of it.
`rfloordiv`(other[, axis, level, fill_value])	Get Integer division of dataframe and other, element-wise (binary operator `rfloordiv`).
`rmod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator `rmod`).
`rmul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator `rmul`).
`rolling`(window[, min_periods, center, ...])	Provide rolling window calculations.
`round`([decimals])	Round a DataFrame to a variable number of decimal places.
`rpow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator `rpow`).
`rsub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator `rsub`).
`rtruediv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `rtruediv`).
`sample`([n, frac, replace, weights, ...])	Return a random sample of items from an axis of object.
`select`([tag, check])	Add one or more exact selection rules, e.g., `key1 = value1, key2 = value2, ...` If `value` is array-like then a match to any of the array members will be selected.
`select_channel`(channel[, tag])	Select channels and/or channel ranges.
`select_dtypes`([include, exclude])	Return a subset of the DataFrame's columns based on the column dtypes.
`select_range`([tag])	Select a range of inclusive values for a given key(s).
`select_within`([tag])	Select a value within a plus or minus for a given key(s).
`sem`([axis, skipna, ddof, numeric_only])	Return unbiased standard error of the mean over requested axis.
`set_axis`(labels, *[, axis, copy])	Assign desired index to given axis.
`set_flags`(*[, copy, allows_duplicate_labels])	Return a new object with updated flags.
`set_index`(keys, *[, drop, append, inplace, ...])	Set the DataFrame index using existing columns.
`shift`([periods, freq, axis, fill_value, suffix])	Shift index by desired number of periods with an optional time `freq`.
`show`()	Print the current selection rules.
`skew`([axis, skipna, numeric_only])	Return unbiased skew over requested axis.
`sort_index`(*[, axis, level, ascending, ...])	Sort object by labels (along an axis).
`sort_values`(by, *[, axis, ascending, ...])	Sort by the values along either axis.
`sparse`	alias of `SparseFrameAccessor`
`squeeze`([axis])	Squeeze 1 dimensional axis objects into scalars.
`stack`([level, dropna, sort, future_stack])	Stack the prescribed level(s) from columns to index.
`std`([axis, skipna, ddof, numeric_only])	Return sample standard deviation over requested axis.
`sub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator `sub`).
`subtract`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator `sub`).
`sum`([axis, skipna, numeric_only, min_count])	Return the sum of the values over the requested axis.
`swapaxes`(axis1, axis2[, copy])	Interchange axes and swap values axes appropriately.
`swaplevel`([i, j, axis])	Swap levels i and j in a `MultiIndex`.
`tail`([n])	Return the last `n` rows.
`take`(indices[, axis])	Return the elements in the given positional indices along an axis.
`to_clipboard`(*[, excel, sep])	Copy object to the system clipboard.
`to_csv`([path_or_buf, sep, na_rep, ...])	Write object to a comma-separated values (csv) file.
`to_dict`([orient, into, index])	Convert the DataFrame to a dictionary.
`to_excel`(excel_writer, *[, sheet_name, ...])	Write object to an Excel sheet.
`to_feather`(path, **kwargs)	Write a DataFrame to the binary Feather format.
`to_gbq`(destination_table, *[, project_id, ...])	Write a DataFrame to a Google BigQuery table.
`to_hdf`(path_or_buf, *, key[, mode, ...])	Write the contained data to an HDF5 file using HDFStore.
`to_html`([buf, columns, col_space, header, ...])	Render a DataFrame as an HTML table.
`to_json`([path_or_buf, orient, date_format, ...])	Convert the object to a JSON string.
`to_latex`([buf, columns, header, index, ...])	Render object to a LaTeX tabular, longtable, or nested table.
`to_markdown`([buf, mode, index, storage_options])	Print DataFrame in Markdown-friendly format.
`to_numpy`([dtype, copy, na_value])	Convert the DataFrame to a NumPy array.
`to_orc`([path, engine, index, engine_kwargs])	Write a DataFrame to the ORC format.
`to_parquet`([path, engine, compression, ...])	Write a DataFrame to the binary parquet format.
`to_period`([freq, axis, copy])	Convert DataFrame from DatetimeIndex to PeriodIndex.
`to_pickle`(path, *[, compression, protocol, ...])	Pickle (serialize) object to file.
`to_records`([index, column_dtypes, index_dtypes])	Convert DataFrame to a NumPy record array.
`to_sql`(name, con, *[, schema, if_exists, ...])	Write records stored in a DataFrame to a SQL database.
`to_stata`(path, *[, convert_dates, ...])	Export DataFrame object to Stata dta format.
`to_string`([buf, columns, col_space, header, ...])	Render a DataFrame to a console-friendly tabular output.
`to_timestamp`([freq, how, axis, copy])	Cast to DatetimeIndex of timestamps, at beginning of period.
`to_xarray`()	Return an xarray object from the pandas object.
`to_xml`([path_or_buffer, index, root_name, ...])	Render a DataFrame to an XML document.
`transform`(func[, axis])	Call `func` on self producing a DataFrame with the same axis shape as self.
`transpose`(*args[, copy])	Transpose index and columns.
`truediv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `truediv`).
`truncate`([before, after, axis, copy])	Truncate a Series or DataFrame before and after some index value.
`tz_convert`(tz[, axis, level, copy])	Convert tz-aware axis to target time zone.
`tz_localize`(tz[, axis, level, copy, ...])	Localize tz-naive index of a Series or DataFrame to target time zone.
`unstack`([level, fill_value, sort])	Pivot a level of the (necessarily hierarchical) index labels.
`update`(other[, join, overwrite, ...])	Modify in place using non-NA values from another DataFrame.
`value_counts`([subset, normalize, sort, ...])	Return a Series containing the frequency of each distinct row in the Dataframe.
`var`([axis, skipna, ddof, numeric_only])	Return unbiased variance over requested axis.
`where`(cond[, other, inplace, axis, level])	Replace values where the condition is False.
`xs`(key[, axis, level, drop_level])	Return cross-section from the Series/DataFrame.

select(tag=None, check=False, **kwargs)[source]#

Add one or more exact selection rules, e.g., key1 = value1, key2 = value2, ... If value is array-like then a match to any of the array members will be selected. For instance select(object=['3C273', 'NGC1234']) will select data for either of those objects and select(ifnum=[0,2]) will select IF number 0 or IF number 2.

Parameters:

tagstr: An identifying tag by which the rule may be referred to later. If None, a randomly generated tag will be created.
checkbool: If True, check that a previous selection does not give an identical result as this one.
keystr: The key (SDFITS column name or other supported key)
valueany: The value to select

select_channel(channel, tag=None)[source]#

Select channels and/or channel ranges. These are NOT used in final() but rather will be used to create a mask for calibration or flagging. Single arrays/tuples will be treated as channel lists; nested arrays will be treated as inclusive ranges. Channel numbers start at zero.

Parameters:

channelnumber, or array-like: The channels to select

Returns:

None

Examples

Select channel 24.

>>> select_channel(24)

Select channels 1 and 10.

>>> select_channel([1,10])

Select channels 1 thru 10 inclusive.

>>> select_channel([[1,10]])

Select channel ranges 1 thru 10 and 47 thru 56 inclusive, and channel 75.

>>> select_channel([[1,10], [47,56], 75)])

Tuples also work. To select the same as above.

>>> select_channel(((1,10), [47,56], 75))

select_range(tag=None, **kwargs)[source]#

Select a range of inclusive values for a given key(s). e.g., key1 = (v1,v2), key2 = (v3,v4), ... will select data v1 <= data1 <= v2, v3 <= data2 <= v4, ... ` Upper and lower limits may be given by setting one of the tuple values to None. e.g., `key1 = (None,v1) for an upper limit data1 <= v1 and key1 = (v1,None) for a lower limit data >=v1. Lower limits may also be specified by a one-element tuple key1 = (v1,).

Parameters:

tagstr, optional: An identifying tag by which the rule may be referred to later. If None, a randomly generated tag will be created.
keystr: The key (SDFITS column name or other supported key)
valuearray-like: Tuple or list giving the lower and upper limits of the range.

Returns:

None.

select_within(tag=None, **kwargs)[source]#

Select a value within a plus or minus for a given key(s). e.g. key1 = [value1,epsilon1], key2 = [value2,epsilon2], ... Will select data value1-epsilon1 <= data1 <= value1+epsilon1, value2-epsilon2 <= data2 <= value2+epsilon2,...

Parameters:

tagstr, optional: An identifying tag by which the rule may be referred to later. If None, a randomly generated tag will be created.
keystr: The key (SDFITS column name or other supported key)
valuearray-like: Tuple or list giving the value and epsilon

Returns:

None.

class dysh.util.selection.SelectionBase(initobj, aliases={'dec': 'crval3', 'elevation': 'elevatio', 'freq': 'crval1', 'gallat': 'crval3', 'gallon': 'crval2', 'glat': 'crval3', 'glon': 'crval2', 'pol': 'plnum', 'ra': 'crval2', 'source': 'object', 'subref': 'subref_state'}, **kwargs)[source]#

Bases: DataFrame

This class is the base class for selection and flagging. Selection and flagging are both kinds of data selection, so SelectionBase can encapsulate most necessary functionality. Derived classes implement specific named methods e.g. select_channel, flag_channel that will simply call the base class methods.

Attributes:

T

The transpose of the DataFrame.

aliases

The aliases that may be used to refer to SDFITS columns.

at

Access a single value for a row/column label pair.

attrs

Dictionary of global attributes of this dataset.

axes

Return a list representing the axes of the DataFrame.

columns

The column labels of the DataFrame.

>>> df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
>>> df
     A  B
0    1  3
1    2  4
>>> df.columns
Index(['A', 'B'], dtype='object')

dtypes

Return the dtypes in the DataFrame.

empty

Indicator whether Series/DataFrame is empty.

final

Create the final selection.

flags

Get the properties associated with this pandas object.

iat

Access a single value for a row/column pair by integer position.

iloc

Purely integer-location based indexing for selection by position.

index

The index (row labels) of the DataFrame.

The index of a DataFrame is a series of labels that identify each row. The labels can be integers, strings, or any other hashable type. The index is used for label-based access and alignment, and can be accessed or modified using this attribute.

pandas.Index: The index labels of the DataFrame.

DataFrame.columns : The column labels of the DataFrame. DataFrame.to_numpy : Convert the DataFrame to a NumPy array.

>>> df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Aritra'],
...                    'Age': [25, 30, 35],
...                    'Location': ['Seattle', 'New York', 'Kona']},
...                   index=([10, 20, 30]))
>>> df.index
Index([10, 20, 30], dtype='int64')

In this example, we create a DataFrame with 3 rows and 3 columns, including Name, Age, and Location information. We set the index labels to be the integers 10, 20, and 30. We then access the index attribute of the DataFrame, which returns an Index object containing the index labels.

>>> df.index = [100, 200, 300]
>>> df
    Name  Age Location
100  Alice   25  Seattle
200    Bob   30 New York
300  Aritra  35    Kona

In this example, we modify the index labels of the DataFrame by assigning a new list of labels to the index attribute. The DataFrame is then updated with the new labels, and the output shows the modified DataFrame.

loc

Access a group of rows and columns by label(s) or a boolean array.

ndim

Return an int representing the number of axes / array dimensions.

shape

Return a tuple representing the dimensionality of the DataFrame.

size

Return an int representing the number of elements in this object.

style

Returns a Styler object.

values

Return a Numpy representation of the DataFrame.

Methods

`abs`()	Return a Series/DataFrame with absolute numeric value of each element.
`add`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator `add`).
`add_prefix`(prefix[, axis])	Prefix labels with string `prefix`.
`add_suffix`(suffix[, axis])	Suffix labels with string `suffix`.
`agg`([func, axis])	Aggregate using one or more operations over the specified axis.
`aggregate`([func, axis])	Aggregate using one or more operations over the specified axis.
`alias`(aliases)	Alias a set of keywords to existing columns.
`align`(other[, join, axis, level, copy, ...])	Align two objects on their axes with the specified join method.
`all`([axis, bool_only, skipna])	Return whether all elements are True, potentially over an axis.
`any`(*[, axis, bool_only, skipna])	Return whether any element is True, potentially over an axis.
`apply`(func[, axis, raw, result_type, args, ...])	Apply a function along an axis of the DataFrame.
`applymap`(func[, na_action])	Apply a function to a Dataframe elementwise.
`asfreq`(freq[, method, how, normalize, ...])	Convert time series to specified frequency.
`asof`(where[, subset])	Return the last row(s) without any NaNs before `where`.
`assign`(**kwargs)	Assign new columns to a DataFrame.
`astype`(dtype[, copy, errors])	Cast a pandas object to a specified dtype `dtype`.
`at_time`(time[, asof, axis])	Select values at particular time of day (e.g., 9:30AM).
`backfill`(*[, axis, inplace, limit, downcast])	Fill NA/NaN values by using the next valid observation to fill the gap.
`between_time`(start_time, end_time[, ...])	Select values between particular times of the day (e.g., 9:00-9:30 AM).
`bfill`(*[, axis, inplace, limit, limit_area, ...])	Fill NA/NaN values by using the next valid observation to fill the gap.
`bool`()	Return the bool of a single element Series or DataFrame.
`boxplot`([column, by, ax, fontsize, rot, ...])	Make a box plot from DataFrame columns.
`clear`()	Remove all selection rules
`clip`([lower, upper, axis, inplace])	Trim values at input threshold(s).
`columns_selected`()	The names of any columns which were used in a selection rule
`combine`(other, func[, fill_value, overwrite])	Perform column-wise combine with another DataFrame.
`combine_first`(other)	Update null elements with value in the same location in `other`.
`compare`(other[, align_axis, keep_shape, ...])	Compare to another DataFrame and show the differences.
`convert_dtypes`([infer_objects, ...])	Convert columns to the best possible dtypes using dtypes supporting `pd.NA`.
`copy`([deep])	Make a copy of this object's indices and data.
`corr`([method, min_periods, numeric_only])	Compute pairwise correlation of columns, excluding NA/null values.
`corrwith`(other[, axis, drop, method, ...])	Compute pairwise correlation.
`count`([axis, numeric_only])	Count non-NA cells for each column or row.
`cov`([min_periods, ddof, numeric_only])	Compute pairwise covariance of columns, excluding NA/null values.
`cummax`([axis, skipna])	Return cumulative maximum over a DataFrame or Series axis.
`cummin`([axis, skipna])	Return cumulative minimum over a DataFrame or Series axis.
`cumprod`([axis, skipna])	Return cumulative product over a DataFrame or Series axis.
`cumsum`([axis, skipna])	Return cumulative sum over a DataFrame or Series axis.
`describe`([percentiles, include, exclude])	Generate descriptive statistics.
`diff`([periods, axis])	First discrete difference of element.
`div`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `truediv`).
`divide`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `truediv`).
`dot`(other)	Compute the matrix multiplication between the DataFrame and other.
`drop`([labels, axis, index, columns, level, ...])	Drop specified labels from rows or columns.
`drop_duplicates`([subset, keep, inplace, ...])	Return DataFrame with duplicate rows removed.
`droplevel`(level[, axis])	Return Series/DataFrame with requested index / column level(s) removed.
`dropna`(*[, axis, how, thresh, subset, ...])	Remove missing values.
`duplicated`([subset, keep])	Return boolean Series denoting duplicate rows.
`eq`(other[, axis, level])	Get Equal to of dataframe and other, element-wise (binary operator `eq`).
`equals`(other)	Test whether two objects contain the same elements.
`eval`(expr, *[, inplace])	Evaluate a string describing operations on DataFrame columns.
`ewm`([com, span, halflife, alpha, ...])	Provide exponentially weighted (EW) calculations.
`expanding`([min_periods, axis, method])	Provide expanding window calculations.
`explode`(column[, ignore_index])	Transform each element of a list-like to a row, replicating index values.
`ffill`(*[, axis, inplace, limit, limit_area, ...])	Fill NA/NaN values by propagating the last valid observation to next valid.
`fillna`([value, method, axis, inplace, ...])	Fill NA/NaN values using the specified method.
`filter`([items, like, regex, axis])	Subset the dataframe rows or columns according to the specified index labels.
`first`(offset)	Select initial periods of time series data based on a date offset.
`first_valid_index`()	Return index for first non-NA value or None, if no non-NA value is found.
`floordiv`(other[, axis, level, fill_value])	Get Integer division of dataframe and other, element-wise (binary operator `floordiv`).
`from_dict`(data[, orient, dtype, columns])	Construct DataFrame from dict of array-like or dicts.
`from_records`(data[, index, exclude, ...])	Convert structured or record ndarray to DataFrame.
`ge`(other[, axis, level])	Get Greater than or equal to of dataframe and other, element-wise (binary operator `ge`).
`get`(key)	Get the selection/flag rule by its ID
`groupby`([by, axis, level, as_index, sort, ...])	Group DataFrame using a mapper or by a Series of columns.
`gt`(other[, axis, level])	Get Greater than of dataframe and other, element-wise (binary operator `gt`).
`head`([n])	Return the first `n` rows.
`hist`([column, by, grid, xlabelsize, xrot, ...])	Make a histogram of the DataFrame's columns.
`idxmax`([axis, skipna, numeric_only])	Return index of first occurrence of maximum over requested axis.
`idxmin`([axis, skipna, numeric_only])	Return index of first occurrence of minimum over requested axis.
`infer_objects`([copy])	Attempt to infer better dtypes for object columns.
`info`([verbose, buf, max_cols, memory_usage, ...])	Print a concise summary of a DataFrame.
`insert`(loc, column, value[, allow_duplicates])	Insert column into DataFrame at specified location.
`interpolate`([method, axis, limit, inplace, ...])	Fill NaN values using an interpolation method.
`isetitem`(loc, value)	Set the given value in the column with position `loc`.
`isin`(values)	Whether each element in the DataFrame is contained in values.
`isna`()	Detect missing values.
`isnull`()	DataFrame.isnull is an alias for DataFrame.isna.
`items`()	Iterate over (column name, Series) pairs.
`iterrows`()	Iterate over DataFrame rows as (index, Series) pairs.
`itertuples`([index, name])	Iterate over DataFrame rows as namedtuples.
`join`(other[, on, how, lsuffix, rsuffix, ...])	Join columns of another DataFrame.
`keys`()	Get the 'info axis' (see Indexing for more).
`kurt`([axis, skipna, numeric_only])	Return unbiased kurtosis over requested axis.
`kurtosis`([axis, skipna, numeric_only])	Return unbiased kurtosis over requested axis.
`last`(offset)	Select final periods of time series data based on a date offset.
`last_valid_index`()	Return index for last non-NA value or None, if no non-NA value is found.
`le`(other[, axis, level])	Get Less than or equal to of dataframe and other, element-wise (binary operator `le`).
`lt`(other[, axis, level])	Get Less than of dataframe and other, element-wise (binary operator `lt`).
`map`(func[, na_action])	Apply a function to a Dataframe elementwise.
`mask`(cond[, other, inplace, axis, level])	Replace values where the condition is True.
`max`([axis, skipna, numeric_only])	Return the maximum of the values over the requested axis.
`mean`([axis, skipna, numeric_only])	Return the mean of the values over the requested axis.
`median`([axis, skipna, numeric_only])	Return the median of the values over the requested axis.
`melt`([id_vars, value_vars, var_name, ...])	Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
`memory_usage`([index, deep])	Return the memory usage of each column in bytes.
`merge`(how[, on])	Merge selection rules using a specific type of join.
`min`([axis, skipna, numeric_only])	Return the minimum of the values over the requested axis.
`mod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator `mod`).
`mode`([axis, numeric_only, dropna])	Get the mode(s) of each element along the selected axis.
`mul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator `mul`).
`multiply`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator `mul`).
`ne`(other[, axis, level])	Get Not equal to of dataframe and other, element-wise (binary operator `ne`).
`nlargest`(n, columns[, keep])	Return the first `n` rows ordered by `columns` in descending order.
`notna`()	Detect existing (non-missing) values.
`notnull`()	DataFrame.notnull is an alias for DataFrame.notna.
`nsmallest`(n, columns[, keep])	Return the first `n` rows ordered by `columns` in ascending order.
`nunique`([axis, dropna])	Count number of distinct elements in specified axis.
`pad`(*[, axis, inplace, limit, downcast])	Fill NA/NaN values by propagating the last valid observation to next valid.
`pct_change`([periods, fill_method, limit, freq])	Fractional change between the current and a prior element.
`pipe`(func, args, *kwargs)	Apply chainable functions that expect Series or DataFrames.
`pivot`(*, columns[, index, values])	Return reshaped DataFrame organized by given index / column values.
`pivot_table`([values, index, columns, ...])	Create a spreadsheet-style pivot table as a DataFrame.
`plot`	alias of `PlotAccessor`
`pop`(item)	Return item and drop from frame.
`pow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator `pow`).
`prod`([axis, skipna, numeric_only, min_count])	Return the product of the values over the requested axis.
`product`([axis, skipna, numeric_only, min_count])	Return the product of the values over the requested axis.
`quantile`([q, axis, numeric_only, ...])	Return values at the given quantile over requested axis.
`query`(expr, *[, inplace])	Query the columns of a DataFrame with a boolean expression.
`radd`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator `radd`).
`rank`([axis, method, numeric_only, ...])	Compute numerical data ranks (1 through n) along axis.
`rdiv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `rtruediv`).
`reindex`([labels, index, columns, axis, ...])	Conform DataFrame to new index with optional filling logic.
`reindex_like`(other[, method, copy, limit, ...])	Return an object with matching indices as other object.
`remove`([id, tag])	Remove (delete) a selection rule(s).
`rename`([mapper, index, columns, axis, copy, ...])	Rename columns or index labels.
`rename_axis`([mapper, index, columns, axis, ...])	Set the name of the axis for the index or columns.
`reorder_levels`(order[, axis])	Rearrange index levels using input order.
`replace`([to_replace, value, inplace, limit, ...])	Replace values given in `to_replace` with `value`.
`resample`(rule[, axis, closed, label, ...])	Resample time-series data.
`reset_index`([level, drop, inplace, ...])	Reset the index, or a level of it.
`rfloordiv`(other[, axis, level, fill_value])	Get Integer division of dataframe and other, element-wise (binary operator `rfloordiv`).
`rmod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator `rmod`).
`rmul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator `rmul`).
`rolling`(window[, min_periods, center, ...])	Provide rolling window calculations.
`round`([decimals])	Round a DataFrame to a variable number of decimal places.
`rpow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator `rpow`).
`rsub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator `rsub`).
`rtruediv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `rtruediv`).
`sample`([n, frac, replace, weights, ...])	Return a random sample of items from an axis of object.
`select_dtypes`([include, exclude])	Return a subset of the DataFrame's columns based on the column dtypes.
`sem`([axis, skipna, ddof, numeric_only])	Return unbiased standard error of the mean over requested axis.
`set_axis`(labels, *[, axis, copy])	Assign desired index to given axis.
`set_flags`(*[, copy, allows_duplicate_labels])	Return a new object with updated flags.
`set_index`(keys, *[, drop, append, inplace, ...])	Set the DataFrame index using existing columns.
`shift`([periods, freq, axis, fill_value, suffix])	Shift index by desired number of periods with an optional time `freq`.
`show`()	Print the current selection rules.
`skew`([axis, skipna, numeric_only])	Return unbiased skew over requested axis.
`sort_index`(*[, axis, level, ascending, ...])	Sort object by labels (along an axis).
`sort_values`(by, *[, axis, ascending, ...])	Sort by the values along either axis.
`sparse`	alias of `SparseFrameAccessor`
`squeeze`([axis])	Squeeze 1 dimensional axis objects into scalars.
`stack`([level, dropna, sort, future_stack])	Stack the prescribed level(s) from columns to index.
`std`([axis, skipna, ddof, numeric_only])	Return sample standard deviation over requested axis.
`sub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator `sub`).
`subtract`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator `sub`).
`sum`([axis, skipna, numeric_only, min_count])	Return the sum of the values over the requested axis.
`swapaxes`(axis1, axis2[, copy])	Interchange axes and swap values axes appropriately.
`swaplevel`([i, j, axis])	Swap levels i and j in a `MultiIndex`.
`tail`([n])	Return the last `n` rows.
`take`(indices[, axis])	Return the elements in the given positional indices along an axis.
`to_clipboard`(*[, excel, sep])	Copy object to the system clipboard.
`to_csv`([path_or_buf, sep, na_rep, ...])	Write object to a comma-separated values (csv) file.
`to_dict`([orient, into, index])	Convert the DataFrame to a dictionary.
`to_excel`(excel_writer, *[, sheet_name, ...])	Write object to an Excel sheet.
`to_feather`(path, **kwargs)	Write a DataFrame to the binary Feather format.
`to_gbq`(destination_table, *[, project_id, ...])	Write a DataFrame to a Google BigQuery table.
`to_hdf`(path_or_buf, *, key[, mode, ...])	Write the contained data to an HDF5 file using HDFStore.
`to_html`([buf, columns, col_space, header, ...])	Render a DataFrame as an HTML table.
`to_json`([path_or_buf, orient, date_format, ...])	Convert the object to a JSON string.
`to_latex`([buf, columns, header, index, ...])	Render object to a LaTeX tabular, longtable, or nested table.
`to_markdown`([buf, mode, index, storage_options])	Print DataFrame in Markdown-friendly format.
`to_numpy`([dtype, copy, na_value])	Convert the DataFrame to a NumPy array.
`to_orc`([path, engine, index, engine_kwargs])	Write a DataFrame to the ORC format.
`to_parquet`([path, engine, compression, ...])	Write a DataFrame to the binary parquet format.
`to_period`([freq, axis, copy])	Convert DataFrame from DatetimeIndex to PeriodIndex.
`to_pickle`(path, *[, compression, protocol, ...])	Pickle (serialize) object to file.
`to_records`([index, column_dtypes, index_dtypes])	Convert DataFrame to a NumPy record array.
`to_sql`(name, con, *[, schema, if_exists, ...])	Write records stored in a DataFrame to a SQL database.
`to_stata`(path, *[, convert_dates, ...])	Export DataFrame object to Stata dta format.
`to_string`([buf, columns, col_space, header, ...])	Render a DataFrame to a console-friendly tabular output.
`to_timestamp`([freq, how, axis, copy])	Cast to DatetimeIndex of timestamps, at beginning of period.
`to_xarray`()	Return an xarray object from the pandas object.
`to_xml`([path_or_buffer, index, root_name, ...])	Render a DataFrame to an XML document.
`transform`(func[, axis])	Call `func` on self producing a DataFrame with the same axis shape as self.
`transpose`(*args[, copy])	Transpose index and columns.
`truediv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator `truediv`).
`truncate`([before, after, axis, copy])	Truncate a Series or DataFrame before and after some index value.
`tz_convert`(tz[, axis, level, copy])	Convert tz-aware axis to target time zone.
`tz_localize`(tz[, axis, level, copy, ...])	Localize tz-naive index of a Series or DataFrame to target time zone.
`unstack`([level, fill_value, sort])	Pivot a level of the (necessarily hierarchical) index labels.
`update`(other[, join, overwrite, ...])	Modify in place using non-NA values from another DataFrame.
`value_counts`([subset, normalize, sort, ...])	Return a Series containing the frequency of each distinct row in the Dataframe.
`var`([axis, skipna, ddof, numeric_only])	Return unbiased variance over requested axis.
`where`(cond[, other, inplace, axis, level])	Replace values where the condition is False.
`xs`(key[, axis, level, drop_level])	Return cross-section from the Series/DataFrame.

alias(aliases)[source]#

Alias a set of keywords to existing columns. Multiple aliases for a single column are allowed, e.g., { ‘glon’:’crval2’, ‘lon’:’crval2’}

Aliases whose target columns don’t exist in the DataFrame are silently skipped. This allows default aliases to work with partial index data (e.g., from .index files that don’t include WCS columns).

Parameters:

aliases{}: The dictionary of keywords and column names where the new alias is the key and the column name is the value and , i.e., {alias:column}

Returns:

None.

property aliases#

The aliases that may be used to refer to SDFITS columns.

Returns:

dict: The dictionary of aliases and SDFITS column names

clear()[source]#: Remove all selection rules

columns_selected()[source]#

The names of any columns which were used in a selection rule

Returns:

colnames - set: A set of str column names. An empty set is returned if no selection rule has yet been made.

property final#

Create the final selection. This is done by a logical AND of each of the selection rules (specifically pandas.merge(how='inner')).

Returns:

finalDataFrame: The resultant selection from all the rules.

get(key)[source]#

Get the selection/flag rule by its ID

Parameters:

keyint: The ID value. See show().

Returns:

DataFrame: The selection/flag rule

merge(how, on=None)[source]#

Merge selection rules using a specific type of join.

Parameters:

how{‘left’, ‘right’, ‘outer’, ‘inner’, ‘cross’}, no default.: The type of join to be performed. See pandas.merge().
on: label or list: Column or index level names to join on. These must be found in both DataFrames. If on is None and not merging on indexes then this defaults to the intersection of the columns in both DataFrames.

Returns:

finalDataFrame: The resultant selection from all the rules.

remove(id=None, tag=None)[source]#

Remove (delete) a selection rule(s). You must specify either id or tag but not both. If there are multiple rules with the same tag, they will all be deleted.

Parameters:

idint: The ID number of the rule as displayed in show()
tagstr: An identifying tag by which the rule may be referred to later.

show()[source]#

Print the current selection rules. Only columns with a rule are shown. The first two columns are ID number a TAG string. Either of these may be used to remove() a row. The final column # SELECTED gives the number of rows that a given rule selects from the original. The final() selection may be fewer rows because each selection rule is logically OR’ed to create the final selection.

Returns:

None.

class dysh.util.timers.Benchmark(description=None, logger=None, track_memory=False)[source]#

Bases: object

Simple context manager for timing code blocks with optional memory tracking.

Parameters:

descriptionstr, optional: Description of the operation being timed
loggercallable, optional: Logging function to use (default: print). Can be a logger.debug, logger.info, etc.
track_memorybool, optional: Whether to track memory usage (default: False for minimal overhead)

Examples

>>> with Benchmark("Loading data", logger=logger.debug):
...     data = load_data()
Loading data in 1.234 seconds

>>> with Benchmark("Processing", track_memory=True):
...     process(data)
Processing in 2.345 seconds, memory delta: 150.23 MB

class dysh.util.timers.DTime(benchname='generic', units='ms', active=True, data_cols=None, data_units=None, data_types=None, args=None)[source]#

Bases: object

This class encapsulated some popular timing/performance tools.

Parameters:

benchnamestr

Identifying name of the benchmark stored in the metadata of the table

unitsstr

Units. Allowed are “ms” (the default), others not implemented yet, if ever.

data_colslist

List of names of the extra columns (in addition to the default name and time) written to an Astropy at the report stage of this class.

data_unitslist

List of units names of the extra columns.

data_typeslist

List of data types of the extra columns.

args: dict

This dictionary controls a number of common variables used in dysh benchmarking.

out : output filename (astropy Table). Default is no file is written. append : append to previous output file (astropy Table). overwrite : overwrite a previous output file (astropy Table). profile : run the profiler: Default False statslines : number of profiler statistics lines to print. Default 25 sortkey : how to sort the profiler statistics, “cumulative” or “time”. Default “cumulative” (SortKey.CUMULATIVE).

Methods

`disable`()	Disable the profiler
`enable`()	Enable the profiler
`total`()	report total CPU time so far

active
close
report
tag

Examples

>>> dt = DTime()
>>> dt.tag("test1")
>>> dt.tag("test2")
>>> dt.tag("test3")
>>> dt.report()

By default it simply builds a delta-time of the time it took between the different tags, as labeled by their tag name. If DTime() is supplied a number of data items for extra columns, these will be reported, or stored in a table, if out= is supplied.

active()[source]#

close()[source]#

disable()[source]#: Disable the profiler

enable()[source]#: Enable the profiler

report(debug=False)[source]#

tag(name, data=None)[source]#

total()[source]#: report total CPU time so far

Created on Wed Feb 12 13:13:33 2025

@author: mpound

class dysh.util.weatherforecast.BaseWeatherForecast[source]#

Bases: ABC

A generic interface to get weather forecast values from an external source

Methods

fetch

abstract fetch(valueType: list | None = None, mjd: Time | ndarray | None = None, coeffs=None, **kwargs) → ndarray[source]#

class dysh.util.weatherforecast.GBTForecastScriptInterface(path: Path | str = '/users/rmaddale/bin/getForecastValues', **kwargs)[source]#

Bases: object

An interface to call the GBO weather forecast script. Generally, users will not use this class directly, but rather use GBTWeatherForecast.

Parameters:

pathstr or pathlib.Path: The script to run to get forecast values.
debugbool: If True, don’t check that path exists. This is useful for testing when not on GBO network. Default: False

Attributes:

valid_vartypes: List of the valid weather variable type names that can be retrieved.

Methods

__call__([vartype, freq, mjd, coeffs])

Call the GBO weather script and parse the results into numbers.

__call__(vartype: str = 'Opacity', freq: list | None = None, mjd: list | None = None, coeffs: bool = True) → ndarray[source]#

Call the GBO weather script and parse the results into numbers.

Parameters:

vartypestr, optional

Which weather variable to fetch. See Notes for a description of valid values. The default is “Opacity”.

freqlist, optional

An input frequency list in GHz at which to evaluate the weather data. If coeffs=True, the polynomial is fetched first and then the vartype at each frequency is evaluated. The default is None.

mjdlist, optional

An input data list in MJD at which to evaluate the weather data. The default is None.

coeffsbool, optional

Fetch the polynomial coefficients by passing ‘-coeffs’ to the script. This is only valid for `vartype` “Opacity” or “Tatm.” The default is True. If polynomial coefficients are requested, the return values will be computed as a function of frequency:

$value = \sum_{i=0}^{n} C_i \nu^i$

where $C_i$ are the coefficients and $\nu$ is the frequency in GHz. Because the polynomial is only defined from 2 GHz to 116 GHz, for values below 2 GHz the value for 2 GHz will be returned.

Returns:

weather_datandarray: The requested weather data evaluated at the input frequencies and MJDs. ** These will be sorted by frequency low to high **

Notes

The vartype name for values that can be returned are described below. These are case-sensitive. This description comes from the help text for getForecastValues and may not fully describe the data that are returned, e.g., when other arguments are ignored.

Opacity: the total zenith opacity
Tatm: the opacity-weighted (i.e., representative) temperature of the atmosphere and is the value that should be used when fitting traditional tipping curves.
AtmTsys: the part of the system temperature due to just the atmosphere, doesn’t include CMB, spillover, and electronics, and is calculated for the specified elevation.
TotalTsys: the above-described Tsys value, augmented by an estimate of the contributions from the receiver, spillover, and the CMB.
Est: the Effective System Temperature for the specified elevation.
Rest: the Relative Effective System Temperature, which includes the contributions from the CMB, spillover, and electronics, and is calculated for the specified elevation. Essentially, the predicted loss in gain due to atmospheric opacity
Trcvr: An estimate of the receiver temperature for the given frequency and MJD
Tau0: The best possible opacity for the given frequency and MJD
Tau10, Tau25, Tau50, Tau75, Tau90: The opacity for various percentile weather conditions for the given frequency and MJD, based on multi-year statistical studies.
Tatm0: The atmospheric temperature Tatm for the given frequency and MJD at the time of the best possible opacity.
Tatm10, Tatm25, Tatm50, Tatm75, Tatm90: The Tatm for various percentile weather conditions for the given frequency and MJD, based on multi-year statistical studies.
Winds: The wind speed in MPH. A specified freqList is ignored.
WindEffect: The predicted loss in point-surface efficiency due to winds
SurfaceEffect: The predicted loss in point-surface efficiency due to a deformed surface during PTCS daytime observing.
TotalEffect: The product of the Rest and the wind and surface effects. That is, the predicted loss in gain due to all the various weather factors.
MinElev: Suggested minimum elevation for an object that rises to the elevation given by the value of -elev. Observing above the suggested elevation should keep the loss in gain due to atmospheric opacity to no more than 70% of the loss at transit (i.e., < factor of ~2 increase in observing time).

Examples

Fetch the wind data for a range of dates.

from dysh.util.weatherforecast import GBTForecastScriptInterface
import numpy as np

g = GBTForecastScriptInterface()
winds = g(vartype="Winds", mjd=np.arange(60722,60732), coeffs=False)

property valid_vartypes: list#: List of the valid weather variable type names that can be retrieved.

class dysh.util.weatherforecast.GBTWeatherForecast(**kwargs)[source]#

Bases: BaseWeatherForecast

Methods

fetch([specval, vartype, mjd, coeffs])

Call the GBO weather script and parse the results into numbers.

fetch(specval: Quantity | None = None, vartype: str = 'Opacity', mjd: Time | float | None = None, coeffs=True) → ndarray[source]#

Call the GBO weather script and parse the results into numbers. For frequencies below 2 GHz, the value at 2 GHz will be returned since the getForecastValues does not cover < 2GHz. Returned values will be sorted by frequency, low to high.

Parameters:

specvalQuantity, optional: The spectral value – frequency or wavelength – at which to compute vartype For data such as ‘Winds’ that don’t depend on frequency, specval can be None.
vartypestr, optional: Which weather variable to fetch. See Notes for a description of valid values. If the user is not on the GBO network , the only variable available is Opacity.
mjdTime or float: The date at which to compute the opacity. If given as a float, it is interpreted as Modified Julian Day. Default: None, meaning the data will be fetched at the most recent MJD available. If the user is not on the GBO network, this argument is ignored and the opacity will only be a function of frequency.
coeffsbool: If True and at GBO, getForecastValues will be passed the -coeffs argument which returns polynomial coefficients to fit vartype as a function of frequency for each MJD. This is only valid for `vartype` “Opacity” or “Tatm.” Because the polynomial is only defined from 2 GHz to 116 GHz, for values below 2 GHz the value for 2 GHz will be returned.

Returns:

weather_datandarray: The requested weather data evaluated at the input frequencies and MJDs. ** These will be sorted by frequency low to high **

Notes

The vartype name for values that can be returned are described below. These are case-sensitive. This description comes from the help text for getForecastValues and may not fully describe the data that are returned, e.g., when other arguments are ignored.

Opacity: the total zenith opacity
Tatm: the opacity-weighted (i.e., representative) temperature of the atmosphere and is the value that should be used when fitting traditional tipping curves.
AtmTsys: the part of the system temperature due to just the atmosphere, doesn’t include CMB, spillover, and electronics, and is calculated for the specified elevation.
TotalTsys: the above-described Tsys value, augmented by an estimate of the contributions from the receiver, spillover, and the CMB.
Est: the Effective System Temperature for the specified elevation.
Rest: the Relative Effective System Temperature, which includes the contributions from the CMB, spillover, and electronics, and is calculated for the specified elevation. Essentially, the predicted loss in gain due to atmospheric opacity
Trcvr: An estimate of the receiver temperature for the given frequency and MJD
Tau0: The best possible opacity for the given frequency and MJD
Tau10, Tau25, Tau50, Tau75, Tau90: The opacity for various percentile weather conditions for the given frequency and MJD, based on multi-year statistical studies.
Tatm0: The atmospheric temperature Tatm for the given frequency and MJD at the time of the best possible opacity.
Tatm10, Tatm25, Tatm50, Tatm75, Tatm90: The Tatm for various percentile weather conditions for the given frequency and MJD, based on multi-year statistical studies.
Winds: The wind speed in MPH. A specified freqList is ignored.
WindEffect: The predicted loss in point-surface efficiency due to winds
SurfaceEffect: The predicted loss in point-surface efficiency due to a deformed surface during PTCS daytime observing.
TotalEffect: The product of the Rest and the wind and surface effects. That is, the predicted loss in gain due to all the various weather factors.
MinElev: Suggested minimum elevation for an object that rises to the elevation given by the value of -elev. Observing above the suggested elevation should keep the loss in gain due to atmospheric opacity to no more than 70% of the loss at transit (i.e., < factor of ~2 increase in observing time).

Examples

Fetch the wind data for a range of dates.

from dysh.util.weatherforecast import GBTWeatherForecast
import numpy as np

g = GBTWeatherForecast()
wind = g.fetch(vartype="Winds", mjd=np.arange(60722,60732), coeffs=False)

Core utility definitions, classes, and functions

dysh.util.core.abbreviate_to(length, value, squeeze=True) → str[source]#

Abbreviate a value for display in limited space. The abbreviated value will have initial characters, ellipsis, and final characters, e.g. ‘[(a,b),(c,d),…,(w,x),(y,z)]’.

Parameters:

lengthint: Maximum string length.
valueany: The value to be abbreviated.
squeezebool, optional: Squeeze blanks. If True, replace “, “ (comma space) with “,” (comma). The default is True.

Returns:

strvstr: Abbreviated string representation of the input value.

dysh.util.core.calc_vegas_spurs(vsprval: float | ndarray, vspdelt: float | ndarray, vsprpix: float | ndarray, maxchan: float, keep_central=False) → MaskedArray[source]#

Calculate VEGAS spur channel locations.

SPUR_CHANNEL = (J-VSPRVAL)*VSPDELT+VSPRPIX - 1

where 0 <= J < 32.

Spur channels are counted from zero.

Parameters:

vsprvalfloat or ndarray: VEGAS spur channel offset.
vspdeltfloat or ndarray: VEGAS spur separation width in channels.
vsprpixfloat or ndarray: VEGAS spur reference pixel.
maxchanfloat: Maximum channel number (counting from zero), above which calculated spurs are masked.
keep_central: bool: Whether to keep the central VEGAS spur location in the returned array or not. The GBO SDFITS writer by default replaces the value at the central SPUR with the average of the two adjacent channels, and hence the central channel is not typically flagged.

Returns:

masked_array: The array of channel numbers where spurs occur, with shape (N_vsp,31) where N_vsp is the length of the VSP arrays. Invalid spur locations will be masked.

Notes

All input arrays must have the same shape.

dysh.util.core.consecutive(data, stepsize=1)[source]#

Returns the indices of elements in data separated by less than stepsize separated into groups.

Parameters:

dataarray: Array with values to split.
stepsizeint: Maximum separation between elements of data to be considered a single group.

Returns:

groupsndarray: Array with values of data separated into groups.

dysh.util.core.convert_array_to_mask(a, length, value=True)[source]#

This method interprets a simple or compound array and returns a numpy mask of length length. Single arrays/tuples will be treated as element index lists; nested arrays will be treated as inclusive ranges.

Parameters:

anumber or array-like: The channels to mask. See the examples for use.
lengthint: The length of the mask to return, e.g. the number of channels in a spectrum.
valuebool: The value to fill the mask with. True to mask data, False to unmask.

Returns:

maskndarray: A numpy array where the mask is value.

Examples

Mask elements 1 and 10.

>>> convert_array_to_mask([1,10])

Mask elements 1 thru 10 inclusive.

>>> convert_array_to_mask([[1,10]])

Mask ranges 1 thru 10 and 47 thru 56 inclusive, and element 75.

>>> convert_array_to_mask([[1,10], [47,56], 75)])

Tuples also work. To do the same as above.

>>> convert_array_to_mask(((1,10), [47,56], 75))

dysh.util.core.eliminate_flagged_rows(df, flag)[source]#

Remove rows from an index (selection) where all channels have been flagged.

Parameters:

dfDataFrame: The input dataframe from which flagged rows will be removed.
flagDataFrame: The flag dataframe. Should be the result of e.g. final

Returns:

A data frame which is the input data frame with flagged rows removed.

dysh.util.core.gbt_timestamp_to_time(timestamp)[source]#

Convert the GBT sdfits timestamp string format to an Time object. GBT SDFITS timestamps have the form YYYY_MM_DD_HH:MM:SS in UTC.

Parameters:

timestampstr or list-like: The GBT format timestamp as described above. If str, a Time object containing a single time is returned. If list-like, a Time object containing multiple UTC times is returned.

Returns:

timeTime: The time object

dysh.util.core.generate_tag(values, hashlen, add_time=True)[source]#

Generate a unique tag based on input values. A hash object is created from the input values using SHA256, and a hex representation is created. The first hashlen characters of the hex string are returned.

Parameters:

valuesarray-like: The values to use in creating the hash object.
hashlenint, optional: The length of the returned hash string.
add_timebool: Add the time of the call to the values for hash generation.

Returns:

tagstr: The hash string.

dysh.util.core.get_project_data() → Path[source]#

Returns the directory where dysh configuration files are kept.

Returns:

Path: The project configuration directory.

dysh.util.core.get_project_root() → Path[source]#: Returns the project root directory.

dysh.util.core.get_project_testdata() → Path[source]#: Returns the project testdata directory

dysh.util.core.get_size(obj, seen=None)[source]#: Recursively finds size of objects. See https://goshippo.com/blog/measure-real-size-any-python-object/

dysh.util.core.get_valid_channel_range(channel: list | ndarray) → list[source]#

Check that a channel range (e.g., that was given to Selection) defines a contiguous range of channels and return a list of length 2 if valid. The returned list can be used in data calibration. Unlike select_channel(), if the input channel list has two elements, it will be assumed that channel[0] is the first chan and channel[1] is the last channel. However, channel ranges specified identically to select_channel() can also be used as input.

Parameters:

channellist|np.ndarray: List of beginning and end channel

Returns:

list: An inclusive list of [first_channel, last_channel]

Raises:

ValueError: If the input channel list cannot be converted to [first,last]

dysh.util.core.grouper(iterable, n, *, incomplete='fill', fillvalue=None)[source]#: Collect data into non-overlapping fixed-length chunks or blocks.

dysh.util.core.in_notebook() → bool[source]#

Check if the code is being run inside a notebook.

This returns False for Spyder and other IPython-based IDEs that are not Jupyter environments, even though they may have IPKernelApp.

dysh.util.core.indices_where_value_changes(colname, df)[source]#

Find the DataFrame indices where the value of the input column name changes.

Parameters:

colnamestr: The column name to query.
dfDataFrame: The DataFrame to search

Returns:

indices~numpy.ndarray: The indices of the Dataframe where colname changes value.

dysh.util.core.inner_channel_slice(nchan: int, fedge: float = 0.1) → slice[source]#

Return a slice cropping fedge channels at each end. This is inclusive on the upper end to reproduce GBTIDL results.

Parameters:

nchanint: Number of channels.
fedgefloat: Fraction of edges to crop on each end. For example, fedge=0.1 will crop 10% of the channels on each end.

Returns:

channel_sliceslice: Slice that will select the inner 1-fedge channels.

dysh.util.core.isot_to_mjd(isot)[source]#: Convert an ISOT string to MJD.

dysh.util.core.keycase(d, case='upper')[source]#

Change the case of dictionary keys

Parameters:

ddict: The input dictionary
casestr, one of ‘upper’, ‘lower’: Case to change keys to The default is “upper”.

Returns:

newDictdict: A copy of the dictionary with keys changed according to case

dysh.util.core.merge_ranges(ranges)[source]#

Merge overlapping and adjacent ranges and yield the merged ranges in order. The argument must be an iterable of pairs (start, stop).

Taken from: https://codereview.stackexchange.com/a/21333

Parameters:

rangesiterable: Pairs of (start, stop) ranges.

Yields:

iterable: Merged ranges.

Examples

>>> list(merge_ranges([(5,7), (3,5), (-1,3)]))
[(-1, 7)]
>>> list(merge_ranges([(5,6), (3,4), (1,2)]))
[(1, 2), (3, 4), (5, 6)]
>>> list(merge_ranges([]))
[]

dysh.util.core.minimum_list_match(strings, valid_strings, casefold=False)[source]#

Return the list of valid strings given a list of minimum string inputs.

Parameters:

stringsstr or list of str: The strings to compare for minimum match
valid_stringslist of str: list of full strings to min match on.
casefold: bool: If True, do a case insensitive match

Returns:

list: List of all minimum matches or None if no matches found

dysh.util.core.minimum_string_match(s, valid_strings, casefold=False)[source]#

return the valid string from a list, given a minimum string input

Example: minimum_string_match(‘a’,[‘alpha’,’beta’,’gamma’]) returns: ‘alpha’

Parameters:

sstring: string to use for minimum match
valid_stringslist of strings: list of full strings to minimum match on.
casefold: bool: If True, do a case insensitive match

Returns:

string: matched string, if one is found. An exact match will also count as a match, even if others are present with longer match. Otherwise “None” is returned.

dysh.util.core.powerof2(number)[source]#

Computes the closest power of 2 for a given number.

Parameters:

numberfloat: number to determine the closest power of 2.

Returns:

pow2int: the closest power of 2.

dysh.util.core.replace_col_astype(t: Table, colname: str, astype, fill_value)[source]#

dysh.util.core.select_from(key, value, df)[source]#

Select data where key=value.

Parameters:

keystr: The key value (SDFITS column name)
valueany: The value to match
dfDataFrame: The DataFrame to search

Returns:

dfDataFrame: The subselected DataFrame

dysh.util.core.show_dataframe(df, show_index=False, max_rows=None, max_cols=None)[source]#

Function to show a DataFrame in IPython or Jupyter.

Parameters:

dfDataFrame: The DataFrame to be shown.
show_indexbool: Show the index of the DataFrame.
max_rowsint or None: Maximum number of rows to display.
max_colsint or None: Maximum number of columns to display.

dysh.util.core.sq_weighted_avg(a, axis=0, weights=None)[source]#

Compute the mean square weighted average of an array (2nd moment).

$v = \sqrt{\frac{\sum_i{w_i~a_i^{2}}}{\sum_i{w_i}}}$

Parameters:

andarray: The data to average
axisint: The axis over which to average the data. Default: 0
weightsndarray or None: The weights to use in averaging. The weights array must be the length of the axis over which the average is taken. Default: None will use equal weights.

Returns:

averagendarray: The average along the input axis

dysh.util.core.to_mjd_list(time_val: Time | float) → ndarray[source]#

Convert an astropy Time, list of MJD, or single MJD to a list of MJD

Parameters:

time_valTime or float or list of float: The time value to convert.

Returns:

mjd~np.ndarray: The Modified Julian Day values in an array. (or None if time_val was None)

dysh.util.core.to_quantity_list(q: Quantity | Sequence) → Quantity[source]#

dysh.util.core.uniq(seq)[source]#: Remove duplicates from a list while preserving order. from http://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-in-python-whilst-preserving-order

Utility classes and functions

Contents

Utility classes and functions#