Note
This tutorial was generated from a Jupyter notebook that can be downloaded here.
Pulsar Data Toolbox:¶
psrfits
class¶
The psrfits
class allows easy access to the specialized FITS files
used in the Pulsar/Radio Astronomy community know as PSRFITS files. The
standard can be found on the CSIRO Pulsar Group
website.
In the current version of pdat
this class is based on the Python
package fitsio
which is a wrapper for the c-library cfitsio
. In
the future we plan to also make a version that uses the
astropy.io.fits
package, however the c
library is fast,
efficient, allows appending and accessing of BinTables without loading
the whole file to memory. Since PSRFITS files carry large BinTables
these types of efficiencies are very useful.
Loading and Appending¶
import pdat
import os
pFits1 = '../../../templates/search_scratch.fits'
pFits2 = '../../../templates/search_template.fits'
Check file sizes¶
a=os.path.getsize(pFits1)
b=os.path.getsize(pFits2)
print('Size of 1st file:',a)
print('Size of 2nd file:',b)
Size of 1st file: 5302080
Size of 2nd file: 5302080
Load files¶
psrf1 = pdat.psrfits(pFits1)
Loading PSRFITS file from path:
'../../../templates/search_scratch.fits'.
Append the Secondary BinTables to an existing PSRFITS¶
The append_from_file
method appends all of the secondary BinTables
of a PSRFITS, given as a file path, to the already loaded PSRFITS. The
secondary BinTables include SUBINT
,POLYCO
, HISTORY
and
PARAM
. This is only possible between identical mode
files
(SEARCH
, PSR
or CAL
). By default the order of the tables is
assumed identical. If the BinTables are in different orders there is an
optional table
flag to provide a list of the order of the original
BinTables. Alternatively, you may only select a subset of BinTables to
append.
psrf1.append_from_file(pFits2)
os.path.getsize(pFits1)
5302080
Checking the size we see it has grown, but not doubled. That is because
the PRIMARY
header was not changed.
The psrfits
class comes with all of the functionality built into
fitsio
. The class represents a list of HDUs. The header information
is accesible through the read_header
method.
psrf1[1].read_header()
XTENSION= 'BINTABLE' / * Subintegration data * BITPIX = 8 / N/A NAXIS = 2 / 2-dimensional binary table NAXIS1 = 264268 / width of table in bytes NAXIS2 = 20 / Number of rows in table (NSUBINT) PCOUNT = 0 / size of special data area GCOUNT = 1 / one data group (required keyword) TFIELDS = 17 / Number of fields per row TTYPE1 = 'TSUBINT ' / Length of subintegration TFORM1 = '1D ' / Double TTYPE2 = 'OFFS_SUB' / Offset from Start of subint centre TFORM2 = '1D ' / Double TTYPE3 = 'LST_SUB ' / LST at subint centre TFORM3 = '1D ' / Double TTYPE4 = 'RA_SUB ' / RA (J2000) at subint centre TFORM4 = '1D ' / Double TTYPE5 = 'DEC_SUB ' / Dec (J2000) at subint centre TFORM5 = '1D ' / Double TTYPE6 = 'GLON_SUB' / [deg] Gal longitude at subint centre TFORM6 = '1D ' / Double TTYPE7 = 'GLAT_SUB' / [deg] Gal latitude at subint centre TFORM7 = '1D ' / Double TTYPE8 = 'FD_ANG ' / [deg] Feed angle at subint centre TFORM8 = '1E ' / Float TTYPE9 = 'POS_ANG ' / [deg] Position angle of feed at subint centre TFORM9 = '1E ' / Float TTYPE10 = 'PAR_ANG ' / [deg] Parallactic angle at subint centre TFORM10 = '1E ' / Float TTYPE11 = 'TEL_AZ ' / [deg] Telescope azimuth at subint centre TFORM11 = '1E ' / Float TTYPE12 = 'TEL_ZEN ' / [deg] Telescope zenith angle at subint centre TFORM12 = '1E ' / Float TTYPE13 = 'DAT_FREQ' / [MHz] Centre frequency for each channel TFORM13 = ' 128E' / NCHAN floats TTYPE14 = 'DAT_WTS ' / Weights for each channel TFORM14 = ' 128E' / NCHAN floats TTYPE15 = 'DAT_OFFS' / Data offset for each channel TFORM15 = ' 128E' / NCHAN*NPOL floats TTYPE16 = 'DAT_SCL ' / Data scale factor for each channel TFORM16 = ' 128E' / NCHAN*NPOL floats TTYPE17 = 'DATA ' / Subint data table TFORM17 = ' 262144B' / NBIN*NCHAN*NPOL*NSBLK int, byte(B) or bit(X) TDIM17 = '(1, 128, 1, 2048)' / Dimensions (NBITS or NBIN,NCHAN,NPOL,NSBLK) INT_TYPE= 'TIME ' / Time axis (TIME, BINPHSPERI, BINLNGASC, etc) INT_UNIT= 'SEC ' / Unit of time axis (SEC, PHS (0-1), DEG) SCALE = 'FluxDen ' / Intensity units (FluxDen/RefFlux/Jansky) NPOL = 1 / Nr of polarisations POL_TYPE= 'IQUV ' / Polarisation identifier (e.g., AABBCRCI, AA+BB) TBIN = 2.04833984375E-05 / [s] Time per bin or sample NBIN = 1 / Nr of bins (PSR/CAL mode; else 1) NBIN_PRD= 0 / Nr of bins/pulse period (for gated data) PHS_OFFS= 0. / Phase offset of bin 0 for gated data NBITS = 8 / Nr of bits/datum (SEARCH mode 'X' data, else 1) NSUBOFFS= 0 / Subint offset (Contiguous SEARCH-mode files) NCHAN = 128 / Number of channels/sub-bands in this file CHAN_BW = 1.5625 / [MHz] Channel/sub-band width NCHNOFFS= 0 / Channel/sub-band offset for split files NSBLK = 2048 / Samples/row (SEARCH mode, else 1) EXTNAME = 'SUBINT ' / name of this binary table extension TUNIT1 = 's ' / Units of field TUNIT2 = 's ' / Units of field TUNIT3 = 's ' / Units of field TUNIT4 = 'deg ' / Units of field TUNIT5 = 'deg ' / Units of field TUNIT6 = 'deg ' / Units of field TUNIT7 = 'deg ' / Units of field TUNIT8 = 'deg ' / Units of field TUNIT9 = 'deg ' / Units of field TUNIT10 = 'deg ' / Units of field TUNIT11 = 'deg ' / Units of field TUNIT12 = 'deg ' / Units of field TUNIT13 = 'MHz ' / Units of field TUNIT17 = 'Jy ' / Units of subint data EXTVER = 1 / auto assigned by template parser
The data in a PSRFITS
is found in the SUBINT
BinTable.
psrf1
file: ../../../templates/search_scratch.fits
mode: READWRITE
extnum hdutype hduname[v]
0 IMAGE_HDU
1 BINARY_TBL SUBINT[1]
Here SUBINT
is the 2nd HDU. The data is accesible as a
numpy.recarray
with NSUBINT
rows. Think of a recarray as a
spreadsheet where the individual entries can be strings, floats or whole
arrays.
data=psrf1[1].read()
print(data.shape)
data.dtype.descr
(20,)
[('TSUBINT', '>f8'),
('OFFS_SUB', '>f8'),
('LST_SUB', '>f8'),
('RA_SUB', '>f8'),
('DEC_SUB', '>f8'),
('GLON_SUB', '>f8'),
('GLAT_SUB', '>f8'),
('FD_ANG', '>f4'),
('POS_ANG', '>f4'),
('PAR_ANG', '>f4'),
('TEL_AZ', '>f4'),
('TEL_ZEN', '>f4'),
('DAT_FREQ', '>f4', (128,)),
('DAT_WTS', '>f4', (128,)),
('DAT_OFFS', '>f4', (128,)),
('DAT_SCL', '>f4', (128,)),
('DATA', '|u1', (2048, 1, 128, 1))]
While the DATA
array above is 4 dimensional (this is the case in
SEARCH
files, it is 3 dimensional in PSR
and CAL
files).
However there are NSUBINT
of those arrays. To access the data one
uses the name of the column, DATA
, then a single entry square
bracket denoting the row. This gives one of the NSUBINT
arrays in
the BinTable.
data['DATA'][0].shape
(2048, 1, 128, 1)
This object is then a normal numpy array that can be accessed with numpy array slice notation. Access a single entry by choosing four integers in the range of dimensions.
data['DATA'][0][1000,0,3,0]
7
Other arrays are accessed similarly, but without as many indices. There
are NSUBINT
rows of 1-dimensional arrays for each of the DAT_X
parameters and NSUBINT
floats of the other entries.
print(data['DAT_OFFS'].shape)
data['DAT_OFFS'][2]
(20, 128)
array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)
print(data['GLON_SUB'].shape)
data['GLON_SUB'][2]
(20,)
97.721010667684681
One can clear the file from memory using the close
method.
psrf1.close()
PSR
and CAL
files¶
The PSRFITS standard actually has many BinTable extensions, and many
files come with more than two HDUs. The psrfits
class will
generically build a Python version of any of these file types. In this
package there are three template types, corresponding to the three most
common file types used by the NANOGrav Pulsar Timing array. If you would
like another template included please start an issue on our GitHub page.
A PSR
mode file is data from an observation where the data is folded
at the frequency of the pulsar to build up signal-to-noise ratio in real
time. A CAL
file has the same set of HDUs but is not folded. It is
data take of a calibration source. Here we access the PSR
template
file and look at a different BinTable extension.
pFits3 = '../../../templates/psr_template.fits'
psrf2 = pdat.psrfits(pFits3)
Loading PSRFITS file from path:
'/Users/jeffrey/PSS/guppi_57691_J1909-3744_0004_0001.fits'.
psrf2
file: /Users/jeffrey/PSS/guppi_57691_J1909-3744_0004_0001.fits
mode: READWRITE
extnum hdutype hduname[v]
0 IMAGE_HDU
1 BINARY_TBL HISTORY[1]
2 BINARY_TBL PSRPARAM[1]
3 BINARY_TBL POLYCO[1]
4 BINARY_TBL SUBINT[1]
psrf2[3].read_header()
XTENSION= 'BINTABLE' / * Polyco history * BITPIX = 8 / N/A NAXIS = 2 / 2-dimensional binary table NAXIS1 = 222 / width of table in bytes NAXIS2 = 1 / number of rows in table PCOUNT = 0 / size of special data area GCOUNT = 1 / one data group (required keyword) TFIELDS = 13 / Number of fields per row TTYPE1 = 'DATE_PRO' / Polyco creation date and time (UTC) TFORM1 = '24A ' / 24-char string TTYPE2 = 'POLYVER ' / Polyco version ID TFORM2 = '16A ' / 16-char string TTYPE3 = 'NSPAN ' / Span of polyco block in min TFORM3 = '1I ' / Integer TTYPE4 = 'NCOEF ' / Nr of coefficients (<=15) TFORM4 = '1I ' / Integer TTYPE5 = 'NPBLK ' / Nr of blocks (rows) for this polyco TFORM5 = '1I ' / Integer TTYPE6 = 'NSITE ' / Observatory code TFORM6 = '8A ' / 8-char string TTYPE7 = 'REF_FREQ' / Reference frequency for phase TFORM7 = '1D ' / Double TTYPE8 = 'PRED_PHS' / Predicted pulse phase at observation start TFORM8 = '1D ' / Double TTYPE9 = 'REF_MJD ' / Reference MJD TFORM9 = '1D ' / Double TTYPE10 = 'REF_PHS ' / Reference phase TFORM10 = '1D ' / Double TTYPE11 = 'REF_F0 ' / Zero-order pulsar frequency TFORM11 = '1D ' / Double TTYPE12 = 'LGFITERR' / Log_10 of polynomial fit rms error in periods TFORM12 = '1D ' / Double TTYPE13 = 'COEFF ' / Polyco coefficients TFORM13 = '15D ' / NCOEF doubles EXTNAME = 'POLYCO ' / name of this binary table extension TUNIT7 = 'MHz ' / Units of field TUNIT11 = 'Hz ' / Units of field EXTVER = 1 / auto assigned by template parser
psrf2[3]['COEFF'][:]
array([[ 6.37061369e-07, -3.84007940e-01, 1.63071384e-03,
-1.91944367e-06, 1.07255013e-09, 6.72218368e-12,
-8.60574070e-12, 1.25507648e-13, 1.71341258e-14,
-2.97308173e-16, -1.79229301e-17, 2.50414099e-19,
9.50130849e-21, -7.26854989e-23, -2.02121757e-24]])
psrf2[2]['PARAM'][:]
array([ b'PSRJ 1909-3744 ',
b'RAJ 19:09:47.4380095699897 ',
b'DECJ -37:44:14.3162347000103 ',
b'PEPOCH 53000.0000000000000000 ',
b'F 3.3931569275871846D+02 ',
b'F1 -1.6150815823660001D+00 ',
b'PMDEC -3.6776299999999999D+01 ',
b'PMRA -9.5500000000000007D+00 ',
b'POSEPOCH 53000.0000000000000000 ',
b'PX 1.3517999999999999D+00 ',
b'DM 1.0394679999999999D+01 ',
b'START 53219.2149999999965075 ',
b'FINISH 54614.2710000000006403 ',
b'CLK UTC(NIST) ',
b'EPHEM DE405 ',
b'TZRMJD 53293.02028990324198077 ',
b'TZRFRQ 8.4256500000000005D+02 ',
b'TZRSITE 1 ',
b'BINARY ELL1 ',
b'A1 1.8979909859999999D+00 ',
b'PB 1.5334494510779999D+00 ',
b'SINI 9.9727800000000000D-01 ',
b'M2 2.2327900000000001D-01 ',
b'EPS1 3.7300000000000003D-08 ',
b'EPS2 1.1340000000000000D-07 ',
b'TASC 53113.9505872139998246 ',
b'TRES 4.2999999999999999D-01 ',
b'NTOA 746 '],
dtype='|S128')
Glossary:¶
BinTable: A table of binary data.
HDU: Header Unit. The main division of a FITS file.
ImageHDU: An HDU that either holds a 2-d data array, usually represnting an image, of the primary HDU, acting as the main header file for the FITS file.
SUBINT HDU: The BinTable extension (HDU) that holds the data from a
pulsar/radio observation. In a PSR
(folded) mode PSRFITS file these
are actually subintegrations of folded pulsar data.
HISTORY HDU: The BinTable extension (HDU) that has some information about the history of the observation and what may have been done to the data in the file.
FITS Card: The header information in FITS files is held in a FITS
card. In Python these are usually held as dictionary-type variables.
There is a card string
which hold the information that appears when
you call the header. One of the dictionary entries is the actual value
called when accesing the data.
POLYCO HDU: The BinTable extension (HDU) that has a list of the Chebyshev polynomial coefficients used for a short timescale timing model when using the backend of a telescope in ‘PSR’ (folding) mode.
PARAM HDU: The BinTable extensino (HDU) that hols the parameters of
the pulsar. Most often these are text lines taken from a .par
(parameter) file.