# Pulsar Data Toolbox: 
## `psrfits` class
The `psrfits` class allows easy access to the specialized FITS files used in the Pulsar/Radio  Astronomy community know as PSRFITS files. The standard can be found on the [CSIRO Pulsar Group website](http://www.atnf.csiro.au/people/pulsar/index.html?n=Main.Psrfits). In the current version of `pdat` this class is based on the Python package `fitsio` which is a wrapper for the c-library `cfitsio`. In the future we plan to also make a version that uses the `astropy.io.fits` package, however the `c` library is fast, efficient, allows appending and accessing of BinTables without loading the whole file to memory. Since PSRFITS files carry large BinTables these types of efficiencies are very useful.

## Loading and Appending

In [1]:
import pdat
import os

In [3]:
pFits1 = '../../../templates/search_scratch.fits'
pFits2 = '../../../templates/search_template.fits'

## Check file sizes

In [4]:
a=os.path.getsize(pFits1)
b=os.path.getsize(pFits2)
print('Size of 1st file:',a)
print('Size of 2nd file:',b)

Size of 1st file: 5302080
Size of 2nd file: 5302080


## Load files

In [5]:
psrf1 = pdat.psrfits(pFits1)

Loading PSRFITS file from path:
'../../../templates/search_scratch.fits'.


## Append the Secondary BinTables to an existing PSRFITS 
The `append_from_file` method appends all of the secondary BinTables of a PSRFITS, given as a file path, to the already loaded PSRFITS. The secondary BinTables include `SUBINT`,`POLYCO`, `HISTORY` and `PARAM`. This is only possible between identical `mode` files (`SEARCH`, `PSR` or `CAL`). By default the order of the tables is assumed identical. If the BinTables are in different orders there is an optional `table` flag to provide a list of the order of the original BinTables. Alternatively, you may only select a subset of BinTables to append. 

In [6]:
psrf1.append_from_file(pFits2)

In [7]:
os.path.getsize(pFits1)

5302080

Checking the size we see it has grown, but not doubled. That is because the `PRIMARY` header was not changed. 

The `psrfits` class comes with all of the functionality built into `fitsio`. The class represents a list of HDUs. The header information is accesible through the `read_header` method.

In [8]:
psrf1[1].read_header()


XTENSION= 'BINTABLE'           / ***** Subintegration data  *****
BITPIX  =                    8 / N/A
NAXIS   =                    2 / 2-dimensional binary table
NAXIS1  =               264268 / width of table in bytes
NAXIS2  =                   20 / Number of rows in table (NSUBINT)
PCOUNT  =                    0 / size of special data area
GCOUNT  =                    1 / one data group (required keyword)
TFIELDS =                   17 / Number of fields per row
TTYPE1  = 'TSUBINT '           / Length of subintegration
TFORM1  = '1D      '           / Double
TTYPE2  = 'OFFS_SUB'           / Offset from Start of subint centre
TFORM2  = '1D      '           / Double
TTYPE3  = 'LST_SUB '           / LST at subint centre
TFORM3  = '1D      '           / Double
TTYPE4  = 'RA_SUB  '           / RA (J2000) at subint centre
TFORM4  = '1D      '           / Double
TTYPE5  = 'DEC_SUB '           / Dec (J2000) at subint centre
TFORM5  = '1D      '           / Double
TTYPE6  = 'GLON_SUB'     

The data in a `PSRFITS` is found in the `SUBINT` BinTable. 

In [9]:
psrf1


  file: ../../../templates/search_scratch.fits
  mode: READWRITE
  extnum hdutype         hduname[v]
  0      IMAGE_HDU       
  1      BINARY_TBL      SUBINT[1]

Here `SUBINT` is the 2nd HDU. The data is accesible as a `numpy.recarray` with `NSUBINT` rows. Think of a recarray as a spreadsheet where the individual entries can be strings, floats or whole arrays.

In [15]:
data=psrf1[1].read()
print(data.shape)
data.dtype.descr

(20,)


[('TSUBINT', '>f8'),
 ('OFFS_SUB', '>f8'),
 ('LST_SUB', '>f8'),
 ('RA_SUB', '>f8'),
 ('DEC_SUB', '>f8'),
 ('GLON_SUB', '>f8'),
 ('GLAT_SUB', '>f8'),
 ('FD_ANG', '>f4'),
 ('POS_ANG', '>f4'),
 ('PAR_ANG', '>f4'),
 ('TEL_AZ', '>f4'),
 ('TEL_ZEN', '>f4'),
 ('DAT_FREQ', '>f4', (128,)),
 ('DAT_WTS', '>f4', (128,)),
 ('DAT_OFFS', '>f4', (128,)),
 ('DAT_SCL', '>f4', (128,)),
 ('DATA', '|u1', (2048, 1, 128, 1))]

While the `DATA` array above is 4 dimensional (this is the case in `SEARCH` files, it is 3 dimensional in `PSR` and `CAL` files). However there are `NSUBINT` of those arrays. To access the data one uses the name of the column, `DATA`, then a single entry square bracket denoting the row. This gives one of the `NSUBINT` arrays in the BinTable. 

In [18]:
data['DATA'][0].shape

(2048, 1, 128, 1)

This object is then a normal numpy array that can be accessed with numpy array slice notation. Access a single entry by choosing four integers in the range of dimensions. 

In [20]:
data['DATA'][0][1000,0,3,0]

7

Other arrays are accessed similarly, but without as many indices. There are `NSUBINT` rows of 1-dimensional arrays for each of the `DAT_X` parameters and `NSUBINT` floats of the other entries.

In [25]:
print(data['DAT_OFFS'].shape)
data['DAT_OFFS'][2]

(20, 128)


array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.], dtype=float32)

In [26]:
print(data['GLON_SUB'].shape)
data['GLON_SUB'][2]

(20,)


97.721010667684681

One can clear the file from memory using the `close` method.

In [27]:
psrf1.close()




## `PSR` and `CAL` files

The PSRFITS standard actually has many BinTable extensions, and many files come with more than two HDUs. The `psrfits` class will generically build a Python version of any of these file types. In this package there are three template types, corresponding to the three most common file types used by the NANOGrav Pulsar Timing array. If you would like another template included please start an issue on our GitHub page. 

A `PSR` mode file is data from an observation where the data is folded at the frequency of the pulsar to build up signal-to-noise ratio in real time. A `CAL` file has the same set of HDUs but is not folded. It is data take of a calibration source. Here we access the `PSR` template file and look at a different BinTable extension. 

In [28]:
pFits3 = '../../../templates/psr_template.fits'
psrf2 = pdat.psrfits(pFits3)

Loading PSRFITS file from path:
'/Users/jeffrey/PSS/guppi_57691_J1909-3744_0004_0001.fits'.


In [29]:
psrf2


  file: /Users/jeffrey/PSS/guppi_57691_J1909-3744_0004_0001.fits
  mode: READWRITE
  extnum hdutype         hduname[v]
  0      IMAGE_HDU       
  1      BINARY_TBL      HISTORY[1]
  2      BINARY_TBL      PSRPARAM[1]
  3      BINARY_TBL      POLYCO[1]
  4      BINARY_TBL      SUBINT[1]

In [39]:
psrf2[3].read_header()


XTENSION= 'BINTABLE'           / ***** Polyco history *****
BITPIX  =                    8 / N/A
NAXIS   =                    2 / 2-dimensional binary table
NAXIS1  =                  222 / width of table in bytes
NAXIS2  =                    1 / number of rows in table
PCOUNT  =                    0 / size of special data area
GCOUNT  =                    1 / one data group (required keyword)
TFIELDS =                   13 / Number of fields per row
TTYPE1  = 'DATE_PRO'           / Polyco creation date and time (UTC)
TFORM1  = '24A     '           / 24-char string
TTYPE2  = 'POLYVER '           / Polyco version ID
TFORM2  = '16A     '           / 16-char string
TTYPE3  = 'NSPAN   '           / Span of polyco block in min
TFORM3  = '1I      '           / Integer
TTYPE4  = 'NCOEF   '           / Nr of coefficients (<=15)
TFORM4  = '1I      '           / Integer
TTYPE5  = 'NPBLK   '           / Nr of blocks (rows) for this polyco
TFORM5  = '1I      '           / Integer
TTYPE6  = 'NSITE

In [43]:
psrf2[3]['COEFF'][:]

array([[  6.37061369e-07,  -3.84007940e-01,   1.63071384e-03,
         -1.91944367e-06,   1.07255013e-09,   6.72218368e-12,
         -8.60574070e-12,   1.25507648e-13,   1.71341258e-14,
         -2.97308173e-16,  -1.79229301e-17,   2.50414099e-19,
          9.50130849e-21,  -7.26854989e-23,  -2.02121757e-24]])

In [44]:
psrf2[2]['PARAM'][:]

array([ b'PSRJ              1909-3744                                                                                                     ',
       b'RAJ               19:09:47.4380095699897                                                                                        ',
       b'DECJ             -37:44:14.3162347000103                                                                                        ',
       b'PEPOCH            53000.0000000000000000                                                                                        ',
       b'F                 3.3931569275871846D+02                                                                                        ',
       b'F1               -1.6150815823660001D+00                                                                                        ',
       b'PMDEC            -3.6776299999999999D+01                                                                                        ',
       b'PMRA      

## Glossary:
__BinTable__: A table of binary data. 

__HDU__: Header Unit. The main division of a FITS file.

__ImageHDU__: An HDU that either holds a 2-d data array, usually represnting an image, of the primary HDU, acting as the main header file for the FITS file.

__SUBINT HDU__: The BinTable extension (HDU) that holds the data from a pulsar/radio observation. In a `PSR` (folded) mode PSRFITS file these are actually subintegrations of folded pulsar data.

__HISTORY HDU__: The BinTable extension (HDU) that has some information about the history of the observation and what may have been done to the data in the file. 

__FITS Card__: The header information in FITS files is held in a FITS card. In Python these are usually held as dictionary-type variables. There is a `card string` which hold the information that appears when you call the header. One of the dictionary entries is the actual value called when accesing the data.

__POLYCO HDU__: The BinTable extension (HDU) that has a list of the Chebyshev polynomial coefficients used for a short timescale timing model when using the backend of a telescope in 'PSR' (folding) mode. 

__PARAM HDU__: The BinTable extensino (HDU) that hols the parameters of the pulsar. Most often these are text lines taken from a `.par` (parameter) file. 