File reading/writing utility

Overview

This module provides functions for file input/output. These are all wrapper functions, based on existing functions in other Python classes. Functions are provided to save a two-dimensional array to a text file, load selected columns of data from a text file, load a column header line, compact strings to include only legal filename characters, and a function from the Python Cookbook to recursively match filename patterns.

See the __main__ function for examples of use.

This package was partly developed to provide additional material in support of students and readers of the book Electro-Optical System Analysis and Design: A Radiometry Perspective, Cornelius J. Willers, ISBN 9780819495693, SPIE Monograph Volume PM236, SPIE Press, 2013. http://spie.org/x648.html?product_id=2021423&origin_id=x646

Module functions

pyradi.ryfiles.saveHeaderArrayTextFile(filename, dataArray, header=None, comment=None, delimiter=None)

Save a numpy array to a file, included header lines.

This function saves a two-dimensional array to a text file, with an optional user-defined header. This functionality will be part of numpy 1.7, when released.

Args:
filename (string): name of the output ASCII flatfile.
dataArray (np.array[N,M]): a two-dimensional array.
header (string): the optional header.
comment (string): the symbol used to comment out lines, default value is None.
delimiter (string): delimiter used to separate columns, default is whitespace.
Returns:
Nothing.
Raises:
No exception is raised.
pyradi.ryfiles.loadColumnTextFile(filename, loadCol=[1], comment=None, normalize=0, skiprows=0, delimiter=None, abscissaScale=1, ordinateScale=1, abscissaOut=None, returnAbscissa=False)

Load selected column data from a text file, processing as specified.

This function loads column data from a text file, scaling and interpolating the read-in data, according to user specification. The first 0’th column has special significance: it is considered the abscissa (x-values) of the data set, while the remaining columns are any number of ordinate (y-value) vectors. The user passes a list of columns to be read (default is [1]) - only these columns are read, processed and returned when the function exits.The user also passes an abscissa vector to which the input data is interpolated and then subsequently amplitude scaled or normalised.

Note: leave only single separators (e.g. spaces) between columns! Also watch out for a single space at the start of line.

Args:
filename (string): name of the input ASCII flatfile.
loadCol ([int]): the M =len([]) column(s) to be loaded as the ordinate, default value is column 1
comment (string): string, the symbol used to comment out lines, default value is None
normalize (int): integer, flag to indicate if data must be normalized.
skiprows (int): integer, the number of rows to be skipped at the start of the file (e.g. headers)
delimiter (string): string, the delimiter used to separate columns, default is whitespace.
abscissaScale (float): scale by which abscissa (column 0) must be multiplied
ordinateScale (float): scale by which ordinate (column >0) must be multiplied
abscissaOut (np.array[N,] or [N,1]): abscissa vector on which output variables are interpolated.
returnAbscissa (bool): return the abscissa vector as second item in return tuple.
Returns:
ordinatesOut (np.array[N,M]): The interpolated, M columns of N rows, processed array.
abscissaOut (np.array[N,M]): The ascissa where the ordinates are interpolated
Raises:
No exception is raised.
pyradi.ryfiles.loadHeaderTextFile(filename, loadCol=[1], comment=None)

Loads column header data in the first string of a text file.

loads column header data from a file, from the first row. Headers must be delimited by commas. The function [LoadColumnTextFile] provides more comprehensive capabilties.

Args:
filename (string): the name of the input ASCII flatfile.
loadCol ([int]): list of numbers, the column headers to be loaded , default value is column 1
comment (string): the symbol to comment out lines
Returns:
[string]: a list with selected column header entries
Raises:
No exception is raised.
pyradi.ryfiles.cleanFilename(sourcestring, removestring=' %:/, .\\[]<>*?')

Clean a string by removing selected characters.

Creates a legal and ‘clean’ source string from a string by removing some clutter and characters not allowed in filenames. A default set is given but the user can override the default string.

Args:
sourcestring (string): the string to be cleaned.
removestring (string): remove all these characters from the string (optional).
Returns:
(string): A cleaned-up string.
Raises:
No exception is raised.
pyradi.ryfiles.listFiles(root, patterns='*', recurse=1, return_folders=0, useRegex=False)

Lists the files/directories meeting specific requirement

Returns a list of file paths to files in a file system, searching a directory structure along the specified path, looking for files that matches the glob pattern. If specified, the search will continue into sub-directories. A list of matching names is returned. The function supports a local or network reachable filesystem, but not URLs.
Args:
root (string): directory root from where the search must take place
patterns (string): glob/regex pattern for filename matching. Multiple pattens may be present, each one separated by ;
recurse (unt): flag to indicate if subdirectories must also be searched (optional)
return_folders (int): flag to indicate if folder names must also be returned (optional)
useRegex (bool): flag to indicate if patterns areregular expression strings (optional)
Returns:
A list with matching file/directory names
Raises:
No exception is raised.
pyradi.ryfiles.execOnFiles(cmdline, root, patterns='*', recurse=1, return_folders=0, useRegex=False, printTask=False)

execute a program on a list of files/directories meeting specific requirement

Seek files recursively and then execute a program on those files. The program is defined as a command line string as would be types on a terminal, except that a token is given in the place where the filename must be. The token is a string ‘{0}’ (with the braces as shown). During execution the token is replaced with the filename found in the recursive search. This replacement is done with the standard string formatter, where the filename replaces all occurences of {0}: task = cmdline.format(filename)

Example: cmdline = ‘bmpp -l eps.object {0}’

Args:
cmdline (str): string that defines the program to be executed
root (string): directory root from where the search must take place
patterns (string): glob/regex pattern for filename matching
recurse (unt): flag to indicate if subdirectories must also be searched (optional)
return_folders (int): flag to indicate if folder names must also be returned (optional)
useRegex (bool): flag to indicate if patterns areregular expression strings (optional)
printTask (bool): flag to indicate if the commandline must be printed (optional)
Returns:
A list with matching file/directory names
Raises:
No exception is raised.
pyradi.ryfiles.readRawFrames(fname, rows, cols, vartype, loadFrames=[])

Loading multi-frame two-dimensional arrays from a raw data file of known data type.

The file must consist of multiple frames, all with the same number of rows and columns. Frames of different data types can be read, according to the user specification. The user can specify which frames must be loaded (if not the whole file).
Args:
fname (string): filename
rows (int): number of rows in each frame
cols (int): number of columns in each frame
vartype (np.dtype): numpy data type of data to be read
int8, int16, int32, int64
uint8, uint16, uint32, uint64
float16, float32, float64
loadFrames ([int]): optional list of frames to load, zero-based , empty list (default) loads all frames
Returns:
frames (int) : number of frames in the returned data set,
0 if error occurred
rawShaped (np.ndarray): vartype numpy array of dimensions (frames,rows,cols),
None if error occurred
Raises:
Exception is raised if IOError
pyradi.ryfiles.rawFrameToImageFile(image, filename)

Writes a single raw image frame to image file. The file type must be given, e.g. png or jpg. The image need not be scaled beforehand, it is done prior to writing out the image. Could be one of BMP, JPG, JPEG, PNG, PPM, TIFF, XBM, XPM) but the file types available depends on the QT imsave plugin in use.

Args:
image (np.ndarray): two-dimensional array representing an image
filename (string): name of file to be written to, with extension
Returns:
Nothing
Raises:
No exception is raised.
pyradi.ryfiles.arrayToLaTex(filename, arr, header=None, leftCol=None, formatstring='%10.4e', filemode='wt')

Write a numpy array to latex table format in output file.

The table can contain only the array data (no top header or left column side-header), or you can add either or both of the top row or side column headers. Leave ‘header’ or ‘leftcol’ as None is you don’t want these.

The output format of the array data can be specified, i.e. scientific notation or fixed decimal point.

Args:
fname (string): text writing output path and filename
arr (np.array[N,M]): array with table data
header (string): column header in final latex format (optional)
leftCol ([string]): left column each row, in final latex format (optional)
formatstring (string): output format precision for array data (see np.savetxt) (optional)
filemode (string): file open mode [a=append, w=new file][t=text, b=binary] use binary for Python 3 (optional)
Returns:
None, writes a file to disk
Raises:
No exception is raised.
pyradi.ryfiles.epsLaTexFigure(filename, epsname, caption, scale=None, vscale=None, filemode='a', strPost='')
Write the LaTeX code to include an eps graphic as a latex figure.
The text is added to an existing file.
Args:
fname (string): text writing output path and filename.
epsname (string): filename/path to eps file (relative to where the LaTeX document is built).
caption (string): figure caption
scale (double): figure scale to textwidth [0..1]
vscale (double): figure scale to textheight [0..1]
filemode (string): file open mode (a=append, w=new file) (optional)
strPost (string): string to write to file after latex figure block (optional)
Returns:
None, writes a file to disk
Raises:
No exception is raised.
pyradi.ryfiles.read2DLookupTable(filename)

Read a 2D lookup table and extract the data.

The table has the following format:

line 1: xlabel ylabel title
line 2: 0 (vector of y (col) abscissa)
lines 3 and following: (element of x (row) abscissa), followed
by table data.

From line/row 3 onwards the first element is the x abscissa value followed by the row of data, one point for each y abscissa value.

The file format can depicted as follows:

x-name y-name ordinates-name
0 y1 y2 y3 y4
x1 v11 v12 v13 v14
x2 v21 v22 v23 v24
x3 v31 v32 v33 v34
x4 v41 v42 v43 v44
x5 v51 v52 v53 v54
x6 v61 v62 v63 v64

This function reads the file and returns the individual data items.

Args:
fname (string): input path and filename
Returns:
xVec ((np.array[N])): x abscissae
yVec ((np.array[M])): y abscissae
data ((np.array[N,M])): data corresponding the x,y
xlabel (string): x abscissa label
ylabel (string): y abscissa label
title (string): dataset title
Raises:
No exception is raised.
pyradi.ryfiles.downloadFileUrl(url, saveFilename=None, proxy=None)

Download a file, given a URL.

The URL is used to download a file, to the saveFilename specified. If no saveFilename is given, the basename of the URL is used. Before doownloading, first test to see if the file already exists.

Args:
url (string): the url to be accessed.
saveFilename (string): path to where the file must be saved (optional).
proxy (string): path to proxy server (optional).

The proxy string is something like this proxy = {‘https’:r’https://username:password@proxyname:portnumber’}

Returns:
(string): Filename saved, or None if failed.
Raises:
Exceptions are handled internally and signaled by return value.
pyradi.ryfiles.unzipGZipfile(zipfilename, saveFilename=None)

Unzip a file that was compressed using the gzip format.

The zipfilename is used to open a file, to the saveFilename specified. If no saveFilename is given, the basename of the zipfilename is used, but with the file extension removed.

Args:
zipfilename (string): the zipfilename to be decompressed.
saveFilename (string): to where the file must be saved (optional).
Returns:
(string): Filename saved, or None if failed.
Raises:
Exceptions are handled internally and signaled by return value.
pyradi.ryfiles.untarTarfile(tarfilename, saveDirname=None)

Untar a tar archive, and save all files to the specified directory.

The tarfilename is used to open a file, extraxting to the saveDirname specified. If no saveDirname is given, the local directory ‘.’ is used.

Args:
tarfilename (string): the name of the tar archive.
saveDirname (string): to where the files must be extracted
Returns:
([string]): list of filenames saved, or None if failed.
Raises:
Exceptions are handled internally and signaled by return value.
pyradi.ryfiles.downloadUntar(tgzFilename, url, destinationDir=None, tarFilename=None, proxy=None)

Download and untar a compressed tar archive, and save all files to the specified directory.

The tarfilename is used to open the tar file, extracting to the destinationDir specified. If no destinationDir is given, the local directory ‘.’ is used. Before downloading, a check is done to determine if the file was already downloaded and exists in the local file system.

Args:
tgzFilename (string): the name of the tar archive file
url (string): url where to look for the file (not including the filename)
destinationDir (string): to where the files must be extracted (optional)
tarFilename (string): downloaded tar filename (optional)
proxy (string): path to proxy server (optional).

The proxy string is something like this proxy = {‘https’:r’https://username:password@proxyname:portnumber’}

Returns:
([string]): list of filenames saved, or None if failed.
Raises:
Exceptions are handled internally and signaled by return value.
pyradi.ryfiles.open_HDF(filename)

Open and return an HDF5 file with the given filename.

See https://github.com/NelisW/pyradi/blob/master/pyradi/hdf5-as-data-format.md for more information on using HDF5 as a data structure.

Args:
filename (string): name of the file to be opened
Returns:
HDF5 file.
Raises:
No exception is raised.

Author: CJ Willers

pyradi.ryfiles.erase_create_HDF(filename)

Create and return a new HDS5 file with the given filename, erase the file if existing.

See https://github.com/NelisW/pyradi/blob/master/pyradi/hdf5-as-data-format.md for more information on using HDF5 as a data structure.

Args:
filename (string): name of the file to be created
Returns:
HDF5 file.
Raises:
No exception is raised.

Author: CJ Willers

pyradi.ryfiles.print_HDF5_text(vartext)

Prints text in visiting algorithm in HDF5 file.

See https://github.com/NelisW/pyradi/blob/master/pyradi/hdf5-as-data-format.md for more information on using HDF5 as a data structure.

Args:
vartext (string): string to be printed
Returns:
HDF5 file.
Raises:
No exception is raised.

Author: CJ Willers

pyradi.ryfiles.print_HDF5_dataset_value(var, obj)

Prints a data set in visiting algorithm in HDF5 file.

See https://github.com/NelisW/pyradi/blob/master/pyradi/hdf5-as-data-format.md for more information on using HDF5 as a data structure.

Args:
var (string): path to a dataset
obj (h5py dataset): dataset to be printed
Returns:
HDF5 file.
Raises:
No exception is raised.

Author: CJ Willers

pyradi.ryfiles.get_HDF_branches(hdf5File)

Print list of all the branches in the file

See https://github.com/NelisW/pyradi/blob/master/pyradi/hdf5-as-data-format.md for more information on using HDF5 as a data structure.

Args:
hdf5File (H5py file): the file to be opened
Returns:
HDF5 file.
Raises:
No exception is raised.

Author: CJ Willers

pyradi.ryfiles.plotHDF5Bitmaps(hfd5f, prefix, pformat='png', lstimgs=None, debug=False)

Plot arrays in the HFD5 as scaled bitmap images.

See https://github.com/NelisW/pyradi/blob/master/pyradi/hdf5-as-data-format.md for more information on using HDF5 as a data structure.

Retain zero in the array as black in the image, only scale the max value to 255

Args:
hfd5f (H5py file): the file to be opened
prefix (string): prefix to be prepended to filename
pformat (string): type of file to be created png/jpeg
lstimgs ([string]): list of paths to image in the HFD5 file
Returns:
Nothing.
Raises:
No exception is raised.

Author: CJ Willers

pyradi.ryfiles.plotHDF5Images(hfd5f, prefix, colormap=<matplotlib.colors.LinearSegmentedColormap object>, cbarshow=True, lstimgs=None, logscale=False, debug=False)

Plot images contained in hfd5f with colour map to show magnitude.

See https://github.com/NelisW/pyradi/blob/master/pyradi/hdf5-as-data-format.md for more information on using HDF5 as a data structure.

http://wiki.scipy.org/Cookbook/Matplotlib/Show_colormaps

Args:
hfd5f (H5py file): the file to be opened
prefix (string): prefix to be prepended to filename
colormap (Matplotlib colour map): colour map to be used in plot
cbarshow (boolean): indicate if colour bar must be shown
lstimgs ([string]): list of paths to image in the HFD5 file
logscale (boolean): True if display must be on log scale
Returns:
Nothing.
Raises:
No exception is raised.

Author: CJ Willers

pyradi.ryfiles.plotHDF5Histograms(hfd5f, prefix, format='png', lstimgs=None, bins=50)

Plot histograms of images contained in hfd5f

See https://github.com/NelisW/pyradi/blob/master/pyradi/hdf5-as-data-format.md for more information on using HDF5 as a data structure.

Retain zero in the array as black in the image, only scale the max value to 255

Args:
hfd5f (H5py file): the file to be opened
prefix (string): prefix to be prepended to filename
format (string): type of file to be created png/jpeg
lstimgs ([string]): list of paths to image in the HFD5 file
bins ([int]): Number of bins to be used in histogram
Returns:
Nothing.
Raises:
No exception is raised.

Author: CJ Willers