Guide for Python

A module to read and write data in ebf format.

EBF is a binary format for storing data. It is designed to read and write data, easily and efficiently.

  • Store multiple data items in one file, each having a unique tag name
    • tagnames follow the convention of unix style pathname e.g. /x or /mydata/x
    • this allows hierarchical storage of data
  • Automatic type and endian conversion
  • Support for mutiple programming languages
    • data can easily read in C, C++, Fortran, Java, IDL and Matlab
    • facilitates easy distribution of data
  • Comprehensive numpy support
    • data is read back as numpy arrays
    • almost any numpy array can be written
    • Nested numpy structures are also supported
  • Read and write directly a recursive dictionary of numpy arrays

To install

$pip install ebfpy           OR
$pip install ebfpy --user    OR

Alternatively

$tar -zxvf ebfpy_x.x.x.tar.gz
$cd ebfpy_x.x.x
$python setup.py install --user                            OR 
$python setup.py install --user --install-scripts=mypath   OR
$python setup.py install  --install-scripts=mypath 

The –install_scripts option if specified determines the installation location of the command line script ebftkpy, the ebf module is always installed in a standard location. It is better to set this manually (to something like ‘/usr/local/bin’ or somewhere in home) because the standard script installation location might not be in your search path. With –user option generally the scripts are installed in ~/.local/bin/.

To run the test suite just do (from within folder ebfpy_x.x.x)

$./ebf.py 

Example:

Write specific numpy arrays.

>>> import ebf
>>> import numpy
>>> x = numpy.random.rand(2,5)
>>> y = numpy.random.rand(2,5)
>>> ebf.write('check.ebf', '/x', x, "w")
>>> ebf.write('check.ebf', '/y', y, "a")

Write in a different path within an ebf file .

>>> ebf.write('check.ebf', '/mypath/x', x, "a")
>>> ebf.write('check.ebf', '/mypath/y', y, "a")

Read back the written arrays

>>> x1 = ebf.read('check.ebf', '/x')
>>> y1 = ebf.read('check.ebf', '/mypath/y')

Read all items in an ebf path as a dictionary such that data[“x”] is same as x1 such that data[“y”] is same as y1

>>> data = ebf.read('check.ebf', '/mypath/')

Check the contents of the file.

>>> ebf.info('check.ebf')
check.ebf 2460 bytes
------------------------------------------------------------------
name                           dtype    endian  unit       dim       
------------------------------------------------------------------
/.ebf/info                     int64    little             [5]       
/.ebf/htable                   int8     little             [1256]    
/x                             float64  little             [2 5]     
/y                             float64  little             [2 5]     
/mypath/x                      float64  little             [2 5]     
/mypath/y                      float64  little             [2 5]     

Split a structure and write individual data items in path “/mypath/” in an ebf file.

>>> dth = numpy.dtype([('data_u1', 'u1', (2, 5)), ('data_u2', 'u2', (2, 5))])
>>> data = numpy.zeros(1, dtype = dth)
>>> ebf.write('check.ebf', '/mypath/', data, "w")
>>> data1 = ebf.read('check.ebf', '/mypath/')
>>> ebf.info('check.ebf') 
check.ebf 1906 bytes
------------------------------------------------------------------
name                           dtype    endian  unit       dim       
------------------------------------------------------------------
/.ebf/info                     int64    little             [5]       
/.ebf/htable                   int8     little             [1256]    
/mypath/data_u1                uint8    little             [2 5]     
/mypath/data_u2                uint16   little             [2 5] 

Write a nested structure and read it back.

>>> dth = numpy.dtype([('data_u1', 'u1', (2, 5)), ('data_u2', 'u2', (2, 5))])
>>> dth1 = numpy.dtype([('data_u1', 'u1', (2, 5)), ('point1', dth, (1, ))])
>>> data = numpy.zeros(10, dtype = dth1)
>>> ebf.write("check.ebf", "/data", data, "w")
>>> data1 = ebf.read("check.ebf", "/data")
>>> ebf.info("check.ebf")
check.ebf 2247 bytes
------------------------------------------------------------------
name                           dtype    endian  unit       dim       
------------------------------------------------------------------
/.ebf/info                     int64    little             [5]       
/.ebf/htable                   int8     little             [1256]    
/data                          struct   little             [10]      
structure definition:
ver-1 
struct {
uint8 data_u1 2  2  5  ;
struct {
uint8 data_u1 2  2  5  ;
uint16 data_u2 2  2  5  ;
} point1 1 1   ; 
} anonymous 1 1   ; 

Write a string and read it back as string. Note, return type is numpy.ndarray, hence have to use tostring() method to convert it back to string.

>>> x = "abcdefghijkl"
>>> ebf.write("check.ebf", "/mystr", numpy.array(x), "w")
>>> y = ebf.read("check.ebf", "/mystr").tostring()

Write a list of string and read it back as numpy.ndarray of type numpy.string

>>> x = ["abc", "abcdef"]
>>> ebf.write("check.ebf", "/mystr", numpy.array(x), "w")
>>> y = ebf.read("check.ebf", "/mystr")
>>> print y[0] == "abc",y[1] == "abcdef"
True True

Write with units and read it back.

>>> data = numpy.zeros(1, dtype = "int32")
>>> ebf.write('check.ebf', '/data', data, "w",dataunit="100 m/s")
>>> print, ebf.unit('check.ebf', '/data')

Check if a data item is present.

>>> ebf.containsKey('check.ebf', '/data')
ebf.cat(filename, tagname, delimiter=' ', tableon=0)

print data items in ascii format

Args:

filename(str):

tagname(str):

Kwargs:
delimiter(str) - ‘ ‘ or ‘, ‘ for csv

Example:

>>> ebf.cat('check.ebf','/x /y',', ')
>>> ebf.cat('check.ebf','/x+',', ')
>>> ebf.cat('check.ebf','/x+',', ',1)
ebf.check(filename)

check if the file is not corrupted

Args:
filename(str):

Kwargs:

Example:

>>> ebf.check('check.ebf')
ebf.clearEbfMap()

Clears cached information about all files. This could be used to conserve memory after a lot of different files have been read.

>>> ebf.clearEbfMap()
ebf.containsKey(filename, dataname)

Check if a data item is present in an ebf file.

Args:

filename : a string specifying filename

dataname : name of data item

Returns:
1 if an item is present else 0

Example:

>>> ebf.containsKey('check.ebf','/x')
ebf.copy(filename1, filename2, mode='a', tagnames='', outpath=None)

copy data items from one file to another

Args:

filename1(str):

filename2(str):

mode(str) : ‘w’ or ‘a’

tagnames(str) : if blank then copies all items or else one can supply space separated list of data items as a single string

outpath(str): Path ending with ‘/’ into which to copy items

Example:

>>> ebf.copy("check1.ebf",'check2.ebf','/x /y','w')
>>> ebf.copy("check1.ebf",'check2.ebf','/x')
>>> ebf.copy("check1.ebf",'check2.ebf')
ebf.dict2npstruct(data, basekey=None, keylist=None)

Convert a python dict containing numpy arrays to numpy struct

Args:

data :

basekey(str): Only those items in dict whose size match that of data[bsekey] will be used.

keylist(str): list of keys to beused when constructing npstruct

ebf.diff(filename1, filename2)

Perform diff operation on two files. Ignores data items starting with “/.” which are for internal use. If file contents are same it does not print anything.

Args:

filename1(str):

filename2(str):

Example:

>>> ebf.diff("check1.ebf","check2.ebf")
ebf.getHeader(filename, dataname)

Get header of the data item

Args:

filename(str):

dataname(str):

Returns:
str.

Example:

>>> ebf.getHeader("check.ebf","/x")
ebf.info(filename, option=0)

Get summary of the contents of a file

Args:
filename(str):

Kwargs:

Example:

>>> ebf.info('check.ebf')
ebf.initialize(filename)

Initialize a file for writing with mode=’w’. After this one can use mode=’a’ to write rest of the items.

Args:
filename(str):

Example:

>>> ebf.initialize('check.ebf')
>>> ebf.write('check.ebf','/x',[0,1,2],'a')
>>> ebf.write('check.ebf','/y',[0,1,2],'a')
is same as
>>> ebf.write('check.ebf','/x',[0,1,2],'w')
>>> ebf.write('check.ebf','/y',[0,1,2],'a')
ebf.iterate(filename, tagname, cache)

An iterator to read in data, part by part of a given size. Useful for reading big arrays which are difficult to fit in RAM.

Args:

filename(str):

tagname(str) : the name of data to be read. Multiple items of same size can be read by appending a + sign

cache(int) : no of data items to read at a time

Example:

>>> temp=0.0
>>> for x in ebf.iterate('check.ebf','/x',1000):
>>>     temp=temp+np.sum(x)   

To read all items whose size match with size of “/x”

>>> temp=0.0
>>> for data in ebf.iterate('check.ebf','/x+',1000):
>>>     temp=temp+np.sum(data['/x'])   
ebf.npstruct2dict(data)

Convert an array of numpy struct to a python dict of numpy arrays

Args:
data :
ebf.read(filename, path='/', recon=0, ckon=1, begin=0, end=None)

Read data from an ebf file

Args:

filename(str) :

path(str) : tagname of data to be read from the ebf file or a path to the data items within the file. If ending with + then all arrays in the same path having same size as the specfied array are read. Useful to load tables where individual columns are written separately.

recon(integer): Should be 1 if one wants to load data objects recursively. Should be 0 if one wants to load data objects only under current path.Defualt is 0.

ckon : option that determines if checksum is to be compared with checksum on file. Default is to compare, but if there is little possibility of file being externally modified then it can be set to 0.

Returns:
numpy.ndarray or a dictionary of numpy.ndarray. If multiple data items are to be read as a dictionary, the path must end with ‘/’ in the later case.
ebf.read_ind(filename, tagname, ind)

read data from specified locations in a file

Args:

filename(str):

tagname(str) : the name of data to be read

ind(str) : list or array of indices to be read

ebf.rename(filename, oldkey, newkey)

Rename a data item in an ebf file

Args:

filename: string

oldkey: a string, the name of key to rename

newkey: a string, the new name. If new key is blank ‘’, then a name of the form ‘/.tr’+oldkey+’.X’ is created. Here X is a an integer greater than equal to zero, which is incremented each time the item with same name is deleted.

Example:

>>> ebf.rename('check.ebf','/x1','/x2')
>>> ebf.rename('check.ebf','/x1','')    
ebf.stat(filename, tagname, recon=0)

Get statistics of a data item

Args:

filename(str):

tagname(str):

Kwargs:

Example:

>>> ebf.stat('check.ebf','/x /y ')
ebf.swapEndian(filename)

Swaps the endianess of the file. Little to Big or Big to Little

Args:
filename(str):

Example:

>>> ebf.swapEndian("check.ebf")
ebf.unit(filename, dataname)

Get physical units of the data type if supplied in file or else empty string

Args:

filename(str):

dataname(str):

Returns:
str.

Example:

>>> ebf.unit("check.ebf","/x")
ebf.update_ind(filename, dataname, data, ind=None)

Update existing data array in a file at user given index positions.

Args:

filename(str):

dataname(str) : the name of data to be upated

data : data to be updated

ind : indices of the array on file that needs to be updated.

ebf.write(filename, tagname, data, mode, dataunit='')

Write data to a file

Args:

filename(str):

tagname(str) : the name of data to be written to the ebf file or a path ending with ‘/’ if multiple items are to be written

data(numpy.ndarray) : data to be to be written

mode(str) : writing mode, “w” to write a fresh file or “a” to append an existing file

Kwargs:
dataunit(str): units of data default is a blank string

Previous topic

Guide to API for multiple languages

Next topic

Guide for IDL

This Page