Python module
Overview
Convenience functions
|
Dump (nested) dictionary to file. |
|
Copy groups/datasets from one HDF5-archive |
|
Copy a dataset from one file to another. |
|
Copy datasets from one HDF5-archive |
|
Compare two files. Return dictionary with differences::. |
|
Compare two files. Return three dictionaries with differences::. |
Manipulate path
|
Return absolute path. |
|
Join path components. |
Iterators
|
Get paths to all datasets and groups that contain attributes. |
|
Iterator to transverse all datasets in a HDF5-archive. |
|
Paths of all groups in a HDF5-archive. |
|
From a list of paths, filter those paths that do not point to datasets. |
|
Copy datasets from one HDF5-archive |
Verify
|
Try reading each datasets. |
|
Check if a path exists in the HDF5-archive. |
|
Check if any of the input paths exists in the HDF5-archive. |
|
Check if all of the input paths exists in the HDF5-archive. |
|
Check that a dataset is equal in both files. |
|
Check that all listed datasets are equal in both files. |
Documentation
- GooseHDF5.G5list(args: list[str])
Command-line tool to print datasets from a file, see
--help
. :param args: Command-line arguments (should be all strings).
- GooseHDF5.G5print(args: list[str])
Command-line tool to print datasets from a file, see
--help
. :param args: Command-line arguments (should be all strings).
- GooseHDF5.abspath(path)
Return absolute path.
- Parameters
path (str) – A HDF5-path.
- Returns
The absolute path.
- GooseHDF5.allequal(source: h5py._hl.files.File, dest: h5py._hl.files.File, source_datasets: list[str], dest_datasets: Optional[list[str]] = None, root: Optional[str] = None, attrs: bool = True, matching_dtype: bool = False)
Check that all listed datasets are equal in both files.
- Parameters
source (h5py.File) – The source HDF5-archive.
dest (h5py.File) – The destination HDF5-archive.
source_datasets (list) – List of dataset-paths in
source
.dest_datasets (list) – List of dataset-paths in
dest
, defaults tosource_datasets
.root – Path prefix for all
dest_datasets
.attrs – Compare attributes (the same way at datasets).
matching_dtype – Check that not only the data but also the type matches.
- GooseHDF5.compare(a: Union[str, h5py._hl.files.File], b: Union[str, h5py._hl.files.File], paths_a: list[str] = None, paths_b: list[str] = None, attrs: bool = True, matching_dtype: bool = False)
Compare two files. Return dictionary with differences:
{ "->" : ["/path/in/b/but/not/in/a", ...], "<-" : ["/path/in/a/but/not/in/b", ...], "!=" : ["/path/in/both/but/different/data", ...], "==" : ["/data/matching", ...] }
- Parameters
a – HDF5-archive (as opened
h5py.File
or with thefilepath
).b – HDF5-archive (as opened
h5py.File
or with thefilepath
).paths_a – Paths from
a
to consider. Default: read fromgetdatapaths()
.paths_b – Paths from
b
to consider. Default: read fromgetdatapaths()
.attrs – Compare attributes (the same way at datasets).
matching_dtype – Check that not only the data but also the type matches.
- Returns
Dictionary with difference.
- GooseHDF5.compare_rename(a: h5py._hl.files.File, b: h5py._hl.files.File, rename: Optional[list[str]] = None, paths_a: Optional[list[str]] = None, paths_b: Optional[list[str]] = None, attrs: bool = True, matching_dtype: bool = False)
Compare two files. Return three dictionaries with differences:
# plain comparison between a and b { "->" : ["/path/in/b/but/not/in/a", ...], "<-" : ["/path/in/a/but/not/in/b", ...], "!=" : ["/path/in/both/but/different/data", ...], "==" : ["/data/matching", ...] } # comparison of renamed paths: list of paths in a { "!=" : ["/path/in/a/with/rename/path/not_equal", ...], "==" : ["/path/in/a/with/rename/path/matching", ...] } # comparison of renamed paths: list of paths in b { "!=" : ["/path/in/b/with/rename/path/not_equal", ...], "==" : ["/path/in/b/with/rename/path/matching", ...] }
- Parameters
a – HDF5-archive (as opened
h5py.File
or with thefilepath
).b – HDF5-archive (as opened
h5py.File
or with thefilepath
).rename – List with with renamed pairs:
[["/a/0", "/b/1"], ...]
.paths_a – Paths from
a
to consider. Default: read fromgetdatapaths()
.paths_b – Paths from
b
to consider. Default: read fromgetdatapaths()
.attrs – Compare attributes (the same way at datasets).
matching_dtype – Check that not only the data but also the type matches.
- GooseHDF5.copy(source: h5py._hl.files.File, dest: h5py._hl.files.File, source_datasets: list[str], dest_datasets: Optional[list[str]] = None, root: Optional[str] = None, recursive: bool = True, skip: bool = False, expand_soft: bool = True)
Copy groups/datasets from one HDF5-archive
source
to another HDF5-archivedest
. The datasets can be renamed by specifying a list ofdest_datasets
(whose entries should correspond to thesource_datasets
). In addition, aroot
(path prefix) for the destination datasets name can be specified.- Parameters
source – The source HDF5-archive.
dest – The destination HDF5-archive.
source_datasets – List of dataset-paths in
source
.dest_datasets – List of dataset-paths in
dest
, defaults tosource_datasets
.root – Path prefix for all
dest_datasets
.recursive – If the source is a group, copy all objects within that group recursively.
skip – Skip datasets that are not present in source.
expand_soft – Copy the underlying data of a link, or copy as link with the same path.
- GooseHDF5.copy_dataset(source, dest, paths, compress=False, double_to_float=False)
Copy a dataset from one file to another. This function also copies possible attributes.
- Parameters
source (h5py.File) – The source HDF5-archive.
dest (h5py.File) – The destination HDF5-archive.
paths (str, list) – (List of) HDF5-path(s) to copy.
compress (bool) – Compress the destination dataset(s).
double_to_float (bool) – Convert doubles to floats before copying.
- GooseHDF5.copydatasets(source: h5py._hl.files.File, dest: h5py._hl.files.File, source_datasets: list[str], dest_datasets: Optional[list[str]] = None, root: Optional[str] = None)
Copy datasets from one HDF5-archive
source
to another HDF5-archivedest
. The datasets can be renamed by specifying a list ofdest_datasets
(whose entries should correspond to thesource_datasets
). If the source is a Group object, by default all objects within that group will be copied recursively.In addition, a
root
(path prefix) for the destination datasets name can be specified.- Parameters
source – The source HDF5-archive.
dest – The destination HDF5-archive.
source_datasets – List of dataset-paths in
source
.dest_datasets – List of dataset-paths in
dest
, defaults tosource_datasets
.root – Path prefix for all
dest_datasets
.
- GooseHDF5.dump(file: h5py._hl.files.File, data: dict, root: str = '/')
Dump (nested) dictionary to file.
- GooseHDF5.equal(source: h5py._hl.files.File, dest: h5py._hl.files.File, source_dataset: str, dest_dataset: Optional[str] = None, root: Optional[str] = None, attrs: bool = True, matching_dtype: bool = False)
Check that a dataset is equal in both files.
- Parameters
source (h5py.File) – The source HDF5-archive.
dest (h5py.File) – The destination HDF5-archive.
source_datasets (list) – List of dataset-paths in
source
.dest_datasets (list) – List of dataset-paths in
dest
, defaults tosource_datasets
.root – Path prefix for
dest_dataset
.attrs – Compare attributes (the same way at datasets).
matching_dtype – Check that not only the data but also the type matches.
- GooseHDF5.exists(file, path)
Check if a path exists in the HDF5-archive.
- Parameters
file (h5py.File) – A HDF5-archive.
path (str) – HDF5-path.
- GooseHDF5.exists_all(file, paths)
Check if all of the input paths exists in the HDF5-archive.
- Arguments
- Parameters
file (h5py.File) – A HDF5-archive.
path (list) – List of HDF5-paths.
- GooseHDF5.exists_any(file, paths)
Check if any of the input paths exists in the HDF5-archive.
- Parameters
file (h5py.File) – A HDF5-archive.
path (list) – List of HDF5-paths.
- GooseHDF5.filter_datasets(file, paths)
From a list of paths, filter those paths that do not point to datasets.
- Parameters
file (h5py.File) – A HDF5-archive.
paths (list) – List of HDF5-paths.
- Returns
Filtered
paths
.
- GooseHDF5.getdatapaths(file, root: str = '/')
Get paths to all datasets and groups that contain attributes.
- Parameters
file – A HDF5-archive.
root – Start at a certain point along the path-tree.
- Returns
list[str]
.
- GooseHDF5.getdatasets(file, root='/', max_depth=None, fold=None)
Iterator to transverse all datasets in a HDF5-archive. One can choose to fold (not transverse deeper than):
Groups deeper than a certain
max_depth
.A (list of) specific group(s).
- Parameters
file (h5py.File) – A HDF5-archive.
root (str) – Start a certain point along the path-tree.
max_depth (int) – Set a maximum depth beyond which groups are folded.
fold (list) – Specify groups that are folded.
- Returns
Iterator.
- Example
Consider this file:
/path/to/first/a /path/to/first/b /data/c /data/d /e
Calling:
with h5py.File("...", "r") as file: for path in GooseHDF5.getpaths(file, max_depth=2, fold="/data"): print(path)
Will print:
/path/to/... /data/... /e
The
...
indicates that it concerns a folded group, not a dataset. Here, the first group was folded because of the maximum depth, the second because it was specifically requested to be folded.
- GooseHDF5.getgroups(file: h5py._hl.files.File, root: str = '/', has_attrs: bool = False, max_depth: Optional[int] = None) list[str]
Paths of all groups in a HDF5-archive.
- Parameters
file – A HDF5-archive.
root – Start at a certain point along the path-tree.
has_attrs – Return only groups that have attributes.
max_depth (int) – Set a maximum depth beyond which groups are folded.
- Returns
list[str]
.
- GooseHDF5.getpaths(data, root='/', max_depth=None, fold=None)
Iterator to transverse all datasets in HDF5-archive. One can choose to fold (not transverse deeper than):
Groups deeper than a certain
max_depth
.A (list of) specific group(s).
- Parameters
data (h5py.File) – A HDF5-archive.
root (str) – Start a certain point along the path-tree.
max_depth (int) – Set a maximum depth beyond which groups are folded.
fold (list) – Specify groups that are folded.
- Returns
Iterator.
- Example
Consider this file:
/path/to/first/a /path/to/first/b /data/c /data/d /e
Calling:
with h5py.File('...', 'r') as data: for path in GooseHDF5.getpaths(data, max_depth=2, fold='/data'): print(path)
Will print:
/path/to/... /data/... /e
The
...
indicate that it concerns a folded group, not a dataset. Here, the first group was folded because of the maximum depth, and the second because it was specifically requested to be folded.
- GooseHDF5.isnumeric(a)
Returns
True
is an array contains numeric values.- Parameters
a (array) – An array.
- Returns
bool
- GooseHDF5.join(*args, root=False)
Join path components.
- Parameters
args (list) – Piece of a path.
- Returns
The concatenated path.
- GooseHDF5.print_attribute(source, paths: list[str])
Print paths to dataset and to all underlying attributes. :param paths: List of paths.
- GooseHDF5.print_info(source, paths: list[str])
Print the paths to all datasets (one per line), including type information. :param paths: List of paths.
- GooseHDF5.print_plain(source, paths: list[str], show_links: bool = False)
Print the paths to all datasets (one per line). :param paths: List of paths. :param show_links: Show the path the link points to.
- GooseHDF5.verify(file, datasets, error=False)
Try reading each datasets.
- Parameters
file (h5py.File) – A HDF5-archive.
datasets (list) – List of HDF5-paths tp datasets.
error (bool) –
If
True
, the function raises an error if reading failed.If
False
, the function just continues.
- Returns
List with only those datasets that can be successfully opened.