swiftsimio.accelerated module

Functions that can be accelerated by numba. Numba does not use classes, unfortunately.

swiftsimio.accelerated.ranges_from_array(array: array) → ndarray[source]

Finds contiguous ranges of IDs in sorted list of IDs

Parameters:: array (np.array of int) – sorted list of IDs
Returns:: list of length two arrays corresponding to contiguous ranges of IDs (inclusive) in the input array
Return type:: np.ndarray

Examples

The array

[0, 1, 2, 3, 5, 6, 7, 9, 11, 12, 13]

would return

[[0, 4], [5, 8], [9, 10], [11, 14]]

swiftsimio.accelerated.read_ranges_from_file_unchunked(handle: ~h5py._hl.dataset.Dataset, ranges: ~numpy.ndarray, output_shape: ~typing.Tuple, output_type: type = <class 'numpy.float64'>, columns: ~numpy.lib.index_tricks.IndexExpression = slice(None, None, None)) → array[source]

Takes a hdf5 dataset, and the set of ranges from ranges_from_array, and reads only those ranges from the file.

Unfortunately this functionality is not built into HDF5.

Parameters:

handle (Dataset) – HDF5 dataset to slice data from
ranges (np.ndarray) – Array of ranges (see ranges_from_array())
output_shape (Tuple) – Resultant shape of output.
output_type (type, optional) – numpy type of output elements. If not supplied, we assume np.float64.
columns (np.lib.index_tricks.IndexExpression, optional) – Selector for columns if using a multi-dimensional array. If the array is only a single dimension this is not used.

Returns:

array – Result from reading only the relevant values from handle.

Return type:

np.ndarray

swiftsimio.accelerated.index_dataset(handle: Dataset, mask_array: array) → array[source]

Indexes the dataset using the mask array.

This is not currently a feature of h5py. (March 2019)

Parameters:

handle (Dataset) – data to be indexed
mask_array (np.array) – mask used to index data

Returns:

Subset of the data specified by the mask

Return type:

np.array

swiftsimio.accelerated.concatenate_ranges(ranges: ndarray) → ndarray[source]

Returns an array of ranges with consecutive ranges merged if there is no gap between them

Parameters:: ranges (np.ndarray) – Array of ranges (see ranges_from_array())
Returns:: two dimensional array of ranges
Return type:: np.ndarray

Examples

>>> concatenate_ranges([[1,5],[6,10],[12,15]])
np.ndarray([[1,10],[12,15]])

swiftsimio.accelerated.get_chunk_ranges(ranges: ndarray, chunk_size: ndarray, array_length: int) → ndarray[source]

Return indices indicating which hdf5 chunk each range from ranges belongs to

Parameters:

ranges (np.ndarray) – Array of ranges (see ranges_from_array())
chunk_size (int) – size of the hdf5 dataset chunks
array_length (int) – size of the dataset

Returns:

two dimensional array of bounds for the chunks that contain each range from ranges

Return type:

np.ndarray

swiftsimio.accelerated.expand_ranges(ranges: ndarray) → array[source]

Return an array of indices that are within the specified ranges

Parameters:: ranges (np.ndarray) – Array of ranges (see ranges_from_array())
Returns:: 1D array of indices that fall within each range specified in ranges
Return type:: np.array

swiftsimio.accelerated.extract_ranges_from_chunks(array: ndarray, chunks: ndarray, ranges: ndarray) → ndarray[source]

Returns elements from array that are located within specified ranges

array is a portion of the dataset being read consisting of all the chunks that contain the ranges specified in ranges. The chunks array contains the indices of the upper and lower bounds of these chunks. To find the elements of the dataset that lie within the specified ranges we first create an array indexing which chunk each range belongs to. From this information we create an array of adjusted ranges that takes into account that the array is not the whole dataset. We then return the values in array that are within the adjusted ranges.

Parameters:

array (np.ndarray) – array containing data read in from snapshot
chunks (np.ndarray) – two dimensional array of bounds for the chunks that contain each range from ranges
ranges (np.ndarray) – Array of ranges (see ranges_from_array())

Returns:

subset of array whose elements are within each range in ranges

Return type:

np.ndarray

swiftsimio.accelerated.read_ranges_from_file_chunked(handle: ~h5py._hl.dataset.Dataset, ranges: ~numpy.ndarray, output_shape: ~typing.Tuple, output_type: type = <class 'numpy.float64'>, columns: ~numpy.lib.index_tricks.IndexExpression = slice(None, None, None)) → array[source]

Takes a hdf5 dataset, and the set of ranges from ranges_from_array, and reads only those ranges from the file.

Unfortunately this functionality is not built into HDF5.

Parameters:

handle (Dataset) – HDF5 dataset to slice data from
ranges (np.ndarray) – Array of ranges (see ranges_from_array())
output_shape (Tuple) – Resultant shape of output.
output_type (type, optional) – numpy type of output elements. If not supplied, we assume np.float64.
columns (np.lib.index_tricks.IndexExpression, optional) – Selector for columns if using a multi-dimensional array. If the array is only a single dimension this is not used.

Returns:

array – Result from reading only the relevant values from handle.

Return type:

np.ndarray

swiftsimio.accelerated.read_ranges_from_file(handle: ~h5py._hl.dataset.Dataset, ranges: ~numpy.ndarray, output_shape: ~typing.Tuple, output_type: type = <class 'numpy.float64'>, columns: ~numpy.lib.index_tricks.IndexExpression = slice(None, None, None)) → array[source]

Wrapper function to correctly select which version of read_ranges_from_file should be used

Parameters:

handle (Dataset) – HDF5 dataset to slice data from
ranges (np.ndarray) – Array of ranges (see ranges_from_array())
output_shape (Tuple) – Resultant shape of output.
output_type (type, optional) – numpy type of output elements. If not supplied, we assume np.float64.
columns (np.lib.index_tricks.IndexExpression, optional) – Selector for columns if using a multi-dimensional array. If the array is only a single dimension this is not used.

Returns:

array – Result from reading only the relevant values from handle.

Return type:

np.ndarray