swiftsimio.accelerated module

Define functions that can be accelerated by numba.

Numba does not use classes, unfortunately.

swiftsimio.accelerated.jit(signature_or_function=None, locals=mappingproxy({}), cache=False, pipeline_class=None, boundscheck=None, **options)[source]

This decorator is used to compile a Python function into native code.

Parameters:
  • signature_or_function – The (optional) signature or list of signatures to be compiled. If not passed, required signatures will be compiled when the decorated function is called, depending on the argument values. As a convenience, you can directly pass the function to be compiled instead.

  • locals (dict) – Mapping of local variable names to Numba types. Used to override the types deduced by Numba’s type inference engine.

  • pipeline_class (type numba.compiler.CompilerBase) – The compiler pipeline type for customizing the compilation stages.

  • options

    For a cpu target, valid options are:
    nopython: bool

    Set to True to disable the use of PyObjects and Python API calls. The default behavior is to allow the use of PyObjects and Python API. Default value is True.

    forceobj: bool

    Set to True to force the use of PyObjects for every value. Default value is False.

    looplift: bool

    Set to True to enable jitting loops in nopython mode while leaving surrounding code in object mode. This allows functions to allocate NumPy arrays and use Python objects, while the tight loops in the function can still be compiled in nopython mode. Any arrays that the tight loop uses should be created before the loop is entered. Default value is True.

    error_model: str

    The error-model affects divide-by-zero behavior. Valid values are ‘python’ and ‘numpy’. The ‘python’ model raises exception. The ‘numpy’ model sets the result to +/-inf or nan. Default value is ‘python’.

    inline: str or callable

    The inline option will determine whether a function is inlined at into its caller if called. String options are ‘never’ (default) which will never inline, and ‘always’, which will always inline. If a callable is provided it will be called with the call expression node that is requesting inlining, the caller’s IR and callee’s IR as arguments, it is expected to return Truthy as to whether to inline. NOTE: This inlining is performed at the Numba IR level and is in no way related to LLVM inlining.

    boundscheck: bool or None

    Set to True to enable bounds checking for array indices. Out of bounds accesses will raise IndexError. The default is to not do bounds checking. If False, bounds checking is disabled, out of bounds accesses can produce garbage results or segfaults. However, enabling bounds checking will slow down typical functions, so it is recommended to only use this flag for debugging. You can also set the NUMBA_BOUNDSCHECK environment variable to 0 or 1 to globally override this flag. The default value is None, which under normal execution equates to False, but if debug is set to True then bounds checking will be enabled.

Returns:

  • A callable usable as a compiled function. Actual compiling will be

  • done lazily if no explicit signatures are passed.

Examples

The function can be used in the following ways:

  1. jit(signatures, **targetoptions) -> jit(function)

    Equivalent to:

    d = dispatcher(function, targetoptions) for signature in signatures:

    d.compile(signature)

    Create a dispatcher object for a python function. Then, compile the function with the given signature(s).

    Example:

    @jit(“int32(int32, int32)”) def foo(x, y):

    return x + y

    @jit([“int32(int32, int32)”, “float32(float32, float32)”]) def bar(x, y):

    return x + y

  2. jit(function, **targetoptions) -> dispatcher

    Create a dispatcher function object that specializes at call site.

    Examples:

    @jit def foo(x, y):

    return x + y

    @jit(nopython=True) def bar(x, y):

    return x + y

class swiftsimio.accelerated.prange(*args)[source]

Bases: object

Provides a 1D parallel iterator that generates a sequence of integers. In non-parallel contexts, prange is identical to range.

swiftsimio.accelerated.ranges_from_array(array: array) ndarray[source]

Find contiguous ranges of IDs in sorted list of IDs.

Parameters:

array (np.array of int) – Sorted list of IDs.

Returns:

List of length two arrays corresponding to contiguous ranges of IDs (inclusive) in the input array.

Return type:

np.ndarray

Examples

The array:

[0, 1, 2, 3, 5, 6, 7, 9, 11, 12, 13]

would return:

[[0, 4], [5, 8], [9, 10], [11, 14]]
swiftsimio.accelerated.read_ranges_from_file_unchunked(handle: Dataset, ranges: ndarray, output_shape: tuple, output_type: type = <class 'numpy.float64'>, columns: slice = slice(None, None, None)) array[source]

Read only a selection of index ranges from a dataset that is not chunked.

Takes a hdf5 dataset, and the set of ranges from ranges_from_array, and reads only those ranges from the file.

Unfortunately this functionality is not built into HDF5.

Parameters:
  • handle (Dataset) – HDF5 dataset to slice data from.

  • ranges (np.ndarray) – Array of ranges (see ranges_from_array()).

  • output_shape (tuple) – Resultant shape of output.

  • output_type (type, optional) – numpy.dtype of output elements. If not supplied, we assume numpy.float64.

  • columns (slice, optional) – Selector for columns if using a multi-dimensional array. If the array is only a single dimension this is not used.

Returns:

Result from reading only the relevant values from handle.

Return type:

np.ndarray

swiftsimio.accelerated.index_dataset(handle: Dataset, mask_array: array) array[source]

Index the dataset using the mask array.

This is not currently a feature of h5py. (March 2019)

Parameters:
  • handle (Dataset) – Data to be indexed.

  • mask_array (np.array) – Mask used to index data.

Returns:

Subset of the data specified by the mask.

Return type:

np.ndarray

swiftsimio.accelerated.get_chunk_ranges(ranges: ndarray, chunk_size: ndarray, array_length: int) ndarray[source]

Return indices indicating which hdf5 chunk each range from ranges belongs to.

Parameters:
  • ranges (np.ndarray) – Array of ranges (see ranges_from_array()).

  • chunk_size (int) – Size of the hdf5 dataset chunks.

  • array_length (int) – Size of the dataset.

Returns:

Two dimensional array of bounds for the chunks that contain each range from ranges.

Return type:

np.ndarray

swiftsimio.accelerated.expand_ranges(ranges: ndarray) array[source]

Return an array of indices that are within the specified ranges.

Parameters:

ranges (np.ndarray) – Array of ranges (see ranges_from_array()).

Returns:

1D array of indices that fall within each range specified in ranges.

Return type:

np.ndarray

swiftsimio.accelerated.extract_ranges_from_chunks(array: ndarray, chunks: ndarray, ranges: ndarray) ndarray[source]

Return elements from array that are located within specified ranges.

array is a portion of the dataset being read consisting of all the chunks that contain the ranges specified in ranges. The chunks array contains the indices of the upper and lower bounds of these chunks. To find the elements of the dataset that lie within the specified ranges we first create an array indexing which chunk each range belongs to. From this information we create an array of adjusted ranges that takes into account that the array is not the whole dataset. We then return the values in array that are within the adjusted ranges.

Parameters:
  • array (np.ndarray) – Array containing data read in from snapshot.

  • chunks (np.ndarray) – Two dimensional array of bounds for the chunks that contain each range from ranges.

  • ranges (np.ndarray) – Array of ranges (see ranges_from_array()).

Returns:

Subset of array whose elements are within each range in ranges.

Return type:

np.ndarray

swiftsimio.accelerated.read_ranges_from_file_chunked(handle: Dataset, ranges: ndarray, output_shape: tuple, output_type: type = <class 'numpy.float64'>, columns: slice = slice(None, None, None)) array[source]

Read only a selection of index ranges from a dataset that is chunked.

Takes a hdf5 dataset, and the set of ranges from ranges_from_array, and reads only those ranges from the file.

Unfortunately this functionality is not built into HDF5.

Parameters:
  • handle (Dataset) – HDF5 dataset to slice data from.

  • ranges (np.ndarray) – Array of ranges (see ranges_from_array()).

  • output_shape (tuple) – Resultant shape of output.

  • output_type (type, optional) – numpy.dtype of output elements. If not supplied, we assume numpy.float64.

  • columns (slice, optional) – Selector for columns if using a multi-dimensional array. If the array is only a single dimension this is not used.

Returns:

Result from reading only the relevant values from handle.

Return type:

np.ndarray

swiftsimio.accelerated.read_ranges_from_file(handle: Dataset, ranges: ndarray, output_shape: tuple, output_type: type = <class 'numpy.float64'>, columns: slice = slice(None, None, None)) array[source]

Correctly select which version of read_ranges_from_file should be used.

Parameters:
  • handle (Dataset) – HDF5 dataset to slice data from.

  • ranges (np.ndarray) – Array of ranges (see ranges_from_array()).

  • output_shape (tuple) – Resultant shape of output.

  • output_type (type, optional) – numpy.dtype of output elements. If not supplied, we assume numpy.float64.

  • columns (slice, optional) – Selector for columns if using a multi-dimensional array. If the array is only a single dimension this is not used.

Returns:

Result from reading only the relevant values from handle.

Return type:

np.ndarray

See also

read_ranges_from_file_chunked

Reads data ranges for chunked hdf5 file.

read_ranges_from_file_unchunked

Reads data ranges for unchunked hdf5 file.

swiftsimio.accelerated.read_ranges_from_hdfstream(handle: Dataset, ranges: ndarray, output_shape: tuple, output_type: type = <class 'numpy.float64'>, columns: slice = slice(None, None, None)) array[source]

Request the specified ranges from the hdfstream server.

Takes a hdfstream remote dataset, and the set of ranges from ranges_from_array, and sends a http request for those ranges.

Parameters:
  • handle (Dataset) – HDF5 dataset to slice data from.

  • ranges (np.ndarray) – Array of ranges (see ranges_from_array()).

  • output_shape (Tuple) – Resultant shape of output.

  • output_type (type, optional) – numpy.dtype of output elements. If not supplied, we assume numpy.float64.

  • columns (slice, optional) – Selector for columns if using a multi-dimensional array. If the array is only a single dimension this is not used.

Returns:

Result from reading only the relevant values from handle.

Return type:

np.ndarray

swiftsimio.accelerated.list_of_strings_to_arrays(lines: list[str]) array[source]

Convert a list of space-delimited values to arrays.

Parameters:

lines (list[str]) – List of strings containing numbers separated by a set of spaces.

Returns:

List of numpy arrays, one per column.

Return type:

list[np.array]

Notes

Currently not suitable for numba acceleration due to mixed datatype usage.

swiftsimio.accelerated.to_native_byteorder_inplace(arr: ndarray) None[source]

Ensure that arr is native endian without making a copy.

Parameters:

arr (np.ndarray) – Array to convert to native endian.