DatasetsΒΆ

Data is a wrapper for the driver class to show the data saved into his structure.

The datasets have groups of n-dimensions:

from dama.data.ds import Data
from dama.drivers.core import Zarr
import numpy as np

array_0 = np.random.rand(100, 1)
array_1 = np.random.rand(100, 2)
array_2 = np.random.rand(100, 3)
array_3 = np.random.rand(100, 6)
array_4 = (np.random.rand(100)*100).astype(int)
array_5 = np.random.rand(100).astype(str)
with Data(name=name, driver=Zarr(mode="w")) as data:
    data.from_data({"x": array_0, "y": array_1, "z": array_2, "a": array_3, "b": array_4, "c": array_5})

In the above example the dataset have x, y, z, a, b and c groups, each one with distinct shape, but with the same length.

with Data(name=name, driver=Zarr(mode="r"), auto_chunks=True) as data:
    print(data)
    print(data[["x", "y"]])
    print(data["x"] + data["y"])  # same as above
    data["x"] = data["x"].darray * 3
    print(data["x"].darray.dask)
DaGroup OrderedDict([('a', (100, 6)), ('b', (100,)), ('c', (100,)), ('x', (100, 1)), ('y', (100,)), ('z', (100, 3))])
DaGroup OrderedDict([('x', (100, 1)), ('y', (100,))])
DaGroup OrderedDict([('x', (100, 1)), ('y', (100,))])
<dask.highlevelgraph.HighLevelGraph object at 0x7f682a8e5b70>