vsf.dataset module
- class vsf.dataset.BaseDataset[source]
Bases:
objectBase class for sequential dataset. Each entry is a sequence of something, typically dicts mapping sequence names to numpy arrays.
- class vsf.dataset.DatasetConfig(type: str = '', path: str = '', keys: dict = <factory>, cache: bool = True, sensor_keys: ~typing.Dict[str, str] = <factory>, control_keys: ~typing.Dict[str, str] = <factory>)[source]
Bases:
objectConfiguration of a dataset, for the two types of dataset we support.
- cache: bool = True
Whether to cache data in memory. Only used for MultiModalDataset
- control_keys: Dict[str, str]
the keys in the dataset used for the controls
- keys: dict
key definition for MultiModalDataset
- path: str = ''
where the dataset lies
- sensor_keys: Dict[str, str]
the keys in the dataset used for matching sensor measurements
- type: str = ''
can be ‘file_loader’ or empty
- class vsf.dataset.MultiModalDataset(data_types: dict, dir_path: str, cache_data=True, sensor_keys=None, control_keys=None)[source]
Bases:
BaseDatasetA dataset that can store multiple types of data along with metadata.
The dataset structure consists of a top-level directory containing multiple subfolders, each named with
seq_followed by a sequence ID.- Parameters:
data_types (dict) – Mapping between dataset entry names and their types. - int: 1D column vector of default NumPy array type. - (int, dtype): 1D column vector of specified type. - ((shape), dtype): N-dimensional array of given shape and type. - dtype can be any array type or “img” for images, (will be saved and loaded as many png files instead of a numpy array).
dir_path (str) – Path to load/save the data, toplevel folder to dump things into. If it does not exist, it is created. Every time a new sequence is collected, a subfolder is created using the unix timestamp, and populated with data (numpy arrays / images). This class can write/read multiple sequences within this “toplevel folder”.
cache_data (bool, optional) – Whether to cache data in memory. Defaults to True.
sensor_keys (list, optional) – Keys used for matching sensor measurements.
control_keys (list, optional) – Keys used for the controls.
- data_types
Processed mapping of dataset entry names and their types.
- Type:
dict
- dir_path
The dataset storage directory.
- Type:
str
- seq_names
List of sequence subfolders.
- Type:
list
- cache_data
Whether data is cached in memory.
- Type:
bool
- seq_cache
Cached data storage.
- Type:
dict
- sensor_keys
Keys used for sensor matching.
- Type:
list
- control_keys
Keys used for control matching.
- Type:
list
- IMAGE16_DTYPE
alias of
uint16
- IMAGE8_DTYPE
alias of
uint8