Skip to content

ocl.cli.compute_dataset_size

Script to compute the size of a dataset.

This is useful when subsampling data using transformations in order to determine the final dataset size. The size of the dataset is typically need when running distributed training in order to ensure that all nodes and gpu training processes are presented with the same number of batches.

ComputeSizeConfig dataclass

Configuration of a training run.

Source code in ocl/cli/compute_dataset_size.py
@dataclasses.dataclass
class ComputeSizeConfig:
    """Configuration of a training run."""

    dataset: Any
    plugins: Dict[str, Dict] = dataclasses.field(default_factory=dict)