ocl.typing

Types used in object centric learning framework.

`ImageData = TensorType['batch size', 'channels', 'height', 'width']` `module-attribute`

`VideoData = TensorType['batch size', 'frames', 'channels', 'height', 'width']` `module-attribute`

`ImageOrVideoData = Union[VideoData, ImageData]` `module-attribute`

`TextData = TensorType['batch_size', 'max_tokens']` `module-attribute`

`CNNImageFeatures = ImageData` `module-attribute`

`TransformerImageFeatures = TensorType['batch_size', 'n_spatial_features', 'feature_dim']` `module-attribute`

`ImageFeatures = TransformerImageFeatures` `module-attribute`

`VideoFeatures = TensorType['batch_size', 'frames', 'n_spatial_features', 'feature_dim']` `module-attribute`

`ImageOrVideoFeatures = Union[ImageFeatures, VideoFeatures]` `module-attribute`

`Positions = TensorType['n_spatial_features', 'spatial_dims']` `module-attribute`

`PooledFeatures = TensorType['batch_size', 'feature_dim']` `module-attribute`

`ObjectFeatures = TensorType['batch_size', 'n_objects', 'object_dim']` `module-attribute`

`EmptyIndicator = TensorType['batch_size', 'n_objects']` `module-attribute`

`ObjectFeatureAttributions = TensorType['batch_size', 'n_objects', 'n_spatial_features']` `module-attribute`

`ConditioningOutput = TensorType['batch_size', 'n_objects', 'object_dim']` `module-attribute`

Output of conditioning modules.

`FrameFeatures` `dataclass`

Features associated with a single frame.

Source code in ocl/typing.py

@dataclasses.dataclass
class FrameFeatures:
    """Features associated with a single frame."""

    features: ImageFeatures
    positions: Positions

`features: ImageFeatures` `class-attribute`

`positions: Positions` `class-attribute`

`FeatureExtractorOutput` `dataclass`

Output of feature extractor.

Source code in ocl/typing.py

@dataclasses.dataclass
class FeatureExtractorOutput:
    """Output of feature extractor."""

    features: ImageOrVideoFeatures
    positions: Positions
    aux_features: Optional[Dict[str, torch.Tensor]] = None

    def __iter__(self) -> Iterable[ImageFeatures]:
        """Iterate over features and positions per frame."""
        if self.features.ndim == 4:
            yield FrameFeatures(self.features, self.positions)
        else:
            for frame_features in torch.split(self.features, 1, dim=1):
                yield FrameFeatures(frame_features.squeeze(1), self.positions)

`features: ImageOrVideoFeatures` `class-attribute`

`positions: Positions` `class-attribute`

`aux_features: Optional[Dict[str, torch.Tensor]] = None` `class-attribute`

`iter`

Iterate over features and positions per frame.

Source code in ocl/typing.py

def __iter__(self) -> Iterable[ImageFeatures]:
    """Iterate over features and positions per frame."""
    if self.features.ndim == 4:
        yield FrameFeatures(self.features, self.positions)
    else:
        for frame_features in torch.split(self.features, 1, dim=1):
            yield FrameFeatures(frame_features.squeeze(1), self.positions)

`PerceptualGroupingOutput` `dataclass`

Output of a perceptual grouping algorithm.

Source code in ocl/typing.py

@dataclasses.dataclass
class PerceptualGroupingOutput:
    """Output of a perceptual grouping algorithm."""

    objects: ObjectFeatures
    is_empty: Optional[EmptyIndicator] = None  # noqa: F821
    feature_attributions: Optional[ObjectFeatureAttributions] = None  # noqa: F821

ocl.typing

`ImageData = TensorType['batch size', 'channels', 'height', 'width']` `module-attribute`

`VideoData = TensorType['batch size', 'frames', 'channels', 'height', 'width']` `module-attribute`

`ImageOrVideoData = Union[VideoData, ImageData]` `module-attribute`

`TextData = TensorType['batch_size', 'max_tokens']` `module-attribute`

`CNNImageFeatures = ImageData` `module-attribute`

`TransformerImageFeatures = TensorType['batch_size', 'n_spatial_features', 'feature_dim']` `module-attribute`

`ImageFeatures = TransformerImageFeatures` `module-attribute`

`VideoFeatures = TensorType['batch_size', 'frames', 'n_spatial_features', 'feature_dim']` `module-attribute`

`ImageOrVideoFeatures = Union[ImageFeatures, VideoFeatures]` `module-attribute`

`Positions = TensorType['n_spatial_features', 'spatial_dims']` `module-attribute`

`PooledFeatures = TensorType['batch_size', 'feature_dim']` `module-attribute`

`ObjectFeatures = TensorType['batch_size', 'n_objects', 'object_dim']` `module-attribute`

`EmptyIndicator = TensorType['batch_size', 'n_objects']` `module-attribute`

`ObjectFeatureAttributions = TensorType['batch_size', 'n_objects', 'n_spatial_features']` `module-attribute`

`ConditioningOutput = TensorType['batch_size', 'n_objects', 'object_dim']` `module-attribute`

`FrameFeatures` `dataclass`

`features: ImageFeatures` `class-attribute`

`positions: Positions` `class-attribute`

`FeatureExtractorOutput` `dataclass`

`features: ImageOrVideoFeatures` `class-attribute`

`positions: Positions` `class-attribute`

`aux_features: Optional[Dict[str, torch.Tensor]] = None` `class-attribute`

`iter`

`PerceptualGroupingOutput` `dataclass`

`objects: ObjectFeatures` `class-attribute`

`is_empty: Optional[EmptyIndicator] = None` `class-attribute`

`feature_attributions: Optional[ObjectFeatureAttributions] = None` `class-attribute`

ocl.typing

ImageData = TensorType['batch size', 'channels', 'height', 'width'] module-attribute

VideoData = TensorType['batch size', 'frames', 'channels', 'height', 'width'] module-attribute

ImageOrVideoData = Union[VideoData, ImageData] module-attribute

TextData = TensorType['batch_size', 'max_tokens'] module-attribute

CNNImageFeatures = ImageData module-attribute

TransformerImageFeatures = TensorType['batch_size', 'n_spatial_features', 'feature_dim'] module-attribute

ImageFeatures = TransformerImageFeatures module-attribute

VideoFeatures = TensorType['batch_size', 'frames', 'n_spatial_features', 'feature_dim'] module-attribute

ImageOrVideoFeatures = Union[ImageFeatures, VideoFeatures] module-attribute

Positions = TensorType['n_spatial_features', 'spatial_dims'] module-attribute

PooledFeatures = TensorType['batch_size', 'feature_dim'] module-attribute

ObjectFeatures = TensorType['batch_size', 'n_objects', 'object_dim'] module-attribute

EmptyIndicator = TensorType['batch_size', 'n_objects'] module-attribute

ObjectFeatureAttributions = TensorType['batch_size', 'n_objects', 'n_spatial_features'] module-attribute

ConditioningOutput = TensorType['batch_size', 'n_objects', 'object_dim'] module-attribute

FrameFeatures dataclass

features: ImageFeatures class-attribute

positions: Positions class-attribute

FeatureExtractorOutput dataclass

features: ImageOrVideoFeatures class-attribute

positions: Positions class-attribute

aux_features: Optional[Dict[str, torch.Tensor]] = None class-attribute

__iter__

PerceptualGroupingOutput dataclass

objects: ObjectFeatures class-attribute

is_empty: Optional[EmptyIndicator] = None class-attribute

feature_attributions: Optional[ObjectFeatureAttributions] = None class-attribute

`ImageData = TensorType['batch size', 'channels', 'height', 'width']` `module-attribute`

`VideoData = TensorType['batch size', 'frames', 'channels', 'height', 'width']` `module-attribute`

`ImageOrVideoData = Union[VideoData, ImageData]` `module-attribute`

`TextData = TensorType['batch_size', 'max_tokens']` `module-attribute`

`CNNImageFeatures = ImageData` `module-attribute`

`TransformerImageFeatures = TensorType['batch_size', 'n_spatial_features', 'feature_dim']` `module-attribute`

`ImageFeatures = TransformerImageFeatures` `module-attribute`

`VideoFeatures = TensorType['batch_size', 'frames', 'n_spatial_features', 'feature_dim']` `module-attribute`

`ImageOrVideoFeatures = Union[ImageFeatures, VideoFeatures]` `module-attribute`

`Positions = TensorType['n_spatial_features', 'spatial_dims']` `module-attribute`

`PooledFeatures = TensorType['batch_size', 'feature_dim']` `module-attribute`

`ObjectFeatures = TensorType['batch_size', 'n_objects', 'object_dim']` `module-attribute`

`EmptyIndicator = TensorType['batch_size', 'n_objects']` `module-attribute`

`ObjectFeatureAttributions = TensorType['batch_size', 'n_objects', 'n_spatial_features']` `module-attribute`

`ConditioningOutput = TensorType['batch_size', 'n_objects', 'object_dim']` `module-attribute`

`FrameFeatures` `dataclass`

`features: ImageFeatures` `class-attribute`

`positions: Positions` `class-attribute`

`FeatureExtractorOutput` `dataclass`

`features: ImageOrVideoFeatures` `class-attribute`

`positions: Positions` `class-attribute`

`aux_features: Optional[Dict[str, torch.Tensor]] = None` `class-attribute`

`iter`

`PerceptualGroupingOutput` `dataclass`

`objects: ObjectFeatures` `class-attribute`

`is_empty: Optional[EmptyIndicator] = None` `class-attribute`

`feature_attributions: Optional[ObjectFeatureAttributions] = None` `class-attribute`