Skip to content

unitorch.models.dinov2¤

DinoV2Processor¤

Bases: HfImageClassificationProcessor

Processor for DINOv2-based image classification models.

Initializes the DinoV2Processor.

Parameters:

Name Type Description Default
vision_config_path str

Path to the vision processor configuration file.

required
Source code in src/unitorch/models/dinov2/processing.py
14
15
16
17
18
19
20
21
22
23
24
25
def __init__(
    self,
    vision_config_path: str,
):
    """
    Initializes the DinoV2Processor.

    Args:
        vision_config_path (str): Path to the vision processor configuration file.
    """
    vision_processor = BitImageProcessor.from_json_file(vision_config_path)
    super().__init__(vision_processor=vision_processor)

DinoV2ForImageClassification¤

Bases: GenericModel

DINOv2 model for image classification tasks.

Initializes the DinoV2ForImageClassification model.

Parameters:

Name Type Description Default
config_path str

Path to the DINOv2 configuration file.

required
num_classes int

Number of output classes. Defaults to 1.

1
Source code in src/unitorch/models/dinov2/modeling.py
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
def __init__(
    self,
    config_path: str,
    num_classes: Optional[int] = 1,
):
    """
    Initializes the DinoV2ForImageClassification model.

    Args:
        config_path (str): Path to the DINOv2 configuration file.
        num_classes (int, optional): Number of output classes. Defaults to 1.
    """
    super().__init__()
    config = Dinov2Config.from_json_file(config_path)
    self.dinov2 = Dinov2Model(config)
    self.classifier = nn.Linear(config.hidden_size * 2, num_classes)
    self.init_weights()

prefix_keys_in_state_dict class-attribute instance-attribute ¤

prefix_keys_in_state_dict = {
    "^embeddings.": "dinov2.",
    "^layernorm.": "dinov2.",
    "^encoder.": "dinov2.",
}

dinov2 instance-attribute ¤

dinov2 = Dinov2Model(config)

classifier instance-attribute ¤

classifier = Linear(hidden_size * 2, num_classes)

forward ¤

forward(pixel_values: Tensor)

Forward pass of the DinoV2ForImageClassification model.

Parameters:

Name Type Description Default
pixel_values Tensor

Input image tensor of shape [batch_size, channels, height, width].

required

Returns:

Type Description

torch.Tensor: Classification logits of shape [batch_size, num_classes].

Source code in src/unitorch/models/dinov2/modeling.py
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
def forward(
    self,
    pixel_values: torch.Tensor,
):
    """
    Forward pass of the DinoV2ForImageClassification model.

    Args:
        pixel_values (torch.Tensor): Input image tensor of shape [batch_size, channels, height, width].

    Returns:
        torch.Tensor: Classification logits of shape [batch_size, num_classes].
    """
    vision_outputs = self.dinov2(pixel_values=pixel_values)[0]
    cls_output = vision_outputs[:, 0]
    patch_output = vision_outputs[:, 1:]
    pooled_output = torch.cat([cls_output, patch_output.mean(dim=1)], dim=-1)
    return self.classifier(pooled_output)