Skip to content

unitorch.models.xlm_roberta¤

XLMRobertaProcessor¤

Bases: HfTextClassificationProcessor

Processor for XLM-RoBERTa model for text classification tasks.

Initializes the XLMRobertaProcessor.

Parameters:

Name Type Description Default
vocab_path str

Path to the vocabulary file.

required
max_seq_length Optional[int]

Maximum sequence length. Defaults to 128.

128
source_type_id Optional[int]

Source type ID. Defaults to 0.

0
target_type_id Optional[int]

Target type ID. Defaults to 0.

0
Source code in src/unitorch/models/xlm_roberta/processing.py
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
def __init__(
    self,
    vocab_path: str,
    max_seq_length: Optional[int] = 128,
    source_type_id: Optional[int] = 0,
    target_type_id: Optional[int] = 0,
):
    """
    Initializes the XLMRobertaProcessor.

    Args:
        vocab_path (str): Path to the vocabulary file.
        max_seq_length (Optional[int]): Maximum sequence length. Defaults to 128.
        source_type_id (Optional[int]): Source type ID. Defaults to 0.
        target_type_id (Optional[int]): Target type ID. Defaults to 0.
    """
    tokenizer = get_xlm_roberta_tokenizer(
        vocab_path,
    )
    super().__init__(
        tokenizer=tokenizer,
        max_seq_length=max_seq_length,
        source_type_id=source_type_id,
        target_type_id=target_type_id,
        position_start_id=self.pad_token_id + 1,
    )

XLMRobertaForClassification¤

Bases: GenericModel

XLM-RoBERTa model for classification tasks.

Initializes the XLMRobertaForClassification model.

Parameters:

Name Type Description Default
config_path str

Path to the configuration file.

required
num_classes Optional[int]

Number of classes. Defaults to 1.

1
gradient_checkpointing Optional[bool]

Whether to use gradient checkpointing. Defaults to False.

False
Source code in src/unitorch/models/xlm_roberta/modeling.py
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def __init__(
    self,
    config_path: str,
    num_classes: Optional[int] = 1,
    gradient_checkpointing: Optional[bool] = False,
):
    """
    Initializes the XLMRobertaForClassification model.

    Args:
        config_path (str): Path to the configuration file.
        num_classes (Optional[int]): Number of classes. Defaults to 1.
        gradient_checkpointing (Optional[bool]): Whether to use gradient checkpointing. Defaults to False.
    """
    super().__init__()
    self.config = XLMRobertaConfig.from_json_file(config_path)
    self.config.gradient_checkpointing = gradient_checkpointing
    self.roberta = XLMRobertaModel(self.config)
    self.dropout = nn.Dropout(self.config.hidden_dropout_prob)
    self.classifier = nn.Linear(self.config.hidden_size, num_classes)
    self.init_weights()

forward ¤

forward(
    input_ids: Tensor,
    attention_mask: Optional[Tensor] = None,
    token_type_ids: Optional[Tensor] = None,
    position_ids: Optional[Tensor] = None,
)

Forward pass of the XLMRobertaForClassification model.

Parameters:

Name Type Description Default
input_ids Tensor

Input tensor of shape [batch_size, sequence_length].

required
attention_mask Optional[Tensor]

Attention mask tensor of shape [batch_size, sequence_length]. Defaults to None.

None
token_type_ids Optional[Tensor]

Token type IDs tensor of shape [batch_size, sequence_length]. Defaults to None.

None
position_ids Optional[Tensor]

Position IDs tensor of shape [batch_size, sequence_length]. Defaults to None.

None

Returns:

Type Description
Tensor

Output logits of shape [batch_size, num_classes].

Source code in src/unitorch/models/xlm_roberta/modeling.py
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
def forward(
    self,
    input_ids: torch.Tensor,
    attention_mask: Optional[torch.Tensor] = None,
    token_type_ids: Optional[torch.Tensor] = None,
    position_ids: Optional[torch.Tensor] = None,
):
    """
    Forward pass of the XLMRobertaForClassification model.

    Args:
        input_ids (torch.Tensor): Input tensor of shape [batch_size, sequence_length].
        attention_mask (Optional[torch.Tensor]): Attention mask tensor of shape [batch_size, sequence_length].
            Defaults to None.
        token_type_ids (Optional[torch.Tensor]): Token type IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.
        position_ids (Optional[torch.Tensor]): Position IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.

    Returns:
        (torch.Tensor):Output logits of shape [batch_size, num_classes].
    """
    outputs = self.roberta(
        input_ids,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
    )
    pooled_output = outputs[1]

    pooled_output = self.dropout(pooled_output)
    logits = self.classifier(pooled_output)
    return logits

XLMRobertaForMaskLM¤

Bases: GenericModel

XLM-RoBERTa model for masked language modeling tasks.

Initializes the XLMRobertaForMaskLM model.

Parameters:

Name Type Description Default
config_path str

Path to the configuration file.

required
gradient_checkpointing Optional[bool]

Whether to use gradient checkpointing. Defaults to False.

False
Source code in src/unitorch/models/xlm_roberta/modeling.py
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
def __init__(
    self,
    config_path: str,
    gradient_checkpointing: Optional[bool] = False,
):
    """
    Initializes the XLMRobertaForMaskLM model.

    Args:
        config_path (str): Path to the configuration file.
        gradient_checkpointing (Optional[bool]): Whether to use gradient checkpointing. Defaults to False.
    """
    super().__init__()
    self.config = XLMRobertaConfig.from_json_file(config_path)
    self.config.gradient_checkpointing = gradient_checkpointing
    self.roberta = XLMRobertaModel(self.config, add_pooling_layer=False)
    self.lm_head = RobertaLMHead(self.config)
    self.init_weights()
    self.roberta.embeddings.word_embeddings.weight = self.lm_head.decoder.weight

forward ¤

forward(
    input_ids: Tensor,
    attention_mask: Optional[Tensor] = None,
    token_type_ids: Optional[Tensor] = None,
    position_ids: Optional[Tensor] = None,
)

Forward pass of the XLMRobertaForMaskLM model.

Parameters:

Name Type Description Default
input_ids Tensor

Input tensor of shape [batch_size, sequence_length].

required
attention_mask Optional[Tensor]

Attention mask tensor of shape [batch_size, sequence_length]. Defaults to None.

None
token_type_ids Optional[Tensor]

Token type IDs tensor of shape [batch_size, sequence_length]. Defaults to None.

None
position_ids Optional[Tensor]

Position IDs tensor of shape [batch_size, sequence_length]. Defaults to None.

None

Returns:

Type Description
Tensor

Output logits of shape [batch_size, sequence_length, vocabulary_size].

Source code in src/unitorch/models/xlm_roberta/modeling.py
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
def forward(
    self,
    input_ids: torch.Tensor,
    attention_mask: Optional[torch.Tensor] = None,
    token_type_ids: Optional[torch.Tensor] = None,
    position_ids: Optional[torch.Tensor] = None,
):
    """
    Forward pass of the XLMRobertaForMaskLM model.

    Args:
        input_ids (torch.Tensor): Input tensor of shape [batch_size, sequence_length].
        attention_mask (Optional[torch.Tensor]): Attention mask tensor of shape [batch_size, sequence_length].
            Defaults to None.
        token_type_ids (Optional[torch.Tensor]): Token type IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.
        position_ids (Optional[torch.Tensor]): Position IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.

    Returns:
        (torch.Tensor):Output logits of shape [batch_size, sequence_length, vocabulary_size].
    """
    outputs = self.roberta(
        input_ids,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
    )
    sequence_output = outputs[0]
    logits = self.lm_head(sequence_output)
    return logits

XLMRobertaXLForClassification¤

Bases: GenericModel

XLM-RoBERTa XL model for classification tasks.

Initializes the XLMRobertaXLForClassification model.

Parameters:

Name Type Description Default
config_path str

Path to the configuration file.

required
num_classes Optional[int]

Number of classes for classification. Defaults to 1.

1
gradient_checkpointing Optional[bool]

Whether to use gradient checkpointing. Defaults to False.

False
Source code in src/unitorch/models/xlm_roberta/modeling.py
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
def __init__(
    self,
    config_path: str,
    num_classes: Optional[int] = 1,
    gradient_checkpointing: Optional[bool] = False,
):
    """
    Initializes the XLMRobertaXLForClassification model.

    Args:
        config_path (str): Path to the configuration file.
        num_classes (Optional[int]): Number of classes for classification. Defaults to 1.
        gradient_checkpointing (Optional[bool]): Whether to use gradient checkpointing. Defaults to False.
    """
    super().__init__()
    self.config = XLMRobertaXLConfig.from_json_file(config_path)
    self.config.gradient_checkpointing = gradient_checkpointing
    self.roberta = XLMRobertaXLModel(self.config)
    self.dropout = nn.Dropout(self.config.hidden_dropout_prob)
    self.classifier = nn.Linear(self.config.hidden_size, num_classes)
    self.init_weights()

forward ¤

forward(
    input_ids: Tensor,
    attention_mask: Optional[Tensor] = None,
    token_type_ids: Optional[Tensor] = None,
    position_ids: Optional[Tensor] = None,
)

Forward pass of the XLMRobertaXLForClassification model.

Parameters:

Name Type Description Default
input_ids Tensor

Input tensor of shape [batch_size, sequence_length].

required
attention_mask Optional[Tensor]

Attention mask tensor of shape [batch_size, sequence_length]. Defaults to None.

None
token_type_ids Optional[Tensor]

Token type IDs tensor of shape [batch_size, sequence_length]. Defaults to None.

None
position_ids Optional[Tensor]

Position IDs tensor of shape [batch_size, sequence_length]. Defaults to None.

None

Returns:

Type Description
Tensor

Output logits of shape [batch_size, num_classes].

Source code in src/unitorch/models/xlm_roberta/modeling.py
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
def forward(
    self,
    input_ids: torch.Tensor,
    attention_mask: Optional[torch.Tensor] = None,
    token_type_ids: Optional[torch.Tensor] = None,
    position_ids: Optional[torch.Tensor] = None,
):
    """
    Forward pass of the XLMRobertaXLForClassification model.

    Args:
        input_ids (torch.Tensor): Input tensor of shape [batch_size, sequence_length].
        attention_mask (Optional[torch.Tensor]): Attention mask tensor of shape [batch_size, sequence_length].
            Defaults to None.
        token_type_ids (Optional[torch.Tensor]): Token type IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.
        position_ids (Optional[torch.Tensor]): Position IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.

    Returns:
        (torch.Tensor):Output logits of shape [batch_size, num_classes].
    """
    outputs = self.roberta(
        input_ids,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
    )
    pooled_output = outputs[1]

    pooled_output = self.dropout(pooled_output)
    logits = self.classifier(pooled_output)
    return logits