unitorch.models.xlm_roberta¤

XLMRobertaProcessor¤

Bases: HfTextClassificationProcessor

Processor for XLM-RoBERTa model for text classification tasks.

Initializes the XLMRobertaProcessor.

Parameters:

Name	Type	Description	Default
`vocab_path`	`str`	Path to the vocabulary file.	required
`max_seq_length`	`Optional[int]`	Maximum sequence length. Defaults to 128.	`128`
`source_type_id`	`Optional[int]`	Source type ID. Defaults to 0.	`0`
`target_type_id`	`Optional[int]`	Target type ID. Defaults to 0.	`0`

Source code in src/unitorch/models/xlm_roberta/processing.py

def __init__(
    self,
    vocab_path: str,
    max_seq_length: Optional[int] = 128,
    source_type_id: Optional[int] = 0,
    target_type_id: Optional[int] = 0,
):
    """
    Initializes the XLMRobertaProcessor.

    Args:
        vocab_path (str): Path to the vocabulary file.
        max_seq_length (Optional[int]): Maximum sequence length. Defaults to 128.
        source_type_id (Optional[int]): Source type ID. Defaults to 0.
        target_type_id (Optional[int]): Target type ID. Defaults to 0.
    """
    tokenizer = get_xlm_roberta_tokenizer(
        vocab_path,
    )
    super().__init__(
        tokenizer=tokenizer,
        max_seq_length=max_seq_length,
        source_type_id=source_type_id,
        target_type_id=target_type_id,
        position_start_id=self.pad_token_id + 1,
    )

XLMRobertaForClassification¤

Bases: GenericModel

XLM-RoBERTa model for classification tasks.

Initializes the XLMRobertaForClassification model.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the configuration file.	required
`num_classes`	`Optional[int]`	Number of classes. Defaults to 1.	`1`
`gradient_checkpointing`	`Optional[bool]`	Whether to use gradient checkpointing. Defaults to False.	`False`

Source code in src/unitorch/models/xlm_roberta/modeling.py

def __init__(
    self,
    config_path: str,
    num_classes: Optional[int] = 1,
    gradient_checkpointing: Optional[bool] = False,
):
    """
    Initializes the XLMRobertaForClassification model.

    Args:
        config_path (str): Path to the configuration file.
        num_classes (Optional[int]): Number of classes. Defaults to 1.
        gradient_checkpointing (Optional[bool]): Whether to use gradient checkpointing. Defaults to False.
    """
    super().__init__()
    self.config = XLMRobertaConfig.from_json_file(config_path)
    self.config.gradient_checkpointing = gradient_checkpointing
    self.roberta = XLMRobertaModel(self.config)
    self.dropout = nn.Dropout(self.config.hidden_dropout_prob)
    self.classifier = nn.Linear(self.config.hidden_size, num_classes)
    self.init_weights()

forward ¤

forward(
    input_ids: Tensor,
    attention_mask: Optional[Tensor] = None,
    token_type_ids: Optional[Tensor] = None,
    position_ids: Optional[Tensor] = None,
)

Forward pass of the XLMRobertaForClassification model.

Parameters:

Name	Type	Description	Default
`input_ids`	`Tensor`	Input tensor of shape [batch_size, sequence_length].	required
`attention_mask`	`Optional[Tensor]`	Attention mask tensor of shape [batch_size, sequence_length]. Defaults to None.	`None`
`token_type_ids`	`Optional[Tensor]`	Token type IDs tensor of shape [batch_size, sequence_length]. Defaults to None.	`None`
`position_ids`	`Optional[Tensor]`	Position IDs tensor of shape [batch_size, sequence_length]. Defaults to None.	`None`

Returns:

Type	Description
`Tensor`	Output logits of shape [batch_size, num_classes].

Source code in src/unitorch/models/xlm_roberta/modeling.py

def forward(
    self,
    input_ids: torch.Tensor,
    attention_mask: Optional[torch.Tensor] = None,
    token_type_ids: Optional[torch.Tensor] = None,
    position_ids: Optional[torch.Tensor] = None,
):
    """
    Forward pass of the XLMRobertaForClassification model.

    Args:
        input_ids (torch.Tensor): Input tensor of shape [batch_size, sequence_length].
        attention_mask (Optional[torch.Tensor]): Attention mask tensor of shape [batch_size, sequence_length].
            Defaults to None.
        token_type_ids (Optional[torch.Tensor]): Token type IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.
        position_ids (Optional[torch.Tensor]): Position IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.

    Returns:
        (torch.Tensor):Output logits of shape [batch_size, num_classes].
    """
    outputs = self.roberta(
        input_ids,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
    )
    pooled_output = outputs[1]

    pooled_output = self.dropout(pooled_output)
    logits = self.classifier(pooled_output)
    return logits

XLMRobertaForMaskLM¤

Bases: GenericModel

XLM-RoBERTa model for masked language modeling tasks.

Initializes the XLMRobertaForMaskLM model.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the configuration file.	required
`gradient_checkpointing`	`Optional[bool]`	Whether to use gradient checkpointing. Defaults to False.	`False`

Source code in src/unitorch/models/xlm_roberta/modeling.py

def __init__(
    self,
    config_path: str,
    gradient_checkpointing: Optional[bool] = False,
):
    """
    Initializes the XLMRobertaForMaskLM model.

    Args:
        config_path (str): Path to the configuration file.
        gradient_checkpointing (Optional[bool]): Whether to use gradient checkpointing. Defaults to False.
    """
    super().__init__()
    self.config = XLMRobertaConfig.from_json_file(config_path)
    self.config.gradient_checkpointing = gradient_checkpointing
    self.roberta = XLMRobertaModel(self.config, add_pooling_layer=False)
    self.lm_head = RobertaLMHead(self.config)
    self.init_weights()
    self.roberta.embeddings.word_embeddings.weight = self.lm_head.decoder.weight

forward ¤

forward(
    input_ids: Tensor,
    attention_mask: Optional[Tensor] = None,
    token_type_ids: Optional[Tensor] = None,
    position_ids: Optional[Tensor] = None,
)

Forward pass of the XLMRobertaForMaskLM model.

Parameters:

Name	Type	Description	Default
`input_ids`	`Tensor`	Input tensor of shape [batch_size, sequence_length].	required
`attention_mask`	`Optional[Tensor]`	Attention mask tensor of shape [batch_size, sequence_length]. Defaults to None.	`None`
`token_type_ids`	`Optional[Tensor]`	Token type IDs tensor of shape [batch_size, sequence_length]. Defaults to None.	`None`
`position_ids`	`Optional[Tensor]`	Position IDs tensor of shape [batch_size, sequence_length]. Defaults to None.	`None`

Returns:

Type	Description
`Tensor`	Output logits of shape [batch_size, sequence_length, vocabulary_size].

Source code in src/unitorch/models/xlm_roberta/modeling.py

def forward(
    self,
    input_ids: torch.Tensor,
    attention_mask: Optional[torch.Tensor] = None,
    token_type_ids: Optional[torch.Tensor] = None,
    position_ids: Optional[torch.Tensor] = None,
):
    """
    Forward pass of the XLMRobertaForMaskLM model.

    Args:
        input_ids (torch.Tensor): Input tensor of shape [batch_size, sequence_length].
        attention_mask (Optional[torch.Tensor]): Attention mask tensor of shape [batch_size, sequence_length].
            Defaults to None.
        token_type_ids (Optional[torch.Tensor]): Token type IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.
        position_ids (Optional[torch.Tensor]): Position IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.

    Returns:
        (torch.Tensor):Output logits of shape [batch_size, sequence_length, vocabulary_size].
    """
    outputs = self.roberta(
        input_ids,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
    )
    sequence_output = outputs[0]
    logits = self.lm_head(sequence_output)
    return logits

XLMRobertaXLForClassification¤

Bases: GenericModel

XLM-RoBERTa XL model for classification tasks.

Initializes the XLMRobertaXLForClassification model.

Parameters:

Name	Type	Description	Default
`config_path`	`str`	Path to the configuration file.	required
`num_classes`	`Optional[int]`	Number of classes for classification. Defaults to 1.	`1`
`gradient_checkpointing`	`Optional[bool]`	Whether to use gradient checkpointing. Defaults to False.	`False`

Source code in src/unitorch/models/xlm_roberta/modeling.py

def __init__(
    self,
    config_path: str,
    num_classes: Optional[int] = 1,
    gradient_checkpointing: Optional[bool] = False,
):
    """
    Initializes the XLMRobertaXLForClassification model.

    Args:
        config_path (str): Path to the configuration file.
        num_classes (Optional[int]): Number of classes for classification. Defaults to 1.
        gradient_checkpointing (Optional[bool]): Whether to use gradient checkpointing. Defaults to False.
    """
    super().__init__()
    self.config = XLMRobertaXLConfig.from_json_file(config_path)
    self.config.gradient_checkpointing = gradient_checkpointing
    self.roberta = XLMRobertaXLModel(self.config)
    self.dropout = nn.Dropout(self.config.hidden_dropout_prob)
    self.classifier = nn.Linear(self.config.hidden_size, num_classes)
    self.init_weights()

forward ¤

forward(
    input_ids: Tensor,
    attention_mask: Optional[Tensor] = None,
    token_type_ids: Optional[Tensor] = None,
    position_ids: Optional[Tensor] = None,
)

Forward pass of the XLMRobertaXLForClassification model.

Parameters:

Name	Type	Description	Default
`input_ids`	`Tensor`	Input tensor of shape [batch_size, sequence_length].	required
`attention_mask`	`Optional[Tensor]`	Attention mask tensor of shape [batch_size, sequence_length]. Defaults to None.	`None`
`token_type_ids`	`Optional[Tensor]`	Token type IDs tensor of shape [batch_size, sequence_length]. Defaults to None.	`None`
`position_ids`	`Optional[Tensor]`	Position IDs tensor of shape [batch_size, sequence_length]. Defaults to None.	`None`

Returns:

Type	Description
`Tensor`	Output logits of shape [batch_size, num_classes].

Source code in src/unitorch/models/xlm_roberta/modeling.py

def forward(
    self,
    input_ids: torch.Tensor,
    attention_mask: Optional[torch.Tensor] = None,
    token_type_ids: Optional[torch.Tensor] = None,
    position_ids: Optional[torch.Tensor] = None,
):
    """
    Forward pass of the XLMRobertaXLForClassification model.

    Args:
        input_ids (torch.Tensor): Input tensor of shape [batch_size, sequence_length].
        attention_mask (Optional[torch.Tensor]): Attention mask tensor of shape [batch_size, sequence_length].
            Defaults to None.
        token_type_ids (Optional[torch.Tensor]): Token type IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.
        position_ids (Optional[torch.Tensor]): Position IDs tensor of shape [batch_size, sequence_length].
            Defaults to None.

    Returns:
        (torch.Tensor):Output logits of shape [batch_size, num_classes].
    """
    outputs = self.roberta(
        input_ids,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
    )
    pooled_output = outputs[1]

    pooled_output = self.dropout(pooled_output)
    logits = self.classifier(pooled_output)
    return logits