Skip to content

Configuration¤

    unitorch command workflow for modeling training/inference is using a unified configuration system. In this example, we'll explore the configuration of the BartForGeneration class in unitorch.

     Here is a training command example for bart generation with local files.

unitorch-train \
    configs/generation/bart.ini \
    --train_file train.tsv \
    --dev_file dev.tsv \
    --core/model/generation/bart@num_beams 20 \
    --core/model/generation/bart@no_repeat_ngram_size 0

    This is the training command for bart generation. In this command, we provide the path to the configuration file configs/core/generation/bart.ini and specify additional parameters using the -- syntax. The parameters provided after --core/model/generation/bart@ override the corresponding values in the configuration file. In this example, we override num_beams to 20 and no_repeat_ngram_size to 0. The configuration file using an INI file format is as follows.

# model
[core/model/generation/bart]
pretrained_name = bart-base
no_repeat_ngram_size = 3
max_gen_seq_length = 15

# dataset
[core/dataset/ast]
names = ['encode', 'decode']

# ...

In this configuration, we specify the following parameters:

  • pretrained_name: The name of the pretrained model. In this example, it is set to bart-base.
  • no_repeat_ngram_size: The size of n-grams to avoid repeating in the generated sequences. It is set to 3 in this example. Because we override this parameter in command line, it would be set to 0 finally.
  • max_gen_seq_length: The maximum length of the generated sequences. It is set to 15 in this example.

Then let's check the BartForGenertaion model class code.

class BartForGeneration(_BartForGeneration):
    def __init__(
        self,
        config_path: str,
        gradient_checkpointing: Optional[bool] = False,
    ):
        pass

    @classmethod
    @add_default_section_for_init("core/model/generation/bart")
    def from_core_configure(cls, config, **kwargs):
        config.set_default_section("core/model/generation/bart")
        pretrained_name = config.getoption("pretrained_name", "bart-base")
        config_path = config.getoption("config_path", None)
        config_path = pop_value(
            config_path,
            nested_dict_value(pretrained_bart_infos, pretrained_name, "config"),
        )

        config_path = cached_path(config_path)
        gradient_checkpointing = config.getoption("gradient_checkpointing", False)

        inst = cls(config_path, gradient_checkpointing)
        pretrained_weight_path = config.getoption("pretrained_weight_path", None)
        weight_path = pop_value(
            pretrained_weight_path,
            nested_dict_value(pretrained_bart_infos, pretrained_name, "weight"),
            check_none=False,
        )
        if weight_path is not None:
            inst.from_pretrained(weight_path)

        return inst

    @add_default_section_for_function("core/model/generation/bart")
    def generate(
        self,
        input_ids: torch.Tensor,
        num_beams: Optional[int] = 5,
        decoder_start_token_id: Optional[int] = 2,
        decoder_end_token_id: Optional[int] = 2,
        num_return_sequences: Optional[int] = 1,
        min_gen_seq_length: Optional[int] = 0,
        max_gen_seq_length: Optional[int] = 48,
        repetition_penalty: Optional[float] = 1.0,
        no_repeat_ngram_size: Optional[int] = 0,
        early_stopping: Optional[bool] = True,
        length_penalty: Optional[float] = 1.0,
        num_beam_groups: Optional[int] = 1,
        diversity_penalty: Optional[float] = 0.0,
        do_sample: Optional[bool] = False,
        temperature: Optional[float] = 1.0,
        top_k: Optional[int] = 50,
        top_p: Optional[float] = 1.0,
    ):
        pass

    The from_core_configure method is a class method used to create an instance of BartForGeneration based on a provided configuration object (config). It retrieves various options from the configuration and initializes the instance with the appropriate values. It also loads pretrained weights if a pretrained_weight_path is specified in the configuration. It also has add_default_section_for_function decorator to override the parameter value from coniguration object with specific section. The num_beams is set to 20 and no_repeat_ngram_size is set to 3 in this example.