如何在 AllenNLP 中使用正则化器?

数据挖掘 深度学习 nlp 艾伦普
2022-02-18 13:09:45

如果这听起来有点蹩脚,请道歉。

我正在尝试将 Allennlp 用于我的 NLP 任务,并希望使用正则化来减少过度拟合。但是从所有的在线教程中,所有的正则化器都设置为无,经过多次尝试,我仍然无法找到如何使用正则化器。

如果我使用官方教程(https://github.com/titipata/allennlp-tutorial)中的示例,如果我想为 LSTM 和前馈层添加正则化器怎么办?

class AcademicPaperClassifier(Model):
    """
    Model to classify venue based on input title and abstract
    """
    def __init__(self, 
                 vocab: Vocabulary,
                 text_field_embedder: TextFieldEmbedder,
                 title_encoder: Seq2VecEncoder,
                 abstract_encoder: Seq2VecEncoder,
                 classifier_feedforward: FeedForward,
                 initializer: InitializerApplicator = InitializerApplicator(),
                 regularizer: Optional[RegularizerApplicator] = None) -> None:
        super(AcademicPaperClassifier, self).__init__(vocab, regularizer)
        self.text_field_embedder = text_field_embedder
        self.num_classes = self.vocab.get_vocab_size("labels")
        self.title_encoder = title_encoder
        self.abstract_encoder = abstract_encoder
        self.classifier_feedforward = classifier_feedforward
        self.metrics = {
                "accuracy": CategoricalAccuracy(),
                "accuracy3": CategoricalAccuracy(top_k=3)
        }
        self.loss = torch.nn.CrossEntropyLoss()
        initializer(self)

    def forward(self, 
                title: Dict[str, torch.LongTensor],
                abstract: Dict[str, torch.LongTensor],
                label: torch.LongTensor = None) -> Dict[str, torch.Tensor]:

        embedded_title = self.text_field_embedder(title)
        title_mask = get_text_field_mask(title)
        encoded_title = self.title_encoder(embedded_title, title_mask)

        embedded_abstract = self.text_field_embedder(abstract)
        abstract_mask = get_text_field_mask(abstract)
        encoded_abstract = self.abstract_encoder(embedded_abstract, abstract_mask)

        logits = self.classifier_feedforward(torch.cat([encoded_title, encoded_abstract], dim=-1))
        class_probabilities = F.softmax(logits, dim=-1)
        argmax_indices = np.argmax(class_probabilities.cpu().data.numpy(), axis=-1)
        labels = [self.vocab.get_token_from_index(x, namespace="labels") for x in argmax_indices]
        output_dict = {
            'logits': logits, 
            'class_probabilities': class_probabilities,
            'predicted_label': labels
        }
        if label is not None:
            loss = self.loss(logits, label)
            for metric in self.metrics.values():
                metric(logits, label)
            output_dict["loss"] = loss

        return output_dict
1个回答

API有点混乱。您将一个RegularizerApplicator实例传递给模型,该模型采用形式为 的元组列表(regex, regularizer)正则表达式与您的模型参数名称匹配。例如,如果您有一个名为linear_relu_stack.0.bias,的层linear_relu_stack.0.weight,您可以使用 regex 对两者应用一个正则化器"^linear_relu_stack.0.(bias|weight)$"

正则化器本身就是L1Regularizeror的一个实例L2Regularizer,您可以在其中指定 alpha。

总之,添加正则化后,您的模型将如下所示:

from allennlp.nn.regularizers.regularizer_applicator import RegularizerApplicator
from allennlp.nn.regularizers.regularizers import L2Regularizer

AcademicPaperClassifier(*model_args, regularizer=RegularizerApplicator([
    (".*LSTM.*", L2Regularizer(alpha=0.01)),
    (".*FFN.*",  L2Regularizer(alpha=0.01)),
]))

API 文档是最好的资源:http ://docs.allennlp.org/v0.9.0/api/allennlp.nn.regularizers.html#allennlp.nn.regularizers.regularizer_applicor.RegularizerApplicator.from_params