数据挖掘 - 如何在 AllenNLP 中使用正则化器？ - 吾爱随笔录

如何在 AllenNLP 中使用正则化器？

数据挖掘深度学习 nlp 艾伦普

2022-02-18 13:09:45

如果这听起来有点蹩脚，请道歉。

我正在尝试将 Allennlp 用于我的 NLP 任务，并希望使用正则化来减少过度拟合。但是从所有的在线教程中，所有的正则化器都设置为无，经过多次尝试，我仍然无法找到如何使用正则化器。

如果我使用官方教程（https://github.com/titipata/allennlp-tutorial）中的示例，如果我想为 LSTM 和前馈层添加正则化器怎么办？

class AcademicPaperClassifier(Model):
    """
    Model to classify venue based on input title and abstract
    """
    def __init__(self, 
                 vocab: Vocabulary,
                 text_field_embedder: TextFieldEmbedder,
                 title_encoder: Seq2VecEncoder,
                 abstract_encoder: Seq2VecEncoder,
                 classifier_feedforward: FeedForward,
                 initializer: InitializerApplicator = InitializerApplicator(),
                 regularizer: Optional[RegularizerApplicator] = None) -> None:
        super(AcademicPaperClassifier, self).__init__(vocab, regularizer)
        self.text_field_embedder = text_field_embedder
        self.num_classes = self.vocab.get_vocab_size("labels")
        self.title_encoder = title_encoder
        self.abstract_encoder = abstract_encoder
        self.classifier_feedforward = classifier_feedforward
        self.metrics = {
                "accuracy": CategoricalAccuracy(),
                "accuracy3": CategoricalAccuracy(top_k=3)
        }
        self.loss = torch.nn.CrossEntropyLoss()
        initializer(self)

    def forward(self, 
                title: Dict[str, torch.LongTensor],
                abstract: Dict[str, torch.LongTensor],
                label: torch.LongTensor = None) -> Dict[str, torch.Tensor]:

        embedded_title = self.text_field_embedder(title)
        title_mask = get_text_field_mask(title)
        encoded_title = self.title_encoder(embedded_title, title_mask)

        embedded_abstract = self.text_field_embedder(abstract)
        abstract_mask = get_text_field_mask(abstract)
        encoded_abstract = self.abstract_encoder(embedded_abstract, abstract_mask)

        logits = self.classifier_feedforward(torch.cat([encoded_title, encoded_abstract], dim=-1))
        class_probabilities = F.softmax(logits, dim=-1)
        argmax_indices = np.argmax(class_probabilities.cpu().data.numpy(), axis=-1)
        labels = [self.vocab.get_token_from_index(x, namespace="labels") for x in argmax_indices]
        output_dict = {
            'logits': logits, 
            'class_probabilities': class_probabilities,
            'predicted_label': labels
        }
        if label is not None:
            loss = self.loss(logits, label)
            for metric in self.metrics.values():
                metric(logits, label)
            output_dict["loss"] = loss

        return output_dict

1个回答

API有点混乱。您将一个RegularizerApplicator实例传递给模型，该模型采用形式为的元组列表(regex, regularizer)。正则表达式与您的模型参数名称匹配。例如，如果您有一个名为linear_relu_stack.0.bias,的层linear_relu_stack.0.weight，您可以使用 regex 对两者应用一个正则化器"^linear_relu_stack.0.(bias|weight)$"。

正则化器本身就是L1Regularizeror的一个实例L2Regularizer，您可以在其中指定 alpha。

总之，添加正则化后，您的模型将如下所示：

from allennlp.nn.regularizers.regularizer_applicator import RegularizerApplicator
from allennlp.nn.regularizers.regularizers import L2Regularizer

AcademicPaperClassifier(*model_args, regularizer=RegularizerApplicator([
    (".*LSTM.*", L2Regularizer(alpha=0.01)),
    (".*FFN.*",  L2Regularizer(alpha=0.01)),
]))

API 文档是最好的资源：http ://docs.allennlp.org/v0.9.0/api/allennlp.nn.regularizers.html#allennlp.nn.regularizers.regularizer_applicor.RegularizerApplicator.from_params

其它你可能感兴趣的问题

上一篇给定我的数据集的统计模型建议下一篇如何编写循环神经网络的简单前向传播？