I noticed that the current model does not set a default activation function. Since the paper confirms that the classification head must use softmax, I suggest adding it to the config.json file to inform the user of the good practice.

Whenever someone considers changing the classification dropout in the model parameters (RobertaForSequenceClassification.from_pretrained), then the parameter classifier_dropout can be altered to any activation function (softmax, sigmoid, tanh), which typically serves to instruct the classifier on a possible dropout ratio as stated in the documentation.

This PR suggests setting the default classification dropout to "softmax" in the model configuration.

stevugnin changed pull request status to open

It appears the documentation is right. The parameter is indeed for the dropout rate, and does not apply such things as an alternative activation function, unfortunately.

I am closing the PR.

stevugnin changed pull request status to closed
Computer Incident Response Center Luxembourg org

Hello @stevugnin

Thank you for your comment. For your information the model is generated with this project:
https://github.com/vulnerability-lookup/VulnTrain
The paper is a almost a bit more than a year now and we are still working on testing/improving the trainer. It already changed. In case you have ideas feel free to create an issue on GitHub.

Thank you

Sign up or log in to comment