Spacy Transformers Versions Save

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

4 years ago

Add support for GLUE benchmark tasks.
Support text-pair classification. The specifics of this are likely to change, but you can see run_glue.py for current usage.
Improve reliability of tokenization and alignment.
Add support for segment IDs to the PyTT_Wrapper class. These can now be passed in as a second column of the RaggedArray input. See the model_registry.get_word_pieces function for example usage.
Set default maximum sequence length to 128.
Fix bug that caused settings not to be passed into PyTT_TextCategorizer on model initialization.
Fix serialization of XLNet model.

4 years ago

4 years ago

⚠️ This version requires downloading new models.

Fix issue #15: Fix serialization of config and make models load correctly offline.
Improve accuracy of textcat by passing hyper-parameters correctly (Adam epsilon, L2).
Support pooler output for BERT model.
Add fine_tune_pooler_output model architecture option for pytt_textcat.
Add Glue benchmark script in examples/tasks/run_glue.py.
Improve overall stability.

4 years ago