These implementations have been tested on several datasets (see the examples) and should match the performances of the associated TensorFlow implementations (e.g. ~91 F1 on SQuAD for BERT, ~88 F1 on ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results