Lexikalische Ressource

Embeddings trained on CONLL2017 Corpora

eng Embeddings trained on CONLL2017 Corpora

eng The embeddings were trained with finalfrontier on the CONLL2017 corpora with more than 100m tokens. For all languages embeddings, were trained with the skip- and structgram algorithms and contain subword ngrams. All embeddings are stored in the finalfusion format and can be used an processed with tools provided by the finalfusion ecosystem. N-Gram range (inclusive): 3 - 6 Number of hashing buckets: 2^21 Hashing function: FNV-1a Window size: 10 Negative Samples: 5 Dimensions: 300 Minimum Token Frequency: 30

2020-09-15

1

41d1ad19-4548-45f9-b43c-186315227aff

8cefa5dd-f5fb-4527-8acb-88cc6824eb48

Keine verknüpften Ressourcen sind verfügbar!
Keine verknüpften Ressourcen sind verfügbar!