Embeddings trained on CONLL2017 Corpora

Lexikalische Ressource

Embeddings trained on CONLL2017 Corpora

Titel

eng Embeddings trained on CONLL2017 Corpora

Resource_description

eng The embeddings were trained with finalfrontier on the CONLL2017 corpora with more than 100m tokens. For all languages embeddings, were trained with the skip- and structgram algorithms and contain subword ngrams. All embeddings are stored in the finalfusion format and can be used an processed with tools provided by the finalfusion ecosystem. N-Gram range (inclusive): 3 - 6 Number of hashing buckets: 2^21 Hashing function: FNV-1a Window size: 10 Negative Samples: 5 Dimensions: 300 Minimum Token Frequency: 30

Md_id

https://doi.org/10.57754/FDAT.eh5fz-7ec28

Md_timestamp

2020-09-15

Lc_version

Tech_landing_page

https://doi.org/10.57754/FDAT.eh5fz-7ec28

entityId

41d1ad19-4548-45f9-b43c-186315227aff

sourceId

8cefa5dd-f5fb-4527-8acb-88cc6824eb48

Keine verknüpften Ressourcen sind verfügbar!