GPT2.TokenizerSourceGPT-2 tokenizer instance with BPE
Create a BPE tokenizer for GPT-2. Either provide vocab_file and merges_file paths, or a model_id to download from HuggingFace (defaults to gpt2)
Encode text directly to input tensors ready for forward pass
val encode_batch :
t ->
?max_length:int ->
?padding:bool ->
string list ->
(int32, Rune.int32_elt) Rune.tEncode multiple texts with optional padding