Saga_models.NgramSourceN-gram language models (unigram, bigram, trigram)
N-gram language models for text generation
An n-gram model
Statistics about the trained model
Smoothing strategies:
Add_k k: classic add-k (Laplace) smoothingStupid_backoff alpha: back off to lower orders scaled by alphacreate ~n ?smoothing ?cache_capacity tokens builds a model with configurable smoothing and an optional logits cache.
logits model ~context returns log probabilities given context. Context length should be n-1 for an n-gram model.
perplexity model tokens computes perplexity on test tokens
log_prob model tokens returns the sum of log-probabilities of the observed tokens under the model.
val generate :
t ->
?max_tokens:int ->
?temperature:float ->
?seed:int ->
?start:int array ->
unit ->
int arraygenerate model ?max_tokens ?temperature ?seed ?start () generates tokens
stats model returns statistics about the highest-order n-grams.
save_text model filename serializes the model to a text file.