Hugging Face
Run Selene-Mini through Hugging Face transformers.
Model card
You can find our Hugging Face model card here.
Quickstart:
Prompt templates
To achieve best results, we provide the prompts we used for training here.
Cookbooks
Try our cookbooks to start running two popular use cases straight away.
Absolute scoring
This example gets you started running evals with absolute scores, and does so on a sample set from the public benchmark FLASK dataset - a collection of 1,740 human-annotated samples from 120 NLP datasets. Evaluators assign scores ranging from 1 to 5 for each annotated skill based on the reference (ground-truth) answer and skill-specific scoring rubrics.
Here, we evaluate the completeness
of AI responses i.e. ‘Does the response provide a sufficient explanation?‘
RAG hallucination
This example gets you started detecting hallucinations, and runs over a sample set from the public benchmark RAGTruth benchmark - a large-scale corpus of naturally generated hallucinations, featuring detailed word-level annotations specifically designed for retrieval-augmented generation (RAG) scenarios.
Here, we check for the hallucination
of AI responses i.e. ‘Is the information provided in the response directly supported by the context given in the related passages?’