Example Files#
Task List#
An example task.yml file:
1- name: product
2 template: "find the product of {a} and {b}"
3 values:
4 a: [2,3,5]
5 b: [3,4,5]
6 reference: calculated
7 scorer: "calculated_answer"
8 format: "IntAnswer"
9- name: product_combination
10 template: "find the product of {a} and {b}"
11 values:
12 a: [2,3,5]
13 b: [3,4,5]
14 reference: calculated
15 value_combinations: combinations
16 scorer:
17 - calculated_answer
18 - add_justify
19 format: "IntJustification"
20- name: symbol
21 template: "what is the name for the following symbol? {symb}"
22 values:
23 symb: ["@","$","#"]
24 exmeta: ['internet', 'world', 'internet']
25 reference: ["at", "dollar sign", "pound"]
26 scorer: "contains"
27- name: symbol_dir
28 template: "what direction does the {symb} symbol point? and what is its name "
29 values:
30 symb: ["^","<",">"]
31 name: [carat, less,greater]
32 direction: ["up", "left", "right"]
33 scorer: check_name_dir
34 reference: calculated
35 format: NameSource
Folder specified Task#
It comprises a text file for the template
what is {a} + {b}?
and then values for the template feilds are in a csv file:
a,b,reference
2,3,5
4,5,9
8,9,17
Runner Specification#
Single run settings#
1runner_type: ollama
2model: 'llama3.2'
Specifying multiple models:#
1runner_type: ollama
2model:
3 - 'llama3.2'
4 - 'gemma3'
Custom Response format#
Classes should be like those in the responses class. Use pydantic.BaseModel for the response formats an enum.Enum to restrict options.
1from pydantic import BaseModel
2from enum import Enum
In order to constrain options, create a class for that:
1class Direction(str,Enum):
2 left="left"
3 right="right"
4 up="up"
Then that can be a field in a response.
1class NameSource(BaseModel):
2 name: str
3 direction : Direction
Custom Scorer#
For custom scoring, add a file custom_scorer.py. There can be multiple functions in one file. A function should take two inputs, the response and the reference.
If reference is set to “calculated” in the tasks.yml for list-style or info.yml for a folder-style, then the reference will be a dictionary of the values for the class, with the prompt_id added.
1import json
2def calculated_answer(response,values):
3 '''
4 example function for calcuating the correct answer from the values
5 '''
6 # parse the response object
7 response_object = json.loads(response)
8 # compute the answer
9 ref = values['a'] *values['b']
10 # check the answer or otherwise calculate
11 return int(ref == response_object['answer'])
12
13def check_name_dir(response,values):
14 '''
15 '''
16 # parse the response object
17 response_object = json.loads(response)
18 return {'name':values['name']==response_object['name'],
19 'direction':values['direction']==response_object['direction'],}
20
21def add_justify(response,values):
22 response_object = json.loads(response)
23 return int('add' in response_object['justification'])
The function can return either a scalar numerical value or a dictionary for multiple values.