CLI#
We can initialize without tasks
cd demos
benchtool init testbench -a "to test a simple example" --no-git
cd testbench
benchtool add-task ../new_test/ FillIn ../datasets/miscops/
benchtool run demos/folderbench
benchtool#
BenchTools is a tool that helps researchers set up benchmarks.
Usage
benchtool [OPTIONS] COMMAND [ARGS]...
add-task#
Set up a new task.
Usage
benchtool add-task [OPTIONS] TASK_NAME
Options
- -p, --benchmark-path <benchmark_path>#
The path to the benchmark repository where the task will be added.
- -s, --task-source <task_source>#
Required The relative path to content that already exists`
- -t, --task-type <task_type>#
Required The type of the task content being added. Options are csv or yml
- Options:
folders | list
Arguments
- TASK_NAME#
Required argument
init#
Initializes a new benchmark.
Benchmark-name is required, if not provided, requested interactively.
this command creates the folder for the benchmark.
Usage
benchtool init [OPTIONS] [BENCHMARK_NAME]
Options
- -p, --path <path>#
The path where the new benchmark repository will be placed
- -a, --about <about>#
Benchmark describtion. Content will go in the about.md file
- --no-git#
Don’t make benchmark a git repository. Default is False
Arguments
- BENCHMARK_NAME#
Optional argument
run#
Running the benchmark and generating logs , help=”The path to the benchmark repository where all the task reside.”
Usage
benchtool run [OPTIONS] BENCHMARK_PATH
Options
- -r, --runner-type <runner_type>#
The engine that will run your LLM.
- Options:
ollama | openai | aws
- -m, --model <model>#
The LLM to be benchmarked.
- -a, --api-url <api_url>#
The api call required to access the runner engine.
- -l, --log-path <log_path>#
The path to a log directory.
Arguments
- BENCHMARK_PATH#
Required argument
run-task#
Running the tasks and generating logs
, help=”The path to the benchmark repository where all the task reside.” , help=”The name of the specific task you would like to run”
Usage
benchtool run-task [OPTIONS] BENCHMARK_PATH TASK_NAME
Options
- -r, --runner-type <runner_type>#
The engine that will run your LLM.
- Options:
ollama | openai | aws
- -m, --model <model>#
The LLM to be benchmarked.
- -a, --api-url <api_url>#
The api call required to access the runner engine.
- -l, --log-path <log_path>#
The path to a log directory.
Arguments
- BENCHMARK_PATH#
Required argument
- TASK_NAME#
Required argument