BenchTools#
A library for building and running benchmarks
Install#
You can install directly from github or after cloning
diect install#
By clone#
You can clone first
git clone https://github.com/ml4sts/benchtools.git
and then install
pip install benchtools
(possibly pip3)
if you clone in order to develop, you may want to install with pip’s -e option
pip install -e benchtools
To update, pull and install again.
Usage#
benchtools allows you to express templated tasks in multiple ways:
a yaml format listing the tasks with a values key
a folder for each task with txt file with template and a csv file of values for variations of the task
a benchmark can consist of tasks that all fit a single format above or a mixture of meta-tasks each represented as a folder and then the specific tasks in one of the forms above
There are two main ways to use BenchTools. The user can mix and match between the two methods.
Contents:
benchtools#
BenchTools is a tool that helps researchers set up benchmarks.
Usage
benchtools [OPTIONS] COMMAND [ARGS]...
add-task#
Set up a new task.
Usage
benchtools add-task [OPTIONS] TASK_NAME
Options
- -p, --benchmark-path <benchmark_path>#
The path to the benchmark repository where the task will be added.
- -s, --task-source <task_source>#
Required The relative path to content that already exists`
- -t, --task-type <task_type>#
Required The type of the task content being added. Options are csv or yml
- Options:
folders | list
Arguments
- TASK_NAME#
Required argument
init#
Initializes a new benchmark.
Benchmark-name is required, if not provided, requested interactively.
this command creates the folder for the benchmark.
Usage
benchtools init [OPTIONS] [BENCHMARK_NAME]
Options
- -p, --path <path>#
The path where the new benchmark repository will be placed
- -a, --about <about>#
Benchmark describtion. Content will go in the about.md file
- --no-git#
Don’t make benchmark a git repository. Default is False
Arguments
- BENCHMARK_NAME#
Optional argument
run#
Running the benchmark and generating logs , help=”The path to the benchmark repository where all the task reside.”
Usage
benchtools run [OPTIONS] BENCHMARK_PATH
Options
- -r, --runner-type <runner_type>#
The engine that will run your LLM.
- Options:
ollama | openai | aws
- -m, --model <model>#
The LLM to be benchmarked.
- -a, --api-url <api_url>#
The api call required to access the runner engine.
- -l, --log-path <log_path>#
The path to a log directory.
Arguments
- BENCHMARK_PATH#
Required argument
run-task#
Running the tasks and generating logs
, help=”The path to the benchmark repository where all the task reside.” , help=”The name of the specific task you would like to run”
Usage
benchtools run-task [OPTIONS] BENCHMARK_PATH TASK_NAME
Options
- -r, --runner-type <runner_type>#
The engine that will run your LLM.
- Options:
ollama | openai | aws
- -m, --model <model>#
The LLM to be benchmarked.
- -a, --api-url <api_url>#
The api call required to access the runner engine.
- -l, --log-path <log_path>#
The path to a log directory.
Arguments
- BENCHMARK_PATH#
Required argument
- TASK_NAME#
Required argument