hlibbabii/bohr

You have to be logged in to leave a comment.

Ideally, the added heurstics improve the metrics on all test datasets, which can be seen in the comment by the Github bot to the submitted PR. In this case the PR can be merged, and the set of heuristics will be updated 🎉

However, metrics might not improve or become worse on some (or even all) test datasets despite heursistics being reasonable. We suggest to follow the next steps to debug the heuristics.

If the metrics haven't changed, make sure that BOHR 'saw' the added heuristic(s). Check that metrics/<task_name>/analysis_<dataset_name>.json file has changed for any dataset and now has entries for the newly added heurstic(s).
BOHR 'saw' the heuristic but it doesn't cover enough datapoints to have a significant impact on the metrics. See how the coverage has increased for different datasets by checking the following file: metrics/<task_name>/<corresponding-heuristic-group>/heuristic_metrics_<dataset_name>.json. If you see that the coverage value is less than expected, there might be a bug in your heuristic. Note! Some heuristics can be designed to have zero coverage on the test set(s). However, adding such heuristics can still lead to the increase of performance by improving the weights of those heuristics that are fired on the test set(s).
Check also metrics/<task_name>/analysis_<dataset_names>.json for suspicious values of 'Polarity', 'Coverage', 'Conflicts', 'Correct', 'Icorrent', 'Emp. Accuracy' values (TODO: elaborate more)
Have all the metrics on all the test datasets got worse or are there some that have improved? TODO how do we handle this case?
If there seems to be no bug in your heuristic and all the metrics consistently got worse, use BOHR debugging suite to inspect individual data points:

git checkout new-heuristic-branch
dvc pull
dvc checkout
bohr debug <task-name> <dataset-name>

This will show you the datapoints whose probabilistic label had been changed the most. The example output of the command:

To see a single datapoint in detail and how each fired heuristic contributed to its label, run the following command:

bohr debug <task-name> <dataset-name> <datapoint-id>

Tip!

Press p or to see the previous file or, n or to see the next file

DebuggingHeuristics.md 2.4 KB

Permalink History Raw

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

hlibbabii / bohr mirror of https://github.com/giganticode/bohr.git

DebuggingHeuristics.md 2.4 KB Permalink History Raw

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

hlibbabii
/
bohr
mirror of https://github.com/giganticode/bohr.git

DebuggingHeuristics.md 2.4 KB

Permalink History Raw