Note
- Goal creating the benchmark:
- Build trust - show the transparency of AI accuracy.
- Have a sense of capacity (?).
- Use this to informally evaluate Holistics AI vs. other tools.
- https://www.sigmacomputing.com/blog/text-to-sql-data-chat
- Current common benchmarks like Spider or BIRD is benchmarking for text-to-SQL problem, not the business-question-to-insight problem.
- Idea: sync common AMQL questions from Zendesk.
- Goal creating the benchmark:
Done
- DONE Finalize the test suite approach
Read Holistics’s AI Philosophy.
Research common/standard GenBI benchmarking methods.
If not, go with text-to-SQL.
Write document and review with a Dat.
/ 2025-07-01
Created Tue, 01 Jul 2025 00:00:00 +0000
Modified Mon, 25 May 2026 06:02:25 +0000