Build production evaluation pipelines for LLM applications — golden datasets, LLM-as-judge, rubrics, statistical significance, regression detection, and evals vs tests.
Statistical distributions, optimization, integration, signal processing, and linear algebra with SciPy. Builds on NumPy arrays.
navigation
actions
cheat sheet pages