Discover how econometricians use math and statistics to analyze and predict economic trends, serving in finance and academia ...
SD-Eval is a benchmark dataset aimed at multidimensional evaluation of spoken dialogue understanding and generation. SD-Eval focuses on paralinguistic and environmental information and includes 7,303 ...
simple-evals is a lightweight and easy-to-deploy large language model (LLM) evaluation framework, developed based on openai/simple-evals. It aims to provide researchers and developers with a simple ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results