UpTrain is an open-source unified platform to evaluate and improve Gener...
[ACL 2024] Benchmarking the Hallucination of Chinese Large Language Mode...
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate yo...