CV
Education
- B.Eng. student in Software Engineering, Fudan University, 2023–Present
Research Interests
- Trustworthy evaluation of large language models
- Medical NLP and real-world clinical benchmarks
- Open-ended novelty assessment and scientific intelligence
Links
- Website: https://huayusha.org
- GitHub: https://github.com/HuayuSha
- ORCID: https://orcid.org/0009-0006-1742-5816
- OpenReview: https://openreview.net/profile?id=~Huayu_Sha1
- Email: [email protected]
Selected Publications
-
SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents
arXiv preprint · ICML 2026 submission (under review), 2026
-
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment
arXiv preprint, 2026
-
LLMEval-Fair: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models
ACL 2026 Submission (Under Review), 2025
-
LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation
Findings of EMNLP 2025, 2025