publications

Please see my google scholar page for the full publication list.

2025

  1. CAI_cover.png
    Statutory Construction and Interpretation for Artificial Intelligence
    Luxi He*, Nimra Nadeem*, Michel Liao, Howard Chen, Danqi Chen , and 2 more authors
    NeurIPS RegML Workshop (Oral), 2025
  2. audiolm_illustration.png
    The Model Hears You: Audio Language Model Deployments Should Consider the Principle of Least Privilege
    Luxi He*, Xiangyu Qi*, Michel Liao, Inyoung Cheong, Prateek Mittal , and 2 more authors
    AIES (Oral), 2025
  3. copycat_cover.png
    Fantastic Copyrighted Beasts and How (Not) to Generate Them
    Luxi He*, Yangsibo Huang*, Weijia Shi*, Tinghao Xie, Haotian Liu , and 5 more authors
    ICLR 2025, ICML GenLaw Workshop (Spotlight), 2025
  4. MeCo_cover.png
    Metadata Conditioning Accelerates Language Model Pre-training
    Tianyu Gao, Alexander Wettig, Luxi He, Yihe Dong, Sadhika Malladi , and 1 more author
    ICML, 2025
  5. safeguard-durability-cover.png
    On Evaluating the Durability of Safeguards for Open-Weight LLMs g
    Xiangyu Qi, Boyi Wei, Nicholas Carlini, Yangsibo Huang, Tinghao Xie , and 5 more authors
    ICLR, 2025
  6. sorry-bench-cover.png
    SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
    Tinghao Xie, Xiangyu Qi, Yi Zeng, Yangsibo Huang, Udari Madhushani Sehwag , and 11 more authors
    ICLR, 2025

2024

  1. benign_data_safety.png
    What is in Your Safe Data? Identifying Benign Data that Breaks Safety
    Luxi He*, Mengzhou Xia*, and Peter Henderson
    Conference on Language Modeling (COLM), ICLR Data Problems in Foundation Model (Best Paper), 2024
  2. charxiv_cover.png
    CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
    Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu , and 8 more authors
    NeurIPS Datasets & Benchmarks, 2024

2023

  1. fairfront_cover.png
    Aleatoric and Epistemic Discrimination: Fundamental Limits of Fairness Interventions
    Hao Wang, Luxi He, Rui Gao, and Flavio Calmon
    In NeurIPS (Spotlight) , 2023