publications

Please see my google scholar page for the full publication list.

2025

  1. copycat_cover.png
    Fantastic Copyrighted Beasts and How (Not) to Generate Them
    Luxi He*, Yangsibo Huang*, Weijia Shi*, Tinghao Xie, Haotian Liu , and 5 more authors
    ICLR 2025, ICML GenLaw (Spotlight), 2025
  2. MeCo_cover.png
    Metadata Conditioning Accelerates Language Model Pre-training
    Tianyu Gao, Alexander Wettig, Luxi He, Yihe Dong, Sadhika Malladi , and 1 more author
    Preprint, 2025
  3. audiolm_illustration.png
    The Deployment of End-to-End Audio Language Models Should Take into Account the Principle of Least Privilege
    Luxi He, Xiangyu Qi, Michel Liao, Inyoung Cheong, Prateek Mittal , and 2 more authors
    Preprint, 2025

2024

  1. benign_data_safety.png
    What is in Your Safe Data? Identifying Benign Data that Breaks Safety
    Luxi He*, Mengzhou Xia*, and Peter Henderson
    Conference on Language Modeling (COLM), ICLR Data Problems in Foundation Model (Best Paper), 2024
  2. charxiv_cover.png
    CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
    Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu , and 8 more authors
    NeurIPS Datasets & Benchmarks, 2024

2023

  1. fairfront_cover.png
    Aleatoric and Epistemic Discrimination: Fundamental Limits of Fairness Interventions
    Hao Wang, Luxi He, Rui Gao, and Flavio Calmon
    In NeurIPS (Spotlight) , 2023