Yizhong Wang

PhD student
Paul G. Allen School of Computer Science & Engineering
University of Washington, Seattle, WA

Email: yizhongw [at] cs.washington.edu

Short Bio

I am a fifth-year PhD student at the Paul G. Allen School of Computer Science & Engineering, University of Washington. I am very fortunate to be co-advised by Hannaneh Hajishirzi and Noah Smith. I am also a part-time research intern at Allen Institute for Artificial Intelligence. I have previously interned at Meta AI, Microsoft Research, and Baidu NLP. Prior to UW, I obtained my Master's degree from Peking University and Bachelor's degree from Shanghai Jiao Tong University.

My primary research interests lie in natural language processing and machine learning. I am excited about the generality of large language models. In particular, my recent research focuses on building general-purpose instruction-following models and algorithms for adaptating them to many scenarios based on various forms of supervision. This includes Super-NaturalInstructions, Self-Instruct, Tülu.

My name is written as 王义中 in Chinese characters.

  • Sep. 22, 2023
  • 📢 We are organizing a Workshop on Instruction Tuning and Instruction Following at NeurIPS 2023. Please consider submitting your paper or joining us in the conference!
  • Sep. 22, 2023
  • Tülu got accepted into NeurIPS 2023 Datasets and Benchmarks Track. See people in New Orleans!
  • June 9, 2023
  • We arxived a paper that systematically studies instruction tuning resources and released Tülu, a suite of full-parameter instruction-tuned models from 7B to 65B! [Tweets]
  • May 2, 2023
  • We have three papers accepted by ACL 2023. Looking forward to meeting people at Toronto!
  • Apr. 18, 2023
  • I gave a talk about instruction tuning of large languag models at JHU. [Slides][Video]
  • Jan. 23, 2023
  • I started doing part-time research internship at AI2.
  • Dec. 20, 2022
  • We arxived Self-Instruct, a new way to align language models with little human annotation. [Tweets]
  • Apr. 16, 2022
  • We released Natural Instructions V2 that covers 1600+ NLP tasks together with their instructions!

    Selected Publications [Google Scholar]

    How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources (Spotlight)

    Yizhong Wang*, Hamish Ivison*, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

    NeurIPS 2023 (Datasets and Benchmarks Track)
    Self-Instruct: Aligning Language Models with Self-Generated Instructions

    Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A Smith, Daniel Khashabi, Hannaneh Hajishirzi

    ACL 2023
    Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

    Yizhong Wang*, Swaroop Mishra*, Pegah Alipoormolabashi, Yeganeh Kordi et al.

    EMNLP 2022
    Probing Across Time: What Does RoBERTa Know and When?

    Leo Z. Liu*, Yizhong Wang*, Jungo Kasai, Hannaneh Hajishirzi, Noah A. Smith

    EMNLP 2021 Findings
    Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

    Swabha Swayamdipta, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith and Yejin Choi

    2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)
    Do Neural NLP Models Know Numbers? Probing Numeracy in Embeddings

    Eric Wallace*, Yizhong Wang*, Sujian Li, Sameer Singh and Matt Gardner

    2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019)
    DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

    Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh and Matt Gardner

    2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019)
    Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification

    Yizhong Wang, Kai Liu, Jing Liu, Wei He, Yajuan Lyu, Hua Wu, Sujian Li and Haifeng Wang.

    The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018)
    DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

    Wei He, Kai Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, Haifeng Wang

    ACL Workshop on Machine Reading for Question Answering, 2018
    A Two-Stage Parsing Method for Text-level Discourse Analysis (Outstanding Paper Award)

    Yizhong Wang, Sujian Li and Houfeng Wang

    The 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017)

    * indicates equal contribution.