ICML 2022 Workshop on

Pre-training: Perspectives, Pitfalls, and Paths Forward

Saturday, July 23, 2022

HALL F, Baltimore Convention Center, Baltimore, MD.


The past five years have seen rapid progress in large-scale pre-trained models across a variety of domains, such as computer vision, natural language processing, robotics, bioinformatics, etc. Leveraging a huge number of parameters, large-scale pre-trained models are capable of encoding rich knowledge from labeled and/or unlabeled examples. Supervised and self-supervised pre-training have been the two most representative paradigms, through which pre-trained models have demonstrated large benefits on a wide spectrum of downstream tasks. For example, convolutional neural networks pre-trained on a large-scale labeled image dataset (e.g., ImageNet) and later fine-tuned on specific vision tasks with a relative small training set are highly successful. By resorting to carefully designed self-supervised tasks, self-supervised pre-trained models (e.g., MoCo and BERT) enjoy impressive generalization and applicability. There are also other pre-training paradigms, e.g., meta-learning for few-shot learning, where pre-trained models are trained so that they quickly adapt to solve new tasks.

However, there are still many remaining challenges and new opportunities ahead for pre-training, In this workshop, we propose to have the following two foci, informed by recent advancement in pre-training.

  • Which pre-training methods transfer across different applications/domains, which ones don’t, and why? This question is highly relevant to the reliability of different pre-trained models. We are particularly interested in bringing together various domains of researchers that work on different kinds of pre-trained models to discuss the pitfalls of pre-trained models as well as present both empirical and theoretical explanations.
  • In what settings should we expect pre-training to be effective, compared to learning from scratch? In many domains, such as medical image analysis, the benefits of popular pre-trained models have been shown to diminish because the data are fairly fine-grained. To this end, we aim to invite researchers from different areas to discuss which aspects of pre-training (e.g., labeled/unlabeled data size, architecture) account for such performance difference and how to maximize the power of pre-training in various fields. Both theoretical and empirical contributions are expected.

Call for Papers

We welcome submissions from areas of pre-trained models, few-shot learning, transfer learning, self-supervised learning, meta-learning, etc. We also invite submissions from researchers in other application areas such as physics, chemistry, biology. To summarize, the topics include but are not limited to:

  • Theoretical foundations of connections and differences between different pre-training methods (supervised pre-training, self-supervised pre-training with different auxiliary tasks, meta-learning, etc.)
  • Empirical analysis of various pre-training methods
  • Generalization bounds of different pre-training methods
  • Novel pre-training methods to maximize generalization
  • Model selection among a zoo of pre-trained models
  • New fine-tuning techniques for maximally leveraging a pre-trained model
  • Pre-training for various application domains, such as computer vision, natural language processing, robotics, physics, drug discovery, and environmental sustainability

Submission URL:   https://openreview.net/group?id=ICML.cc/2022/Workshop/Pre-Training

Format:  All submissions must be in PDF format and anonymized. Submissions are limited to four content pages, including all figures and tables; unlimited additional pages containing references and supplementary materials are allowed. Reviewers may choose to read the supplementary materials but will not be required to. Camera-ready versions may go up to five content pages.

Style file:   You must format your submission using the ICML 2022 LaTeX style file. For your convenience, we have modified the main conference style file to refer to the Pre-training workshop: pre-training.sty. Please include the references and supplementary materials in the same PDF as the main paper. The maximum file size for submissions is 50MB. Submissions that violate the ICML style (e.g., by decreasing margins or font sizes) or page limits may be rejected without further review.

Dual-submission policy:  We welcome ongoing and unpublished work. We will also accept papers that are under review at the time of submission, or that have been recently accepted without published proceedings.

Non-archival:  The workshop is a non-archival venue and will not have official proceedings. Workshop submissions can be subsequently or concurrently submitted to other venues.

Visibility:  Submissions and reviews will not be public. Only accepted papers will be made public.

Contact:  For any questions, please contact us at pretraining2022@googlegroups.com.

If you would like to become a reviewer for this workshop, please let us know at https://t.co/gaHT0VA2Sl.

Important Dates


    Submission deadline: May 22th, 2022, AOE May 25th, 2022, AOE

    Notification to authors: June 13th, 2022, AOE

    Video recording deadline (contributed talk only): July 1st, 2022

    Final workshop program, camera-ready deadline: July 8th, 2022

Accepted papers

Please kindly find the list of accepted papers here.


This is the tentative schedule of the workshop. All slots are provided in Eastern Time (ET).

Morning Session

[8:50 - 9:00] Introduction and opening remarks
[9:00 - 9:30] Invited talk 1: Nathan C. Frey
[9:30 - 10:00] Invited talk 2: Oriol Vinyals
[10:00 - 10:15] Contributed talk 1: Multimodal Masked Autoencoders Learn Transferable Representations
[10:15 - 10:45] Invited talk 3: Maithra Raghu
[10:45 - 11:15] Invited talk 4: Charles Sutton
[11:15 - 12:15] Panel Discussion

Afternoon Session

[13:30 - 14:00] Invited talk 5: Hanie Sedghi
[14:00 - 14:15] Contributed talk 2: Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Prior
[14:15 - 14:45] Invited talk 6: Xinlei Chen
[14:45 - 15:00] Contributed talk 3: Plex: Towards Reliability using Pretrained Large Model Extensions
[15:00 - 16:30] Poster Session
[16:30 - 17:00] Invited talk 7: Mohit Bansal
[17:00 - 17:30] Invited talk 8: Sara Beery

Invited Speakers

Hanie Sedghi

Google Brain


Jason D. Lee

Princeton University

‪Zhangyang (Atlas) Wang

University of Texas at Austin

James Zou

Stanford University

Workshop Organizers

Huaxiu Yao

Stanford University

Hugo Larochelle

Google Brain

Percy Liang

Stanford University

Colin Raffel

UNC Chapel Hill

Jian Tang

Mila-Quebec AI Institute & HEC Montreal

Ying Wei

City University of Hong Kong

Eric P. Xing


Chelsea Finn

Stanford University