Pre-training: Perspectives, Pitfalls, and Paths Forward

Overview

The past five years have seen rapid progress in large-scale pre-trained models across a variety of domains, such as computer vision, natural language processing, robotics, bioinformatics, etc. Leveraging a huge number of parameters, large-scale pre-trained models are capable of encoding rich knowledge from labeled and/or unlabeled examples. Supervised and self-supervised pre-training have been the two most representative paradigms, through which pre-trained models have demonstrated large benefits on a wide spectrum of downstream tasks. For example, convolutional neural networks pre-trained on a large-scale labeled image dataset (e.g., ImageNet) and later fine-tuned on specific vision tasks with a relative small training set are highly successful. By resorting to carefully designed self-supervised tasks, self-supervised pre-trained models (e.g., MoCo and BERT) enjoy impressive generalization and applicability. There are also other pre-training paradigms, e.g., meta-learning for few-shot learning, where pre-trained models are trained so that they quickly adapt to solve new tasks.

However, there are still many remaining challenges and new opportunities ahead for pre-training, In this workshop, we propose to have the following two foci, informed by recent advancement in pre-training.

Which pre-training methods transfer across different applications/domains, which ones don’t, and why? This question is highly relevant to the reliability of different pre-trained models. We are particularly interested in bringing together various domains of researchers that work on different kinds of pre-trained models to discuss the pitfalls of pre-trained models as well as present both empirical and theoretical explanations.
In what settings should we expect pre-training to be effective, compared to learning from scratch? In many domains, such as medical image analysis, the benefits of popular pre-trained models have been shown to diminish because the data are fairly fine-grained. To this end, we aim to invite researchers from different areas to discuss which aspects of pre-training (e.g., labeled/unlabeled data size, architecture) account for such performance difference and how to maximize the power of pre-training in various fields. Both theoretical and empirical contributions are expected.

Call for Papers

We welcome submissions from areas of pre-trained models, few-shot learning, transfer learning, self-supervised learning, meta-learning, etc. We also invite submissions from researchers in other application areas such as physics, chemistry, biology. To summarize, the topics include but are not limited to:

Theoretical foundations of connections and differences between different pre-training methods (supervised pre-training, self-supervised pre-training with different auxiliary tasks, meta-learning, etc.)
Empirical analysis of various pre-training methods
Generalization bounds of different pre-training methods
Novel pre-training methods to maximize generalization
Model selection among a zoo of pre-trained models
New fine-tuning techniques for maximally leveraging a pre-trained model
Pre-training for various application domains, such as computer vision, natural language processing, robotics, physics, drug discovery, and environmental sustainability

Submission URL: https://openreview.net/group?id=ICML.cc/2022/Workshop/Pre-Training

Format: All submissions must be in PDF format and anonymized. Submissions are limited to four content pages, including all figures and tables; unlimited additional pages containing references and supplementary materials are allowed. Reviewers may choose to read the supplementary materials but will not be required to. Camera-ready versions may go up to five content pages.

Style file: You must format your submission using the ICML 2022 LaTeX style file. For your convenience, we have modified the main conference style file to refer to the Pre-training workshop: pre-training.sty. Please include the references and supplementary materials in the same PDF as the main paper. The maximum file size for submissions is 50MB. Submissions that violate the ICML style (e.g., by decreasing margins or font sizes) or page limits may be rejected without further review.

Dual-submission policy: We welcome ongoing and unpublished work. We will also accept papers that are under review at the time of submission, or that have been recently accepted without published proceedings.

Non-archival: The workshop is a non-archival venue and will not have official proceedings. Workshop submissions can be subsequently or concurrently submitted to other venues.

Visibility: Submissions and reviews will not be public. Only accepted papers will be made public.

Contact: For any questions, please contact us at pretraining2022@googlegroups.com.

If you would like to become a reviewer for this workshop, please let us know at https://t.co/gaHT0VA2Sl.

Schedule

This is the tentative schedule of the workshop. All slots are provided in Eastern Time (ET).

Morning Session

[8:50 - 9:00]	Introduction and opening remarks
[9:00 - 9:30]	Invited talk 1: Nathan C. Frey
[9:30 - 10:00]	Invited talk 2: Oriol Vinyals
[10:00 - 10:15]	Contributed talk 1: Multimodal Masked Autoencoders Learn Transferable Representations
[10:15 - 10:45]	Invited talk 3: Maithra Raghu
[10:45 - 11:15]	Invited talk 4: Charles Sutton
[11:15 - 12:15]	Panel Discussion

Afternoon Session

[13:30 - 14:00]	Invited talk 5: Hanie Sedghi
[14:00 - 14:15]	Contributed talk 2: Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Prior
[14:15 - 14:45]	Invited talk 6: Xinlei Chen
[14:45 - 15:00]	Contributed talk 3: Plex: Towards Reliability using Pretrained Large Model Extensions
[15:00 - 16:30]	Poster Session
[16:30 - 17:00]	Invited talk 7: Mohit Bansal
[17:00 - 17:30]	Invited talk 8: Sara Beery

Overview

Call for Papers

Important Dates

Accepted papers

Schedule

Morning Session

Afternoon Session

Invited Speakers

Oriol Vinyals

Hanie Sedghi

Xinlei Chen

Mohit Bansal

Nathan C. Frey

Sara Beery

Maithra Raghu

Charles Sutton

Panelists

Jason D. Lee

‪Zhangyang (Atlas) Wang

James Zou

Oriol Vinyals

Maithra Raghu

Xinlei Chen

Workshop Organizers

Huaxiu Yao

Hugo Larochelle

Percy Liang

Colin Raffel

Jian Tang

Ying Wei

Saining Xie

Eric P. Xing

Chelsea Finn