Skip to main content
Cornell University
We gratefully acknowledge support from
the Simons Foundation and member institutions.
arxiv logo > cs > arXiv:2112.04094

Help | Advanced Search

Computer Science > Machine Learning

(cs)
[Submitted on 8 Dec 2021]

Title:The Effect of Model Size on Worst-Group Generalization

Authors:Alan Pham, Eunice Chan, Vikranth Srivatsa, Dhruba Ghosh, Yaoqing Yang, Yaodong Yu, Ruiqi Zhong, Joseph E. Gonzalez, Jacob Steinhardt
Download PDF
Abstract: Overparameterization is shown to result in poor test accuracy on rare subgroups under a variety of settings where subgroup information is known. To gain a more complete picture, we consider the case where subgroup information is unknown. We investigate the effect of model size on worst-group generalization under empirical risk minimization (ERM) across a wide range of settings, varying: 1) architectures (ResNet, VGG, or BERT), 2) domains (vision or natural language processing), 3) model size (width or depth), and 4) initialization (with pre-trained or random weights). Our systematic evaluation reveals that increasing model size does not hurt, and may help, worst-group test performance under ERM across all setups. In particular, increasing pre-trained model size consistently improves performance on Waterbirds and MultiNLI. We advise practitioners to use larger pre-trained models when subgroup labels are unknown.
Comments: The first four authors contributed equally to the work
Subjects: Machine Learning (cs.LG)
Cite as: arXiv:2112.04094 [cs.LG]
  (or arXiv:2112.04094v1 [cs.LG] for this version)
  https://doi.org/10.48550/arXiv.2112.04094
arXiv-issued DOI via DataCite

Submission history

From: Alan Pham [view email]
[v1] Wed, 8 Dec 2021 03:45:47 UTC (524 KB)
Full-text links:

Download:

  • PDF
  • Other formats
(license)
Current browse context:
cs.LG
< prev   |   next >
new | recent | 2112
Change to browse by:
cs

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

listing | bibtex
Yaoqing Yang
Yaodong Yu
Ruiqi Zhong
Joseph E. Gonzalez
Jacob Steinhardt
a export bibtex citation Loading...

Bookmark

BibSonomy logo Mendeley logo Reddit logo ScienceWISE logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack