The Effect of Model Size on Worst-Group Generalization

Pham, Alan; Chan, Eunice; Srivatsa, Vikranth; Ghosh, Dhruba; Yang, Yaoqing; Yu, Yaodong; Zhong, Ruiqi; Gonzalez, Joseph E.; Steinhardt, Jacob

doi:10.48550/arXiv.2112.04094

Computer Science > Machine Learning

arXiv:2112.04094 (cs)

[Submitted on 8 Dec 2021]

Title:The Effect of Model Size on Worst-Group Generalization

Authors:Alan Pham, Eunice Chan, Vikranth Srivatsa, Dhruba Ghosh, Yaoqing Yang, Yaodong Yu, Ruiqi Zhong, Joseph E. Gonzalez, Jacob Steinhardt

Download PDF

Abstract: Overparameterization is shown to result in poor test accuracy on rare subgroups under a variety of settings where subgroup information is known. To gain a more complete picture, we consider the case where subgroup information is unknown. We investigate the effect of model size on worst-group generalization under empirical risk minimization (ERM) across a wide range of settings, varying: 1) architectures (ResNet, VGG, or BERT), 2) domains (vision or natural language processing), 3) model size (width or depth), and 4) initialization (with pre-trained or random weights). Our systematic evaluation reveals that increasing model size does not hurt, and may help, worst-group test performance under ERM across all setups. In particular, increasing pre-trained model size consistently improves performance on Waterbirds and MultiNLI. We advise practitioners to use larger pre-trained models when subgroup labels are unknown.

Comments:	The first four authors contributed equally to the work
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2112.04094 [cs.LG]
	(or arXiv:2112.04094v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2112.04094

Submission history

From: Alan Pham [view email]
[v1] Wed, 8 Dec 2021 03:45:47 UTC (524 KB)

Full-text links:

Download:

(license)

Current browse context:

cs.LG

< prev | next >

new | recent | 2112

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Yaoqing Yang
Yaodong Yu
Ruiqi Zhong
Joseph E. Gonzalez
Jacob Steinhardt

export bibtex citation

Computer Science > Machine Learning

Title:The Effect of Model Size on Worst-Group Generalization

Submission history

Download:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code and Data Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:The Effect of Model Size on Worst-Group Generalization

Submission history

Download:

References & Citations

DBLP - CS Bibliography

Bibtex formatted citation

Bookmark

Bibliographic and Citation Tools

Code and Data Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators