[null,null,["最后更新时间 (UTC):2023-08-26。"],[[["\u003cp\u003eThis guide focuses on hyperparameter tuning and other practical aspects of deep learning training to improve model effectiveness.\u003c/p\u003e\n"],["\u003cp\u003eIt targets engineers and researchers with basic machine learning and deep learning knowledge, recommending the Machine Learning Crash Course for beginners.\u003c/p\u003e\n"],["\u003cp\u003eThe document addresses the lack of comprehensive, practical guidance on achieving good results with deep learning, aiming to bridge the gap between experts and less experienced practitioners.\u003c/p\u003e\n"],["\u003cp\u003eIt reflects the authors' experience and opinions, focusing on hyperparameter tuning and other practical issues, and is intended to be a living document that evolves with the field.\u003c/p\u003e\n"],["\u003cp\u003eThe robot emoji (🤖) highlights areas where further research is needed to improve deep learning workflows.\u003c/p\u003e\n"]]],[],null,["# Deep Learning Tuning Playbook\n\nThis document helps you train deep learning models more effectively.\nAlthough this document emphasizes hyperparameter tuning, it\nalso touches on other aspects of deep learning training,\nsuch as training pipeline implementation and optimization.\n\nThis document assumes your machine learning task is either a\n[supervised learning](/machine-learning/glossary#supervised-machine-learning)\nproblem or a similar problem (for example,\n[self-supervised learning](/machine-learning/glossary#self-supervised-learning))\nThat said, some of the advice in this document\nmay also apply to other types of machine learning problems.\n| **Note:** This document is based on an earlier version, which is stored on [GitHub](https://github.com/google-research/tuning_playbook). The names and affiliations of the authors are available on the github version.\n\nTarget audience\n---------------\n\nWe've aimed this document at engineers and researchers with at least\na basic knowledge of machine learning and\n[deep learning](/machine-learning/glossary#deep-model).\nIf you don't have that background, please consider taking\n[Machine Learning Crash Course](/machine-learning/crash-course).\n\nWhy did we write this document?\n-------------------------------\n\nCurrently, there is an astonishing amount of toil and guesswork involved in\ngetting deep neural networks to work well in practice. Even worse, the\nactual recipes people use to get good results with deep learning are rarely\ndocumented. Papers gloss over the process that led to their final results in\norder to present a cleaner story, and machine learning engineers working on\ncommercial problems rarely have time to take a step back and generalize their\nprocess. Textbooks tend to eschew practical guidance and prioritize fundamental\nprinciples, even if their authors have the necessary experience in applied work\nto provide useful advice.\n\nWhen preparing to create this document, we couldn't\nfind any comprehensive attempt to actually explain *how to get good results with\ndeep learning*. Instead, we found snippets of advice in blog posts and on social\nmedia, tricks peeking out of the appendix of research papers, occasional case\nstudies about one particular project or pipeline, and a lot of confusion. There\nis a vast gulf between the results achieved by deep learning experts and less\nskilled practitioners who are using superficially similar methods. However,\nthe experts readily admit that some of what they do might not be\nwell-justified. As deep learning matures and has a larger impact on the world,\nthe community needs more resources covering useful recipes, including all the\npractical details that can be so critical for obtaining good results.\n\nWe are a team of five researchers and engineers who have worked in deep learning\nfor many years, some of us since as early as 2006. We have applied deep learning\nin everything from speech recognition to astronomy.\nThis document grew out of our own experience training neural\nnetworks, teaching new machine learning engineers, and advising our colleagues\non the practice of deep learning.\n\nIt has been gratifying to see deep\nlearning go from a machine learning approach practiced by a handful of academic\nlabs to a technology powering products used by billions of people. However,\ndeep learning is still in its infancy as an engineering discipline, and we hope\nthis document encourages others to help systematize the field's\nexperimental protocols.\n\nThis document came about as we tried to crystallize our own approach to deep\nlearning. Thus, it represents our opinions at the time of\nwriting, not any sort of objective truth. Our own struggles with hyperparameter\ntuning made it a particular focus of our guidance, but we also cover other\nimportant issues we have encountered in our work (or seen go wrong). Our\nintention is for this work to be a living document that grows and evolves as our\nbeliefs change. For example, the material on debugging and mitigating training\nfailures wouldn't have been possible for us to write two years ago because it\nis based on recent results and ongoing investigations.\n\nInevitably, some of our advice will need to be updated to account for new\nresults and improved workflows. We don't know the *optimal* deep learning\nrecipe, but until the community starts writing down and debating different\nprocedures, we cannot hope to find it. To that end, we would encourage readers\nwho find issues with our advice to produce alternative recommendations, along\nwith convincing evidence, so we can update the playbook. We would also love to\nsee alternative guides and playbooks that might have different recommendations\nso we can work towards best practices as a community.\n\nAbout that robot emoji\n----------------------\n\nThe robot 🤖 emoji indicates areas where we would like to do more research.\nOnly after trying to write this playbook did it become completely clear how\nmany interesting and neglected research questions\ncan be found in the deep learning practitioner's workflow."]]