Text classification is a fundamental machine learning problem with applications across various products. In this guide, we have broken down the text classification workflow into several steps. For each step, we have suggested a customized approach based on the characteristics of your specific dataset. In particular, using the ratio of number of samples to the number of words per sample, we suggest a model type that gets you closer to the best performance quickly. The other steps are engineered around this choice. We hope that following the guide, the accompanying code, and the flowchart will help you learn, understand, and get a swift first-cut solution to your text classification problem.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2023-10-30 UTC.
[null,null,["Last updated 2023-10-30 UTC."],[[["This guide provides a structured workflow for text classification, breaking it down into manageable steps tailored to your dataset's characteristics."],["Model selection is guided by the ratio of samples to words per sample, helping you quickly identify a suitable model for optimal performance."],["The guide includes code and a flowchart to facilitate learning, understanding, and implementing a first-cut solution for your text classification problem."]]],[]]