監督式學習
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
監督式學習的任務已明確定義,可應用於各種情境,例如識別垃圾郵件或預測降雨量。
監督式學習的基本概念
監督式機器學習的核心概念如下:
資料
資料是機器學習的動力來源,資料以文字和數字的形式儲存在表格中,或是以圖像和音訊檔案中擷取的像素和波形值的形式存在。我們會將相關資料儲存在資料集中。舉例來說,我們可能有以下資料集:
資料集是由個別示例組成,其中包含特徵和標籤。您可以將示例視為試算表中的單一資料列。特徵是監督式模型用來預測標籤的值。標籤是「答案」,也就是我們希望模型預測的值。在預測降雨量的天氣模型中,特徵可能包括緯度、經度、溫度、濕度、雲層覆蓋率、風向和大氣壓。標籤為「降雨量」。
同時包含特徵和標籤的示例稱為標記示例。
兩個標記範例
相反地,未標示的示例包含特徵,但沒有標籤。建立模型後,模型會根據特徵預測標籤。
兩個未標記的示例
資料集特性
資料集的特徵在於其大小和多樣性。大小代表範例數量。多樣性代表這些範例涵蓋的範圍。優質資料集的資料量大且多樣性高。
資料集可以是大型且多樣化、大型但不具多樣性,或是小型但極具多樣性。換句話說,大量的資料集並不能保證足夠的多樣性,而多樣性極高的資料集也無法保證有足夠的範例。
舉例來說,資料集可能包含 100 年的資料,但只限於 7 月份。使用這個資料集來預測 1 月的降雨量,會產生不準確的預測結果。相反地,資料集可能只涵蓋幾年,但包含每個月的資料。由於這個資料集不含足夠的年度資料,無法考量變化情形,因此可能會產生不準確的預測結果。
隨堂測驗
資料集的哪些屬性最適合用於機器學習?
資料量大 / 多樣性高
機器學習系統必須有大量涵蓋各種用途的範例,才能瞭解資料中的底層模式。以這類資料集訓練的模型,更有可能針對新資料做出正確的預測。
規模龐大 / 多樣性偏低
機器學習模型的品質取決於用於訓練的範例。模型對於從未訓練過的新資料,會產生較差的預測結果。
小型 / 多樣性高
大多數模型無法在小型資料集中找出可靠的模式。預測結果的可信度會不如大型資料集。
規模小 / 多樣性低
如果資料集較小且變化不大,您可能無法從機器學習中獲益。
資料集也可以透過其地圖項目數量來加以描述。舉例來說,某些氣象資料集可能包含數百個地圖項目,從衛星影像到雲層覆蓋率值都有。其他資料集可能只包含三或四個特徵,例如濕度、大氣壓力和溫度。資料集包含更多功能,有助於模型發現其他模式,並做出更準確的預測。不過,含有更多特徵的資料集不一定會產生可做出更精確預測的模型,因為某些特徵可能與標籤沒有因果關係。
模型
在監督式學習中,模型是複雜的數字集合,定義從特定輸入特徵模式到特定輸出標籤值的數學關係。模型會透過訓練發現這些模式。
訓練
監督式模型必須先經過訓練,才能進行預測。為了訓練模型,我們會為模型提供含有標記範例的資料集。模型的目標是找出最佳解決方案,從特徵預測標籤。模型會比較預測值與標籤的實際值,找出最佳解決方案。模型會根據預測值和實際值之間的差異 (稱為損失),逐步更新解決方案。換句話說,模型會學習特徵和標籤之間的數學關係,以便針對未知資料做出最佳預測。
舉例來說,如果模型預測 1.15 inches
的降雨量,但實際值為 .75 inches
,模型會修改解決方案,讓預測值更接近 .75 inches
。模型查看資料集中的每個示例 (在某些情況下會查看多次) 後,就會找到一個解決方案,可針對每個示例做出平均最佳預測。
以下示範如何訓練模型:
模型會接收單一標記範例並提供預測結果。
圖 1. 機器學習模型根據標記的範例進行預測。
模型會比較預測值與實際值,並更新解決方案。
圖 2. 機器學習模型更新預測值。
模型會對資料集中的每個標記示例重複這個程序。
圖 3. 機器學習模型會針對訓練資料集中的每個標記範例更新預測結果。
如此一來,模型就能逐漸學習特徵和標籤之間的正確關係。這也是為什麼大量且多樣化的資料集可產生更優質模型的原因。模型已看到更多資料,且值的範圍更廣泛,因此可更精確地瞭解特徵和標籤之間的關係。
在訓練期間,機器學習專家可以微調模型用於預測的設定和功能。舉例來說,某些功能的預測能力比其他功能更強。因此,機器學習專家可以選擇模型在訓練期間使用的特徵。舉例來說,假設天氣資料集包含 time_of_day
做為特徵。在這種情況下,機器學習專家可以在訓練期間新增或移除 time_of_day
,看看模型是否在有或無 time_of_day
的情況下預測得更準確。
評估中
我們會評估已訓練的模型,判斷模型的學習成效。評估模型時,我們會使用已標示的資料集,但只會將資料集的特徵提供給模型。然後將模型的預測結果與標籤的實際值進行比較。
圖 4. 比較機器學習模型的預測結果與實際值,藉此評估模型。
視模型預測結果而定,我們可能會在實際應用程式中部署模型前,進行更多訓練和評估。
隨堂測驗
為何模型必須先經過訓練才能做出預測?
模型需要經過訓練,才能學習資料集中特徵與標籤之間的數學關係。
模型需要經過訓練,才能在沒有資料的情況下做出預測。
推論
一旦滿意評估模型的結果,我們就可以使用模型針對未標註的範例進行預測,也就是所謂的推論。在天氣應用程式範例中,我們會將目前的天氣狀況 (例如溫度、大氣壓力和相對濕度) 提供給模型,讓模型預測降雨量。
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
上次更新時間:2025-07-27 (世界標準時間)。
[null,null,["上次更新時間:2025-07-27 (世界標準時間)。"],[[["\u003cp\u003eSupervised learning uses labeled data to train models that predict outcomes for new, unseen data.\u003c/p\u003e\n"],["\u003cp\u003eThe training process involves feeding the model labeled examples, allowing it to learn the relationship between features and labels.\u003c/p\u003e\n"],["\u003cp\u003eModels are evaluated by comparing their predictions on unseen data to the actual values, helping to refine their accuracy.\u003c/p\u003e\n"],["\u003cp\u003eOnce trained and evaluated, models can be used for inference, making predictions on new, unlabeled data in real-world applications.\u003c/p\u003e\n"],["\u003cp\u003eThe quality of the dataset, including its size and diversity, significantly impacts the model's performance and ability to generalize.\u003c/p\u003e\n"]]],[],null,["# Supervised Learning\n\n\u003cbr /\u003e\n\nSupervised learning's tasks are well-defined and can be applied to a multitude\nof scenarios---like identifying spam or predicting precipitation.\n\nFoundational supervised learning concepts\n-----------------------------------------\n\nSupervised machine learning is based on the following core concepts:\n\n- Data\n- Model\n- Training\n- Evaluating\n- Inference\n\n### Data\n\nData is the driving force of ML. Data comes in the form of words and numbers\nstored in tables, or as the values of pixels and waveforms captured in images\nand audio files. We store related data in datasets. For example, we might have a\ndataset of the following:\n\n- Images of cats\n- Housing prices\n- Weather information\n\nDatasets are made up of individual\n[examples](/machine-learning/glossary#example) that contain\n[features](/machine-learning/glossary#feature) and a\n[label](/machine-learning/glossary#label). You could think of an example as\nanalogous to a single row in a spreadsheet. Features are the values that a\nsupervised model uses to predict the label. The label is the \"answer,\" or the\nvalue we want the model to predict. In a weather model that predicts rainfall,\nthe features could be *latitude* , *longitude* , *temperature* ,\n*humidity* , *cloud coverage* , *wind direction* , and *atmospheric pressure* . The\nlabel would be *rainfall amount*.\n\nExamples that contain both features and a label are called\n[labeled examples](/machine-learning/glossary#labeled-example).\n\n**Two labeled examples**\n\nIn contrast, unlabeled examples contain features, but no label. After you create\na model, the model predicts the label from the features.\n\n**Two unlabeled examples**\n\n### Dataset characteristics\n\nA dataset is characterized by its size and diversity. Size indicates the number\nof examples. Diversity indicates the range those examples cover. Good datasets\nare both large and highly diverse.\n\nDatasets can be large and diverse, or large but\nnot diverse, or small but highly diverse. In other words, a\nlarge dataset doesn't guarantee sufficient diversity, and a dataset that is\nhighly diverse doesn't guarantee sufficient examples.\n\nFor instance, a dataset might contain 100 years worth of data, but only for the\nmonth of July. Using this dataset to predict rainfall in January would produce\npoor predictions. Conversely, a dataset might cover only a few years but contain\nevery month. This dataset might produce poor predictions because it doesn't\ncontain enough years to account for variability.\n\n### Check Your Understanding\n\nWhat attributes of a dataset would be ideal to use for ML? \nLarge size / High diversity \nA large number of examples that cover a variety of use cases is essential for a machine learning system to understand the underlying patterns in the data. A model trained on this type of dataset is more likely to make good predictions on new data. \nLarge size / Low diversity \nMachine learning models are only as good as the examples used to train them. A model will produce poorer predictions on novel data that it never trained on. \nSmall size / High diversity \nMost models can't find reliable patterns in a small dataset. The predictions will lack the confidence a larger dataset provides. \nSmall size / Low diversity \nIf your dataset is small and without much variation, you may not benefit from machine learning.\n\nA dataset can also be characterized by the number of its features. For example,\nsome weather datasets might contain hundreds of features, ranging from satellite\nimagery to cloud coverage values. Other datasets might contain only three or\nfour features, like humidity, atmospheric pressure, and temperature. Datasets\nwith more features can help a model discover additional patterns and make better\npredictions. However, datasets with more features don't *always* produce models\nthat make better predictions because some features might have no causal\nrelationship to the label.\n\n### Model\n\nIn supervised learning, a model is the complex collection of numbers that define\nthe mathematical relationship from specific input feature patterns to specific\noutput label values. The model discovers these patterns through training.\n\n### Training\n\nBefore a supervised model can make predictions, it must be trained. To train a\nmodel, we give the model a dataset with labeled examples. The model's goal is to\nwork out the best solution for predicting the labels from the features. The\nmodel finds the best solution by comparing its predicted value to the label's\nactual value. Based on the difference between the predicted and actual\nvalues---defined as the [loss](/machine-learning/glossary#loss)---the\nmodel gradually updates its solution. In other words, the model learns the\nmathematical relationship between the features and the label so that it can\nmake the best predictions on unseen data.\n\nFor example, if the model predicted `1.15 inches` of rain, but the actual value\nwas `.75 inches`, the model modifies its solution so its prediction is closer to\n`.75 inches`. After the model has looked at each example in the dataset---in\nsome cases, multiple times---it arrives at a solution that makes the best\npredictions, on average, for each of the examples.\n\nThe following demonstrates training a model:\n\n1. The model takes in a single labeled example and provides a prediction.\n\n **Figure 1**. An ML model making a prediction from a labeled example.\n\n \u003cbr /\u003e\n\n2. The model compares its predicted value with the actual value and updates its solution.\n\n **Figure 2**. An ML model updating its predicted value.\n\n \u003cbr /\u003e\n\n3. The model repeats this process for each labeled example in the dataset.\n\n **Figure 3**. An ML model updating its predictions for each labeled example\n in the training dataset.\n\n \u003cbr /\u003e\n\nIn this way, the model gradually learns the correct relationship between the\nfeatures and the label. This gradual understanding is also why large and diverse\ndatasets produce a better model. The model has seen more data with a wider range\nof values and has refined its understanding of the relationship between the\nfeatures and the label.\n\nDuring training, ML practitioners can make subtle adjustments to the\nconfigurations and features the model uses to make predictions. For example,\ncertain features have more predictive power than others. Therefore, ML\npractitioners can select which features the model uses during training. For\nexample, suppose a weather dataset contains`time_of_day` as a feature. In this\ncase, an ML practitioner can add or remove `time_of_day` during training to see\nwhether the model makes better predictions with or without it.\n\n### Evaluating\n\nWe evaluate a trained model to determine how well it learned. When we evaluate a model,\nwe use a labeled dataset, but we only give the model the dataset's features. We\nthen compare the model's predictions to the label's true values.\n\n**Figure 4**. Evaluating an ML model by comparing its predictions to the actual\nvalues.\n\n\u003cbr /\u003e\n\nDepending on the model's predictions, we might do more training and evaluating\nbefore deploying the model in a real-world application.\n\n### Check Your Understanding\n\nWhy does a model need to be trained before it can make predictions? \nA model needs to be trained to learn the mathematical relationship between the features and the label in a dataset. \nA model doesn't need to be trained. Models are available on most computers. \nA model needs to be trained so it won't require data to make a prediction. \n\n### Inference\n\nOnce we're satisfied with the results from evaluating the model, we can use the\nmodel to make predictions, called\n[inferences](/machine-learning/glossary#inference), on\nunlabeled examples. In the weather app example, we would give the model the\ncurrent weather conditions---like temperature, atmospheric pressure, and\nrelative humidity---and it would predict the amount of rainfall.\n| **Key Terms:**\n|\n| \u003cbr /\u003e\n|\n| - [example](/machine-learning/glossary#example)\n| - [feature](/machine-learning/glossary#feature)\n| - [inference](/machine-learning/glossary#inference)\n| - [labeled example](/machine-learning/glossary#labeled-example)\n| - [label](/machine-learning/glossary#label)\n| - [loss](/machine-learning/glossary#loss)\n| - [prediction](/machine-learning/glossary#prediction)\n| - [training](/machine-learning/glossary#training)\n|\n| \u003cbr /\u003e\n|"]]