測試自己的理解程度
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
下列問題有助於加深您對核心機器學習概念的瞭解。
預測能力
監督式機器學習模型會使用含有標記範例的資料集進行訓練。模型會學習如何從特徵預測標籤。不過,資料集中並非所有特徵都有預測能力。在某些情況下,只有少數特徵可做為標籤的預測因子。在下列資料集中,請將價格設為標籤,其餘欄則設為特徵。
您認為哪三項功能最能預測汽車價格?
Make_model、year、miles。
汽車的廠牌/型號、年份和里程數,可能是價格最強的預測指標。
顏色、高度、製造商/型號。
車輛的高度和顏色並非預測車輛價格的強力指標。
英里、變速箱、make_model。
變速箱不是價格的主要預測指標。
Tire_size、wheel_base、year。
輪胎尺寸和輪距並非預測車輛價格的強力指標。
監督式學習和非監督式學習
您會根據問題使用監督式或非監督式方法。舉例來說,如果您事先知道要預測的值或類別,就應使用監督式學習。不過,如果您想瞭解資料集是否包含任何區隔或相關範例的群組,則應使用無監督式學習。
假設您有一個線上購物網站的使用者資料集,其中包含下列資料欄:
如果想瞭解造訪網站的使用者類型,您會使用監督式學習還是非監督式學習?
非監督式學習。
由於我們希望模型將相關顧客群組聚集在一起,因此會使用非監督式學習。模型將使用者分組後,我們會為每個叢集建立專屬名稱,例如「尋找折扣」、「尋找特惠」、「瀏覽者」、「忠誠者」和「流浪者」。
我要預測使用者屬於哪個類別,因此使用監督式學習。
在監督式學習中,資料集必須包含您要預測的標籤。資料集中沒有任何標籤會參照使用者類別。
假設您有一份住宅的能源使用量資料集,其中包含下列欄:
您會使用哪種機器學習技術,預測新建房屋每年的千瓦小時用量?
監督式學習。
監督式學習會使用加上標籤的範例進行訓練。在這個資料集中,「每年使用千瓦小時」會是標籤,因為這是您希望模型預測的值。這些特徵是「平方英尺數量單位」、「位置」和「建造年份」。
非監督式學習。
非監督式學習會使用未標註的範例。在本範例中,「每年使用千瓦小時」會是標籤,因為這是您希望模型預測的值。
假設您有一個航班資料集,其中包含下列欄:
如果您想預測機票費用,會使用迴歸還是分類?
分類
分類模型的輸出內容是離散值,通常是單字。在本例中,機票費用是數值。
您能否根據資料集訓練分類模型,將機票價格分類為「高」、「平均」或「低」?
可以,但我們必須先將 airplane_ticket_cost
欄中的數值轉換為分類值。
您可以使用這份資料集建立分類模型。
您可以執行下列動作:
- 查看從出發機場前往目的地機場的平均票價。
- 決定「高」、「平均」和「低」的門檻。
- 比較預測的成本與閾值,並輸出值所屬的類別。
不行。您無法建立分類模型。airplane_ticket_cost
值是數值,而非類別。
您只需稍微調整,就能建立分類模型。
否。分類模型只會預測兩個類別,例如 spam
或 not_spam
。這個模型需要預測三個類別。
分類模型可預測多個類別。這類模型稱為多元分類模型。
訓練與評估
訓練模型後,我們會使用含有標記範例的資料集評估模型,並將模型的預測值與標籤的實際值進行比較。
請為問題選取兩個最合適的答案。
如果模型的預測結果與實際情況相差甚遠,您可以採取哪些行動來改善預測結果?
重新訓練模型,但只使用您認為對標籤最具預測力的功能。
重新訓練模型時,使用較少的特徵 (但預測能力更強),可以產生更準確的預測模型。
您無法修正預測結果有很大落差的模型。
您可以修正預測結果不準確的模型。大多數模型都需要經過多輪訓練,才能做出準確的預測。
使用更大且多元化的資料集重新訓練模型。
在包含更多範例和更廣泛值範圍的資料集上訓練的模型,可產生更準確的預測結果,因為模型可針對特徵和標籤之間的關係提供更佳的一般化解決方案。
嘗試採用其他訓練方式。舉例來說,如果您使用監督式方法,請嘗試無監督式方法。
使用其他訓練方法無法產生更準確的預測結果。
您現在可以繼續進行機器學習歷程的下一個步驟:
使用者 + AI 指南。如果您想瞭解 Google 員工、業界專家和學術研究人員提供的一系列方法、最佳做法和範例,以便使用機器學習。
問題界定。如果您想瞭解實地測試過的建立機器學習模型方法,並避免常見的陷阱。
機器學習密集課程。您是否已準備好透過深入且實用的做法,進一步瞭解機器學習。
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
上次更新時間:2025-07-27 (世界標準時間)。
[null,null,["上次更新時間:2025-07-27 (世界標準時間)。"],[[["\u003cp\u003eThis page tests your understanding of core machine learning (ML) concepts through interactive questions.\u003c/p\u003e\n"],["\u003cp\u003eIt covers fundamental ML topics such as predictive power of features, supervised and unsupervised learning, and model training and evaluation.\u003c/p\u003e\n"],["\u003cp\u003eYou'll learn how to choose the right ML approach for different problems and assess the effectiveness of a trained model.\u003c/p\u003e\n"],["\u003cp\u003eLinks to further resources are provided to deepen your understanding of ML and its practical applications.\u003c/p\u003e\n"]]],[],null,["# Test Your Understanding\n\n\u003cbr /\u003e\n\nThe following questions help you solidify your understanding of core ML concepts.\n\nPredictive power\n----------------\n\nSupervised ML models are trained using datasets with labeled examples. The model\nlearns how to predict the label from the features. However, not every feature in\na dataset has predictive power. In some instances, only a few features act as\npredictors of the label. In the dataset below, use price as the label\nand the remaining columns as the features.\n\nWhich three features do you think are likely the greatest predictors for a car's price? \nMake_model, year, miles. \nA car's make/model, year, and miles are likely to be among the strongest predictors for its price. \nColor, height, make_model. \nA car's height and color are not strong predictors for a car's price. \nMiles, gearbox, make_model. \nThe gearbox isn't a main predictor of price. \nTire_size, wheel_base, year. \nTire size and wheel base aren't strong predictors for a car's price.\n\nSupervised and unsupervised learning\n------------------------------------\n\nBased on the problem, you'll use either a supervised or unsupervised approach.\nFor example, if you know beforehand the value or category you want to predict,\nyou'd use supervised learning. However, if you wanted to learn if your dataset\ncontains any segmentations or groupings of related examples, you'd use\nunsupervised learning.\n\nSuppose you had a dataset of users for an online shopping website, and it contained the following columns:\n\nIf you wanted to understand the types of users that visit the site, would you use supervised or unsupervised learning? \nUnsupervised learning. \nBecause we want the model to cluster groups of related customers, we'd use unsupervised learning. After the model clustered the users, we'd create our own names for each cluster, for example, \"discount seekers,\" \"deal hunters,\" \"surfers,\" \"loyal,\" and \"wanderers.\" \nSupervised learning because I'm trying to predict which class a user belongs to. \nIn supervised learning, the dataset must contain the label you're trying to predict. In the dataset, there is no label that refers to a category of user.\n\nSuppose you had an energy usage dataset for homes with the following columns:\n\nWhat type of ML would you use to predict the kilowatt hours used per year for a newly constructed house? \nSupervised learning. \nSupervised learning trains on labeled examples. In this dataset \"kilowatt hours used per year\" would be the label because this is the value you want the model to predict. The features would be \"square footage,\" \"location,\" and \"year built.\" \nUnsupervised learning. \nUnsupervised learning uses unlabeled examples. In this example, \"kilowatt hours used per year\" would be the label because this is the value you want the model to predict.\n\nSuppose you had a flight dataset with the following columns:\n\nIf you wanted to predict the cost of an airplane ticket, would you use regression or classification? \nRegression \nA regression model's output is a numeric value. \nClassification \nA classification model's output is a discrete value, normally a word. In this case, the cost of an airplane ticket is a numeric value. \nBased on the dataset, could you train a classification model to classify the cost of an airplane ticket as \"high,\" \"average,\" or \"low\"? \nYes, but we'd first need to convert the numeric values in the `airplane_ticket_cost` column to categorical values. \nIt's possible to create a classification model from the dataset. You would do something like the following:\n\n1. Find the average cost of a ticket from the departure airport to the destination airport.\n2. Determine the thresholds that would constitute \"high,\" \"average,\" and \"low\".\n3. Compare the predicted cost to the thresholds and output the category the value falls within. \nNo. It's not possible to create a classification model. The `airplane_ticket_cost` values are numeric not categorical. \nWith a little bit of work, you could create a classification model. \nNo. Classification models only predict two categories, like `spam` or `not_spam`. This model would need to predict three categories. \nClassification models can predict multiple categories. They're called multiclass classification models.\n\nTraining and evaluating\n-----------------------\n\nAfter we've trained a model, we evaluate it by using a dataset with labeled examples\nand compare the model's predicted value to the label's actual value.\n\nSelect the two best answers for the question. \nIf the model's predictions are far off, what might you do to make them better? \nRetrain the model, but use only the features you believe have the strongest predictive power for the label. \nRetraining the model with fewer features, but that have more predictive power, can produce a model that makes better predictions. \nYou can't fix a model whose predictions are far off. \nIt's possible to fix a model whose predictions are off. Most models require multiple rounds of training until they make useful predictions. \nRetrain the model using a larger and more diverse dataset. \nModels trained on datasets with more examples and a wider range of values can produce better predictions because the model has a better generalized solution for the relationship between the features and the label. \nTry a different training approach. For example, if you used a supervised approach, try an unsupervised approach. \nA different training approach would not produce better predictions.\n\nYou're now ready to take the next step in your ML journey:\n\n- [People + AI Guidebook](https://pair.withgoogle.com/guidebook/). If you're\n looking for a set of methods, best practices and examples presented by\n Googlers, industry experts, and academic research for using ML.\n\n- [Problem Framing](/machine-learning/problem-framing). If you're looking for\n a field-tested approach for creating ML models and avoiding common pitfalls\n along the way.\n\n- [Machine Learning Crash Course](/machine-learning/crash-course). If you're\n ready for an in-depth and hands-on approach to learning more about ML."]]