組成機器學習團隊
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
機器學習專案需要具備多種技能、專業知識和職責的團隊成員,才能與機器學習相關。以下是一般機器學習團隊最常見的角色:
角色 |
知識和技能 |
主要成果 |
機器學習產品經理 |
機器學習產品經理對機器學習的優缺點和開發流程瞭若指掌。他們會直接與機器學習團隊、使用者和其他利害關係人合作,將業務問題與機器學習解決方案進行對應。他們會建立產品願景、定義用途和需求,並規劃及排定專案優先順序。 |
產品需求文件 (PRD)。
|
工程部門經理 |
工程管理員會設定、傳達及達成團隊優先事項,以達成業務目標。就像機器學習產品經理一樣,他們會將機器學習解決方案與業務問題保持一致。他們會為團隊成員設定明確的期望、進行績效評估,並協助他們發展職涯和專業。 |
設計文件、專案計畫和績效評估。
|
數據資料學家 |
數據資料學家會運用量化和統計分析,從資料中擷取洞察和價值。有助於識別及測試功能和模型原型,並協助改善模型的可解讀性。 |
透過統計分析解答業務問題的報表和資料視覺化資料。 |
機器學習工程師 |
機器學習工程師負責設計、建構、推送至正式版環境及管理機器學習模型。他們是優秀的軟體工程師,對機器學習技術和最佳做法有深入的瞭解。 |
部署的模型具有足夠的預測品質,可達成業務目標。 |
資料工程師 |
資料工程師會建構資料管道,用於儲存、匯總及處理大量資料。他們會開發基礎架構和系統,收集原始資料並轉換成可供模型訓練和提供的實用格式。資料工程師負責整個機器學習開發程序中的資料。 |
完全正式發布的資料管道,內含必要的監控和快訊功能。 |
開發人員作業 (DevOps) 工程師 |
開發運作工程師會開發、部署、擴充及監控機器學習模型的服務基礎架構。 |
自動化程序,可用於提供、監控、測試及發出模型行為警示。 |
成功的機器學習專案團隊,都會妥善安排各個角色。在較小的團隊中,個人需要處理多個角色的職責。
建立團隊做法
由於 ML 開發人員的角色、工具和架構差異極大,因此透過優異的程序文件建立常見做法至關重要。舉例來說,某位工程師可能認為只要取得正確的資料,就能開始訓練模型,但較負責任的工程師會驗證資料集是否正確匿名,並記錄其中繼資料和來源。確保工程師共用流程和設計模式的共同定義,可減少混淆情形,並提高團隊的速度。
程序說明文件
程序文件應定義團隊用於機器學習開發的工具、基礎架構和程序。良好的程序文件有助於讓新成員和現有團隊成員保持一致。應能回答下列類型的提問:
- 如何為模型產生資料?
- 如何檢查、驗證及呈現資料?
- 如何修改訓練資料中的輸入特徵或標籤?
- 如何自訂資料產生、訓練和評估管道?
- 如何變更模型架構,以便因應輸入特徵或標籤的變更?
- 我們如何取得測試範例?
- 我們會使用哪些指標來評估模型品質?
- 如何在正式環境中發布模型?
- 如何判斷模型是否有問題?
- 模型依賴哪些上游系統?
- 如何讓 SQL 可維護且可重複使用?
其他可能的問題
模型
訓練課程
正式版、監控和維護
管道
如何自訂資料產生/訓練/評估管道?
我應該何時及如何建立全新管道?
SQL
基礎架構
模型放送的運作方式為何?是否有圖表?
我的模型依賴哪些上游系統,我應該注意什麼?
通訊
注意事項
不同公司、團隊和個人對「機器學習最佳做法」的定義可能不同。舉例來說,部分團隊成員可能會將實驗性 Colab 視為主要成果,而其他人則想在 R 中工作。有些人可能熱衷於軟體工程,有些人認為監控是最重要的事,但有些人則瞭解優質功能的正式版化做法,但想使用 Scala。每個人都有自己的「正確」觀點,如果能正確引導,混合型態就會成為強大的力量。否則可能會造成混亂。
在撰寫程式碼之前,先建立團隊將用到的工具、程序和基礎架構,這麼做可以避免專案在兩年後失敗,或提前一個季度成功推出。
由於 ML 本身具有不確定性和模糊性,人力資源經理必須設定明確的期望,並提前定義可交付的成果。
在決定期望和成果時,請考量如果專案或做法不成功,這些項目會如何受到評估。換句話說,團隊成員的表現與專案的成敗並無直接關聯,舉例來說,團隊成員經常花費數週時間調查解決方案,但最終卻無法成功。即使是這種情況,他們的優質程式碼、完整說明文件和有效協作,仍應對評估結果帶來正面影響。
隨堂測驗
為什麼要建立完善的程序文件和一般做法?
加快專案速度。
沒錯。提供完善的程序文件並建立常見做法,可減少混淆情形並簡化開發程序。
建立全公司適用的最佳做法。
由於機器學習開發作業因專案而異,團隊通常會自行制定一套最佳做法,以便有效地進行工作並提高速度。
確保團隊中所有工程師的專業程度相同。
機器學習團隊通常會聘用具備多種技能和知識的工程師。程序文件可協助工程師遵循最佳做法,提高工作速度。
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
上次更新時間:2025-07-27 (世界標準時間)。
[null,null,["上次更新時間:2025-07-27 (世界標準時間)。"],[[["\u003cp\u003eMachine learning projects necessitate diverse teams with specialized roles like ML product managers, data scientists, and ML engineers, to address various aspects of development and deployment.\u003c/p\u003e\n"],["\u003cp\u003eComprehensive process documentation is crucial for ML teams to establish common practices, ensure smooth collaboration, and enhance project velocity by reducing confusion and streamlining workflows.\u003c/p\u003e\n"],["\u003cp\u003eProcess documentation should cover key questions regarding data handling, model development, training, evaluation, and productionization to guide the team's approach and decision-making.\u003c/p\u003e\n"],["\u003cp\u003eEstablishing clear expectations, deliverables, and evaluation criteria for team members is essential, emphasizing contributions beyond project success due to the inherent uncertainties in ML development.\u003c/p\u003e\n"],["\u003cp\u003eSuccessful ML teams foster a collaborative environment where diverse perspectives and expertise are valued, enabling efficient problem-solving and innovative solutions.\u003c/p\u003e\n"]]],[],null,["# Assembling an ML team\n\nML projects require teams with members who have a range of skills, expertise,\nand responsibilities related to machine learning. These are the most common\nroles found on typical ML teams:\n\n| Role | Knowledge and skills | Main deliverable |\n|----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|\n| ML product manager | ML product managers have a deep understanding of ML strengths and weaknesses and the ML development process. They align business problems to ML solutions by working directly with the ML team, end-users, and other stakeholders. They create the product vision, define use cases and requirements, and plan and prioritize projects. | Product requirements document (PRD). |\n| Engineering manager | Engineering managers achieve business goals by setting, communicating, and achieving team priorities. Like ML product managers, they align ML solutions to business problems. They set clear expectations for team members, conduct performance evaluations, and assist with career and professional development. | Design docs, project plans, and performance evaluations. |\n| Data scientist | Data scientists use quantitative and statistical analysis to extract insights and value from data. They help to identify and test features, prototype models, and help with model interpretability. | Reports and data visualizations that answer business questions through statistical analysis. |\n| ML engineer | ML engineers design, build, productionize, and manage ML models. They are strong software engineers with a deep understanding of ML technologies and best practices. | Deployed model with sufficient prediction quality to meet business goals. |\n| Data engineer | Data engineers build data pipelines for storing, aggregating, and processing large amounts of data. They develop the infrastructure and systems for collecting and transforming raw data into useful formats for model training and serving. Data engineers are responsible for the data across the entire ML development process. | Fully productionized data pipelines with the necessary monitoring and alerting. |\n| Developer operations (DevOps) engineer | DevOps engineers develop, deploy, scale, and monitor the serving infrastructure for ML models. | An automated process for serving, monitoring, testing, and alerting on a model's behavior. |\n\nSuccessful ML projects have teams with each role well\nrepresented. In smaller teams, individuals will need to handle the\nresponsibilities for multiple roles.\n\n\nEstablish team practices\n------------------------\n\nBecause the roles, tools, and frameworks vary widely in ML\ndevelopment, it's critical to establish common practices through\nexcellent process documentation. For example, one engineer might\nthink that just getting the right data is sufficient to begin training a model,\nwhile a more responsible engineer will validate that the dataset is anonymized\ncorrectly and document its metadata and provenance. Making sure engineers share\ncommon definitions for processes and design patterns reduces confusion and\nincreases the team's velocity.\n\n### Process documentation\n\nProcess docs should define the tools, infrastructure, and processes the team\nwill use for ML development. Good process docs help align new and current\nteam members. They should answer the following types of questions:\n\n- How is the data generated for the model?\n- How do we examine, validate, and visualize the data?\n- How do we modify an input feature or label in the training data?\n- How do we customize the data generation, training, and evaluation pipeline?\n- How do I change the model architecture to accommodate changes in input features or labels?\n- How do we obtain testing examples?\n- What metrics will we use to judge model quality?\n- How do we launch our models in production?\n- How will we know if something is wrong with our model?\n- What upstream systems do our models depend on?\n- How do I make my SQL maintainable and reusable?\n\n#### More potential questions\n\n**Model**\n\n-\n Can I train models on different datasets in the same\n pipeline, like for fine-tuning?\n\n-\n How do I add a new test dataset to my pipeline?\n\n**Training**\n\n-\n How do I check the model's prediction on a hand-crafted example?\n\n-\n How do I find, examine, and visualize examples where the model made\n mistakes?\n\n-\n How do I determine which feature was most responsible for a given\n prediction?\n\n-\n How do I understand which features have the most impact on\n predictions within a given sample?\n\n-\n How do I compute or plot model predictions on a chosen dataset or\n sample?\n\n-\n How do I compute standard metrics for my model's predictions on a\n chosen dataset?\n\n-\n How do I develop and compute custom metrics?\n\n-\n How do I compare my model with other models offline?\n\n-\n Can I perform meta-analysis for multiple model evaluations in a single\n development environment?\n\n-\n Can I compare the current model with the one from 10 months ago?\n\n**Productionization, monitoring, and maintenance**\n\n-\n I think I created a good model. How can I launch it in production?\n\n-\n How do I verify that my new model is running in production correctly?\n\n-\n Can I get the history of model evaluations over time?\n\n-\n How will I know when something is wrong with the model?\n\n-\n I got assigned a page/bug mentioning something about the model.\n What should I do?\n\n**Pipelines**\n\n-\n How could I customize the data generation/training/evaluation\n pipeline?\n\n-\n When and how should I create a completely new pipeline?\n\n**SQL**\n\n-\n I need SQL to generate some data. Where should I put it?\n\n**Infrastructure**\n\n-\n How does our model serving work? Is there a diagram?\n\n-\n What upstream systems does my model depend on that I should be\n aware of?\n\n**Communication**\n\n-\n I can't figure something out. Who (and how) should I contact?\n\n### Keep in mind\n\nWhat constitutes \"ML best practices\" can differ between companies, teams, and\nindividuals. For\nexample, some team members might consider experimental Colabs as the main\ndeliverable, while others will want to work in R. Some might have a passion for\nsoftware engineering, someone else thinks monitoring is the most important\nthing, yet someone else is aware of good feature productionization practices but\nwants to use Scala. Everyone is \"right\" from their own perspective and if\nsteered correctly, the mix will be a powerhouse. If not, it can be a mess.\n\nEstablishing the tools, processes, and infrastructure the team will use before\nwriting a line of code can be the difference between the project failing after\ntwo years or successfully launching a quarter ahead of schedule.\n\nPerformance evaluations\n-----------------------\n\nDue to the ambiguity and uncertainty inherent in ML, people managers need to set\nclear expectations and define deliverables early.\n\nWhen determining expectations and deliverables, consider how they'll be\nevaluated if a project or approach isn't successful. In other words, it's\nimportant that a team member's performance isn't directly connected to the\nsuccess of the project. For example, it's not uncommon for team members to spend\nweeks investigating solutions that are ultimately unsuccessful. Even in these\ncases, their high-quality code, thorough documentation, and effective\ncollaboration should contribute positively toward their evaluation.\n\n### Check Your Understanding\n\nWhat is the primary reason for having excellent process documentation and establishing common practices? \nIncrease project velocity. \nCorrect. Having good process documentation and establishing common practices reduces confusion and streamlines the development process. \nEstablish best practices across a company. \nBecause ML development varies from project to project, teams typically establish their own sets of best practices to work effectively and increase their velocity. \nEnsure all engineers on the team have the same level of expertise. \nML teams typically have engineers with a variety of skills and knowledge. Process documentation helps engineers align on best practices to increase their velocity."]]