对象检测和跟踪
bookmark_border
bookmark
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
主要功能
- 快速对象检测和跟踪
检测对象并获取它们在图片中的位置。跨连续图像帧跟踪对象。
- 优化的设备端模型
对象检测和跟踪模型针对移动设备进行了优化,旨在用于实时应用,甚至可以在低端设备上使用。
- 突出对象检测
自动确定图片中最突出的对象。
- 粗略分类:将对象分为广泛的类别,您可以使用这些类别过滤掉您不感兴趣的对象。支持以下类别:家居用品、时尚商品、食品、植物和地点。
- 使用自定义模型分类
使用您自己的自定义图片分类模型来识别或过滤特定的对象类别。您可以去掉图片背景,提升自定义模型的性能。
示例结果
跨图片跟踪最突出的对象
以下示例显示了使用机器学习套件提供的默认粗分类器时,3 个连续帧的跟踪数据。
 |
跟踪 ID |
0 |
边界 |
(95, 45)、(496, 45)、(496, 240)、(95, 240) |
Category |
位置 |
分类置信度 |
0.9296875 |
|
 |
跟踪 ID |
0 |
边界 |
(84, 46)、(478, 46)、(478, 247)、(84, 247) |
Category |
位置 |
分类置信度 |
0.8710938 |
|
 |
跟踪 ID |
0 |
边界 |
(53, 45)、(519, 45)、(519, 240)、(53, 240) |
Category |
位置 |
分类置信度 |
0.8828125 |
|
照片:Christian Ferrer [CC BY-SA 4.0]
静态图片中的多个对象
以下示例显示了使用机器学习套件提供的默认粗分类器在图片中检测到的四个对象的数据。

对象 0 |
边界 |
(1, 97)、(332, 97)、(332, 332)、(1, 332) |
Category |
FASHION_GOOD |
分类置信度 |
0.95703125 |
Object 1 |
边界 |
(186, 80)、(337, 80)、(337, 226)、(186, 226) |
Category |
FASHION_GOOD |
分类置信度 |
0.84375 |
Object 2 |
边界 |
(296, 80)、(472, 80)、(472, 388)、(296, 388) |
Category |
FASHION_GOOD |
分类置信度 |
0.94921875 |
Object 3 |
边界 |
(439, 83)、(615, 83)、(615, 306)、(439, 306) |
Category |
FASHION_GOOD |
分类置信度 |
0.9375 |
使用自定义 TensorFlow Lite 模型
默认的粗分类器专为五个类别构建,可提供有关检测到的对象的有限信息。您可能需要一个更专业的分类器模型,该模型涵盖更具体概念的更具体概念;例如,一个用于区分花卉种类或食品类型的模型。
此 API 支持来自各种来源的自定义图片分类模型,让您能够根据特定使用场景进行定制。如需了解详情,请参阅使用机器学习套件自定义模型。自定义模型可以与您的应用捆绑在一起,也可以使用 Firebase Machine Learning 的模型部署服务从云端动态下载。
iOS
Android
如果需要,对象检测和跟踪会使用双线性图片缩放和拉伸来调整输入图片的大小和宽高比,使其满足底层模型的要求。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-17。
[null,null,["最后更新时间 (UTC):2025-08-17。"],[[["\u003cp\u003eML Kit's on-device API enables detection and tracking of objects within images or live camera feeds, working efficiently even on lower-end mobile devices.\u003c/p\u003e\n"],["\u003cp\u003eIt offers optional object classification using a built-in coarse classifier or your own custom TensorFlow Lite model for more specialized categorization.\u003c/p\u003e\n"],["\u003cp\u003eThe API can identify the most prominent object in an image and track it across frames, making it suitable for visual search applications.\u003c/p\u003e\n"],["\u003cp\u003eCustom models can be integrated to classify objects into specific categories, enhancing the functionality for tailored use cases.\u003c/p\u003e\n"],["\u003cp\u003eInput images are automatically preprocessed to fit model requirements, using bilinear scaling and stretching if necessary.\u003c/p\u003e\n"]]],["ML Kit's API enables on-device object detection and tracking in images or live feeds, optionally classifying them. It features fast detection, an optimized model for mobile, and prominent object identification. Objects can be classified into broad categories or with custom models. The API supports custom image classification models, including ones dynamically downloaded. Detected objects are tracked across frames, and the system adjusts input image size and aspect ratio as needed.\n"],null,["# Object detection and tracking\n\nWith ML Kit's on-device object detection and tracking API, you can detect\nand track objects in an image or live camera feed.\n\nOptionally, you can classify detected objects, either by using the coarse\nclassifier built into the API, or using your own custom image classification\nmodel. See [Using a custom TensorFlow Lite model](#custom-tflite) for more\ninformation.\n\nBecause object detection and tracking happens on the device, it works well as\nthe front end of the visual search pipeline. After you detect and filter\nobjects, you can pass them to a cloud backend, such as\n[Cloud Vision Product Search](https://cloud.google.com/vision/product-search/docs/).\n\n[iOS](/ml-kit/vision/object-detection/ios)\n[Android](/ml-kit/vision/object-detection/android)\n\nKey capabilities\n----------------\n\n- **Fast object detection and tracking** Detect objects and get their locations in the image. Track objects across successive image frames.\n- **Optimized on-device model** The object detection and tracking model is optimized for mobile devices and intended for use in real-time applications, even on lower-end devices.\n- **Prominent object detection** Automatically determine the most prominent object in an image.\n- **Coarse classification** Classify objects into broad categories, which you can use to filter out objects you're not interested in. The following categories are supported: home goods, fashion goods, food, plants, and places.\n- **Classification with a custom model** Use your own custom image classification model to identify or filter specific object categories. Make your custom model perform better by leaving out background of the image.\n\nExample results\n---------------\n\n### Tracking the most prominent object across images\n\nThe example below shows the tracking data from three successive frames with the\ndefault coarse classifier provided by ML Kit.\n\n|---|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| | |---------------------------|--------------------------------------------| | Tracking ID | 0 | | Bounds | (95, 45), (496, 45), (496, 240), (95, 240) | | Category | PLACE | | Classification confidence | 0.9296875 | |\n| | |---------------------------|--------------------------------------------| | Tracking ID | 0 | | Bounds | (84, 46), (478, 46), (478, 247), (84, 247) | | Category | PLACE | | Classification confidence | 0.8710938 | |\n| | |---------------------------|--------------------------------------------| | Tracking ID | 0 | | Bounds | (53, 45), (519, 45), (519, 240), (53, 240) | | Category | PLACE | | Classification confidence | 0.8828125 | |\n\nPhoto: Christian Ferrer \\[CC BY-SA 4.0\\]\n\n### Multiple objects in a static image\n\nThe example below shows the data for the four objects detected in the image with\nthe default coarse classifier provided by ML Kit.\n\n| Object 0 ||\n|---------------------------|----------------------------------------------|\n| Bounds | (1, 97), (332, 97), (332, 332), (1, 332) |\n| Category | FASHION_GOOD |\n| Classification confidence | 0.95703125 |\n| Bounds | (186, 80), (337, 80), (337, 226), (186, 226) |\n| Category | FASHION_GOOD |\n| Classification confidence | 0.84375 |\n| Bounds | (296, 80), (472, 80), (472, 388), (296, 388) |\n| Category | FASHION_GOOD |\n| Classification confidence | 0.94921875 |\n| Bounds | (439, 83), (615, 83), (615, 306), (439, 306) |\n| Category | FASHION_GOOD |\n| Classification confidence | 0.9375 |\n\nUsing a custom TensorFlow Lite model\n------------------------------------\n\nThe default coarse classifier is built for five categories, providing limited\ninformation about the detected objects. You might need a more specialized\nclassifier model that covers a narrower domain of concepts in more detail;\nfor example, a model to distinguish between species of flowers or types of\nfood.\n\nThis API lets you tailor to a particular use case by by supporting custom image\nclassification models from a wide range of sources. Please refer to\n[Custom models with ML Kit](/ml-kit/custom-models) to learn more. Custom models\ncan be bundled with your app or dynamically downloaded from the cloud using\nFirebase Machine Learning's Model deployment service.\n\n[iOS](/ml-kit/vision/object-detection/custom-models/ios)\n[Android](/ml-kit/vision/object-detection/custom-models/android)\n\nInput image preprocessing\n-------------------------\n\nIf needed, object detection and tracking uses bilinear image scaling and\nstretching to adjust the input image size and aspect ratio so that they fit the\nrequirements of the underlying model."]]