借助 Scene Semantics API,开发者可以通过提供基于机器学习模型的实时语义信息来了解用户周围的场景。给定一张户外场景图片,该 API 会针对一组实用的语义类别(例如天空、建筑物、树木、道路、人行道、车辆、人等)为每个像素返回一个标签。除了像素标签之外,Scene Semantics API 还会为每个像素标签提供置信度值,并提供一种简单易用的方式来查询给定标签在户外场景中的普遍性。
CVPixelBuffersemanticImage=garFrame.semanticImage;if(semanticImage){// Use the semantic image here}else{// Semantic images are not available.// The output image may be missing for the first couple frames before the model has had a// chance to run yet.}
在会话开始后大约 1-3 帧后,系统应可输出语义图像,具体取决于设备。
获取置信度图片
除了为每个像素提供标签的语义图像之外,该 API 还提供包含相应像素置信度值的置信度图像。置信度图像是 kCVPixelFormatType_OneComponent8 图像,其中每个像素对应于 [0, 255] 范围内的值,该值与每个像素的语义标签相关联的概率相对应。
CVPixelBufferconfidenceImage=garFrame.semanticConfidenceImage;if(confidenceImage){// Use the semantic image here}else{// Semantic images are not available.// The output image may be missing for the first couple frames before the model has had a// chance to run yet.}
// Ensure that semantic data is present for the GARFrame.if(garFrame.semanticImage){floatfraction=[garFramefractionForSemanticLabel:GARSemanticLabelSky];}
[null,null,["最后更新时间 (UTC):2025-07-26。"],[[["\u003cp\u003eThe Scene Semantics API lets you understand a user's surroundings by identifying objects like buildings, trees, and people in real-time using an ML model.\u003c/p\u003e\n"],["\u003cp\u003eTo use the API, enable semantic mode in your ARCore session after checking for device compatibility.\u003c/p\u003e\n"],["\u003cp\u003eYou can access the semantic image, which assigns a label to each pixel, and the confidence image, which provides the probability of each label's accuracy.\u003c/p\u003e\n"],["\u003cp\u003eThe API also allows you to efficiently query the percentage of pixels representing a specific object, like the sky, within a scene.\u003c/p\u003e\n"]]],["The Scene Semantics API provides real-time semantic information for outdoor scenes. To use it, first check if the device supports it and enable semantic mode in the ARCore session configuration. Then, retrieve the semantic image, a pixel-labeled image, using `GARFrame.semanticImage`. The API also offers a confidence image, `GARFrame.semanticConfidenceImage`, indicating pixel label probabilities. You can query the prevalence of a specific label, like \"sky,\" using `fractionForSemanticLabel:`, which returns a fraction representing its presence.\n"],null,["# Understand the user's environment on iOS\n\nLearn how to use the [Scene Semantics API](/ar/develop/scene-semantics) in your own apps.\n\nThe Scene Semantics API enables developers to understand the scene surrounding the user, by providing ML model-based, real-time semantic information. Given an image of an outdoor scene, the API returns a label for each pixel across a set of useful semantic classes, such a sky, building, tree, road, sidewalk, vehicle, person, and more. In addition to pixel labels, the Scene Semantics API also offers confidence values for each pixel label and an easy-to-use way to query the prevalence of a given label in an outdoor scene.\n\nYour browser does not support the video tag.\n\nFrom left to right, examples of an input image, the semantic image of pixel labels, and the corresponding confidence image:\n\nPrerequisites\n-------------\n\nMake sure that you understand [fundamental AR concepts](/ar/develop/fundamentals)\nand how to [configure an ARCore session](/ar/develop/ios/session-config) before proceeding.\n\nEnable Scene Semantics\n----------------------\n\nIn [a new ARCore session](/ar/develop/ios/session-config), check whether a user's device supports the Scene Semantics API. Not all ARCore-compatible devices support the Scene Semantics API due to processing power constraints.\n\nTo save resources, Scene Semantics is disabled by default on ARCore. Enable semantic mode to have your app use the Scene Semantics API. \n\n GARSessionConfiguration *configuration = [[GARSessionConfiguration alloc] init];\n if ([self.garSession isSemanticModeSupported:GARSemanticModeEnabled]) {\n configuration.semanticMode = GARSemanticModeEnabled;\n }\n\n NSError *error;\n [self.garSession setConfiguration:configuration error:&error];\n\nObtain the semantic image\n-------------------------\n\nOnce Scene Semantics is enabled, the semantic image can be retrieved. The semantic image is a [`kCVPixelFormatType_OneComponent8`](https://developer.apple.com/documentation/corevideo/1563591-pixel_format_identifiers/kcvpixelformattype_onecomponent8?language=objc) image, where each pixel corresponds to a semantic label defined by [`GARSemanticLabel`](/ar/reference/ios/group/GARSemanticLabel).\n\nUse [`GARFrame.semanticImage`](/ar/reference/ios/category/GARFrame(Semantics)#category_g_a_r_frame_07_semantics_08_1abb18fd408efa4a5225402772f0deef3b) to acquire the semantic image: \n\n CVPixelBuffer semanticImage = garFrame.semanticImage;\n if (semanticImage) {\n // Use the semantic image here\n } else {\n // Semantic images are not available.\n // The output image may be missing for the first couple frames before the model has had a\n // chance to run yet.\n }\n\nOutput semantic images should be available after about 1-3 frames from the start of the session, depending on the device.\n\nObtain the confidence image\n---------------------------\n\nIn addition to the semantic image, which provides a label for each pixel, the API also provides a confidence image of corresponding pixel confidence values. The confidence image is a [`kCVPixelFormatType_OneComponent8`](https://developer.apple.com/documentation/corevideo/1563591-pixel_format_identifiers/kcvpixelformattype_onecomponent8?language=objc) image, where each pixel corresponds to a value in the range `[0, 255]`, corresponding to the probability associated with the semantic label for each pixel.\n\nUse [`GARFrame.semanticConfidenceImage`](/ar/reference/ios/category/GARFrame(Semantics)#category_g_a_r_frame_07_semantics_08_1abeda29d610d8c5b9295559cc20658bff) to acquire the semantic confidence image: \n\n CVPixelBuffer confidenceImage = garFrame.semanticConfidenceImage;\n if (confidenceImage) {\n // Use the semantic image here\n } else {\n // Semantic images are not available.\n // The output image may be missing for the first couple frames before the model has had a\n // chance to run yet.\n }\n\nOutput confidence images should be available after about 1-3 frames from the start of the session, depending on the device.\n\nQuery the fraction of pixels for a semantic label\n-------------------------------------------------\n\nYou can also query the fraction of pixels in the current frame that belong to a particular class, such as sky. This query is more efficient than returning the semantic image and performing a pixel-wise search for a specific label. The returned fraction is a float value in the range `[0.0, 1.0]`.\n\nUse [`fractionForSemanticLabel:`](/ar/reference/ios/category/GARFrame(Semantics)#category_g_a_r_frame_07_semantics_08_1a20c44998075101f44c360592025b2ed4) to acquire the fraction for a given label: \n\n // Ensure that semantic data is present for the GARFrame.\n if (garFrame.semanticImage) {\n float fraction = [garFrame fractionForSemanticLabel:GARSemanticLabelSky];\n }"]]