इस पेज का अनुवाद Cloud Translation API से किया गया है.

iOS पर ML Kit की मदद से पोज़ का पता लगाएं

ML Kit, पोज़ का पता लगाने के लिए दो ऑप्टिमाइज़ किए गए SDK टूल उपलब्ध कराता है.

एसडीके टूल का नाम	PoseDetection	PoseDetectionAccurate
लागू करना	बेस डिटेक्टर के लिए ऐसेट, बिल्ड टाइम पर आपके ऐप्लिकेशन से स्टैटिक तौर पर लिंक होती हैं.	सटीक डिटेक्टर के लिए ऐसेट, आपके ऐप्लिकेशन से बिल्ड टाइम पर स्टैटिक तौर पर लिंक होती हैं.
ऐप्लिकेशन का साइज़	ज़्यादा से ज़्यादा 29.6 एमबी	33.2 एमबी तक
परफ़ॉर्मेंस	iPhone X: ~45FPS	iPhone X: ~29FPS

इसे आज़माएं

इस एपीआई के इस्तेमाल का उदाहरण देखने के लिए, सैंपल ऐप्लिकेशन का इस्तेमाल करें.

शुरू करने से पहले

अपने Podfile में, ML Kit के ये पॉड शामिल करें:

# If you want to use the base implementation:
pod 'GoogleMLKit/PoseDetection', '8.0.0'

# If you want to use the accurate implementation:
pod 'GoogleMLKit/PoseDetectionAccurate', '8.0.0'

अपने प्रोजेक्ट के पॉड इंस्टॉल या अपडेट करने के बाद, Xcode प्रोजेक्ट को xcworkspace का इस्तेमाल करके खोलें. ML Kit, Xcode के 13.2.1 या इसके बाद के वर्शन के साथ काम करता है.

1. `PoseDetector` का इंस्टेंस बनाना

किसी इमेज में पोज़ का पता लगाने के लिए, सबसे पहले PoseDetector का एक इंस्टेंस बनाएं. इसके बाद, डिटेक्टर की सेटिंग तय करें. हालांकि, ऐसा करना ज़रूरी नहीं है.

`PoseDetector` विकल्प

पहचान करने का मोड

PoseDetector दो मोड में काम करता है. पक्का करें कि आपने अपने इस्तेमाल के उदाहरण से मिलता-जुलता विकल्प चुना हो.

stream (डिफ़ॉल्ट): पोज़ डिटेक्टर, सबसे पहले इमेज में मौजूद सबसे अहम व्यक्ति का पता लगाएगा. इसके बाद, पोज़ का पता लगाएगा. इसके बाद के फ़्रेम में, व्यक्ति का पता लगाने की प्रोसेस तब तक नहीं की जाएगी, जब तक व्यक्ति धुंधला न हो जाए या उसका पता न चल पाए. पोज़ डिटेक्टर, सबसे अहम व्यक्ति को ट्रैक करने की कोशिश करेगा. साथ ही, हर अनुमान में उसके पोज़ की जानकारी देगा. इससे, आवाज़ का पता लगाने में लगने वाला समय कम हो जाता है और यह सुविधा बेहतर तरीके से काम करती है. इस मोड का इस्तेमाल तब करें, जब आपको किसी वीडियो स्ट्रीम में पोज़ का पता लगाना हो.
singleImage: पोज़ डिटेक्टर, किसी व्यक्ति का पता लगाएगा. इसके बाद, पोज़ का पता लगाने की सुविधा चालू करेगा. व्यक्ति का पता लगाने वाला चरण हर इमेज के लिए चलेगा. इसलिए, लेटेन्सी ज़्यादा होगी और व्यक्ति को ट्रैक नहीं किया जाएगा. इस मोड का इस्तेमाल तब करें, जब आपको स्टैटिक इमेज पर पोज़ का पता लगाना हो या आपको ट्रैकिंग नहीं करनी हो.

पोज़ का पता लगाने वाले टूल के विकल्प तय करें:

Swift

// Base pose detector with streaming, when depending on the PoseDetection SDK
let options = PoseDetectorOptions()
options.detectorMode = .stream

// Accurate pose detector on static images, when depending on the
// PoseDetectionAccurate SDK
let options = AccuratePoseDetectorOptions()
options.detectorMode = .singleImage

Objective-C

// Base pose detector with streaming, when depending on the PoseDetection SDK
MLKPoseDetectorOptions *options = [[MLKPoseDetectorOptions alloc] init];
options.detectorMode = MLKPoseDetectorModeStream;

// Accurate pose detector on static images, when depending on the
// PoseDetectionAccurate SDK
MLKAccuratePoseDetectorOptions *options =
    [[MLKAccuratePoseDetectorOptions alloc] init];
options.detectorMode = MLKPoseDetectorModeSingleImage;

आखिर में, PoseDetector का इंस्टेंस पाएं. आपने जो विकल्प दिए हैं उन्हें पास करें:

Swift

let poseDetector = PoseDetector.poseDetector(options: options)

Objective-C

MLKPoseDetector *poseDetector =
    [MLKPoseDetector poseDetectorWithOptions:options];

2. इनपुट इमेज तैयार करना

पोज़ का पता लगाने के लिए, वीडियो के हर फ़्रेम या इमेज के लिए यह तरीका अपनाएं. अगर आपने स्ट्रीम मोड चालू किया है, तो आपको CMSampleBuffer से VisionImage ऑब्जेक्ट बनाने होंगे.

UIImage या CMSampleBuffer का इस्तेमाल करके, VisionImage ऑब्जेक्ट बनाएं.

अगर UIImage का इस्तेमाल किया जाता है, तो यह तरीका अपनाएं:

UIImage का इस्तेमाल करके, VisionImage ऑब्जेक्ट बनाएं. पक्का करें कि आपने सही .orientation डाला हो.

Swift

let image = VisionImage(image: UIImage)
visionImage.orientation = image.imageOrientation

Objective-C

MLKVisionImage *visionImage = [[MLKVisionImage alloc] initWithImage:image];
visionImage.orientation = image.imageOrientation;

अगर CMSampleBuffer का इस्तेमाल किया जाता है, तो यह तरीका अपनाएं:

CMSampleBuffer में मौजूद इमेज डेटा का ओरिएंटेशन बताएं.

इमेज का ओरिएंटेशन पाने के लिए:

Swift

func imageOrientation(
  deviceOrientation: UIDeviceOrientation,
  cameraPosition: AVCaptureDevice.Position
) -> UIImage.Orientation {
  switch deviceOrientation {
  case .portrait:
    return cameraPosition == .front ? .leftMirrored : .right
  case .landscapeLeft:
    return cameraPosition == .front ? .downMirrored : .up
  case .portraitUpsideDown:
    return cameraPosition == .front ? .rightMirrored : .left
  case .landscapeRight:
    return cameraPosition == .front ? .upMirrored : .down
  case .faceDown, .faceUp, .unknown:
    return .up
  }
}

Objective-C

- (UIImageOrientation)
  imageOrientationFromDeviceOrientation:(UIDeviceOrientation)deviceOrientation
                         cameraPosition:(AVCaptureDevicePosition)cameraPosition {
  switch (deviceOrientation) {
    case UIDeviceOrientationPortrait:
      return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationLeftMirrored
                                                            : UIImageOrientationRight;

    case UIDeviceOrientationLandscapeLeft:
      return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationDownMirrored
                                                            : UIImageOrientationUp;
    case UIDeviceOrientationPortraitUpsideDown:
      return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationRightMirrored
                                                            : UIImageOrientationLeft;
    case UIDeviceOrientationLandscapeRight:
      return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationUpMirrored
                                                            : UIImageOrientationDown;
    case UIDeviceOrientationUnknown:
    case UIDeviceOrientationFaceUp:
    case UIDeviceOrientationFaceDown:
      return UIImageOrientationUp;
  }
}

CMSampleBuffer ऑब्जेक्ट और ओरिएंटेशन का इस्तेमाल करके, VisionImage ऑब्जेक्ट बनाएं:

Swift

let image = VisionImage(buffer: sampleBuffer)
image.orientation = imageOrientation(
  deviceOrientation: UIDevice.current.orientation,
  cameraPosition: cameraPosition)

Objective-C

 MLKVisionImage *image = [[MLKVisionImage alloc] initWithBuffer:sampleBuffer];
 image.orientation =
   [self imageOrientationFromDeviceOrientation:UIDevice.currentDevice.orientation
                                cameraPosition:cameraPosition];

3. इमेज को प्रोसेस करना

VisionImage को पोज़ डिटेक्टर की इमेज प्रोसेसिंग के किसी एक तरीके से पास करें. एसिंक्रोनस process(image:) या सिंक्रोनस results() में से किसी भी तरीके का इस्तेमाल किया जा सकता है.

ऑब्जेक्ट का पता लगाने के लिए, सिंक्रोनस तरीके का इस्तेमाल करने के लिए:

Swift

var results: [Pose]
do {
  results = try poseDetector.results(in: image)
} catch let error {
  print("Failed to detect pose with error: \(error.localizedDescription).")
  return
}
guard let detectedPoses = results, !detectedPoses.isEmpty else {
  print("Pose detector returned no results.")
  return
}

// Success. Get pose landmarks here.

Objective-C

NSError *error;
NSArray *poses = [poseDetector resultsInImage:image error:&error];
if (error != nil) {
  // Error.
  return;
}
if (poses.count == 0) {
  // No pose detected.
  return;
}

// Success. Get pose landmarks here.

ऑब्जेक्ट का पता एसिंक्रोनस तरीके से लगाने के लिए:

Swift

poseDetector.process(image) { detectedPoses, error in
  guard error == nil else {
    // Error.
    return
  }
  guard !detectedPoses.isEmpty else {
    // No pose detected.
    return
  }

  // Success. Get pose landmarks here.
}

Objective-C

[poseDetector processImage:image
                completion:^(NSArray * _Nullable poses,
                             NSError * _Nullable error) {
                    if (error != nil) {
                      // Error.
                      return;
                    }
                    if (poses.count == 0) {
                      // No pose detected.
                      return;
                    }

                    // Success. Get pose landmarks here.
                  }];

4. पहचानी गई पोज़ के बारे में जानकारी पाना

अगर इमेज में किसी व्यक्ति का पता चलता है, तो पोज़ का पता लगाने वाला एपीआई, कंप्लीशन हैंडलर को Pose ऑब्जेक्ट की एक ऐरे पास करता है या ऐरे को वापस भेजता है. यह इस बात पर निर्भर करता है कि आपने एसिंक्रोनस या सिंक्रोनस तरीके को कॉल किया है या नहीं.

अगर व्यक्ति इमेज में पूरी तरह से नहीं है, तो मॉडल, फ़्रेम के बाहर मौजूद लैंडमार्क के कोऑर्डिनेट असाइन करता है. साथ ही, उन्हें InFrameConfidence की कम वैल्यू देता है.

अगर कोई व्यक्ति नहीं मिला, तो ऐरे खाली होता है.

Swift

for pose in detectedPoses {
  let leftAnkleLandmark = pose.landmark(ofType: .leftAnkle)
  if leftAnkleLandmark.inFrameLikelihood > 0.5 {
    let position = leftAnkleLandmark.position
  }
}

Objective-C

for (MLKPose *pose in detectedPoses) {
  MLKPoseLandmark *leftAnkleLandmark =
      [pose landmarkOfType:MLKPoseLandmarkTypeLeftAnkle];
  if (leftAnkleLandmark.inFrameLikelihood > 0.5) {
    MLKVision3DPoint *position = leftAnkleLandmark.position;
  }
}

परफ़ॉर्मेंस को बेहतर बनाने के लिए सुझाव

नतीजों की क्वालिटी, इनपुट इमेज की क्वालिटी पर निर्भर करती है:

ML Kit को पोज़ का सटीक पता लगाने के लिए, इमेज में मौजूद व्यक्ति को पिक्सल के ज़रूरी डेटा से दिखाया जाना चाहिए. सबसे अच्छी परफ़ॉर्मेंस के लिए, विषय कम से कम 256x256 पिक्सल का होना चाहिए.
अगर आपको रीयल-टाइम ऐप्लिकेशन में पोज़ का पता लगाना है, तो आपको इनपुट इमेज के कुल डाइमेंशन पर भी ध्यान देना चाहिए. छोटी इमेज को तेज़ी से प्रोसेस किया जा सकता है. इसलिए, इमेज को कम रिज़ॉल्यूशन पर कैप्चर करें, ताकि लेटेन्सी कम हो. हालांकि, ऊपर दी गई रिज़ॉल्यूशन से जुड़ी ज़रूरी शर्तों को ध्यान में रखें. साथ ही, यह पक्का करें कि इमेज में ज़्यादा से ज़्यादा जगह पर विषय मौजूद हो.
इमेज के फ़ोकस में गड़बड़ी होने पर भी, सटीक नतीजे नहीं मिलते. अगर आपको सही नतीजे नहीं मिलते हैं, तो उपयोगकर्ता से इमेज को फिर से कैप्चर करने के लिए कहें.

अगर आपको रीयल-टाइम ऐप्लिकेशन में पोज़ का पता लगाने की सुविधा का इस्तेमाल करना है, तो सबसे अच्छी फ़्रेमरेट पाने के लिए, इन दिशा-निर्देशों का पालन करें:

PoseDetection SDK और stream detection mode का इस्तेमाल करें.
इमेज को कम रिज़ॉल्यूशन में कैप्चर करें. हालांकि, इस एपीआई के लिए इमेज के डाइमेंशन से जुड़ी ज़रूरी शर्तों का भी ध्यान रखें.
वीडियो फ़्रेम प्रोसेस करने के लिए, डिटेक्टर के results(in:) सिंक्रोनस एपीआई का इस्तेमाल करें. दिए गए वीडियो फ़्रेम से नतीजे पाने के लिए, इस तरीके को AVCaptureVideoDataOutputSampleBufferDelegate के captureOutput(_, didOutput:from:) फ़ंक्शन से कॉल करें. डिटेक्टर को कॉल करने की संख्या को कम करने के लिए, AVCaptureVideoDataOutput के alwaysDiscardsLateVideoFrames को true के तौर पर सेट करें. अगर डिटेक्टर के चालू रहने के दौरान कोई नया वीडियो फ़्रेम उपलब्ध होता है, तो उसे छोड़ दिया जाएगा.
अगर आपको इनपुट इमेज पर ग्राफ़िक ओवरले करने के लिए, डिटेक्टर के आउटपुट का इस्तेमाल करना है, तो पहले ML Kit से नतीजा पाएं. इसके बाद, इमेज को रेंडर करें और एक ही चरण में ओवरले करें. ऐसा करने से, हर प्रोसेस किए गए इनपुट फ़्रेम के लिए, डिसप्ले सर्फ़ेस पर सिर्फ़ एक बार रेंडर किया जाता है. उदाहरण के लिए, शोकेस सैंपल ऐप्लिकेशन में previewOverlayView और MLKDetectionOverlayView क्लास देखें.

अगले चरण

पोज़ का पता लगाने के लिए, पोज़ के लैंडमार्क इस्तेमाल करने का तरीका जानने के लिए, पोज़ का पता लगाने से जुड़ी सलाह देखें.
इस एपीआई के इस्तेमाल का उदाहरण देखने के लिए, GitHub पर ML Kit का क्विकस्टार्ट सैंपल देखें.

इसे आज़माएं

शुरू करने से पहले

1. PoseDetector का इंस्टेंस बनाना

PoseDetector विकल्प

पहचान करने का मोड

Swift

Objective-C

Swift

Objective-C

2. इनपुट इमेज तैयार करना

Swift

Objective-C

Swift

Objective-C

Swift

Objective-C

3. इमेज को प्रोसेस करना

Swift

Objective-C

Swift

Objective-C

4. पहचानी गई पोज़ के बारे में जानकारी पाना

Swift

Objective-C

परफ़ॉर्मेंस को बेहतर बनाने के लिए सुझाव

अगले चरण

1. `PoseDetector` का इंस्टेंस बनाना

`PoseDetector` विकल्प