You can use ML Kit to recognize well-known landmarks in an image.
Before you begin
- If you have not already added Firebase to your app, do so by following the
steps in the
getting started guide
.
- Include the ML Kit libraries in your Podfile:
pod 'Firebase/MLVision', '6.25.0'
After you install or update your project's Pods, be sure to open your Xcode
project using its
.xcworkspace
.
- In your app, import Firebase:
Objective-C
@import Firebase;
-
If you have not already enabled Cloud-based APIs for your project, do so
now:
- Open the
ML Kit
APIs page
of the Firebase console.
-
If you have not already upgraded your project to a Blaze pricing plan, click
Upgrade
to do so. (You will be prompted to upgrade only if your
project isn't on the Blaze plan.)
Only Blaze-level projects can use Cloud-based APIs.
- If Cloud-based APIs aren't already enabled, click
Enable Cloud-based
APIs
.
By default, the Cloud detector uses the stable version of the model and
returns up to 10 results. If you want to change either of these settings,
specify them with a
VisionCloudDetectorOptions
object as
in the following example:
Swift
let options = VisionCloudDetectorOptions()
options.modelType = .latest
options.maxResults = 20
Objective-C
FIRVisionCloudDetectorOptions *options =
[[FIRVisionCloudDetectorOptions alloc] init];
options.modelType = FIRVisionCloudModelTypeLatest;
options.maxResults = 20;
In the next step, pass the
VisionCloudDetectorOptions
object when you create the Cloud detector object.
Run the landmark detector
To recognize landmarks in an image, pass the image as a
UIImage
or a
CMSampleBufferRef
to the
VisionCloudLandmarkDetector
's
detect(in:)
method:
- Get an instance of
VisionCloudLandmarkDetector
:
Swift
lazy var vision = Vision.vision()
let cloudDetector = vision.cloudLandmarkDetector(options: options)
// Or, to use the default settings:
// let cloudDetector = vision.cloudLandmarkDetector()
Objective-C
FIRVision *vision = [FIRVision vision];
FIRVisionCloudLandmarkDetector *landmarkDetector = [vision cloudLandmarkDetector];
// Or, to change the default settings:
// FIRVisionCloudLandmarkDetector *landmarkDetector =
// [vision cloudLandmarkDetectorWithOptions:options];
-
Create a
VisionImage
object using a
UIImage
or a
CMSampleBufferRef
.
To use a
UIImage
:
- If necessary, rotate the image so that its
imageOrientation
property is
.up
.
- Create a
VisionImage
object using the correctly-rotated
UIImage
. Do not specify any rotation metadata—the default
value,
.topLeft
, must be used.
Swift
let image = VisionImage(image: uiImage)
Objective-C
FIRVisionImage *image = [[FIRVisionImage alloc] initWithImage:uiImage];
To use a
CMSampleBufferRef
:
-
Create a
VisionImageMetadata
object that specifies the
orientation of the image data contained in the
CMSampleBufferRef
buffer.
To get the image orientation:
Swift
func imageOrientation(
deviceOrientation: UIDeviceOrientation,
cameraPosition: AVCaptureDevice.Position
) -> VisionDetectorImageOrientation {
switch deviceOrientation {
case .portrait:
return cameraPosition == .front ? .leftTop : .rightTop
case .landscapeLeft:
return cameraPosition == .front ? .bottomLeft : .topLeft
case .portraitUpsideDown:
return cameraPosition == .front ? .rightBottom : .leftBottom
case .landscapeRight:
return cameraPosition == .front ? .topRight : .bottomRight
case .faceDown, .faceUp, .unknown:
return .leftTop
}
}
Objective-C
- (FIRVisionDetectorImageOrientation)
imageOrientationFromDeviceOrientation:(UIDeviceOrientation)deviceOrientation
cameraPosition:(AVCaptureDevicePosition)cameraPosition {
switch (deviceOrientation) {
case UIDeviceOrientationPortrait:
if (cameraPosition == AVCaptureDevicePositionFront) {
return FIRVisionDetectorImageOrientationLeftTop;
} else {
return FIRVisionDetectorImageOrientationRightTop;
}
case UIDeviceOrientationLandscapeLeft:
if (cameraPosition == AVCaptureDevicePositionFront) {
return FIRVisionDetectorImageOrientationBottomLeft;
} else {
return FIRVisionDetectorImageOrientationTopLeft;
}
case UIDeviceOrientationPortraitUpsideDown:
if (cameraPosition == AVCaptureDevicePositionFront) {
return FIRVisionDetectorImageOrientationRightBottom;
} else {
return FIRVisionDetectorImageOrientationLeftBottom;
}
case UIDeviceOrientationLandscapeRight:
if (cameraPosition == AVCaptureDevicePositionFront) {
return FIRVisionDetectorImageOrientationTopRight;
} else {
return FIRVisionDetectorImageOrientationBottomRight;
}
default:
return FIRVisionDetectorImageOrientationTopLeft;
}
}
Then, create the metadata object:
Swift
let cameraPosition = AVCaptureDevice.Position.back // Set to the capture device you used.
let metadata = VisionImageMetadata()
metadata.orientation = imageOrientation(
deviceOrientation: UIDevice.current.orientation,
cameraPosition: cameraPosition
)
Objective-C
FIRVisionImageMetadata *metadata = [[FIRVisionImageMetadata alloc] init];
AVCaptureDevicePosition cameraPosition =
AVCaptureDevicePositionBack; // Set to the capture device you used.
metadata.orientation =
[self imageOrientationFromDeviceOrientation:UIDevice.currentDevice.orientation
cameraPosition:cameraPosition];
- Create a
VisionImage
object using the
CMSampleBufferRef
object and the rotation metadata:
Swift
let image = VisionImage(buffer: sampleBuffer)
image.metadata = metadata
Objective-C
FIRVisionImage *image = [[FIRVisionImage alloc] initWithBuffer:sampleBuffer];
image.metadata = metadata;
-
Then, pass the image to the
detect(in:)
method:
Swift
cloudDetector.detect(in: visionImage) { landmarks, error in
guard error == nil, let landmarks = landmarks, !landmarks.isEmpty else {
// ...
return
}
// Recognized landmarks
// ...
}
Objective-C
[landmarkDetector detectInImage:image
completion:^(NSArray<FIRVisionCloudLandmark *> *landmarks,
NSError *error) {
if (error != nil) {
return;
} else if (landmarks != nil) {
// Got landmarks
}
}];
If landmark recognition succeeds, an array of
VisionCloudLandmark
objects will be passed to the completion handler. From each object, you can get
information about a landmark recognized in the image.
For example:
Swift
for landmark in landmarks {
let landmarkDesc = landmark.landmark
let boundingPoly = landmark.frame
let entityId = landmark.entityId
// A landmark can have multiple locations: for example, the location the image
// was taken, and the location of the landmark depicted.
for location in landmark.locations {
let latitude = location.latitude
let longitude = location.longitude
}
let confidence = landmark.confidence
}
Objective-C
for (FIRVisionCloudLandmark *landmark in landmarks) {
NSString *landmarkDesc = landmark.landmark;
CGRect frame = landmark.frame;
NSString *entityId = landmark.entityId;
// A landmark can have multiple locations: for example, the location the image
// was taken, and the location of the landmark depicted.
for (FIRVisionLatitudeLongitude *location in landmark.locations) {
double latitude = [location.latitude doubleValue];
double longitude = [location.longitude doubleValue];
}
float confidence = [landmark.confidence floatValue];
}
Next steps