Image Recognition using Google Vision API

September 20, 2018

What is Image recognition?

Image recognition is the concept employed by machine learning companies that focuses on identifying the objects, places, peoples and several other variables in the image. Basically, in image recognition, the first step is image classification. In image classification, the system extracts the important data/information and features from the images.

For example, in the below image if you want to extract cat from the background you will notice a significant variation in RGB pixel values. The task in Image Classification is to take an array of pixels that represents a single image and assign a label to it.

1. General Approaches to Image Classification:

Unsupervised Classification – Unsupervised classification (commonly referred to as clustering) is the method of partitioning remote sensor image data in multispectral feature space and extracting land-cover information

Supervised Classification – Supervised classification is the technique used by machine learning companies for the quantitative analysis of remote sensing image data. It can be divided into two types-

  • Training phase – in this phase the classification algorithm is provided to identify classes uniquely and a respective number of pixels is assigned to a particular class in which the particular image belongs.
  • Classification phase – In this, the algorithm uses the training data files and by looking for each pixel in the trained classes, assign classes to each pixel.

2. Image Recognition Techniques:

In mobile, web and software development there are a lot of techniques are present like-Tensorflow projects by Google, DeepFace by Facebook and on the other side, hosted APIs like Google Cloud Vision, Microsoft computer vision API. These API’s are paid.

Here we will discuss Google Cloud Vision API in android. In this, we will learn how we will use the mobile device camera to identify the images.

How does Google Cloud Vision API work?

Google image recognition API will identify images from pre-trained models on large datasets of images and then it classifies the images into thousands of categories to detect the objects, places, people and faces in the images and then prints the results with the confidence value.

If you want to use the Google Cloud Vision API, Google has introduced the API for developers. Developers can use this API in C#, GO, JAVA, NODE.JS, PHP, PYTHON, RUBY. This API takes input an image, and the results back you the output in the form of BatchAnnotateImagesResponse.

How to use the Google Image Recognition API?

  • Setting up the Google Vision API

  1.  To use the Google Vision API, you have to sign up for a Google Compute Engine Account. GCE is free to try but you will need a credit card to sign up.
  2. Then create a project in the developer console.
  3. Click on the credentials drop-down menu and select OAuth Client ID: AND Select application type as Android. Enter your SHA1 fingerprint AND the package name of your app: It must be the same as the one declared in the build.gradle of your app. Then get yourself an API key from the left-hand menu.
  4. Enable the API on your project (go to this URL and click Enable the API):

Now you’re ready to go! Now we will see how Google”s API gives us output using a mobile”s camera.

In build. gradle add below dependencies


We will use the Google API Client library Code because we are using the OAuth request.  So first we need to obtain the auth token from Google.



Now we have got the parameters to call the Cloud Vision API and receive the results.





Benefits of Cloud Vision API

1. Entity Detection 

Google Cloud Vision API can easily detect entities including the location of each object within the image through Vision API and AutoML.

2. Edge Devices 

AutoML builds and deploys high-performance models to classify the images and trigger real-time actions based on local data. It also supports a variety of edge devices where resources are parameters and latency is critical.

3. Purchase Friction 

With vision API retailers can create an engaging mobile experience that enables customers to upload the photo of the required item and immediately are provided with the list of items exact or similar to the requirement.

4. Detecting Text 

Google Vision API has OCR to detect text present in the images and can recognize up to 50 languages and various file types. It also helps in processing millions of documents quickly and automatically through its Document Understanding AI.

5. Recognizing Explicit Content 

Vision’s Safe Search can help review and recognize images and estimate the likelihood that the provided image is explicit, has adult content, or consists of violence.

Quick QnA on Google Cloud Vision API

Q1. How is Google Cloud Vision API impacting the recognition industry?

Ans: Companies that make use of Google Cloud Vision API are benefitting as they are investing in a technology that can be used for image sentiment analysis, moderation of offensive content and image pattern matching without spending on the third-party vendor and creating their own APIs.

Q2. What advantages does Google give as compared to other vendors?

Ans: Vision API  has designed a sophisticated large scale system that can handle huge data sets and iterate fasts. Additionally, the 3 main advantages that Google provides are – 

  • Capturing huge data sets 
  • Large scale data processing
  • Tackling generic domains          

Q3. What are the key functionalities and features of Google Vision API?

Ans: Some of the strong and powerful Cloud Vision API features and functionalities are described below – 

  • Label Detection – Detects broad sets of categories
  • Web detection – Searching online for related image 
  • OCR – Detecting and fetching text in an image
  • Landmark recognition – Distinguishing all the accepted natural and man-made structures
  • Face Detection – Recognize numerous faces within
  • Moderation of Content – Automated detection of precise content
  • Handwriting Recognition – Easily distinguish human writing 
  • Logo recognition – Helps in detecting accepted logos
  • Object localizer – Spotting an object in an image.
  • REST API –  Request one or more annotation sorts per image.

Choose Between The Right Vision Product For Your Application 

[table width=”850″ colwidth=”400|250|400″ colalign=”center|center|center”] Features, AutoML Vision , Vision API,
REST API, Yes, Yes,
RPC API ,Yes,Yes,
Predefinied Labels,-,Yes,
Custom Labels,Yes ,-,
Annotation,Yes ,Yes ,
Deploy ML at the edge,Yes ,-,
Comparing Images ,-, -,
OCR, -,Yes,
Face Recognition, -,Yes,
Landmarks Identification, -,Yes,
Logo Identification, -,Yes,
Detect general attributes, -,Yes,
Web Entities, -,Yes,
Detect Explicit Content, -,Yes,
Detecting Objects, Yes, Integrate with ML Kit,

The cloud vision gives you an insight from the images with superior pre-defined APIs models and with no compatibility issue trains the custom vision models using AutoML Vision BETA. 

Along with Google, Microsoft is also offering  ML APIs to developers to build smart applications using Azure Machine Learning helping in detecting the age of the people seen in the photos. Whereas, IBM Watson is another service that’s available to developers.

If you want to integrate image recognition functionality in your application, without any wait you can get in touch with us. We will not only automate the quality control process but will also enable edge devices to identify the underlying defects. We will help you find the products of interest within images and visually search products.

Along with this we will help you access information to classify, enrich, and extract information, and make images searchable across broad topics and scenes. At Signity, we have the best machine learning developers and you can consider outsourcing to India and reap out the benefits associated. 

Get In Touch

Send us your requirements. We will get back to you with a free quote.

Want to See Your Idea as the Next Big Thing?

  • With Signity Solutions, your business is destined to grow at a new level. Get in touch today to create a game changing experience.

Send Us a Message


Our Global workspaces


Bestech Business Tower
A-413, 4th Floor, Tower A, Sector-66
Mohali Punjab 160066, IN

New Zealand

14049, 35 George
Street, Kingsland
Auckland 1024.
+64 22 111 0002


45 Timber Ridge Road,
North Brunswick, NJ 08902

Drop us an e-mail at [email protected]