Google releases the latest version of the world's largest dataset 'Open Images Dataset'



In 2016, Google first released the dataset 'Open Images' for machine learning, but Google released the latest version ' Open Images Dataset V5 ' on May 8, 2019.

Google AI Blog: Announcing Open Images V5 and the ICCV 2019 Open Images Challenge
https://ai.googleblog.com/2019/05/announcing-open-images-v5-and-iccv-2019.html

Since machine learning requires a large amount of data to be a source of learning, Google released the image dataset 'Open Images v4' in 2018. This is a 9 million images with labels and bounding boxes . Open Images v4 is the world's largest data with a total of 15.4 million bounding boxes for objects in 600 categories, with over 300,000 image-related annotations and location annotations. It was a set.

And on May 8, 2019 Google released the new Open Images Dataset V5.

Open Images Dataset V5
https://storage.googleapis.com/openimages/web/index.html



A feature of Open Images Dataset V5 is that it has 2.8 million segmentation masks covering 350 categories for object instances. Unlike the bounding box, this segmentation mask only recognizes where the object exists. Segmentation masks mark the contours of objects and characterize the spatial extent to the smallest detail. The Google development team was very particular about this accuracy, so for example it would recognize the tail if it was a cat, and if it was a camel carrying a person or luggage, it would mask the person or the luggage. More importantly, the development team says it contains objects and examples from different categories than any dataset in the past.



(PDF file) The mask produced by the interactive segmentation process is much more efficient than manual drawing and is accurate enough to record 84% in Intersection-over-Union (IoU) You Google also publishes a Mask Verification & Test Set , saying that the combination of the two is 'almost perfect' in the ability of the Open Images Dataset V5 to capture complex details down to the details.



In addition to masks, Google has added 6.4 million new human-verified image-level labels, which will provide nearly 20,000 categories and 36.5 million labels. In addition, it improves the accuracy of the object detection model by improving the density of annotations in 600 categories of validation & test sets.

In addition, in conjunction with the release of Open Images Dataset V5, Google announced the holding of the object detection competition 'Open Images Challenge 2019'. This will be held by Kaggle using Open Images V5, and will be available from June 3, 2019.

Open Images Challenge 2019
https://storage.googleapis.com/openimages/web/challenge2019.html

in Software, Posted by darkhorse_log