Meta releases AI model 'Segment Anything Model' that can separate and select objects in photos



Meta has announced an AI model ' Segment Anything Model (SAM) ' that can identify individual objects in images and videos, even those that have not been learned.

Segment Anything | Meta AI Research

https://ai.facebook.com/research/publications/segment-anything/

Introducing Segment Anything: Working toward the first foundation model for image segmentation
https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/

'Image segmentation', which divides images and movies and distinguishes them by segment, facilitates image analysis and processing, so Meta image segmentation is useful for understanding the content of web pages, augmented reality (AR) applications, and image editing. I'm looking at it. Furthermore, it is said that it can be applied to scientific research by automatically specifying the position of animals and objects shown on the video.

You can see how SAM can achieve highly accurate image segmentation by looking at the following example given by Meta. For example, a photo of a kitchen landscape.



When image segmentation is performed with SAM, it looks like this. Each knife and each lemon in the basket are identified firmly, and the knife also identifies the blade and the handle.



Photo of a box full of vegetables



Each vegetable can be individually recognized.



When selecting a range by dragging and dropping, only the vegetables included in the range were selected.



A demo that actually performs image segmentation with prepared photos is published below, and it is possible to perform image segmentation even with images uploaded by yourself.

Segment Anything | Meta AI

https://segment-anything.com/demo

After accessing the SAM demo, click to check 'I have read and agree to the Segment Anything Terms and Conditions'.



After checking, click 'Upload an image' at the top of the screen. Then Explorer will start, so select the image you want to load.



This time, I uploaded an image of

Torotamama Cheese Teriyaki Burger ~Hokkaido Gouda Cheese~ . When I clicked on the patty, only the patty part was properly selected.



It is like this when lettuce is selected.



When you click the buns part, only the buns part turns blue as shown below.



Furthermore, when you click 'multi-mask' in the left column, only the bun part pops out and is displayed in three dimensions. Also, it was said that things on the same layer were automatically clipped, so not only the upper buns but also the lower buns emerged.



SAM is an image segmentation model that allows you to isolate specific objects in an image in response to text prompts or user clicks. Image segmentation technology itself is not new, but SAM is unique in that it can identify objects that are not present in the training dataset.

According to Meta, creating a highly accurate image segmentation model typically 'requires AI training infrastructure and highly specialized work by technical experts with access to large amounts of carefully annotated data.' However, SAM hopes to 'democratize image segmentation' by reducing the need for this specialized training and expertise, further facilitating computer vision research.

The SA-1B dataset used for SAM training consists of approximately 1.1 billion high-quality segmentation masks collected by the Meta data engine under license from a major photography company, and is licensed under the Apache 2.0 open license. Available for research purposes below.

Also, the source code excluding SAM weight data is published on GitHub.

GitHub - facebookresearch/segment-anything: The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
https://github.com/facebookresearch/segment-anything

in Review,   Software,   Web Service,   Web Application, Posted by log1i_yk