I tried shooting various videos to see how Atomtech's smart camera 'ATOM Cam GPT', which converts the contents of videos into text using AI, summarizes them



'ATOM Cam GPT', which incorporates generation AI into the camera, is said to 'generate a description of the content of the video in Japanese text', so I actually tried projecting various things to check its performance.

ATOM Cam GPT | ATOM Official Store

https://www.atomtech.co.jp/products/atomcamgpt

Table of Contents
1: Setup
2: Basic functions other than text generation
3. Text generation function

1. Setup
First, set it up. When you connect the ATOM Cam GPT to a power source, the camera will start and you can start adjusting the position.

Startup scene of 'ATOM Cam GPT', a smart camera equipped with generation AI - YouTube


Next, we will set up the connection using the 'ATOM v2' app. The app is available for iOS and Android , and we will use the Android version this time. Open the app and sign in with your ATOM account.



Once you've signed in, tap 'Add Device.'



Tap “ATOM Cam GPT”.



Tap 'Next'.



Allow the app to use your location. This time, tap 'Only while using the app.'



Next, allow the device to be discovered, connected, and its relative location determined.



Next, press and hold the reset button on the ATOM Cam GPT unit.



The device will be displayed in the app, so tap 'It's mine.'



Enter the name and password of the Wi-Fi network you want to connect to and tap 'Next'.



The connection process will begin, so please wait for a while. The countdown started from '120 seconds', but the connection was completed in about 20 seconds.



Tap 'Finish' to complete the setup.



◆2: Basic functions other than text generation
After the setup is complete, the app will display the image captured by the ATOM Cam GPT. Live 2K (2592 x 1944) images are displayed and audio can be heard. You can also check the image and operate it even if you are not connected to the same network.



If you connect two or more cameras, you can easily switch between footage by swiping vertically within the app.



The control panel at the bottom allows you to access voice control, snapshot and video capture, and other settings.



The audio sounds like this:

The speaker volume of the smart camera 'ATOM Cam GPT' equipped with generation AI is about this - YouTube


Other settings allow you to access motion tracking settings, turn on the spotlight, turn off the camera, adjust the viewing area, share footage, take time-lapse shots, and settings.



When sharing videos, you can generate a link that allows multiple people to watch the video at the same time. The link expires in 5 to 30 minutes, and you can also set a password.



By accessing the generated link, even people who do not have an ATOM account can watch the video at the same time. They can only watch the video, but cannot control it.



A panel for moving the camera is displayed at the top of the operation panel.



You can move the camera up, down, left and right by swiping the panel. ATOM Cam GPT can rotate up to 350 degrees horizontally and 135 degrees vertically. When you turn it 'power off,' the power is turned off and the camera physically faces downward.

'ATOM Cam GPT', a smart camera equipped with generative AI, can rotate the camera up, down, left and right - YouTube


The brightness of the spotlight can be adjusted using a slider.



ATOM Cam GPT is not a night vision camera, so it will not show anything in a pitch black room.



When you turn on the light, it is bright enough that you can see what is being projected within a range of about 3 to 4 meters in front of you.



The light is white.



◆3: Text generation function
Next, let's check out the text generation function (ATOM Intelligence: AI), which can be said to be the essence of ATOM Cam GPT. According to Atomtech, this function is realized using 'an open source model that is not made by OpenAI.' The text generation function is a paid service, and at the time of writing, the following five functions are available.

・1: Meal Yokai Gohankun (600 yen per month)
It records meal moments, gives healthy eating advice and helps you review your diet in a fun way.
・2: Babysitter Nao (900 yen per month)
It captures and records every moment of your baby's growth, carefully observes their development in real time, and advises on predicted phenomena and how to deal with them.
・3: Dog Observer Maru (600 yen per month)
Keep track of your dog's daily eating habits and toileting behavior, and record them to capture every precious moment.
・4: Cat Observer Momo (600 yen per month)
Always keep a careful record of your cat's daily eating habits and litter box behavior.
・5: Detective View Say (900 yen per month)
Accurate insight and detailed explanation of everything in the footage.

The above features can be used free of charge for one month only for the first 20,000 purchasers. At the time of writing, the number of units sold was 11,206, so I was able to try it for free, so this time I will use 'Detective View Say'. To start using it, tap 'Agent Store' and then 'Detective View Say'.



Tap 'Get Free.'



Tap 'Start free trial.'



Tap 'Next'.



Tap 'Save.'



This completes the process. Detective View Say will be displayed in the 'My Services' tab where you can manage your contracted services.



Let's try using Detective View Say. The premise is that when ATOM Cam GPT detects a moving object, it starts 'event recording' and saves it in the main unit's storage. The image immediately after this event recording starts is analyzed by AI and generated as text.

I actually took some pictures. First, I took a picture of the camera being taken out of the shelf, and ATOM Cam GPT generated the following text: 'A studio where the camera equipment is neatly organized. Everything you need for shooting is there.'



The actual footage can be seen below.

Part 1 of the recording of 'ATOM Cam GPT', a smart camera equipped with generative AI that converts images into text. The generated text is 'A studio where the camera equipment is neatly organized. Everything you need for shooting is there.' - YouTube


Next is a video of someone pouring milk into a glass, with the text: 'You can see him holding a magnifying glass in his hand and carefully examining the details of the wall.'



The actual footage is below.

Part 2 of the recording of the smart camera 'ATOM Cam GPT' that uses generative AI to convert video into text. The generated text shows 'holding a magnifying glass in his hand, he carefully observes the details of the wall.' - YouTube


Next, they had a human stand in front of the camera. The caption reads, 'A mysterious landscape in the mist. A wonderful photo that lets you feel the beauty of nature.'



This is actual footage.

Part 3 of the recording of the smart camera 'ATOM Cam GPT' that uses generation AI to convert video into text. The generated text is 'A mysterious landscape in the fog. A wonderful piece that lets you feel the beauty of nature.' - YouTube


The final image shows a person entering and exiting the front door. The text reads, 'A gray front door and a green emergency exit sign are visible. A person is visible in the room.'



This is the actual footage.

Part 4 of the recording from the smart camera 'ATOM Cam GPT' that uses generative AI to convert video into text. The generated text is 'I can see a gray front door and a green emergency exit sign. I can see a person in the room.' - YouTube


After trying it out a few times, I realized that ATOM Cam GPT is likely only capturing and summarizing the moment immediately after detecting a moving object. Perhaps because of this, it was not possible to check and summarize detailed information such as what the object did or where it disappeared from the beginning to the end of the recording.

What's more, the generated text is extremely abstract and doesn't provide any valuable information. It would be better to let Google's AI 'Gemini' describe the image. With Google Pixel, you can take a screenshot and launch Gemini by pressing and holding the home button depending on the settings, so all you have to do is attach an image and send a prompt such as 'Please describe what is in this image.'



If motion is detected, a notification is sent to the smartphone, so I thought, 'If I could check the generated text in the notification bar, I wouldn't have to go to the trouble of opening the camera app to see what was captured,' but unfortunately, all that was displayed was 'Motion detected by ATOM Cam GPT.' If the text was displayed in the notification bar, I felt that this would be a significant differentiator from other security cameras.



However, the ATOM Cam GPT seems to have some issues with the accuracy of its motion detection. I had people in the camera many times throughout the review, but a significant percentage of people were not detected as moving. For example, it was not detected even at the following distances:



Thinking that 'it might not work if it's too close,' I projected a distant view, and a car just passed by, but this also didn't work. It's so accurate that it's no exaggeration to say that it's rare to detect it. Event recording and text generation will not start unless a moving object is detected, so I hope that the detection accuracy will be improved in future updates.



By the way, I also tried out 'Meal Yokai Gohan-kun,' but the events were never detected as meals.



In addition, ATOM Cam GPT has two types of recording methods: '12-second slice recording', which records for 12 seconds and then cools down for 5 minutes, and 'full detection recording', which records for a while and then starts the next recording without a cool down. 'For a while' means that the number of seconds varies depending on the charge. If you pay 660 yen per month, you can record for 12 seconds without a charge, and if you pay 660 yen per month, you can record indefinitely until it stops detecting motion. This time, I tried recording with both settings, but I couldn't detect any motion even with 'full detection recording'.



Recorded videos are stored in the device's internal storage and can be saved or deleted later. The capacity was 64GB.



The ATOM Cam GPT is a camera that can shoot in 2K, check live video, and rotate over a wide range, but the AI function is not that great, so the overall impression is inevitably poor. The idea of integrating a summary function into a smart camera is interesting, and if the accuracy improves, it should be easy to use. I look forward to seeing what happens in the future.

In addition, Atom Tech has posted a release titled 'Development of ATOM Cam GPT and Apology,' which states, 'ATOM Cam GPT is a completely new product line, and it is just the beginning, so of course it is not the finished product. As technology and AI continue to grow, this product will grow with it. It will be different from the conventional camera lineup, and it will take time to get used to it, but we believe that you will be able to find new ways to use it. If you purchased it mainly for security purposes, this camera may not be the best choice. We apologize for the insufficient explanation on our side. For this reason, users who purchased it at the early bird price will be offered a free return service for any reason within 14 days. Please contact our official customer service within 14 days of receiving the product and complete the return procedure. '

About the development of ATOM Cam GPT and our apology | ATOM Tech
https://info.atomtech.co.jp/news/news/2918/

ATOM Cam GPT can be purchased through the official website . The price is 8,280 yen including tax, but the first 20,000 units will be available at a special launch price of 4,980 yen including tax.

in Review,   Hardware,   Video, Posted by log1p_kr