this post was submitted on 07 Oct 2023

51 points (96.4% liked)

No Stupid Questions

35525 readers

728 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)

Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.

Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.

Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.

Rule 4- No self promotion or upvote-farming of any kind.

That's it.

Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.

Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.

Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.

Rule 8- All comments should try to stay relevant to their parent content.

Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.

Rule 10- Majority of bots aren't allowed to participate here.

Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 1 year ago

MODERATORS

ja2@lemmy.world

_MoveSwiftly@lemmy.world

Rooki@lemmy.world

Thekingoflorda@lemmy.world

automodbeta@lemmy.world

L3s@lemmy.world

When do you think there will be an Android app that could accurately perform real time object recognition? (reddthat.com)

submitted 1 year ago* (last edited 1 year ago) by PumpkinDrama@reddthat.com to c/nostupidquestions@lemmy.world

28 comments fedilink hide all child comments

In Star Wars Rebels, there was an E-XD-series infiltrator droid that could quickly take inventory of everything in a Rebel warehouse. With the advanced object recognition capabilities of modern AI, it’s only a matter of time before an app for Android can accurately and rapidly identify and store objects in real-time from video capture. This could be similar to a home inventory app where users only need to capture video and move around the house instead of taking pictures and labeling items. When do you think such an app will become available? Also, what is the closest app available right now?

edit: I didn't say offline or on-device, I don't know why everyone assumes that. I mean a service offered through an Android app.

all 29 comments

sorted by: hot top controversial new old

[–] over_clox@lemmy.world 11 points 1 year ago* (last edited 1 year ago) (3 children)

Modern AI, as you're seeing it today, is processed by massive data centers online with thousands of processing units running in parallel, not by your local device. Your device would be way too slow to expect any sort of realtime object recognition, at least with the current state of technology.

TL;DR - I don't think it'll happen anytime soon, at least not on your local device. It would take a super fast and steady connection to the AI service.

[–] deegeese@sopuli.xyz 10 points 1 year ago* (last edited 1 year ago) (1 children)

Don’t underestimate the potential for optimization when you can constrain the problem to a narrow range of uses. Model pruning and custom silicon go far. Voice assistants used to be purely cloud compute, but a lot of common use cases are done on device now.

[–] over_clox@lemmy.world 5 points 1 year ago

Yes, I've been testing FUTO Voice Recognition lately. It's awesome as hell, but it is far from realtime. And this ain't even object recognition, it's only voice recognition.

https://voiceinput.futo.org/

https://play.google.com/store/apps/details?id=org.futo.voiceinput

[–] umbrella@lemmy.ml 2 points 1 year ago (1 children)

dunno, some mobile devices are starting to ship with pretty passable gpus nowadays

[–] over_clox@lemmy.world 6 points 1 year ago (1 children)

We're not talking about image rendering, we're talking about image recognition. Although they may seem related, they are not.

It's one thing to sling a 3D model and textures to a GPU, but it's totally a different thing to take a photo and sling it against a humongous AI model being run at a datacenter with billions of images to compare it to.

[–] umbrella@lemmy.ml 4 points 1 year ago (1 children)

image recognition is also done on gpus, a powerful enough gpu on say, a phone can do a variety of ai tasks

a mobile integrated intel gpu can already do facial recognition on a video stream for example

data centers have to be big because they centralize a lot of work

[–] over_clox@lemmy.world 3 points 1 year ago* (last edited 1 year ago) (1 children)

Recognizing a face is one thing, that's more or less just knowing certain geometries. Recognizing who that face actually is, or what model car that is, or whatever, requires processing through a huge database of information.

Also, as of right now, not all AI systems are even smart enough to distinguish a human from a monkey. They both have faces yo...

[–] umbrella@lemmy.ml 2 points 1 year ago (1 children)

tell that to my frigate nvr

[–] over_clox@lemmy.world 1 points 1 year ago (1 children)

No shit Watson, that's my whole point. AI as anyone today knows it is cloud based, meaning you're tethered to the internet. Your device can't process it all by its little measly lonesome self.

[–] umbrella@lemmy.ml 2 points 1 year ago* (last edited 1 year ago) (1 children)

you should look up what frigate is.

my desktop gpu can generate ai art pretty quickly too

[–] over_clox@lemmy.world 2 points 1 year ago (2 children)

Again, we are not discussing generating AI, we are discussing recognizing images from a camera. That requires parallel processing and many terabytes, if not even petabytes of images to compare to.

You got a petabyte of storage and a 1024 core processor to scavenge through all those images to tell you that the picture of your butt plug looks like a purple booty packer 3000?

[–] Akrenion@programming.dev 3 points 1 year ago (1 children)

Modern ai only needs those images to learn features and to project that on embeddings. I do not disagree with your assessment but you are talking about reverse image searching and not about neural nets.

[–] over_clox@lemmy.world 1 points 1 year ago

You act like I don't have experience programming neural networks. I actually do, starting from the early 2000s, initially for OCR (Optical Character Recognition).

If you want to do literal realtime object recognition, you'll have to reverse search images, otherwise you'll be dealing with a system that can only generalize the object, but can't even tell if a face is a human or a monkey.

https://www.philstar.com/headlines/2023/09/06/2294255/nbi-test-shows-sim-registration-system-accepts-ids-animal-faces

[–] umbrella@lemmy.ml 2 points 1 year ago* (last edited 1 year ago)

again, look up frigate. its similar although not the same as what you describe.

you dont need that much hardware to do ai, im not really saying the average joe will train huge intricate models for personal use on his laptop/phone.

[–] qwertyqwertyqwerty@lemmy.world 2 points 1 year ago (1 children)

Honestly, I expect some form of it in the next five years. Tech can move fast when it wants to and there’s 💵 involved.

[–] over_clox@lemmy.world 2 points 1 year ago

Is Moore's Law Finally Dead?

[–] Zarxrax@lemmy.world 10 points 1 year ago

There is an app called Object Detector which does this. It's not particularly accurate and can't recognize a lot of objects though. It does run on phones in realtime though.

[–] 4am@lemm.ee 5 points 1 year ago* (last edited 1 year ago)

Frigate does this on a raspberry pi or Intel NUC already. Would be power hungry in a phone but if you are not training ML models and just looking for objects they already know, the tech would be ready today.

EDIT: Here’s Google’s article on how to create your own TensorFlow app for Android https://developers.google.com/ml-kit/vision/object-detection/custom-models/android

[–] fartsparkles@sh.itjust.works 4 points 1 year ago

Don’t know about an Android app but YOLOv8 Detect and similar models can detect objects in videos and classify them.

[–] ryathal@sh.itjust.works 4 points 1 year ago

An isolated phone, not for a while. A phone with a dedicated 5g connection would be pretty close.

[–] altima_neo@lemmy.zip 3 points 1 year ago (1 children)

Google Goggles used to be able to do that

[–] Kolanaki@yiffit.net 6 points 1 year ago

Google Lens still can.

[–] QubaXR@lemmy.world 3 points 1 year ago

GPT 4 with image uploads gets pretty damn close, though it's not real-time and processed server side

[–] breadsmasher@lemmy.world 2 points 1 year ago (1 children)

Doesn’t google lens basically do this already?

[–] over_clox@lemmy.world 8 points 1 year ago (1 children)

Yes, but OP is referring to realtime object recognition. Although we don't have to wait very long for object recognition right now, we still have to wait a bit. That's not quite realtime.

[–] adespoton@lemmy.ca 2 points 1 year ago

Think about what Apple currently has: dedicated ML processing chip with multiple cores, and yet the on-device object recognition is still an “overnight while plugged in” process for a single image, and only detects a limited number of object types.

Real-time mobile offline OR is still the mythical “at least ten years out.” It needs improvements in processors, sample sets, training data and algorithms to get to real-time.

[–] DrownedRats@lemmy.world 1 points 1 year ago

Not likely in the near future, probably not feasible in the long term either. It's not just about recognising an object. You could write a program that recognises a screw but you'd need far more complicated sensors and algorithms to identify the dimensions, specific characteristics, material composition, design specifications, etc, then apply that to every screw, bolt, washer, small component and assembly, tubes, threaded rods, tyres, pistons, brake pads, resistors, capacitors, diodes, seals, consumables, etc.

For a long time, I think that kind of thing would be wildly inaccurate, hugely expensive, massively complicated, and much less efficient than asking a human to kindly go over there and check all those things manually.