this post was submitted on 03 Dec 2024

64 points (91.0% liked)

Open Source

31710 readers

170 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Posts must be relevant to the open source ideology
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago

MODERATORS

kevincox@lemmy.ml

CrypticCoffee@lemmy.ml

Lettuceeatlettuce@lemmy.ml

64

Lokas : Record and transcribe your meetings in complete confidentiality ! (framablog.org)

submitted 2 weeks ago by Framasoft@lemmy.world to c/opensource@lemmy.ml

8 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] solrize@lemmy.world 3 points 2 weeks ago* (last edited 2 weeks ago)

I get the impression that recreating the Whisper training code is possibly doable, but the data is a bigger task.

This is a possible Whisper alternative with maybe similar issues: https://petewarden.com/2024/10/21/introducing-moonshine-the-new-state-of-the-art-for-speech-to-text/

Yes it's nice that the phone app is free but STT is the difficult and important part. With Moonshine it might be possible to run the transcriber completely on the phone instead of having the STT on a remote server.

It's interesting that they are able to do all that speaker distinguishing with just a single mic as found on both phones. There was a thread about phone features recently. Given this STT stuff, it could be useful to have a phone with 3 or 4 mics in the corners of the phone, like one of those tabletop conference mics, so it can figure out directionality of sound sources.