Self-hosted voice assistant with mobile app
from eager_eagle@lemmy.world to selfhosted@lemmy.world on 21 Feb 12:44
https://lemmy.world/post/43424949

Any experiences with a self-hosted assistant like the modern Google Assistant? Looking for something LLM-powered that is smarter than older assistants that would just try to call 3rd party tools directly and miss or misunderstand requests half of the time.

I’d like integration with a mobile app to use it from the phone and while driving. I see Home Assistant has an Android Auto integration. Has anyone used this, or another similar option? Any blatant limitations?

#selfhosted

threaded - newest

artyom@piefed.social on 21 Feb 12:59 next collapse

Multi-billion dollar companies like Google and Apple can’t even figure this shit out, doubt some nerd is gonna do it for free.

eager_eagle@lemmy.world on 21 Feb 13:20 next collapse

  1. They can and do; 2. LLMs can do tool calling just fine, even self-hosted ones.
artyom@piefed.social on 21 Feb 13:37 collapse

LOL they can’t even reliably turn the lights on, WTF are you talking about?

eager_eagle@lemmy.world on 21 Feb 14:03 next collapse

maybe last you tried it was over 6 months ago, maybe you’re using the old google assistant, or idk, but it definitely works for me

artyom@piefed.social on 21 Feb 14:33 next collapse

Everything I’ve read says Gemini is like 10x worse than Google Assistant.

eager_eagle@lemmy.world on 21 Feb 16:36 collapse

so you haven’t tried it recently

NarrativeBear@lemmy.world on 21 Feb 21:28 collapse

Gemini is a hot pile of garbage.

When I ask Gemini for directions it starts to give me a definition, as opposed to opening maps and showing me the way. If I ask to turn off the lights I get a conversation and I end up walking to the light switch myself.

eager_eagle@lemmy.world on 21 Feb 21:30 collapse

idk what to tell you, because I just tried it and it works

iamthetot@piefed.ca on 22 Feb 00:30 collapse

Heh, I’ve actually moved away from using Google Home stuff because it’s shoving Gemini down my throat and it’s been worse in the last six months than it was a year ago.

avidamoeba@lemmy.ca on 22 Feb 07:54 collapse

Using Home Assistant with Qwen locally. It functions better than any version of Google Home I’ve had. Understands me without having to think about how I say things. Can ask it for one or multiple things at the same time. Can even make it so that it pretends to be Santa Claus while responding. My wife was ecstatic when she heard the Ho-ho-ho while asking to turn the coffee machine on on Christmas.

Mubelotix@jlai.lu on 22 Feb 09:13 collapse

You are not wrong, I read an article a while (like several years) ago from an amazon engineer who revealed they coulnd’t wrap their minds around upgrading Alexa to the latest technologies. He told about them going in circles and people eventually leaving the company because of how bad of a mess it was. And it turns out they havn’t made any progress since then, Alexa is still absolute garbage. It can’t hear you when there is music, it can’t hear you when you cook, it can’t hear you from more than 5 meters, and even when it does hear you, it responds completely off the track and goes on and on, refusing to stop the nonsense

wildbus8979@sh.itjust.works on 21 Feb 13:04 next collapse

Home Assistant can absolutely do that. If you are ok with simple intent based phrasing it’ll do it out of the box. If you want complex understanding and reasoning you’ll have to run a local LLM, like Llama, on top of it

eager_eagle@lemmy.world on 21 Feb 13:21 collapse

yeah, that’s what I’m looking for. Do you know of a way to integrate ollama with HA?

lyralycan@sh.itjust.works on 21 Feb 13:48 collapse

I don’t think there’s a straightforward way like a HACS integration yet, but you can access Ollama from the web with open-webui and save the page to your homepage:

<img alt="" src="https://sh.itjust.works/pictrs/image/ae206b55-421f-4035-ac6a-a51774739fcd.jpeg">

Just be warned, you’ll need a lot of resources depending on which model you choose and its parameter count (4B, 7B etc) – Gemma3 4B uses around 3GB storage, 0.5GB RAM and 4GB of VRAM to respond. It’s a compromise as I can’t get replacement RAM, and tends to be wildly inaccurate with large responses. The one I’d rather use, Dolphin-Mixtral 22B, takes 80GB storage and 17GB min RAM, the latter of which I can’t afford to take from my other services.

excursion22@piefed.ca on 21 Feb 16:52 collapse

There’s an Ollama integration that adds it as a conversation agent.

hendrik@palaver.p3x.de on 21 Feb 17:08 next collapse

And there’s another custom component, integrating all servers with an OpenAI-compatible API endpoint: https://github.com/jekalmin/extended_openai_conversation

wildbus8979@sh.itjust.works on 21 Feb 17:41 collapse
eager_eagle@lemmy.world on 21 Feb 20:38 collapse

ah, this puts it together and it’s exactly what I was looking for, thanks

penguin@lemmy.pixelpassport.studio on 21 Feb 13:10 next collapse

Home Assistant can do that, the quality will really depend on what hardware you have to run the LLM. If you only have a CPU you’ll be waiting 20 seconds for a response, which could also be pretty poor if you have to run a small quantized model

Kirk@startrek.website on 21 Feb 13:16 next collapse

Maybe things have improved but the last time I tried the Home Assistant er- assistant, it was garbage at anything other than the most basic commands given perfectly.

avidamoeba@lemmy.ca on 22 Feb 07:47 collapse

You gotta hook it to a local LLM. Then it’s boss.

Kirk@startrek.website on 22 Feb 09:21 collapse

Any pointers where to begin?

avidamoeba@lemmy.ca on 22 Feb 10:47 collapse

Install Ollama on a machine with fast CPU or GPU and enough RAM. I currently use Qwen3 that takes 8GB RAM. Runs on an NVIDIA GPU. Running it on CPU is also fast enough. There’s a 4GB version which is also decent for device control. Add Ollama integration in Home Assistant. Connect it to the Ollama on the other machine. Add Ollama as conversation agent to the Home Assistant’s voice assistant. Expose HA devices to be controllable. That’s about it on high level.

irotsoma@piefed.blahaj.zone on 21 Feb 14:46 next collapse

You have to run an LLM of your own and link it, if you want quality even close to approaching Google, but the Home Assistant with the Nabu Casa “Home Assistant Voice Preview Edition” speakers are working well enough for me. I don’t use it for much beyond controlling my home automation components, though. But it’s still very early tech anf it doesn’t understand all that much unless you add a lot of your own configurations. I eventually plan to add an LLM, but even just running on the home assistant yellow hardware with a raspberry pi compute module 5 works ok for the basics though there is a slight delay.

I haven’t tried, but Nabu Casa also offers a subscription service for the voice processing if you want something more robust and can’t host your own LLM, but thst means sending your data out, even if they have good privacy policies, which I’m not interested in, because while I somewhat trust Nabu Casa’s current business model and policies, being hosted in the US means it’s susceptible to the current regime’s police-state policies. I’m waiting for hardware costs to recover from the AI bubble to self host an LLM, personally.

hendrik@palaver.p3x.de on 21 Feb 17:15 next collapse

Livekit can be used to build voice assistants. But it’s more a framework to build an agent yourself, not a ready-made solution.

cymor@midwest.social on 21 Feb 17:48 next collapse

Try ollama.com you can download and try whatever you want. Quality is mostly how much VRAM your video card has.

grue@lemmy.world on 21 Feb 19:32 next collapse

I don’t like the guy’s breathless over-enthusiasm, but NetworkChuck has a video on how to integrate LLM-based voice assistants with HomeAssistant using Whisper and Ollama.

eager_eagle@lemmy.world on 21 Feb 20:29 collapse

ah yes, I stopped watching the guy because of that and the clickbait, but he does make some interesting content sometimes.

Mubelotix@jlai.lu on 22 Feb 09:01 collapse

He covers interesting subjects but he believes his audience is dumb and unknowledgeable which leads to this. He thinks he has to adapt to the regular youtube game to retain us, but he is just boring everyone

avidamoeba@lemmy.ca on 22 Feb 07:58 next collapse

HA with local LLM on Ollama. Can imtegrate the Android app as the default phone assistant. I don’t think it can use a wake word on the phone though. I invoke it by holding the power button, like a walkie.

billwashere@lemmy.world on 26 Feb 13:56 collapse

But what’s acting like the little box in your house that listens like an Amazon echo?