Selfhosted

52841 readers

443 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago

MODERATORS

HybridSarcasm@lemmy.world

HybridSarcasm@lemmy.hybridsarcasm.xyz

Frustratingly bad at self hosting. Can someone help me access LLMs on my rig from my phone (lemmy.zip)

submitted 2 months ago by BlackSnack@lemmy.zip to c/selfhosted@lemmy.world

43 comments fedilink hide all child comments

tl-dr

-Can someone give me step by step instructions (ELI5) on how to get access to my LLM's on my rig from my phone?

Jan seems the easiest but I've tried with Ollama, librechat, etc.

.....

I've taken steps to secure my data and now I'm going the selfhosting route. I don't care to become a savant with the technical aspects of this stuff but even the basics are hard to grasp! I've been able to install a LLM provider on my rig (Ollama, Librechat, Jan, all of em) and I can successfully get models running on them. BUT what I would LOVE to do is access the LLM's on my rig from my phone while I'm within proximity. I've read that I can do that via wifi or LAN or something like that but I have had absolutely no luck. Jan seems the easiest because all you have to do is something with an API key but I can't even figure that out.

Any help?

you are viewing a single comment's thread
view the rest of the comments

[–] tal@lemmy.today 1 points 2 months ago* (last edited 2 months ago) (1 children)

Ollama does have some features that make it easier to use for a first-time user, including:

Calculating automatically how many layers can fit in VRAM and loading that many layers and splitting between main memory/CPU and VRAM/GPU. llama.cpp can't do that automatically yet.
Automatically unloading the model from VRAM after a period of inactivity.

I had an easier time setting up ollama than other stuff, and OP does apparently already have it set up.

[–] brucethemoose@lemmy.world 1 points 2 months ago* (last edited 2 months ago)

Yeah. But it also messes stuff up from the llama.cpp baseline, and hides or doesn't support some features/optimizations, and definitely doesn't support the more efficient iq_k quants of ik_llama.cpp and its specialzied MoE offloading.

And that's not even getting into the various controversies around ollama (like broken GGUFs or indications they're going closed source in some form).

...It just depends on how much performance you want to squeeze out, and how much time you want to spend on the endeavor. Small LLMs are kinda marginal though, so IMO its important if you really want to try; otherwise one is probably better off spending a few bucks on an API that doesn't log requests.