What AI services are you selfhosting? Or, have tested and passed on

kiol@lemmy.world · 1 day ago

What AI services are you selfhosting? Or, have tested and passed on

SmokeyDope@lemmy.world · edit-2 23 hours ago

I run kobold.cpp which is a cutting edge local model engine, on my local gaming rig turned server. I like to play around with the latest models to see how they improve/change over time. The current chain of thought thinking models like deepseek r1 distills and qwen qwq are fun to poke at with advanced open ended STEM questions.

STEM questions like “What does Gödel’s incompleteness theorem imply about scientific theories of everything?” Or “Could the speed of light be more accurately refered to as ‘the speed of causality’?”

As for actual daily use, I prefer using mistral small 24b and treating it like a local search engine with the legitimacy of wikipedia. Its a starting point to ask questions about general things I don’t know about or want advice on, then do further research through more legitimate sources.

Its important to not take the LLM too seriously as theres always a small statistical chance it hallucinates some bullshit but most of the time its fairly accurate and is a pretty good jumping off point for further research.

Lets say I want an overview of how can I repair small holes forming in concrete, or general ideas on how to invest financially, how to change fluids in a car, how much fat and protein is in an egg, ect.

If the LLM says a word or related concept I don’t recognize I grill it for clarifying info and follow it through the infinite branching garden of related information.

I’ve used an LLM to help me go through old declassified documents and speculate on internal gov terminalogy I was unfamiliar with.

I’ve used a speech to text model and get it to speek just for fun. Ive used multimodal model and get it to see/scan documents for info.

Ive used websearch to get the model to retrieve information it didn’t know off a ddg search, again mostly for fun.

Feel free to ask me anything, I’m glad to help get newbies started.

ikidd@lemmy.world · 21 hours ago

LMStudio is pretty much the standard. I think it’s opensource except for the UI. Even if you don’t end up using it long-term, it’s great for getting used to a lot of the models.

Otherwise there’s OpenWebUI that I would imagine would work as a docker compose, as I think there’s ARM images for OWU and ollama

kata1yst@sh.itjust.works · 1 day ago

I use OLlama & Open-WebUI, OLlama on my gaming rig and Open-WebUI as a frontend on my server.

It’s been a really powerful combo!

kiol@lemmy.world · 1 day ago

Would you please talk more about it. I forgot about Open-webui, but intending to start playing with. Honestly, what do you actually do with it?

mac@lemm.ee · edit-2 24 hours ago

I have Linkwarden pointed at my ollama deployment, so it auto tags links that I archive which is nice.

I’ve seen other people send images captured on their security cameras on frigate to ollama to get it to describe the image

There’s a bunch of other use cases I’ve thought of for coding projects, but haven’t started on any of them yet

Lucy :3@feddit.org · 1 day ago

Sex chats. For other uses, just simple searches are better 99% of the time. And for the 1%, something like the Kagis FastGPT helps to find the correct keywords.

truxnell@infosec.pub · 24 hours ago

I run ollama and auto1111 on my desktop when it’s powers on. Using open-webui in my homelab always on, and also connected to openrouter. This way I can always use openwebui with openrouter models and it’s pretty cheap per query and a little more private that using a big tech chatbot. And if I want local, I turn on the desktop and have local lamma and stab diff.

I also get bugger all benefit out of it., it’s a cute toy.

kiol@lemmy.world · 22 hours ago

How do you like auto1111 as I’ve never head of it

colourlesspony@pawb.social · 1 day ago

I messed around with home assistant and the ollama integration. I have passed on it and just use the default one with voice commands I set up. I couldn’t really get ollama to do or say anything useful. Like I asked it what’s a good time to run on a treadmill for beginners and it told me it’s not a doctor.

metoosalem@feddit.org · 1 day ago

Like I asked it what’s a good time to run on a treadmill for beginners and it told me it’s not a doctor.

Kirkland brand meseeks energy.

psmgx@lemmy.world · 19 hours ago

Hey now Kirkland brand is respectable, usually premium brands repackaged. Such as how Costco vodka was secretly (“secretly”) Grey Goose

Starfighter@discuss.tchncs.de · edit-2 1 day ago

There are some experimental models made specifically for use with Home Assistant, for example home-llm.

Even though they are tiny 1-3B I’ve found them to work much better than even 14B general purpose models. Obviously they suck for general purpose questions just by their size alone.

That being said they’re still LLMs. I like to keep the “prefer handling commands locally” option turned on and only use the LLM as a fallback.

SmokeyDope@lemmy.world · 23 hours ago

Sounds like ollama was loaded up with an either overly censored or plain brain dead language model. Do you know which model it was? Maybe try mistral if it fits in your computer.

kiol@lemmy.world · 1 day ago

Haha, that is hilarious. Sounds like it gave you some snark. afaik you have to clarify by asking again when it says such things. “I’m not asking for medical advice, but…”

RonnyZittledong@lemmy.world · 1 day ago

deleted by creator

RonnyZittledong@lemmy.world · 1 day ago

None currently. Wish I could afford a GPU to play with some stuff.

state_electrician@discuss.tchncs.de · 9 hours ago

Yeah. I have a mini PC with an AMD GPU. Even if I were to buy a big GPU I couldn’t use it. That frustrates me, because I’d love to play around with some models locally. I refuse to use anything hosted by other people.

moomoomoo309@programming.dev · 7 hours ago

Your M.2 port can probably fit an M.2 to PCIe adapter and you can use a GPU with that - ollama supports AMD GPUs just fine nowadays (well, as well as it can, rocm is still very hit or miss)

state_electrician@discuss.tchncs.de · 6 hours ago

Oh, then I need to give it another try.

kiol@lemmy.world · 1 day ago

Well, let me know your suggestions if you wish. I took the plunge and am willing to test on your behalf, assuming I can.

Rikudou_Sage@lemmings.world · 23 hours ago

Try running an AI Horde worker, it’s a really great service!

kiol@lemmy.world · 22 hours ago

Not sure I know what that is. As in Hoarder?

Rikudou_Sage@lemmings.world · 14 hours ago

It’s a cluster of workers where everyone can generate images/text using workers connected to the service.

So if you ran a worker, people could generate stuff using your PC. For that you would gain kudos, which in turn you can use to generate stuff on other people’s computers.

Basically you do two things: help common people without access to powerful machines and use your capacity when you have time to use the kudos whenever you want, even on the road where you can’t turn on your PC if you fancy so.

Grandwolf319@sh.itjust.works · 1 day ago

I have Immich that has AI searching for my photos. Pretty useful for finding stuff actually

gdog05@lemmy.world · 9 hours ago

Once I changed the default model, immich search became amazing. I want to show it off to people but alas, way too many NSFW pics in my library. I would create a second “clean” version to show off to people but I’ve been too lazy.

superglue@lemmy.dbzer0.com · 21 hours ago

Can anyone suggest a model for light coding? I’m on a 3070 mobile.

ikidd@lemmy.world · 21 hours ago

Claude is the standard that all others are judged by. But it’s not cheap.

Gemini is pretty good, and Qwen-coder isn’t bad. I’d suggest you watch a few vids on GosuCoder’s YT channel to see what works for you, he reviews a pile of them and it’s quite up to date.

And if you use VScode, I highly recommend the Roocode extension. Gosucoder also goes into revising the roocode prompt to reduce costs for Claude. Another extension is Cline.

NeatoBuilds@lemmy.today · 1 day ago

I have immich machine learning and ollama with openwebui

I use immich search a lot to find things like pictures of the side of the road to post on my community !sideoftheroad@lemmy.today

I almost never use the ollama though, not really sure what to do with it other than ask it dumb questions just to see what it says

I use the duckduckgo one when it auto has an answer to something I searched but its not too reliable

acockworkorange@mander.xyz · 22 hours ago

I am curious about trying an application specific AI. Like just for coding, for instance. I assume the memory requirements would be much lower.

kiol@lemmy.world · 22 hours ago

afaik Ollama would fit that bill, but perhaps others can chime in. You could probably run it on your local computer with a small model based on CPU alone.

acockworkorange@mander.xyz · 15 hours ago

I haven’t sunk much time at it, but I’m not aware of any training data focusing on code only. There’s nothing preventing me from running with general purpose data, but I imagine I’d have a snappier response with a smaller, focused dataset, without losing accuracy.

y0shi@lemm.ee · 1 day ago

I’ve an old gaming PC with a decent GPU laying around and I’ve thought of doing that (currently use it for linux gaming and GPU related tasks like photo editing etc) However ,I’m currently stuck using LLMs on demand locally with ollama. Energy costs of having it powered on all time for on demand queries seems a bit overkill to me…

pezhore@infosec.pub · 1 day ago

I put my Plex media server to work doing Ollama - it has a GPU for transcoding that’s not awful for simple LLMs.

y0shi@lemm.ee · 1 day ago

That sounds like a great way of leveraging existing infrastructure! I host Plex together with other services in a server with intel transcoding capable CPU. I’m quite sure I would get much better performance with the GPU machine, might end up following this path!

kiol@lemmy.world · 1 day ago

Have to agree on that. Certainly only makes sense to have up when you are using it.

Helmaar@lemmy.world · 1 day ago

I was able to run a distilled version of DeepSeek on Linux. I ran it inside a PODMAN container with ROCM support (I have an AMD GPU). It wasn’t super fast but for a locally deployed and self hosted option the performance was okay. Apart from that I have deployed Fooocus for image generation in a similar manner. Currently, I am working on deploying Stable Diffusion with either ComfyUI or Automatic1111 inside a PODMAN container with ROCM support.

kiol@lemmy.world · 1 day ago

Didn’t know about these image generation tools, besides Stable Diffusion. Thanks!

What AI services are you selfhosting? Or, have tested and passed on

What AI services are you selfhosting? Or, have tested and passed on

Testing Indiedroid Nova w/ 16gb ram - Learning Together