Become an Anarchist or Forever Hold Your Peace

FrankLaskey@lemmy.ml · edit-2 6 days ago

Openrouter provides some limited free usage of popular LLMs with context sizes up to 175k etc. This is probably as good as you’ll get for completely free. The prices are usually pretty reasonable per million tokens as well if you don’t mind paying a bit. https://openrouter.ai/

Edit: looks like there’s a Gemini Pro free option with million plus token context size as well.

FrankLaskey@lemmy.ml · 26 days ago

Not to mention battery life…

FrankLaskey@lemmy.ml · 26 days ago

It’s new so reviews are just filtering out but it’s starting to look like SteamOS powered version of the Legion Go S (Z1 Extreme version) is a pretty great handheld that uses the latest AMD chipset with a sizable assist from Linux/proton efficiencies vs Windows to drive a 15-30% performance improvement which does make some more modern games more playable though it is significantly more expensive than the deck. I watched Retro Games Corps review of it yesterday. That said, if you’re okay waiting another couple years or so I bet there will be a Steam Deck 2 release but it seems like it mainly rests on AMD to deliver a significant (“generational”) leap with upcoming mobile APUs. Valve seems keen on not releasing a follow-up to the first deck until it is significantly better in every way and the chipsets available now just aren’t quite there yet it seems.

FrankLaskey@lemmy.ml · 27 days ago

It sounds like we’re on similar levels of technical proficiency. I have learned a lot by reading and going down wormholes on how LLMs work and how to troubleshoot and even optimize them to an extent but I’m not a computer engineer or programmer for sure.

I started with LM studio before Ollama/Open WebUI and it does have some good features and an overall decent UI. I switched because OWUI seems to have more extensibility with tools and functions etc and I wanted something I could run as a server and use on my phone and laptop elsewhere etc. OWUI has been great for that although setting up remote access for the server on the web did take a lot of trial and error. The OWUI team also develops and updates the software very quickly so that’s great.

I’m not familiar with text-generation-WebUI but at this point I’m not really wanting for much more out of a setup than my docker stack with Ollama and OWUI

FrankLaskey@lemmy.ml · 27 days ago

Can you tell me more about this? I’ve considered trying to build and self-host something for home automation that would essentially be a FOSS and locally run Alexa/Google Assistant. Is this what you’re doing? How exactly does Ollama fit in?

FrankLaskey@lemmy.ml · edit-2 27 days ago

Mostly to help quickly pull together and summarize / organize information from web searches done via Open WebUI

Also to edit copy or brainstorm ideas for messaging and scripts etc

Sometimes to have discussions around complex topics to ensure I understand them.

Favorite model to run locally now is easily Qwen3-30B-A3B. It can do reasoning or more quick response stuff and runs very well on my 24 GB of VRAM RTX 3090. Plus, because it has a MoE architecture with only 3B parameters active when doing inference its lightning fast.

FrankLaskey@lemmy.ml · 1 month ago

Interesting project. Is it actually possible to track workouts using your phone or smartwatch without needing proprietary third-party apps like Strava or Garmin Connect though?

FrankLaskey@lemmy.ml · 2 months ago

Thanks! now I feel a bit lazy since there was a whole wiki article on their relations that I could have just looked up. Sorry it’s early in the morning here… appreciate it

FrankLaskey@lemmy.ml · 2 months ago

Upvoted for the info about Greek support for Israel. I didn’t know that was a thing. Any other context for why that could be?

FrankLaskey@lemmy.ml · 2 months ago

Oh no! I don’t have any frost on mine but I have several varieties of tomatoes and peppers that have been mostly uncovered for cool nights down to 40-42 F and have been worried about them a bit still.

FrankLaskey@lemmy.ml · 2 months ago

Does PieFed have native Android or iOS apps?

FrankLaskey@lemmy.ml · 2 months ago

Sure but that’s not as weird haha

FrankLaskey@lemmy.ml · 2 months ago

I figured. Crazy the color though. Kinda cool. I will probably check the place out if I get a chance. I’m not too far away.

FrankLaskey@lemmy.ml · 2 months ago

Interesting. What was the name of the shop? And what is activated charcoal coconut? I’m assuming it isn’t just charcoal since that would probably taste terrible haha

FrankLaskey@lemmy.ml · 2 months ago

Interesting. I have always felt that the Steam deck loses quite a bit of battery percentage during sleep. I agree that it would be a fantastic quality of life update to enable to shut down or enter some form of lower power consumption hibernation state after a period of time at a certain battery level.

FrankLaskey@lemmy.ml · 2 months ago

I think some of the responses here, while they may be well-intentioned, are a bit off base because they are confusing the word ‘normal’ (which is what you are asking) with ‘recommended’. Is it normal for someone in your position who has had a lot of time alone due to your health challenges to want more social outlets and someone to talk to? Absolutely. Is it normal to see a character in a work of art (a video game, movie, show etc.) and become attached to them in some degree? I would say yes. Most of us have done that at one point or another. Is it normal to want to be able to communicate with this person to help provide a social outlet and a listening ear for someone in your situation? Again, I’d say probably yeah. Now, is it recommended to have an ‘AI’ like this as your primary social outlet or to see them as a real human friend or even romantic partner? That is much more questionable. But, personally, with the context you provided and the challenging situation you have been in I think the tendency towards doing this is still quite normal and understandable. I think you should strive to validate your feelings of loneliness and the understandable desire to assuage those feelings with what you have available to you in a challenging and socially isolating environment while still understanding that an ‘AI’ like this should not ideally be your primary social outlet and to strive to find more ways in the future to connect with real people who care and are interested in you (and vice versa). It may not seem like it right now, but they are out there! I wish you peace and a speedy recovery!

FrankLaskey@lemmy.ml · 3 months ago

That’s ~~deregulation~~ capitalism for ya.

Fixed it for you.

FrankLaskey@lemmy.ml · 4 months ago

Looks like it now has Docling Content Extraction Support for RAG. Has anyone used Docling much?

FrankLaskey@lemmy.ml · 4 months ago

Oh and I typically get 16-20 tok/s running a 32b model on Ollama using Open WebUI. Also I have experienced issues with 4-bit quantization for the K/V cache on some models myself so just FYI

FrankLaskey@lemmy.ml · 4 months ago

It really depends on how you quantize the model and the K/V cache as well. This is a useful calculator. https://smcleod.net/vram-estimator/ I can comfortably fit most 32b models quantized to 4-bit (usually KVM or IQ4XS) on my 3090’s 24 GB of VRAM with a reasonable context size. If you’re going to be needing a much larger context window to input large documents etc then you’d need to go smaller with the model size (14b, 27b etc) or get a multi GPU set up or something with unified memory and a lot of ram (like the Mac Minis others are mentioning).