PumpkinEscobar

3 days ago

OpenAI to remove non-profit control and give Sam Altman equity

First a caveat/warning - you'll need a beefy GPU to run larger models, there are some smaller models that perform pretty well.

Adding a medium amount of extra information for you or anyone else that might want to get into running models locally

Tools

Ollama - great app for downloading/managing/running models locally
OpenWebUI - A web app that provides a UI like the ChatGPT web app, but can use local models
continue.dev - A VS Code extension that can use ollama to give a github copilot-like AI assistant running against a local model (can also connect to Anthropic Claude, etc...)

Models

If you look at https://ollama.com/library?sort=featured you can see models

Model size is measured by parameter count. Generally higher parameter models are better (more "smart", more accurate) but it's very challenging/slow to run anything over 25b parameters on consumer GPUs. I tend to find 8-13b parameter models are a sort of sweet spot, the 1-4b parameter models are meant more for really low power devices, they'll give you OK results for simple requests and summarizing, but they're not going to wow you.

If you look at the 'tags' for the models listed below, you'll see things like 8b-instruct-q8_0 or 8b-instruct-q4_0. The q part refers to quantization, or shrinking/compressing a model and the number after that is roughly how aggressively it was compressed. Note the size of each tag and how the size reduces as the quantization gets more aggressive (smaller numbers). You can roughly think of this size number as "how much video ram do I need to run this model". For me, I try to aim for q8 models, fp16 if they can run in my GPU. I wouldn't try to use anything below q4 quantization, there seems to be a lot of quality loss below q4. Models can run partially or even fully on a CPU but that's much slower. Ollama doesn't yet support these new NPUs found in new laptops/processors, but work is happening there.

Llama 3.1 - The 8b instruct model is pretty good, decent speed and good quality. This is a good "default" model to use
Llama 3.2 - This model was just released yesterday. I'm only seeing the 1b and 3b models right now. They've changed the 8b model to 11b, I'm assuming the 11b model is going to be my new goto when it's available.
Deepseek Coder v2 - A great coding assistant model
Command-r - This is a more niche model, mainly useful for RAG. It's only available in a 35b parameter model, so not all that feasible to run locally
Mistral small - A really good model, in the ballpark of Llama. I haven't had quite as much luck with this as with Llama but it is good and I just saw that a new version was released 8 days ago, will need to check it out again

3 days ago

OpenAI to remove non-profit control and give Sam Altman equity

It’s a good thing that real open source models are getting good enough to compete with or exceed OpenAI.

5 days ago

What are the best fictional books you’ve ever read?

The Moon is a Harsh Mistress - by Heinlein who also wrote Starship Troopers. Starship Troopers is also great and pretty different from the movie

1 wk. ago

Are modern LLMs closer to AGI or next word predictor? Where do they fall in this graph with 10 on x-axis being human intelligence.

I'll preface by saying I think LLMs are useful and in the next couple years there will be some interesting new uses and existing ones getting streamlined...

But they're just next word predictors. The best you could say about intelligence is that they have an impressive ability to encode knowledge in a pretty efficient way (the storage density, not the execution of the LLM), but there's no logic or reasoning in their execution or interaction with them. It's one of the reasons they're so terrible at math.

1 wk. ago

Zelda-Inspired Plucky Squire Shows What Happens When A Game Doesn't Trust Its Players

I like the game, but agree with the over-tutorialed complaints. They have two difficulty modes, I wish only story mode got all the handholding. I think there’s enough obvious indicators to get you through all the game mechanics.

1 wk. ago

What comes to mind?

It has been on my list to figure out how to move to forgejo, need to do it soon before the migration process breaks or gets awful.

2 wk. ago

any actually good ear buds out there?

The Beats Fit Pros are nice. I don’t do a ton of exercise in them but think they’d handle all the issues you mentioned pretty well.

2 wk. ago

I redid the meme with what hurts me

Coming from c# then typescript and nextjs, rye feels very intuitive and like a nice bridge / gateway drug into python.

2 wk. ago

Removed

Do You Still Use Git in the Terminal?

VS Code’s git features are pretty good for staging changes, resolving merge conflicts, pushing changes. I still do most branch changing and creating with the CLI, and yeah, any sort of problem generally needs the CLI.

We’ve also been using graphite at work and there’s a lot I like about graphite. They have a VS Code extension I haven’t used in a while but their CLI is pretty nice

3 wk. ago

Elon Musk on pace to become world’s first trillionaire by 2027, report says

surely he'll be less of a twat then. right?

3 wk. ago

South-up map orientation

Cuz it's freaking me out...

4 wk. ago

[Discussion] Of all the films you’ve gone into blind, which one truly stands out as your favorite find?

Donnie Darko - Just such a great, strange movie

4 wk. ago

Trump bizarrely claims people have stopped eating bacon because of wind power

I guess it wasn't bacon I hate for breakfast yesterday.

Why do you hate bacon, are you a windmill?

1 mo. ago

Looking for software KVM I can't remember the name of (solved)

Lan-mouse looks great but keep in mind that there’s no network encryption right now. There is a GitHub ticket open and the developer seems eager to add encryption. It’s just worth understanding that all your keystrokes are going across the network unencrypted.

1 mo. ago

Ken Paxton Is in Big Trouble After Raiding Homes of Latino Democrats

Things I will bet money on

They will produce no evidence of any wrongdoing uncovered from any of these raids
They will give some cryptic statement that tries to make it sound like they did find something
Texas lawmakers will continue to not hold Paxton accountable for anything

1 mo. ago

How can I improve my communication with a friend I like?

Shoot your shot, player.

Don’t go crazy or over the top, don’t overdo it, but just say it. If they’re a good friend they won’t be scared away. If they’re like you that way you’ll both be happier.

Don’t overthink it, ask them if they’d ever like to hang out or do something more like a date.

Ballsy, direct, badass. That can be you.

Dating is awkward but life gets a lot better once you get more comfortable with it. Everyone is a dating idiot until they’re not, there’s a good chance your friend is still in the idiot stage and maybe hell be over the moon that you helped push through it.

1 mo. ago

Anyone having issues with newer laptop's?

More than distro hopping maybe try out a zen kernel or compiling kernel yourself and changing kernel config and scheduler, or a newer version of the stock kernel?

I’m not super current on what’s in each kernel but I’d expect latest mainline to handle newer processors better than some of the older stable kernels in some of the more mainstream slower releasing distros.

1 mo. ago

What are some good flight simulator games?

The realism is amazing

1 mo. ago

anyone who uses Linux on apple silicon or another arm device

Ran Asahi for several months, tried it out again recently. It’s good/fine, I just don’t love fedora.

There’s some funkiness with the more complicated install, the AI acceleration doesn’t work, no thunderbolt / docking station.

MacBooks are great hardware but I don’t think they’re the best option for Linux right now. If you’re never going to boot into macOS then I’d look for x13, new Qualcomm, isn’t there a framework arm64 option now or was that a RISC module?

I’m also assuming you’re not looking to do any gaming? Because gaming on ARM is not really a thing right now and doesn’t feel like it will be for a long while.

1 mo. ago

Meta cancels Vision Pro competitor, which was too pricey to ‘sell well’

I’m really curious how the visor headset gets reviewed and performs. Their subscription pricing model is interesting.

VR has had some interesting success in the last few years but it feels like a tough job to strike the right balance on cost and performance.