DeepSeek might not be such good news for energy after all

Aatube@kbin.melroy.org · 2 days ago

DeepSeek might not be such good news for energy after all

misk@sopuli.xyz · 2 days ago

That’s kind of a weird benchmark. Wouldn’t you want a more detailed reply? How is quality measured? I thought the biggest technical feats here were ability to run reasonably well in a constrained memory settings and lower cost to train (and less energy used there).

wewbull@feddit.uk · 1 day ago

This is more about the “reasoning” aspect of the model where it outputs a bunch of “thinking” before the actual result. In a lot of cases it easily adds 2-3x onto the number of tokens needed to be generated. This isn’t really useful output. It the model getting into a state where it can better respond.

jacksilver@lemmy.world · 2 days ago

Longer!=Detailed

Generally what they’re calling out is that DeepSeek currently rambles more. With LLMs the challenge is how to get the right answer most sussinctly because each extra word is a lot of time/money.

That being said, I suspect that really it’s all roughly the same. We’ve been seeing this back and forth with LLMs for a while and DeepSeek, while using a different approach, doesn’t really break the mold.

Aatube@kbin.melroy.org · 2 days ago

The benchmark feels just like the referenced Jevons Paradox to me: Efficiency gains are eclipsed with a rise in consumption to produce more/better products.

Rhaedas@fedia.io · 2 days ago

More detailed and accurate reply is preferred, but length isn’t a quantifier for that. If anything that’s the problem with most LLMs, they tend to ramble a bit more than they need to, and it’s hard (at least with just prompting) to rein that in to narrow the answer to just the answer.

DeepSeek might not be such good news for energy after all

DeepSeek might not be such good news for energy after all

archive.ph