Article: https://proton.me/blog/deepseek

Calls it “Deepsneak”, failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers - unlike most of the competing SOTA AIs.

I can’t speak for Proton, but the last couple weeks are showing some very clear biases coming out.

  • lily33@lemm.ee
    link
    fedilink
    arrow-up
    33
    arrow-down
    2
    ·
    22 hours ago

    To be fair, most people can’t actually self-host Deepseek, but there already are other providers offering API access to it.

    • halcyoncmdr@lemmy.world
      link
      fedilink
      English
      arrow-up
      32
      arrow-down
      2
      ·
      22 hours ago

      There are plenty of step-by-step guides to run Deepseek locally. Hell, someone even had it running on a Raspberry Pi. It seems to be much more efficient than other current alternatives.

      That’s about as openly available to self host as you can get without a 1-button installer.

      • tekato@lemmy.world
        link
        fedilink
        arrow-up
        13
        ·
        21 hours ago

        You can run an imitation of the DeepSeek R1 model, but not the actual one unless you literally buy a dozen of whatever NVIDIA’s top GPU is at the moment.

        • lily33@lemm.ee
          link
          fedilink
          arrow-up
          9
          arrow-down
          1
          ·
          19 hours ago

          A server grade CPU with a lot of RAM and memory bandwidth would work reasonable well, and cost “only” ~$10k rather than 100k+…

      • Dyf_Tfh@lemmy.sdf.org
        link
        fedilink
        arrow-up
        8
        arrow-down
        3
        ·
        edit-2
        21 hours ago

        Those are not deepseek R1. They are unrelated models like llama3 from Meta or Qwen from Alibaba “distilled” by deepseek.

        This is a common method to smarten a smaller model from a larger one.

        Ollama should have never labelled them deepseek:8B/32B. Way too many people misunderstood that.

        • ☆ Yσɠƚԋσʂ ☆@lemmy.ml
          link
          fedilink
          arrow-up
          4
          arrow-down
          1
          ·
          19 hours ago

          I’m running deepseek-r1:14b-qwen-distill-fp16 locally and it produces really good results I find. Like yeah it’s a reduced version of the online one, but it’s still far better than anything else I’ve tried running locally.

            • ☆ Yσɠƚԋσʂ ☆@lemmy.ml
              link
              fedilink
              arrow-up
              2
              arrow-down
              1
              ·
              7 hours ago

              The main difference is speed and memory usage. Qwen is a full-sized, high-parameter model while qwen-distill is a smaller model created using knowledge distillation to mimic qwen’s outputs. If you have the resources to run qwen fast then I’d just go with that.

              • morrowind@lemmy.ml
                link
                fedilink
                arrow-up
                1
                ·
                16 minutes ago

                I think you’re confusing the two. I’m talking about the regular qwen before it was finetuned by deep seek, not the regular deepseek