• snooggums
    link
    fedilink
    English
    arrow-up
    61
    arrow-down
    18
    ·
    8 months ago

    More mediocre images for everyone!

    • BrianTheeBiscuiteer@lemmy.world
      link
      fedilink
      English
      arrow-up
      22
      arrow-down
      3
      ·
      8 months ago

      While I think the realism of some models is fantastic and the flexibility of others is great it is starting to feel like we’re reaching a plateau on quality. Most of the white papers I’ve seen posted lately are about speed or some alternate way of doing what ControlNet or inpainting can already do.

      • Fubarberry@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        15
        arrow-down
        4
        ·
        8 months ago

        Have you seen the SD3 preview images? They’re looking seriously impressive.

      • Björn Tantau@swg-empire.de
        link
        fedilink
        English
        arrow-up
        11
        arrow-down
        2
        ·
        8 months ago

        Well, when it’s fast enough you can do it in real time. How about making old games look like they looked to you as a child?

        • UlrikHD@programming.dev
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          8 months ago

          There’s way more to a game’s look than textures though. Arguably ray tracing will have a greater impact than textures. Not to mention, for retro games, you could just generate the textures beforehand, no need to do it in real time.

          • Björn Tantau@swg-empire.de
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            1
            ·
            8 months ago

            I meant putting the whole image through AI. Not just the textures. Tell it how you want it to look and suddenly a grizzled old Mario is jumping on a realistic turtle with blood splattering everywhere.

            • webghost0101@sopuli.xyz
              link
              fedilink
              English
              arrow-up
              2
              ·
              8 months ago

              There is no single “whole” image when talking about a video game. It’s a combination of dynamic layers carefully interacting with each-other.

              You can take any individual texture and make it look different/more realistic and it may work with some interaction but might end up breaking the game. Especially if hit boxes depend on the texture.

              We may see ai remakes of video games at some point but it will require the ai to reprogram from scratch.

              Now when we talk about movies and other linear media, i expect to see this technology relatively soon.

              • Björn Tantau@swg-empire.de
                link
                fedilink
                English
                arrow-up
                2
                ·
                8 months ago

                There is no single “whole” image when talking about a video game. It’s a combination of dynamic layers carefully interacting with each-other.

                Of course there is. When everything is done a whole image is sent to the display to show. That’s how FSR 1 can work without explicit game support.

                • webghost0101@sopuli.xyz
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  8 months ago

                  What i ment is that the final image is dynamic so players may have a unique configuration which makes it harder for ai to understand whats going on.

                  Using the final render of each frame would cause a lot of texture bleeding for example when a red character stands in front of a red background. Or is jumping on top of an animal, you may have wild frames where the body shape drastically changes or is suddenly realistically riding the animal then petting it the next frame to then have it die on frame 3, all because every frame is processed as its own work.

                  Upscaling final renders is indeed possible but mostly because it doesnt change things all that much of the general shapes, Small artifacts are also very common here but often not noticeable by the human eye and dont effect a modern game.

                  In older games, especially mario where hitboxes are pixel dependent youd either have a very confusing games with tons of clipping because the game doesn’t consider the new textures or it abides to new textures affecting the gameplay.

                  Source: i have studied game development and have recreated mario era games as part of assignments, currently i am self-studying the technological specifics of how machine learning and generative algorithms operate.

      • snooggums
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        1
        ·
        8 months ago

        When the output of something is the average of the inputs it will naturally be mediocre. It will always look like the output of a committee by the nature of how it is formed.

        Certain artists stand out because they are different from everyone else, and that is why they are celebrated. M.C. Escher has a certain style that when run through AI looks like a skilled high school student doing their best impression of M.C. Escher.

        Now as a tool to inspire, AI is pretty good at creating mashups of multiple things really fast. Those could be used by an actual artist to create something engaging. Most AI reminds me of photoshop battles.

        • webghost0101@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          2
          ·
          8 months ago

          Who says the output is an average?

          I agree for narrow models and Loras trained on a specific style they can never be as good as the original but i also think that is the lamest uncreative way to generate.

          Much more fun to use general use models and to crack the settings to generate exactly what you want the way you want,

      • AggressivelyPassive@feddit.de
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        8 months ago

        That’s maybe because we’ve reached the limits of what the current architecture of models can achieve on the current architecture of GPUs.

        To create significantly better models without having a fundamentally new approach, you have to increase the model size. And if all accelerators accessible to you only offer, say, 24gb, you can’t grow infinitely. At least not within a reasonable timeframe.

        • Kbin_space_program@kbin.social
          link
          fedilink
          arrow-up
          2
          arrow-down
          5
          ·
          edit-2
          8 months ago

          Will increasing the model actually help? Right now we’re dealing with LLMs that literally have the entire internet as a model. It is difficult to increase that.

          Making a better way to process said model would be a much more substantive achievement. So that when particular details are needed it’s not just random chance that it gets it right.

          • AggressivelyPassive@feddit.de
            link
            fedilink
            English
            arrow-up
            9
            ·
            8 months ago

            That is literally a complete misinterpretation of how models work.

            You don’t “have the Internet as a model”, you train a model using large amounts of data. That does not mean, that this model contains any of the actual data. State of the at models are somewhere in the billions of parameters. If you have, say, 50b parameters, each being a 64bit/8 byte double (which is way, way too much accuracy) you get something like 400gb of data. That’s a lot, but the Internet slightly larger than that.

            • Kbin_space_program@kbin.social
              link
              fedilink
              arrow-up
              1
              arrow-down
              6
              ·
              edit-2
              8 months ago

              It’s an exaggeration, but its not far off given that Google literally has all of the web parsed at least once a day.

              Reddit just sold off AI harvesting rights on all of its content to Google.

              The problem is no longer model size. The problem is interpretation.

              You can ask almost everyone on earth a simple deterministic math problem and you’ll get the right answer almost all of the time because they understand the principles behind it.

              Until you can show deterministic understanding in AI, you have a glorified chat bot.

              • AggressivelyPassive@feddit.de
                link
                fedilink
                English
                arrow-up
                8
                ·
                8 months ago

                It is far off. It’s like saying you have the entire knowledge of all physics because you skimmed a textbook once.

                Interpretation is also a problem that can be solved, current models do understand quite a lot of nuance, subtext and implicit context.

                But you’re moving the goal post here. We started at “don’t get better, at a plateau” and now you’re aiming for perfection.

                • Kbin_space_program@kbin.social
                  link
                  fedilink
                  arrow-up
                  1
                  arrow-down
                  4
                  ·
                  8 months ago

                  You’re building beautiful straw men. They’re lies, but great job.

                  I said originally that we need to improve the interpretation of the model by AI, not just have even bigger models that will invariably have the same flaw as they do now.

                  Deterministic reliability is the end goal of that.

  • Fubarberry@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    25
    arrow-down
    1
    ·
    8 months ago

    These kind of performance improvements have really cool potential for real time image/texture generation in games. I’ve already seen some games do this, but they usually rely on generating the images online.

    ASCII and low graphic roguelike’s have a lot of generation freedom where they can create very unique monsters/items/etc. However a lot of this flexibility is lost as you move to more polished games that require models and art assets for everything. This is also one of the many reasons that old-styled games are still popular, is because they often offer more variety and randomization than newer titles. I think generated art assets could be a cool way to bridge the gap though, and let more modern games have crazy unique monsters/items with visuals.

  • Thann@lemmy.ml
    link
    fedilink
    English
    arrow-up
    24
    arrow-down
    1
    ·
    8 months ago

    Pfft, I can do that, just run them on a computer that’s 30 times faster!

      • Archr@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        edit-2
        8 months ago

        I see your InvokeAI and raise you Stability Matrix

        Edit: I wanted to edit my comment to leave some context for people.

        Stability Matrix is an app that handles installing many different stable diffusion applications. (no more messing with InvokeAI’s janky install script).

        It also integrates with CivitAI and HuggingFace to directly download models and Lora and share them between your applications, saving you lots of diskspace.

        • Black616Angel@feddit.de
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          1
          ·
          edit-2
          8 months ago

          … Thanks. This looks super useful.

          Edit: After posting I realized.that this sounds super sarcastic, which it wasn’t. This does look useful and I was already looking for smth. like that.

      • Cyyy@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        8 months ago

        but that is just normal Stable Diffusion, not the method used or mentioned here. So it isn’t even what this news is about :/

  • istanbullu@lemmy.ml
    link
    fedilink
    English
    arrow-up
    9
    ·
    8 months ago

    This is nice, but the post ignores all the other research in this topic. SDXL Lightning can generate images in 2 steps.