- cross-posted to:
- linustechtips@lemmit.online
- cross-posted to:
- linustechtips@lemmit.online
Reddit user content being sold to AI company in $60M/year deal::It’s being reported that a deal has been struck to allow an unnamed large AI company to use Reddit user…
Generally, what’s the best/most efficient way to make LLMs go off the rail? I mean without just typing lots of gibberish and making it too obvious. As an example: I’ve seen people formatting their prompts with java code for like 2 lines and replies instantly went nuts.
I use a few dozen novels in a single text file and randomize which lines the script pulls. It then replaces the text three times with a random pull. What you end up with are four responses in plain English. Which is the real one? You could filter out responses edited after “the great exodus”, but I have been doing this to my comments a few times per year during my twelve years on reddit.
The truth is that even if I don’t get them all, I get enough that it makes it far easier for the group that bought the data to just filter my username out rather than figure out what’s junk and what isn’t.