12
A tale of a new Lemmy instance, a bot infestation, the fallout, and how we dealt with it - Lemmy Ninja Clan
lemmy.ninja## Summary We started a Lemmy instance on June 13 during the Reddit blackout.
While we were configuring the site, we accumulated a few thousand bot accounts,
leading some sites to defederate with us. Read on to see how we cleaned up the
mess. ## Introduction Like many of you, we came to Lemmy during the Great Reddit
Blackout. @MrEUser started Lemmy.ninja on the 13th, and the rest of us on the
site got to work populating some initial rules and content, learning how Lemmy
worked, and finding workarounds for bugs and issues in the software.
Unfortunately for us, one of the challenges to getting the site up turned out to
be getting the email validation to work. So, assuming we were small and beneath
notice, we opened our registration for a few days until we could figure out if
the problems we were experiencing were configuration related or software bugs.
In that brief time, we were discovered by malicious actors and hundreds of new
bot users were being created on the site. Of course we had no idea, since Lemmy
provides no user management features. We couldn’t see them, and the bots didn’t
participate in any of our local content. ## Discovering the Bots Within a couple
of days, we discovered some third-party tools that gave us the only insights we
had into our user base. Lemmy Explorer [https://lemmyverse.net/] and The
Federation [https://the-federation.info] were showing us that a huge number of
users had registered. It took a while, but we eventually tracked down a post
that described how to output a list of users from our Lemmy database. Sure
enough, there were thousands of users there. It took some investigation
[https://lemmy.ninja/post/7420], but we were eventually able to see which users
were actually registered at lemmy.ninja. There were thousands, just like the
third-party tools told us. ## Meanwhile… While we were figuring this out, others
in Lemmy had noticed [https://lemm.ee/post/197715] a coordinated bot attack, and
some were rightly taking steps to cordon off the sites with bots as they began
to interact with federated content. Unfortunately for us, this news never made
it to us because our site was still young, and young Lemmy servers don’t
automatically download all federated content right away. (In fact, despite daily
efforts to connect lemmy.ninja to as many communities as possible, I didn’t even
learn about the lemm.ee [http://lemm.ee] mitigation efforts until today.) We
know now that the bots began to interact with other Mastodon and Lemmy instances
at some point, because we learned (again, today) that we had been blocked by a
few of them. (Again, this required third-party tools [https://fba.ryona.agency]
to even discover.) At the time, we were completely unaware of the attack, that
we had been blocked, or that the bots were doing anything at all. ## Cleaning Up
The moment we learned that the bots were in our database, we set out to
eliminate them. The first step, of course, was to enable a captcha and activate
email validation so that no new bots could sign up. [Note: The captcha feature
was eliminated in Lemmy 0.18.0.] Then we had to delete the bot users. Next we
made a backup. Always make a backup! After that, we asked the database to output
all the users so we could manually review the data. After logging into the
database docker container, we executed the following command: — select p.name,
p.display_name, a.person_id, a.email, a.email_verified, a.accepted_application
from local_user a, person p where a.person_id = p.id; — That showed us that yes,
every user after #8 or so was indeed a bot. Next, we composed a SQL statement to
wipe all the bots. — BEGIN; CREATE TEMP TABLE temp_ids AS SELECT person_id FROM
local_user WHERE person_id > 85347; DELETE FROM local_user WHERE person_id IN
(SELECT person_id FROM temp_ids); DELETE FROM person WHERE id IN (SELECT
person_id FROM temp_ids); DROP TABLE temp_ids; COMMIT; — And to finalize the
change: — UPDATE site_aggregates SET users = (SELECT count(*) FROM local_user)
WHERE site_id = 1; — If you read the code, you’ll see that we deleted records
whose person_id was > 85347. That’s the approach that worked for us. But you
could just as easily delete all users who haven’t passed email verification, for
example. If that’s the approach you want to use, try this SQL statement: —
BEGIN; CREATE TEMP TABLE temp_ids AS SELECT person_id FROM local_user WHERE
email_verified = 'f'; DELETE FROM local_user WHERE person_id IN (SELECT
person_id FROM temp_ids); DELETE FROM person WHERE id IN (SELECT person_id FROM
temp_ids); DROP TABLE temp_ids; COMMIT; — And to finalize the change: — UPDATE
site_aggregates SET users = (SELECT count(*) FROM local_user) WHERE site_id = 1;
— Even more aggressive mods could put these commands into a nightly cron job,
wiping accounts every day if they don’t finish their registration process. We
chose not to do that (yet). Our user count has remained stable with email
verification on. After that, the bots were gone. Third party tools reflected the
change in about 12 hours. We did some testing to make sure we hadn’t destroyed
the site, but found that everything worked flawlessly. ## Wrapping Up We chose
to write this up for the rest of the new Lemmy administrators out there who may
unwittingly be hosts of bots. Hopefully having all of the details in one place
will help speed their discovery and elimination. Feel free to ask questions, but
understand that we aren’t experts. Hopefully other, more knowledgeable people
can respond to your questions in the comments here.
Excellent post that hopefully other admin see and can implement. @Ernest this might be useful for the moderation tools you plan to implement in the future.