Anubis is awesome and I want to talk about it
from SmokeyDope@piefed.social to selfhosted@lemmy.world on 28 Nov 14:09
https://piefed.social/c/selfhosted/p/1520552/anubis-is-awesome-and-i-want-to-talk-aout-it

I got into the self-hosting scene this year when I wanted to start up my own website run on old recycled thinkpad. A lot of time was spent learning about ufw, reverse proxies, header security hardening, fail2ban.

Despite all that I still had a problem with bots knocking on my ports spamming my logs. I tried some hackery getting fail2ban to read caddy logs but that didnt work for me. I nearly considered giving up and going with cloudflare like half the internet does. But my stubbornness for open source self hosting and the recent cloudflare outages this year have encouraged trying alternatives.

Coinciding with that has been an increase in exposure to seeing this thing in the places I frequent like codeberg. This is Anubis, a proxy type firewall that forces the browser client to do a proof-of-work security check and some other nice clever things to stop bots from knocking. I got interested and started thinking about beefing up security.

I’m here to tell you to try it if you have a public facing site and want to break away from cloudflare It was VERY easy to install and configure with caddyfile on a debian distro with systemctl. In an hour its filtered multiple bots and so far it seems the knocks have slowed down.

anubis.techaro.lol

My botspam woes have seemingly been seriously mitigated if not completely eradicated. I’m very happy with tonights little security upgrade project that took no more than an hour of my time to install and read through documentation. Current chain is caddy reverse proxy -> points to Anubis -> points to services

Good place to start for install is here

anubis.techaro.lol/docs/admin/native-install/

#selfhosted

threaded - newest

henfredemars@infosec.pub on 28 Nov 14:44 next collapse

I appreciate a simple piece of software that does exactly what it’s supposed to do.

merc@sh.itjust.works on 28 Nov 19:12 collapse

The front page of the web site is excellent. It describes what it does, and it does its feature set in quick, simple terms.

I can’t tell you how many times I’ve gone to a website for some open-source software and had no idea what it was or how it was trying to do it. They often dive deep into the 300 different ways of installing it, tell you what the current version is and what features it has over the last version, but often they just assume you know the basics.

orbituary@lemmy.dbzer0.com on 28 Nov 14:46 next collapse

It’s a great service. I hate the character.

CoyoteFacts@piefed.ca on 28 Nov 14:54 next collapse

You can customize the images if you want: https://anubis.techaro.lol/docs/admin/botstopper#customizing-images

natecox@programming.dev on 28 Nov 15:22 collapse

I can’t access the page to validate this because I don’t allow JS; isn’t that gated behind a paywall?

CoyoteFacts@piefed.ca on 28 Nov 15:29 next collapse

It looks like it might be; I just know someone that has a site using it and they use a different mascot, so I thought it would have been trivial. I kind of wonder why it wouldn’t be possible to just docker bind mount a couple images into the right path, but I’m guessing maybe they obfuscate/archive the file they’re reading from or something?

Axolotl_cpp@feddit.it on 28 Nov 16:49 collapse

It’s actually possible, also, it’s open source so nothing stop you from making your fork with your own images and build it

orbituary@lemmy.dbzer0.com on 28 Nov 16:20 next collapse

Not sure why you’re getting down votes for just asking a question.

natecox@programming.dev on 28 Nov 19:28 collapse

Lots of idol worship in the dev community, question the current darling and people get upset.

Lemminary@lemmy.world on 28 Nov 22:56 collapse

Not idol worship, rather, it’s silly to complain about JS when tools like NoScript allow you to selectively choose what runs instead of guessing what it is. It’s simply a documentation page like it says on the URL. I mean, they’re incredibly tame on the danger scale to leave your guard all the way up and instead take a jab at the entire community that had nothing to do with your personal choices.

natecox@programming.dev on 28 Nov 23:03 collapse

Who jabbed at anything?

I can’t get to that page, so I asked a question about the contents.

Someone here is being silly, we just disagree about who.

Lemminary@lemmy.world on 28 Nov 23:13 collapse

It gets quite silly when you blame the entire dev community for supposedly downvoting you over ideals rather than being overly strict about them. I also prefer HTML-first and think it should be the norm, but I draw the line somewhere reasonable.

I can’t get to that page, so I asked a question

Yeah, and you can run the innocuous JS or figure out what it is from the URL. You’re tying your own hands while dishing it out to everyone else.

MissingInteger@lemmy.zip on 28 Nov 16:26 collapse

You can just fork it and replace the image.

The authors talks about it here on their blog a bit more.

SmokeyDope@piefed.social on 28 Nov 14:55 collapse

You know the thing is that they know the character is a problem/annoyance, thats how they grease the wheel on selling subscription access to a commecial version with different branding.

https://anubis.techaro.lol/docs/admin/botstopper/

pricing from site

Commercial support and an unbranded version

If you want to use Anubis but organizational policies prevent you from using the branding that the open source project ships, we offer a commercial version of Anubis named BotStopper. BotStopper builds off of the open source core of Anubis and offers organizations more control over the branding, including but not limited to:

  • Custom images for different states of the challenge process (in process, success, failure)
  • Custom CSS and fonts
  • Custom titles for the challenge and error pages
  • “Anubis” replaced with “BotStopper” across the UI
  • A private bug tracker for issues

In the near future this will expand to:

  • A private challenge implementation that does advanced fingerprinting to check if the client is a genuine browser or not
  • Advanced fingerprinting via Thoth-based advanced checks

In order to sign up for BotStopper, please do one of the following:

  • Sign up on GitHub Sponsors at the $50 per month tier or higher
  • Email <sales@techaro.lol> with your requirements for invoicing, please note that custom invoicing will cost more than using GitHub Sponsors for understandable overhead reasons

I have to respect the play tbh its clever. Absolutely the kind of greasy shit play that Julian from the trailer park boys would do if he were an open source developer.

<img alt="" src="https://media.piefed.social/posts/Os/IW/OsIWaaBeg4fea86.jpg">

webghost0101@sopuli.xyz on 28 Nov 15:19 collapse

I wish more projects did stuff like this.

It just feels silly and unprofessional while being seriously useful. Exactly my flavour of software, makes the web feel less corporate.

mrbn@lemmy.ca on 28 Nov 15:02 next collapse

When I visit sites on my cellphone, Anubis often doesn’t let me through.

cmnybo@discuss.tchncs.de on 28 Nov 15:44 next collapse

I’ve never had any issues on my phone using Fennec or Firefox. I don’t have many addons installed apart from uBlock Origin. I wouldn’t be surprised if some privacy addons cause issues with Anubis though.

mrbn@lemmy.ca on 28 Nov 16:20 collapse

Yeah, my setup is almost like yours; I’m also on Firefox with unlock and the only difference is that I’m also using Privacy Badger

url@feddit.fr on 29 Nov 04:25 collapse

Just imagine my pain on my phone. Js disabled, and takes a year to complete☠️ 

And on private tab, have to go through every time

non_burglar@lemmy.world on 28 Nov 15:03 next collapse

Anubis is an elegant solution to the ai bot scraper issue, I just wish the solution to everything wasn’t just spending compute everywhere. In a world where we need to rethink our energy consumption and generation, even on clients, this is a stupid use of computing power.

Dojan@pawb.social on 28 Nov 15:09 next collapse

It also doesn’t function without JavaScript. If you’re security or privacy conscious chances are not zero that you have JS disabled, in which case this presents a roadblock.

On the flip side of things, if you are a creator and you’d prefer to not make use of JS (there’s dozens of us) then forcing people to go through a JS “security check” feels kind of shit. The alternative is to just take the hammering, and that feels just as bad.

No hate on Anubis. Quite the opposite, really. It just sucks that we need it.

natecox@programming.dev on 28 Nov 15:21 next collapse

I feel comfortable hating on Anubis for this. The compute cost per validation is vanishingly small to someone with the existing budget to run a cloud scraping farm, it’s just another cost of doing business.

The cost to actual users though, particularly to lower income segments who may not have compute power to spare, is annoyingly large. There are plenty of complaints out there about Anubis being painfully slow on old or underpowered devices.

Some of us do actually prefer to use the internet minus JS, too.

Plus the minor irritation of having anime catgirls suddenly be a part of my daily browsing.

bitcrafter@programming.dev on 28 Nov 15:50 next collapse

What would you propose as an alternative?

natecox@programming.dev on 28 Nov 15:56 collapse

There’s a caddy config out there that works as well as Anubis without the catgirls and mining: fxgn.dev/blog/anubis/

Axolotl_cpp@feddit.it on 28 Nov 16:45 next collapse

Not having catgirls is def a con

rtxn@lemmy.world on 28 Nov 17:30 next collapse

No numbers, no testimonials, or even anecdotes… “It works, trust me bro” is not exactly convincing.

poVoq@slrpnk.net on 28 Nov 18:01 collapse

That blog post is fundamentally misunderstanding what Anubis actually does.

url@feddit.fr on 29 Nov 04:22 collapse

Imagine friends seeing catgirl on your browser and now you have to explain it to them who has zero knowledge in it 

SmokeyDope@piefed.social on 28 Nov 15:40 next collapse

Theres a compute option that doesnt require javascript. The responsibility lays on site owners to properly configure IMO, though you can make the argument its not default I guess.

https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh

From docs on Meta Refresh Method

Meta Refresh (No JavaScript)

The metarefresh challenge sends a browser a much simpler challenge that makes it refresh the page after a set period of time. This enables clients to pass challenges without executing JavaScript.

To use it in your Anubis configuration:

# Generic catchall rule
- name: generic-browser
  user_agent_regex: >-
    Mozilla|Opera
  action: CHALLENGE
  challenge:
    difficulty: 1 # Number of seconds to wait before refreshing the page
    algorithm: metarefresh # Specify a non-JS challenge method

This is not enabled by default while this method is tested and its false positive rate is ascertained. Many modern scrapers use headless Google Chrome, so this will have a much higher false positive rate.

z3rOR0ne@lemmy.ml on 28 Nov 19:49 next collapse

Yeah I actually use the noscript extension and i refuse to just whitelist certain sites unless I’m very certain I trust them.

I run into Anubis checks all the time and while I appreciate the software, having to consistently temporarily whitelist these sites does get cumbersome at times. I hope they make this noJS implementation the default soon.

Prathas@lemmy.zip on 29 Nov 08:38 collapse

Wait, you keep temporarily allowing then over and over again? Why temporary?

z3rOR0ne@lemmy.ml on 29 Nov 12:44 collapse

Most of the Anubis encounters I have are to redlib instances that are shuffled around, go down all the time, and generally are more ephemeral than other sites. Because I use another extension called Libredirect to shuffle which redlib instance I visit when clicking on a reddit link, I don’t bother whitelisting them permanently.

I already have solved this on my desktop by self hosting my own redlib instance via localhost and using libredirect to just point there, but on my phone I still do the whole nojs temp unblock random redlib instance. Eventually I plan on using wireguard to host a private redlib instance on a vps so I can just not deal with this.

This is a weird case I know, but its honestly not that bad.

Dojan@pawb.social on 28 Nov 19:49 collapse

This is news to me! Thanks for enlightening me!

cecilkorik@piefed.ca on 28 Nov 15:58 next collapse

if you are a creator and you’d prefer to not make use of JS (there’s dozens of us) then forcing people to go through a JS “security check” feels kind of shit. The alternative is to just take the hammering, and that feels just as bad.

I’m with you here. I come from an older time on the Internet. I’m not much of a creator, but I do have websites, and unlike many self-hosters I think, in the spirit of the internet, they should be open to the public as a matter of principle, not cowering away for my own private use behind some encrypted VPN. I want it to be shared. Sometimes that means taking a hammering. It’s fine. It’s nothing that’s going to end the world if it goes down or goes away, and I try not to make a habit of being so irritating that anyone would have much legitimate reason to target me.

I don’t like any of these sort of protections that put the burden onto legitimate users. I get that’s the reality we live in, but I reject that reality, and substitute my own. I understand that some people need to be able to block that sort of traffic to be able to limit and justify the very real costs of providing services for free on the Internet and Anubis does its job for that. But I’m not one of those people. It has yet to cost me a cent above what I have already decided to pay, and until it does, I have the freedom to adhere to my principles on this.

To paraphrase another great movie: Why should any legitimate user be inconvenienced when the bots are the ones who suck. I refuse to punish the wrong party.

[deleted] on 28 Nov 17:09 next collapse

.

quick_snail@feddit.nl on 29 Nov 04:58 collapse

This is why we need these sites to have .onions. Tor Browser has a PoW that doesn’t require js

cadekat@pawb.social on 28 Nov 15:32 next collapse

Scarcity is what powers this type of challenge: you have to prove you spent a certain amount of electricity in exchange for access to the site, and because electricity isn’t free, this imposes a dollar cost on bots.

You could skip the detour through hashes/electricity and do something with a proof-of-stake cryptocurrency, and just pay for access. The site owner actually gets compensated instead of burning dead dinosaurs.

Obviously there are practical roadblocks to this today that a JavaScript proof-of-work challenge doesn’t face, but longer term…

natecox@programming.dev on 28 Nov 16:00 next collapse

The cost here only really impacts regular users, too. The type of users you actually want to block have budgets which easily allow for the compute needed anyways.

chicken@lemmy.dbzer0.com on 28 Nov 17:47 collapse

I think maybe they wouldn’t if they are trying to scale their operations to scanning through millions of sites and your site is just one of them

cadekat@pawb.social on 28 Nov 17:59 collapse

Yeah, exactly. A regular user isn’t going to notice an extra few cents on their electricity bill (boiling water costs more), but a data centre certainly will when you scale up.

artyom@piefed.social on 29 Nov 07:26 collapse

You could skip the detour through hashes/electricity and do something with a proof-of-stake cryptocurrency, and just pay for access. The site owner actually gets compensated instead of burning dead dinosaurs.

Maybe if the act of transferring crypto didn’t use a comparable or greater amount of energy…

cadekat@pawb.social on 29 Nov 13:27 collapse

That’s why I specified a proof-of-stake cryptocurrency. They use so much less energy that it is practically negligible in comparison, and more on the order of traditional online transactions.

quick_snail@feddit.nl on 29 Nov 04:57 collapse

We have memory hard cryptographic functions

natecox@programming.dev on 28 Nov 15:14 next collapse

Counterpoint: Anubis is not awesome: lock.cmpxchg8b.com/anubis.html

Cyberflunk@lemmy.world on 28 Nov 16:37 collapse

thank you! this needed said.

  • This post is a bit critical of a small well-intentioned project, so I felt obliged to email the maintainer to discuss it before posting it online. I didn’t hear back.

i used to watch the dev on mastodon, they seemed pretty radicalized on killing AI, and anyone who uses it (kidding!!) i’m not even surprised you didn’t hear back

great take on the software, and as far as i can tell, playwright still works/completes the unit of work. at scale anubis still seems to work if you have popular content, but does hasnt stopped me using claude code + virtual browsers

im not actively testing it though. im probably very wrong about a few things, but i know anubis isn’t hindering my personal scraping, it does fuck up perplexity and chatgpt bots, which is fun to see.

good luck Blue team!

SmokeyDope@piefed.social on 28 Nov 17:44 next collapse

What use cases does perplexity do that Claude doesn’t for you?

natecox@programming.dev on 28 Nov 19:11 next collapse

For clarity: I didn’t write the article, it’s just a good reference.

kilgore_trout@feddit.it on 28 Nov 21:51 collapse

the dev […] seemed pretty radicalized on killing Ai

As one should, to lead a similar project.

perishthethought@piefed.social on 28 Nov 15:16 next collapse

I don’t really understand what I am seeing here, so I have to ask – are these Security issues a concern?

https://github.com/TecharoHQ/anubis/security

I have a server running a few tiny web sites, so I am considering this, but I’m always concerned about the possibility that adding more things to it could make it less secure, versus more. Thanks for any thoughts.

lime@feddit.nu on 28 Nov 15:29 next collapse

all of the issues listed are closed so any recent version is fine.

also, you probably don’t need to deploy this unless you have a problem with bots.

SmokeyDope@piefed.social on 28 Nov 15:33 next collapse

Security issues are always a concern the question is how much. Looking at it they seem to at most be ways to circumvent the Anubis redirect system to get to your page using very specific exploits. These are marked as m low to moderate priority and I do not see anything that implies like system level access which is the big concern. Obviously do what you feel is best but IMO its not worth sweating about. Nice thing about open source projects is that anyone can look through and fix, if this gets more popular you can expect bug bounties and professional pen testing submissions.

artyom@piefed.social on 29 Nov 07:35 collapse

This isn’t really a security issue as much as it is a DDOS issue.

Imagine you own a brick and mortar store. And periodically one thousand fucking people sprint into your store and start recording the UPCs on all the products, knocking over every product in the store along the way. They don’t buy anything, they’re exclusively there to collect information from your store which they can use to grift investors and burn precious resources, and if they fuck your shit up in the process, that’s your problem.

This bot just sits at the door and ensures the people coming in there are actually shoppers interested in the content of some items of your store.

Fizz@lemmy.nz on 28 Nov 15:33 next collapse

Its a fun little project and I like the little character but it doesnt actually do anything at this point.

daniskarma@lemmy.dbzer0.com on 29 Nov 08:02 collapse

I don’t know if “anything”. But surely people overestimate its capabilities.

It’s only a PoW challenge. Any bot can execute a PoW challenge. For a smal to medium number of bots the energy difference it’s negligible.

Anubis it’s useful when millions of bots would want to attack a site. Then the energy difference of the PoW (specially because Anubis increase the challenge if there’s a big number of petitions) can be enough to make the attacker desist, or maybe it’s not enough, but at least then it’s doing something.

I see more useful against DDOS than AI scrapping. And only if the service being DDOS is more heavy than Anubis itself, if not you can get DDOS via anubis petitions. For AI scrapping I don’t see the point, you don’t need millions of bots to scrape a site unless you are talking about a massively big site.

panda_abyss@lemmy.ca on 28 Nov 15:35 next collapse

I like the quirky SPH character

tux0r@feddit.org on 28 Nov 16:56 next collapse

I use it with OpenBSD’s relayd and I find it amazing how little maintenance it needs.

Arghblarg@lemmy.ca on 28 Nov 17:13 next collapse

I have a script that watches apache or caddy logs for poison link hits and a set of bot user agents, adding IPs to an ipset blacklist, blocking with iptables. I should polish it up for others to try. My list of unique IPs is well over 10k in just a few days.

git repos seem to be real bait for these damn AI scrapers.

JustTesting@lemmy.hogru.ch on 28 Nov 23:08 next collapse

This is the way. I also have rules for hits to url, without a referer, that should never be hit without a referer, with some threshold to account for a user hitting F5. Plus a whitelist of real users (ones that got a 200 on a login endpoint). Mostly the Huawei and Tencent crawlers have fake user agents and no referer. Another thing crawlers don’t do is caching. A user would never download that same .js file 100s of times in a hour, all their devices’ browsers would have cached it. There’s quite a lot of these kinds of patterns that can be used to block bots. Just takes watching the logs a bit to spot them.

Then there’s ratelimiting and banning ip’s that hit the ratelimit regularly. Use nginx as a reverse proxy, set rate limits for URLs where it makes sense, with some burst set, ban IPs that got rate-limited more than x times in the past y hours based on the rate limit message in the nginx error.log. Might need some fine tuning/tweaking to get the thresholds right but can catch some very spammy bots. Doesn’t help with those that just crawl from 100s of ips but only use each ip once every hour, though.

Ban based on the bot user agents, for those that set it. Sure, theoretically robots.txt should be the way to deal with that, for well behaved crawlers, but if it’s your homelab and you just don’t want any crawlers, might as well just block those in the firewall the first time you see them.

Downloading abuse ip lists nightly and banning those, that’s around 60k abusive ip’s gone. At that point you probably need to use nftables directly though instead of iptables or going through ufw, for the sets, as having 60k rules would be a bad idea.

there’s lists of all datacenter ip ranges out there, so you could block as well, though that’s a pretty nuclear option, so better make sure traffic you want is whitelisted. E.g. for lemmy, you can get a list of the ips of all other instances nightly, so you don’t accidentally block them. Lemmy traffic is very spammy…

there’s so much that can be done with f2b and a bit of scripting/writing filters

iopq@lemmy.world on 29 Nov 04:56 collapse

Can’t you just bookmark the page?

JustTesting@lemmy.hogru.ch on 29 Nov 05:15 collapse

You mean for the referer part? Of course you don’t want it for all urls and there’s some legitimate cases. I have that on specific urls where it’s highly unlikely, not every url. E.g. a direct link to a single comment in lemmy, and whitelisting logged-in users. Plus a limit, like >3 times an hour before a ban. It’s already pretty unusual to bookmark a link to a single comment

It’s a pretty consistent bot pattern, they will go to some subsubpage with no referer with no prior traffic from that ip, and then no other traffic from that ip after that for a bit (since they cycle though ip’s on each request) but you will get a ton of these requests across all ips they use. It was one of the most common patterns i saw when i followed the logs for a while.

of course having some honeypot url in a hidden link or something gives more reliable results, if you can add such a link, but if you’re hosting some software that you can’t easily add that to, suspicious patterns like the one above can work really well in my experience. Just don’t enforce it right away, have it with the ‘dummy’ action in f2b for a while and double check.

And I mostly intended that as an example of seeing suspicious traffic in the logs and tailoring a rule to it. Doesn’t take very long and can be very effective.

pedroapero@lemmy.ml on 29 Nov 00:09 next collapse

Hi, there are pre-made ipset lists also, ex: github.com/ktsaou/blocklist-ipsets

quick_snail@feddit.nl on 29 Nov 05:00 collapse

You just described what wazuh does ootb

Goretantath@lemmy.world on 28 Nov 17:19 next collapse

My phone hates anubis.

A_norny_mousse@feddit.org on 28 Nov 22:33 collapse

it’s mentioned in this article

sudo@programming.dev on 28 Nov 18:00 next collapse

I’ve repeatedly stated this before: Proof of Work bot-management is only Proof of Javascript bot-management. It is nothing to a headless browser to by-pass. Proof of JavaScript does work and will stop the vast majority of bot traffic. That’s how Anubis actually works. You don’t need to punish actual users by abusing their CPU. POW is a far higher cost on your actual users than the bots.

Last I checked Anubis has an JavaScript-less strategy called “Meta Refresh”. It first serves you a blank HTML page with a <meta> tag instructing the browser to refresh and load the real page. I highly advise using the Meta Refresh strategy. It should be the default.

I’m glad someone is finally making an open source and self hostable bot management solution. And I don’t give a shit about the cat-girls, nor should you. But Techaro admitted they had little idea what they were doing when they started and went for the “nuclear option”. Fuck Proof of Work. It was a Dead On Arrival idea decades ago. Techaro should strip it from Anubis.

I haven’t caught up with what’s new with Anubis, but if they want to get stricter bot-management, they should check for actual graphics acceleration.

SmokeyDope@piefed.social on 28 Nov 18:18 next collapse

Something that hasn’t been mentioned much in discussions about Anubis is that it has a graded tier system of how sketchy a client is and changing the kind of challenge based on a a weighted priority system.

The default bot policies it comes with has it so squeaky clean regular clients are passed through, then only slightly weighted clients/IPs get the metarefresh, then its when you get to moderate-suspicion level that JavaScript Proof of Work kicks. The bot policy and weight triggers for these levels, challenge action, and duration of clients validity are all configurable.

It seems to me that the sites who heavy hand the proof of work for every client with validity that only last every 5 minutes are the ones who are giving Anubis a bad wrap. The default bot policy settings Anubis comes with dont trigger PoW on the regular Firefox android clients ive tried including hardened ironfox. meanwhile other sites show the finger wag every connection no matter what.

Its understandable why some choose strict policies but they give the impression this is the only way it should be done which Is overkill. I’m glad theres config options to mitigate impact normal user experience.

sudo@programming.dev on 29 Nov 05:56 collapse

Anubis is that it has a graded tier system of how sketchy a client is and changing the kind of challenge based on a a weighted priority system.

Last I checked that was just User-Agent regexes and IP lists. But that’s where Anubis should continue development, and hopefully they’ve improved since. Discerning real users from bots is how you do proper bot management. Not imposing a flat tax on all connections.

___qwertz___@feddit.org on 29 Nov 00:00 next collapse

Funnily enough, PoW was a hot topic in academia around the late 90s / early 2000, and it’s somewhat clear that the autor of Anubis has not read much about the discussion back then.

There was a paper called “Proof of work does not work” (or similar, can’t be bothered to look it up) that argued that PoW can not work for spam protection, because you have to support both low-powered consumer devices while blocking spammers with heavy hardware. And that is very valid concern. Then there was a paper arguing that PoW can still work, as long as you scale the difficulty in such a way that a legit user (e.g. only sending one email) has a low difficulty, while a spammer (sending thousands of emails) has a high difficulty.

The idea of blocking known bad actors actually is used in email quite a lot in forms of DNS block lists (DNSBLs) such as spamhaus (this has nothing to do with PoW, but such a distributed list could be used to determine PoW difficulty).

Anubis on the other hand does nothing like that and a bot developed to pass Anubis would do so trivially.

Sorry for long text.

Flipper@feddit.org on 29 Nov 00:27 next collapse

At least in the beginning the scrapers just used curl with a different user agent. Forcing them to use a headless client is already a 100x increase in resources for them. That in itself is already a small victory and so far it is working beautifully.

sudo@programming.dev on 29 Nov 06:00 collapse

Well in most cases it would by Python requests not curl. But yes, forcing them to use a browser is the real cost. Not just in CPU time but in programmer labor. PoW is overkill for that though.

sudo@programming.dev on 29 Nov 05:49 collapse

Then there was a paper arguing that PoW can still work, as long as you scale the difficulty in such a way that a legit user

Telling a legit user from a fake user is the entire game. If you can do that you just block the fake user. Professional bot blockers like Cloudflare or Akamai have machine learning systems to analyze trends in network traffic and serve JS challenges to suspicious clients. Last I checked, all Anubis uses is User-Agent filters, which is extremely behind the curve. Bots are able to get down to faking TLS fingerprints and matching them with User-Agents.

rtxn@lemmy.world on 29 Nov 04:03 next collapse

POW is a far higher cost on your actual users than the bots.

That sentence tells me that you either don’t understand or consciously ignore the purpose of Anubis. It’s not to punish the scrapers, or to block access to the website’s content. It is to reduce the load on the web server when it is flooded by scraper requests. Bots running headless Chrome can easily solve the challenge, but every second a client is working on the challenge is a second that the web server doesn’t have to waste CPU cycles on serving clankers.

POW is an inconvenience to users. The flood of scrapers is an existential threat to independent websites. And there is a simple fact that you conveniently ignored: it fucking works.

sudo@programming.dev on 29 Nov 05:41 collapse

Its like you didn’t understand anything I said. Anubis does work. I said it works. But it works because most AI crawlers don’t have a headless browser to solve the PoW. To operate efficiently at the high volume required, they use raw http requests. The vast majority are probably using basic python requests module.

You don’t need PoW to throttle general access to your site and that’s not the fundamental assumption of PoW. PoW assumes (incorrectly) that bots won’t pay the extra flops to scrape the website. But bots are paid to scape the website users aren’t. They’ll just scale horizontally and open more parallel connections. They have the money.

poVoq@slrpnk.net on 29 Nov 05:59 collapse

You are arguing a strawman. Anubis works because because most AI scrapers (currently) don’t want to spend extra on running headless chromium, and because it slightly incentivises AI scrapers to correctly identify themselves as such.

Most of the AI scraping is frankly just shoddy code written by careless people that don’t want to ddos the independent web, but can’t be bothered to actually fix that on their side.

sudo@programming.dev on 29 Nov 06:04 collapse

You are arguing a strawman. Anubis works because because most AI scrapers (currently) don’t want to spend extra on running headless chromium

WTF, That’s what I already said? That was my entire point from the start!? You don’t need PoW to force headless usage. Any JavaScript challenge will suffice. I even said the Meta Refresh challenge Anubis provides is sufficient and explicitly recommended it.

poVoq@slrpnk.net on 29 Nov 06:08 collapse

And how do you actually check for working JS in a way that can’t be easily spoofed? Hint: PoW is a good way to do that.

Meta refresh is a downgrade in usability for everyone but a tiny minority that has disabled JS.

sudo@programming.dev on 29 Nov 06:32 collapse

And how do you actually check for working JS in a way that can’t be easily spoofed? Hint: PoW is a good way to do that.

Accessing the browsers API in any way is way harder to spoof than some hashing. I already suggested checking if the browser has graphics acceleration. That would filter out the vast majority of headless browsers too. PoW is just math and is easy to spoof without running any JavaScript. You can even do it faster than real JavaScript users something like Rust or C.

Meta refresh is a downgrade in usability for everyone but a tiny minority that has disabled JS.

What are you talking about? It just refreshes the page without doing any of the extra computation that PoW does. What extra burden does it put on users?

poVoq@slrpnk.net on 29 Nov 07:11 collapse

If you check for GPU (not generally a bad idea) you will have the same people that currently complain about JS, complain about this breaking with their anti-fingerprinting browser addons.

But no, you can’t spoof PoW obviously, that’s the entire point of it. If you do the calculation in Javascript or not doesn’t really matter for it to work.

In the current shape Anubis has zero impact on usability for 99% of the site visitors, not so with meta refresh.

sudo@programming.dev on 29 Nov 08:37 collapse

You will have people complain about their anti-fingerprinting being blocked with every bot-managment solution. Your ability to navigate the internet anonymously is directly correlated with a bots ability to scrape. That has never been my complaint about Anubis.

My complaint is that the calculations Anubis forces you to do are absolutely negligible burden for a bot to solve. The hardest part is just having a JavaScript interpreter available. Making the author of the scraper write custom code to deal with your website is the most effective way to prevent bots.

Think about how much computing power AI data centers have. Do you think they give a shit about hashing some values for Anubis? No. They burn more compute power than a thousand Anubis challenges generating a single llm answer. PoW is a backwards solution.

Please Think. Captchas worked because they’re supposed to be hard for a computer to solve but are easy for a human. PoW is the opposite.

In the current shape Anubis has zero impact on usability for 99% of the site visitors, not so with meta refresh.

Again, I ask you: What extra burden does meta-refresh impose on users? How does setting a cookie and immediately refreshing the page burden the user more than making them wait longer while draining their battery before doing the exact same thing? Its strictly less intrusive.

poVoq@slrpnk.net on 29 Nov 08:51 next collapse

No one is disputing that in theory (!) Anubis offers very little protection against an adversary that specifically tries to circumvent it, but we are dealing with an elephant in the porcelain shop kind of situation. The AI companies simply don’t care if they kill off small independently hosted web-applications with their scraping and Anubis is the mouse that is currently sufficient to make them back off.

And no, forced site reloads are extremely disruptive for web-applications and often force a lot of extra load for re-authentication etc. It is not as easy as you make it sound.

sudo@programming.dev on 29 Nov 09:30 collapse

Anubis forces the site to reload when doing the normal PoW challenge! Meta Refresh is a sufficient mouse to block 99% of all bot traffic without being any more burdensome than PoW.

You’ve failed to demonstrate why meta-refresh is more burdensome than PoW and have pivoted to arguing the point I was making from the start as though it was your own. I’m not arguing with you any further. I’m satisfied that I’ve convinced any readers of our discussion.

natecox@programming.dev on 29 Nov 09:02 collapse

Heads up, you’re really invested in arguing with someone who does not appear to be arguing in good faith. Just block them and move on, you will be a happier person for it.

sudo@programming.dev on 29 Nov 09:34 collapse

🤙

quick_snail@feddit.nl on 29 Nov 04:54 collapse

Hashcash works great, what are you going on about?

sudo@programming.dev on 29 Nov 06:06 collapse

LOL

0_o7@lemmy.dbzer0.com on 28 Nov 20:37 next collapse

I don’t mind Anubis but the challenge page shouldn’t really load an image. It’s wasting extra bandwidth for nothing.

Just parse the challenge and move on.

Allero@lemmy.today on 28 Nov 21:10 next collapse

Afaik, you can set it up not to have any image, or have any other one.

Voroxpete@sh.itjust.works on 29 Nov 03:56 collapse

It’s actually a brilliant monetization model. If you want to use it as is, it’s free, even for large corporate clients.

If you want to get rid of the puppygirls though, that’s when you have to pay.

(The absolute Chads at the UN left the puppygirls in, and I have to respect that

frongt@lemmy.zip on 29 Nov 09:37 collapse

It’s open source, so you could always just patch it without paying too. But you should support the maintainers if you think they deserve it.

kilgore_trout@feddit.it on 28 Nov 21:19 collapse

It’s a palette of 10 colours. I would guess it uses an indexed colorspace, reducing the size to a minimum.
edit: 28 KB on disk

CameronDev@programming.dev on 28 Nov 22:20 collapse

A HTTP get request is a few hundred bytes. The response is 28KB. Thats 280x. If a large botnet wanted to denial of service an Anubis protected site, requesting that image could be enough.

Ideally, Anubis should serve as little data as possible until the POW is completed. Caching the POW algorithm (and the image) to a CDN would also mitigate the issue.

teolan@lemmy.world on 28 Nov 23:36 collapse

The whole point of Anubis is to not have to go through a CDN to sustain scrapping botnets

CameronDev@programming.dev on 29 Nov 01:21 collapse

I dunno that is true, nothing in the docs indicates that it is explicitly anti-CDN. And using a CDN for a static javascript resource and an image isn’t the same as running the entire site through a CDN proxy.

krooklochurm@lemmy.ca on 28 Nov 21:01 next collapse

“Anubis has risen, Wendell”

“Are you Jane’s addiction”?

silmarine@discuss.tchncs.de on 28 Nov 22:52 next collapse

Thanks for this! In going to set this up for myself.

A_norny_mousse@feddit.org on 28 Nov 22:53 next collapse

At the time of commenting, this post is 8h old. I read all the top comments, many of them critical of Anubis.

I run a small website and don’t have problems with bots. Of course I know what a DDOS is - maybe that’s the only use case where something like Anubis would help, instead of the strictly server-side solution I deploy?

I use CrowdSec (it seems to work with caddy btw). It took a little setting up, but it does the job.
(I think it’s quite similar to fail2ban in what it does, plus community-updated blocklists)

Am I missing something here? Why wouldn’t that be enough? Why do I need to heckle my visitors?

Despite all that I still had a problem with bots knocking on my ports spamming my logs.

By the time Anubis gets to work, the knocking already happened so I don’t really understand this argument.

If the system is set up to reject a certain type of requests, these are microsecond transactions of no (DDOS exception) harm.

SmokeyDope@piefed.social on 29 Nov 00:35 next collapse

If crowdsec works for you thats great but also its a corporate product whos premium sub tier starts at 900$/month not exactly a pure self hosted solution.

I’m not a hypernerd, still figuring all this out among the myriad of possible solutions with different complexity and setup times. All the self hosters in my internet circle started adopting anubis so I wanted to try it. Anubis was relatively plug and play with prebuilt packages and great install guide documentation.

Allow me to expand on the problem I was having. It wasnt just that I was getting a knock or two, its that I was getting 40 knocks every few seconds scraping every page and searching for a bunch that didnt exist that would allow exploit points in unsecured production vps systems.

On a computational level the constant network activity of bytes from webpage, zip files and images downloaded from scrapers pollutes traffic. Anubis stops this by trapping them in a landing page that transmits very little information from the server side. By traping the bot in an Anubis page which spams that 40 times on a single open connection before it gives up, it reduces overall network activity/ data transfered which is often billed as a metered thing as well as the logs.

And this isnt all or nothing. You don’t have to pester all your visitors, only those with sketchy clients. Anubis uses a weighted priority which grades how legit a browser client is. Most regular connections get through without triggering, weird connections get various grades of checks by how sketchy they are. Some checks dont require proof of work or JavaScript.

On a psychological level it gives me a bit of relief knowing that the bots are getting properly sinkholed and I’m punishing/wasting the compute of some asshole trying to find exploits my system to expand their botnet. And a bit of pride knowing I did this myself on my own hardware without having to cop out to a corporate product.

Its nice that people of different skill levels and philosophies have options to work with. One tool can often complement another too. Anubis worked for what I wanted, filtering out bots from wasting network bandwith and giving me peace of mind where before I had no protection. All while not being noticeable for most people because I have the ability to configure it to not heckle every client every 5 minutes like some sites want to do.

A_norny_mousse@feddit.org on 29 Nov 03:59 collapse

If crowdsec works for you thats great but also its a corporate product

It’s also fully FLOSS with dozens of contributors (not to speak of the community-driven blocklists). If they make money with it, great.

not exactly a pure self hosted solution.

Why? I host it, I run it. It’s even in Debian Stable repos, but I choose their own more up-to-date ones.

Allow me to expand on the problem I was having. It wasnt just that I was getting a knock or two, its that I was getting 40 knocks every few seconds scraping every page and searching for a bunch that didnt exist that would allow exploit points in unsecured production vps systems.

  • Again, a properly set up WAF will deal with this pronto
  • You should not have exploit points in unsecured production systems, full stop.

On a computational level the constant network activity of bytes from webpage, zip files and images downloaded from scrapers pollutes traffic. Anubis stops this by trapping them in a landing page that transmits very little information from the server side.

  • And instead you leave the computations to your clients. Which becomes a problem on slow hardware.
  • Again, with a properly set up WAF there’s no “traffic pollution” or “downloading of zip files”.

Anubis uses a weighted priority which grades how legit a browser client is.

And apart from the user agent and a few other responses, all of which are easily spoofed, this means “do some javascript stuff on the local client” (there’s a link to an article here somewhere that explains this well) which will eat resources on the client’s machine, which becomes a real pita on e.g. smartphones.

Also, I use one of those less-than-legit, weird and non-regular browsers, and I am being punished by tools like this.

SmokeyDope@piefed.social on 29 Nov 07:26 collapse

why? I run it.

Mmm how to say this. i suppose what I’m getting at is like a philosophy of development and known behaviors of corporate products.

So, here’s what I understand about crowdsec. Its essentially like a centralized collection of continuously updated iptable rules and botscanning detectors that clients install locally.

In a way its crowd sourcing is like a centralized mesh network each client is a scanner node which phones home threat data to the corporate home which updates that.

Notice the optimal word, centralized. The company owns that central home and its their proprietary black box to do what they want with. And so you know what for profit companies like to do to their services over time? Enshittify them by

  • adding subscription tier price models

  • putting once free features behind paywalls,

  • change data sharing requirements as a condition for free access

  • restricting free api access tighter and tighter to encourage paid tiers,

  • making paid tiers cost more to do less.

  • Intentionally ruining features in one service to drive power users to use a different.

They can and do use these tactics to drive up profit or reduce overhead once a critical mass has been reached. I do not expect alturism and respect for usersfrom corporations, I expect bean counters using alturism as a vehicle to attract users in the growing phase and then flip the switch in their tos to go full penny pinching once they’re too big to fail.

Crowdsecs pricing updates from last year

CrowdSec updated pricing policy

Hi everyone,

Our former pricing model led to some incomprehensions and was sub-optimal for some use-cases.

We remade it entirely here. As a quick note, in the former model, one never had to pay $2.5K to get premium blocklists. This was Support for Enterprise, which we poorly explained. Premium blocklists were and are still available from the premium SaaS plan, accessible directly from the SaaS console.

Here are the updates:

Security Engine: All its embedded features (IDS, IPS and WAF) were, are and will remain free.

SAAS: The free plan offers up to three silver-grade blocklists (on top of receiving IP related to signals your security engines share). Premium plans can use any free, premium and gold-grade blocklists. Previously, we had a premium and an enterprise plan with more features. All features are now merged into a unique SaaS enterprise plan. The one starting at $31/month. As before, those are available directly from the SaaS console page: https://app.crowdsec.net

SUPPORT: The $2.5K (which were mostly support for Enterprise) are now becoming optional. Instead, a client can contract $1K for Emergency bug & security fixes and $1K for support if they want to.

BLOCKLISTS: Very specific (country targeted, industry targeted, stack targeted, etc.) or AI-enhanced are now nested in a different offer named “Platinum blocklists subscription”. You can subscribe to them, regardless of whether you use the FOSS Security Engine or not. They can be joined, tuned, and injected directly into most firewalls with regular automatic remote updates of their content. As long as you do not resell them (meaning you are the final client), you can use the subscription in any part of your company.

CTI DATA: They can be consumed through API keys with associated quotas. These are affordable and intended for use in tools like OpenCTI, MISP, The Hive, Xsoar, etc. Costs are in the range of hundreds of dollars per month. The Full CTI database can also be locally replicated at your place and constantly synced for deltas. Those are the largest plans we have, and they are usually destined to L/XL enterprises, governmental bodies, OEM & hardware vendors.

Safer together.
14
·
14
Comments Section
u/ShroomShroomBeepBeep avatar
ShroomShroomBeepBeep

1y ago

Whilst I’m pleased to see it made clearer, £290 a year for each security engine is still far too expensive for me to consider it.
2
u/GuitarEven avatar
GuitarEven

1y ago

We get that £290 is too high for individual home labs. Those offers are made for companies.
Free tier features should cover homelabs correctly.

Features that are oriented for enterprise clients.
If a company cannot invest $300 yearly in its security, no judgment and the free tier will still be very helpful until it recovers some budget margins to strengthen its security posture.
4
[deleted]

1y ago

Any idea why we dont have any good free / freemium (max $5 per month) app yet. Reason am asking

quick_snail@feddit.nl on 29 Nov 04:49 next collapse

With varnish and wazuh, I’ve never had a need for Anubis.

My first recommendation for anyone struggling with bots is to fix their cache.

poVoq@slrpnk.net on 29 Nov 05:42 next collapse

AI scraping is a massive issue for specific types of websites, such as git forges, wikis and to a lesser extend Lemmy etc, that rely on complex database operations that can not be easily cached. Unless you massively overprovision your infrastructure these web-applications come to a grinding halt by constantly maxing out the available CPU power.

The vast majority of the critical commenters here seem to talk from a point of total ignorance about this, or assume operators of such web applications have time for hyperviligance to constantly monitor and manually block AI scrapers (that do their best to circumvent more basic blocks). The realistic options for such operators are right now: Anubis (or similar), Cloudflare or shutting down their servers. Of these Anubis is clearly the least bad option.

chunes@lemmy.world on 29 Nov 10:56 collapse

Sounds like maybe webapps are a bad idea then.

If they need dynamism, how about releasing a desktop application?

Pastime0293@discuss.tchncs.de on 29 Nov 05:43 next collapse

I also used CrowdSec for almost a year, but as AI scrapers became more aggressive, CrowdSec alone wasn’t enough. The scrapers used distributed IP ranges and spoofed user agents, making them hard to detect and costing my Forgejo instance a lot in expensive routes. I tried custom CrowdSec rules but hit its limits.

Then I discovered Anubis. It’s been an excellent complement to CrowdSec — I now run both. In my experience they work very well together, so the question isn’t “A or B?” but rather “How can I combine them, if needed?”

daniskarma@lemmy.dbzer0.com on 29 Nov 07:52 collapse

You are right. For most self-hosting usecases anubis is not only irrelevant, but it actually works against you. False sense of security and making your devices do extra work for nothing.

Anubis is though for public facing services that may get ddos or AI scrapped by some not targeted bot (for a target bot it’s trivial to get over Anubis in order to scrap).

And it’s never a substitute of crowdsec or fail2ban. Getting an Anubis token it’s just a matter of executing the PoW challenge. You still need a way to detect and ban malicious attacks.

TerHu@lemmy.dbzer0.com on 29 Nov 04:12 next collapse

yes, please be mindful when using cloudflare. with them you’re possibly inviting in a much much bigger problem

www.devever.net/~hl/cloudflare

quick_snail@feddit.nl on 29 Nov 04:47 collapse

Great article, but I disagree about WAFs.

Try to secure a nonprofit’s web infrastructure with as 1 IT guy and no budget for devs or security.

It would be nice if we could update servers constantly and patch unmaintained code, but sometimes you just need to front it with something that plugs those holes until you have the capacity to do updates.

But 100% the WAF should be run locally, not a MiTM from evil US corp in bed with DHS.

url@feddit.fr on 29 Nov 04:16 next collapse

Honestly im not a big fan of anubis . it fucks users with slow devices

https://lock.cmpxchg8b.com/anubis.html

url@feddit.fr on 29 Nov 04:18 collapse

Did i forgot to mention it doesnt work without js that i keep disabled

quick_snail@feddit.nl on 29 Nov 04:21 next collapse

Kinda sucks how it makes websites inaccessible to folks who have to disable JavaScript for security.

poVoq@slrpnk.net on 29 Nov 05:29 next collapse

I kinda sucks how AI scrapers make websites inaccessible to everyone 🙄

quick_snail@feddit.nl on 29 Nov 07:23 next collapse

Not if the admin has a cache. It’s not a difficult problem for most websites

poVoq@slrpnk.net on 29 Nov 08:26 collapse

You clearly don’t know what you are talking about.

quick_snail@feddit.nl on 29 Nov 08:51 collapse

Lol I’m the sysadmin for many sites that doesn’t have these issues, so obviously I do…

It you’re the one that thinks you need this trash pow fronting for a static site, then clearly you’re the one who is ignorant

poVoq@slrpnk.net on 29 Nov 08:55 collapse

Obviously I don’t think you need Anubis for a static site. And if that is what your admin experience is limited too, than you have a strong case of dunning krueger.

Mwa@thelemmy.club on 29 Nov 11:35 collapse

and they dont respect robots.txt

WhyJiffie@sh.itjust.works on 29 Nov 07:42 collapse

there’s a fork that has non-js checks. I don’t remember the name but maybe that’s what should be made more known

quick_snail@feddit.nl on 29 Nov 09:01 collapse

Please share if you know.

The only way I know how to do this is running a Tor Onion Service, since the tor protocol has built-in pow support (without js)

WhyJiffie@sh.itjust.works on 29 Nov 10:23 next collapse

It’s this one: git.gammaspectra.live/git/go-away

the project name is a bit unfortunate to show for users, maybe change that if you will use it.

some known privacy services use it too, including the invidious at nadeko.net, so you can check there how it works. It’s one of the most popular inv servers so I guess it cannot be bad, and they use multiple kinds of checks for each visitor

WhyJiffie@sh.itjust.works on 29 Nov 10:36 collapse

ps: I was wrong it’s not a fork, but a different thing doing the same and more

quick_snail@feddit.nl on 29 Nov 04:22 next collapse

getting fail2ban to read caddy logs

You should look into wazuh

victorz@lemmy.world on 29 Nov 05:33 collapse

Seems like they already have a working solution now.

quick_snail@feddit.nl on 29 Nov 07:22 collapse

sure, but they have to maintain it.

Wazuh ships with rules that are maintained by wazuh. Less code rot.

victorz@lemmy.world on 29 Nov 13:12 collapse

That’s really good, could be worth looking into in that case. 👍 Thanks for following up!

quick_snail@feddit.nl on 29 Nov 05:02 next collapse

It’s amazing how few people here are familiar with caching

turdas@suppo.fi on 29 Nov 05:40 next collapse

Inspired by this post I spent a couple of hours today trying to set this up on my toy server, only to immediately run into what seems to be a bug where <video> tags loading a simple WebM video from right next to index.html broke because the media response got Anubis’s HTML bot check instead of media.

I suppose my use-case was just too complicated.

daniskarma@lemmy.dbzer0.com on 29 Nov 07:40 next collapse

I don’t think you have a usecase for Anubis.

Anubis is mainly aimed against bad AI scrappers and some ddos mitigation if you have a heavy service.

You are getting hit exactly the same, anubis doesn’t put up a block list or anything. It just put itself in front of the service. The load on your server and the risk you take it’s very similar anubis or not anubis here. Most bots are not AI scrappers they are just proving. So the hit on your server is the same.

What you want is to properly set up fail2ban or, even better, crowdsec. That would actually block and ban bots that try to prove your server.

If you are just self-hosting with Anubis the only thing you are doing is deriving the log noise towards Anubis logs and making your devices do a PoW every once in a while when you want to use your services.

Being honest I don’t know what you are self hosting. But at least it’s something that’s going to get ddos or AI scrapped, there’s not much point with Anubis.

Also Anubis is not a substitute for fail2ban or crowdsec. You need something to detect and ban brute force attacks. If not the attacker would only need to execute the anubis challenge get the token for the week and then they are free to attack your services as they like.

Appoxo@lemmy.dbzer0.com on 29 Nov 09:16 next collapse

Maybe you know the answer to my question:
If I’d want to use any app that doesnt run in a webbrowser (e.g. the native jellyfin app), how would that work? Does it still work then?

SmokeyDope@piefed.social on 29 Nov 12:08 next collapse

It explicitly checks for web browser properties to apply challenges and all its challenges require basic web functionality like page refresh. Unless the connection to your server involves handling a user agents string it won’t work, I think this I how it is anyway. Hope this helped.

Appoxo@lemmy.dbzer0.com on 29 Nov 13:23 collapse

Assuming what you said is correct, it wouldnt help my use case.
Not hosting any page meant for public consumption anyway so it’s not really important.
But thanks for answering :)

chaospatterns@lemmy.world on 29 Nov 14:18 collapse

If the app is just a WebView wrapper around the application, then the challenge page would load and try to be evaluated.

If it’s a native Android/iOS app, then it probably wouldn’t work because the app would try to make HTTP API calls and get back something unexpected.

smh@slrpnk.net on 29 Nov 09:17 next collapse

The creator is active on a professional slack I’m on and they’re lovely and receptive to user feedback. Their tool is very popular in the online archives/cultural heritage scene (we combine small budgets and juicy, juicy data).

My site has enabled js-free screening when the site load is low, under the theory that if the site load is too high then no one’s getting in anyway.

sixty@sh.itjust.works on 29 Nov 10:37 next collapse

Yeah im not gonna use this anime stuff

Mwa@thelemmy.club on 29 Nov 11:35 collapse

can be removed btw

ohshit604@sh.itjust.works on 29 Nov 11:46 collapse

Thought you had to pay for that with Anubis? Recently I’ve been eyeing Go Away as a potential alternative.

Mwa@thelemmy.club on 29 Nov 12:49 collapse

am not sure if you still need to pay for it

sudoer777@lemmy.ml on 29 Nov 13:45 next collapse

I host my main server on my own hardware, and a VPN on Hetzner because my shitty ISP doesn’t let me port forward. For the past year, bots were hitting my Forgejo instance hard. I forgot to disable registration and they generated hundreds of accounts with hundreds of repos with sketchy links, generating terrabytes of traffic from my VPS, costing me money in traffic. I disabled registration and deleted the spam, and bots still kept hitting my server for several months, which would cause memory leaks over time and crash it and consume CPU, and still costed me money with terrabytes of traffic per month. A few weeks ago, I put Anubis on the VPS. Now, zero bots hit my Forgejo instance and I don’t pay for their traffic anymore. Problem solved.

Jason2357@lemmy.ca on 29 Nov 15:46 collapse

Its always code forges and wikis that are effected by this because the scrapers spider down into every commit or edit in your entire history, then come back the next day and check every “page” again to see if any changed. Consider just blocking pages that are commit history at your reverse proxy.

drkt_@lemmy.dbzer0.com on 29 Nov 14:38 collapse

Stop playing wack-a-mole with these fucking people and build TARPITS!

Make it HURT to crawl your site illegitimately.