Some time ago i was wondering if the common "me at foobar dot com" you still see a lot of people do actually helps at all, especially now with LLMs, so i searched for some common "obfuscation" techniques and found this site (not the 2026 update, but the previous - it was a few months ago). Then i wrote a simple LLM query with a bunch of examples from the site[0] (the tool is just a frontend for a commandline program that uses llama.cpp and Mistral Small 3.1 in Q4_K_M quantization since it loads relatively fast and is fine for simple prompts). AFAICT it could reveal anything that wasn't relying on CSS tricks or JavaScript.
Like others mentioned, though, personally i haven't bothered by email harvesting for years now since spam filters seem to do a decent job. I have my email posted in plaintext here (which i bet is harvested very often) and in various other places and the occasional spam i get is eclipsed from "spam" from services i've actually signed up for (coughlinkedincough).
I stopped being concerned about email harvesting years ago, I just simply leave the email on my website. Spam handling is okay enough, I guess.
But I like this review of techniques, even the simplest ones are very effective, that surprised me.
sureglymop 23 minutes ago [-]
What I often see is js that fetches the email from the server separately and inserts it.
fmajid 1 hours ago [-]
I use SVG where I created a text object in Affinity Designer and converted it to curves so the SVG doesn't have text any more, just vectors for the glyphs of it. Seems to work pretty well at keeping spammers at bay.
szszrk 32 minutes ago [-]
It also keeps visually impaired people at bay.
jwr 1 hours ago [-]
This is such a waste of effort. Your E-mail address is not and can't be a secret. It will get into spammer databases eventually, no matter what you do. You will spend a lot of effort doing all these fancy tricks, and eventually you will get spam anyway.
Also, a note to those who make fancy "me+someservice@somedomain.com" addresses: make really sure you are in control and these work. Some services (including mine) will need to E-mail you one day, for example to tell you that your account will be deleted because of inactivity. If you don't receive that E-mail because of your fancy spam defenses, your account will be deleted. I've seen people hurt themselves like this and it makes me sad.
On a constructive note: what works very well is spam filtering using LLMs. We have AI to help us with this problem today. I wrote an LLM despammer tool which processes my inbox via IMAP using a local LLM (for privacy reasons). I see >97% accuracy in my benchmarks on my (very difficult) testing corpus. It's nearly perfect in real life usage. I've tested many local models in the 4-32B range and the top practical choice is gpt-oss:20b (GGUF, I run it from LM Studio, MLX quantizations are worse) — not only does it perform very well, but it's also really fast.
0x3f 42 minutes ago [-]
Plus-addressing is built in to most email services. There's no 'fancy' set up to break; it just works. That is, there's no way me@gmail.com works but me+someservice@gmail.com doesn't, unless you explicitly configure it not to work. Similarly for custom domains on most services.
If you use a catch-all on a domain, i.e. someservice@somedomain.com, I guess in theory that might break. But it seems about as likely as messing up the overall domain setup.
Also, my account on your service is likely much more disposable to me than my email address/domain. Anything I care about, I'd back up. Not just assume some random website is going to preserve it for me forever.
mmsc 11 minutes ago [-]
> Also, a note to those who make fancy "me+someservice@somedomain.com" addresses:
Just wait until one of these companies demands an email from the registered email address of your account!
dandersch 49 minutes ago [-]
The techniques in the article right now have had around 95%-100% success at avoiding spam and take about 5 min. to implement. Your approach of putting an LLM in front of your inbox gives 97% accuracy, may have false positives (so you may not receive that account deletion email after all), requires to run inference and, I assume, would take at least an hour to setup.
Also, the two can be complementary, anyways, so I am not sure what your point is.
siruwastaken 36 minutes ago [-]
I'm surprised that html entity supstitution performs so well. I would have assumed that scrappers could at least speak proper html.
bit1993 3 hours ago [-]
Good stuff, but I think the title should be Email address obfuscation. Thank you for sharing I guess, but spammers can now learn from this too (:
Contact details: [any mailbox] [at] [the domain name of this web site]. Please don’t ask me to give interviews, sign books, appear on podcasts, attend conferences or conventions, or provide feedback or endorsements for works of fiction, scientific theories, or slabs of text disgorged by chatbots.
I have no idea how to decipher this obfuscation.
0x3f 49 minutes ago [-]
What's difficult about it? You know the domain, gregegan.net. You know the @ symbol, presumably. Then put literally any valid text before the @.
0xEF 29 minutes ago [-]
Completely unrelated to the conversation, but our user names are remarkably similar.
dandersch 2 hours ago [-]
Very interesting. It seems for his own email the author has opted for a combination of the CSS display none technique and a XOR cipher:
> HTML entities are often decoded automatically by server-side libraries, which means that even the most basic harvesters can get your email addresses without any special effort. This technique should be worthless—and, yet, it still stops most harvesters.
Anecdotal, but I’ve used HTML entities on a public static website for a long time using an href tag with mailto, and yet I’ve not seen any spam.
I guess any spammer who uses some level of GenAI to process and extract email addresses would have a lot more success against all the methods listed in this article.
ciroduran 1 hours ago [-]
I wouldn't think it's very cost effective to apply GenAI to extract email addresses
3 hours ago [-]
_ache_ 3 hours ago [-]
I'm sorry, but that is not how email address are spammed in bulk.
The data-source are the enormous data breach that are more and more frequent.
There is more intensive to collect more information on someone you already know something about than spamming an email you don't even know if it's a valid one.
The spam can also be very more effective as it present itself with personal information about the spammed.
curiousObject 3 hours ago [-]
The OP put those addresses on that web page, and only on that web page. Some addresses received spam.
Edit: that’s not to deny that big data leaks are a serious problem
0x3f 46 minutes ago [-]
If you're only passing the address in private to some service, you can just use [some-string-unique-to-that-service]@yourdomain.com. Or, more classically, plus addressing to do the same. Then you just block that recipient.
That solution doesn't apply to the use case in the article.
GCUMstlyHarmls 39 minutes ago [-]
Surely spammers just turn `me+leaked/sold@mail.com` into `me@mail.com` as well as `me+apple@mail.com`, `me+softbank@mail.com` etc. The cost of stripping any `+postfix` must be about zero even at volume.
0x3f 35 minutes ago [-]
Some people block all mail to non-plus-addressed emails on that inbox, so a plus address is required to be received at all. You could say then spammers will just add a random one, but they wouldn't be getting bounces and would have to guess as much. Still, even stripping the +'ed part is beyond what most of them even bother to do. That dropoff plus normal spam filters works well enough.
gfody 3 hours ago [-]
I filter everything that does NOT include “+asdf” in the to:
Rendered at 09:04:40 GMT+0000 (Coordinated Universal Time) with Vercel.
Like others mentioned, though, personally i haven't bothered by email harvesting for years now since spam filters seem to do a decent job. I have my email posted in plaintext here (which i bet is harvested very often) and in various other places and the occasional spam i get is eclipsed from "spam" from services i've actually signed up for (coughlinkedincough).
[0] https://i.imgur.com/ytYkyQW.png
But I like this review of techniques, even the simplest ones are very effective, that surprised me.
Also, a note to those who make fancy "me+someservice@somedomain.com" addresses: make really sure you are in control and these work. Some services (including mine) will need to E-mail you one day, for example to tell you that your account will be deleted because of inactivity. If you don't receive that E-mail because of your fancy spam defenses, your account will be deleted. I've seen people hurt themselves like this and it makes me sad.
On a constructive note: what works very well is spam filtering using LLMs. We have AI to help us with this problem today. I wrote an LLM despammer tool which processes my inbox via IMAP using a local LLM (for privacy reasons). I see >97% accuracy in my benchmarks on my (very difficult) testing corpus. It's nearly perfect in real life usage. I've tested many local models in the 4-32B range and the top practical choice is gpt-oss:20b (GGUF, I run it from LM Studio, MLX quantizations are worse) — not only does it perform very well, but it's also really fast.
If you use a catch-all on a domain, i.e. someservice@somedomain.com, I guess in theory that might break. But it seems about as likely as messing up the overall domain setup.
Also, my account on your service is likely much more disposable to me than my email address/domain. Anything I care about, I'd back up. Not just assume some random website is going to preserve it for me forever.
Just wait until one of these companies demands an email from the registered email address of your account!
Also, the two can be complementary, anyways, so I am not sure what your point is.
Contact details: [any mailbox] [at] [the domain name of this web site]. Please don’t ask me to give interviews, sign books, appear on podcasts, attend conferences or conventions, or provide feedback or endorsements for works of fiction, scientific theories, or slabs of text disgorged by chatbots.
I have no idea how to decipher this obfuscation.
Anecdotal, but I’ve used HTML entities on a public static website for a long time using an href tag with mailto, and yet I’ve not seen any spam.
I guess any spammer who uses some level of GenAI to process and extract email addresses would have a lot more success against all the methods listed in this article.
The data-source are the enormous data breach that are more and more frequent. There is more intensive to collect more information on someone you already know something about than spamming an email you don't even know if it's a valid one.
The spam can also be very more effective as it present itself with personal information about the spammed.
Edit: that’s not to deny that big data leaks are a serious problem
That solution doesn't apply to the use case in the article.