Hey! I'm Nick, and I work on Integrity at OpenAI. These checks are part of how we protect our first-party products from abuse like bots, scraping, fraud, and other attempts to misuse the platform.
A big reason we invest in this is because we want to keep free and logged-out access available for more users. My team’s goal is to help make sure the limited GPU resources are going to real users.
We also keep a very close eye on the user impact. We monitor things like page load time, time to first token and payload size, with a focus on reducing the overhead of these protections. For the majority of people, the impact is negligible, and only a very small percentage may see a slight delay from extra checks. We also continuously evaluate precision so we can minimize false positives while still making abuse meaningfully harder.
Imnimo 4 hours ago [-]
It's interesting to me that OpenAI considers scraping to be a form of abuse.
nikitaga 3 hours ago [-]
Scraping static content from a website at near-zero marginal cost to its server, vs scraping an expensive LLM service provided for free, are different things.
The former relies on fairly controversial ideas about copyright and fair use to qualify as abuse, whereas the latter is direct financial damage – by your own direct competitors no less.
It's fun to poke at a seeming hypocrisy of the big bad, but the similarity in this case is quite superficial.
PunchyHamster 58 minutes ago [-]
> Scraping static content from a website at near-zero marginal cost to its server, vs scraping an expensive LLM service provided for free, are different things.
I bet people being fucking DDOSed by AI bots disagree
Also the fucking ignorance assuming it's "static content" and not something needing code running
not2b 1 hours ago [-]
I understand why OpenAI is trying to reduce its costs, but it simply isn't true that AI crawlers aren't creating very significant load, especially those crawlers that ignore robots.txt and hide their identities. This is direct financial damage and it's particularly hard on nonprofit sites that have been around a long time.
razingeden 1 hours ago [-]
It is direct financial damage if my servers not on an unmetered connection — after years of bills coming in around $3/mo I got a surprise >$800 bill on a site nobody on earth appears to care about besides AI scrapers.
It hasn’t even been updated in years so hell if I know why it needs to be fetched constantly and aggressively, - but fuck every single one of these companies now whining about bots scraping and victimizing them, here’s my violin.
swagmoney1606 18 minutes ago [-]
And yet I have to pay in my time and cash to handle the constant ddos'es from the constant LLM scraping
bakugo 3 hours ago [-]
The cost is so marginal that many, many websites have been forced to add cloudflare captchas or PoW checks before letting anyone access them, because the server would slow to a crawl from 1000 scrapers hitting it at once otherwise.
AtlasBarfed 1 hours ago [-]
Because you say it is?
I obviously disagree. I mean, on top of this we are talking about not-open OpenAI.
karlshea 1 hours ago [-]
I don’t know what world you live in but it’s not this one.
nslsm 3 hours ago [-]
The issue is that there are so many awful webmasters that have websites that take hundreds of milliseconds to generate and are brought down by a couple requests a second.
bakugo 2 hours ago [-]
OpenAI must be the most awful webmasters of all, then, to need such sophisticated protections.
heyethan 13 minutes ago [-]
I think the distinction is less about scraping itself, and more about marginal cost.
Scraping static pages is cheap for both sides.
Scraping an LLM-backed service effectively externalizes compute costs onto the provider.
Same behavior, very different economics.
ProofHouse 3 hours ago [-]
The irony is thick
sabedevops 4 hours ago [-]
Seriously. The hypocrisy is staggering!
Aurornis 2 hours ago [-]
I interpreted scraping to mean in the context of this:
> we want to keep free and logged-out access available for more users
I have no doubt that many people see the free ChatGPT access as a convenient target for browser automation to get their own free ChatGPT pseudo-API.
zer00eyz 3 hours ago [-]
" Integrity at OpenAI .. protect ... abuse like bots, scraping, fraud "
Did you mean to use the word hypocrisy. If not, I'm happy to have said it.
I just want to note, that it is well covered how good the support is for actual malware...
everdrive 5 hours ago [-]
It's getting to the point where a user needs at minimum two browsers. One to allow all this horrendous client checking so that crucial services work, and another browser to attempt to prevent tracking users across the web.
Nick, I understand the practical realities regarding why you'd need to try to tamp down on some bot traffic, but do you see a world where users are not forced to choose between privacy and functionality?
mememememememo 4 hours ago [-]
Local models for privacy.
You want to go to the world's best hotel? You are gonna be on their CCTV. Staying at home is crappier but private.
Unfortunately for the first time moores law isn't helping (e.g. give a poor person an old laptop and install linux they will be fine). They can do that and all good except no LLM.
karlgkk 3 hours ago [-]
> You want to go to the world's best hotel? You are gonna be on their CCTV.
ironically, in high end hotels, there's often a lot less cctv. not none. just less. rich people enjoy privacy
Barbing 2 hours ago [-]
So they’re not just hidden better? Does make sense.
Well, I can use the world‘s best safety deposit box without being on CCTV while I pass secrets in and out of it, right? Just not for free.
Bummer, this sounds like it is about to turn into a Monero ad (“let us pay privately”)
0x3f 5 hours ago [-]
Meet me in a cafe and I will sign a JWT saying you're not a bot. You can submit this to whoever will accept it.
Brilliant! Just the thing we want: more hardware attestation, more deanonymization, less user control, all diligently orchestrated in a repository where the only contributor is Anthropic Claude [0]. Comes complete with a misaligned ASCII diagram in the README to show how much effort the humans behind it put in!
Yes, even their "humanifesto" is LLM output, and is written almost exclusively in the "it's not X <emdash> it's Y" style.
Those are all situationally-valid criticisms, but I've long thought the ability to have smartphones' cameras cryptographically sign photos is good when available. The use case is demonstrating a photo wasn't doctored, and that it came from a device associated with e.g. a journalist, who maintains a public key. Of course, it should be optional.
magicseth 3 hours ago [-]
Yes! That's what I'm getting at. This protocol optionally allows you to sign with your private key, but you don't have to for the protocol to provide utility. It could just be enough to say "if you trust magicseth's binary and apple, then this was typed one letter at a time"
There's nothing stopping folks from typing a message an LLM wrote one at a time, but the idea of increasing the human cost of sending messages is an interesting one, or at least I thought :-(
magicseth 3 hours ago [-]
Hi! I want anonymity! I also want to be able to prove what level of effort has been put in to something. I think there's room for both. This is an encrypted proof that I wrote something on a keyboard that tracks fingers. The protocol allows you to optionally sign it with your identity, but that isn't strictly required.
It is an attempt at putting something into the conversation more than just "OSS is broken because there are too many slop PRs." What if OSS required a human to attest that they actually looked at the code they're submitting? This tool could help with that.
Yes LLMs were used greatly in the production of this prototype!
It doesn't change the goal of the experiment! or it's potential utility! Do you see any potential area in your world where some piece of this is valuable?
Arainach 4 hours ago [-]
> Yes, even their "humanifesto" is LLM output, and is written almost exclusively in the "it's not X <emdash> it's Y" style.
There are six emdashes on that page. NONE of them are "it's not X it's why".
> Emails, messages, essays, code reviews, love letters — all suspect.
> We believe this can be solved — not by detecting AI, but by proving humanity.
> KeyWitness captures cryptographic proof at the point of input — the keyboard.
> When you seal a message, the keyboard builds a W3C Verifiable Credential — a self-contained proof that can be verified by anyone, anywhere, without trusting us or any central authority.
> That's an alphabet of 774 symbols — each carrying log2(774) ≈ 9.6 bits. 27 emoji for 256 bits.
> They're a declaration: this message was written by a person — one of the diverse, imperfect, irreplaceable humans who still choose to type their own words.
Clarifications: 4
Continuation from a list: 1
Could just be a comma: 1
"It's not X -- it's Y": 0.
If you're going to make lazy commentary about good writing being AI, please at least be sure that you're reading the content and saying accurate things.
magicseth 3 hours ago [-]
It is largely written by iteration with an LLM! No need to speculate or analyze em dashes :-)
The emoji idea was mine. I like it :-) unfortunately it doesn't work in places like HN that strip out emoji. So I had to make a base64 encoding option.
The goal was to create an effective encryption key for the url hash (so it doesn't get sent to the server). And encoding skin tone with human emojis allows a super dense bit/visual character encoding that ALSO is a cute reference to the humans I'm trying to center with this project!
josephg 3 hours ago [-]
> We believe this can be solved — not by detecting AI, but by proving humanity
“It's not X -- it's Y": 1
arrowsmith 28 minutes ago [-]
From their “how it works” page:
> The server stores an encrypted blob it can't decrypt. We couldn't read your messages even if we wanted to. That's not a policy — it's math.
If you can’t tell that this is AI slop then maybe KeyWitness does solve a real problem after all.
dandellion 3 hours ago [-]
It's either a bot, or someone who writes exactly like a bot. I don't care which it is, both go to the discard pile.
arrowsmith 26 minutes ago [-]
It’s a product for people who need help telling whether text was written by AI.
Maybe they deliberately write it like that, to filter out people who aren’t the target market?
magicseth 3 hours ago [-]
phew!
Velocifyer 4 hours ago [-]
<redacted because my friend posted it but accidentaly used my account>
magicseth 3 hours ago [-]
Oh you think it's stupid? It was an attempt to encode an encryption key that isn't sent to the server in a way that is minimally invasive. The skintone emomis allow pretty high byte density, and also are cute!
Sorry it doesn't meet your needs.
There is irony in having an ai generated humanifesto. Could it be intentional? hmm?
Is there no irony in deriding a project for being potentially LLM generated, when it's goal is to aide people in differentiating?
:shrug:
arrowsmith 16 minutes ago [-]
You’re getting a negative reaction from others but I share this feedback in good faith: I don’t understand what problem your product is supposed to solve.
Yeah I guess the cryptographic stuff sounds vaguely impressive although it’s been a long time since I had to think about cryptography in detail. But what is this _for_? I’m going to buy an expensive keyboard so that I can send messages to someone and they’ll know it’s really me – but it has to be someone who a) doesn’t trust me or any of our existing communication channels and b) cares enough to verify using this weird software? Oh and it’s important they know I sent it from a particular device out of the many I could be using?
Who is that person? What would I be sending them? What is the scenario where we would both need this?
Also the server can’t read the message but the decryption key is in the URL? So anyone with the URL can still read it? Then why even bother encrypting it?
Maybe this is one of those cases where I’m so far outside your target market that it was never supposed to make sense to me but I feel like I’m missing something here. Or maybe you need to work on your elevator pitch.
Just sharing my honest reaction.
Terretta 2 hours ago [-]
The first widely distributed and open source version of this typist timing validation idea I saw (and incorporated into my own software at the time) was released by Michael Crichton as part of a password 2nd-factor checker (1st factor a known phrase or even your name, the 2nd factor being your idiosyncratic typing pattern) in Creative Computing magazine that printed the code.
Somewhere there is someone 3D printing a keyboard cover that an llm can type with.
magicseth 3 hours ago [-]
I'm actually building a physical keyboard for those people who don't have iphones! Though given the reaction I'm seeing here, I probably won't share it with this audience :-P it has capacitive keys, a secure enclave, and a fingerprint sensor.
Velocifyer 4 hours ago [-]
This does not prove anything and it is only avalible to users with X.com accounts (you need a X.com account to download the app).
magicseth 3 hours ago [-]
Hi! You don't need an x.com account to download, that's just the easiest way to dm me. If you're actually interested, I can let you try it! The source is also available.
It proves 1) that an apple device with a secure enclave signed it. 2) that my app signed it.
If you trust the binary I've distributed is the same as the one on the app store, then it also proves:
3) that it was typed on my keyboard not using automation (though as others have mentioned, you could build a capacitive robot to type on it)
4) that the typer has the same private key as previous messages they've signed (if you have an out of band way to corroborate that's great too)
5) optionally, that the person whose biometrics are associated with the device approved it.
There is also an optional voice to text mode that uses 3d face mesh to attempt to verify the words were spoken live.
Not every level of verification is required by the ptrotocol, so you could attest that it was written on a keyboard, but not who wrote it (not yet implemented in the client app).
The protocol doesn't require you to run my app, if you compile it yourself, you can create your own web of trust around you!
Velocifyer 2 hours ago [-]
>that an apple device with a secure enclave signed it.
What Apple devices are supported? All I have is a iPhone 4 running a old iOS version(pre iOS 7) (which I will not update and I don't think has a secure enclave) and a M1 mac mini and some lightning earpods and a apple thunderbolt display and some USB-A chargers and some old MacBooks.
I think that the concept is stupid becuase it would require to somehow prove that the app is not modified(which is impractical) and there is no stylus on a motor or fake screen(which is also impractical).
I think that a better aproach would be to form a Web Of Trust where only people's (not just humans, this would include all animals and potentially aliens but no clankers) certificates are signed, but with a interface that is friendly to people who are not very into technology but with some sort of way to not have who your friends are revealed, but this would still allow someone to get a attestation for their robot.
toss1 3 hours ago [-]
Oh Gawd, not this idea again!
This idea of capturing the timing of people's keystrokes to identify them, ensure it is them typing their passwords, or even using the timing itself as a password has been recurring every few years for at least three decades.
It is always just as bad. Because there are so many cases where it completely fails.
The first case is a minor injury to either hand — just put a fat bandage on one finger from a minor kitchen accident, and you'll be typing completely differently for a few days.
Or, because I just walked into my office eating a juicy apple with one hand and I'm in a hurry typing my PW with my other hand because someone just called with an urgent issue I've got to fix, aaaaannnd, your software balks because I'm typing with a completely different cadence.
The list of valid reasons for failure is endless wherein a person's usual solid patterns are good 90%+ of the time, but will hard fail the other 10% of the time. And the acceptable error rate would be 2-4 orders of magnitude less.
It's a mystery how people go all the way to building software based on an idea that seems good but is actually bad, without thinking it through, or even checking how often it has been done before and failed?
monocularvision 2 hours ago [-]
You might want to check out “How it Works” on the site as none of what you said applies: https://typed.by/how
josefx 1 hours ago [-]
Then why does your link claim the following?
> While you type, the keyboard quietly records how you type — the rhythm, the pauses between keys, where your finger lands, how hard you press.
> Nobody types the same way. Your pattern is as unique as your handwriting. That's the signal.
arrowsmith 5 minutes ago [-]
I’m sceptical about this idea but, to give it full credit, it’s a custom piece of hardware that would presumably be more accurate than previous software-only attempts. Maybe it will actually work this time, idk, although I still don’t really see the point.
magicseth 3 hours ago [-]
That's not what this is. at all.
jagged-chisel 5 hours ago [-]
Sounds like we’re bringing back the PGP key signing parties
__MatrixMan__ 5 hours ago [-]
The sooner we do the better.
hathawsh 4 hours ago [-]
I wonder what the PGP signing concept does to thwart people who want to profit and don't care about the public good. It seems like anyone who attends a signing party can sell their key to the highest bidder, leading to bots and spammers all over again.
__MatrixMan__ 10 minutes ago [-]
In the flat trust model we currently use most places, it's on each person to block each spammer, bot, etc. The cost of creating a new bot account is low so it's cheap to make them come back.
On a web of trust, if you have a negative interaction with a bot, you revoke trust in one of the humans in the chain of trust that caused you to come in contact with that bot. You've now effectively blocked all bots they've ever made or ever will make... At least until they recycle their identity and come to another key signing party.
Once you have the web in place though, a series of "this key belongs to a human" attestations, then you can layer metadata on top of it like "this human is a skilled biologist" or "this human is a security expert". So if you use those attestations to determine what content your exposed to then a malicious human doesn't merely need to show up at a key signing party to bootstrap a new identity, they also have to rebuild their reputation to a point where you or somebody you trust becomes interested in their content again.
Nothing can be done to prevent bad people from burning their identities for profit, but we can collectively make it not economical to do so by practicing some trust hygiene.
Key signing establishes a graph upon which more effective trust management becomes possible. It on its own is likely insufficient.
0x3f 4 hours ago [-]
You can never prevent things like this, but you can make it expensive enough to effectively solve the problem for almost all use cases.
zar1048576 1 hours ago [-]
Definitely miss those!
tshaddox 4 hours ago [-]
Doesn’t really make sense, because any service can just say “you must paste your human-attestation JWT here to use this service” and plenty of people will.
0x3f 4 hours ago [-]
You can just decay your trust level based on the `iat` value. That way people will need to keep buying me coffee. I can optionally chide them for giving out their token.
If you're engaging with the idea seriously, I suppose we'd need to build a reputation or trust network or something.
Although if you're talking about replay attacks specifically, there are other crypto based solutions for that.
tshaddox 20 minutes ago [-]
My point is that there probably is no way in principle to distinguish between a human user utilizing automation on their own behalf in good faith (e.g. RSS readers) and bad faith automations.
magicseth 3 hours ago [-]
I am engaging with this seriously! I don't know if there will be any real solution. But I think it's worth exploring.
5 hours ago [-]
kevin_thibedeau 3 hours ago [-]
I've been doing that for years. Cloudflare is slowly breaking more and more of the web.
atoav 3 hours ago [-]
What if I run a website and OpenAI produces bot traffic? Do they also consider it abuse when they do it?
madrox 4 hours ago [-]
I am not Nick, but there's a few ways that world happens: the free tier goes away and what people pay for more correctly reflects what they use, this all becomes cheap enough that it doesn't matter, or we come up with an end to end method of determining usage is triggered by a person.
Another way is to just do better isolation as a user. That's probably your best shot without hoping these companies change policies.
gruez 5 hours ago [-]
>It's getting to the point where a user needs at minimum two browsers. One to allow all this horrendous client checking so that crucial services work, and another browser to attempt to prevent tracking users across the web.
What are you talking about? It works fine with firefox with RFP and VPN enabled, which is already more paranoid than the average configuration. There are definitely sites where this configuration would get blocked, but chatgpt isn't one of them, so you're barking up the wrong tree here.
SV_BubbleTime 5 hours ago [-]
Firefox multicontainers are pretty cool. But it’s an advanced process that most people wouldn’t do or do correctly.
Sabinus 4 hours ago [-]
I love the containers too. My current use case is to keep my YouTube account separate from my Google one. Google doesn't need all that behavioural data in one place.
It's a pity Firefox doesn't get the praise it deserves half as much as it cops criticism.
halJordan 4 hours ago [-]
It is absolutely not an advanced process. It's clicking a gui. It's not advanced thinking to understand profiles. It's a basic ability to hold multiple things in your mind at once. Telling people that's difficult only increases the societal problem that being ignorant is ok.
docjay 3 hours ago [-]
“Difficult” is a relative term. They were saying it was a difficult concept for them, not you. In order to save their ego, people often phrase those events to be inclusive of the reader; it doesn’t feel as bad if you imagine everyone else would struggle too. Pay attention and you’ll notice yourself doing it too.
“Ignorant” is also infinite - you’re ignorant of MANY things as well, and I’m sure you would struggle with things I can do with ease. For example, understanding the meaning behind what’s being said so I know not to brow-beat someone over it.
SV_BubbleTime 49 minutes ago [-]
Mostly right; it’s not that it was difficult for me. It’s that normal people are never going to do it.
I’m almost endlessly surprised by the probably-autistic-spectrum responses to tech things from people with no idea how things seem to other people.
Imustaskforhelp 5 hours ago [-]
The possibilities with Firefox multi containers and automation scripts as well are truly endless.
It's also possible to make Firefox route each container through a different proxy which could be running locally even which then can connect to multiple different VPN's. I haven't tried doing that but its certainly possible.
It's sort of possible to run different browsers with completely new identities and sometimes IP within the convenience of one. It's really underrated. I don't use the IP part of this that I have mentioned but I use multi containers quite a lot on zen and they are kind of core part of how I browse the web and there are many cool things which can be done/have been done with them.
halflife 5 hours ago [-]
Don’t know if it’s related to the article, but the chats ui performance becomes absolutely horrendous in long chats.
Typing the chat box is slow, rendering lags and sometimes gets stuck altogether.
I have a research chat that I have to think twice before messaging because the performance is so bad.
Running on iPhone 16 safari, and MacBook Pro m3 chrome.
DenisM 4 hours ago [-]
In the good old days Netflix had "Dynamic HTML" code that would take a DOM element which scrolled out of view port and move it to the position where it was about to be scrolled in from the other end. Hence he number of DOM elements stayed constant no matter how far you scroll and the only thing that grows is the Y coordinate.
They did it because a lot of devices running Netflix (TVs, DVD players, etc) were underpowered and Netflix was not keen on writing separate applications. They did, however, invest into a browser engine that would have HW acceleration not just for video playback but also for moving DOM elements. Basically, sprites.
The lost art of writing efficient code...
zdragnar 4 hours ago [-]
> Hence he number of DOM elements stayed constant no matter how far you scroll and the only thing that grows is the Y coordinate.
This is generally called virtual scrolling, and it is not only an option in many common table libraries, but there are plenty of standalone implementations and other libraries (lists and things) that offer it. The technique certainly didn't originate with Netflix.
tmpz22 3 hours ago [-]
Its been about three years but infinite scroll is naunced depending on the content that needs to be displayed. Its a tough nut to crack and can require a lot of maintenance to keep stable.
None of which chatgpt can handle presumably.
dotancohen 3 hours ago [-]
And yet ChatGPT does not use it.
GP was mentioning that a solution to the problem exists, not that Netflix specifically invented it. Your quip that the technique is not specific to Netflix bolsters the argument that OpenAI should code that in.
jasonfarnon 3 hours ago [-]
I'm ignorant of the tech here. But I have noticed that ctrl-F search doesn't work for me on these longer chats. Which is what made me think they were doing something like virtual scrolling. I can't understand how the UI can get so slow if a bunch of the page is being swapped out.
dotancohen 2 hours ago [-]
Ctrl-A for select all doesn't work either. I actually wondered how they broke that.
BoorishBears 3 hours ago [-]
They didn't actually name the solution: the solution is virtualization.
They described Netflix's implementation, but if someone actually wanted to follow up on this (even for their own personal interest), Dynamic HTML would not get you there, while virtualization would across all the places it's used: mobile, desktop, web, etc.
groundzeros2015 3 hours ago [-]
This is how every scrolling list has been implemented since the 80s. We actually lost knowledge about how to build UI in the move to web
bloomca 2 hours ago [-]
The biggest issue is that there is no native component support for that. So everyone implements their own and it is both brittle and introduces some issues like:
- "ctrl + f" search stops working as expected
- the scrollbar has wrong dimensions
- sometimes the content might jump (common web issue overall)
The reason why we lost it is because web supports wildly different types of layouts, so it is really hard to optimize the same way it is possible in native apps (they are much less flexible overall).
TeMPOraL 2 hours ago [-]
Right. This is one of my favorite examples of how badly bloated the web is, and how full of stupid decisions. Virtual scrolling means you're maintaining a window into content, not actually showing full content. Web browsers are perfectly fine showing tens of thousands of lines of text, or rows in a table, so if you need virtual scrolling for less, something already went badly wrong, and the product is likely to be a toy, not a tool (working definition: can it handle realistic amount of data people would use for productive work - i.e. 10k rows, not 10 rows).
bschwindHN 1 hours ago [-]
Almost certainly running some sort of O(n^2) algorithm on the chat text every key press. Or maybe just insane hierarchies of HTML.
Either way, pretty wild that you can have billions of dollars at your disposal, your interface is almost purely text, and still manage to be a fuckup at displaying it without performance problems.
qingcharles 1 hours ago [-]
OpenAI sites are the only ones that do this to me. I have to keep a separate browser profile just for my OpenAI login with absolutely nothing installed on it or it'll end up being dogshit slow and unusable.
stacktraceyo 5 hours ago [-]
Same. It’s wild how bad it can get with just like a normal longer running conversation
PunchyHamster 53 minutes ago [-]
That's how eating your own dogshit works, or whatever was that saying
moffkalast 4 hours ago [-]
Yeah just had this earlier today, I had to write my response in vscode and paste it in, there were literal seconds of lag for typing each character. Typical bloated React.
scq 4 hours ago [-]
Just because a web application uses React and is slow, it does not follow that it is slow because of React.
It's perfectly possible to write fast or slow web applications in React, same as any other framework.
Linear is one of the snappiest web applications I've ever used, and it is written in React.
brigandish 3 hours ago [-]
Does not, in the seeming absence of other snappy examples and the overwhelming evidence of many, many slow React apps, the exception prove the rule?
scq 3 hours ago [-]
There are plenty of snappy examples. Off the top of my head: Discord, Netflix, Signal Desktop, WhatsApp Web.
driverdan 2 hours ago [-]
Brand new account with 2 comments in this thread. How can we be sure you're not a bot deployed to defend OpenAI?
Please run Cloudflare's privacy invasive tool and share all the values it generates here so we can determine if you're a real person.
sebmellen 5 hours ago [-]
Great to hear from a first-party source. I'm a Pro subscriber and my team spends well over two thousand dollars per month on OpenAI subscriptions. However, even when I'm logged in with my Pro account, if I'm using a VPN provider like Mullvad, I often have trouble using the chat interface or I get timeout errors.
Is this to be expected? I would presume that if I'm authenticated and paying, VPN use wouldn't be a worry. It would be nice to be able to use the tool whether or not I'm on a VPN.
JumpCrisscross 2 hours ago [-]
> even when I'm logged in with my Pro account, if I'm using a VPN provider like Mullvad, I often have trouble using the chat interface or I get timeout errors
Heard from a founder who recently switched his company to Claude due to OpenAI's lagginess–it's absolutely an OpenAI problem. Not an AI problem in general.
seba_dos1 5 hours ago [-]
Hi! It's all perfectly understandable - after all, we use things like Anubis to protect our services from OpenAI and similar actors and keep them available to the real users for exactly the same reasons.
noosphr 5 hours ago [-]
>These checks are part of how we protect our first-party products from abuse like bots, scraping, fraud, and other attempts to misuse the platform.
Can you share these mitigations so we can mitigate against you?
0x3f 5 hours ago [-]
It's just Cloudflare. Bypassing it is a whole industry.
zenethian 3 hours ago [-]
I read the comment as “use it to mitigate against OpenAI bots scraping the web” and not to mitigate Cloudflare.
0x3f 3 hours ago [-]
Well it's the same answer isn't it... use Cloudflare. And hope OpenAI doesn't have a backroom scraping deal with them, which they might.
dawnerd 4 hours ago [-]
Flaresolverr is one way. Isn’t perfect but bypasses a lot.
c0_0p_ 5 hours ago [-]
Can't have those bots or scrapers running amok can we...
tipiirai 47 minutes ago [-]
I don't trust what OpenAI says. Sam Altman gives shivers, and these kinds of blog posts make things look even worse.
the_gipsy 4 hours ago [-]
But is the title true, is typing specifically blocked? Or does it just block submitting the text?
I ask because I have seen huge variations in load time. Sometimes I had to wait seconds until being able to type. Nowadays it seems better though.
mehov 5 hours ago [-]
> because we want to keep free and logged-out access
But don't you run these checks on logged-in users too?
MyNameIsNickT 5 hours ago [-]
Yep, on logged-in users too. The reason is basically the same: we want scarce compute going to real people, not attackers. Being logged in is one useful signal, but it doesn’t fully prevent automation, account abuse, or other malicious traffic, so we apply protections in both cases.
salawat 4 hours ago [-]
More like "We want your money, but don't want to provide service." Are you sure OpenAI isn't morphing into a finance/insurance company?
pixl97 3 hours ago [-]
While OAI is one of the more hypocritical of the bunch, it is not uncommon for paid services to have some limitations in their terms of service. Like going in a store and buying stuff, it doesn't me a free for all doing whatever you want.
zamadatix 2 hours ago [-]
Limitations on the ChatGPT subscription should have to do with the usage limits of the tier you paid for (and I don't think anyone has a problem with that). If I'm in the limits of requests I paid for then it's usage rather than abuse.
"Abuse" checks should only come into play when someone tries to leverage the free tier. It reminds me of those cable companies that try to sell "unlimited" plans and then try to say customers who use more than x GB/month are abusing the service rather than just say what the real limits are because "unlimited" sounds better in marketing.
angoragoats 4 hours ago [-]
Nothing you do can fully prevent automation. Someone who wants to automate requests badly enough will be able to do it, especially when the “protections” are as easy to decrypt and analyze as the OP proved.
Meanwhile, the rest of us (well, not me, because I don’t use your garbage product, but lots of others do) have to suffer and have our compute resources used up in the name of “protection.”
3form 4 hours ago [-]
Yeah, that's it. Also, it is a bit amusing to me - "We want to prevent automation", says the employee of Let's Automate Inc.
geetee 4 hours ago [-]
[flagged]
jorvi 3 hours ago [-]
I'm glad you guys at least went with CloudFlare. LMarena went with Google's ReCaptcha, which is plain evil. It'll often gaslight you and pretend you failed a captcha of identifying something as simple as fire hydrants. Another lovely trick is asking you to identify bridges or busses, but in actuality it also wants you to identify viaducts or semi-trucks.
pdntspa 4 hours ago [-]
Y'all just salty that DeepSeek et al are training their LLMs on yours
huertouisj 4 hours ago [-]
sometimes I paste giant texts (think summarization) in the chatgpt (paid) webapp and I noticed that the CPU fans spin up for about 5 seconds after, as if the text is "processed" client side somehow. this is before hitting "submit" to send the prompt to the model.
I assumed it was maybe some tokenization going on client side, but now I realize maybe it's some proof of work related to prompt length?
myHNAccount123 4 hours ago [-]
Can you fix the resizing text box issue on Safari when a new line is inserted? When your question wraps to a newline Safari locks up for a few seconds and it's really annoying. You can test by pasting text too.
JumpCrisscross 3 hours ago [-]
> we want to keep free and logged-out access available for more users
How does this comport with OpenAI's new B2B-first strategy?
> We also keep a very close eye on the user impact
Are paid or logged-in users also penalised?
dev1ycan 5 hours ago [-]
"abuse like bots, scraping, fraud, and other attempts to misuse the platform"
This has to be a joke, right?
pera 4 hours ago [-]
I really can't tell for sure (new user posting a ridiculously hypocritical corporate message on a Sunday) but if GP actually works for OpenAI the lack of self-awareness is seriously striking
singpolyma3 4 hours ago [-]
How?
oblio 3 hours ago [-]
Because OpenAI built their entire business around shamelessly scraping anything that had bits on it.
singpolyma3 2 hours ago [-]
Maybe. But scraping isn't abuse. Seems a bit different?
PunchyHamster 52 minutes ago [-]
Given that the scraping doesn't do any rate limiting and pisses on robots.txt, yes it is abuse
4 hours ago [-]
vkou 4 hours ago [-]
> Hey! I'm Nick, and I work on Integrity at OpenAI. These checks are part of how we protect our first-party products from abuse like bots, scraping, fraud, and other attempts to misuse the platform.
How can first-party products protect themselves from abuse by OpenAI's bots and scraping?
mystraline 4 hours ago [-]
This is a completely in-scope question.
How do we defend against your scraping, OpenAI?
I dont want any of my content scraped or seen by you all. Frankly, fuck you all for thinking my content is owned by you.
> OpenAI: These checks are part of how we protect products from abuse like bots, scraping, and other attempts to misuse the platform.
This would be fucking HILARIOUS if it wasn't so tragic.
rchaud 4 hours ago [-]
Manifest destiny for me, border enforcement for thee.
lmz 1 hours ago [-]
This kind of flawed thinking again. Like the natives didn't fight and lose wars against the manifest destiny types.
Chance-Device 5 hours ago [-]
It can be both
piskov 5 hours ago [-]
Tangential question: are there chatgpt app devs on X? There are a few from Codex team but I couldn’t find guys from “ordinary” chatgpt.
Also if you could pass this over: it takes 5 taps to change thinking effort on ios and none (as in completely hidden) on macos.
If I were to guess it seems that you were trying to lower the token usage :-). Why the effort is only nicely available on web and windows is beyond me
rglullis 4 hours ago [-]
I shouldn't be giving ideas to your boss, but I bet he would be interested in making ChatGPT available only by paying customers or free for those whose who gets their eyes scanned by The Orb. Give 30 days of raised limits and we're all set to live in the dystopia he wants.
user3939382 5 hours ago [-]
Have you given any thought to what we trade when big tech elects one corporation as the gatekeeper for vast swaths of the Internet?
tomalbrc 1 hours ago [-]
Fake Account
crest 3 hours ago [-]
Then make sure they only target the free tier!
0dayman 4 hours ago [-]
Hi Nick, your software is a horrendous encroachment on users' privacy and its quality is subpar to those of us who know what we're working with. We don't use your product here.
quotemstr 4 hours ago [-]
We really need ZKPs of humanity
ctoth 4 hours ago [-]
No, we really don't. We don't need worldcoin, we don't need papers, please. We just don't.
"Prove your humanity/age/other properties" with this mechanism quickly goes places you do not want it to go.
Muromec 3 hours ago [-]
> quickly goes places you do not want it to go.
Which places?
quotemstr 4 hours ago [-]
No, it doesn't go places we "do not want it to go". What part of zero knowledge doesn't make sense? How precisely does a free, unlinkable, multi-vendor, open-source cryptographic attestation of recent humanity create something terrible?
It would behoove people to engage with the substance of attestation proposals. It's lazy to state that any verification scheme whatsoever is equivalent to a panopticon, dystopia as thought-terminating cliche.
We really do have the technology now to attest biographical details in such a way that whoever attests to a fact about you can't learn the use to which you put that attestation and in such a way that the person who verifies your attestation can see it's genuine without learning anything about you except that one bit of information you disclose.
And no, such a ZK scheme does not turn instantly into some megacorp extracting monopoly rents from some kind of internet participation toll booth. Why would this outcome be inevitable? We have plenty of examples of fair and open ecosystems. It's just lazy to assert right out of the gate that any attestation scheme is going to be captured.
So, please, can we stop matching every scheme whatsoever for verifying facts as actors as the East German villain in a cold war movie? We're talking about something totally different.
ctoth 4 hours ago [-]
The ZK part isn't the problem. The "attestation of recent humanity" part is. Who attests? What happens when someone can't get attested?
You've been to the doctor recently, right? Given them your SSN? Every identity system ever built was going to be scoped || voluntary. None of them stayed that way.
Once you have the identity mechanism, "Oh it's zero knowledge! So let's use it for your age! Have you ever been convicted?" which leads to "mandated by employers" which leads to...
We've seen this goddamn movie before. Let's just skip it this time? Please?
dzikimarian 4 hours ago [-]
The part where FAANG does usual Embrace, Extend, Extinguish, masses don't care/understand and we have yet another "sign in with... " that isn't open source nor zero-knowledge in practice and monetizes your every move. And probably at least one of the vendors has massive leak that shows half-assed or even flawed on purpose implementation.
gzread 1 hours ago [-]
Sure. I'll provide an API to provide mine to your bot for $1 each time.
4 hours ago [-]
thegreatpeter 4 hours ago [-]
You’re doing gods work sir, thank you!
nickphx 4 hours ago [-]
the irony of your statement is hilarious, disappointing, and infuriating.
huflungdung 3 hours ago [-]
[dead]
jgalt212 5 hours ago [-]
[dead]
lxgr 5 hours ago [-]
It's absurd how unusable Cloudflare is making the web when using a browser or IP address they consider "suspicious". I've lately been drowning in captchas for the crime of using Firefox. All in the interest of "bot protection", of course.
lucasfin000 5 hours ago [-]
The real frustrating part is that Cloudflare's "definition" of suspicious keeps changing and expanding. VPN users, privacy-first browsers, uncommon IP ranges, they all get flagged. The people most likely to get caught by these systems are exactly the ones who care most about their privacy, and not the bots that they are apparently targeting.
gruez 5 hours ago [-]
>The real frustrating part is that Cloudflare's "definition" of suspicious keeps changing and expanding.
That's... exactly expected? It's a cat and mouse game. People running botnets or AI scrapers aren't diligently setting the evil bit on their packets.
jagged-chisel 5 hours ago [-]
That’s obviously because they’re not being “evil”
lxgr 4 hours ago [-]
So the stable state here is all humans eventually being locked out? (Bots are getting better every day; I doubt the same is true for all humans, including those with weird browsers or networks unwilling to install some dystopian Cloudflare "Internet passport".)
But hey, at least some bots are also not making it past Cloudflare!
WatchDog 2 hours ago [-]
The inevitability is that these kinds of services just won't be offered without identifying yourself.
Claude's free tier requires a phone number just to try it.
small_scombrus 2 hours ago [-]
> So the stable state here is all humans eventually being locked out?
Yep. The most easy to implement stable state for any system where you're aiming to prevent misuse is to just prevent use
Aurornis 2 hours ago [-]
> The people most likely to get caught by these systems are exactly the ones who care most about their privacy, and not the bots that they are apparently targeting.
In my brief experience with abuse mitigation, connections coming from VPNs or unusual IP ranges were very significantly more likely to be associated with abuse.
It depends on your users. VPNs aren’t common at all, even though you hear about them a lot on Hacker News. For types of social sites where people got banned for abuse (forums) the first step to getting back on the forum was always to sign up for a VPN and try to reconnect. It got so bad that almost every new account connecting via VPN would reveal itself as a spammer, a banned member trying to return, or someone trying to sock puppet alternate accounts for some reason.
The worst offenders are Tor IP addresses. Anyone connecting from Tor was basically guaranteed to have bad intentions.
I heard from someone who dealt with a lot of e-mail abuse that the death threats, extortion, and other serious abuse almost always came from Protonmail or one of the other privacy-first providers that I can’t remember right now. He half-jokingly said they could likely block Protonmail entirely without impacting any real users.
It’s tough for people who want these things for privacy, but the sad reality is that these same privacy protections are favored by people who are trying to abuse services.
whatisthiseven 5 hours ago [-]
Which VPNs are people using that actually care about the user's privacy? Most of them don't, sell their home IP to buyers, sell their DNS history to others, etc. Worse, some of them could require invasive MITM cert stuff most users will just click yes through.
I have yet to see a use case for VPNs for the casual internet audience, and for a tech savvy user, their better off renting through some datacenter or something, which at that point is hardly a VPN and more home IP obfuscation. All the same downsides, and at least you get real privacy.
traceroute66 5 hours ago [-]
> Which VPNs are people using that actually care about the user's privacy?
Mullvad.
It has been proven in a court of law that when Mullvad says "no logging", they mean it.
They also regularly have security audits and publish the results[2][3]
Second for Mullvad, I am quite distrusting in general but more I know about Mullvad, more I am convinced they really are serious about user privacy
evilduck 5 hours ago [-]
Using any popular datacenter's IP range for a personal VPN is likely to be outright blocked.
Imustaskforhelp 5 hours ago [-]
Also you only get 1 IP so its not really anonymous and you definitely would have a fingerprint.
thisisnow 4 hours ago [-]
you just rotate it?
lxgr 4 hours ago [-]
I'm forced to use a VPN to occasionally check my US bank account, since a foreign IP address is obviously a harbinger of unspeakable evil (while the friendly Youtube advertised neighborhood VPN is obviously evidence of pure intentions).
gruez 5 hours ago [-]
>Most of them don't, sell their home IP to buyers, sell their DNS history to others, etc. Worse, some of them could require invasive MITM cert stuff most users will just click yes through.
Source? I haven't seen any evidence that the major paid VPN providers engage in any of those things. At best it's vague implications something shady is happening because one of the key people was previously at [shady organization].
Imustaskforhelp 5 hours ago [-]
ProtonVPN with bitcoin which you get from a monero swap is a good idea for complete privacy if you want port forwarding.
MullvadVPN is also another great one.
I have heard some good things about AirVPN, but I can absolutely attest for mullvad and to a degree ProtonVPN (Just with Proton, depending upon your threat model, do make the necessary precautions like buying with monero for example)
There are others, but mostly its the 2-3 that I trust.
5 hours ago [-]
danielheath 5 hours ago [-]
Maybe check your network isn't sending web traffic you're not aware of?
I'm running firefox and seeing the normal amount.
jychang 5 hours ago [-]
Most people are on a CGNAT these days, drowning in captchas is the new normal. You’re at the mercy of one of your neighbors not hosting a botnet from their home computer.
perching_aix 5 hours ago [-]
For better or for worse, CF's fingerprinting and traffic filtering is a lot more in-depth than just IP trend analysis. Kind of by necessity, exactly because of what you mention. So I'd think that's not as big a worry per se.
lxgr 4 hours ago [-]
Yet here I am drowning in captchas every once in a while, so it's quite a big worry for me.
Maybe I just have to disable all ad blockers and Safari tracking prevention? Or I guess I could send a link to a scan of my photo ID in a custom request header like X-Please-Cloudflare-May-I-Use-Your-Open-Web?
perching_aix 4 hours ago [-]
> Yet here I am drowning in captchas every once in a while, so it's quite a big worry for me.
I think I was sufficiently clear that I was specifically talking about CGNAT-caused IP address tainting being an unreasonably emphasized worry, not the worry about their detections overall misfiring. Though I certainly don't hear much about people having issues with it (but then anecdotes are anecdotal).
> Or I guess I could send a link to a scan of my photo ID in a custom request header like X-Please-Cloudflare-May-I-Use-Your-Open-Web?
Sounds good, have you tried?
Not sure what's the point of these comically asinine rhetoricals.
tokioyoyo 5 hours ago [-]
Not even remotely true, I genuinely have no idea what you're talking about. The only time I get captcha'ed is when I sometimes VPN around, or do some custom browser stuff and etc. I'll even say I get captcha'ed less now than maybe 5 years ago.
cogman10 5 hours ago [-]
Every so often, usually after a firefox update, CF will get into a "I'm convinced your a bot" mode with me. I can get out of it by solving 20 CAPTCHAs.
hansvm 5 hours ago [-]
It's probably just a higher rate of autonomous vehicles needing stop signs and buses identified at that moment, and cognitive bias causes you to only remember when that happens when you recently performed an update. /s
cogman10 5 hours ago [-]
My assumption is that CF has something like a SVM that it's feeding a bunch of datapoints into for bot detection. Go over some threshold and you end up in the CAPTCHA jail.
I'm certain the User-Agent is part of it. I know that for certain because a very reliable way I can trigger the CF stuff is this plugin with the wrong browser selected [1].
>It's probably just a higher rate of autonomous vehicles needing stop signs and buses identified at that moment
I can't tell whether you're serious but in case you are, this theory immediately falls apart when you realize waymo operates at night but there aren't any night photos.
hansvm 5 hours ago [-]
Thanks for the comment. Lack of seriousness is now appropriately indicated.
g-b-r 5 hours ago [-]
Maybe you allow tracking and cookies?
Eji1700 5 hours ago [-]
I don't, and I rarely have issues with firefox. Private + blockers + VPN causes, expected, issues but otherwise i'm usually fine?
girvo 4 hours ago [-]
Surprising really, because I'm a Firefox + Ublock Origin die hard and I never get Cloudflare captchas. Wonder what the difference is? I have CGNAT turned off, if that matters at all (probably not).
lxgr 4 hours ago [-]
I could definitely imagine a public IPv4 with lots of good, logged-in Cloudflare traffic to act as a positive signal for their heuristics, possibly even overriding the Firefox penalty.
ehnto 5 hours ago [-]
I recently had the insane experience of filling out 15 consecutive captchas, after, I had checked out and entered my payment information into the payment processor widget. I just wanted to submit the order. I was logged in to their website, and the bank even needed a one time code for payment. If the bank is pretty sure I am human then your ecomm site can figure it out surely.
At least outside the US, there's 3DS as an (admittedly often high friction) high quality cardholder verification method, but in the US, that's of course considered much too consumer-hostile, so "select 87 overpasses" it is.
amatecha 5 hours ago [-]
A while back I was buying tickets for a gondola for a trip in Europe and the checkout process failed during payment because their site didn't load their analytics/tracking stuff with proper error-handling, so when my ad-blocker prevented the tracking stuff, their checkout process failed to handle my CC's 2-factor auth and the checkout would fail. Had to contact my CC company and work with the gondola company to tell them what they're doing wrong so they could fix their website code. Pretty sad to know whoever built their stuff actually shipped a checkout flow (for a VERY popular tourist destination) without testing with ad-blockers enabled.
lxgr 4 hours ago [-]
To be fair, this sometimes seems on the ad blocker. I've definitely seen mine accidentally nuke part of the payment Javascript (or maybe the 3DS iframe?) because some substring of it matched some common ad URL, which is obviously unrecoverable for the site itself.
binaryturtle 4 hours ago [-]
I'm with a slightly older Firefox and can't use many websites at all anymore because the Cloudflare cancer.
Of course then you got sites like gnu.org too that block you because your slightly outdated user agent.
onion2k 5 hours ago [-]
Is that because botnets spoof being Firefox? It's not really fair to blame Cloudflare it is. That's on the bots.
doctaj 5 hours ago [-]
In what way would that not be fair? Their product giving false positives (unnecessary challenges for a normal browser humans commonly use) to real people is definitely their fault.
gruez 5 hours ago [-]
>Their product giving false positives (unnecessary challenges for a normal browser humans commonly use) to real people is definitely their fault.
Is it TSA's "fault" that non-terrorists are subject to screening?
lxgr 4 hours ago [-]
No, but it's entirely within TSA's hands to make that process as frictionless as possible.
(It's a different question whether zero friction is actually desired, or whether some security theater is actually part of the service being provided, but that's a different question.)
forkerenok 5 hours ago [-]
We're discussing the quality of screening here, not the act/necessity of screening itself.
gruez 5 hours ago [-]
>We're discussing the quality of screening here
The "quality" of TSA's screening seems be pretty bad too given how many people have to go through secondary screening vs how many terrorist they catch (0?)
bdangubic 4 hours ago [-]
they caught 11 million by now (just as arbitrary as your 0 but probably more accurate since we haven’t had a large terrorist attack since they got the gig to serve and protect and before we lost thousands of lives…)
gruez 3 hours ago [-]
>they caught 11 million by now (just as arbitrary as your 0 but probably more accurate
Nice try but I used "caught", not "stopped", which requires they actually apprehended someone, not just prevented some hypothetical attack.
>since they got the gig to serve and protect and before we lost thousands of lives…)
You could easily reuse this argument for cloudflare: "if it wasn't for such invasive browser fingerprinting openai would be drowning in bajillion req/s from bots."
bdangubic 3 hours ago [-]
> “if it wasn't for such invasive browser fingerprinting openai would be drowning in bajillion req/s from bots."
of course they would be drowning! I have no issues with what CF is doing. too funny that people use tools like chatgpt and expect privacy?!
DonHopkins 5 hours ago [-]
They are failing to meet there quotas of shooting innocent people in the face, so ICE is helping out.
lxgr 4 hours ago [-]
No, using a stupid authentication/verification method with lots of false positives is always on whoever deploys it.
Imagine an apartment building with a flimsy front door lock that breaks all the time, and the landlord only telling you that that can't be helped because of all the burglars.
josephcsible 4 hours ago [-]
If it's just as easy to spoof being Chrome as it is to spoof being Firefox, then it is indeed fair to blame Cloudflare if they give Firefox users more CAPTCHAs than Chrome users.
conradkay 5 hours ago [-]
Not really, there's camoufox but the vast majority use modified chrome/chromium
segmondy 2 hours ago [-]
trying using firefox and then using a cellphone network for internet. sometimes i can't access a site, because i get infinite captcha. i know what a damn bus, stairwell, stop light or motorcycle looks like.
dawnerd 4 hours ago [-]
I’ve been getting it in safari too. It’s ridiculous frankly. My residential ip must have been flagged or something. The part that’s really annoying is its trivial for bots to bypass.
lxgr 4 hours ago [-]
> I’ve been getting it in safari too.
I'm getting it on iCloud Private Relay all the time. It honestly makes it kind of useless.
Maybe that's the point? But then again, doesn't Cloudflare run part of it!? And wasn't there some "privacy-preserving captcha replacement" that iOS devices should already be opting me in to? So many questions, nobody there to answer them, because they can get away with it.
> The part that’s really annoying is its trivial for bots to bypass.
Not the ethical bots, though! My GPT-backed Openclaw staunchly refuses to go anywhere near a "I'm not a robot" button.
tshaddox 4 hours ago [-]
Is anyone talking about the fact that this is a fundamental design flaw of the web? Or arguably even the entire Internet?
3form 4 hours ago [-]
It's hard to call something a "fundamental flaw of web" if it wasn't an issue for 30 years. Unless you mean something more general that I'm missing.
pixl97 5 minutes ago [-]
A flaw can be fundamental but not immediate. It's probably better to say it's a fundamental flaw of the open web, that is the system collapses as the number of bad actors increases, and there is no way to prevent bad actors and have the system keep the name as open web.
tshaddox 22 minutes ago [-]
Arguably it didn’t see widespread commercial adoption for 30 years, and you wouldn’t expect fundamental design flaws regarding commercial incentives to manifest before that.
fastball 48 minutes ago [-]
Cloudflare isn't providing Turnstile as a service in a vacuum, this is a direct response to bad actors who can trivially abuse the web.
amatecha 5 hours ago [-]
These days I just close sites that show that "checking if you're a bot" shit. If this is how the web is going to be now, I don't care, I'll just not use it. I didn't need to see that article or post that badly anyways. I'm tired of paying the price for the sociopathic, greedy actions of others. It's especially bad for anyone who uses an open source OS like Linux or *BSD (to the extent many sites just block me automatically with a 403 Forbidden simply for using OpenBSD + Firefox, completely free pass if I try the same site from a Windows or Linux computer).
jgalt212 5 hours ago [-]
We use Cloudflare to protect our content, but at the same time our machines mostly run Linux / Firefox so it really is quite a frustrating relationship. It really bums me out how much of Turnstile boils down to these two questions:
is it Linux (or similar)?
is it Firefox?
If yes, to one or both, you're blocked! Clearly millions of dollars of engineering talent and petabytes of data collection should be able to come up with something more nuanced than this.
dheera 5 hours ago [-]
Exactly. For the most part all this bot protection is only protecting these websites against humans.
I don't do free work. I'm not going to label 50 images of crosswalks and motorcycles for free.
ronbenton 5 hours ago [-]
> For the most part all this bot protection is only protecting these websites against humans.
Curious how do you know this?
EGreg 5 hours ago [-]
Well, that's for the public internet.
I'm building Safebox and Safecloud, where this won't be the case anymore. Not only will you have a decentralized hosting network that can sideload resources (e.g. via a browser extension that looks at your "integrity" attribute on websites) but also the websites will require you to be logged in with a HMAC-signed session ID (which means they don't need to do any I/O to reject your requests, and can do so quickly)... so the whole thing comes down to having a logged in account.
As far as server-to-server requests, they'll be coming from a growing network of cryptographically attested TPMs (Nitro in AWS, also available in GCP, IBM, Azure, Oracle etc.) so they'll just reject based on attestations also.
In short... the cryptographically attested web of trust will mean you won't need cloudflare. What you will need, however, to prevent sybil attacks, is age verification of accounts (e.g. Telegram ID is a proxy for that if you use Telegram for authentication).
password4321 5 hours ago [-]
Wow, if Seinfeld can have a soup nazi, I think it's within reason for you to be called the internet nazi.
"No s̶o̶u̶p̶ internet for you!"
Good luck!
ale42 5 hours ago [-]
This was sarcasm, right?
simonw 6 hours ago [-]
Presumably this is all because OpenAI offers free ChatGPT to logged out users and don't want that being abused as a free API endpoint.
NotPractical 5 hours ago [-]
But do they do it whether you're logged in or not?
I noticed the ChatGPT app also checks Play Integrity on Android (because GrapheneOS snitches on apps when they do this), probably for the same reason. Claude's app doesn't, by the way, but it also requires a login.
Gander5739 4 hours ago [-]
Because accounts are free, and could still be used to abuse as a free endpoint, with a little trickiness.
appreciatorBus 6 hours ago [-]
Yup.
Coincidentally about an hour ago, I wanted to look something up in ChatGPT and I happened to be in a browser window I don’t normally use, with no logged in accounts. I assumed it wouldn’t work, but to my surprise with no account, no cookies of any kind it took my query and gave me an answer.
gruez 5 hours ago [-]
>I assumed it wouldn’t work, but to my surprise with no account, no cookies of any kind it took my query and gave me an answer.
They allowed anonymous requests for months now, maybe even a year.
aziaziazi 5 hours ago [-]
I used to mostly use chatgpt in an incognito tab, logged out. Until I notice it seems to have some context of my logged in session, and of the logged out as well. It may be paranoia or prompt deduction as well but that felt strange.
FergusArgyll 4 hours ago [-]
Yeah it works but it's a dumber model. Prob mini
vscode-rest 2 hours ago [-]
[dead]
bredren 3 hours ago [-]
It is also intended to protect the usage patterns of pro subscribers.
As has been amply explained, the API pricing per token is far more for equivalent use when maximizing a subscription plan.
It isn’t really a massive hurdle to deal with this full SPA load check. If one is even aware it exists they already have the skills to bypass it anyway.
I get why people would “what about” the automation inherit in what OpenAI is doing but that is a separate matter.
Other businesses and applications can put into place their own hurdles and anti bot practices to protect the models they’ve leaned into—-and they have been.
darepublic 4 hours ago [-]
Using 5.2 at 20 a month would also be a steal. Other shoe will drop on codex sooner or later
thisisnow 4 hours ago [-]
Its probably same for copilot.microsoft.com and their cloudfart usage
petcat 6 hours ago [-]
> These properties only exist if the ChatGPT React application has fully rendered and hydrated. A headless browser that loads the HTML but doesn't execute the JavaScript bundle won't have them. A bot framework that stubs out browser APIs but doesn't actually run React won't have them.
> This is bot detection at the application layer, not the browser layer.
I kind of just assumed that all sophisticated bot-detectors and adblock-detectors do this? Is there something revealing about the finding that ChatGPT/CloudFlare's bot detector triggers on "javascript didn't execute"?
Chance-Device 6 hours ago [-]
Perhaps the author should have made it clearer why we should care about any of this. OpenAI want you to use their real react app. That’s… ok? I skimmed the article looking for the punchline and there doesn’t seem to be one.
elwebmaster 4 hours ago [-]
That's because the article is AI slop.
technion 4 hours ago [-]
To prompt a discussion that's purely technical: I'm interested in how this was done.
Specifically, Turnstile as far as I'm aware doesn't do anything specifically configurable or site specific. It works on sites that don't run React, and the cookie OpenAI-Sentinel-Turnstile-Token is not a CF cookie.
Did OpenAI somehow do something on their own API that uses data from Turnstile?
londons_explore 6 hours ago [-]
I just don't understand why bot owners can't just run a complete windows 11 VM running Google Chrome complete with graphics acceleration.
You can probably run 50 of those simultaneously if you use memory page deduplication, and with a decent CPU+GPU you ought to be able to render 50 pages a second. That's 1 cent per thousand page loads on AWS. Damn cheap.
jaccola 4 hours ago [-]
There are myriad providers competing to offer this, nicely packaged with all the accoutrements (IP rotation, location spoofing, language settings, prebuilt parsers, etc.) behind an easy to use API.
Honestly it is a very healthy competitive market with reasonably low switching costs which drives prices down. These circumstances make rolling your own a tough sell.
arcfour 56 minutes ago [-]
They do, but the fact that they have to do this means there are fewer bots because it's less economical to go to such lengths, compared to something much less complex (which is orders of magnitude cheaper).
huertouisj 4 hours ago [-]
there are scraping subreddits.
if you browse them you will see that bot writers are very annoyed if they can't scrape a site with a headless browser.
you can do what you suggested, but with Linux VMs/containers. windows is too heavy, each VM will cost you 4 GB of RAM
xmcp123 2 hours ago [-]
I’m in those. xvfb and headless=false still works great
poly2it 5 hours ago [-]
If you know of a simple way to run a Windows 11 VM with good graphics acceleration (no GPU passthrough), please contact me.
MarioMan 5 hours ago [-]
I assume your concern with GPU passthrough is that each VM needs a whole GPU?
You can use GPU-PV to split your GPU between VM instances.
Then the main bottleneck becomes how thin you split out your VRAM.
and chatgpt was then used to write this article. at least try to clean it up a bit
hx8 5 hours ago [-]
Ah yes, the timeless hallmark of web blogs: a draft so messy even a language model would ask for a second pass.
NSPG911 3 hours ago [-]
I was using KeepChatGPT[1] for a while back in 2023-2024, pre-Gemini-in-Google era, and I was fascinated as to how it was able to mask being a user without needing any API or help from the end user. I stopped using it after 2024 because 1) Gemini and 2) It breaks quite a lot. I did however, like how you had an option to push the AI panel to the right, if only Google even considers doing so.
Does anyone know how this is integrated on the Cloudflare side and across the app? Is this beyond standard turnstile? Is this custom/enterprise functionality? Something else?
jtbayly 2 hours ago [-]
Others here are asking if this is the cause of slow performance in a long chat.
But it seems clear to me that this is why I can't start typing right away when I first load the page and click to focus in the text field.
bredren 3 hours ago [-]
On a related note, ChatGPT.com changed how it handles large text pastes this past week.
It now behaves like Claude, attaching the paste as a file for upload rather than inlining it.
This affected page UX some and reduces the cost of the browser tab some.
At some point, maybe still true, very long conversations ~froze/crashed ChatGPT pages.
arcfour 59 minutes ago [-]
> They exist only if the request passed through Cloudflare's network. A bot making direct requests to the origin server or running behind a non-Cloudflare proxy will produce missing or inconsistent values.
...I don't think that's possible even if you are a bot? I would be very surprised if OAI had their origin exposed to the internet. What is a "non-Cloudflare proxy"? Is this AI slop?
It's likely just looking at the CF properties as part of a bot scoring metric (e.g. many users from this ASN or that geoip to this specific city exhibit abusive patterns).
darepublic 5 hours ago [-]
I imagine to stop web automation from getting free API like use of the model
CorneredCoroner 4 hours ago [-]
> A headless browser that loads the HTML but doesn't execute the JavaScript bundle won't have them.
this is meaningless btw. A browser headless or not does execute javascript.
jaccola 4 hours ago [-]
I disagree, a browser can have javascript execution disabled (and this is somewhat common in scraping to save time/resources).
I read it to mean: "A browser that doesn't execute the JavaScript bundle won't have [the rendered React elements]." Which is true.
maxwellg 1 hours ago [-]
Wouldn't a browser that doesn't execute JS also not execute the browser fingerprinting code in the first place?
girvo 4 hours ago [-]
A bunch of the points in this AI generated blog post were like that. Makes me feel dirty when I'm 1/3rd of the way through and I realise how off it is.
thisisnow 4 hours ago [-]
Hah, sure, you just let random JS execute from random sites on your machine...
apsurd 3 hours ago [-]
Haven't read yet but instantly matched with my experience of the chat being unusable at times. The latency and glitch-like feel is unbearable.
refulgentis 5 hours ago [-]
If you have AI write a blog post for ya, when you think it's set, check word count (can c+p to google docs if AI can't pull it off with built in tools), and ask it to identify repetitions if it's over 1000.
Also, you can have it spotcheck colors: light orange on light background is unreadable, ask it to find the L*[1] of colors and dark/lighten as necessary if gap < 40 (that's minimum gap for yuge header text on background, 50 for text on background, these have gap of 25)
I haven't tried this yet, but, maybe have it count word count-per-header too. It's got 11 headers for 1000 words currently, makes reading feel really stacatto and you gotta evaluate "is this a real transition or vibetransition"
[1] L* as in L*a*b*, not L in Oklab
beering 6 hours ago [-]
So are you able to get free inference now that you decrypted this?
superkuh 6 hours ago [-]
It doesn't look like it in the full sense of "free". But part of how one pays these services is by running a permissive modern browser which allows the corporation to spy on you even when you already paid in currency. In a sense by depriving them of the ability to easily spy on your this workaround is closer to "free".
gruez 5 hours ago [-]
>My best guess is -- ChatGPT is running something in your browser to try to determine the best things to send down to the model API
There's no way this is worth it unless the models are absolutely tiny, in which case any benefits from offloading to the client is marginal and probably isn't worth the engineering effort.
beering 6 hours ago [-]
They already see everything I’m doing because I send my prompts to them. What “workaround” are you referring to?
superkuh 5 hours ago [-]
They see everything your doing because you send the text. But this is talking about everything about your computer system. You would not normally be sending this to them or having it involved at all. This workaround allows you to not involve unneeded information about your computer setup. It is not about avoiding sending prompt text.
And as for "but chatgpt isn't paid" (another commenter), well, then yes, that's even closer to free by removing this spying on your computer setup. But they spy on the paid users too.
voxic11 6 hours ago [-]
But isn't ChatGPT access free through the browser? What do you mean already paid in currency?
aslihana 5 hours ago [-]
I mean, I can easily get them to behaving defensively for not being abused. But MBP with M5 here, my chatgpt tab always get stucked when I hit some prompt.
Really really bad user experience, wondering about when they will leave this approach.
EGreg 5 hours ago [-]
Why does ChatGPT slow down so much when the conversations get long, while Claude does compaction?
My best guess is -- ChatGPT is running something in your browser to try to determine the best things to send down to the model API –- when it should have been running quantized models on its own server.
yapyap 3 hours ago [-]
wow OpenAi sure doesnt like bots for a company enabling the botification of the world wide web
tripdout 6 hours ago [-]
AI-written article?
avazhi 4 hours ago [-]
Yep. I flag these as spam at this point.
gobdovan 6 hours ago [-]
Imagine if they'd put as much effort into making a decent frontend experience.
blinkbat 5 hours ago [-]
Ok... so... ?
heliumtera 5 hours ago [-]
I am shocked openai collects data about it's users before users have the opportunity to send the same data to openai servers!
themafia 5 hours ago [-]
My theory is that "AI" doesn't really have any long term paying customers and the majority of the "users" are people who have cooked up some clever hack to effectively siphon computing power from these providers in an effort to crank out the lowest effort ad supported slop imaginable.
Every provider seems to have been plauged by these freeloaders to such an extent that they've had to develop extreme and onerous countermeasures just to avoid losing their shirts.
What's the word? Schadenfreude?
pencilcode 5 hours ago [-]
ai slop analysis finding CF detects non javascript capable browsers with no punchline
avazhi 5 hours ago [-]
Another AI-slop article.
Sick.
aplomb1026 4 hours ago [-]
[dead]
oluwajubelo1 4 hours ago [-]
[dead]
mistM 3 hours ago [-]
[dead]
56745742597 5 hours ago [-]
[dead]
Rendered at 02:27:16 GMT+0000 (Coordinated Universal Time) with Vercel.
A big reason we invest in this is because we want to keep free and logged-out access available for more users. My team’s goal is to help make sure the limited GPU resources are going to real users.
We also keep a very close eye on the user impact. We monitor things like page load time, time to first token and payload size, with a focus on reducing the overhead of these protections. For the majority of people, the impact is negligible, and only a very small percentage may see a slight delay from extra checks. We also continuously evaluate precision so we can minimize false positives while still making abuse meaningfully harder.
The former relies on fairly controversial ideas about copyright and fair use to qualify as abuse, whereas the latter is direct financial damage – by your own direct competitors no less.
It's fun to poke at a seeming hypocrisy of the big bad, but the similarity in this case is quite superficial.
I bet people being fucking DDOSed by AI bots disagree
Also the fucking ignorance assuming it's "static content" and not something needing code running
It hasn’t even been updated in years so hell if I know why it needs to be fetched constantly and aggressively, - but fuck every single one of these companies now whining about bots scraping and victimizing them, here’s my violin.
I obviously disagree. I mean, on top of this we are talking about not-open OpenAI.
Scraping static pages is cheap for both sides. Scraping an LLM-backed service effectively externalizes compute costs onto the provider.
Same behavior, very different economics.
> we want to keep free and logged-out access available for more users
I have no doubt that many people see the free ChatGPT access as a convenient target for browser automation to get their own free ChatGPT pseudo-API.
Did you mean to use the word hypocrisy. If not, I'm happy to have said it.
I just want to note, that it is well covered how good the support is for actual malware...
Nick, I understand the practical realities regarding why you'd need to try to tamp down on some bot traffic, but do you see a world where users are not forced to choose between privacy and functionality?
You want to go to the world's best hotel? You are gonna be on their CCTV. Staying at home is crappier but private.
Unfortunately for the first time moores law isn't helping (e.g. give a poor person an old laptop and install linux they will be fine). They can do that and all good except no LLM.
ironically, in high end hotels, there's often a lot less cctv. not none. just less. rich people enjoy privacy
Well, I can use the world‘s best safety deposit box without being on CCTV while I pass secrets in and out of it, right? Just not for free.
Bummer, this sounds like it is about to turn into a Monero ad (“let us pay privately”)
Yes, even their "humanifesto" is LLM output, and is written almost exclusively in the "it's not X <emdash> it's Y" style.
[0]: https://github.com/magicseth/keywitness/graphs/contributors
There's nothing stopping folks from typing a message an LLM wrote one at a time, but the idea of increasing the human cost of sending messages is an interesting one, or at least I thought :-(
It is an attempt at putting something into the conversation more than just "OSS is broken because there are too many slop PRs." What if OSS required a human to attest that they actually looked at the code they're submitting? This tool could help with that.
Yes LLMs were used greatly in the production of this prototype!
It doesn't change the goal of the experiment! or it's potential utility! Do you see any potential area in your world where some piece of this is valuable?
....no. There's not a single occurrence of that.
https://keywitness.io/manifesto
There are six emdashes on that page. NONE of them are "it's not X it's why".
> Emails, messages, essays, code reviews, love letters — all suspect.
> We believe this can be solved — not by detecting AI, but by proving humanity.
> KeyWitness captures cryptographic proof at the point of input — the keyboard.
> When you seal a message, the keyboard builds a W3C Verifiable Credential — a self-contained proof that can be verified by anyone, anywhere, without trusting us or any central authority.
> That's an alphabet of 774 symbols — each carrying log2(774) ≈ 9.6 bits. 27 emoji for 256 bits.
> They're a declaration: this message was written by a person — one of the diverse, imperfect, irreplaceable humans who still choose to type their own words.
Clarifications: 4
Continuation from a list: 1
Could just be a comma: 1
"It's not X -- it's Y": 0.
If you're going to make lazy commentary about good writing being AI, please at least be sure that you're reading the content and saying accurate things.
The emoji idea was mine. I like it :-) unfortunately it doesn't work in places like HN that strip out emoji. So I had to make a base64 encoding option.
The goal was to create an effective encryption key for the url hash (so it doesn't get sent to the server). And encoding skin tone with human emojis allows a super dense bit/visual character encoding that ALSO is a cute reference to the humans I'm trying to center with this project!
“It's not X -- it's Y": 1
> The server stores an encrypted blob it can't decrypt. We couldn't read your messages even if we wanted to. That's not a policy — it's math.
If you can’t tell that this is AI slop then maybe KeyWitness does solve a real problem after all.
Maybe they deliberately write it like that, to filter out people who aren’t the target market?
Sorry it doesn't meet your needs.
There is irony in having an ai generated humanifesto. Could it be intentional? hmm?
Is there no irony in deriding a project for being potentially LLM generated, when it's goal is to aide people in differentiating? :shrug:
Yeah I guess the cryptographic stuff sounds vaguely impressive although it’s been a long time since I had to think about cryptography in detail. But what is this _for_? I’m going to buy an expensive keyboard so that I can send messages to someone and they’ll know it’s really me – but it has to be someone who a) doesn’t trust me or any of our existing communication channels and b) cares enough to verify using this weird software? Oh and it’s important they know I sent it from a particular device out of the many I could be using?
Who is that person? What would I be sending them? What is the scenario where we would both need this?
Also the server can’t read the message but the decryption key is in the URL? So anyone with the URL can still read it? Then why even bother encrypting it?
Maybe this is one of those cases where I’m so far outside your target market that it was never supposed to make sense to me but I feel like I’m missing something here. Or maybe you need to work on your elevator pitch.
Just sharing my honest reaction.
Original here: https://archive.org/details/sim_creative-computing_1984-06_1...
It proves 1) that an apple device with a secure enclave signed it. 2) that my app signed it.
If you trust the binary I've distributed is the same as the one on the app store, then it also proves: 3) that it was typed on my keyboard not using automation (though as others have mentioned, you could build a capacitive robot to type on it) 4) that the typer has the same private key as previous messages they've signed (if you have an out of band way to corroborate that's great too) 5) optionally, that the person whose biometrics are associated with the device approved it.
There is also an optional voice to text mode that uses 3d face mesh to attempt to verify the words were spoken live.
Not every level of verification is required by the ptrotocol, so you could attest that it was written on a keyboard, but not who wrote it (not yet implemented in the client app).
The protocol doesn't require you to run my app, if you compile it yourself, you can create your own web of trust around you!
What Apple devices are supported? All I have is a iPhone 4 running a old iOS version(pre iOS 7) (which I will not update and I don't think has a secure enclave) and a M1 mac mini and some lightning earpods and a apple thunderbolt display and some USB-A chargers and some old MacBooks.
I saw something about android (https://typed.by/manifesto#:~:text=Android,Integrity) on the website, but it mentioned Play Integrity which I do not have becuase I use LineageOS for MicroG.
I think that the concept is stupid becuase it would require to somehow prove that the app is not modified(which is impractical) and there is no stylus on a motor or fake screen(which is also impractical).
I think that a better aproach would be to form a Web Of Trust where only people's (not just humans, this would include all animals and potentially aliens but no clankers) certificates are signed, but with a interface that is friendly to people who are not very into technology but with some sort of way to not have who your friends are revealed, but this would still allow someone to get a attestation for their robot.
This idea of capturing the timing of people's keystrokes to identify them, ensure it is them typing their passwords, or even using the timing itself as a password has been recurring every few years for at least three decades.
It is always just as bad. Because there are so many cases where it completely fails.
The first case is a minor injury to either hand — just put a fat bandage on one finger from a minor kitchen accident, and you'll be typing completely differently for a few days.
Or, because I just walked into my office eating a juicy apple with one hand and I'm in a hurry typing my PW with my other hand because someone just called with an urgent issue I've got to fix, aaaaannnd, your software balks because I'm typing with a completely different cadence.
The list of valid reasons for failure is endless wherein a person's usual solid patterns are good 90%+ of the time, but will hard fail the other 10% of the time. And the acceptable error rate would be 2-4 orders of magnitude less.
It's a mystery how people go all the way to building software based on an idea that seems good but is actually bad, without thinking it through, or even checking how often it has been done before and failed?
> While you type, the keyboard quietly records how you type — the rhythm, the pauses between keys, where your finger lands, how hard you press.
> Nobody types the same way. Your pattern is as unique as your handwriting. That's the signal.
On a web of trust, if you have a negative interaction with a bot, you revoke trust in one of the humans in the chain of trust that caused you to come in contact with that bot. You've now effectively blocked all bots they've ever made or ever will make... At least until they recycle their identity and come to another key signing party.
Once you have the web in place though, a series of "this key belongs to a human" attestations, then you can layer metadata on top of it like "this human is a skilled biologist" or "this human is a security expert". So if you use those attestations to determine what content your exposed to then a malicious human doesn't merely need to show up at a key signing party to bootstrap a new identity, they also have to rebuild their reputation to a point where you or somebody you trust becomes interested in their content again.
Nothing can be done to prevent bad people from burning their identities for profit, but we can collectively make it not economical to do so by practicing some trust hygiene.
Key signing establishes a graph upon which more effective trust management becomes possible. It on its own is likely insufficient.
If you're engaging with the idea seriously, I suppose we'd need to build a reputation or trust network or something.
Although if you're talking about replay attacks specifically, there are other crypto based solutions for that.
Another way is to just do better isolation as a user. That's probably your best shot without hoping these companies change policies.
What are you talking about? It works fine with firefox with RFP and VPN enabled, which is already more paranoid than the average configuration. There are definitely sites where this configuration would get blocked, but chatgpt isn't one of them, so you're barking up the wrong tree here.
It's a pity Firefox doesn't get the praise it deserves half as much as it cops criticism.
“Ignorant” is also infinite - you’re ignorant of MANY things as well, and I’m sure you would struggle with things I can do with ease. For example, understanding the meaning behind what’s being said so I know not to brow-beat someone over it.
I’m almost endlessly surprised by the probably-autistic-spectrum responses to tech things from people with no idea how things seem to other people.
It's also possible to make Firefox route each container through a different proxy which could be running locally even which then can connect to multiple different VPN's. I haven't tried doing that but its certainly possible.
It's sort of possible to run different browsers with completely new identities and sometimes IP within the convenience of one. It's really underrated. I don't use the IP part of this that I have mentioned but I use multi containers quite a lot on zen and they are kind of core part of how I browse the web and there are many cool things which can be done/have been done with them.
Typing the chat box is slow, rendering lags and sometimes gets stuck altogether.
I have a research chat that I have to think twice before messaging because the performance is so bad.
Running on iPhone 16 safari, and MacBook Pro m3 chrome.
They did it because a lot of devices running Netflix (TVs, DVD players, etc) were underpowered and Netflix was not keen on writing separate applications. They did, however, invest into a browser engine that would have HW acceleration not just for video playback but also for moving DOM elements. Basically, sprites.
The lost art of writing efficient code...
This is generally called virtual scrolling, and it is not only an option in many common table libraries, but there are plenty of standalone implementations and other libraries (lists and things) that offer it. The technique certainly didn't originate with Netflix.
None of which chatgpt can handle presumably.
GP was mentioning that a solution to the problem exists, not that Netflix specifically invented it. Your quip that the technique is not specific to Netflix bolsters the argument that OpenAI should code that in.
They described Netflix's implementation, but if someone actually wanted to follow up on this (even for their own personal interest), Dynamic HTML would not get you there, while virtualization would across all the places it's used: mobile, desktop, web, etc.
- "ctrl + f" search stops working as expected - the scrollbar has wrong dimensions - sometimes the content might jump (common web issue overall)
The reason why we lost it is because web supports wildly different types of layouts, so it is really hard to optimize the same way it is possible in native apps (they are much less flexible overall).
Either way, pretty wild that you can have billions of dollars at your disposal, your interface is almost purely text, and still manage to be a fuckup at displaying it without performance problems.
It's perfectly possible to write fast or slow web applications in React, same as any other framework.
Linear is one of the snappiest web applications I've ever used, and it is written in React.
Please run Cloudflare's privacy invasive tool and share all the values it generates here so we can determine if you're a real person.
Is this to be expected? I would presume that if I'm authenticated and paying, VPN use wouldn't be a worry. It would be nice to be able to use the tool whether or not I'm on a VPN.
Heard from a founder who recently switched his company to Claude due to OpenAI's lagginess–it's absolutely an OpenAI problem. Not an AI problem in general.
Can you share these mitigations so we can mitigate against you?
I ask because I have seen huge variations in load time. Sometimes I had to wait seconds until being able to type. Nowadays it seems better though.
But don't you run these checks on logged-in users too?
"Abuse" checks should only come into play when someone tries to leverage the free tier. It reminds me of those cable companies that try to sell "unlimited" plans and then try to say customers who use more than x GB/month are abusing the service rather than just say what the real limits are because "unlimited" sounds better in marketing.
Meanwhile, the rest of us (well, not me, because I don’t use your garbage product, but lots of others do) have to suffer and have our compute resources used up in the name of “protection.”
I assumed it was maybe some tokenization going on client side, but now I realize maybe it's some proof of work related to prompt length?
How does this comport with OpenAI's new B2B-first strategy?
> We also keep a very close eye on the user impact
Are paid or logged-in users also penalised?
This has to be a joke, right?
How can first-party products protect themselves from abuse by OpenAI's bots and scraping?
How do we defend against your scraping, OpenAI?
I dont want any of my content scraped or seen by you all. Frankly, fuck you all for thinking my content is owned by you.
This would be fucking HILARIOUS if it wasn't so tragic.
Also if you could pass this over: it takes 5 taps to change thinking effort on ios and none (as in completely hidden) on macos.
If I were to guess it seems that you were trying to lower the token usage :-). Why the effort is only nicely available on web and windows is beyond me
"Prove your humanity/age/other properties" with this mechanism quickly goes places you do not want it to go.
Which places?
It would behoove people to engage with the substance of attestation proposals. It's lazy to state that any verification scheme whatsoever is equivalent to a panopticon, dystopia as thought-terminating cliche.
We really do have the technology now to attest biographical details in such a way that whoever attests to a fact about you can't learn the use to which you put that attestation and in such a way that the person who verifies your attestation can see it's genuine without learning anything about you except that one bit of information you disclose.
And no, such a ZK scheme does not turn instantly into some megacorp extracting monopoly rents from some kind of internet participation toll booth. Why would this outcome be inevitable? We have plenty of examples of fair and open ecosystems. It's just lazy to assert right out of the gate that any attestation scheme is going to be captured.
So, please, can we stop matching every scheme whatsoever for verifying facts as actors as the East German villain in a cold war movie? We're talking about something totally different.
You've been to the doctor recently, right? Given them your SSN? Every identity system ever built was going to be scoped || voluntary. None of them stayed that way.
Once you have the identity mechanism, "Oh it's zero knowledge! So let's use it for your age! Have you ever been convicted?" which leads to "mandated by employers" which leads to...
We've seen this goddamn movie before. Let's just skip it this time? Please?
That's... exactly expected? It's a cat and mouse game. People running botnets or AI scrapers aren't diligently setting the evil bit on their packets.
But hey, at least some bots are also not making it past Cloudflare!
Claude's free tier requires a phone number just to try it.
Yep. The most easy to implement stable state for any system where you're aiming to prevent misuse is to just prevent use
In my brief experience with abuse mitigation, connections coming from VPNs or unusual IP ranges were very significantly more likely to be associated with abuse.
It depends on your users. VPNs aren’t common at all, even though you hear about them a lot on Hacker News. For types of social sites where people got banned for abuse (forums) the first step to getting back on the forum was always to sign up for a VPN and try to reconnect. It got so bad that almost every new account connecting via VPN would reveal itself as a spammer, a banned member trying to return, or someone trying to sock puppet alternate accounts for some reason.
The worst offenders are Tor IP addresses. Anyone connecting from Tor was basically guaranteed to have bad intentions.
I heard from someone who dealt with a lot of e-mail abuse that the death threats, extortion, and other serious abuse almost always came from Protonmail or one of the other privacy-first providers that I can’t remember right now. He half-jokingly said they could likely block Protonmail entirely without impacting any real users.
It’s tough for people who want these things for privacy, but the sad reality is that these same privacy protections are favored by people who are trying to abuse services.
I have yet to see a use case for VPNs for the casual internet audience, and for a tech savvy user, their better off renting through some datacenter or something, which at that point is hardly a VPN and more home IP obfuscation. All the same downsides, and at least you get real privacy.
Mullvad.
It has been proven in a court of law that when Mullvad says "no logging", they mean it.
They also regularly have security audits and publish the results[2][3]
[1]https://mullvad.net/en/blog/mullvad-vpn-was-subject-to-a-sea... [2]https://mullvad.net/en/blog/new-security-audit-of-account-an... [3]https://mullvad.net/en/blog/successful-security-assessment-o...
https://github.com/mullvad/mullvad-browser/
Source? I haven't seen any evidence that the major paid VPN providers engage in any of those things. At best it's vague implications something shady is happening because one of the key people was previously at [shady organization].
MullvadVPN is also another great one.
I have heard some good things about AirVPN, but I can absolutely attest for mullvad and to a degree ProtonVPN (Just with Proton, depending upon your threat model, do make the necessary precautions like buying with monero for example)
There are others, but mostly its the 2-3 that I trust.
I'm running firefox and seeing the normal amount.
Maybe I just have to disable all ad blockers and Safari tracking prevention? Or I guess I could send a link to a scan of my photo ID in a custom request header like X-Please-Cloudflare-May-I-Use-Your-Open-Web?
I think I was sufficiently clear that I was specifically talking about CGNAT-caused IP address tainting being an unreasonably emphasized worry, not the worry about their detections overall misfiring. Though I certainly don't hear much about people having issues with it (but then anecdotes are anecdotal).
> Or I guess I could send a link to a scan of my photo ID in a custom request header like X-Please-Cloudflare-May-I-Use-Your-Open-Web?
Sounds good, have you tried?
Not sure what's the point of these comically asinine rhetoricals.
I'm certain the User-Agent is part of it. I know that for certain because a very reliable way I can trigger the CF stuff is this plugin with the wrong browser selected [1].
[1] https://addons.mozilla.org/en-US/firefox/addon/uaswitcher/
I can't tell whether you're serious but in case you are, this theory immediately falls apart when you realize waymo operates at night but there aren't any night photos.
At least outside the US, there's 3DS as an (admittedly often high friction) high quality cardholder verification method, but in the US, that's of course considered much too consumer-hostile, so "select 87 overpasses" it is.
Of course then you got sites like gnu.org too that block you because your slightly outdated user agent.
Is it TSA's "fault" that non-terrorists are subject to screening?
(It's a different question whether zero friction is actually desired, or whether some security theater is actually part of the service being provided, but that's a different question.)
The "quality" of TSA's screening seems be pretty bad too given how many people have to go through secondary screening vs how many terrorist they catch (0?)
Nice try but I used "caught", not "stopped", which requires they actually apprehended someone, not just prevented some hypothetical attack.
>since they got the gig to serve and protect and before we lost thousands of lives…)
You could easily reuse this argument for cloudflare: "if it wasn't for such invasive browser fingerprinting openai would be drowning in bajillion req/s from bots."
of course they would be drowning! I have no issues with what CF is doing. too funny that people use tools like chatgpt and expect privacy?!
Imagine an apartment building with a flimsy front door lock that breaks all the time, and the landlord only telling you that that can't be helped because of all the burglars.
I'm getting it on iCloud Private Relay all the time. It honestly makes it kind of useless.
Maybe that's the point? But then again, doesn't Cloudflare run part of it!? And wasn't there some "privacy-preserving captcha replacement" that iOS devices should already be opting me in to? So many questions, nobody there to answer them, because they can get away with it.
> The part that’s really annoying is its trivial for bots to bypass.
Not the ethical bots, though! My GPT-backed Openclaw staunchly refuses to go anywhere near a "I'm not a robot" button.
is it Linux (or similar)?
is it Firefox?
If yes, to one or both, you're blocked! Clearly millions of dollars of engineering talent and petabytes of data collection should be able to come up with something more nuanced than this.
I don't do free work. I'm not going to label 50 images of crosswalks and motorcycles for free.
Curious how do you know this?
I'm building Safebox and Safecloud, where this won't be the case anymore. Not only will you have a decentralized hosting network that can sideload resources (e.g. via a browser extension that looks at your "integrity" attribute on websites) but also the websites will require you to be logged in with a HMAC-signed session ID (which means they don't need to do any I/O to reject your requests, and can do so quickly)... so the whole thing comes down to having a logged in account.
https://github.com/Safebots/Safecloud
As far as server-to-server requests, they'll be coming from a growing network of cryptographically attested TPMs (Nitro in AWS, also available in GCP, IBM, Azure, Oracle etc.) so they'll just reject based on attestations also.
In short... the cryptographically attested web of trust will mean you won't need cloudflare. What you will need, however, to prevent sybil attacks, is age verification of accounts (e.g. Telegram ID is a proxy for that if you use Telegram for authentication).
"No s̶o̶u̶p̶ internet for you!"
Good luck!
I noticed the ChatGPT app also checks Play Integrity on Android (because GrapheneOS snitches on apps when they do this), probably for the same reason. Claude's app doesn't, by the way, but it also requires a login.
Coincidentally about an hour ago, I wanted to look something up in ChatGPT and I happened to be in a browser window I don’t normally use, with no logged in accounts. I assumed it wouldn’t work, but to my surprise with no account, no cookies of any kind it took my query and gave me an answer.
They allowed anonymous requests for months now, maybe even a year.
As has been amply explained, the API pricing per token is far more for equivalent use when maximizing a subscription plan.
It isn’t really a massive hurdle to deal with this full SPA load check. If one is even aware it exists they already have the skills to bypass it anyway.
I get why people would “what about” the automation inherit in what OpenAI is doing but that is a separate matter.
Other businesses and applications can put into place their own hurdles and anti bot practices to protect the models they’ve leaned into—-and they have been.
> This is bot detection at the application layer, not the browser layer.
I kind of just assumed that all sophisticated bot-detectors and adblock-detectors do this? Is there something revealing about the finding that ChatGPT/CloudFlare's bot detector triggers on "javascript didn't execute"?
Specifically, Turnstile as far as I'm aware doesn't do anything specifically configurable or site specific. It works on sites that don't run React, and the cookie OpenAI-Sentinel-Turnstile-Token is not a CF cookie.
Did OpenAI somehow do something on their own API that uses data from Turnstile?
You can probably run 50 of those simultaneously if you use memory page deduplication, and with a decent CPU+GPU you ought to be able to render 50 pages a second. That's 1 cent per thousand page loads on AWS. Damn cheap.
Honestly it is a very healthy competitive market with reasonably low switching costs which drives prices down. These circumstances make rolling your own a tough sell.
if you browse them you will see that bot writers are very annoyed if they can't scrape a site with a headless browser.
you can do what you suggested, but with Linux VMs/containers. windows is too heavy, each VM will cost you 4 GB of RAM
More info here:
https://web.archive.org/web/20231107182321/https://mu0.cc/20...
https://youtu.be/XLLcc29EZ_8?t=570
https://github.com/jamesstringer90/Easy-GPU-PV
[1]: https://github.com/xcanwin/keepchatgpt
But it seems clear to me that this is why I can't start typing right away when I first load the page and click to focus in the text field.
It now behaves like Claude, attaching the paste as a file for upload rather than inlining it.
This affected page UX some and reduces the cost of the browser tab some.
At some point, maybe still true, very long conversations ~froze/crashed ChatGPT pages.
...I don't think that's possible even if you are a bot? I would be very surprised if OAI had their origin exposed to the internet. What is a "non-Cloudflare proxy"? Is this AI slop?
It's likely just looking at the CF properties as part of a bot scoring metric (e.g. many users from this ASN or that geoip to this specific city exhibit abusive patterns).
this is meaningless btw. A browser headless or not does execute javascript.
I read it to mean: "A browser that doesn't execute the JavaScript bundle won't have [the rendered React elements]." Which is true.
Also, you can have it spotcheck colors: light orange on light background is unreadable, ask it to find the L*[1] of colors and dark/lighten as necessary if gap < 40 (that's minimum gap for yuge header text on background, 50 for text on background, these have gap of 25)
I haven't tried this yet, but, maybe have it count word count-per-header too. It's got 11 headers for 1000 words currently, makes reading feel really stacatto and you gotta evaluate "is this a real transition or vibetransition"
[1] L* as in L*a*b*, not L in Oklab
There's no way this is worth it unless the models are absolutely tiny, in which case any benefits from offloading to the client is marginal and probably isn't worth the engineering effort.
And as for "but chatgpt isn't paid" (another commenter), well, then yes, that's even closer to free by removing this spying on your computer setup. But they spy on the paid users too.
Really really bad user experience, wondering about when they will leave this approach.
My best guess is -- ChatGPT is running something in your browser to try to determine the best things to send down to the model API –- when it should have been running quantized models on its own server.
Every provider seems to have been plauged by these freeloaders to such an extent that they've had to develop extreme and onerous countermeasures just to avoid losing their shirts.
What's the word? Schadenfreude?
Sick.