There are also other documents that appear to simulate a scanned document but completely lack the “real-world noise” expected with physical paper-based workflows. The much crisper images appear almost perfect without random artifacts or background noise, and with the exact same amount of image skew across multiple pages. Thanks to the borders around each page of text, page skew can easily be measured, such as with VOL00007\IMAGES\0001\EFTA00009229.pdf. It is highly likely these PDFs were created by rendering original content (from a digital document) to an image (e.g., via print to image or save to image functionality) and then applying image processing such as skew, downscaling, and color reduction.
tombrossman 3 hours ago [-]
GNOME Desktop users can put this in a Bash script in ~/.local/share/nautilus/ for more convincing looking fake PDF scans, accessible from your right-click menu. I do not recall where I copied it from originally to give credit so thanks, random internet person (probably on Stack Exchange). It works perfectly.
ROTATION=$(shuf -n 1 -e '-' '')$(shuf -n 1 -e $(seq 0.05 .5))
for pdf in "$@";
do magick -density 150 $pdf \
-linear-stretch '1.5%x2%' \
-rotate 0.4 \
-attenuate '0.01' \
+noise Multiplicative \
-colorspace 'gray' \
"${pdf%.*}-fakescan.${pdf##*.}"
done
barrkel 1 hours ago [-]
That seq is probably supposed to be $(seq 0.05 0.05 0.5). Right now it's always 0.05.
Note that you can get random numbers straight from bash with $RANDOM. It's 15 bit (0 to 32767) but good enough here; this would get between 0.05 and 0.5: $(printf "0.%.4d\n" $((500 + RANDOM % 4501)))
streetfighter64 2 hours ago [-]
Shouldn't $ROTATION be set inside the loop and actually used in the magick command?
tombrossman 2 hours ago [-]
You know, now that you point it out that seems obvious. I think maybe I was experimenting with rotation and left that in, unused. I did this years ago. The loop works OK though. Thanks for the feedback (and now I have to finish editing that script ...)
streetfighter64 4 hours ago [-]
Very interesting. That document in particular seems to be an interview of A. Acosta by the DoJ from 2019. But what reason would the FBI have for pretending it's a scanned document, if it is genuine? Perhaps there's some aspect of Epstein's deal with Acosta that they'd rather not reveal to the public?
Not that I can speak from personal experience or anything... But somebody on an email chain may have requested a scanned version of the document to ensure there is no metadata and the employee might have found it easier to just flatten the pdf and apply a graphical filter to make the document appear like a scanned document. There might even be a webtool available somewhere to do so, I wouldn't know...
mikkupikku 3 hours ago [-]
> the employee might have found it easier to just flatten the pdf and apply a graphical filter to make the document appear like a scanned document
Is that remotely plausible? I can't imaging faking a scan being easier than just walking down the hall to the copier room.
meinersbur 9 minutes ago [-]
The time advantage of faking a scan becomes better the more pages you have to scan.
It's thousands of pages, surely investing some time in a script is faster. They were in a rush as well.
If they were faking the documents rather than the delivery method they definitely could have invested some time in flawless looks.
salynchnew 2 hours ago [-]
If it's already scanned, then you don't have to leave your desk.
ffsm8 2 hours ago [-]
Depending on their technical capability, yes.
I mean even in this thread you got what are essentially one-liners to do it.
Definitely less hassle then doing it irl
mikkupikku 35 minutes ago [-]
I know I'm not the brightest bulb by any measure, but do some people really take less than at least a few minutes to come up with one-liners for problems as novel as graphical transformations to PDFs? Maybe if the presumed techie hacker / federal worker took it as an amusing challenge I could see this being done, but genuinely out of pure laziness? That's incredible if true.
streetfighter64 2 hours ago [-]
Hoe big a percentage of FBI / DoJ employees are running linux (with imagemagick) as their work computer? I'd be surprised to see a similar oneliner for a stock windows installation.
Yeah they might have used some web converter, but that on the other hand would have been extremely incompetent handling of the secret data.
agopo 2 hours ago [-]
[dead]
ThePowerOfFuet 1 hours ago [-]
Straight to the signup page? A bit blatant, no?
draw_down 3 hours ago [-]
[dead]
hiccuphippo 50 minutes ago [-]
I mean, I do that all the time when they ask me to print something, sign it, and then scan it.
Sign a blank paper, scan it, paste the original doc on it. Then keep the scan for future docs.
ted_bunny 6 hours ago [-]
Has anyone analysed JE's writing style and looked for matches in archived 4chan posts or content from similar platforms? Same with Ghislaine, there should be enough data to identify them atp right? I don't buy the MaxwellHill claims for various reasons but it doesn't mean there's nothing to find.
qoez 5 hours ago [-]
People always claimed this as a data leak vector but I've always been sceptical. Like just writing style and vocabulary is probably extremely shared among too many people to narrow it down much. (How people that you know could have written this reply?) The counter argument is that he had a very specific style in his mail so maybe this is a special case.
Eisenstein 3 hours ago [-]
If you have a large enough set to test against and a specific person you are looking for, this is totally doable currently.
fluoridation 2 hours ago [-]
Of course it's doable. The question is how reliable the results are.
zxcvasd 4 hours ago [-]
this is a well-studied field (stylometry). when combining writing styles, vocabulary, posting times, etc. you absolutely can narrow it down to specific people.
even when people deliberately try to feign some aspects (e.g. switching writing styles for different pseudonyms), they will almost always slip up and revert to their most comfortable style over time. which is great, because if they aren't also regularly changing pseudonyms (which are also subject to limited stylometry, so pseudonym creation should be somewhat randomized in name, location, etc.), you only need to catch them slipping once to get the whole history of that pseudonym (and potentially others, once that one is confirmed).
ge96 2 hours ago [-]
People do change over time, I used to write "ha" after every sentence for some reason
wholinator2 1 hours ago [-]
You know, i had a particularly cringy period in which i put "la" at the end of sentences.
Exoristos 1 hours ago [-]
You left off something.
zxcvasd 1 hours ago [-]
[dead]
Der_Einzige 5 hours ago [-]
Stylometry is extremely sophisticated even with simple n-gram analysis. There's a demo of this that can easily pick out who you are on HN just based on a few paragraphs of your own writing, based on N-gram analysis.
You can also unironically spot most types of AI writing this way. The approaches based on training another transformer to spot "AI generated" content are wrong.
mrandish 4 hours ago [-]
> You can also unironically spot most types of AI writing this way.
I have no idea if specialized tools can reliably detect AI writing but, as someone whose writing on forums like HN has been accused a couple of times of being AI, I can say that humans aren't very good at it. So far, my limited experience with being falsely accused is it seems to partly just be a bias against being a decent writer with a good vocabulary who sometimes writes longer posts.
As for the reliability of specialized tools in detecting AI writing, I'm skeptical at a conceptual level because an LLM can be reinforcement trained with feedback from such a tool (RLTF instead of RLHF). While they may be somewhat reliable at the moment, it seems unlikely they'll stay that way.
Unfortunately, since there are already companies marketing 'AI detectors' to academic institutions, they won't stop marketing them as their reliability continues to get worse. Which will probably result in an increasing shit show of false accusations against students.
streetfighter64 2 hours ago [-]
Well, humans might be great at detecting AI (few false negatives) but might falsely accuse humans more often (higher false positive rate). You might be among a set of humans being falsely accused a lot, but that's just proof that "heuristic stylometry" is consistent, it doesn't really say anything about the size of that set.
digiown 4 hours ago [-]
Funnily this also implies that laundering your writing through an AI is a good way to defeat stylometry. You add in a strong enough signal, and hopefully smooth out the rest.
mikkupikku 3 hours ago [-]
Hacker News is one of the best places for this, because people write relatively long posts and generally try to have novel ideas. On 4chan, most posts are very short memey quips, so everybody's style is closer to each others than it is to their normal writing style.
diamondage 5 hours ago [-]
Why are they wrong? Surely it depends on how you train it?
He met with moot ("he is sensitive, be gentile", search on jmail), and within a few days the /pol/ board got created, starting a culture war in the US, leading to Trump getting elected president. Absolutely nuts.
albroland 4 hours ago [-]
Few thoughts: in context it's not nuts at all:
- moot was fundraising for his VC backed startup during the years the emails are in, and he was likely connected via mutuals in USV or other firms. These meetings were clearly around him trying to solicit investment in his canv.as project.
- /pol/ was /new/ being returned; the ethos of the board had already existed for a long time and the decision to undo the deletion of /new/ was entirely unsurprising for denizens at the time, and was consistent with a concerted push moot was making for more transparency in the enforcement of rules on the site and fairness towards users who followed the rules. /pol/ didn't start a culture war at this time any more than /new/ had previously - it just existed as a relatively content-unmoderated platform for people to discuss earnestly what would get them banned elsewhere.
mikkupikku 3 hours ago [-]
Besides /new/ there was also /n/ (not at that time about transportation.) Moot's war with people being racist on 4chan had many back and forths before /pol/ was created.
LAC-Tech 4 minutes ago [-]
be gentile
We're just not going to talk about that one I suppose?
acessoproibido 5 hours ago [-]
I always wondered how much of a cultural etc influence 4Chan actually had (has?) - so much of the mindset and vernacular that was popular there 10+ years ago is now completely mainstream.
jazzyjackson 5 hours ago [-]
Ah, a rare opportunity to share a blog post that had a big effect on my political outlook back in 2016, Meme Magic Is Real, You Guys
Who can say what effect it had on the world, but a presidential candidate reposting himself personified as Pepe the frog was still weird back then, and at least a nod to the trolls doing so much work on his behalf
Summary: Trump used memes not in the sense of pepes but in the original (Dawkins') sense of "earworm" soundbites, along with a torrent of scandals, each making the previous seem like old news, to exploit a public tired of the "status quo" into voting for a zany wildcard pushing for reactionary policy
GaryBluto 5 hours ago [-]
/pol/ in no way started the American culture war. It was brewing for a while.
_--__--__ 5 hours ago [-]
pol was made to contain all posting on the American culture war so it could be banned from the other (more active) boards
WetMinister 4 hours ago [-]
You’re acting as if https://doge.gov does not exist. Ask yourself under which presidency, administration and kind of politics such is allowed to even exist with a straight face.
GaryBluto 2 hours ago [-]
It would've existed regardless of internet memes, just under a different and similarly obnoxious name.
actionfromafar 5 hours ago [-]
Well, broke the levee if you will. Otherwise, explain Pepe.
GaryBluto 5 hours ago [-]
I hardly think an internet image of a cartoon frog heavily influenced American elections, despite a surface-level co-option by various Republican politicians.
4 hours ago [-]
actionfromafar 4 hours ago [-]
I agree completely.
I'm just saying, it's a symptom. The crazy found critical mass, broke containment. From there it was laundered in millions of Facebook groups and here we are.
jahsome 5 hours ago [-]
In no way?
mort96 5 hours ago [-]
Just to substantiate this a bit: I remember a gleeful consensus in certain circles being that /pol/ and /r/the_donald had "memed Trump into the White House". It's much more complicated than that, but there's certainly an element of truth there.
ronnier 3 hours ago [-]
Then Reddit and almost all of social media went on to purge trump and pro trump content. The Donald was banned. Trump deplatformed across social media.
mort96 1 hours ago [-]
That's true, but not really relevant to this discussion. You can't really deplatform a president; yes he was no longer on Twitter, but roughly 8 billion people listen any time he speaks.
kipchak 5 hours ago [-]
Which meeting are you seeing? That search doesn't seem to work for me, I'm only seeing the one Jan 2012.
Thanks, trying to figure out the timeline relative to the board's creation given how close they are. The first email I can find related to a meeting is this one from Boris Nikolic on Oct 20th, with /pol/ on the 23rd.
The reason I don’t agree is that moot banned any Gamergate discussion and those people then went to 8chan, a site which moot had no control over.
And it was Gamergate that put some fuel on the fire which (IMHO) increased support for Trump. The 8chan site grew a great deal from it, then continued from that first initial “win”.
kmeisthax 4 hours ago [-]
From moot's perspective, it can be as simple as being convinced by some rich guy you've never heard of to bring back the politics board. He doesn't need to have an intent to start a fascist coup, that's Epstein's job. GamerGate is just the point at which moot realized he'd fucked up and destroyed 4chan imageboard culture by letting /pol/ fester.
dopa42365 5 hours ago [-]
Given the "nature" of 4chan (only a few hundred posts and a few thousand comments at a time, the vast majority of it shitposts and spam), it just can't do that. The imageboard format and limits basically prevent any scaling and mainstream success. If you follow any of the general threads in pol or sp for a while, you'll spot the same few people all the time, it's a tiny community of active users.
thatguy0900 5 hours ago [-]
I think the logic is Pol didn't need to reach the masses, the masses only consume content they don't create it. You only need to radicalize the few people who then go on to be the 1% of people commenting and posting.
mort96 4 hours ago [-]
There's an old joke that 9gag* only reposts stuff from Reddit and Reddit only reposts stuff from 4chan and 4chan is the origin of all meme culture. This joke was widespread enough to reach myself and my friend group back in the day, even though none of used 4chan or Reddit.
If you radicalise the 0.01% of people who are prolific meme creators, you radicalise the masses.
* I did say old...
direwolf20 29 minutes ago [-]
And Facebook repeats stuff from 9gag
acessoproibido 5 hours ago [-]
That is a crazy amount of emails from/about moot...
4 hours ago [-]
yonatan8070 5 hours ago [-]
A bit off-topic, but I find it kinda funny that the "Decline" button on the cookie popup on this page is labled "Continue without consent".
Paracompact 33 minutes ago [-]
They're really trying to guilt trip you.
direwolf20 29 minutes ago [-]
Damn, so the website about the Epstein is Epstein too
waynenilsen 7 hours ago [-]
> Information leakage may also be occurring via PDF comments or orphaned objects inside compressed object streams, as I discovered above.
hopefully someone is independently archiving all documents
Are they being removed or replaced with more heavily redacted documents? There were definitely some victim names that slipped through the cracks that have since been redacted.
6 hours ago [-]
embedding-shape 7 hours ago [-]
Initially under "Epstein Files Transparency Act (H.R.4405)" on https://www.justice.gov/epstein/doj-disclosures, all datasets had .zip links. I first saw that page when all but dataset 11 (or 10) had a .zip link. At one point this morning, all the .zip links were removed, now it seems like most are back again.
littlecorner 6 hours ago [-]
I think some of the released documents included images of victims, which where redacted. So it's not necessarily malicious removals
dylan604 6 hours ago [-]
That's my understanding too, so archiving the unredacted images could mean holding CSAM.
streetfighter64 4 hours ago [-]
Which is of course very convenient for the government, similar to when wikileaks got prosecuted for holding state secrets.
thatguy0900 4 hours ago [-]
If we're assuming they didn't leave victims unredacted on purpose
embedding-shape 7 hours ago [-]
Re the OCR, I'm currently running allenai/olmocr-2-7b against all the PDFs with text in them, comparing with the OCR DOJ provided, and a lot it doesn't match, and surprisingly olmocr-2-7b is quite good at this. However, after extracing the pages from the PDFs, I'm currently sitting on ~500K images to OCR, so this is currently taking quite a while to run through.
originalvichy 7 hours ago [-]
Did you take any steps to decrease the dimension size of images, if this increases the performance? I have not tried this as I have not peformed an OCR task like this with an LLM. I would be interested to know at what size the vlm cannot make out the details in text reliably.
embedding-shape 7 hours ago [-]
The performance is OK, takes a couple of seconds at most on my GPU, just the amount of documents to get through that takes time, even with parallelism. The dimension seems fine as it is, as far as I can tell.
helterskelter 7 hours ago [-]
[flagged]
embedding-shape 7 hours ago [-]
Haven't seen anything particular about that, but lots of the documents with names that were half-redacted contain OCRd text that is completely garbled, but olmocr-2-7b seems to handle it just fine. Unsure if they just had sucky processes or if there is something else going on.
helterskelter 6 hours ago [-]
Might be a good fit for uploading a git repo and crowdsourcing
direwolf20 28 minutes ago [-]
GitHub would ban you
embedding-shape 6 hours ago [-]
Was my first impulse too but not sure I trust that unless I could gather a bunch of people I trust, which would mean I'd no longer be anonymous. Kind of a catch22.
originalvichy 7 hours ago [-]
Any guesses why some of the newest files seem to have random ”=” characters in the text? My first thought was OCR, but it seemed to not be linked to characters like ”E” that could be mistakenly interpreted by an OCR tool. My second guess is just making it more difficult to produce reliable text searches, but probably 90% of HN readers could find a way to make a search tool that does not fall apart in case a ”=” character is found (although making this work for long search queries would make the search slower).
The equal characters are due to poor handling of quoted-printable in email.
The author of gnus, Lars Ingebrigtsen, wrote a blog post explaining this. His post was on the HN front page today.
originalvichy 6 hours ago [-]
He explained the newline thing that confused me. Good read!
_def 6 hours ago [-]
I can't even download the archive, the transmission always terminates just before its finished. Spooky.
Beijinger 2 hours ago [-]
What would be more interesting: His Bank accounts.
Who paid him?
Who did get paid?
nkozyra 7 hours ago [-]
> DoJ explicitly avoids JPEG images in the PDFs probably because they appreciate that JPEGs often contain identifiable information, such as EXIF, IPTC, or XMP metadata
Maybe I'm underestimating the issue at full, but isn't this a very lightweight problem to solve? Is converting the images to lower DPI formats/versions really any easier than just stripping the metadata? Surely the DOJ and similar justice agencies have been aware of and doing this for decades at this point, right?
DharmaPolice 5 hours ago [-]
This is speculation but generally rules like this follow some sort of incident. e.g. Someone responds to a FOI request and accidentally discloses more information than desired due to metadata. So a blanket rule is instituted not to use a particular format.
originalvichy 7 hours ago [-]
Maybe they know more than we do. It may be possible to tamper with files at a deeper level. I wonder if it is also possible to use some sort of tampered compression algorithm that could mark images much like printers do with paper.
Another guess is that perhaps the step is a part of a multi-step sanitation process, and the last step(s) perform the bitmap operation.
normalaccess 6 hours ago [-]
I'm not sure about computer image generation but you can (relatively) easily fingerprint images generated by digital cameras due to sensor defects. I'll bet there is a similar problem with PC image generation where even without the EXIF data there is probably still too much side channel data leakage.
Eisenstein 3 hours ago [-]
Image metadata is the wild west of structured text. The developer of the foremost tool for dealing with it (exiftool) has made 'remove metadata' feature but still disclaims that it is not able to remove everything.
zahlman 3 hours ago [-]
How could that be possible? Isn't JPEG a fairly straightforward container for JFIF+metadata?
Paracompact 8 minutes ago [-]
"Fairly straightforward" is incorrect. Not an authority to describe in more detail, but the most tricky blocker I'm aware of are these proprietary "MakerNote" tags from camera manufacturers, which are (often undocumented) binary blobs. exiftool might not even know what's in there, let alone how to safely remove it without corrupting the file.
bugeats 7 hours ago [-]
Somebody ought to train an LLM exclusively on this text, just for funsies.
pc86 6 hours ago [-]
DeepSeek-V4-JEE
TheKnownSecret 5 hours ago [-]
It would be funny (and disturbing) to add Jemini to JMail.
corygarms 7 hours ago [-]
These folks must really have their hands full with the 3M+ pages that were recently released. Hoping for an update once they expand this work to those new files.
seydor 2 hours ago [-]
why do we count this in "pages" when it's mostly an email dump
NoToP 5 hours ago [-]
This is so incredibly useful to me right now for incidental reasons I am commenting to make sure I can get back to it.
layer8 4 hours ago [-]
HN lets you mark submissions (and comments) as favorites, no need to spam the thread.
tibbon 7 hours ago [-]
That's a lot of PeDoFiles!
(But seriously, great work here!)
ted_bunny 6 hours ago [-]
Elite PDF File ring
mmooss 5 hours ago [-]
What is the legal basis for releasing the someone's private files and communications? If they can do it to Epstein, they can do it to you, to the Washington Post journalist, to former President Clinton, etc.
Is the scope at least limited somehow? Generally I favor transparency, but of course probably the most important parts are withheld.
toast0 5 hours ago [-]
> What is the legal basis for releasing the someone's private files and communications?
An act of congress, for one.
Also, AFAIK, federal privacy generally ends at death, as does criminal liability; so releasing government files from a federal investigation after death of the subject is generally within the realm of acceptable conduct.
mmooss 2 hours ago [-]
Yes, I forgot about that major part of the story! Still, acts of Congress can't violate Consitutional rights.
It seems unlikely you lose all rights when you die or it would be chaos - imagine all the secrets people die with that affect everyone they know. An integral part of every estate plan would be incinerating records. Wills do have real power.
toast0 1 hours ago [-]
Your estate retains many of your rights when you die. However, the federal privacy act explicitly does not apply. Your estate may have privacy rights via the Constitution, although privacy is not specifically enumerated. Your estate may have privacy rights via state law; but that wouldn't bar the federal government from disclosing its investigative materials.
OTOH, there's a 2004 case, National Archives & Records Administration v. Favish[1], which establishes the surviving family's right of privacy to death scene photos, but that's technically not privacy of the deceased.
Given what we've seen so far, there's probably some very interesting stuff in Clinton's private files and communications. Not to mention the stuff in current president Trump's. Some random journalist, probably not. Unless it's a very wealthy and/or connected journalist like David Brooks...
pstuart 5 hours ago [-]
I'd assume it was the nature of the case, and that discovery was done with him being dead.
todfox 4 hours ago [-]
He was a pedophile sex trafficker. Epstein and his clients deserve zero privacy.
mmooss 2 hours ago [-]
Who determines who deserves privacy, and how do they determine it?
meidan_y 8 hours ago [-]
(2025) just follow hn guideline, impressive voter ring though
8 hours ago [-]
alain94040 8 hours ago [-]
We're in early February 2025 [edit:2026] and the article was written on Dec 23, 2025, which makes it less than two months old. I think it's ok not to include a year in the submission title in that case.
I personally understand a year in the submission as a warning that the article may not be up to date.
embedding-shape 7 hours ago [-]
Less about the age, and more about confusing what they are analyzing, for the files that were just released like a week ago.
petepete 8 hours ago [-]
We're in Feb 2026.
I'm not used to typing it yet, either.
GlitchRider47 7 hours ago [-]
Generally, I'd agree with you. However, the recent Epstein file dump was in 2026, not 2025, so I would say it is relevant in this case..
michaelmcdonald 8 hours ago [-]
"We're in early February ~2025~ *2026*"
Rendered at 22:33:11 GMT+0000 (Coordinated Universal Time) with Vercel.
There are also other documents that appear to simulate a scanned document but completely lack the “real-world noise” expected with physical paper-based workflows. The much crisper images appear almost perfect without random artifacts or background noise, and with the exact same amount of image skew across multiple pages. Thanks to the borders around each page of text, page skew can easily be measured, such as with VOL00007\IMAGES\0001\EFTA00009229.pdf. It is highly likely these PDFs were created by rendering original content (from a digital document) to an image (e.g., via print to image or save to image functionality) and then applying image processing such as skew, downscaling, and color reduction.
Note that you can get random numbers straight from bash with $RANDOM. It's 15 bit (0 to 32767) but good enough here; this would get between 0.05 and 0.5: $(printf "0.%.4d\n" $((500 + RANDOM % 4501)))
https://www.justice.gov/epstein/files/DataSet%207/EFTA000092...
Is that remotely plausible? I can't imaging faking a scan being easier than just walking down the hall to the copier room.
https://xkcd.com/1205/
If they were faking the documents rather than the delivery method they definitely could have invested some time in flawless looks.
I mean even in this thread you got what are essentially one-liners to do it.
Definitely less hassle then doing it irl
Yeah they might have used some web converter, but that on the other hand would have been extremely incompetent handling of the secret data.
Sign a blank paper, scan it, paste the original doc on it. Then keep the scan for future docs.
even when people deliberately try to feign some aspects (e.g. switching writing styles for different pseudonyms), they will almost always slip up and revert to their most comfortable style over time. which is great, because if they aren't also regularly changing pseudonyms (which are also subject to limited stylometry, so pseudonym creation should be somewhat randomized in name, location, etc.), you only need to catch them slipping once to get the whole history of that pseudonym (and potentially others, once that one is confirmed).
https://news.ycombinator.com/item?id=33755016
You can also unironically spot most types of AI writing this way. The approaches based on training another transformer to spot "AI generated" content are wrong.
I have no idea if specialized tools can reliably detect AI writing but, as someone whose writing on forums like HN has been accused a couple of times of being AI, I can say that humans aren't very good at it. So far, my limited experience with being falsely accused is it seems to partly just be a bias against being a decent writer with a good vocabulary who sometimes writes longer posts.
As for the reliability of specialized tools in detecting AI writing, I'm skeptical at a conceptual level because an LLM can be reinforcement trained with feedback from such a tool (RLTF instead of RLHF). While they may be somewhat reliable at the moment, it seems unlikely they'll stay that way.
Unfortunately, since there are already companies marketing 'AI detectors' to academic institutions, they won't stop marketing them as their reliability continues to get worse. Which will probably result in an increasing shit show of false accusations against students.
- moot was fundraising for his VC backed startup during the years the emails are in, and he was likely connected via mutuals in USV or other firms. These meetings were clearly around him trying to solicit investment in his canv.as project.
- /pol/ was /new/ being returned; the ethos of the board had already existed for a long time and the decision to undo the deletion of /new/ was entirely unsurprising for denizens at the time, and was consistent with a concerted push moot was making for more transparency in the enforcement of rules on the site and fairness towards users who followed the rules. /pol/ didn't start a culture war at this time any more than /new/ had previously - it just existed as a relatively content-unmoderated platform for people to discuss earnestly what would get them banned elsewhere.
We're just not going to talk about that one I suppose?
Who can say what effect it had on the world, but a presidential candidate reposting himself personified as Pepe the frog was still weird back then, and at least a nod to the trolls doing so much work on his behalf
https://medium.com/tryangle-magazine/meme-magic-is-real-you-... (dismissable login wall)
Summary: Trump used memes not in the sense of pepes but in the original (Dawkins') sense of "earworm" soundbites, along with a torrent of scandals, each making the previous seem like old news, to exploit a public tired of the "status quo" into voting for a zany wildcard pushing for reactionary policy
I'm just saying, it's a symptom. The crazy found critical mass, broke containment. From there it was laundered in millions of Facebook groups and here we are.
https://www.justice.gov/epstein/files/DataSet%2010/EFTA01992...
The reason I don’t agree is that moot banned any Gamergate discussion and those people then went to 8chan, a site which moot had no control over.
And it was Gamergate that put some fuel on the fire which (IMHO) increased support for Trump. The 8chan site grew a great deal from it, then continued from that first initial “win”.
If you radicalise the 0.01% of people who are prolific meme creators, you radicalise the masses.
* I did say old...
hopefully someone is independently archiving all documents
my understanding is that some are being removed
The author of gnus, Lars Ingebrigtsen, wrote a blog post explaining this. His post was on the HN front page today.
Who paid him?
Who did get paid?
Maybe I'm underestimating the issue at full, but isn't this a very lightweight problem to solve? Is converting the images to lower DPI formats/versions really any easier than just stripping the metadata? Surely the DOJ and similar justice agencies have been aware of and doing this for decades at this point, right?
Another guess is that perhaps the step is a part of a multi-step sanitation process, and the last step(s) perform the bitmap operation.
(But seriously, great work here!)
Is the scope at least limited somehow? Generally I favor transparency, but of course probably the most important parts are withheld.
An act of congress, for one.
Also, AFAIK, federal privacy generally ends at death, as does criminal liability; so releasing government files from a federal investigation after death of the subject is generally within the realm of acceptable conduct.
It seems unlikely you lose all rights when you die or it would be chaos - imagine all the secrets people die with that affect everyone they know. An integral part of every estate plan would be incinerating records. Wills do have real power.
OTOH, there's a 2004 case, National Archives & Records Administration v. Favish[1], which establishes the surviving family's right of privacy to death scene photos, but that's technically not privacy of the deceased.
[1] https://www.justice.gov/archives/oip/blog/foia-post-2004-sup...
(It also surprises me that this passed anyway, given that both sides of the aisle seem to have people with clear reason to keep it covered up... ?)
(Also, Maxwell is specifically named, and is still alive... ?)
https://en.wikipedia.org/wiki/Epstein_Files_Transparency_Act
I personally understand a year in the submission as a warning that the article may not be up to date.
I'm not used to typing it yet, either.