Datasheet websites do that a lot. If it’s PDF.js, Firefox’s PDF viewer (or a fork of it), I just right-click to “Show only this frame” and it goes fullscreen. It might have shenanigans such as disabled printing but you can press Ctrl+Shift+E and reload to check network activity for what address the PDF is loaded from and save that.
The worse ones are PDFs that exist only for SEO and contain nothing but keywords and a link to a paywall.
Wow I had no idea about this. And I was just in the process of trying to download a pdf from one of these websites. Thanks
but if they just gave you pdf, how would they track every mouse movement for their bullshit metrics?
would somebody think of the advertisers?
Advertisers? Think of the managers! Managers are nothing without their metrics.
No, it should obviously take you to “pay us enormous amount of money every month” page first.
Suck my InterLibraryLoan, Pearson.
suck my pear, son
https://…/epdf/… -> https://…/pdf/…
Works for some places at least. Super infuriating though. Why use the fast native PDF viewer in the browser when you could use a bloated and buggy JS app?
Very informative, but I’d change one small thing.
Why use the fast native PDF viewer
in the browserwhen you could use a bloated and buggy JS app?Fair, but certain corporate-mandated client-side PDF viewers are… bloatier. Though, I do like not having another window to manage when I open in browser, particularly when doing web searches. It pairs well with tab grouping extensions, and I generally don’t use markup, so no loss for me there.
That will be $30.12
Just email the author and ask for a copy
Oh boy, I sure am excited to websites hosting PDFs! I love when the tool that everyone uses for hosting and viewing HTML get to be blessed with the perfect format that is PDF!
I LOVE PDFS! I love two column PDFs! I love reading like this!
1 3
2 4
5 7
6 8Instead of like this
1
2
3
4
5
6
7
8It’s amazing and such a good user experience!
I love that PDFs are so difficult to transform into HTML, too. I would never want the besmirch the publishers oerfect one approved layout by resizing the window!
I love that PDFs are so difficult to transform into HTML, too
FYI, if that’s relevant to your field, every new article published on arxiv.org now has a HTML render as well.
And on many older publications, transforming “arxiv.org” into “ar5iv.org” leads to an HTML rendering that is a best-effort experiments they ran for a while.
That’s really cool! What I really would like is a tool that converts PDFs to semantic HTML files. I took a peek there and it seems easier for them because they have the original LeX source.
I think for arbitrary PDFs files the information just isn’t there. I’ve looked into it a bit and it’s sort of all over. A tool called pdf2htmlex is pretty good but it makes the HTML look exactly like the PDF.
Yes, PDFs are much more permissive and may not have any semantic information at all. Hell, some old publications are just scanned images!
PDF -> semantic seems to be a hard problem that basically requires OCR, like these people are doing
Oh nice, thanks for sharing that project. I haven’t heard of it before!
Not just semantics. PDFs doesn’t even have segmentations like spaces/lines/paragraph. It’s just text drawn at locations the text processor/any other softwares inserted into. Many pdf editor softwares just detect the closeness of the characters to group them together.
And one step further is you can convert text to path, which basically won’t even have glyph (characters) info and font info, all characters will just be geometric shapes. In that case you can’t even copy the text. OCR is your only choice.
PDF is for finalizing something and printing/sharing without the ability to edit.
I’ve always called Word documents and PDFs “dead-end formats” (DEF). Once you export your data to them, there’s no reliable way to retrieve your data from them for further transformation like you can for YAML, JSON, XML, HTML, Markdown, &c.
Choose your own adventure PDF! 1, 5, 7, 3, 9, 2, 0, 6, 4, 8! What an ending!
At least you can usually print them as PDF easily. My main issue is that the page title becomes “PDF.js Viewer - [Paper title]”.
If it’s PDF.js, it’s just Firefox’s PDF viewer (or a fork of it). I just right-click to “Show only this frame” and it goes fullscreen.
well they have to justify the exorbitant amount of money they charge for publicly funded science articles (apart from the obvious reason of thinking about the shareholders)
I’ll just be happy it doesn’t ask me to make an account
Truly. Also the springer nature ones load so slowly for absolutely no reason, and break 10% of the time. I really don’t get what their motivation is, do they think that after I’ve said no, I dont want a web version, I will be happy with a different web version?
I use searxng and it has some option that automaticly replaces links wich just give u the pdf based on the doi or whatever its called.
nothing beats having to click the download button twice. it’s my favorite
Or a virus. It could be an exciting virus.
PDF button? Or time to create an account to get a subscription to access that PDF!
Honestly, i prefer a native renderer over a XSS-plagued js.