this post was submitted on 23 Jan 2024
532 points (91.7% liked)

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ

54716 readers
330 users here now

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don't request invites, trade, sell, or self-promote

3. Don't request or link to specific pirated titles, including DMs

4. Don't submit low-quality posts, be entitled, or harass others



Loot, Pillage, & Plunder

📜 c/Piracy Wiki (Community Edition):


💰 Please help cover server costs.

Ko-Fi Liberapay
Ko-fi Liberapay

founded 1 year ago
MODERATORS
 

not being able to ctrl-F a textbook or have click-to-chapter links sure makes studying harder these days... and any scanning software worth it's salt will at least do the bare minimum OCR automatically...

all 31 comments
sorted by: hot top controversial new old
[–] Moonrise2473@feddit.it 119 points 10 months ago* (last edited 10 months ago)

I much prefer doing ocr by myself if really needed, than getting an half assed "book" full of typos and broken tables just because someone did an automated OCR but didn't have the 5-6 hours required to manually edit to make it decent

Already be thankful that someone took the time to flip page by page in their scanner manually and upload it somewhere

[–] raven@hexbear.net 87 points 10 months ago

I hope this sentiment never stops someone from uploading a textbook without OCR. Once it's scanned it can always be OCRed at a later time.

[–] RegalPotoo@lemmy.world 78 points 10 months ago (1 children)

Look, it's all about authorial intent - if the author had wanted their book to be easy to reference or accessible to people who use screen readers, they would have published a DRM free PDF in the first place. Gotta respect the artist's vision.

[–] trucy@lemmy.blahaj.zone 10 points 10 months ago (1 children)

...and sometimes the artist turns out to be an idiot :D

[–] nilloc@discuss.tchncs.de 2 points 10 months ago

Or the professor who’s profiting off requiring the latest edition of their own book each year.

[–] flipflop97@feddit.nl 69 points 10 months ago* (last edited 10 months ago) (1 children)
[–] Kingofthezyx@lemm.ee 68 points 10 months ago (2 children)

Bitch you can't ctrl-F or click to chapter in an actual book either.

[–] JackbyDev@programming.dev 60 points 10 months ago

Wait until OP hears about the Index at the back of the book.

[–] empireOfLove2@lemmy.dbzer0.com 18 points 10 months ago

I know, that's my point! PDF's are inherently superior BECAUSE you can usually CTRL-F them.

[–] Anamnesis@lemmy.world 55 points 10 months ago (2 children)

Simple: pirate adobe acrobat and ocr them yourself.

[–] empireOfLove2@lemmy.dbzer0.com 16 points 10 months ago

I might just do that and reupload the OCR'd copy. I already have 3 or 4 books that I've saved out to cut the binding off of and scan in- gonna need OCR for that too.

In my free time, of course. University waits for no student...

[–] antonim@lemmy.dbzer0.com 41 points 10 months ago (1 children)

By the time you finished making this snarky meme, you could've set up a program to OCR a book yourself.

[–] 5714@lemmy.dbzer0.com 2 points 10 months ago

'A' yes, but the more scan pix you get, the annoyter you get

[–] Gork@lemm.ee 41 points 10 months ago

OCR'ing a book before uploading saves so much hours on the user end of things. I wish it were done more so I don't have to leave my computer running overnight to batch OCR stuff.

[–] DontNoodles@discuss.tchncs.de 36 points 10 months ago

Sites like Anna's library should permit users to flag books without OCR and permit users to submit OCR version of the books.

[–] unperson@hexbear.net 34 points 10 months ago* (last edited 10 months ago)

Be the change you wish to see in the world.

https://library.bz/main/upload/ anonymous username genesis password upload

[–] lanolinoil@lemmy.world 17 points 10 months ago (1 children)
[–] nameisnotimportant@lemmy.ml 11 points 10 months ago (1 children)

Very impressive! That's a bummer that you need Chrome to make it happen though :/

[–] lanolinoil@lemmy.world 1 points 10 months ago

It will still work on PDFs loaded in Chrome to be fair though.

[–] Creddit@lemmy.world 16 points 10 months ago (2 children)

There are a bunch of online tools that are free and let you upload a PDF to have it go through OCR.

Just Google "Free PDF OCR" and click through all the ads to upload, then give them a temporary email address to get a download link to the finished product.

Hot tip: There are free temporary email address sites too, if you need one to avoid getting on their ad lists.

[–] imkali@lemmy.dbzer0.com 11 points 10 months ago* (last edited 10 months ago)

List of free temorary email solutions.

https://www.guerrillamail.com/

https://10minutemail.com/

https://addy.io/ - this one is slightly different

and about a billion similar ones.

[–] Agent641@lemmy.world 2 points 10 months ago

If you have a jpg or png file, you can upload it to Google drive, then right click and open in Google docs, and it will OCR the text for you.

[–] OgdenTO@hexbear.net 5 points 10 months ago

Use the index

[–] Mango@lemmy.world 4 points 10 months ago (1 children)
[–] Gaspar@lemmy.dbzer0.com 14 points 10 months ago

Optical Character Recognition. Essentially, software that "reads" an image and pulls text out of it.