this post was submitted on 17 Jan 2025
20 points (95.5% liked)

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ

55456 readers
669 users here now

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don't request invites, trade, sell, or self-promote

3. Don't request or link to specific pirated titles, including DMs

4. Don't submit low-quality posts, be entitled, or harass others



Loot, Pillage, & Plunder

📜 c/Piracy Wiki (Community Edition):


💰 Please help cover server costs.

Ko-Fi Liberapay
Ko-fi Liberapay

founded 2 years ago
MODERATORS
 

I ripped a lot of xhtml files from a crappy ebook reader online, how do combine these into a pdf?

top 12 comments
sorted by: hot top controversial new old
[–] we_avoid_temptation@lemmy.zip 6 points 2 days ago (2 children)

Pretty sure calibre makes this easy if you don't wanna reinvent the wheel

[–] Irelephant@lemm.ee 1 points 1 day ago

Oh, I already have that installed. I'll try it.

[–] sirpuppy@lemmy.dbzer0.com 2 points 2 days ago

came here to say calibre! it works and the converting is super simple. takes a little while for pdf files since its a big file but it works

[–] merde@sh.itjust.works 2 points 2 days ago

Scribus may help you with that

[–] deegeese@sopuli.xyz 2 points 2 days ago (1 children)

There are a ton of options depending on your tech level.

How are you with basic Python scripts?

[–] Irelephant@lemm.ee 1 points 2 days ago (3 children)

I made the script to rip them in bash. I know python, lua, js, bash and powershell, anything using these works.

[–] danielquinn@lemmy.ca 3 points 2 days ago

I've used pdfkit to considerable success. It has a few system-level dependencies, but the instructions are pretty straightforward:

# apt-get install wkhtmltopdf
$ pip install pdfkit
[–] deegeese@sopuli.xyz 3 points 2 days ago (1 children)

Surely you can figure out how to use existing libraries for this task, or is there something you’re stuck on?

[–] Irelephant@lemm.ee 2 points 2 days ago (1 children)

Can't really find many good ones. Google isn't returning much, just pdfs about python libraries and the odd abandoned github repo

[–] deegeese@sopuli.xyz 2 points 2 days ago

I’d start with wkhtmltopdf/pdfkit

[–] undefined@lemmy.hogru.ch 1 points 2 days ago* (last edited 2 days ago)

In a production web app I use Gotenberg. It’s definitely overkill for the task at hand, but if you find yourself doing this often I would highly recommend it. It’s dead easy to convert HTML (and I imagine XHTML) to PDF.

[–] Moonrise2473@feddit.it 1 points 2 days ago

If when opened with a browser they have the right stylesheet, you can pirate m0nkrus' acrobat pro, then select all => right click => convert to pdf