Pages

Saturday, 19 December 2020

pikepdf: a better Python PDF lib


1 Python PDF libs


PyPDF2, PyPDF3, PyPDF4, and pypdfrw are the most popular pure Python libs for manipulating pdf files.
Unfortunately, all of them are old and lack maintenance. Obvious bugs exist for a long time on PyPDFx.

Fortunately, there is a new PDF lib started in 2020 and is actively developed. It's pikepdf!

The document is complete and of high quality too.

IMHO, the APIs are much better and easier to use than other libs.

2 Simple example to split a pdf file

Below is a super simple but practical example code to split a pdf file to separate files per page.

import pikepdf
pdf = pikepdf.Pdf.open('ticket.pdf')
for i, page in enumerate(pdf.pages):
    dst = pikepdf.Pdf.new()
    dst.pages.append(page)
    dst.save(f'ticket_{i:02d}.pdf')



No comments:

Post a Comment