PDFs

In this section we are going to tackle a few libraries and tools that can be used to work with PDF files.

PyPDF

Perform common tasks like merging, splitting and adding watermarks to PDF documents.

PDFMiner.six

Get detailed access to the internal structure of PDF documents, such as extracting text with precise positioning, extracting images, or navigating complex document structures.

fpdf2

The library is specifically designed for creating PDF files from scratch.

Usage

PyPDF

from pypdf import PdfReader reader = PdfReader("example.pdf") print(len(reader.pages))

PDFMiner.six

from pdfminer.high_level import extract_text print(extract_text('samples/simple1.pdf'))

fpdf2

from fpdf import FPDF pdf = FPDF() pdf.add_page() pdf.set_font("Arial", size=25) # create a cell pdf.cell(200, 10, txt="Hello World!", ln=1, align='C') pdf.output("info.pdf")

AI/LLM's are quite good with pdfs.
๐Ÿ‘‰ Try asking ReMark

PyPDF

PDFMiner.six

fpdf2