Extract images from PDF

13 Jan 2023updated 13 May 2023
A while ago, I came across a small graphic novel in a PDF file, and I needed the pages as regular image files.
Here’s a quick way to extract bitmap images from PDFs, using Python.
# Install dependency
pip install pymupdf
import fitz doc = fitz.open("/path/to/file.pdf") for i in range(len(doc)): for img in doc.get_page_images(i): xref = img[0] pix = doc.extract_image(xref) imgout = open("p%s-%s.png" % (i, xref), "wb") imgout.write(pix["image"]) imgout.close()