Questions tagged [pypdf]

Ask Question

pypdf is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. It can retrieve text and metadata from PDFs as well as merge entire files together.

1,511 questions

0 votes

0 answers

24 views

Merged .pdf file created on python is damaged

I'm very new to python. I wanted to merge multiple pdfs - 121 files - in a particular order (since Python merges in an alphabetical order). So, I created a filelist.txt file while merging all the .pdf ...

proy

asked yesterday

1 vote

2 answers

48 views

When using PyPDF2 for Python, how do I transfer data in CSV format to an existing PDF with blank form fields?

I am currently using the PyPDF2 extension with Python and have my data (which was originally a Google Form) and then downloaded as a CSV file and am hoping to copy this data into an existing PDF with ...

Felicia

asked 2 days ago

0 votes

0 answers

36 views

Is there a function in pypdf to get the page number of a field? (Python)

I'm trying to find an attribute or function that will return the page number/index of a field that I pass as an argument. E.g. get_field_page_number(field_name) -> int I want to be able to get a ...

mevans_fsi

asked Jul 3 at 20:45

0 votes

0 answers

40 views

pypdf: extract_text in extraction_mode="layout" is working if table is on one page but not working if the table goes to 2nd page

I am using pypdf to extract text and and using this code below. It works if the table is one page (closing the table), but if the table is extended to another page (partially on one page and the rest ...

Ravi P

asked Jul 1 at 21:09

0 votes

0 answers

44 views

Python PDF page size

I am trying to get the page sizes of the pages in my PDF. I have tried using both PyPDF2 and pdfminer, I get the same results from both - 423.024x639.024 for artbox, cropbox, etc, and 459.048x675.048 ...

calwex718

asked Jul 1 at 18:07

2 votes

0 answers

44 views

Trying to extract information from pdf files in google colab. It is just repeating most information from the first file into all the others

This is the code: for file in files.get('files', []): # ... (Get file content as before) # Extract data from the PDF pdf_reader = PyPDF2.PdfReader(BytesIO(file_content)) page = ...

Victor Brandao

asked Jun 26 at 17:43

0 votes

0 answers

30 views

Python pypdf watermark position

I am trying to add a watermark to a pdf using pypdf. I have a watermark.pdf file which has 'Confidential' in small read font on the very top left of the page. However when I try to stamp or watermark '...

Novice Python charmer

asked Jun 14 at 15:08

0 votes

1 answer

127 views

I can't get any PDF uploads to read

The app is supposed to read multiple PDFs but I can't get even a single PDF to work because of this issue. Any help is appreciated. I received the error: AttributeError: 'bytes' object has no ...

aria obscura

asked Jun 6 at 23:16

0 votes

1 answer

91 views

Create a blank page and add text content using PyPDF2: module 'PyPDF2' has no attribute 'pdf'

Using this method to add create a blank page, add text to it and then append the page to a pdf. def add_text_to_blank_page(pdf_writer, text): # Create a new blank page page = PyPDF2._pdf....

Dhruv

asked May 29 at 13:48

0 votes

0 answers

62 views

Add image in Image Field with PDF Forms

I got PDF Forms with Text field and Image Field. How to I add image from Image field? For text field in document pypdf that show great information and I success. But I fails to add image in Image ...

aideed programmer

asked May 29 at 3:40

1 vote

1 answer

92 views

PyPDF does not give me the right image

I am writing a python program to merge multiple PDFs containing images into one PDF, with the option to select specific pages from PDF source files, specify the order and other things. I'm using PyPDF ...

Andreas Kågedal

asked May 20 at 21:12

0 votes

1 answer

117 views

How do i extract tables in the most efficient way using?

I have been using pdfplumber since. Is there any other library? apart from camelot, which uses pypdf2 and now theres an error saying: File "C:\Users\USER\AppData\Local\Programs\Python\Python312\...

Mahmud Umar

asked May 15 at 9:20

0 votes

1 answer

109 views

Anaconda 3: nbconvert failed: PdfFileWriter is deprecated and was removed in PyPDF2 3.0.0. Use PdfWriter instead

ANACONDA 3 - Windows 11 Jupyter Lab: File -> Save As -> PDFviaHTML fails with the error below. Anaconda Prompt command Line: jupyter nbconvert -to xxx yyy.ipynb fails with the same error ...

Dave

asked May 9 at 20:52

0 votes

0 answers

38 views

How can I add the sub-bookmark to a PDF with pypdf, and remaining them when I tried many times?

There are 2 questions: I want to add the sub-bookmark to an existing bookmark. What I know is just to add first-layer bookmark. I want to know how to remain the existing bookmark to avoid delete an ...

Spring Harbor

asked May 9 at 17:34

0 votes

0 answers

95 views

Python - Extract certain values from PDFs in a folder

I am using the below code to extract text from hundreds of PDF files in a specific folder: from pypdf import PdfReader import os import glob path = input("Enter the file path: ") pattern = ...

Mr Cs

asked May 6 at 15:37

15 30 50 per page

2 3 4 5

…

101 Next

Collectives™ on Stack Overflow

Questions tagged [pypdf]

Merged .pdf file created on python is damaged

When using PyPDF2 for Python, how do I transfer data in CSV format to an existing PDF with blank form fields?

Is there a function in pypdf to get the page number of a field? (Python)

pypdf: extract_text in extraction_mode="layout" is working if table is on one page but not working if the table goes to 2nd page

Python PDF page size

Trying to extract information from pdf files in google colab. It is just repeating most information from the first file into all the others

Python pypdf watermark position

I can't get any PDF uploads to read

Create a blank page and add text content using PyPDF2: module 'PyPDF2' has no attribute 'pdf'

Add image in Image Field with PDF Forms

PyPDF does not give me the right image

How do i extract tables in the most efficient way using?

Anaconda 3: nbconvert failed: PdfFileWriter is deprecated and was removed in PyPDF2 3.0.0. Use PdfWriter instead

How can I add the sub-bookmark to a PDF with pypdf, and remaining them when I tried many times?

Python - Extract certain values from PDFs in a folder

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [pypdf]

Related Tags