Skip to main content

Questions tagged [pypdf]

pypdf is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. It can retrieve text and metadata from PDFs as well as merge entire files together.

pypdf
0 votes
0 answers
24 views

Merged .pdf file created on python is damaged

I'm very new to python. I wanted to merge multiple pdfs - 121 files - in a particular order (since Python merges in an alphabetical order). So, I created a filelist.txt file while merging all the .pdf ...
proy's user avatar
  • 1
1 vote
2 answers
48 views

When using PyPDF2 for Python, how do I transfer data in CSV format to an existing PDF with blank form fields?

I am currently using the PyPDF2 extension with Python and have my data (which was originally a Google Form) and then downloaded as a CSV file and am hoping to copy this data into an existing PDF with ...
Felicia's user avatar
  • 11
0 votes
0 answers
36 views

Is there a function in pypdf to get the page number of a field? (Python)

I'm trying to find an attribute or function that will return the page number/index of a field that I pass as an argument. E.g. get_field_page_number(field_name) -> int I want to be able to get a ...
mevans_fsi's user avatar
0 votes
0 answers
40 views

pypdf: extract_text in extraction_mode="layout" is working if table is on one page but not working if the table goes to 2nd page

I am using pypdf to extract text and and using this code below. It works if the table is one page (closing the table), but if the table is extended to another page (partially on one page and the rest ...
Ravi P's user avatar
  • 157
0 votes
0 answers
44 views

Python PDF page size

I am trying to get the page sizes of the pages in my PDF. I have tried using both PyPDF2 and pdfminer, I get the same results from both - 423.024x639.024 for artbox, cropbox, etc, and 459.048x675.048 ...
calwex718's user avatar
2 votes
0 answers
44 views

Trying to extract information from pdf files in google colab. It is just repeating most information from the first file into all the others

This is the code: for file in files.get('files', []): # ... (Get file content as before) # Extract data from the PDF pdf_reader = PyPDF2.PdfReader(BytesIO(file_content)) page = ...
Victor Brandao's user avatar
0 votes
0 answers
30 views

Python pypdf watermark position

I am trying to add a watermark to a pdf using pypdf. I have a watermark.pdf file which has 'Confidential' in small read font on the very top left of the page. However when I try to stamp or watermark '...
Novice Python charmer's user avatar
0 votes
1 answer
127 views

I can't get any PDF uploads to read

The app is supposed to read multiple PDFs but I can't get even a single PDF to work because of this issue. Any help is appreciated. I received the error: AttributeError: 'bytes' object has no ...
aria obscura's user avatar
0 votes
1 answer
91 views

Create a blank page and add text content using PyPDF2: module 'PyPDF2' has no attribute 'pdf'

Using this method to add create a blank page, add text to it and then append the page to a pdf. def add_text_to_blank_page(pdf_writer, text): # Create a new blank page page = PyPDF2._pdf....
Dhruv's user avatar
  • 645
0 votes
0 answers
62 views

Add image in Image Field with PDF Forms

I got PDF Forms with Text field and Image Field. How to I add image from Image field? For text field in document pypdf that show great information and I success. But I fails to add image in Image ...
aideed programmer's user avatar
1 vote
1 answer
92 views

PyPDF does not give me the right image

I am writing a python program to merge multiple PDFs containing images into one PDF, with the option to select specific pages from PDF source files, specify the order and other things. I'm using PyPDF ...
Andreas Kågedal's user avatar
0 votes
1 answer
117 views

How do i extract tables in the most efficient way using?

I have been using pdfplumber since. Is there any other library? apart from camelot, which uses pypdf2 and now theres an error saying: File "C:\Users\USER\AppData\Local\Programs\Python\Python312\...
Mahmud Umar's user avatar
0 votes
1 answer
109 views

Anaconda 3: nbconvert failed: PdfFileWriter is deprecated and was removed in PyPDF2 3.0.0. Use PdfWriter instead

ANACONDA 3 - Windows 11 Jupyter Lab: File -> Save As -> PDFviaHTML fails with the error below. Anaconda Prompt command Line: jupyter nbconvert -to xxx yyy.ipynb fails with the same error ...
Dave's user avatar
  • 1
0 votes
0 answers
38 views

How can I add the sub-bookmark to a PDF with pypdf, and remaining them when I tried many times?

There are 2 questions: I want to add the sub-bookmark to an existing bookmark. What I know is just to add first-layer bookmark. I want to know how to remain the existing bookmark to avoid delete an ...
Spring Harbor's user avatar
0 votes
0 answers
95 views

Python - Extract certain values from PDFs in a folder

I am using the below code to extract text from hundreds of PDF files in a specific folder: from pypdf import PdfReader import os import glob path = input("Enter the file path: ") pattern = ...
Mr Cs's user avatar
  • 1

15 30 50 per page
1
2 3 4 5
101