PDFDocument vs. SimplePDFViewer¶
- pdfreader provides 2 different interfaces for PDFs:
What is the difference?
PDFDocument
:knows nothing about interpretation of content-level PDF operators
knows all about PDF file and document structure (types, objects, indirect objects, references etc.)
can be used to access any document object: XRef table, DocumentCatalog, page tree nodes (aka Pages), binary streams like Font, CMap, Form, Page etc.
can be used to access raw objects content (raw page content stream for example)
has no graphical state
SimplePDFViewer
:uses
PDFDocument
as document navigation enginecan render document content properly decoding it and interpreting PDF operators
has graphical state
Use PDFDocument
to navigate document and access raw data.
Use SimplePDFViewer
to extract content you see in your favorite viewer
(Adobe Acrobat Reader, hehe :-).
Let’s see several usecases.