A PDF document consists of some primitive and high-level PDF objects. Generally, a PDF document contains nine primitive types of objects and can be interpreted as a graph of linked primitive PDF objects, where an object is one of the following types defined in the PDF specification:
All high-level PDF objects in object model (such as Page, AnnotationBase, Action, etc.) are implemented as wrappers around primitive PDF objects. A wrapper contains a reference to the underlying primitive PDF type (PdfDict, PdfArray, PdfDictObject, etc.) and provides methods and properties for accessing and manipulating the underlying object. The root class for all high-level objects is PdfWrapperBase; it contains a reference to the underlying PDF primitive object defined by IPdfObject.
DsPdf allows you to work directly with the primitive objects used to build all the high-level entities in a PDF document, such as DocumentInfo, a PDF dictionary, using the following listed interfaces and classes, and their methods and properties in GrapeCity.Documents.Pdf.Spec namespace:
Interface/Class | Description |
---|---|
IPdfObject | It is the common interface supported by all PDF objects in a GcPdfDocument that are persisted in a PDF file. Indirect and ObjID properties allow you to identify indirect PDF objects and IDs of the PDF objects. |
IPdfArray | It is the common interface implemented by PdfArray, PdfArrayObject, and PdfArrayWrapper types. |
IPdfArrayExt | It contains extension methods for the IPdfArray interface. |
IPdfDict | It is the common interface implemented by PdfDict, PdfDictObject, and PdfDictWrapper types. |
IPdfDictExt | It contains extension methods for the IPdfDict interface. |
IPdfName | It is the common interface for PdfName and PdfNameObject. |
IPdfNameExt | It contains extension methods for the IPdfName interface. |
IPdfNumber | It is the common interface for PdfNumber and PdfNumberObject. |
IPdfNumberExt | It contains extension methods for the IPdfNumber interface. |
IPdfRef | It is the common interface for PdfRef and PdfRefObject. |
IPdfRefExt | It contains extension methods for the IPdfRef interface. |
IPdfString | It is the common interface for PdfString and PdfStringObject. |
IPdfStringExt | It contains extension methods for the IPdfString interface. |
IPdfBool | It is the common interface for PdfBool and PdfBoolObject. |
IPdfBoolExt | It contains extension methods for the IPdfBool interface. |
IPdfNull | It is the common interface for PdfNull and PdfNullObject. |
IPdfNullExt | It contains extension methods for the IPdfNull interface. |
PdfArray | It represents a direct PDF array object. |
PdfArrayObject | It represents an indirect PDF array object. |
PdfArrayWrapper | It represents an array wrapper object. |
PdfDict | It represents a direct PDF dictionary object. |
PdfDictObject | It represents an indirect PDF dictionary object. |
PdfDictWrapper | It represents a dictionary wrapper object. |
PdfName | It represents a direct PDF name object. This class overrides GetHashCode() and Equals(object) methods and defines the equality and inequality operators. This class is immutable. |
PdfNameObject | It represents an indirect PDF name object. |
PdfNumber | It represents a direct PDF number object. The class overrides GetHashCode() and Equals(object) methods and defines the equality and inequality operators. This class is immutable. |
PdfNumberObject | It represents an indirect PDF number object. |
PdfStreamObjectBase | It represents a PDF stream. It is always an indirect object, as a stream cannot be a direct object in PDF. |
PdfRef | It represents a direct PDF reference object. This class overrides GetHashCode() and Equals(object) methods. The class is immutable. |
PdfRefObject | It represents an indirect PDF reference object. |
PdfString | It represents a direct PDF string object. This class overrides GetHashCode() and Equals(object) methods and defines the equality and inequality operators. The class is immutable. |
PdfStringObject | It represents an indirect PDF string object. |
PdfBool | It represents a direct PDF bool object. You cannot create instances of this class from user code; the two predefined instances are PdfBool.True and PdfBool.False. Overrides GetHashCode() and Equals(object), which define equality and inequality operators. |
PdfBoolObject | It represents an indirect PDF bool object. |
PdfNull | It represents a direct PDF null object. You cannot create instances of this class from user code; instead, use the PdfNull.Instance predefined instance. It overrides GetHashCode() and Equals(object), which define equality and inequality operators. This class is immutable. |
PdfNullObject | It represents an indirect PDF null object. |
The PDF specification defines the properties that can be present in this dictionary (Creator, Author, etc.), but PDF producers can add arbitrary custom properties, such as the SourceModified property, which is often found in various real-world PDF files. Types from GrapeCity.Documents.Pdf.Spec namespace allow you to access (read, write, or edit) such custom elements.
Since most high-level objects in a PDF file are PDF dictionaries, in the DsPdf API, the corresponding objects are derived from the PdfDictWrapper class, which in turn is derived from PdfWrapperBase and uses IPdfDict as the underlying object. The GetPdfStream, GetPdfStreamInfo, and GetPdfStreamData methods of PdfWrapperBase can retrieve data from the PDF stream associated with the PDF dictionary.
Each high-level PDF object (depending on its type) implements one of the primitive interfaces so that the user can use the extension methods of GrapeCity.Documents.Pdf.Spec namespace with these high-level objects.
Refer to the following example code to get image properties from a PDF document:
C# |
Copy Code
|
---|---|
// Initialize GcPdfDocument. GcPdfDocument doc = new GcPdfDocument(); // Load PDF document. doc.Load(fs); // Get image from the PDF document. var imgs = doc.GetImages(); var pi = imgs[0].Image; // Write image ID. Console.WriteLine($"PdfImage object ID: {pi.ObjID}"); /* The PdfImage is a descendant of PdfDictWrapper object and has a lot of methods that allow you to get properties and data from the underlying PDF stream object. */ using (PdfStreamInfo psi = pi.GetPdfStreamInfo()) { // Get image information such as length filter name, filter decode parameters, etc. Console.WriteLine($" Image stream length: {psi.Stream.Length}"); Console.WriteLine($" ImageFilterName: {psi.ImageFilterName}"); Console.WriteLine($"ImageFilterDecodeParams: {psi.ImageFilterDecodeParams}"); // Dump content of ImageFilterDecodeParams. foreach (var kvp in psi.ImageFilterDecodeParams.Dict) { Console.WriteLine($"{kvp.Key}: {kvp.Value}"); } // Get value of BlackIs1. var blackIs1 = psi.ImageFilterDecodeParams.GetBool(PdfName.Std.BlackIs1, null); Console.WriteLine($"BlackIs1: {blackIs1}"); } // Dump properties of PdfImage dictionary. Console.WriteLine(); Console.WriteLine("Properties of PdfImage dictionary:"); foreach (KeyValuePair<PdfName, IPdfObject> kvp in pi.PdfDict.Dict) { Console.WriteLine($"{kvp.Key}: {kvp.Value}"); } // Get color space and bits per component. var cs = pi.Get<IPdfObject>(PdfName.Std.ColorSpace); Console.WriteLine($"ColorSpace: {cs.GetType().Name} {cs}"); var bpc = pi.Get<IPdfObject>(PdfName.Std.BitsPerComponent); Console.WriteLine($"BitsPerComponent: {bpc?.GetType().Name} {bpc}"); |