PDF Output

PDF output is a standard function of AH Formatter V5.0. The PDF versions that can be output are as follows:

PDF that can be outputted by AH Formatter V5.0 has the following features.

See also PDF Output Settings for more details.

PDF/X

PDF/X is defined in ISO 15930 and is a subset of PDF that aims to exchange the data for printing. Basically all the information for printing is included in the PDF file. It's possible to output PDF/X with the following versions with AH Formatter V5.0. Imposible to output PDF/X with AH Formatter V5.0 Lite.

The following shows the main features for PDF/X.

PDF/X-1aPDF/X-2PDF/X-3
All fonts must be embedded. Yes Yes Yes
The output intent must be specified. Yes Yes Yes
Supports only CMYK, Spot color including images. Yes No No
Avoids transparent images. Yes Yes Yes
The setting of password, the restriction for printing and the restriction for changing must not be done. Yes No Yes
Contain neither link nor the annotation, etc. Yes Yes Yes

In PDF/X, all fonts must be embedded. When the font that is not allowed to embed is used, PDF/X cannot be generated.

PDF/X adopts the appropreate information by ignoring the most of the information set by users, such as font embedding, etc. For example the version of the output PDF is specified to PDF/X, Font Embedding in the PDF Option Setting Dialog will be displayed in gray color.

In order to specify the output indent by ICC color profile, use the URL of the ICC color profile to the src property for fo:color-profile. In this case please omit the color-profile-name prooperty or specify "#CMYK" or "#RGB". If this property is omitted, it is considered as "#CMYK". For example:

<fo:declarations>
 <fo:color-profile
  src="url(file:///C:/WINDOWS/system32/spool/drivers/color/JapanColor2001Coated.icc)"
  color-profile-name="#CMYK"
 />
</fo:declarations>

The output condition identifier can also be specified with the src property for fo:color-profile. The setting is done in the form of the fragmentation of URI. For example:

<fo:declarations>
 <fo:color-profile
  src="#OutputConditionIdentifier=CGATS TR 001&amp;RegistryName=http://www.color.org"
 />
</fo:declarations>

The first character must be #. Afrer that, the parameters are lined and separated by &. Each parameter is in the form of name=value. The name of the parameters are as follows (by mapping to the items of the Outputlntent dictionary for PDF/X.)

It’s also possible to give the information of the profile by describing the format of the flagment parameter following to the URI of the ICC color profile.

<fo:declarations axf:base-uri="url(file:///C:/WINDOWS/system32/spool/drivers/color/)">
 <fo:color-profile src="url('Photoshop5DefaultCMYK.icc#Info=Photoshop5')"/>
</fo:declarations>

When the output intent is not specified in FO, default-output-intent in the PDF Output Settings will be adopted.

The standard ICC color profile can be downloaded from Adobe.

http://www.adobe.com/support/downloads/detail.jsp?ftpID=3145

The profile that can be downloaded may be already bundled with Adobe Acrobat and has already been installed on your system. If your OS is Winodows, please search the %windir%\system32\spool\drivers\color directory.

PDF/A

PDF/A-1 is defined by ISO 19005-1:2005 and it is the specification intended to be suitable for long-term preservation of electronic documents based on PDF1.4 specification. It's possible to output PDF/A with the following versions with AH Formatter V5.0. Impossible to output PDF/A with AH Formatter V5.0 Lite.

The following shows the main features for PDF/A.

PDF/A-1a:2005PDF/A-1b:2005
All fonts must be embedded. Yes Yes
Files must be tagged. Yes No
Files must include XMP metadata Yes Yes
Files may not include encryption Yes Yes
Files may not include LZW Compression Yes Yes
Files may not include Transparent images Yes Yes
Files may not refer to the external content Yes Yes
Files may not include JavaScript Yes Yes

In PDF/A, all fonts must be embedded as well as PDF/X. PDF/A cannot be generated when the font which is not permitted to embed is being used.

Most information, including embedding of fonts etc., is adopted precisely disregarding the user setting. In PDF/A-1a, the tagging is done compulsorily. See also Tagged PDF.

XMP metadata is automatically generated from the document information of PDF. In AH Formatter V5.0, it's not available to embed information other than these.

CAUTION: In AH Formatter V5.0, you cannot specify PDF/A and PDF/X simultaneously.

Tagged PDF

Usual PDF does not have document structure in the contents. For example, the sentences are cut off per each line. In the column, the 1st line of the right column follows the 1st line of the left column. Therefore, even if a person with sight problems, etc. is going to read PDF using some kind of reader, it is very difficult to read a text in the right order. The same thing is applied to the text extraction from PDF.

Tagged PDF structurizes PDF documents with the tag embedded into PDF. By structurizing documents, PDF serves as reusable information. For this reason, Tagged PDF is indispensable to create accessible PDF documents. See also the site below for the accessibility.

http://www.adobe.com/enterprise/accessibility/

AH Formatter V5.0 embeds the following tags (StructElem) for each FO element.

FO element PDF element Comment
fo:root Document
fo:page-sequence Part
fo:flow Sect
fo:static-content Sect
fo:block P or Div P when it has the content of inline-level, otherwize Div
fo:block-container Div or Sect Sect when absolute-position="fixed" or "absolute", otherwize Div
fo:inline Span
fo:inline-container Span
fo:leader Span
fo:page-number Span
fo:page-number-citation Span
fo:page-number-citation-last Span
fo:scaling-value-citation Span
fo:index-page-citation-list Span
fo:bidi-override Span
fo:footnote Note
fo:footnote-body Sect
fo:float Sect
fo:external-graphic Figure
fo:instream-foreign-object Figure
fo:basic-link Link
fo:list-block L
fo:list-item LI
fo:list-item-label Lbl
fo:list-item-body Lbody
fo:table Table
fo:table-caption Caption
fo:table-header THead
fo:table-footer TFoot
fo:table-body TBody
fo:table-row TR
fo:table-cell TD

There are some tags which are not structural elements.

PDF element Comment
ArtifactIt is mapped to the contents distinguished from the text of a page. static-content which is repeatedly outputted at each line break, and table-header (except the one at the beginning of the table) and table-footer (except the one at the end of the table) serve as Artifact.
ReversedCharsIt is mapped to the text which runs from right to left such as Arabic, etc.
SpanIt is used for setting up the character string before being processed for the display as ActualText. It is different from Span in the structural element. The character strings before being processed mean the character string before the hyphenation is processed, or the character string before complicated glyph substitution is done in Thai etc., for example.

In order to create a tagged PDF, you can check Tagged PDF in the PDF Option Setting Dialog, or you can specify -taggedpdf with the Commandline interface.

AH Formatter V5.0 processes as follows for each check item (in the case of Acrobat 7.0) of the Adobe Acrobat's Accessibility Full Check

Impossible to output Tagged PDF with AH Formatter V5.0 Lite.

Digital Signature

With the AH Formatter V5.0 Windows version, a digital signature can be applied to PDF in the environment where you have the PDF Digital Signature Module installed when outputting to PDF.

Customers must purchase the PDF Digital Signature Module to put the digital signature. See also Antenna House website for more details.

In order to put a digital signature with AH Formatter V5.0, it is necessary to make a signature area in FO. A signature area can be created by using the <axf:form-field> extension property.

<axf:form-field
     field-type="signature"
     field-apply-signature="true"
     width="40pt"
     height="50pt"

The field type is specified by field-type="signature". The appearance of the signature cannot be specified here. The size of the field etc. can only be specified. An actual signature is performed when generating PDF. At this time, the digital area is generated in PDF as the signed signature field. Although any number of signature areas can be made and set, AH Formatter V5.0 can put only one of them in the generated PDF. You have to specify field-apply-signature="true" to the area you actually sign. When field-apply-signature="true" is specified multiply, the signature is applied to the first specified area. The non-signed signature field is generated to the signature area without the signature. A signature can be applied to such a signature field afterwards by using PDF Digital Signature Module.

Moreover, in order to actually sign, you have to specify both the signature information which defines the appearance of the signature, etc. and the certificate to be used. These information is defined and saved by PDF Digital Signature Module . In GUI, the digital signature can be specified on the Security section of PDF Option Setting Dialog. With the command- line, the digital signature can be specified by the -pdss option. See also the setting method with the other interfaces respectively.

When outputting PDF by GUI, the signature field is not generated in PDF if the PDF is created without checking "Apply Digital Signature". Similarly, in the environment where PDF Digital Signature Module is not installed, even if the field type is specified by field-type="signature", the signature field is not generated when outputting PDF. PDF Digital Signature Module applies a signature to the signature field in PDF. Please note that you must specify the executing of digital signature when formatting, even if you want to apply the signature later without applying a signature with AH Formatter V5.0. The same setting is applied for the other interfaces.

Followings are restrictions.

See also the manula of PDF Digital Signature Module for more details about digital signatures.

CAUTION: PDF Digital Signature Module is available only on the Windows version without Windows x64 version. Also, the function of digital signatures cannot be used with AH Formatter V5.0 Lite.

PDF Embedding

Possible to embed PDF document in the other PDF.

It is performed by using <fo:external-graphic> like handling an image. Please refer to Graphics.

<fo:external-graphic src="embedded.pdf#page=3"/>

As described above, specify the page number which you want to embed to the URI. When there is no page number specified, the first page will be embedded. When there is no size specified, it's embedded in the page size of the PDF where it's embedded. However if you want to scale the size, content-width or content-height can be specified as follows:

<fo:external-graphic src="embedded.pdf#page=3" content-width="50%"/>

When specifying PDF by utilizing the data scheme (RFC2397), the page number can be specified as the parameter of the media type as follows.

<fo:external-graphic src="data:application/pdf;page=3;base64,JVBERi0xLjQKJeLjz9M..."/>

Moreover, it's also possible to embed PDF as a background. This can be used when making a list form as a background and formatting only the content data on it. When specifying PDF as a background, please specify axf:background-repeat="no-repeat" to fo:simple-page-master or fo:page-sequence as follows. axf:background-repeat="repeat" cannot be specified.

<fo:simple-page-master axf:background-image="background.pdf"
                       axf:background-repeat="no-repeat" ...>

When embedding PDF against the background, it's possible to embed not only 1 page but also two or more pages continuously. Specify as follows; axf:background-repeat="paginate"

<fo:simple-page-master axf:background-image="background.pdf#page=3-5"
                       axf:background-repeat="paginate" ...>

In this example, pages from the 3rd to the 5th are embedded as the background. When the number of pages generated from the contents of fo:flow is less than the number of embedded PDF pages, pages are added so that all pages of embedded PDF may be outputted. Therefore, if the contents of fo:flow is empty, there would be no problem. When there is more number of pages generated from the contents of fo:flow, the background image of the page beyond the number of embedded PDF pages will drop off. Specify the page in the form of #page=<FirstPage>-<LastPage>. When axf:background-repeat="paginate" is not specified, the -<LastPage> portion is disregarded.

background.pdf#page=3-5
from 3rd page to 5th page
background.pdf#page=3-
from 3rd page to the last page
background.pdf#page=3
3rd page only
background.pdf
All pages

When axf:background-image or axf:background-repeat are specified to both fo:page-sequence and fo:simple-page-master, fo:simple-page-master takes priority. It's possible to embed PDF to fo:region-body/before/after/start/end by specifying to fo:simple-page-master.

axf:background-repeat="paginate" is not available to specify with AH Formatter V5.0 Lite.

The version of embedded PDF must be less than or equal to the version of PDF to output. The following table shows the acceptable combination with PDF/X, etc.

Embedded PDF PDF PDF/X PDF/A
1.3 1.4 1.5 1.6 1.7 1a:2001 3:2002 1a:2003 2:2003 3:2003 1a:2005 1b:2005
Output PDF PDF1.3 Ok Ok Ok
PDF1.4 Ok Ok Ok Ok Ok Ok Ok Ok Ok
PDF1.5 Ok Ok Ok Ok Ok Ok Ok Ok Ok Ok
PDF1.6 Ok Ok Ok Ok Ok Ok Ok Ok Ok Ok Ok
PDF1.7 Ok Ok Ok Ok Ok Ok Ok Ok Ok Ok Ok Ok
PDF/X-1a:2001 Ok
PDF/X-3:2002 Ok Ok
PDF/X-1a:2003 Ok Ok
PDF/X-2:2003 Ok Ok Ok Ok Ok
PDF/X-3:2003 Ok Ok Ok Ok
PDF/A-1a:2005
PDF/A-1b:2005 Partial Partial

When outputted PDF is TaggedPDF, it is impossible to embed PDF. When there is no compatibility in OutputIntent, an error will occur when embedding PDF/A into PDF/A-1b: 2005.

Font Output

Adobe Type1 fonts (including Adobe Standard 14 fonts), and TrueType fonts (including OpenType fonts with TrueType Outlines), OpenType fonts (PostScript Outline) and Macintosh TrueType font data fork suitcase are supported for PDF output. Other font formats are not supported. For more details, refer to the "Fonts".

AH Formatter V5.0 requires that the fonts, which are specified in documents, are installed on your system in order to use them correctly. Please refer to Windows help or follow the installation instructions attached to the fonts for the method of installing the font in the Windows version. The font placed aside from the font folder can be outputted to PDF in the Windows versoin. At this time, you need to specify some setting in the font construction file. However, the font cannot be displayed in GUI.

These 14 Adobe Type1 fonts are called Standard 14 Fonts in PDF.

Even when using an Adobe Type1 font except for these Standard 14 Fonts, it is not necessary to prepare an AFM (Adobe Font Metrics) file. The glyph names of Adobe Type1 fonts maps to character codes (Unicode) of formatting data according to the AGL (Adobe Glyph List) specification. The glyph with a name that is not defined in AGL is not output. See also Unicode and glyph mapping using the .AFM file for more details about .AFM file.

Character Sets, Encoding

The following character sets are supported:

  • Adobe Standard Latin character set
  • Symbol character set
  • ZapfDingbats character set
  • Japanese character set (Adobe-Japan1-Supplement2)
  • Simplified Chinese character set (Adobe-GB1-Supplement2)
  • Traditional Chinese character set (Adobe-CNS1-Supplement0)
  • Korean character set (Adobe-Korea1-Supplement1)

Encoding of all characters is processed as Unicode within AH Formatter V5.0. In the case of Chinese, Japanese, Korean, (CJK), AH Formatter V5.0 maps the Unicode to glyph in each CJK character sets by using the following CMap.

  • Japanese : UniJIS-UCS2-H(V) UniJIS-UCS2-HW-H(V)
  • Simplified Chinese : UniGB-UCS2-H(V)
  • Traditional Chinese : UniCNS-UCS2-H(V)
  • Korean : UniKS-UCS2-H(V)

The characters that do not belong to the above character sets are embedded in the PDF by getting the glyphs from the font files. This process is done only for TrueType fonts.

Font Embedding

By embedding fonts in PDF, it becomes possible to display PDF files even in the environment where there are no fonts.

In the default setting of TrueType font processing, only the outline of glyphs that are not defined by CMap is embedded. In the case where TrueType fonts that are prohibited from embed by font vendor appears, error occurs and processing stops. You can avoid this error by replacing it with a white space and output PDF. You can also specify the option that all glyphs of a font are to be embedded whether the character is defined by CMap or not.

In the default setting of Adobe Type1 font processing, only the outline of a font that has font specific encoding is embedded. You can also specify the option that all glyphs of a font are to be embedded whether the font has standard encoding or font specific encoding.

You cannot embed the font in PDF if the font is not allowed to embed. See also PDF Output Settings to learn how to specify the font you want to embed.

Regardless of the setting, there may be a case that a font may always be embedded. In the following cases, the font is always embedded. If the font is not allowed to embed, you cannot use the font.

  • Fonts for the following scripts
    • Arabic
    • Hebrew
    • Thai
    • Devalagari
  • Latin script in ligatured form axf:ligature-mode
  • The glyph of modified Japanese Kanji axf:japanese-glyph
  • The Unicode characters which cannot be expressed by 16bit.

Image Output

For more information about supported graphic images, refer to the "Graphics".

Vector Images

The following vector images are outputted to PDF as vector primitives are replaced with PDF operators.

In Windows version, vector images other than the above are transformed into the raster images and outputted to PDF. The resolution of the raster can be set in the result PDF with the value of dpi. Please refer to rasterize-resolution in PDF Output Setting. In non-Windows versions, the vector image which cannot be outputted to PDF is disregarded.

Please refer to EPS in Graphics for details.

CAUTION: MathML can be used only with "AH Formatter MathML Option" with AH Formatter V5.0 Lite.
CAUTION: If AH Formatter AH Formatter CGM Option is not installed on your Formatter, please refer to Graphics for more detail.

Raster Images

Generally the raster graphic data is compressed using a compression algorithms in the graphic file. If the compression method and the original (uncompressed) image format are both compatible with the PDF file format, the compressed raster graphic data will be directly embedded into the PDF file. If the compression method or original image format are not compatible with the PDF format, the graphic data will be uncompressed and converted to a PDF compatible bitmap format for output. If graphic data cannot be uncompressed, its data cannot be processed. Bitmap graphic will be compressed using JPEG or ZLIB compression and embedded into the PDF file. Refer to image-compression, jpeg-quality attributes in PDF Output Setting of the option setting file. If raster image data is directly embedded into the PDF file, these attributes will not be applied.

The raster image which can be embedded directly in PDF is as follows.

However, there are the following restrictions.

  • Progressive JPEG, Interlaced GIF are transformed into regular JPEG or GIF images.
  • 16-bit color in PNG or TIFF is reduced to a 8-bit color.
  • When alpha channel is attached to PNG or TIFF, it is divided.
  • There are some unsupported TIFF formats.
  • JPEG2000 is embedded into PDF only when it is PDF1.5 or later. For other version, it is embedded after being converted to JPEG etc.

Down sampling

In AH Formatter V5.0, the down sampling of the raster images embedded in PDF can be down sampled.

How it's down sampled can be specified in Compression in the PDF Option Setting dialog or in Option Seting File.