Fine-Tuning Converters – Convert HTML, MHTML, EPUB, and SVG in Java
You can convert HTML to various popular formats in a few ways using Aspose.HTML for Java:
- By using the
convertHTML() methods of the
Converter
class. This is the most common way to convert HTML into various formats. - By using the
renderTo(
device
) method of theHTMLDocument
class orrender()
methods of the Renderer class. This alternative way to render HTML documents can give you more control over the HTML rendering process in your Java application.
Rendering Device
The rendering (output) device encapsulates a 2D drawing surface, whose API is implemented using the IDevice interface. Currently, Aspose.HTML for Java API implements a set of rendering devices – PdfDevice, XpsDevice, DocDevice and ImageDevice, which are used to generate PDF, XPS, DOCX, and Image file formats, respectively.
The next example shows how to use the PdfDevice to render HTML document into PDF file. The process occurs with default rendering options:
- Load an HTML document.
- Create an instance of the PdfDevice class using one of the PdfDevice() constructors.
- Call the
renderTo(
device
) method of theHTMLDocument
class.
1// Prepare HTML code
2String code = "<span>Hello, World!!</span>";
3
4// Initialize an HTML document from HTML code
5HTMLDocument document = new HTMLDocument(code, ".");
6
7// Create an instance of the PdfDevice class and specify the output file to render
8PdfDevice device = new PdfDevice("output.pdf");
9
10// Render HTML to PDF
11document.renderTo(device);
Rendering Options
Rendering options gives you additional control over the output device. Every rendering device PdfDevice, XpsDevice and ImageDevice has his own unique set of options and implemented with classes PdfRenderingOptions, XpsRenderingOptions, DocRenderingOptions and ImageRenderingOptions, respectively. For example, you can change the page size, adjust the margins and background color, reduce the file size by adjusting the image quality and resolution, set a security password if you use PdfDevice, etc.
Below is a demonstration of how to use PdfRenderingOptions
to customize the page size during HTML to PDF rendering:
1// Prepare HTML code
2String code = "<span>Hello, World!!</span>";
3
4// Initialize a HTML document from the HTML code
5HTMLDocument document = new HTMLDocument(code, ".");
6
7// Create an instance of PdfRenderingOptions and set a custom page-size
8PdfRenderingOptions options = new PdfRenderingOptions();
9PageSetup pageSetup = new PageSetup();
10Page anyPage = new Page();
11anyPage.setSize(
12 new Size(
13 Length.fromInches(5),
14 Length.fromInches(2)
15 )
16);
17pageSetup.setAnyPage(anyPage);
18options.setPageSetup(pageSetup);
19
20// Create a PDF Device and specify options and output file
21PdfDevice device = new PdfDevice(options, "output.pdf");
22
23// Render HTML to PDF
24document.renderTo(device);
General Options
Rendering Options gives you additional control over the output device. The com.aspose.html.rendering package consists of numerous renderer objects and appropriate low-level options classes responsible for rendering documents into IDevice implementation. The RenderingOptions and CssOptions classes represent rendering options, or in other words, general rendering options.
General Options are valid for all rendering devices and all rendering processes from HTML to PDF, XPS, DOCX, and Images. Let’s look at some of them:
Horizontal and Vertical Resolution
The horizontal and vertical resolution settings are essential for achieving high-quality output when rendering HTML to other formats, for example, HTML to PDF. Horizontal and vertical resolutions are measured (get and set) in pixels per inch (dpi), with a default value of 300 dpi. This setting ensures crisp details and smooth rendering of elements such as text, images, and horizontal and vertical lines in PDF.
The following example shows how to control the resolution of the resulting PDF file, ultimately affecting its size and quality:
1// Prepare HTML code and save it to a file
2String code = "< style >\n" +
3 " p\n" +
4 " {\n" +
5 " background:\n" +
6 " blue;\n" +
7 " }\n" +
8 " @media(min - resolution:300dpi)\n" +
9 " {\n" +
10 " p\n" +
11 " {\n" +
12 " /* high resolution screen color */\n" +
13 " background:\n" +
14 " green\n" +
15 " }\n" +
16 " }\n" +
17 " </style >\n" +
18 " <p > Hello World !! </p >\n";
19
20try (java.io.FileWriter fileWriter = new java.io.FileWriter("document.html")) {
21 fileWriter.write(code);
22}
23
24// Create an instance of the HTMLDocument class
25HTMLDocument document = new HTMLDocument("document.html");
26
27// Create options for low-resolution screens
28PdfRenderingOptions options = new PdfRenderingOptions();
29options.setHorizontalResolution(Resolution.to_Resolution(50d));
30options.setVerticalResolution(Resolution.to_Resolution(50d));
31
32// Create an instance of the PdfDevice
33PdfDevice device = new PdfDevice(
34 options,
35 "output_resolution_50.pdf"
36);
37
38// Render HTML to PDF
39document.renderTo(device);
40
41// Create options for high-resolution screens
42options = new PdfRenderingOptions();
43options.setHorizontalResolution(Resolution.to_Resolution(300d));
44options.setVerticalResolution(Resolution.to_Resolution(300d));
45
46// Create an instance of PDF device
47device = new PdfDevice(
48 options,
49 "output_resolution_300.pdf"
50);
51
52// Render HTML to PDF
53document.renderTo(device);
CSS Media Type
CSS media-type is an important feature that specifies how a document is to be presented on different media: on the screen, on paper, with a braille device, etc. There are few ways to specify media-type for a style sheet, via linked style sheets or inline style sheet:
Linked Style Sheet
1 <link rel="stylesheet" type="text/css" media="print" href="style.javas">
Inline Style Sheet
1<style type="text/css">
2@media print {
3 body{ color: #000000; }
4}
5</style>
Aspose.HTML for Java supports this feature, so you can convert HTML documents as they look on screen or on print with applying the corresponded media types and style sheets. Following example shows how to set up the media type:
1// Prepare HTML code
2String code = "<span>Hello, World!!</span>";
3
4// Initialize an HTML document from the HTML code
5HTMLDocument document = new HTMLDocument(code, ".");
6
7// Create an instance of the PdfRenderingOptions class
8PdfRenderingOptions options = new PdfRenderingOptions();
9// Set the 'screen' media-type
10options.getCss().setMediaType(MediaType.Screen);
11
12// Create a PDF Device and specify options and output file
13PdfDevice device = new PdfDevice(options, "output.pdf");
14
15// Render HTML to PDF
16document.renderTo(device);
Please note that the default value of the
CssOptions.MediaType is Print
. It means that the document will be converting with applying style sheets related to the printing device and looks like on paper (you can use print preview of your browser to see the difference). If you want the document to look the way it is rendered on screen, you should use
MediaType.Screen.
Background Color
The Background Color setting is a critical feature for customizing the appearance of the rendered document. It allows developers to specify the color that will fill the background of each page in the output file. By default, this property is set to Transparent, which means that the background will have no visible fill unless explicitly specified. Customizing the background color can improve the readability of a document, meet branding requirements, or create visually appealing designs.
1// Prepare HTML code and save it to a file
2String code = "<p>Hello, World!!</p>";
3try (java.io.FileWriter fileWriter = new java.io.FileWriter("document.html")) {
4 fileWriter.write(code);
5}
6
7// Create an instance of the HTMLDocument class
8HTMLDocument document = new HTMLDocument("document.html");
9
10// Initialize options with 'cyan' as a background-color
11PdfRenderingOptions options = new PdfRenderingOptions();
12options.setBackgroundColor(Color.getCyan());
13
14// Create an instance of the PdfDevice class
15PdfDevice device = new PdfDevice(options, "output.pdf");
16
17// Render HTML to PDF
18document.renderTo(device);
Page Setup
The page setup is a set of parameters that determine the layout of a printed page. Those parameters include everything from the page size, margins, and auto-resizing to @page priority rules. Using this set of parameters, you can easily set up an individual layout for every page.
In some cases, the content of the HTML page could be wider than the page-size defined with options. If you don’t want to cut off the page content, you can use AdjustToWidestPage
of the
PageSetup class. The following example shows how to adjust the page size to the content.
1// Prepare HTML code
2String code = " <style>\n" +
3 " div {\n" +
4 " page - break -after:always;\n" +
5 " }\n" +
6 " </style >\n" +
7 " <div style = 'border: 1px solid red; width: 400px' > First Page</div >\n" +
8 " <div style = 'border: 1px solid red; width: 600px' > Second Page</div >\n";
9// Initialize an HTML document from HTML code
10HTMLDocument document = new HTMLDocument(code, ".");
11
12// Create an instance of the PdfRenderingOptions class and set a custom page-size
13PdfRenderingOptions options = new PdfRenderingOptions();
14options.getPageSetup().setAnyPage(new Page(new Size(500, 200)));
15
16// Enable auto-adjusting for the page size
17options.getPageSetup().setAdjustToWidestPage(true);
18
19// Create an instance of the PdfDevice class and specify options and output file
20PdfDevice device = new PdfDevice(options, "output.pdf");
21
22// Render HTML to PDF
23document.renderTo(device);
PDF Options
The PdfRenderingOptions class gives developers extensive control over the rendering process when converting HTML to PDF. It allows customization of all general options and, in addition, it offers options specific to rendering only to PDF format – DocumentInfo, Encryption, FormFieldBehaviour, and JpegQuality.
The following example demonstrates the functionality of setting permissions for a PDF file.
1// Prepare HTML code
2String code = "<div>Hello, World!!</div>";
3
4// Initialize an HTML document from the HTML code
5HTMLDocument document = new HTMLDocument(code, ".");
6
7// Create the instance of the PdfRenderingOptions class
8PdfRenderingOptions options = new PdfRenderingOptions();
9
10// Set file permissions
11options.setEncryption(
12 new PdfEncryptionInfo(
13 "user_pwd",
14 "owner_pwd",
15 PdfPermissions.PrintDocument,
16 PdfEncryptionAlgorithm.RC4_128
17 )
18);
19
20// Create a PDF Device and specify options and output file
21PdfDevice device = new PdfDevice(options, "output.pdf");
22
23// Render HTML to PDF
24document.renderTo(device);
Image Options
ImageRenderingOptions allows you to customize a wide range of setting from smoothing (antialiasing), image resolution, and formats to image compression. Following example demonstrates how to change resolution and antialiasing for the resulted image:
1// Prepare HTML code
2String code = "<div>Hello, World!!</div>";
3
4// Initialize an instance of the HTMLDocument class based on prepared code
5HTMLDocument document = new HTMLDocument(code, ".");
6
7// Create an instance of the ImageRenderingOptions class
8ImageRenderingOptions options = new ImageRenderingOptions();
9options.setFormat(ImageFormat.Jpeg);
10
11// Disable smoothing mode
12options.setSmoothingMode(SmoothingMode.None);
13
14// Set the image resolution as 75 dpi
15options.setVerticalResolution(Resolution.fromDotsPerInch(75));
16options.setHorizontalResolution(Resolution.fromDotsPerInch(75));
17
18// Create an instance of the ImageDevice class
19ImageDevice device = new ImageDevice(options, "output.jpg");
20
21// Render HTML to Image
22document.renderTo(device);
Renderers
While the
renderTo(device
) method of the
Document class gives you the ability to send a single document to the output rendering device, using the
Renderer instances directly you can send multiple files at once. Aspose.HTML for Java provides the following implementation of renderers:
HtmlRenderer,
SvgRenderer,
MhtmlRenderer and
EpubRenderer, which are used to render HTML, SVG, MHTML and EPUB documents, respectively.
The next example demonstrates how to use HtmlRenderer
to render multiple HTML documents:
1// Prepare HTML code
2String code1 = "<br><span style='color: green'>Hello, World!!</span>";
3String code2 = "<br><span style='color: blue'>Hello, World!!</span>";
4String code3 = "<br><span style='color: red'>Hello, World!!</span>";
5
6// Create three HTML documents to merge later
7HTMLDocument document1 = new HTMLDocument(code1, ".");
8HTMLDocument document2 = new HTMLDocument(code2, ".");
9HTMLDocument document3 = new HTMLDocument(code3, ".");
10
11// Create an instance of HTML Renderer
12HtmlRenderer renderer = new HtmlRenderer();
13
14// Create an instance of the PdfDevice class
15PdfDevice device = new PdfDevice("output.pdf");
16
17// Merge all HTML documents to PDF
18renderer.render(device, new HTMLDocument[]{document1, document2, document3});
Set Timeout
One more important feature that is available for renderers is the timeout setting. You can use it to specify how long you are ready to wait for all internal processes related to a document lifecycle to be completed, such as resource loading, active timers, etc. Sure, you can specify an infinite waiting period. However, if the document contains a script with an endless loop, you will wait indefinitely. The example below demonstrates how to use the
render(device
, timeout
, documents
) method with the timeout
parameter:
1// Prepare HTML code
2String code = "< script >\n" +
3 " var count = 0;\n" +
4 " setInterval(function()\n" +
5 " {\n" +
6 " var element = document.createElement('div');\n" +
7 " var message = (++count) + '. ' + 'Hello World!!';\n" +
8 " var text = document.createTextNode(message);\n" +
9 " element.appendChild(text);\n" +
10 " document.body.appendChild(element);\n" +
11 " },1000);\n" +
12 "</script >\n";
13
14// Initialize an HTML document based on prepared HTML code
15HTMLDocument document = new HTMLDocument(code, ".");
16
17// Create an instance of HTML Renderer
18HtmlRenderer renderer = new HtmlRenderer();
19
20// Create an instance of the PdfDevice class
21PdfDevice device = new PdfDevice("output.pdf");
22
23// Render HTML to PDF
24renderer.render(device, 5, document);
Conclusion
Aspose.HTML for Java is a powerful and flexible library for rendering HTML, MHTML, EPUB, and SVG to various formats, such as PDF, XPS, DOCX, and images. The Converter
class is fast and easy to use for simple tasks. However, if you need more control over rendering options, use the renderTo(device)
method.
With a wide range of rendering options and configurable features, developers have full control over the output, including resolution, page settings, CSS media types, and device-specific configurations. The flexibility of the API, demonstrated by the ability to use multiple renderers, configure general and format-specific options, and even manage timeouts, makes it a great choice for creating high-quality, customized documents.