Fine-Tuning Converters – Convert HTML, MHTML, EPUB, and SVG in Java

You can convert HTML to various popular formats in a few ways using Aspose.HTML for Java:

Rendering Device

The rendering (output) device encapsulates a 2D drawing surface, whose API is implemented using the IDevice interface. Currently, Aspose.HTML for Java API implements a set of rendering devices – PdfDevice, XpsDevice, DocDevice and ImageDevice, which are used to generate PDF, XPS, DOCX, and Image file formats, respectively.

The next example shows how to use the PdfDevice to render HTML document into PDF file. The process occurs with default rendering options:

  1. Load an HTML document.
  2. Create an instance of the PdfDevice class using one of the PdfDevice() constructors.
  3. Call the renderTo(device) method of the HTMLDocument class.
 1// Prepare HTML code
 2String code = "<span>Hello, World!!</span>";
 3
 4// Initialize an HTML document from HTML code
 5HTMLDocument document = new HTMLDocument(code, ".");
 6
 7// Create an instance of the PdfDevice class and specify the output file to render
 8PdfDevice device = new PdfDevice("output.pdf");
 9
10// Render HTML to PDF
11document.renderTo(device);

Rendering Options

Rendering options gives you additional control over the output device. Every rendering device PdfDevice, XpsDevice and ImageDevice has his own unique set of options and implemented with classes PdfRenderingOptions, XpsRenderingOptions, DocRenderingOptions and ImageRenderingOptions, respectively. For example, you can change the page size, adjust the margins and background color, reduce the file size by adjusting the image quality and resolution, set a security password if you use PdfDevice, etc.

Below is a demonstration of how to use PdfRenderingOptions to customize the page size during HTML to PDF rendering:

 1// Prepare HTML code
 2String code = "<span>Hello, World!!</span>";
 3
 4// Initialize a HTML document from the HTML code
 5HTMLDocument document = new HTMLDocument(code, ".");
 6
 7// Create an instance of PdfRenderingOptions and set a custom page-size
 8PdfRenderingOptions options = new PdfRenderingOptions();
 9PageSetup pageSetup = new PageSetup();
10Page anyPage = new Page();
11anyPage.setSize(
12        new Size(
13                Length.fromInches(5),
14                Length.fromInches(2)
15        )
16);
17pageSetup.setAnyPage(anyPage);
18options.setPageSetup(pageSetup);
19
20// Create a PDF Device and specify options and output file
21PdfDevice device = new PdfDevice(options, "output.pdf");
22
23// Render HTML to PDF
24document.renderTo(device);

General Options

Rendering Options gives you additional control over the output device. The com.aspose.html.rendering package consists of numerous renderer objects and appropriate low-level options classes responsible for rendering documents into IDevice implementation. The RenderingOptions and CssOptions classes represent rendering options, or in other words, general rendering options.

General Options are valid for all rendering devices and all rendering processes from HTML to PDF, XPS, DOCX, and Images. Let’s look at some of them:

Horizontal and Vertical Resolution

The horizontal and vertical resolution settings are essential for achieving high-quality output when rendering HTML to other formats, for example, HTML to PDF. Horizontal and vertical resolutions are measured (get and set) in pixels per inch (dpi), with a default value of 300 dpi. This setting ensures crisp details and smooth rendering of elements such as text, images, and horizontal and vertical lines in PDF.

The following example shows how to control the resolution of the resulting PDF file, ultimately affecting its size and quality:

 1// Prepare HTML code and save it to a file
 2String code = "< style >\n" +
 3        "                p\n" +
 4        "        {\n" +
 5        "            background:\n" +
 6        "            blue;\n" +
 7        "        }\n" +
 8        "        @media(min - resolution:300dpi)\n" +
 9        "        {\n" +
10        "            p\n" +
11        "            {\n" +
12        "                /* high resolution screen color */\n" +
13        "                background:\n" +
14        "                green\n" +
15        "            }\n" +
16        "        }\n" +
17        "    </style >\n" +
18        "    <p > Hello World !! </p >\n";
19
20try (java.io.FileWriter fileWriter = new java.io.FileWriter("document.html")) {
21    fileWriter.write(code);
22}
23
24// Create an instance of the HTMLDocument class
25HTMLDocument document = new HTMLDocument("document.html");
26
27// Create options for low-resolution screens
28PdfRenderingOptions options = new PdfRenderingOptions();
29options.setHorizontalResolution(Resolution.to_Resolution(50d));
30options.setVerticalResolution(Resolution.to_Resolution(50d));
31
32// Create an instance of the PdfDevice
33PdfDevice device = new PdfDevice(
34        options,
35        "output_resolution_50.pdf"
36);
37
38// Render HTML to PDF
39document.renderTo(device);
40
41// Create options for high-resolution screens
42options = new PdfRenderingOptions();
43options.setHorizontalResolution(Resolution.to_Resolution(300d));
44options.setVerticalResolution(Resolution.to_Resolution(300d));
45
46// Create an instance of PDF device
47device = new PdfDevice(
48        options,
49        "output_resolution_300.pdf"
50);
51
52// Render HTML to PDF
53document.renderTo(device);

CSS Media Type

CSS media-type is an important feature that specifies how a document is to be presented on different media: on the screen, on paper, with a braille device, etc. There are few ways to specify media-type for a style sheet, via linked style sheets or inline style sheet:

Linked Style Sheet

1 <link rel="stylesheet" type="text/css" media="print" href="style.javas">

Inline Style Sheet

1<style type="text/css">
2@media print {
3  body{ color: #000000; }
4}
5</style>

Aspose.HTML for Java supports this feature, so you can convert HTML documents as they look on screen or on print with applying the corresponded media types and style sheets. Following example shows how to set up the media type:

 1// Prepare HTML code
 2String code = "<span>Hello, World!!</span>";
 3
 4// Initialize an HTML document from the HTML code
 5HTMLDocument document = new HTMLDocument(code, ".");
 6
 7// Create an instance of the PdfRenderingOptions class
 8PdfRenderingOptions options = new PdfRenderingOptions();
 9// Set the 'screen' media-type
10options.getCss().setMediaType(MediaType.Screen);
11
12// Create a PDF Device and specify options and output file
13PdfDevice device = new PdfDevice(options, "output.pdf");
14
15// Render HTML to PDF
16document.renderTo(device);

Please note that the default value of the CssOptions.MediaType is Print. It means that the document will be converting with applying style sheets related to the printing device and looks like on paper (you can use print preview of your browser to see the difference). If you want the document to look the way it is rendered on screen, you should use MediaType.Screen.

Background Color

The Background Color setting is a critical feature for customizing the appearance of the rendered document. It allows developers to specify the color that will fill the background of each page in the output file. By default, this property is set to Transparent, which means that the background will have no visible fill unless explicitly specified. Customizing the background color can improve the readability of a document, meet branding requirements, or create visually appealing designs.

 1// Prepare HTML code and save it to a file
 2String code = "<p>Hello, World!!</p>";
 3try (java.io.FileWriter fileWriter = new java.io.FileWriter("document.html")) {
 4    fileWriter.write(code);
 5}
 6
 7// Create an instance of the HTMLDocument class
 8HTMLDocument document = new HTMLDocument("document.html");
 9
10// Initialize options with 'cyan' as a background-color
11PdfRenderingOptions options = new PdfRenderingOptions();
12options.setBackgroundColor(Color.getCyan());
13
14// Create an instance of the PdfDevice class
15PdfDevice device = new PdfDevice(options, "output.pdf");
16
17// Render HTML to PDF
18document.renderTo(device);

Page Setup

The page setup is a set of parameters that determine the layout of a printed page. Those parameters include everything from the page size, margins, and auto-resizing to @page priority rules. Using this set of parameters, you can easily set up an individual layout for every page.

In some cases, the content of the HTML page could be wider than the page-size defined with options. If you don’t want to cut off the page content, you can use AdjustToWidestPage of the PageSetup class. The following example shows how to adjust the page size to the content.

 1// Prepare HTML code
 2String code = "    <style>\n" +
 3        "        div {\n" +
 4        "            page - break -after:always;\n" +
 5        "        }\n" +
 6        "    </style >\n" +
 7        "    <div style = 'border: 1px solid red; width: 400px' > First Page</div >\n" +
 8        "    <div style = 'border: 1px solid red; width: 600px' > Second Page</div >\n";
 9// Initialize an HTML document from HTML code
10HTMLDocument document = new HTMLDocument(code, ".");
11
12// Create an instance of the PdfRenderingOptions class and set a custom page-size
13PdfRenderingOptions options = new PdfRenderingOptions();
14options.getPageSetup().setAnyPage(new Page(new Size(500, 200)));
15
16// Enable auto-adjusting for the page size
17options.getPageSetup().setAdjustToWidestPage(true);
18
19// Create an instance of the PdfDevice class and specify options and output file
20PdfDevice device = new PdfDevice(options, "output.pdf");
21
22// Render HTML to PDF
23document.renderTo(device);

PDF Options

The PdfRenderingOptions class gives developers extensive control over the rendering process when converting HTML to PDF. It allows customization of all general options and, in addition, it offers options specific to rendering only to PDF format – DocumentInfo, Encryption, FormFieldBehaviour, and JpegQuality.

The following example demonstrates the functionality of setting permissions for a PDF file.

 1// Prepare HTML code
 2String code = "<div>Hello, World!!</div>";
 3
 4// Initialize an HTML document from the HTML code
 5HTMLDocument document = new HTMLDocument(code, ".");
 6
 7// Create the instance of the PdfRenderingOptions class
 8PdfRenderingOptions options = new PdfRenderingOptions();
 9
10// Set file permissions
11options.setEncryption(
12        new PdfEncryptionInfo(
13                "user_pwd",
14                "owner_pwd",
15                PdfPermissions.PrintDocument,
16                PdfEncryptionAlgorithm.RC4_128
17        )
18);
19
20// Create a PDF Device and specify options and output file
21PdfDevice device = new PdfDevice(options, "output.pdf");
22
23// Render HTML to PDF
24document.renderTo(device);

Image Options

ImageRenderingOptions allows you to customize a wide range of setting from smoothing (antialiasing), image resolution, and formats to image compression. Following example demonstrates how to change resolution and antialiasing for the resulted image:

 1// Prepare HTML code
 2String code = "<div>Hello, World!!</div>";
 3
 4// Initialize an instance of the HTMLDocument class based on prepared code
 5HTMLDocument document = new HTMLDocument(code, ".");
 6
 7// Create an instance of the ImageRenderingOptions class
 8ImageRenderingOptions options = new ImageRenderingOptions();
 9options.setFormat(ImageFormat.Jpeg);
10
11// Disable smoothing mode
12options.setSmoothingMode(SmoothingMode.None);
13
14// Set the image resolution as 75 dpi
15options.setVerticalResolution(Resolution.fromDotsPerInch(75));
16options.setHorizontalResolution(Resolution.fromDotsPerInch(75));
17
18// Create an instance of the ImageDevice class
19ImageDevice device = new ImageDevice(options, "output.jpg");
20
21// Render HTML to Image
22document.renderTo(device);

Renderers

While the renderTo(device) method of the Document class gives you the ability to send a single document to the output rendering device, using the Renderer instances directly you can send multiple files at once. Aspose.HTML for Java provides the following implementation of renderers: HtmlRenderer, SvgRenderer, MhtmlRenderer and EpubRenderer, which are used to render HTML, SVG, MHTML and EPUB documents, respectively.

The next example demonstrates how to use HtmlRenderer to render multiple HTML documents:

 1// Prepare HTML code
 2String code1 = "<br><span style='color: green'>Hello, World!!</span>";
 3String code2 = "<br><span style='color: blue'>Hello, World!!</span>";
 4String code3 = "<br><span style='color: red'>Hello, World!!</span>";
 5
 6// Create three HTML documents to merge later
 7HTMLDocument document1 = new HTMLDocument(code1, ".");
 8HTMLDocument document2 = new HTMLDocument(code2, ".");
 9HTMLDocument document3 = new HTMLDocument(code3, ".");
10
11// Create an instance of HTML Renderer
12HtmlRenderer renderer = new HtmlRenderer();
13
14// Create an instance of the PdfDevice class
15PdfDevice device = new PdfDevice("output.pdf");
16
17// Merge all HTML documents to PDF
18renderer.render(device, new HTMLDocument[]{document1, document2, document3});

Set Timeout

One more important feature that is available for renderers is the timeout setting. You can use it to specify how long you are ready to wait for all internal processes related to a document lifecycle to be completed, such as resource loading, active timers, etc. Sure, you can specify an infinite waiting period. However, if the document contains a script with an endless loop, you will wait indefinitely. The example below demonstrates how to use the render(device, timeout, documents) method with the timeout parameter:

 1// Prepare HTML code
 2String code = "< script >\n" +
 3        "        var count = 0;\n" +
 4        "        setInterval(function()\n" +
 5        "        {\n" +
 6        "            var element = document.createElement('div');\n" +
 7        "            var message = (++count) + '. ' + 'Hello World!!';\n" +
 8        "            var text = document.createTextNode(message);\n" +
 9        "            element.appendChild(text);\n" +
10        "            document.body.appendChild(element);\n" +
11        "        },1000);\n" +
12        "</script >\n";
13
14// Initialize an HTML document based on prepared HTML code
15HTMLDocument document = new HTMLDocument(code, ".");
16
17// Create an instance of HTML Renderer
18HtmlRenderer renderer = new HtmlRenderer();
19
20// Create an instance of the PdfDevice class
21PdfDevice device = new PdfDevice("output.pdf");
22
23// Render HTML to PDF
24renderer.render(device, 5, document);

Conclusion

Aspose.HTML for Java is a powerful and flexible library for rendering HTML, MHTML, EPUB, and SVG to various formats, such as PDF, XPS, DOCX, and images. The Converter class is fast and easy to use for simple tasks. However, if you need more control over rendering options, use the renderTo(device) method.

With a wide range of rendering options and configurable features, developers have full control over the output, including resolution, page settings, CSS media types, and device-specific configurations. The flexibility of the API, demonstrated by the ability to use multiple renderers, configure general and format-specific options, and even manage timeouts, makes it a great choice for creating high-quality, customized documents.

Subscribe to Aspose Product Updates

Get monthly newsletters & offers directly delivered to your mailbox.