Aspose.Words for Java 22.5 Release Notes
Major Features
There are 130 improvements and fixes in this regular monthly release. The most notable are:
- Added support for loading EPUB documents.
- Added support for loading XML documents.
- Added support of “Envelope No. 10” page size for printing.
- Implemented rendering of a border box around the MathML formulas and the strike lines.
- Improved font detection when rendering characters in MathML formulas.
- Improved text wrapping for RTL paragraphs with custom left indent.
Full List of Issues Covering all Changes in this Release (Reported by Java Users)
Key | Summary | Category |
---|---|---|
WORDSNET-19386 | Text-shift observed during Word to PDF conversion | New Feature |
WORDSNET-15581 | RTF to PDF conversion issue with table’s cell width | New Feature |
WORDSJAVA-2636 | JsonDataSource can’t properly parse root element of input stream. | Bug |
WORDSJAVA-2641 | Chart Axis formatting for near-zero value. | Bug |
WORDSJAVA-2725 | Incorrect LeftIndent() values for Xml documents | Bug |
WORDSJAVA-2726 | Small files with ambivalent encoding and file format detection. | Bug |
WORDSNET-17061 | Wrong Font for certain Arabic Characters used in PDF | Bug |
WORDSNET-23673 | FileCorruptedException is thrown upon loading DOCX document | Bug |
WORDSNET-23678 | Aspose.Words hangs upon rendering document | Bug |
WORDSNET-19196 | Text position is changed in output PDF | Bug |
WORDSNET-23658 | System.InvalidOperationException: Stack empty. is thrown on Range.Replace | Bug |
WORDSNET-23695 | System.InvalidOperationException: Infinite loop detected. exception thrown | Bug |
WORDSNET-23716 | Images are lost after loading word 2003 XML document | Bug |
WORDSNET-22835 | Unexpected Column Widths after HTML with Merged Cells is Converted to DOCX | Bug |
WORDSNET-23766 | Ident of list item is incorrect after comparing documents | Bug |
WORDSNET-23277 | Axis labels are wrapped improperly | Bug |
WORDSNET-23569 | FileCorruptedException is thrown upon loading HTML document | Bug |
WORDSNET-23571 | Uppercase text is rendered as regular text | Bug |
WORDSNET-23592 | UpdateFields() fails with NPE | Bug |
WORDSNET-21486 | Imported SVG-based 3D Pie Chart Renders Incorrectly in Word | Bug |
WORDSNET-20866 | DOC to HTML conversion throws System.NullReferenceException | Bug |
Full List of Issues Covering all Changes in this Release (Reported by .NET Users)
Key | Summary | Category |
---|---|---|
WORDSNET-9253 | Shaping issues with Telugu, Tamil, and Chinese characters | New Feature |
WORDSNET-8319 | Table column widths are calculated incorrectly during rendering | New Feature |
WORDSNET-8838 | Support loading EPUB file format | New Feature |
WORDSNET-3822 | Table headers are not wrapped properly | New Feature |
WORDSNET-8931 | Tab spacing is not respected in fixed page formats | New Feature |
WORDSNET-14941 | FILLIN fields are lost in output PDF and print | New Feature |
WORDSNET-22284 | Text position is changed after DOC to PDF conversion | New Feature |
WORDSNET-22697 | Add support for loading of XML documents | New Feature |
WORDSNET-22887 | Add loading progress notification | New Feature |
WORDSNET-23577 | Add .NET 6.0 assemblies to the release build | New Feature |
WORDSNET-12720 | Table contents do not render correctly in output PDF | New Feature |
WORDSNET-8487 | Paragraphs followed by Tightly wrapped Shapes render incorrectly in PDF | New Feature |
WORDSNET-10869 | Add feature to format page number | New Feature |
WORDSNET-9075 | Table column widths are calculated incorrectly during rendering | Enhancement |
WORDSNET-7128 | Text wrapping in Cell is not correct in PDF | Enhancement |
WORDSNET-8325 | WordML to PDF conversion issue with table rendering | Enhancement |
WORDSNET-12186 | Picture and Textbox cause Aspose.Words to render content on one additional page | Enhancement |
WORDSNET-13405 | Table width in percent is not honored when converted from DOCX to XPS | Enhancement |
WORDSNET-12750 | Table Cells widths are incorrect in rendered PDF | Bug |
WORDSNET-5460 | Table inside header of RTF was not rendered in PDF | Bug |
WORDSNET-10700 | RTF to PDF conversion issue with table rendering | Bug |
WORDSNET-22733 | Extra vertical spacing added between Rows of a Table with Merged Cells | Bug |
WORDSNET-12381 | Table Cells widths are incorrect in rendered PDF | Bug |
WORDSNET-10410 | Table indentation is not preserved during rendering | Bug |
WORDSNET-8327 | WordML to Pdf conversion issue with shape rendering | Bug |
WORDSNET-10947 | Incorrect tab positioning causes incorrect text wrapping | Bug |
WORDSNET-11641 | Widths of Tables and cells are not preserved during rendering to PDF | Bug |
WORDSNET-9172 | DOCX to PDF conversion issue with table formatting | Bug |
WORDSNET-18524 | Conversion RTF to PDF inconsistent table width | Bug |
WORDSNET-11806 | DOC to PDF conversion issue with table layout | Bug |
WORDSNET-11500 | Incorrect position of wrapped text on conversion to PDF | Bug |
WORDSNET-8037 | WordML to PDF conversion issue with text rendering | Bug |
WORDSNET-5619 | Table widths are disturbed upon rendering to PDF | Bug |
WORDSNET-11123 | Table widths are not calculated correctly during rendering to PDF | Bug |
WORDSNET-12979 | RenderedDocument and lines issue within table cells | Bug |
WORDSNET-22669 | Table Content Pushed Down from its Original Position in PDF | Bug |
WORDSNET-10017 | DrawingML TextBoxes are pushed to the left beyond the left boundary in fixed page formats | Bug |
WORDSNET-12099 | Table layouts are not correct in PDF | Bug |
WORDSNET-23607 | “Unsupported file format: Unknown” on loading TXT file | Bug |
WORDSNET-23332 | Aspose.Words hangs when loading a MOBI document | Bug |
WORDSNET-22023 | Text alignments in narrow cells of PDF differs from Word after conversion | Bug |
WORDSNET-13196 | Thai font is displayed in the wrong way in PDF | Bug |
WORDSNET-19215 | OfficeMath enclosing formula is crushed when outputting PDF | Bug |
WORDSNET-16742 | Arabic text is not rendered correctly in output PDF | Bug |
WORDSNET-23643 | Chart series are lost after DOCX to PDF conversion | Bug |
WORDSNET-23642 | DOCX to PDF conversion causes layout issues in output PDF file | Bug |
WORDSNET-23644 | Bar charts' height decreases after DOCX to PDF conversion | Bug |
WORDSNET-9788 | DOC to PDF conversion issue with text (date) alignment | Bug |
WORDSNET-23661 | ReportingEngine.BuildReport throws an exception on .NET 6 when reflection optimization is on | Bug |
WORDSNET-23665 | Text in category labels is not wrapped | Bug |
WORDSNET-23668 | Extra paragraph in header on WML to DOCX conversion | Bug |
WORDSNET-23667 | Font name and size does not match MS Word on WML to DOCX conversion | Bug |
WORDSNET-22605 | Split string in LINQ Reporting not working as expected | Bug |
WORDSNET-23685 | Document.ExtractPages() causes line numbers restarting | Bug |
WORDSNET-19798 | Cells in Table gets misplaced during open/save a DOC | Bug |
WORDSNET-23698 | DOC to PDF: Text with Shadow effect not correctly converted | Bug |
WORDSNET-23699 | RTL paragraph is positioned incorrectly inside an inline table with different left and right spacings | Bug |
WORDSNET-23660 | AW does not imitate MS Word handling of an unsupported xml element | Bug |
WORDSNET-23703 | Font is changed after appending document with KeepSourceFormatting | Bug |
WORDSNET-23707 | DOC Compare System.InvalidOperationException: Custom XML part is not found. | Bug |
WORDSNET-23693 | InvalidOperationException: Sequence contains more than one matching element | Bug |
WORDSNET-23672 | Incorrect shape positions on WML to DOCX conversion | Bug |
WORDSNET-23696 | TestSaveOdt performance test fails on net5 and net6 CLR | Bug |
WORDSNET-23715 | FileCorruptedException is thrown upon loading DOCX document | Bug |
WORDSNET-23717 | SVG letter-spacing style gets ignored when converting DOCX to PDF | Bug |
WORDSNET-23718 | Document.ExtractPages changes list numbering | Bug |
WORDSNET-23725 | Wrong paragraph format when adding an image after Pdf2Word conversion | Bug |
WORDSNET-23732 | Fix StringComparison warnings | Bug |
WORDSNET-23225 | Aspose.Words hangs on document rendering | Bug |
WORDSNET-22987 | Import differs from what is in browser | Bug |
WORDSNET-23371 | Structured Document Tag gets removed | Bug |
WORDSNET-23743 | Part of content is moved into table upon reading RTF | Bug |
WORDSNET-23745 | Fix StringComparison warnings in fields/mailmerge domain | Bug |
WORDSNET-22843 | Incorrect rendering of Column3D in PDF | Bug |
WORDSNET-23730 | Fix StringComparison warnings | Bug |
WORDSNET-23394 | Document.UpdatePageLayout() throws System.InvalidOperationException : Infinite loop detected | Bug |
WORDSNET-23396 | Text wrapping does not match Word | Bug |
WORDSNET-22736 | Image position is changed after MHTML to PDF Conversion | Bug |
WORDSNET-23757 | Comments anchor is misplaced after the saving | Bug |
WORDSNET-23760 | PDF can’t be loaded because of “Sequence contains more than one matching element” error | Bug |
WORDSNET-23677 | Do not invoke ResourceLoadingCallback for empty URLs | Bug |
WORDSNET-22726 | Exception is thrown while converting from DOCX to HTML | Bug |
WORDSNET-23279 | Horizontal axis labels are wrapped improperly | Bug |
WORDSNET-23330 | Image is not visible after import from AZW3 | Bug |
WORDSNET-16037 | Field.isDirty value always false | Bug |
WORDSNET-23604 | List numbering is wrong for lists from HTML altChunk’s | Bug |
WORDSNET-23735 | Wrong list numbering due to loss and non-use of DurableId attribute values | Bug |
WORDSNET-23791 | Fix customer issues using SonarQube analysis | Bug |
WORDSNET-23370 | UpdatePageLayout throws exception | Bug |
WORDSNET-23025 | ArgumentException: Incorrect hex length | Bug |
WORDSNET-23485 | Tab is lost upon converting document to HTML | Bug |
WORDSNET-23500 | Content is shifted upon rendering document | Bug |
WORDSNET-23504 | Text is wrapped improperly upon rendering | Bug |
WORDSNET-23511 | RemoveEmptyParagraphs cleanup option does not work in case of nested IF fields | Bug |
WORDSNET-23527 | Graphics is lost on PDF import | Bug |
WORDSNET-23531 | Math equations alignment issue | Bug |
WORDSNET-23535 | Consider disabling LoadOptions.ResourceLoadingCallback invocations for data URLs | Bug |
WORDSNET-23536 | FileCorruptedException is thrown upon loading HTML document | Bug |
WORDSNET-23545 | Problem when editing PDF form field in Chrome | Bug |
WORDSNET-23540 | DOCX to PDF: Text overlapping the document layout | Bug |
WORDSNET-23563 | Content is lost upon loading PDF document | Bug |
WORDSNET-23565 | Numbers are rendered as tofu when use NumeralFormat.ArabicIndic | Bug |
WORDSNET-23578 | Inaccurate vertical alignment in equations when saving to PDF | Bug |
WORDSNET-23505 | Aspose.Words improperly selects paper source upon printing. | Bug |
WORDSNET-23588 | ArgumentException is thrown upon loading MHTML document | Bug |
WORDSNET-23596 | Text alignment in table is incorrect | Bug |
WORDSNET-14989 | Thai characters are not preserved when rendered to PDF | Bug |
WORDSNET-23733 | Fix StringComparison warnings | Bug |
WORDSNET-22725 | Table Cut off Issue when converting Html to Word | Bug |
Public API and Backward Incompatible Changes
This section lists public API changes that were introduced in Aspose.Words 22.5. It includes not only new and obsoleted public methods, but also a description of any changes in the behavior behind the scenes in Aspose.Words which may affect existing code. Any behavior introduced that could be seen as regression and modifies the existing behavior is especially important and is documented here.
Added support for loading EPUB documents
Related issue: WORDSNET-8838
Aspose.Words now can load EPUB 2.0 documents.
EPUB is an e-book file format that uses the “.epub” file extension. A EPUB document is a collection of XHTML documents. Currently, Aspose.Words always loads all XHTML files from a EPUB document in the order in which they appear in the content file (OPF).
The following publicly visible enum values were added:
FileFormat.Epub
LoadFormat.Epub
WarningSource.Epub
The FileFormatUtil class can now be used to determine if a file is a EPUB document. For example, the following call
FileFormatInfo info = FileFormatUtil.DetectFileFormat("book.epub");
will return an info instance with the FileFormatInfo.LoadFormat property set to LoadFormat.Epub. Of all load options only LoadOptions.ResourceLoadingCallback currently has effect when working with EPUB documents. It is useful for loading EPUB documents when the customer wants to control how external resources are loaded,
The use cases for loading EPUB documents are as follows:
Document doc = new Document("book.epub");
LoadOptions options = new LoadOptions
{
ResourceLoadingCallback = new CustomResourceLoadingCallback();
};
Document doc = new Document("book.epub", options);
Added support for loading XML documents
Related issue: WORDSNET-22697
Aspose.Words now can load XML documents. The Extensible Markup Language (XML) is a simple text-based format for representing structured information: documents, data, configuration, books, transactions, invoices, and much more. Aspose.Words mimics MS Word behavior during import XML documents.
The following publicly visible enum value was added:
LoadFormat.Xml
The FileFormatUtil class can now be used to determine if a file is a XML document. For example, the following call
FileFormatInfo info = FileFormatUtil.DetectFileFormat("sample.xml");
will return an info instance with the FileFormatInfo.LoadFormat property set to LoadFormat.Xml.
The use cases for loading XML documents are as follows:
Document doc = new Document("sample.xml");
Introduced ChapterPageSeparator enum and added PageSetup.ChapterPageSeparator and PageSetup.HeadingLevelForChapter properties
Related issue: WORDSNET-10869
The ChapterPageSeparator enum is introduced:
/// <summary>
/// Defines the separator character that appears between the chapter and page number.
/// </summary>
/// <seealso cref="PageSetup"/>
/// <seealso cref="PageSetup.ChapterPageSeparator"/>
public enum ChapterPageSeparator
{
/// <summary>
/// A colon.
/// </summary>
Hyphen = 0,
/// <summary>
/// A period.
/// </summary>
Period = 1,
/// <summary>
/// A colon.
/// </summary>
Colon = 2,
/// <summary>
/// An emphasized dash.
/// </summary>
EmDash = 3,
/// <summary>
/// A standard dash.
/// </summary>
EnDash = 4
}
The following public properties are added to PageSetup class:
/// <summary>
/// Gets or sets the heading level style that is applied to the chapter titles in the document.
/// </summary>
/// <remarks>
/// <p>Can be a number from 0 through 9. 0 means no chapter number if applied to page number.</p>
/// <p>Before you can create page numbers that include chapter numbers, the document headings must have a numbered outline format applied.</p>
/// </remarks>
public int HeadingLevelForChapter { get; set; }
/// <summary>
/// Gets or sets the separator character that appears between the chapter number and the page number.
/// </summary>
/// <remarks>
/// <p>Before you can create page numbers that include chapter numbers, the document headings must have a numbered outline format applied.</p>
/// </remarks>
public ChapterPageSeparator ChapterPageSeparator { get; set; }
Use Case:
Document doc = new Document(fileName);
PageSetup pageSetup = doc.FirstSection.PageSetup;
pageSetup.PageNumberStyle = NumberStyle.UppercaseRoman;
pageSetup.ChapterPageSeparator = ChapterPageSeparator.Colon;
pageSetup.HeadingLevelForChapter = 1;
LoadOptions.ResourceLoadingCallback is no longer invoked for data URLs
Related issue: WORDSNET-23535
LoadOptions.ResourceLoadingCallback is no longer invoked for resources that are embedded as data URLs (for example, data:image/gif;base64,R0lGODlhEAAQAMQAAORH…). The reason is that these URLs do not reference external resources and are decoded in place.
LoadOptions.ResourceLoadingCallback is no longer invoked for empty URLs
Related issue: WORDSNET-23677
LoadOptions.ResourceLoadingCallback is no longer invoked for empty URLs (for example, ), because empty URLs don’t reference any external resource.
Slight changes in markup nodes typed collection
Related issue: WORDSNET-23774
The default indexer for markup nodes collection has been changed. Now it is the index number of a structured document tag in the collection.
/// <summary>
/// Returns the structured document tag at the specified index.
/// </summary>
/// <param name="index">An index into the collection.</param>
public IStructuredDocumentTag this[int index] { get; }
Along with this, it has become possible to remove a structured document tag at the specified index number, as well as remove a structured document tag by its identifier.
/// <summary>
/// Removes the structured document tag with the specified identifier.
/// </summary>
/// <param name="id">The structured document tag identifier.</param>
public void Remove(int id)
/// <summary>
/// Removes a structured document tag at the specified index.
/// </summary>
/// <param name="index">An index into the collection.</param>
public void RemoveAt(int index)
The functionality that the indexer has previously performed by ID is now available through GetById() method.
/// <summary>
/// Returns the structured document tag by identifier.
/// </summary>
/// <remarks>
/// <p>Returns null if the structured document tag with the specified identifier cannot be found.</p>
/// </remarks>
/// <param name="id">The structured document tag identifier.</param>
public IStructuredDocumentTag GetById(int id)
Use Case:
StructuredDocumentTags structuredDocumentTags = doc.Range.StructuredDocumentTags;
// We iterate through all collection elements, getting each element by its index number.
for (int i = 0; i < structuredDocumentTags.Count; i++)
{
IStructuredDocumentTag sdt = structuredDocumentTags[i];
Console.WriteLine(sdt.Title);
}
// Get the structured document tag by its Id.
sdt = structuredDocumentTags.GetById(1160505028);
if (sdt != null)
Console.WriteLine(sdt.Title);
// Remove the structured document tag by its Id.
structuredDocumentTags.Remove(1160505028);
// Remove the structured document tag at position 0.
structuredDocumentTags.RemoveAt(0);
Added “Number10Envelope” value to PaperSize enum
Related issue: WORDSNET-23505
Added support of “Envelope No. 10” page size for printing.
/// <summary>
/// Specifies paper size.
/// </summary>
public enum PaperSize
{
/// <summary>
/// 4.125 x 9.5 inches.
/// </summary>
Number10Envelope
}
Use Case:
// This value is used to set the page size as follows:
Document doc = new Document(fileName);
doc.FirstSection.PageSetup.PaperSize = PaperSize.Number10Envelope;
// Or in a similar way using DocumentBuilder:
DocumentBuilder builder = new DocumentBuilder(doc);
builder.PageSetup.PaperSize = PaperSize.Number10Envelope;
HtmlSaveOptions.ExportTextBoxAsSvg was marked as obsolete
Related issue: WORDSNET-23514
The HtmlSaveOptions.ExportTextBoxAsSvg property is now obsolete. The customers should use the HtmlSaveOptions.ExportShapesAsSvg, which affects text boxes as well.