Working with Columns and Rows

Find the Index of Table Elements

Finding the index of any node involves gathering all child nodes of the element’s type from the parent node then using the NodeCollection.IndexOf method to find the index of the desired node in the collection.

Find the Index of Table in a Document

The code example given below demonstrates how to retrieve the index of a table in the document.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
// Get the first table in the document.
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
NodeCollection allTables = doc.GetChildNodes(NodeType.Table, true);
int tableIndex = allTables.IndexOf(table);

Find the Index of a Row in a Table

The code example given below demonstrates how to retrieve the index of a row in a table.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
int rowIndex = table.IndexOf((Row)table.LastRow);

Find the Index of a Cell in a Row

The code example given below demonstrates how to retrieve the index of a cell in a row.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
int cellIndex = row.IndexOf(row.Cells[4]);

Work with Columns

In both Word documents and in the Aspose.Words Document Object Model, there is no concept of a column. By design, table rows in Microsoft Word are completely independent and the base properties and operations are only contained on rows and cells of the table. This gives tables the possibility of some interesting attributes:

  • Each row in a table can have a completely different number of cells.
  • Vertically, the cells of each row can have different widths.
  • It is possible to join tables with differing row formats and cell counts.

Any operations that are performed on columns in Microsoft Word are in actual fact “short-cut methods” which perform the operation by modifying the cells of the rows collectively in such a way that it appears they are being applied to columns. This structure of rows and cells in the same way that tables are represented in Aspose.Words.

In the Aspose.Words Document Object Model a Table node is made up of Row and then Cell nodes. There is also no native support for columns.

You can still achieve such operations on columns by iterating through the same cell index of the rows of a table. The code below makes such operations easier by proving a façade class which collects the cells which make up a “column” of a table. Below example demonstrates a facade object for working with a column of a table.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
/// <summary>
/// Represents a facade object for a column of a table in a Microsoft Word document.
/// </summary>
internal class Column
{
private Column(Table table, int columnIndex)
{
if (table == null)
throw new ArgumentException("table");
mTable = table;
mColumnIndex = columnIndex;
}
/// <summary>
/// Returns a new column facade from the table and supplied zero-based index.
/// </summary>
public static Column FromIndex(Table table, int columnIndex)
{
return new Column(table, columnIndex);
}
/// <summary>
/// Returns the cells which make up the column.
/// </summary>
public Cell[] Cells
{
get
{
return (Cell[])GetColumnCells().ToArray(typeof(Cell));
}
}
/// <summary>
/// Returns the index of the given cell in the column.
/// </summary>
public int IndexOf(Cell cell)
{
return GetColumnCells().IndexOf(cell);
}
/// <summary>
/// Inserts a brand new column before this column into the table.
/// </summary>
public Column InsertColumnBefore()
{
Cell[] columnCells = Cells;
if (columnCells.Length == 0)
throw new ArgumentException("Column must not be empty");
// Create a clone of this column.
foreach (Cell cell in columnCells)
cell.ParentRow.InsertBefore(cell.Clone(false), cell);
// This is the new column.
Column column = new Column(columnCells[0].ParentRow.ParentTable, mColumnIndex);
// We want to make sure that the cells are all valid to work with (have at least one paragraph).
foreach (Cell cell in column.Cells)
cell.EnsureMinimum();
// Increase the index which this column represents since there is now one extra column infront.
mColumnIndex++;
return column;
}
/// <summary>
/// Removes the column from the table.
/// </summary>
public void Remove()
{
foreach (Cell cell in Cells)
cell.Remove();
}
/// <summary>
/// Returns the text of the column.
/// </summary>
public string ToTxt()
{
StringBuilder builder = new StringBuilder();
foreach (Cell cell in Cells)
builder.Append(cell.ToString(SaveFormat.Text));
return builder.ToString();
}
/// <summary>
/// Provides an up-to-date collection of cells which make up the column represented by this facade.
/// </summary>
private ArrayList GetColumnCells()
{
ArrayList columnCells = new ArrayList();
foreach (Row row in mTable.Rows)
{
Cell cell = row.Cells[mColumnIndex];
if (cell != null)
columnCells.Add(cell);
}
return columnCells;
}
private int mColumnIndex;
private Table mTable;
}

The code example given below demonstrates how to insert a blank column into a table.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
// Get the first table in the document.
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
// Get the second column in the table.
Column column = Column.FromIndex(table, 0);
// Print the plain text of the column to the screen.
Console.WriteLine(column.ToTxt());
// Create a new column to the left of this column.
// This is the same as using the "Insert Column Before" command in Microsoft Word.
Column newColumn = column.InsertColumnBefore();
// Add some text to each of the column cells.
foreach (Cell cell in newColumn.Cells)
cell.FirstParagraph.AppendChild(new Run(doc, "Column Text " + newColumn.IndexOf(cell)));

Below example shows how to remove a column from a table in a document.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
// Get the second table in the document.
Table table = (Table)doc.GetChild(NodeType.Table, 1, true);
// Get the third column from the table and remove it.
Column column = Column.FromIndex(table, 2);
column.Remove();

Specify Rows to Repeat on Subsequent Pages as Header Rows

In Microsoft Word, this option is found under Table Properties as “Repeat row as a header on subsequent pages”. Using this option you can choose to repeat only a single row or many rows in a table.

In the case of a single header row, it must be the first row in the table. In addition when multiple header rows are used then the header row each of these rows must be consecutive and these rows must be on one page. In Aspose.Words you can apply this setting by using the RowFormat.HeadingFormat property.

The code example given below demonstrates how to build a table which includes heading rows that repeat on subsequent pages.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
// The path to the documents directory.
string dataDir = RunExamples.GetDataDir_WorkingWithTables();
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
Table table = builder.StartTable();
builder.RowFormat.HeadingFormat = true;
builder.ParagraphFormat.Alignment = ParagraphAlignment.Center;
builder.CellFormat.Width = 100;
builder.InsertCell();
builder.Writeln("Heading row 1");
builder.EndRow();
builder.InsertCell();
builder.Writeln("Heading row 2");
builder.EndRow();
builder.CellFormat.Width = 50;
builder.ParagraphFormat.ClearFormatting();
// Insert some content so the table is long enough to continue onto the next page.
for (int i = 0; i < 50; i++)
{
builder.InsertCell();
builder.RowFormat.HeadingFormat = false;
builder.Write("Column 1 Text");
builder.InsertCell();
builder.Write("Column 2 Text");
builder.EndRow();
}
dataDir = dataDir + "Table.HeadingRow_out.doc";
// Save the document to disk.
doc.Save(dataDir);

How to Apply Different AutoFit Settings to a Table

When creating a table using a visual agent such as Microsoft Word, you will often find yourself using one of the AutoFit options to automatically size the table to the desired width. For instance, you can use the AutoFit to Window option to fit the table to the width of the page and AutoFit to Contents option to allow each cell to grow or shrink to accommodate its contents.

By default, Aspose.Words inserts a new table using “AutoFit to Window”. The table will size to the available width on the page. To change the sizing behavior on such a table or an existing table you can call Table.AutoFit method. This method accepts an AutoFitBehavior enumeration which defines what type of auto fitting is applied to the table.

As in Microsoft Word, the autofit method is actually a shortcut which applies different properties to the table all at once. These properties are actually what give the table the observed behavior. We will discuss these properties for each autofit option.

AutoFitting a Table to Window

The code example given below demonstrates autofits a table to fit the page width. You can download the template file of this example from here.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
// The path to the documents directory.
string dataDir = RunExamples.GetDataDir_WorkingWithTables();
string fileName = "TestFile.doc";
// Open the document
Document doc = new Document(dataDir + fileName);
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
// Autofit the first table to the page width.
table.AutoFit(AutoFitBehavior.AutoFitToWindow);
dataDir = dataDir + RunExamples.GetOutputFilePath(fileName);
// Save the document to disk.
doc.Save(dataDir);
Debug.Assert(doc.FirstSection.Body.Tables[0].PreferredWidth.Type == PreferredWidthType.Percent, "PreferredWidth type is not percent");
Debug.Assert(doc.FirstSection.Body.Tables[0].PreferredWidth.Value == 100, "PreferredWidth value is different than 100");

When autofit to the window is applied to a table the following operations are actually being performed behind the scenes:

  1. The Table.AllowAutoFit property is enabled to automatically resize columns to the available content.
  2. A Table.PreferredWidth value of 100% is applied.
  3. The CellFormat.PreferredWidth is removed from all cells in the table. Note this is a little bit different to how Microsoft Word performs this step. In Microsoft Word, the preferred width of each cell is set to suitable values based on their current size and content. Aspose.Words does not update preferred width so instead, they are just cleared.
  4. The column widths are recalculated for the current content of the table. The end result is a table that occupies all available width. The widths of the columns in the table change automatically as the user edits the text in MS Word.

AutoFitting a Table to Contents

The code example given below demonstrates autofits a table in the document to its contents. You can download the template file of this example from here.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
// The path to the documents directory.
string dataDir = RunExamples.GetDataDir_WorkingWithTables();
string fileName = "TestFile.doc";
Document doc = new Document(dataDir + fileName);
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
// Auto fit the table to the cell contents
table.AutoFit(AutoFitBehavior.AutoFitToContents);
dataDir = dataDir + RunExamples.GetOutputFilePath(fileName);
// Save the document to disk.
doc.Save(dataDir);
Debug.Assert(doc.FirstSection.Body.Tables[0].PreferredWidth.Type == PreferredWidthType.Auto, "PreferredWidth type is not auto");
Debug.Assert(doc.FirstSection.Body.Tables[0].FirstRow.FirstCell.CellFormat.PreferredWidth.Type == PreferredWidthType.Auto, "PrefferedWidth on cell is not auto");
Debug.Assert(doc.FirstSection.Body.Tables[0].FirstRow.FirstCell.CellFormat.PreferredWidth.Value == 0, "PreferredWidth value is not 0");

Disable AutoFitting on a Table and Use Fixed Column Widths

The code example given below demonstrates how to disable autofitting and enables fixed widths for the specified table. You can download the template file of this example from here.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
// The path to the documents directory.
string dataDir = RunExamples.GetDataDir_WorkingWithTables();
string fileName = "TestFile.doc";
Document doc = new Document(dataDir + fileName);
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
// Disable autofitting on this table.
table.AutoFit(AutoFitBehavior.FixedColumnWidths);
dataDir = dataDir + RunExamples.GetOutputFilePath(fileName);
// Save the document to disk.
doc.Save(dataDir);
// ExEnd
Debug.Assert(doc.FirstSection.Body.Tables[0].PreferredWidth.Type == PreferredWidthType.Auto, "PreferredWidth type is not auto");
Debug.Assert(doc.FirstSection.Body.Tables[0].PreferredWidth.Value == 0, "PreferredWidth value is not 0");
Debug.Assert(doc.FirstSection.Body.Tables[0].FirstRow.FirstCell.CellFormat.Width == 69.2, "Cell width is not correct.");

When a table has auto fit disabled and fixed column widths used instead the following steps are taken:

  1. The Table.AllowAutoFit property is disabled so columns do not grow or shrink to their contents.
  2. The table-wide preferred width is removed from Table.PreferredWidth.
  3. The CellFormat.PreferredWidth is removed from all cells in the table. The end result is a table whose column widths are defined using the CellFormat.Width property and whose columns do not automatically resize when the user enter texts or the page size is modified.

Keep Tables and Rows from Breaking Across Pages

There are times where the contents of a table should not be split across a page. For instance, when there is a title above a table, the title and the table should always be kept together on the same page to preserve proper appearance.

There are two separate techniques that are useful to achieve this functionality:

  • Allow Row to Break across Pages which is applied to the rows of a table.
  • Keep with Next which is applied to paragraphs in table cells.

We will use the table below in our example. By default, it has the properties above disabled. Also, notice how the content in the middle row is split across the page.

Keep a Row from Breaking Across Pages

This involves restricting content inside the cells of a row from being split across a page. In Microsoft Word, this can found under Table Properties as the option “Allow row to break across pages”. In Aspose.Words this is found under the RowFormat object of a Row as the property RowFormat.AllowBreakAcrossPages. The code example given below demonstrates how to disable rows breaking across pages for every row in a table. You can download the template file of this example from here.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
Document doc = new Document(dataDir + "Table.TableAcrossPage.doc");
// Retrieve the first table in the document.
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
// Disable breaking across pages for all rows in the table.
foreach (Row row in table.Rows)
row.RowFormat.AllowBreakAcrossPages = false;
dataDir = dataDir + "Table.DisableBreakAcrossPages_out.doc";
doc.Save(dataDir);

Keep a Table from Breaking Across Pages

To stop a table from splitting across the page we need to state that we wish the content contained within the table to stay together. In Microsoft Word, this involves selecting the table and enabling “Keep with Next” under Paragraph Format.

In Aspose.Words the technique is the same. Each paragraph inside the cells of the table should have ParagraphFormat.KeepWithNext set to true. The exception is the last paragraph in the table which should be set to false. The code example given below demonstrates how to set a table to stay together on the same page. You can download the template file of this example from here.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
Document doc = new Document(dataDir + "Table.TableAcrossPage.doc");
// Retrieve the first table in the document.
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
// To keep a table from breaking across a page we need to enable KeepWithNext
// For every paragraph in the table except for the last paragraphs in the last
// Row of the table.
foreach (Cell cell in table.GetChildNodes(NodeType.Cell, true))
{
// Call this method if table's cell is created on the fly
// Newly created cell does not have paragraph inside
cell.EnsureMinimum();
foreach (Paragraph para in cell.Paragraphs)
if (!(cell.ParentRow.IsLastRow && para.IsEndOfCell))
para.ParagraphFormat.KeepWithNext = true;
}
dataDir = dataDir + "Table.KeepTableTogether_out.doc";
doc.Save(dataDir);

Work with Merged Cells

In a table, several cells can be merged together into a single cell. This is useful when certain rows require a title or large blocks of text which span across the width of the table. This can only be achieved by merging some of the cells in the table into a single cell. Aspose.Words supports merged cells when working with all input formats including when importing HTML content.

Merged Cells in Aspose.Words

In Aspose.Words, merged cells are represented by CellFormat.HorizontalMerge and CellFormat.VerticalMerge. The CellFormat.HorizontalMerge property describes if the cell is part of a horizontal merge of cells. Likewise the CellFormat.VerticalMerge property describes if the cell is a part of a vertical merge of cells.

The values of these properties are what define the merge behavior of cells.

  • The first cell in a sequence of merged cells will have CellMerge.First.
  • Any subsequently merged cells will have CellMerge.Previous.
  • A cell which is not merged will have CellMerge.None.

Sometimes when you load existing document cells in a table will appear merged. However, these can be in fact one long cell. Microsoft Word at times is known to export merged cells in this way. This can cause confusion when attempting to work with individual cells. There appears to be no particular pattern as to when this happens.

Check if a Cell is Merged

To check if a cell is part of a sequence of merged cells, we simply check the CellFormat.HorizontalMerge and CellFormat.VerticalMerge properties. Below example prints the horizontal and vertical merge type of a cell. You can download the template file of this example from here.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
Document doc = new Document(dataDir + "Table.MergedCells.doc");
// Retrieve the first table in the document.
Table table = (Table)doc.GetChild(NodeType.Table, 0, true);
foreach (Row row in table.Rows)
{
foreach (Cell cell in row.Cells)
{
Console.WriteLine(PrintCellMergeType(cell));
}
}

Merged Cells in a Table

The same technique is used to set the merge behavior on the cells in a table. When building a table with merged cells with DocumentBuilder you need to set the appropriate merge type for each cell. Also, you must remember to clear the merge setting or otherwise all cells in the table will become merged. This can be done by setting the value of the appropriate merge property to CellMerge.None. The code example given below demonstrates how to create a table with two rows with cells in the first row horizontally merged.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.InsertCell();
builder.CellFormat.HorizontalMerge = CellMerge.First;
builder.Write("Text in merged cells.");
builder.InsertCell();
// This cell is merged to the previous and should be empty.
builder.CellFormat.HorizontalMerge = CellMerge.Previous;
builder.EndRow();
builder.InsertCell();
builder.CellFormat.HorizontalMerge = CellMerge.None;
builder.Write("Text in one cell.");
builder.InsertCell();
builder.Write("Text in another cell.");
builder.EndRow();
builder.EndTable();
dataDir = dataDir + "Table.HorizontalMerge_out.doc";
// Save the document to disk.
doc.Save(dataDir);

The code example given below demonstrates how to create a table with two columns with cells merged vertically in the first column.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.InsertCell();
builder.CellFormat.VerticalMerge = CellMerge.First;
builder.Write("Text in merged cells.");
builder.InsertCell();
builder.CellFormat.VerticalMerge = CellMerge.None;
builder.Write("Text in one cell");
builder.EndRow();
builder.InsertCell();
// This cell is vertically merged to the cell above and should be empty.
builder.CellFormat.VerticalMerge = CellMerge.Previous;
builder.InsertCell();
builder.CellFormat.VerticalMerge = CellMerge.None;
builder.Write("Text in another cell");
builder.EndRow();
builder.EndTable();
dataDir = dataDir + "Table.VerticalMerge_out.doc";
// Save the document to disk.
doc.Save(dataDir);

In other situations where a builder is not used, such as in an existing table, merging cells in this way may not be as simple. Instead, we can wrap the base operations which are involved in apply merge properties to cells into a method which makes the task much easier. This method is similar to the automation Merge method which is called to merge a range of cells in a table. The code below will merge the range of cells in the table starting from the given cell, to the end cell. This range can span over many rows or columns. A method which merges all cells of a table in the specified range of cells.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
internal static void MergeCells(Cell startCell, Cell endCell)
{
Table parentTable = startCell.ParentRow.ParentTable;
// Find the row and cell indices for the start and end cell.
Point startCellPos = new Point(startCell.ParentRow.IndexOf(startCell), parentTable.IndexOf(startCell.ParentRow));
Point endCellPos = new Point(endCell.ParentRow.IndexOf(endCell), parentTable.IndexOf(endCell.ParentRow));
// Create the range of cells to be merged based off these indices. Inverse each index if the end cell if before the start cell.
Rectangle mergeRange = new Rectangle( System.Math.Min(startCellPos.X, endCellPos.X), System.Math.Min(startCellPos.Y, endCellPos.Y),
System.Math.Abs(endCellPos.X - startCellPos.X) + 1, System.Math.Abs(endCellPos.Y - startCellPos.Y) + 1);
foreach (Row row in parentTable.Rows)
{
foreach (Cell cell in row.Cells)
{
Point currentPos = new Point(row.IndexOf(cell), parentTable.IndexOf(row));
// Check if the current cell is inside our merge range then merge it.
if (mergeRange.Contains(currentPos))
{
if (currentPos.X == mergeRange.X)
cell.CellFormat.HorizontalMerge = CellMerge.First;
else
cell.CellFormat.HorizontalMerge = CellMerge.Previous;
if (currentPos.Y == mergeRange.Y)
cell.CellFormat.VerticalMerge = CellMerge.First;
else
cell.CellFormat.VerticalMerge = CellMerge.Previous;
}
}
}
}

The code example given below demonstrates how to merge the range of cells between the two specified cells. You can download the template file of this example from here.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
// Open the document
Document doc = new Document(dataDir + "Table.Document.doc");
// Retrieve the first table in the body of the first section.
Table table = doc.FirstSection.Body.Tables[0];
// We want to merge the range of cells found inbetween these two cells.
Cell cellStartRange = table.Rows[2].Cells[2];
Cell cellEndRange = table.Rows[3].Cells[3];
// Merge all the cells between the two specified cells into one.
MergeCells(cellStartRange, cellEndRange);
dataDir = dataDir + "Table.MergeCellRange_out.doc";
// Save the document.
doc.Save(dataDir);

Depending on the version of the .NET Framework you are using, you may want to further build on this method by turning it into an extension method. In this case, you can then call this method directly on a cell to merge a range of cells e.g cell1.Merge(cell2).

Vertical and Horizontal Merged Cells in a Table

A table in MS Word is a set of independent rows. Each row has a set of cells independent on cells of other rows. So there is no logical “column” in MS Word’s table. “The 1st column” is something like “a set of the 1st cells of each row in a table”. For example, it’s possible to have a table where the 1st row consists of two cells: 2cm and 1cm and the 2nd row consists of different two cells: 1cm and 2cm of width.

A table in Html has essentially different structure: each row has the same number of cells and (it’s important for the problem) each cell has a width of the corresponding column, the same for all cells of the same column.

Use the following code example if CellFormat.HorizontalMerge and CellFormat.VerticalMerge returns an incorrect value. Below example prints the horizontal and vertical merge of a cell. You can download the template file of this example from here.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
Document doc = new Document(dataDir + "Table.MergedCells.doc");
// Create visitor
SpanVisitor visitor = new SpanVisitor(doc);
// Accept visitor
doc.Accept(visitor);
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
/// <summary>
/// Helper class that contains collection of rowinfo for each row
/// </summary>
public class TableInfo
{
public List<RowInfo> Rows
{
get { return mRows; }
}
private List<RowInfo> mRows = new List<RowInfo>();
}
/// <summary>
/// Helper class that contains collection of cellinfo for each cell
/// </summary>
public class RowInfo
{
public List<CellInfo> Cells
{
get { return mCells; }
}
private List<CellInfo> mCells = new List<CellInfo>();
}
/// <summary>
/// Helper class that contains info about cell. currently here is only colspan and rowspan
/// </summary>
public class CellInfo
{
public CellInfo(int colSpan, int rowSpan)
{
mColSpan = colSpan;
mRowSpan = rowSpan;
}
public int ColSpan
{
get { return mColSpan; }
}
public int RowSpan
{
get { return mRowSpan; }
}
private int mColSpan = 0;
private int mRowSpan = 0;
}
public class SpanVisitor : DocumentVisitor
{
/// <summary>
/// Creates new SpanVisitor instance
/// </summary>
/// <param name="doc">Is document which we should parse</param>
public SpanVisitor(Document doc)
{
// Get collection of tables from the document
mWordTables = doc.GetChildNodes(NodeType.Table, true);
// Convert document to HTML
// We will parse HTML to determine rowspan and colspan of each cell
MemoryStream htmlStream = new MemoryStream();
HtmlSaveOptions options = new HtmlSaveOptions();
options.ImagesFolder = Path.GetTempPath();
doc.Save(htmlStream, options);
// Load HTML into the XML document
XmlDocument xmlDoc = new XmlDocument();
htmlStream.Position = 0;
xmlDoc.Load(htmlStream);
// Get collection of tables in the HTML document
XmlNodeList tables = xmlDoc.DocumentElement.SelectNodes("// Table");
foreach (XmlNode table in tables)
{
TableInfo tableInf = new TableInfo();
// Get collection of rows in the table
XmlNodeList rows = table.SelectNodes("tr");
foreach (XmlNode row in rows)
{
RowInfo rowInf = new RowInfo();
// Get collection of cells
XmlNodeList cells = row.SelectNodes("td");
foreach (XmlNode cell in cells)
{
// Determine row span and colspan of the current cell
XmlAttribute colSpanAttr = cell.Attributes["colspan"];
XmlAttribute rowSpanAttr = cell.Attributes["rowspan"];
int colSpan = colSpanAttr == null ? 0 : Int32.Parse(colSpanAttr.Value);
int rowSpan = rowSpanAttr == null ? 0 : Int32.Parse(rowSpanAttr.Value);
CellInfo cellInf = new CellInfo(colSpan, rowSpan);
rowInf.Cells.Add(cellInf);
}
tableInf.Rows.Add(rowInf);
}
mTables.Add(tableInf);
}
}
public override VisitorAction VisitCellStart(Cell cell)
{
// Determone index of current table
int tabIdx = mWordTables.IndexOf(cell.ParentRow.ParentTable);
// Determine index of current row
int rowIdx = cell.ParentRow.ParentTable.IndexOf(cell.ParentRow);
// And determine index of current cell
int cellIdx = cell.ParentRow.IndexOf(cell);
// Determine colspan and rowspan of current cell
int colSpan = 0;
int rowSpan = 0;
if (tabIdx < mTables.Count &&
rowIdx < mTables[tabIdx].Rows.Count &&
cellIdx < mTables[tabIdx].Rows[rowIdx].Cells.Count)
{
colSpan = mTables[tabIdx].Rows[rowIdx].Cells[cellIdx].ColSpan;
rowSpan = mTables[tabIdx].Rows[rowIdx].Cells[cellIdx].RowSpan;
}
Console.WriteLine("{0}.{1}.{2} colspan={3}\t rowspan={4}", tabIdx, rowIdx, cellIdx, colSpan, rowSpan);
return VisitorAction.Continue;
}
private List<TableInfo> mTables = new List<TableInfo>();
private NodeCollection mWordTables = null;
}

Convert to Horizontally Merged Cells

In the latest versions of MS Word, the cells are merged horizontally by its width. Whereas, the merge flags were used in the older technique,  like Cell.CellFormat.HorizontalMerge. The merge flags are not used when cells are horizontally merged by their width and it is also not possible to detect which cells are merged. Aspose.Words provides ConvertToHorizontallyMergedCells method to convert cells which are horizontally merged by its width to the cell horizontally merged by flags. It simply transforms the table and adds new cells when needed.

The following code example shows the working of the above-mentioned method.

// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-.NET
Document doc = new Document();
Table table = doc.FirstSection.Body.Tables[0];
table.ConvertToHorizontallyMergedCells(); // Now merged cells have appropriate merge flags.