python

2024-07-13 23:41| 来源: 网络整理| 查看: 265

由于工作需要提取一个word文档中的表格，及其所在的章节，普通的Document.paragraphs 和Document.tables无法满足需求。所以综合GitHub作者的代码及我自己的需求代码如下：

from docx.document import Document from docx.oxml.table import CT_Tbl from docx.oxml.text.paragraph import CT_P from docx.table import _Cell, Table from docx.text.paragraph import Paragraph import docx import openpyxl import xlsxwriter def iter_block_items(parent): """ Yield each paragraph and table child within *parent*, in document order. Each returned value is an instance of either Table or Paragraph. *parent* would most commonly be a reference to a main Document object, but also works for a _Cell object, which itself can contain paragraphs and tables. """ if isinstance(parent, Document): parent_elm = parent.element.body elif isinstance(parent, _Cell): parent_elm = parent._tc else: raise ValueError("something's not righ

【本文地址】

python

python

今日新闻

推荐新闻