当前位置:  开发笔记 > 编程语言 > 正文

从Word文档中获取标题

如何解决《从Word文档中获取标题》经验,为你挑选了2个好方法。

如何使用VBA获取word文档中所有标题的列表?



1> VonC..:

你的意思是这个createOutline函数(它实际上将源word文档中的所有标题复制到新的word文档中):

(我相信astrHeadings = 函数是这个程序的关键,应该允许你检索你要求的东西)_docSource.GetCrossReferenceItems(wdRefTypeHeading)

Public Sub CreateOutline()
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim intLevel As Integer
    Dim intItem As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    ' Content returns only the main body of the document, not the headers/footer.        
    Set rng = docOutline.Content
    ' GetCrossReferenceItems(wdRefTypeHeading) returns an array with references to all headings in the document
    astrHeadings = docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        ' Add the text to the document.
        rng.InsertAfter strText & vbNewLine

        ' Set the style of the selected range and
        ' then collapse the range for the next entry.
        rng.Style = "Heading " & intLevel
        rng.Collapse wdCollapseEnd
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim intDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    intDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (intDiff / 2) + 1
End Function

由@kol于2018年3月6日更新

虽然astrHeadings是一个数组(IsArray返回TrueTypeName返回String()),type mismatch但当我尝试访问VBScript中的元素时,我收到错误(Windows 10 Pro 1709 16299.248上的v5.8.16384).这必须是特定于VBScript的问题,因为如果我在Word的VBA编辑器中运行相同的代码,我可以访问这些元素.我最终迭代了TOC的行,因为它甚至可以从VBScript中工作:

For Each Paragraph In Doc.TablesOfContents(1).Range.Paragraphs
  WScript.Echo Paragraph.Range.Text
Next



2> JonnyGold..:

获取标题列表的最简单方法是遍历文档中的段落,例如:

 Sub ReadPara()

    Dim DocPara As Paragraph

    For Each DocPara In ActiveDocument.Paragraphs

     If Left(DocPara.Range.Style, Len("Heading")) = "Heading" Then

       Debug.Print DocPara.Range.Text

     End If

    Next


End Sub

顺便说一句,我发现删除段落范围的最后一个字符是个好主意.否则,如果您将字符串发送到消息框或文档,Word将显示一个额外的控制字符.例如:

Left(DocPara.Range.Text, len(DocPara.Range.Text)-1)

推荐阅读
小白也坚强_177
这个屌丝很懒,什么也没留下!
DevBox开发工具箱 | 专业的在线开发工具网站    京公网安备 11010802040832号  |  京ICP备19059560号-6
Copyright © 1998 - 2020 DevBox.CN. All Rights Reserved devBox.cn 开发工具箱 版权所有