在Word中使用VBA的文檔中標記特殊字符

以下宏分析文檔中的每個字符，並且如果ASCII值高於255，它將對其應用特殊字符樣式 - 有些針對特定語言，或者只是針對特定語言的「lang」，如果它不是這些字符的一部分語言。

該宏可以很好地工作，但在長文檔中，需要花費很長的時間來處理。例如，我剛剛在每個頁面上處理了一個147頁（單間隔）的文檔，其中包含幾行希臘文，花了40分鐘，在Word 2016 for Windows中（相反，完全相同的文件和相同的代碼需要2分鐘在Mac上）。

有什麼我可以做的下面的代碼來優化這個Windows？

感謝您的任何建議。約翰

Sub CheckSpecialCharacters() 
    'This macro looks for any characters above 255 and tags them with the appropriate existing language character. 

     Dim ch As Range: Set ch = ActiveDocument.Characters(1) 

     Do 

      Counter = Counter + 1 

      ch.Select 

      myValue = AscW(Selection.Text) 
      If myValue > 255 Then 

       If (myValue > 8190 And myValue < 8225) Or (myValue > 288 And myValue < 381) Or (myValue > 701 And myValue < 704) Or myValue = 730 Then 
        'Ignores Curly Quotes and Transliteration punctuation 

       ElseIf (myValue > 7935 And myValue < 8192) Or (myValue > 879 And myValue < 1024) Then 
        'Greek Characters get langgrk applied 
        Selection.Expand unit:=wdWord 
        Selection.Style = "langgrk" 

       ElseIf (myValue > 1423 And myValue < 1535) Then 
        'Hebrew Characters get langheb applied 
        Selection.Expand unit:=wdWord 
        Selection.Style = "langheb" 

       ElseIf myValue > 7679 And myValue < 7830 Then 
        'Extended transliteration characters get langtrans applied //OLD VALUES// (myValue > 288 And myValue < 381) Or (myValue > 701 And myValue < 704) 
        If HCCP = True Then Selection.Expand unit:=wdWord 
        Selection.Style = "langtrans" 

       ElseIf (myValue > 19968 And myValue < 40959) Then 
        'Chinese Characters get langchin applied 
        Selection.Expand unit:=wdWord 
        Selection.Style = "langchin" 

       ElseIf (myValue > 19968 And myValue < 40917) Then 
        'Japanese Characters get langjap applied 
        Selection.Expand unit:=wdWord 
        Selection.Style = "langjap" 

       Else 
        If HCCP = True Then Selection.Expand unit:=wdWord 
        Selection.Style = "lang" 

       End If 

      End If 

DoNext: 


End Sub

來源

2017-02-28 johnwangel

查找並替換格式http://www.excelforum.com/word-programming-vba-macros/997078-best-way-to-find-replace-unicode-characters.html – Slai

鏈接中的方法isn'我需要什麼，因爲它假定你已經知道你正在尋找的角色。我試圖識別文檔中使用的任何特殊字符。如果它落入某些常見範圍，我會標記特定的語言，但否則所標識的任何東西都會得到一個通用的字符樣式。 – johnwangel

出於某種原因Range.DetectLanguage似乎並沒有對我的版本的Word（2007年）的工作，但是這可能是一個尋找到的不是檢查字符編碼。

的一般方法，以加快辦公室VBA宏是禁用屏幕更新：

Application.ScreenUpdating = False 
' some slow code that causes the screen to be updated 
Application.ScreenUpdating = True

這將有助於你的情況一點，因爲你正在使用，而不是Range較慢Selection。

此外，檢查字節值直接似乎有點比AscW快：

Sub test() 
    'Options.DefaultHighlightColorIndex = wdNoHighlight 
    'Range.HighlightColorIndex = wdNoHighlight ' used for testing to clear Highlight 

    Dim r As Range, t As Double: t = Timer 
    Application.ScreenUpdating = False 

    For Each r In Range.Characters ' For Each r In Range.Words is somehow about 2 times slower than .Characters 
     checkRange r 
    Next 

    Application.ScreenUpdating = True 
    Debug.Print Timer - t; Range.Words.Count; Range.Characters.Count; Range.End ' " 3.15625 8801 20601 20601 " 
End Sub 

Sub checkRange(r As Range) 
    Dim b() As Byte, i As Long, a As Long 
    b = r.Text ' converts the string to byte array (2 or 4 bytes per character) 
    'Debug.Print "'" & r & "'"; r.LanguageID; r.LanguageIDFarEast; r.LanguageIDOther 

    For i = 1 To UBound(b) Step 2   ' 2 bytes per Unicode codepoint 
     If b(i) > 0 Then      ' if AscW > 255 
      a = b(i): a = a * 256 + b(i - 1) ' AscW 
      Select Case a 
       Case &H1F00 To &H1FFF: r.HighlightColorIndex = wdBlue: Exit Sub ' Greek Extended 
       Case &H3040 To &H30FF: r.HighlightColorIndex = wdPink: Exit Sub ' Hiragana and Katakana 
       Case &H4E00 To 40959: r.HighlightColorIndex = wdGreen: Exit Sub ' CJK Unified Ideographs 

       Case 55296 To 56319: ' ignore leading High Surrogates ? 
       Case 56320 To 57343: ' ignore trailing Low Surrogates ? 

       Case Else: r.HighlightColorIndex = wdRed: Debug.Print Hex(a), r.End - r.Start ' other 
      End Select 
     End If 
    Next 
End Sub

很少在你的代碼一樣8190 Unicode碼點似乎有點過了，這樣你就可以在 http://www.fileformat.info/info/unicode/block/index.htm

檢查他們

來源

2017-03-01 20:05:38 Slai

謝謝。我會測試這個並讓你知道。我最初確實禁用了屏幕更新，但它並沒有加快速度，當人們長時間看到空白屏幕時，他們認爲它已經崩潰。所以我想讓他們看到發生的事情總比沒有好。 – johnwangel

謝謝@Slai。你在Windows上的方法更加高效 - 從40分鐘降低到4分鐘！（奇怪的是，這個版本在Mac上非常慢，所以我只保留我的舊方法）。 – johnwangel

@johnwangel 4分鐘聽起來有點慢，因爲在我的測試中，它在3秒內完成了10頁。您是否改變了Case語句以外的任何內容？其實，沒關係，因爲我的測試只改變了高光而不是風格 – Slai

在Word中使用VBA的文檔中標記特殊字符

回答

相關問題