2013-03-21 80 views
0

我需要做的是......讀取一個word文件並根據字體的屬性在它們之前添加一個標籤,以區分它作爲標題或 但是,我需要使用Perl來做到這一點.. 是否有可能? 任何幫助將不勝感激。 謝謝!使用perl查找MS-Word文檔的字體屬性

回答

2

我需要更多的信息來幫助你確定的話,你需要處理。在我的例子我只是在尋找我有這個代碼的工作文字一些this is my *.docx file

#!/usr/bin/perl 

use Modern::Perl; 
use Win32::OLE; 

use Win32::OLE qw(in with); 
use Win32::OLE::Variant; 
use Win32::OLE::Const 'Microsoft Word'; 
$Win32::OLE::Warn = 3; 

print "Starting Word\n"; 

    my $Word = Win32::OLE->GetActiveObject('Word.Application') || 
      Win32::OLE->new('Word.Application'); 
    $Word->{'Visible'}  = 1; 
    $Word->{DisplayAlerts} = 0; 

my $File = $Word->Documents->Open("./fonts.docx") or die Win32::OLE->LastError; 

$Word->Selection->HomeKey(wdStory); 

$Word->Selection->Find->{'Text'} = 'Some'; 

$Word->Selection->Find->Execute(); 

say "Font size: [", $Word->Selection->Font->Size(), "]"; 
say "Font name: [", $Word->Selection->Font->Name(), "]"; 

$Word->Quit; 
0

嘗試使用OLE自動化,Win32::OLE模塊是有幫助的。 這種方式需要更深入的Word OLE api知識。

4

@Nikita,這將給你的公司如何做的詳細視圖:

#!/usr/bin/perl 
use strict; 
use warnings; 
use Win32::OLE::Const 'Microsoft Word'; 
#$Win32::OLE::CP = CP_UTF8; 
binmode STDOUT, 'encoding(utf8)'; 

# OPEN FILE SPECIFIED AS FIRST ARGUMENT 
my $fname=$ARGV[0]; 
my $fnameFullPath = `cygpath.exe -wa $fname`; 
$fnameFullPath =~ s/\\/\\\\/g; 
$fnameFullPath =~ s/\s*$//; 
unless (-e $fnameFullPath) { print "Error: File did not exists\n"; exit 1;} 

# STARTING OLE 
my $Word = Win32::OLE->GetActiveObject('Word.Application') 
    || Win32::OLE->new('Word.Application','Quit') 
    or die Win32::OLE->LastError(); 

$Word->{'Visible'} = 0; 
my $doc = $Word->Documents->Open($fnameFullPath); 
my $paragraphs = $doc->Paragraphs() ; 
my $enumerate = new Win32::OLE::Enum($paragraphs); 

# PROCESSING PARAGRAPHS 
while(defined(my $paragraph = $enumerate->Next())) { 

    my $text = $paragraph->{Range}->{Text}; 
    my $sel = $Word->Selection; 
    my $font = $sel->Font; 

    if ($font->{Size} == 18){ 
     print "Text: ", $text, "\n"; 
     print "Font Bold: ", $font->{Bold}, "\n"; 
     print "Font Italic: ", $font->{Italic}, "\n"; 
     print "Font Name: ", $font->{Name}, "\n"; 
     print "Font Size: ", $font->{Size}, "\n"; 
     print "=========\n"; 
    } 
} 

# CLOSING OLE 
$Word->ActiveDocument->Close ; 
$Word->Quit; 

輸出將是這樣的:

Text: This is a doc file containing different fonts and size, document also contain header and footer (Font: TNR, Size: 18) 
Font Bold: 0 
Font Italic: 0 
Font Name: Times New Roman 
Font Size: 18 
========= 
Text: This is a Perl example (Font TNR, Size: 12) 
Font Bold: 0 
Font Italic: 0 
Font Name: Times New Roman 
Font Size: 18 
========= 
Text: This is a Python example(Font: Courier New, Size: 10) 
Font Bold: 0 
Font Italic: 0 
Font Name: Times New Roman 
Font Size: 18 
=========
+0

但cudnt找到的所有段落屬性。 Neways謝謝! – Nikita 2013-03-22 06:05:22

+0

感謝sinan的編輯 – 2013-03-25 10:38:07