使用itextsharp(或任何c#pdf庫),我需要打開PDF,用實際值替換一些佔位符文本,並將其作爲byte []返回。使用itextsharp(或任何c#pdf庫),如何打開PDF,替換一些文本,並再次保存?
有人可以建議如何做到這一點?我查看了itext文檔,無法確定從哪裏開始。到目前爲止,我一直在堅持如何從pdfReader獲取源PDF到Document對象,我推測我可能接近這個錯誤的方式。
非常感謝
使用itextsharp(或任何c#pdf庫),我需要打開PDF,用實際值替換一些佔位符文本,並將其作爲byte []返回。使用itextsharp(或任何c#pdf庫),如何打開PDF,替換一些文本,並再次保存?
有人可以建議如何做到這一點?我查看了itext文檔,無法確定從哪裏開始。到目前爲止,我一直在堅持如何從pdfReader獲取源PDF到Document對象,我推測我可能接近這個錯誤的方式。
非常感謝
最後,我用PDFescape打開我現有的PDF文件,並將其放置在我需要把我的田某種形式的字段,然後再次將其保存到我的創建PDF文件。
後來我發現關於如何更換表單字段此博客條目:
所有作品很好!以下是代碼:
public static byte[] Generate()
{
var templatePath = HttpContext.Current.Server.MapPath("~/my_template.pdf");
// Based on:
// http://www.johnnycode.com/blog/2010/03/05/using-a-template-to-programmatically-create-pdfs-with-c-and-itextsharp/
var reader = new PdfReader(templatePath);
var outStream = new MemoryStream();
var stamper = new PdfStamper(reader, outStream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
if (form.GetField(fieldKey) == "MyTemplatesOriginalTextFieldA")
form.SetField(fieldKey, "1234");
if (form.GetField(fieldKey) == "MyTemplatesOriginalTextFieldB")
form.SetField(fieldKey, "5678");
}
// "Flatten" the form so it wont be editable/usable anymore
stamper.FormFlattening = true;
stamper.Close();
reader.Close();
return outStream.ToArray();
}
不幸的是我一直在尋找類似的東西,不能弄明白。下面是我得到的,也許你可以用它作爲一個起點。問題是PDF實際上並不保存文本,而是使用查找表和其他一些神祕的魔法。這個方法讀取頁面的字節值並嘗試轉換爲字符串,但據我所知它只能做英文而錯過了一些特殊字符,所以我放棄了我的項目並繼續前進。
string contents = string.Empty();
Document doc = new Document();
PdfReader reader = new PdfReader("pathToPdf.pdf");
using (MemoryStream memoryStream = new MemoryStream())
{
PdfWriter writer = PdfWriter.GetInstance(doc, memoryStream);
doc.Open();
PdfContentByte cb = writer.DirectContent;
for (int p = 1; p <= reader.NumberOfPages; p++)
{
// add page from reader
doc.SetPageSize(reader.GetPageSize(p));
doc.NewPage();
// pickup here something like this:
byte[] bt = reader.GetPageContent(p);
contents = ExtractTextFromPDFBytes(bt);
if (contents.IndexOf("something")!=-1)
{
// make your own pdf page and add to cb (contentbyte)
}
else
{
PdfImportedPage page = writer.GetImportedPage(reader, p);
int rot = reader.GetPageRotation(p);
if (rot == 90 || rot == 270)
cb.AddTemplate(page, 0, -1.0F, 1.0F, 0, 0, reader.GetPageSizeWithRotation(p).Height);
else
cb.AddTemplate(page, 1.0F, 0, 0, 1.0F, 0, 0);
}
}
reader.Close();
doc.Close();
File.WriteAllBytes("pathToOutputOrSamePathToOverwrite.pdf", memoryStream.ToArray());
這取自this site。
private string ExtractTextFromPDFBytes(byte[] input)
{
if (input == null || input.Length == 0) return "";
try
{
string resultString = "";
// Flag showing if we are we currently inside a text object
bool inTextObject = false;
// Flag showing if the next character is literal
// e.g. '\\' to get a '\' character or '\(' to get '('
bool nextLiteral = false;
//() Bracket nesting level. Text appears inside()
int bracketDepth = 0;
// Keep previous chars to get extract numbers etc.:
char[] previousCharacters = new char[_numberOfCharsToKeep];
for (int j = 0; j < _numberOfCharsToKeep; j++) previousCharacters[j] = ' ';
for (int i = 0; i < input.Length; i++)
{
char c = (char)input[i];
if (inTextObject)
{
// Position the text
if (bracketDepth == 0)
{
if (CheckToken(new string[] { "TD", "Td" }, previousCharacters))
{
resultString += "\n\r";
}
else
{
if (CheckToken(new string[] { "'", "T*", "\"" }, previousCharacters))
{
resultString += "\n";
}
else
{
if (CheckToken(new string[] { "Tj" }, previousCharacters))
{
resultString += " ";
}
}
}
}
// End of a text object, also go to a new line.
if (bracketDepth == 0 &&
CheckToken(new string[] { "ET" }, previousCharacters))
{
inTextObject = false;
resultString += " ";
}
else
{
// Start outputting text
if ((c == '(') && (bracketDepth == 0) && (!nextLiteral))
{
bracketDepth = 1;
}
else
{
// Stop outputting text
if ((c == ')') && (bracketDepth == 1) && (!nextLiteral))
{
bracketDepth = 0;
}
else
{
// Just a normal text character:
if (bracketDepth == 1)
{
// Only print out next character no matter what.
// Do not interpret.
if (c == '\\' && !nextLiteral)
{
nextLiteral = true;
}
else
{
if (((c >= ' ') && (c <= '~')) ||
((c >= 128) && (c < 255)))
{
resultString += c.ToString();
}
nextLiteral = false;
}
}
}
}
}
}
// Store the recent characters for
// when we have to go back for a checking
for (int j = 0; j < _numberOfCharsToKeep - 1; j++)
{
previousCharacters[j] = previousCharacters[j + 1];
}
previousCharacters[_numberOfCharsToKeep - 1] = c;
// Start of a text object
if (!inTextObject && CheckToken(new string[] { "BT" }, previousCharacters))
{
inTextObject = true;
}
}
return resultString;
}
catch
{
return "";
}
}
private bool CheckToken(string[] tokens, char[] recent)
{
foreach (string token in tokens)
{
if ((recent[_numberOfCharsToKeep - 3] == token[0]) &&
(recent[_numberOfCharsToKeep - 2] == token[1]) &&
((recent[_numberOfCharsToKeep - 1] == ' ') ||
(recent[_numberOfCharsToKeep - 1] == 0x0d) ||
(recent[_numberOfCharsToKeep - 1] == 0x0a)) &&
((recent[_numberOfCharsToKeep - 4] == ' ') ||
(recent[_numberOfCharsToKeep - 4] == 0x0d) ||
(recent[_numberOfCharsToKeep - 4] == 0x0a)))
{
return true;
}
}
return false;
}
什麼是_numberOfCharsToKeep缺少聲明this.so指導我如何定義此。 – 2013-09-13 04:57:18
發現此爲止:http://www.johnnycode.com/blog/2010/03/05/using-a-template-to-programmatically-create-pdfs-with-c -and-itextsharp/ – Chris 2010-11-19 02:24:19