我想從SQL向ElasticSearch索引批量插入數據。以下是我正在使用的代碼,總記錄數約爲150萬。我認爲這與連接設置有關,但我無法弄清楚。有人可以幫助這個代碼或建議更好的方法來做到這一點?C#NEST Bulk api與System.IO.IOException失敗
public void InsertReceipts
{
IEnumerable<Receipts> receipts = GetFromDB() // get receipts from SQL DB
const string index = "receipts";
var config = ConfigurationManager.AppSettings["ElasticSearchUri"];
var node = new Uri(config);
var settings = new ConnectionSettings(node).RequestTimeout(TimeSpan.FromMinutes(30));
var client = new ElasticClient(settings);
var bulkIndexer = new BulkDescriptor();
foreach (var receiptBatch in receipts.Batch(20000)) //using MoreLinq for Batch
{
Parallel.ForEach(receiptBatch, (receipt) =>
{
bulkIndexer.Index<OfficeReceipt>(i => i
.Document(receipt)
.Id(receipt.TransactionGuid)
.Index(index));
});
var response = client.Bulk(bulkIndexer);
if (!response.IsValid)
{
_logger.LogError(response.ServerError.ToString());
}
bulkIndexer = new BulkDescriptor();
}
}
代碼工作正常,但需要大約10分鐘才能完成。當我試圖增加批量大小,它失敗,提示以下錯誤:
Invalid NEST response built from a unsuccessful low level call on POST: /_bulk
Invalid Bulk items: OriginalException: System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a send. ---> System.IO.IOException: Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host
查看可能的重複問題的答案。 NEST 5.x包含一個幫助器,用於並行執行批量請求,可以幫助您在這裏 –
謝謝@RussCam我認爲我同意我的異常的根本原因也是我在每個批量請求中發送的數據的大小。我將使用您在不同設置的答案中提到的Observable設計模式來查看最適合我的場景的設計模式。 –