2011-07-14 66 views
23

該程序一個接一個地執行數以萬計的連續插入。我從來沒有用過Hibernate。我得到的速度非常慢(如果我只是手動連接和執行SQL,速度要快10-12倍,根據許多hibernate教程,我的batch_size設置爲50)Hibernate批量大小混淆

這是一個來自單個插入的日誌 - 也許你能幫助我瞭解到底發生了什麼:

START INSERT 
11:02:56.121 [main] DEBUG org.hibernate.impl.SessionImpl - opened session at timestamp: 13106053761                        
11:02:56.121 [main] DEBUG o.h.transaction.JDBCTransaction - begin                                
11:02:56.121 [main] DEBUG org.hibernate.jdbc.ConnectionManager - opening JDBC connection                            
11:02:56.121 [main] TRACE o.h.c.DriverManagerConnectionProvider - total checked-out connections: 0                          
11:02:56.121 [main] TRACE o.h.c.DriverManagerConnectionProvider - using pooled JDBC connection, pool size: 0                       
11:02:56.121 [main] DEBUG o.h.transaction.JDBCTransaction - current autocommit status: false                          
11:02:56.121 [main] TRACE org.hibernate.jdbc.JDBCContext - after transaction begin                            
11:02:56.121 [main] TRACE org.hibernate.impl.SessionImpl - setting flush mode to: MANUAL                          
11:02:56.121 [main] TRACE o.h.e.def.DefaultLoadEventListener - loading entity: [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000 ADSUFC}]       
11:02:56.121 [main] TRACE o.h.e.def.DefaultLoadEventListener - creating new proxy for entity                          
11:02:56.122 [main] TRACE o.h.e.d.DefaultSaveOrUpdateEventListener - saving transient instance                           
11:02:56.122 [main] DEBUG o.h.e.def.AbstractSaveEventListener - generated identifier: component[keyW000]{keyW000=F000 ADSUFC}, using strategy: org.hibernate.id.CompositeNestedGeneratedValueGenerator 
11:02:56.122 [main] TRACE o.h.e.def.AbstractSaveEventListener - saving [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000 ADSUFC}]          
11:02:56.123 [main] TRACE o.h.e.d.AbstractFlushingEventListener - flushing session                              
11:02:56.123 [main] DEBUG o.h.e.d.AbstractFlushingEventListener - processing flush-time cascades                          
11:02:56.123 [main] DEBUG o.h.e.d.AbstractFlushingEventListener - dirty checking collections                           
11:02:56.123 [main] TRACE o.h.e.d.AbstractFlushingEventListener - Flushing entities and processing referenced collections                    
11:02:56.125 [main] TRACE o.h.e.d.AbstractFlushingEventListener - Processing unreferenced collections                         
11:02:56.125 [main] TRACE o.h.e.d.AbstractFlushingEventListener - Scheduling collection removes/(re)creates/updates                     
11:02:56.126 [main] DEBUG o.h.e.d.AbstractFlushingEventListener - Flushed: 1 insertions, 0 updates, 0 deletions to 62 objects                   
11:02:56.126 [main] DEBUG o.h.e.d.AbstractFlushingEventListener - Flushed: 0 (re)creations, 0 updates, 0 removals to 0 collections                                              
11:02:56.132 [main] TRACE o.h.e.d.AbstractFlushingEventListener - executing flush                              
11:02:56.132 [main] TRACE org.hibernate.jdbc.ConnectionManager - registering flush begin                            
11:02:56.132 [main] TRACE o.h.p.entity.AbstractEntityPersister - Inserting entity: [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000 ADSUFC}]       
11:02:56.132 [main] DEBUG org.hibernate.jdbc.AbstractBatcher - about to open PreparedStatement (open PreparedStatements: 0, globally: 0)               
11:02:56.132 [main] DEBUG org.hibernate.SQL - insert into MSW000 (W000_DATA_REC, W000_FILE_FLAGS, KEY_W000) values (?, ?, ?)              
11:02:56.132 [main] TRACE org.hibernate.jdbc.AbstractBatcher - preparing statement                             
11:02:56.132 [main] TRACE o.h.p.entity.AbstractEntityPersister - Dehydrating entity: [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000 ADSUFC}]      
11:02:56.132 [main] TRACE org.hibernate.type.StringType - binding ' ADSUFCA                                                                     ' to parameter: 1 
11:02:56.132 [main] TRACE org.hibernate.type.StringType - binding ' ' to parameter: 2                           
11:02:56.132 [main] TRACE org.hibernate.type.StringType - binding 'F000 ADSUFC' to parameter: 3                        
11:02:56.132 [main] DEBUG org.hibernate.jdbc.AbstractBatcher - Executing batch size: 1                            
11:02:56.133 [main] DEBUG org.hibernate.jdbc.AbstractBatcher - about to close PreparedStatement (open PreparedStatements: 1, globally: 1)               
11:02:56.133 [main] TRACE org.hibernate.jdbc.AbstractBatcher - closing statement                             
11:02:56.133 [main] TRACE org.hibernate.jdbc.ConnectionManager - registering flush end                            
11:02:56.133 [main] TRACE o.h.e.d.AbstractFlushingEventListener - post flush                               
11:02:56.133 [main] DEBUG o.h.transaction.JDBCTransaction - commit                                
11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - automatically flushing session                          
11:02:56.133 [main] TRACE org.hibernate.jdbc.JDBCContext - before transaction completion                          
11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - before transaction completion                          
11:02:56.133 [main] DEBUG o.h.transaction.JDBCTransaction - committed JDBC Connection                           
11:02:56.133 [main] TRACE org.hibernate.jdbc.JDBCContext - after transaction completion                           
11:02:56.133 [main] DEBUG org.hibernate.jdbc.ConnectionManager - transaction completed on session with on_close connection release mode; be sure to close the session to release JDBC resources!  
11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - after transaction completion                           
11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - closing session                              
11:02:56.133 [main] TRACE org.hibernate.jdbc.ConnectionManager - performing cleanup                             
11:02:56.133 [main] DEBUG org.hibernate.jdbc.ConnectionManager - releasing JDBC connection [ (open PreparedStatements: 0, globally: 0) (open ResultSets: 0, globally: 0)]        
11:02:56.133 [main] TRACE o.h.c.DriverManagerConnectionProvider - returning connection to pool, pool size: 1                       
11:02:56.133 [main] TRACE org.hibernate.jdbc.JDBCContext - after transaction completion                           
11:02:56.133 [main] DEBUG org.hibernate.jdbc.ConnectionManager - transaction completed on session with on_close connection release mode; be sure to close the session to release JDBC resources!  
11:02:56.134 [main] TRACE org.hibernate.impl.SessionImpl - after transaction completion                           
FINISH INSERT 

回答

75

當你調用session.save(),hibernate會產生一個INSERT SQL實際代碼的建議。這個INSERT SQL將被追加以在沖洗期間發佈給DB(即session.flush())。

在刷新期間,如果hibernate.jdbc.batch_size設置爲某個非零值,Hibernate將使用JDBC2 API中引入的批處理功能向數據庫發出批量插入SQL。

例如,如果您save() 100記錄和您的hibernate.jdbc.batch_size設置爲50.在沖洗,而不是發出以下SQL 100倍:

insert into TableA (id , fields) values (1, 'val1'); 
insert into TableA (id , fields) values (2, 'val2'); 
insert into TableA (id , fields) values (3, 'val3'); 
......................... 
insert into TableA (id , fields) values (100, 'val100'); 

Hiberate將他們的50批次組,只有問題2 SQL的數據庫,就像這樣:

insert into TableA (id , fields) values (1, 'val1') , (2, 'val2') ,(3, 'val3') ,(4, 'val4') ,......,(50, 'val50') 
insert into TableA (id , fields) values (51, 'val51') , (52, 'val52') ,(53, 'val53') ,(54, 'val54'),...... ,(100, 'val100') 

請注意,Hibernate對將在JDBC級別透明的,如果插入表的主鍵是GenerationType.Identity禁用插入配料。

從您的日誌中:您只有一條記錄save(),然後flush(),因此只有一個附加的INSERT SQL將被處理用於每次刷新。這就是爲什麼Hibernate無法幫助您批量插入,因爲只有一個INSERT SQL需要處理。在撥打flush()之前,您應該撥打save(),直到達到一定數量的記錄,而不是每save()撥打flush()

批量插入的最佳做法是這樣的:

Session session = sessionFactory.openSession(); 
Transaction tx = session.beginTransaction(); 
for (int i=0; i<888888; i++) { 
    TableA record = new TableA(); 
    record.setXXXX(); 
    session.save(record) 
    if (i % 50 == 0) { //50, same as the JDBC batch size 
     //flush a batch of inserts and release memory: 
     session.flush(); 
     session.clear(); 
    } 
} 
tx.commit(); 
session.close(); 

保存並通過批量刷新紀錄批次。在每個批處理結束時,您應該清除持久化上下文以釋放一些內存,以防止每個持久對象放入第一級緩存(JVM的內存)時耗盡內存。您也可以禁用二級緩存以減少不必要的開銷。


參考:

+2

夢幻般的答案。如果我能讓你更多,我會。我設法解決了這個問題,不過謝謝! –

+2

我想知道爲什麼在**遞增循環索引'i'之前刷新並清除會話**。我想你應該從'for'循環中刪除'i ++',並且如果條件爲'++ i%50 == 0'則更改。 (在Hibernate文檔中出錯?) – DylanYi

+0

@dylanyi這只是一個例子。也許索引i與掃描源數組有關(請參閱record.setXXX:它從何處獲取數據?)。在處理數百萬條記錄時,在開始時刷新0並不會產生如此差異 –

3
11:02:56.133 [main] DEBUG o.h.transaction.JDBCTransaction - commit  

這是說,數據庫每次插入後提交確保您沒有提交你的事務/關閉插入循環內部會話執行此操作。一次代之以代替。

+0

你應該,但是,沖洗和清除您的會話每隔數1000條記錄左右。 – jtahlborn

+0

@jtahlborn是非常好的一點,否則Hibernate會耗盡內存 – prunge

4

如果您必須使用hibernate進行大批量作業,則需要使用StatelessSession。它將事物分解爲最基本的轉換對象到SQL語句的映射,並消除了在將批量行填充到批量數據庫時不使用的ORM功能的所有開銷。

這也將是更容易做出比日誌:)

+0

我知道會的。該代碼實際上是由naca生成的java(來自COBOL),因此難以置信地進行調試並讓您頭腦發熱。還沒有找到插入嘆息的地方 –