2017-06-21 129 views
1

我在16G RAM的Linux機器上使用Neo4j 3.2.1。 我試圖從11M行的csv文件加載圖形,節點的最大數量約爲150K節點。 它需要永久加載,我試圖增加堆大小,使用從10000到100000的定期提交,但仍然沒有任何改變,現在掛了大約2個小時。 我擡頭互聯網和發現它根本不應該參加那麼多的時間 這裏是配置文件Neo4j永久加載

#***************************************************************** 
# Neo4j configuration 
# 
# For more details and a complete list of settings, please see 
# https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/ 
#***************************************************************** 

# The name of the database to mount 
#dbms.active_database=graph.db 

# Paths of directories in the installation. 
#dbms.directories.data=data 
#dbms.directories.plugins=plugins 
#dbms.directories.certificates=certificates 
#dbms.directories.logs=logs 
#dbms.directories.lib=lib 
#dbms.directories.run=run 

# This setting constrains all `LOAD CSV` import files to be under the `import` directory. Remove or comment it out to 
# allow files to be loaded from anywhere in the filesystem; this introduces possible security problems. See the 
# `LOAD CSV` section of the manual for details. 
dbms.directories.import=import 

# Whether requests to Neo4j are authenticated. 
# To disable authentication, uncomment this line 
#dbms.security.auth_enabled=false 

# Enable this to be able to upgrade a store from an older version. 
dbms.allow_format_migration=true 

# Java Heap Size: by default the Java heap size is dynamically 
# calculated based on available system resources. 
# Uncomment these lines to set specific initial and maximum 
# heap size. 
dbms.memory.heap.initial_size=512m 
dbms.memory.heap.max_size=15400m 

# The amount of memory to use for mapping the store files, in bytes (or 
# kilobytes with the 'k' suffix, megabytes with 'm' and gigabytes with 'g'). 
# If Neo4j is running on a dedicated server, then it is generally recommended 
# to leave about 2-4 gigabytes for the operating system, give the JVM enough 
# heap to hold all your transaction state and query context, and then leave the 
# rest for the page cache. 
# The default page cache memory assumes the machine is dedicated to running 
# Neo4j, and is heuristically set to 50% of RAM minus the max Java heap size. 
dbms.memory.pagecache.size=14g 

#***************************************************************** 
# Network connector configuration 
#***************************************************************** 

# With default configuration Neo4j only accepts local connections. 
# To accept non-local connections, uncomment this line: 
dbms.connectors.default_listen_address=0.0.0.0 

# You can also choose a specific network interface, and configure a non-default 
# port for each connector, by setting their individual listen_address. 

# The address at which this server can be reached by its clients. This may be the server's IP address or DNS name, or 
# it may be the address of a reverse proxy which sits in front of the server. This setting may be overridden for 
# individual connectors below. 
#dbms.connectors.default_advertised_address=localhost 

# You can also choose a specific advertised hostname or IP address, and 
# configure an advertised port for each connector, by setting their 
# individual advertised_address. 

# Bolt connector 
dbms.connector.bolt.enabled=true 
#dbms.connector.bolt.tls_level=OPTIONAL 
#dbms.connector.bolt.listen_address=:7687 

# HTTP Connector. There must be exactly one HTTP connector. 
dbms.connector.http.enabled=true 
#dbms.connector.http.listen_address=:7474 

# HTTPS Connector. There can be zero or one HTTPS connectors. 
dbms.connector.https.enabled=true 
#dbms.connector.https.listen_address=:7473 

# Number of Neo4j worker threads. 
#dbms.threads.worker_count= 

#***************************************************************** 
# Logging configuration 
#***************************************************************** 

# To enable HTTP logging, uncomment this line 
#dbms.logs.http.enabled=true 

# Number of HTTP logs to keep. 
#dbms.logs.http.rotation.keep_number=5 

# Size of each HTTP log that is kept. 
#dbms.logs.http.rotation.size=20m 

# To enable GC Logging, uncomment this line 
#dbms.logs.gc.enabled=true 

# GC Logging Options 
# see http://docs.oracle.com/cd/E19957-01/819-0084-10/pt_tuningjava.html#wp57013 for more information. 
#dbms.logs.gc.options=-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:+PrintTenuringDistribution 

# Number of GC logs to keep. 
#dbms.logs.gc.rotation.keep_number=5 

# Size of each GC log that is kept. 
#dbms.logs.gc.rotation.size=20m 

# Size threshold for rotation of the debug log. If set to zero then no rotation will occur. Accepts a binary suffix "k", 
# "m" or "g". 
#dbms.logs.debug.rotation.size=20m 

# Maximum number of history files for the internal log. 
#dbms.logs.debug.rotation.keep_number=7 

#***************************************************************** 
# Miscellaneous configuration 
#***************************************************************** 

# Enable this to specify a parser other than the default one. 
#cypher.default_language_version=3.0 

# Determines if Cypher will allow using file URLs when loading data using 
# `LOAD CSV`. Setting this value to `false` will cause Neo4j to fail `LOAD CSV` 
# clauses that load data from the file system. 
#dbms.security.allow_csv_import_from_file_urls=true 

# Retention policy for transaction logs needed to perform recovery and backups. 
dbms.tx_log.rotation.retention_policy=1 days 

# Enable a remote shell server which Neo4j Shell clients can log in to. 
#dbms.shell.enabled=true 
# The network interface IP the shell will listen on (use 0.0.0.0 for all interfaces). 
#dbms.shell.host=127.0.0.1 
# The port the shell will listen on, default is 1337. 
#dbms.shell.port=1337 

# Only allow read operations from this Neo4j instance. This mode still requires 
# write access to the directory for lock purposes. 
#dbms.read_only=false 

# Comma separated list of JAX-RS packages containing JAX-RS resources, one 
# package name for each mountpoint. The listed package names will be loaded 
# under the mountpoints specified. Uncomment this line to mount the 
# org.neo4j.examples.server.unmanaged.HelloWorldResource.java from 
# neo4j-server-examples under /examples/unmanaged, resulting in a final URL of 
# http://localhost:7474/examples/unmanaged/helloworld/{nodeId} 
#dbms.unmanaged_extension_classes=org.neo4j.examples.server.unmanaged=/examples/unmanaged 

#******************************************************************** 
# JVM Parameters 
#******************************************************************** 

# G1GC generally strikes a good balance between throughput and tail 
# latency, without too much tuning. 
dbms.jvm.additional=-XX:+UseG1GC 

# Have common exceptions keep producing stack traces, so they can be 
# debugged regardless of how often logs are rotated. 
dbms.jvm.additional=-XX:-OmitStackTraceInFastThrow 

# Make sure that `initmemory` is not only allocated, but committed to 
# the process, before starting the database. This reduces memory 
# fragmentation, increasing the effectiveness of transparent huge 
# pages. It also reduces the possibility of seeing performance drop 
# due to heap-growing GC events, where a decrease in available page 
# cache leads to an increase in mean IO response time. 
# Try reducing the heap memory, if this flag degrades performance. 
dbms.jvm.additional=-XX:+AlwaysPreTouch 

# Trust that non-static final fields are really final. 
# This allows more optimizations and improves overall performance. 
# NOTE: Disable this if you use embedded mode, or have extensions or dependencies that may use reflection or 
# serialization to change the value of final fields! 
dbms.jvm.additional=-XX:+UnlockExperimentalVMOptions 
dbms.jvm.additional=-XX:+TrustFinalNonStaticFields 

# Disable explicit garbage collection, which is occasionally invoked by the JDK itself. 
dbms.jvm.additional=-XX:+DisableExplicitGC 

# Remote JMX monitoring, uncomment and adjust the following lines as needed. Absolute paths to jmx.access and 
# jmx.password files are required. 
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords, 
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'. 
# For more details, see: http://download.oracle.com/javase/8/docs/technotes/guides/management/agent.html 
# On Unix based systems the jmx.password file needs to be owned by the user that will run the server, 
# and have permissions set to 0600. 
# For details on setting these file permissions on Windows see: 
#  http://docs.oracle.com/javase/8/docs/technotes/guides/management/security-windows.html 
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.port=3637 
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.authenticate=true 
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.ssl=false 
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.password.file=/absolute/path/to/conf/jmx.password 
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.access.file=/absolute/path/to/conf/jmx.access 

# Some systems cannot discover host name automatically, and need this line configured: 
#dbms.jvm.additional=-Djava.rmi.server.hostname=$THE_NEO4J_SERVER_HOSTNAME 

# Expand Diffie Hellman (DH) key size from default 1024 to 2048 for DH-RSA cipher suites used in server TLS handshakes. 
# This is to protect the server from any potential passive eavesdropping. 
dbms.jvm.additional=-Djdk.tls.ephemeralDHKeySize=2048 

#******************************************************************** 
# Wrapper Windows NT/2000/XP Service Properties 
#******************************************************************** 
# WARNING - Do not modify any of these properties when an application 
# using this configuration file has been installed as a service. 
# Please uninstall the service before modifying this section. The 
# service can then be reinstalled. 

# Name of the service 
dbms.windows_service_name=neo4j 

#******************************************************************** 
# Other Neo4j system properties 
#******************************************************************** 
dbms.jvm.additional=-Dunsupported.dbms.udc.source=tarball 

很長一段時間加載我有這樣的錯誤

Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x000000072c000000, 497025024, 0) failed; error='Ne peut allouer de l      a mémoire' (errno=12) 
# 
# There is insufficient memory for the Java Runtime Environment to continue. 
# Native memory allocation (mmap) failed to map 497025024 bytes for committing reserved memory. 

我該如何解決之後這個?

查詢

CREATE CONSTRAINT ON (n:Node) ASSERT n.NodeID IS UNIQUE; 
USING PERIODIC COMMIT 100000 

LOAD CSV WITH HEADERS FROM 'file:///file.csv' 
AS line 

MERGE (n1:Node {NodeID: line.p1}) 
MERGE (n2:Node {NodeID: line.p2}) 
MERGE (n1)-[:ACTING_WITH_L {Score: TOFLOAT(line.score)}]->(n2); 
+1

你能提供你的索引/約束和你的LOAD CSV查詢嗎? – InverseFalcon

+0

@InverseFalcon當我遇到此問題時,我已將查詢添加到帖子 –

回答

2

當我跑瞭解釋你的查詢,我注意到有一個熱心的操作。當EAGER是使用LOAD CSV的計劃的一部分時,它會回到處理不使用PERIODIC COMMIT的方式,並且會遇到內存問題。

這是一個blog post關於這個,包括解決方案:確保節點合併到圖中作爲一個單獨的操作,並且當你確定節點被加載時,分別處理關係匹配到節點而不是使用MERGE。

而不是MERGE ... MERGE ... MERGE模式合併在兩個節點,然後關係,使用MATCH ... MATCH ... MERGE。另外,如果您確定圖中沒有關係,您可能需要在關係上使用CREATE而不是MERGE來加速關係。

+0

,我做了幾個LOAD CSV。工作也很好。 嘗試LOAD MERGE,LOAD MERGE和MATCH MATCH CREATE –

+0

謝謝,它工作 –

+0

@JeromeB你說我應該做節點的每一邊的合併和創建一個關係? –