Expert Level – Becoming a Neo4j Power User

 when running Neo4j in production (especially on Kubernetes/OpenShift), you’ll want to know where your queries and database spend time. That’s where performance profiling comes in.

Here’s a clear guide:


⚡ Performance Profiling in Neo4j (Production)

Neo4j provides multiple tools to monitor, debug, and optimize query/database performance.


1. Use PROFILE and EXPLAIN

For individual queries (usually in dev/staging, but useful for prod debugging):

EXPLAIN MATCH (p:Person)-[:KNOWS]->(f) RETURN p, f; PROFILE MATCH (p:Person)-[:KNOWS]->(f) RETURN p, f;
  • EXPLAIN → shows query plan without running it.

  • PROFILE → runs query and shows actual db hits, rows, time.

✅ Use this to identify missing indexes, unnecessary joins, or Cartesian products.


2. Enable Query Logging

In neo4j.conf:

dbms.logs.query.enabled=true dbms.logs.query.threshold=1000ms # log slow queries (>1s) dbms.logs.query.parameter_logging_enabled=true
  • Logs stored in $NEO4J_HOME/logs/query.log.

  • Forward logs to OpenShift’s EFK/Loki stack for monitoring.

👉 Helps you find “slow queries” in production without enabling PROFILE everywhere.


3. Metrics & Monitoring (Prometheus/Grafana)

Neo4j exposes JMX metrics → can be scraped via Prometheus.

Enable metrics in neo4j.conf:

dbms.monitoring.enabled=true dbms.metrics.prometheus.enabled=true dbms.metrics.prometheus.endpoint=0.0.0.0:2004

Then in Kubernetes:

  • Expose port 2004.

  • Configure Prometheus scrape target.

  • Use Grafana dashboards (Neo4j provides prebuilt ones).

📊 Key metrics to monitor:

  • Query execution time

  • Transaction commits/rollbacks

  • Page cache hit ratio

  • JVM heap/memory usage

  • Bolt/HTTP connections


4. Page Cache & Memory Tuning

Neo4j performance depends heavily on page cache (stores graph on disk in memory).

📌 In neo4j.conf:

dbms.memory.heap.initial_size=2g dbms.memory.heap.max_size=4g dbms.memory.pagecache.size=8g # ~50-70% of RAM (minus heap + OS)

👉 Tune based on:

  • Graph size (on disk).

  • Available memory in the pod.

  • Workload type (OLTP vs analytics).


5. Connection & Thread Tuning

  • Configure Bolt thread pools if handling lots of clients:

dbms.threads.worker_count=8
  • Use connection limits to avoid overload:

dbms.connector.bolt.thread_pool_max_size=400 dbms.connector.bolt.thread_pool_keep_alive=5m

6. Use CALL dbms.listQueries for Live Profiling

In production, you can run:

CALL dbms.listQueries() YIELD query, elapsedTimeMillis, status RETURN * ORDER BY elapsedTimeMillis DESC;
  • See which queries are running “right now.”

  • Identify long-running queries.

  • You can even kill them:

CALL dbms.killQuery('<queryId>');

7. APOC & Profiling

APOC has extra procedures:

CALL apoc.monitor.tx() // transaction stats CALL apoc.monitor.kernel() // JVM + thread info CALL apoc.monitor.store() // store sizes

Great for runtime diagnostics inside the cluster.


8. OpenShift/Kubernetes Integration

  • Use sidecar containers (like JMX Exporter or Prometheus agent) to collect JVM/Neo4j metrics.

  • Send metrics + logs → OpenShift monitoring stack.

  • Use K8s HPA (Horizontal Pod Autoscaler) if Bolt/HTTP load spikes.

  • Use Resource Limits to prevent JVM OOM kills:

resources: requests: memory: "8Gi" cpu: "2" limits: memory: "12Gi" cpu: "4"

Summary (Best Practices for Profiling in Prod)

  1. Use PROFILE/EXPLAIN in staging/dev → optimize queries.

  2. Enable query logging with slow query threshold.

  3. Monitor with Prometheus + Grafana.

  4. Tune page cache & heap memory.

  5. Use dbms.listQueries + APOC monitoring for live debugging.

  6. Integrate with OpenShift observability stack.

Scaling Neo4j to billions of nodes/relationships is absolutely doable—but you need the right model, ingest pattern, hardware, and cluster topology. Here’s a practical blueprint you can follow.


1) Model for scale (before you buy hardware)

Start from an indexed node.

  • Every heavy query should begin with an index-backed lookup (NodeIndexSeek), not a graph-wide scan.

  • Add unique IDs and composite indexes for your main entry points.

    CREATE CONSTRAINT user_id_unique FOR (u:User) REQUIRE u.id IS UNIQUE; CREATE CONSTRAINT order_id_unique FOR (o:Order) REQUIRE o.id IS UNIQUE; CREATE INDEX user_name_city FOR (u:User) ON (u.name, u.city);

Avoid supernodes.

  • Break “celebrity” fan-outs into buckets or time partitions.

    (:Post)<-[:IN_BUCKET]-(:LikeBucket {day:"2025-08-25"})<-[:LIKES]-(:User)
  • Or use intermediate entities when relationships carry many attributes (e.g., (:User)-[:MADE]->(:Tx)-[:FOR]->(:Product)).

Bound your traversals.

  • Prefer [:REL*1..3] to unbounded [:REL*].

  • Use relationship type + direction filters aggressively.

Denormalize judiciously.

  • Add “shortcut” relationships for very common multi-hop questions (e.g., :FRIEND_OF_FRIEND) built by batch jobs.


2) Ingest at scale (millions → billions)

Cold load (fastest): neo4j-admin import into a new DB.

  • Use CSV with :ID, :START_ID, :END_ID, :TYPE.

  • Then create indexes/constraints after the import.

Warm/continuous load: batch writes, not row-by-row.

  • Use parameterized UNWIND + index-backed MERGE.

  • For large flows, use APOC periodic iterate (server-side batching):

    CALL apoc.periodic.iterate( "CALL apoc.load.csv('file:///edges.csv') YIELD map RETURN map", "MERGE (a:User {id: toInteger(map.from)}) MERGE (b:User {id: toInteger(map.to)}) MERGE (a)-[:FOLLOWS]->(b)", {batchSize: 20000, parallel: true} )
  • From streams (Kafka/etc.), buffer and send bulk batches (hundreds–thousands per tx).

Make MERGE cheap.

  • Always MERGE on indexed keys. Avoid MERGE on non-indexed properties.


3) Cluster topology for billions

Causal Cluster (Enterprise):

  • 1–3 Core members (write quorum, durability).

  • N Read Replicas for scale-out reads & analytics.

  • Run heavy analytics on read replicas (keeps writers snappy).

Fabric (sharding across graphs):

  • Partition by tenant, time (monthly/yearly), or domain.

  • Keep cross-shard queries rare; query the right shard with USE.

    USE customer_2025_08 MATCH (u:User {id:$id})-[:BOUGHT]->(o:Order) RETURN o
  • Good pattern: hot recent data in one shard, historical in time shards.


4) Hardware & storage planning

Disks: NVMe SSDs with high IOPS; XFS/ext4; no network HDDs for production.

RAM sizing (rule of thumb):

  • Page cache ≈ size of the hot portion of the store (try 50–70% of node’s RAM after heap).

  • Heap sized for query concurrency & complexity (e.g., OLTP 2–8 GB; complex aggregations need more).

  • Use the memory advisor:

    neo4j-admin memrec

    (Run against your DB to get starting recommendations.)

CPU: Fewer, faster cores often beat many slow cores for OLTP. Scale reads via replicas.

Kubernetes/OpenShift tips:

  • Use StatefulSets, node/pod anti-affinity, local NVMe where possible.

  • Pin pods to “storage-strong” nodes (node selectors/taints).

  • Request/limit memory carefully to avoid OOM kills.


5) Query patterns that stay fast at billion scale

  • Start from an index, immediately reduce with WHERE, then traverse.

  • Early LIMIT + WITH to cut rows before expanding further.

  • Avoid accidental Cartesian products (watch PROFILE plan).

  • Use EXPLAIN/PROFILE; look for NodeIndexSeek, not AllNodesScan.

  • Aggregate with COLLECT + size() carefully; prefer streaming results when possible.

Examples:

// Good: index seek -> narrow -> expand (bounded) MATCH (u:User {id:$id})-[:FOLLOWS*1..2]->(p:User) WHERE p.country = $country RETURN p.id LIMIT 100; // Avoid this (scan + unbounded expansion) MATCH (u:User)-[:FOLLOWS*]->(p:User) WHERE u.id=$id RETURN p;

6) Managing dense nodes

Neo4j automatically groups relationships by type+direction once a node becomes dense.
Design to filter by type + direction so the engine can hop buckets efficiently.
(You rarely need to tweak internal density thresholds; fix via modeling/bucketing instead.)


7) Index & constraint strategy

  • Unique ID constraints on all identity nodes (User, Product, Order…).

  • Composite indexes for common multi-key filters.

  • Fulltext indexes for search-like use (names, descriptions) and then anchor from results to graph hops.

Fulltext example:

CREATE FULLTEXT INDEX prod_search FOR (p:Product) ON EACH [p.name, p.brand]; CALL db.index.fulltext.queryNodes('prod_search', 'galaxy~') YIELD node, score WITH node MATCH (node)<-[:BOUGHT]-(u:User {id:$uid}) RETURN node LIMIT 10;

8) Analytics at scale

  • Use GDS (Graph Data Science) with graph projections (in-memory) on read replicas.

  • Project only the subgraph you need (labels, rel types, properties) to fit memory.

  • Persist results back as properties/relationships for fast OLTP re-use.


9) Observability & guardrails

  • Slow query log (threshold) to surface hotspots:

    dbms.logs.query.enabled=true dbms.logs.query.threshold=1000ms dbms.logs.query.parameter_logging_enabled=true
  • Prometheus/Grafana for page cache hit ratio, heap, GC, tx rates, connection counts.

  • Backups (online) and checkpoints tuned for write volume. Keep tx logs on fast disk.


10) Growth playbook (what to do when…)

  • Ingest is the bottleneck: increase batch size, parallel writers (to distinct keys), ensure MERGE keys are indexed, consider cold import for backfills.

  • Reads are hot: add read replicas, add/adjust indexes, denormalize hotspots, cache results in app layer if stable.

  • Queries blow up: reduce traversal depth, add shortcut edges, precompute aggregates into relationship properties/nodes.

  • Store too big for RAM: adopt Fabric time/tenant sharding; move cold data to separate shards; place analytics on replicas.


Quick reference snippets

High-throughput MERGE with parameters

UNWIND $batch AS row MERGE (u:User {id: row.uid}) ON CREATE SET u.createdAt = timestamp() SET u.name = row.name;

Batch creation of relationships

UNWIND $edges AS e MATCH (a:User {id:e.from}), (b:User {id:e.to}) MERGE (a)-[:FOLLOWS]->(b);

Periodic iterate (from a query)

CALL apoc.periodic.iterate( "MATCH (u:User) RETURN u", "WITH u MATCH (u)-[:BOUGHT]->(o) SET o.popular = true", {batchSize: 5000, parallel: true} )

Bottom line

  • Model first, keep traversals bounded, and always start from an index.

  • Ingest in batches; use neo4j-admin import for first loads.

  • Scale reads with replicas; shard with Fabric for extreme sizes.

  • Tune memory (page cache vs heap), use fast NVMe, and monitor everything.


Comments