📥 Loading Data from CSV in Neo4j

LOAD CSV lets you import external CSV files into Neo4j and turn them into nodes and relationships.

1. Basic Syntax


LOAD CSV FROM 'file:///filename.csv' AS row
RETURN row

file:/// → means file should be in Neo4j’s import directory (default: <neo4j_home>/import).
row → each row of the CSV is returned as a list of strings.

2. Example CSV

📄 users.csv


id,name,age
1,Alice,30
2,Bob,25
3,Charlie,35

3. Load and Create Nodes


LOAD CSV WITH HEADERS FROM 'file:///users.csv' AS row
CREATE (:User {id: toInteger(row.id), name: row.name, age: toInteger(row.age)});

✅ Creates 3 User nodes:


(:User {id: 1, name: "Alice", age: 30})
(:User {id: 2, name: "Bob", age: 25})
(:User {id: 3, name: "Charlie", age: 35})

4. Load Relationships

📄 friends.csv


from,to
1,2
2,3
1,3

Query:


LOAD CSV WITH HEADERS FROM 'file:///friends.csv' AS row
MATCH (u1:User {id: toInteger(row.from)})
MATCH (u2:User {id: toInteger(row.to)})
CREATE (u1)-[:FRIEND_WITH]->(u2);

✅ Creates relationships:


(Alice)-[:FRIEND_WITH]->(Bob)
(Bob)-[:FRIEND_WITH]->(Charlie)
(Alice)-[:FRIEND_WITH]->(Charlie)

5. MERGE Instead of CREATE (avoid duplicates)


LOAD CSV WITH HEADERS FROM 'file:///users.csv' AS row
MERGE (u:User {id: toInteger(row.id)})
SET u.name = row.name, u.age = toInteger(row.age);

👉 Ensures nodes are not duplicated if the CSV is loaded again.

6. Handling Large CSVs

Use USING PERIODIC COMMIT to commit in batches:


USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:///bigfile.csv' AS row
...

This commits every 1000 rows → prevents memory issues.

7. Extra Tricks

Skip empty values:


WHERE row.name IS NOT NULL

Split list values from CSV:


SET u.skills = split(row.skills, ";")

(CSV: "Python;Neo4j;SQL" → stored as ["Python","Neo4j","SQL"])

⚡ Summary

LOAD CSV FROM ... AS row → read file.
Use WITH HEADERS if CSV has headers.
Convert data types using toInteger(), toFloat().
Use MERGE (not CREATE) to prevent duplicates.
For large imports → USING PERIODIC COMMIT.

📥 Importing JSON / APIs into Neo4j

1. Using APOC Procedures (`apoc.load.json`)

👉 APOC (Awesome Procedures on Cypher) is an extension library for Neo4j.
It adds support for loading JSON directly from files or APIs.

Example JSON file

📄 users.json


[
  { "id": 1, "name": "Alice", "age": 30 },
  { "id": 2, "name": "Bob", "age": 25 }
]

Load JSON from file


CALL apoc.load.json("file:///users.json") YIELD value
CREATE (:User {id: value.id, name: value.name, age: value.age});

✅ Creates nodes:


(:User {id: 1, name: "Alice", age: 30})
(:User {id: 2, name: "Bob", age: 25})

2. Import from a REST API

If the API returns JSON, you can fetch it directly:


CALL apoc.load.json("https://jsonplaceholder.typicode.com/users") YIELD value
CREATE (:User {id: value.id, name: value.name, email: value.email});

✅ Imports users from a public API.

3. Handling Nested JSON

📄 Example JSON:


{
  "id": 1,
  "name": "Alice",
  "projects": [
    {"title": "Graph DB", "year": 2023},
    {"title": "AI System", "year": 2024}
  ]
}

Cypher:


CALL apoc.load.json("file:///user_projects.json") YIELD value
MERGE (u:User {id: value.id})
SET u.name = value.name
UNWIND value.projects AS proj
MERGE (p:Project {title: proj.title})
MERGE (u)-[:WORKS_ON {year: proj.year}]->(p);

✅ Creates:


(Alice)-[:WORKS_ON {year:2023}]->(Graph DB)
(Alice)-[:WORKS_ON {year:2024}]->(AI System)

4. JSON from Parameters

You can also pass JSON into a query as a parameter (e.g., from a Python/Java app):


WITH $json AS data
UNWIND data AS row
CREATE (:User {id: row.id, name: row.name, age: row.age});

And from your driver (Python example):


session.run("""
WITH $json AS data
UNWIND data AS row
CREATE (:User {id: row.id, name: row.name, age: row.age})
""", json=[{"id":1,"name":"Alice","age":30},{"id":2,"name":"Bob","age":25}])

5. Best Practices

Use MERGE instead of CREATE for re-runs.
Use UNWIND to handle arrays.
For huge JSON → break into chunks before loading.
Secure API calls → pass headers (APOC supports this: apoc.load.jsonParams).

✅ Summary

CSV → LOAD CSV
JSON/API → apoc.load.json or apoc.load.jsonParams
Nested JSON → UNWIND arrays and create relationships
From code → pass JSON params into Cypher

📦 Bulk Imports with `neo4j-admin import`

1. When to use

Fresh database (not an existing one).
Loading very large datasets.
Input data is available in CSV format.
❌ Cannot be run on an active running database.

2. Basic Command


neo4j-admin import \
  --database=graph.db \
  --nodes=import/users.csv \
  --nodes=import/projects.csv \
  --relationships=import/works_on.csv

✅ This creates a new database named graph.db with data from CSV files.

3. CSV Format Requirements

Header row defines columns.
Use :ID, :LABEL, :START_ID, :END_ID, and :TYPE.
Properties are just column names.

Example: Users

📄 users.csv


userId:ID,name,age:int,:LABEL
1,Alice,30,User
2,Bob,25,User

Example: Projects

📄 projects.csv


projectId:ID,title,year:int,:LABEL
10,Graph DB,2023,Project
20,AI System,2024,Project

Example: Relationships

📄 works_on.csv


:START_ID,:END_ID,role,:TYPE
1,10,Developer,WORKS_ON
1,20,Lead,WORKS_ON
2,10,Tester,WORKS_ON

4. Run Import


neo4j-admin import \
  --database=graph.db \
  --nodes=users.csv \
  --nodes=projects.csv \
  --relationships=works_on.csv

✅ Resulting graph:


(:User {id:1, name:"Alice", age:30})-[:WORKS_ON {role:"Developer"}]->(:Project {id:10, title:"Graph DB", year:2023})
(:User {id:2, name:"Bob", age:25})-[:WORKS_ON {role:"Tester"}]->(:Project {id:10, title:"Graph DB", year:2023})

5. Advanced Options

--high-io → use max disk speed.
--delimiter="|" → custom CSV delimiter.
--array-delimiter=";" → handle lists.
--skip-bad-relationships=true → skip errors.

6. Best Practices

Always import into an empty database.
Prepare clean CSVs (no missing IDs).
Use integer IDs for speed.
After import, create indexes & constraints for queries.

✅ Summary

neo4j-admin import = fastest way for first-time bulk loading.
Needs structured CSV files with :ID, :START_ID, etc.
Great for millions or billions of records.
Not for updates → only for new databases.

Learn & Grow with Python

Search This Blog

Importing & Exporting Data-1

📥 Loading Data from CSV in Neo4j

1. Basic Syntax

2. Example CSV

3. Load and Create Nodes

4. Load Relationships

5. MERGE Instead of CREATE (avoid duplicates)

6. Handling Large CSVs

7. Extra Tricks

⚡ Summary

📥 Importing JSON / APIs into Neo4j

1. Using APOC Procedures (`apoc.load.json`)

Example JSON file

Load JSON from file

2. Import from a REST API

3. Handling Nested JSON

4. JSON from Parameters

5. Best Practices

📦 Bulk Imports with `neo4j-admin import`

1. When to use

2. Basic Command

3. CSV Format Requirements

Example: Users

Example: Projects

Example: Relationships

4. Run Import

5. Advanced Options

6. Best Practices

Comments

Post a Comment

Importing & Exporting Data-1

📥 Loading Data from CSV in Neo4j

1. Basic Syntax

2. Example CSV

3. Load and Create Nodes

4. Load Relationships

5. MERGE Instead of CREATE (avoid duplicates)

6. Handling Large CSVs

7. Extra Tricks

⚡ Summary

📥 Importing JSON / APIs into Neo4j

1. Using APOC Procedures (apoc.load.json)

Example JSON file

Load JSON from file

2. Import from a REST API

3. Handling Nested JSON

4. JSON from Parameters

5. Best Practices

📦 Bulk Imports with neo4j-admin import

1. When to use

2. Basic Command

3. CSV Format Requirements

Example: Users

Example: Projects

Example: Relationships

4. Run Import

5. Advanced Options

6. Best Practices

Comments

Post a Comment

1. Using APOC Procedures (`apoc.load.json`)

📦 Bulk Imports with `neo4j-admin import`