Advanced Cypher-3

 Neo4j’s Cypher has NULL handling rules that are very important to understand.

Unlike SQL, Cypher does not allow direct comparisons with NULL (like = NULL). Instead, it has special operators.


🔹 1. What is NULL in Cypher?

  • NULL = "unknown" or "missing value".

  • It propagates: if any part of an expression is NULL, the result is usually NULL.

Example:

RETURN 5 + NULL AS result

✅ Output: NULL


🔹 2. Filtering with NULL

You cannot do:

WHERE u.age = NULL // ❌ Wrong

Instead use:

  • IS NULL

MATCH (u:User) WHERE u.age IS NULL RETURN u.name

👉 Finds users with no age property.

  • IS NOT NULL

MATCH (u:User) WHERE u.age IS NOT NULL RETURN u.name, u.age

👉 Finds users where age exists.


🔹 3. NULL in Properties

If a property does not exist → it behaves like NULL.

Example:

MATCH (u:User) RETURN u.name, u.role

If role is not set, it will return NULL.


🔹 4. Functions that handle NULLs

  • coalesce() → returns the first non-NULL value.

MATCH (u:User) RETURN u.name, coalesce(u.role, "Unknown") AS role

👉 If role is missing, shows "Unknown".

  • exists() / u.prop IS NOT NULL → check property existence.

MATCH (u:User) WHERE exists(u.email) // Neo4j < 4.0 RETURN u
MATCH (u:User) WHERE u.email IS NOT NULL // Neo4j 4.0+ RETURN u

🔹 5. NULL in Aggregations

Aggregations ignore NULLs.

Example:

MATCH (u:User) RETURN avg(u.age) AS avg_age

👉 Only considers users with age, ignores NULL.


🔹 6. Example Dataset

Suppose:

CREATE (:User {name:"Alice", age:25}), (:User {name:"Bob"}), (:User {name:"Charlie", age:30})

Queries:

MATCH (u:User) RETURN u.name, u.age

✅ Output:

nameage
Alice25
BobNULL
Charlie30

✅ Key Takeaways

  • Use IS NULL / IS NOT NULL (not = NULL).

  • NULL propagates → most operations with NULL return NULL.

  • Use coalesce() to provide default values.

  • Aggregations ignore NULLs automatically.

 Let’s go into performance optimization in Neo4j using Indexes & Constraints. These are the backbone of making queries fast and consistent.


🔹 1. Why Indexes & Constraints?

  • Without indexes → Neo4j scans all nodes/relationships to find matches.

  • With indexes → Neo4j can quickly lookup nodes/relationships by property values.

  • Constraints ensure data integrity (uniqueness, existence, etc.) and often create backing indexes automatically.


🔹 2. Indexes in Neo4j

a) Property Index

Used to speed up lookups on properties.

// Create index on single property CREATE INDEX user_name_index FOR (u:User) ON (u.name);

👉 Now queries like:

MATCH (u:User {name:"Alice"}) RETURN u;

will be index-backed instead of scanning all User nodes.


b) Composite Index

When you often filter with multiple properties together.

CREATE INDEX user_composite_index FOR (u:User) ON (u.firstName, u.lastName);

👉 Used for queries like:

MATCH (u:User {firstName:"Alice", lastName:"Smith"}) RETURN u;

c) Fulltext Index

For text search (contains, starts with, fuzzy match).

CREATE FULLTEXT INDEX user_fulltext_index FOR (u:User) ON EACH [u.name, u.email];

Query:

CALL db.index.fulltext.queryNodes("user_fulltext_index", "Ali*") YIELD node, score RETURN node.name, score;

🔹 3. Constraints in Neo4j

Constraints help maintain data integrity and often create indexes automatically.

a) Uniqueness Constraint

Ensures property is unique (like a primary key).

CREATE CONSTRAINT user_name_unique FOR (u:User) REQUIRE u.email IS UNIQUE;

👉 Prevents inserting two users with same email.


b) Existence Constraint

Ensures a property must exist.

CREATE CONSTRAINT user_email_exists FOR (u:User) REQUIRE u.email IS NOT NULL;

👉 Every User must have an email.


c) Node Key Constraint

Ensures composite uniqueness.

CREATE CONSTRAINT user_composite_unique FOR (u:User) REQUIRE (u.firstName, u.lastName) IS UNIQUE;

👉 No two users can share the same (firstName, lastName).


🔹 4. Checking & Dropping Indexes/Constraints

  • Check:

SHOW INDEXES; SHOW CONSTRAINTS;
  • Drop:

DROP INDEX user_name_index; DROP CONSTRAINT user_name_unique;

🔹 5. Example Dataset & Performance

Suppose we create 1M users:

UNWIND range(1,1000000) AS id CREATE (:User {id:id, name:"User"+id, email:"user"+id+"@mail.com"});
  • Without index:

MATCH (u:User {email:"user999999@mail.com"}) RETURN u;

👉 Full scan → very slow.

  • With unique index on email:

CREATE CONSTRAINT user_email_unique FOR (u:User) REQUIRE u.email IS UNIQUE;

👉 Same query runs in milliseconds.


🔹 6. Best Practices

  • Always index properties used in MATCH / WHERE.

  • Use composite indexes when filtering on multiple properties together.

  • Use unique constraints for identifiers (like id, email).

  • Don’t over-index → each index slows down CREATE/UPDATE operations slightly.


✅ So, indexes = performance optimization,
constraints = data correctness + sometimes free indexes.

Comments