# Neo4j Tutorial: A Practical Guide to Graph Query Language

Introduction

Welcome to the third installment of our Neo4j tutorial series! In this guide, we'll dive deep into Cypher, Neo4j's powerful query language designed specifically for working with graph databases. If you've been following along with our previous tutorials, you're now familiar with graph database concepts and the Neo4j platform. Now it's time to master the language that brings these graph databases to life.

Cypher is to Neo4j what SQL is to relational databases—an elegant, declarative query language that allows you to describe what you want to retrieve from your database without having to specify exactly how to retrieve it. What makes Cypher special is its visual ASCII art syntax that mimics the diagram patterns it's meant to match in your graph.

Let's explore Cypher through practical examples, building on real-world scenarios that demonstrate its power and flexibility.

Getting Started with Cypher Syntax

Cypher's syntax is designed to be visual and intuitive, representing nodes as parentheses () and relationships as arrows -[]->. This makes queries readable and similar to drawing the pattern you want to find on a whiteboard.

Basic Node and Relationship Patterns

// Basic pattern: Find a person named "John"
MATCH (p:Person {name: "John"})
RETURN p

// Pattern with relationship: Find John's friends
MATCH (p:Person {name: "John"})-[:FRIENDS_WITH]->(friend)
RETURN friend.name

In these examples:

() represents a node
:Person is a label that categorizes the node
{name: "John"} is a property constraint
-[:FRIENDS_WITH]-> represents a directed relationship with the type "FRIENDS_WITH"

Creating Data with Cypher

Let's start by creating a small social network dataset to work with throughout this tutorial.

Creating Nodes

// Create Person nodes
CREATE (alice:Person {name: "Alice", age: 32, occupation: "Data Scientist"})
CREATE (bob:Person {name: "Bob", age: 35, occupation: "Software Engineer"})
CREATE (charlie:Person {name: "Charlie", age: 28, occupation: "UX Designer"})
CREATE (diana:Person {name: "Diana", age: 41, occupation: "Project Manager"})
CREATE (edward:Person {name: "Edward", age: 25, occupation: "Data Analyst"})

// Create Interest nodes
CREATE (graphdb:Interest {name: "Graph Databases", category: "Technology"})
CREATE (cycling:Interest {name: "Cycling", category: "Sports"})
CREATE (cooking:Interest {name: "Cooking", category: "Hobby"})
CREATE (photography:Interest {name: "Photography", category: "Arts"})
CREATE (travel:Interest {name: "Travel", category: "Lifestyle"})

Creating Relationships

// Create friendship relationships
MATCH (alice:Person {name: "Alice"}), (bob:Person {name: "Bob"})
CREATE (alice)-[:FRIENDS_WITH {since: 2018}]->(bob)

MATCH (alice:Person {name: "Alice"}), (charlie:Person {name: "Charlie"})
CREATE (alice)-[:FRIENDS_WITH {since: 2020}]->(charlie)

MATCH (bob:Person {name: "Bob"}), (diana:Person {name: "Diana"})
CREATE (bob)-[:FRIENDS_WITH {since: 2015}]->(diana)

MATCH (charlie:Person {name: "Charlie"}), (diana:Person {name: "Diana"})
CREATE (charlie)-[:FRIENDS_WITH {since: 2019}]->(diana)

MATCH (diana:Person {name: "Diana"}), (edward:Person {name: "Edward"})
CREATE (diana)-[:FRIENDS_WITH {since: 2021}]->(edward)

// Create interest relationships
MATCH (alice:Person {name: "Alice"}), (graphdb:Interest {name: "Graph Databases"})
CREATE (alice)-[:INTERESTED_IN {level: "Expert"}]->(graphdb)

MATCH (alice:Person {name: "Alice"}), (cycling:Interest {name: "Cycling"})
CREATE (alice)-[:INTERESTED_IN {level: "Intermediate"}]->(cycling)

MATCH (bob:Person {name: "Bob"}), (graphdb:Interest {name: "Graph Databases"})
CREATE (bob)-[:INTERESTED_IN {level: "Beginner"}]->(graphdb)

MATCH (bob:Person {name: "Bob"}), (cooking:Interest {name: "Cooking"})
CREATE (bob)-[:INTERESTED_IN {level: "Advanced"}]->(cooking)

MATCH (charlie:Person {name: "Charlie"}), (photography:Interest {name: "Photography"})
CREATE (charlie)-[:INTERESTED_IN {level: "Expert"}]->(photography)

MATCH (diana:Person {name: "Diana"}), (travel:Interest {name: "Travel"})
CREATE (diana)-[:INTERESTED_IN {level: "Advanced"}]->(travel)

MATCH (diana:Person {name: "Diana"}), (cooking:Interest {name: "Cooking"})
CREATE (diana)-[:INTERESTED_IN {level: "Intermediate"}]->(cooking)

MATCH (edward:Person {name: "Edward"}), (graphdb:Interest {name: "Graph Databases"})
CREATE (edward)-[:INTERESTED_IN {level: "Beginner"}]->(graphdb)

MATCH (edward:Person {name: "Edward"}), (cycling:Interest {name: "Cycling"})
CREATE (edward)-[:INTERESTED_IN {level: "Advanced"}]->(cycling)

Querying Data with Cypher

Now that we have our dataset, let's explore different types of queries to extract valuable insights.

Basic READ Operations

Finding Nodes by Label and Properties

// Find all Person nodes
MATCH (p:Person)
RETURN p

// Find people over 30 years old
MATCH (p:Person)
WHERE p.age > 30
RETURN p.name, p.age, p.occupation

Finding Relationships

// Find who is friends with whom
MATCH (p1:Person)-[r:FRIENDS_WITH]->(p2:Person)
RETURN p1.name, p2.name, r.since

// Find Alice's friends
MATCH (p:Person {name: "Alice"})-[:FRIENDS_WITH]->(friend)
RETURN friend.name

Advanced Pattern Matching

Finding Friends of Friends

// Find friends of Alice's friends (who aren't directly connected to Alice)
MATCH (alice:Person {name: "Alice"})-[:FRIENDS_WITH]->(friend)-[:FRIENDS_WITH]->(fof)
WHERE NOT (alice)-[:FRIENDS_WITH]->(fof) AND alice <> fof
RETURN DISTINCT fof.name as FriendOfFriend

Finding Common Interests

// Find people who share interests with Alice
MATCH (alice:Person {name: "Alice"})-[:INTERESTED_IN]->(interest)<-[:INTERESTED_IN]-(other)
WHERE alice <> other
RETURN other.name as Person, interest.name as SharedInterest

Aggregations and Sorting

// Count friends for each person
MATCH (p:Person)-[:FRIENDS_WITH]->(friend)
RETURN p.name as Person, COUNT(friend) as NumberOfFriends
ORDER BY NumberOfFriends DESC

// Find the most popular interests
MATCH (i:Interest)<-[:INTERESTED_IN]-(p:Person)
RETURN i.name as Interest, COUNT(p) as Popularity
ORDER BY Popularity DESC

Updating Data with Cypher

Cypher is not just for querying data; it's also used for updating your graph.

Updating Node Properties

// Update Alice's age
MATCH (p:Person {name: "Alice"})
SET p.age = 33
RETURN p

// Add a new property to all Person nodes
MATCH (p:Person)
SET p.active = true
RETURN p

Creating New Relationships

// Create a new friendship between Alice and Diana
MATCH (alice:Person {name: "Alice"}), (diana:Person {name: "Diana"})
WHERE NOT (alice)-[:FRIENDS_WITH]->(diana)
CREATE (alice)-[:FRIENDS_WITH {since: 2022}]->(diana)

Removing Properties and Relationships

// Remove the 'level' property from an INTERESTED_IN relationship
MATCH (alice:Person {name: "Alice"})-[r:INTERESTED_IN]->(i:Interest {name: "Cycling"})
REMOVE r.level
RETURN alice, r, i

// Delete a friendship
MATCH (bob:Person {name: "Bob"})-[r:FRIENDS_WITH]->(diana:Person {name: "Diana"})
DELETE r

Advanced Cypher Techniques

Path Variables and Functions

// Find the shortest path between Alice and Edward
MATCH path = shortestPath((alice:Person {name: "Alice"})-[:FRIENDS_WITH*]-(edward:Person {name: "Edward"}))
RETURN path

// Return the length of the path
MATCH path = shortestPath((alice:Person {name: "Alice"})-[:FRIENDS_WITH*]-(edward:Person {name: "Edward"}))
RETURN [node in nodes(path) | node.name] as People, length(path) as PathLength

Working with Collections

// Collect all interests for each person
MATCH (p:Person)-[:INTERESTED_IN]->(i:Interest)
RETURN p.name as Person, COLLECT(i.name) as Interests

// Find people who are interested in ALL of these interests
MATCH (p:Person)
WHERE ALL(interest IN ["Graph Databases", "Cycling"] 
          WHERE (p)-[:INTERESTED_IN]->(:Interest {name: interest}))
RETURN p.name

Using CASE Expressions

// Categorize people by age group
MATCH (p:Person)
RETURN p.name, 
       CASE
         WHEN p.age < 30 THEN "Young Professional"
         WHEN p.age >= 30 AND p.age < 40 THEN "Mid-career"
         ELSE "Senior Professional"
       END AS AgeCategory

Query Optimization in Cypher

As your graph grows, optimizing queries becomes important. Here are some techniques:

Using Indexes

// Create an index on Person name
CREATE INDEX person_name FOR (p:Person) ON (p.name)

// Create an index on Interest name
CREATE INDEX interest_name FOR (i:Interest) ON (i.name)

Query Profiling

// Profile a query to see execution plan
PROFILE MATCH (p:Person {name: "Alice"})-[:FRIENDS_WITH*1..3]-(other)
RETURN other.name

Query Optimization Tips

Start with specific nodes: Begin your queries with the most specific node patterns to reduce the initial working set.
Use parameters: Instead of hardcoding values, use parameters for better query plan caching.
Limit relationship depth: Be careful with unbounded variable-length paths, as they can explore large portions of your graph.
Filter early: Apply WHERE clauses as early as possible in your query to reduce the amount of data processed.

Practical Examples: Solving Real-World Problems

Let's apply what we've learned to solve some common graph problems:

Recommendation System

// Recommend new interests to Alice based on what her friends like
MATCH (alice:Person {name: "Alice"})-[:FRIENDS_WITH]->(friend)-[:INTERESTED_IN]->(interest)
WHERE NOT (alice)-[:INTERESTED_IN]->(interest)
RETURN interest.name as RecommendedInterest, COUNT(friend) as CommonFriends
ORDER BY CommonFriends DESC

Network Analysis

// Find the most central person (with most connections) within 2 steps from Alice
MATCH (alice:Person {name: "Alice"})-[:FRIENDS_WITH*1..2]-(person)-[:FRIENDS_WITH]-(connection)
RETURN person.name, COUNT(DISTINCT connection) as Connections
ORDER BY Connections DESC
LIMIT 1

Pattern Detection

// Find triangles of friendship (groups of 3 mutual friends)
MATCH (p1:Person)-[:FRIENDS_WITH]-(p2:Person)-[:FRIENDS_WITH]-(p3:Person)-[:FRIENDS_WITH]-(p1)
WHERE p1.name < p2.name AND p2.name < p3.name  // To avoid duplicate results
RETURN p1.name, p2.name, p3.name

Best Practices for Writing Cypher Queries

Start small and build up: Begin with simple patterns and gradually add complexity.
Use meaningful aliases: Choose descriptive variable names that make your queries readable.
Test with LIMIT: When developing queries for large datasets, use LIMIT to test with a smaller result set first.
Comment your queries: Add comments to explain complex logic, especially for queries that will be reused.
Use parameters: Instead of hardcoding values, use parameters for better security and performance:

// Using parameters
MATCH (p:Person)
WHERE p.age > $minAge
RETURN p.name

Keep queries focused: Each query should have a single, clear purpose rather than trying to do too much at once.

Conclusion

Cypher is a powerful, expressive language that makes working with graph data intuitive and efficient. By mastering Cypher, you unlock the full potential of Neo4j and graph databases, enabling you to solve complex connected data problems with elegant, readable queries.

This tutorial has covered the fundamentals of Cypher syntax as well as advanced techniques, all through practical examples. As you continue your Neo4j journey, experiment with these patterns on your own data and explore how the expressiveness of Cypher can simplify your most complex data challenges.