Quick Start¶
Get up and running with Duffy in 5 minutes.
Prerequisites¶
- Python 3.10+
- PostgreSQL with Apache AGE and pgvector extensions installed
Install¶
Connect¶
The graph parameter creates the graph if it doesn't exist. You can also connect without a default graph and set it later:
Create nodes and relationships¶
# Create nodes
db.cypher("CREATE (:Person {name: 'Alice', age: 30})")
db.cypher("CREATE (:Person {name: 'Bob', age: 25})")
# Create a relationship
db.cypher("""
MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})
CREATE (a)-[:KNOWS {since: 2020}]->(b)
""")
db.commit()
Query the graph¶
# Cypher query → DataFrame
df = db.cypher("MATCH (n:Person) RETURN n.name, n.age").to_df()
print(df)
# n.name n.age
# 0 Alice 30
# 1 Bob 25
# Traverse relationships
df = db.cypher("""
MATCH (a:Person)-[r:KNOWS]->(b:Person)
RETURN a.name AS from, b.name AS to, r.since
""").to_df()
Parameterized queries¶
Use %s placeholders for safe parameterization:
result = db.cypher(
"MATCH (n:Person {name: %s}) RETURN n.name, n.age",
params=("Alice",),
)
record = result.single()
print(record["n.name"]) # Alice
Vector search¶
Search for similar vectors using pgvector:
query_vec = [0.1, 0.2, 0.3, ...] # your query embedding
results = db.vector_search(
"documents", # table name
"embedding", # vector column
query_vec,
k=10, # number of results
metric="cosine", # cosine, l2, or inner_product
)
df = results.to_df()
Create an index for faster searches:
Hybrid search¶
Combine graph traversal with vector similarity — find nodes via Cypher, then rank by embedding distance:
results = db.hybrid_search(
cypher="MATCH (p:Paper)-[:CITES]->(cited) RETURN cited",
vector_table="papers",
vector_column="abstract_embedding",
query_vector=query_vec,
k=10,
)
df = results.to_df()
Output formats¶
Every query returns a Result object with multiple output options:
result = db.cypher("MATCH (n:Person) RETURN n.name, n.age")
result.to_df() # pandas DataFrame
result.to_dicts() # list of dicts
result.to_arrow() # PyArrow Table
result.records # list of Record objects
# Expand Vertex/Edge objects into flat columns
result.to_df(expand=True)
Transactions¶
Group operations in a transaction that auto-commits on success and rolls back on exception:
with db.transaction():
db.cypher("CREATE (:Person {name: 'Carol'})")
db.cypher("CREATE (:Person {name: 'Dave'})")
# auto-committed here
Context manager¶
Use connect() as a context manager for automatic cleanup:
with duffy.connect("postgresql://localhost:5432/mydb", graph="g") as db:
df = db.cypher("MATCH (n) RETURN n").to_df()
# connection closed automatically
Next steps¶
- API Reference — full method signatures and parameters
- Integrations — LangChain, LlamaIndex, NetworkX, PyG