Integrations¶
Duffy provides native integrations with LangChain, LlamaIndex, NetworkX, and PyTorch Geometric (PyG).
LangChain¶
DuffyVectorStore¶
LangChain-compatible VectorStore backed by pgvector. Stores documents with text, metadata, and embedding columns in a SQL table.
from duffy._integrations.langchain import DuffyVectorStore
vs = DuffyVectorStore(
driver,
embedding, # LangChain Embeddings instance
table="documents", # table name (default: "langchain_documents")
create_table=True, # auto-create table if missing
embedding_dimensions=768, # vector dimensions (for table creation)
)
Adding documents:
ids = vs.add_texts(
["First document", "Second document"],
metadatas=[{"source": "a"}, {"source": "b"}],
)
Searching:
# By text query (embeds automatically)
docs = vs.similarity_search("search query", k=4)
# With scores
docs_with_scores = vs.similarity_search_with_score("search query", k=4)
# By pre-computed vector
docs = vs.similarity_search_by_vector(embedding_vector, k=4)
Deleting:
Factory method:
vs = DuffyVectorStore.from_texts(
texts=["doc1", "doc2"],
embedding=embedding,
driver=driver,
table="my_docs",
)
DuffyGraphStore (LangChain)¶
LangChain-compatible GraphStore backed by Apache AGE. Use with LangChain's GraphCypherQAChain or knowledge graph pipelines.
from duffy._integrations.langchain import DuffyGraphStore
gs = DuffyGraphStore(driver, graph="my_graph")
Adding graph documents:
Accepts LangChain GraphDocument objects. Nodes are upserted with MERGE (matched on id), relationships are created between matched nodes. All labels are validated against [A-Za-z_][A-Za-z0-9_]* and all values use parameterized queries.
Querying:
Schema:
gs.get_schema # string representation
gs.get_structured_schema # dict with node_props, rel_props, relationships
gs.refresh_schema() # refresh cached schema from database
LlamaIndex¶
Duffy provides two LlamaIndex integrations: a simple triplet-based GraphStore and a full-featured PropertyGraphStore.
DuffyGraphStore (LlamaIndex)¶
Simple triplet-based store. Stores knowledge as (:Entity)-[:PREDICATE]->(:Entity) triples.
from duffy._integrations.llamaindex import DuffyGraphStore
store = DuffyGraphStore(driver, graph="my_graph")
Upserting triplets:
Querying:
# Get triplets for a subject
triplets = store.get("Alice", depth=2, limit=30)
# [["Alice", "knows", "Bob"], ["Alice", "works_at", "Acme"]]
# Get relationship map for multiple subjects
rel_map = store.get_rel_map(subjs=["Alice", "Bob"])
# {"Alice": [["knows", "Bob"], ["works_at", "Acme"]], "Bob": [...]}
Deleting:
Schema and raw queries:
schema_str = store.get_schema(refresh=True)
result_str = store.query("MATCH (n:Entity) RETURN n.name")
DuffyPropertyGraphStore (LlamaIndex)¶
Modern labeled property graph store. Supports typed nodes, typed relations, structured queries, and vector queries.
Architecture: graph structure lives in AGE (vertices/edges), embeddings live in a companion pgvector table ({graph}_embeddings).
from duffy._integrations.llamaindex_property import DuffyPropertyGraphStore
store = DuffyPropertyGraphStore(driver, graph="my_graph")
Upserting nodes:
from llama_index.core.graph_stores.types import EntityNode, ChunkNode
nodes = [
EntityNode(id_="alice", label="Person", properties={"age": 30}),
ChunkNode(id_="doc1", text="Some document text"),
]
store.upsert_nodes(nodes)
Upserting relations:
from llama_index.core.graph_stores.types import Relation
rels = [
Relation(source_id="alice", target_id="bob", label="KNOWS"),
]
store.upsert_relations(rels)
Querying:
# Get nodes by ID
nodes = store.get(ids=["alice"])
# Get nodes by properties
nodes = store.get(properties={"age": 30})
# Get triplets
triplets = store.get_triplets(entity_names=["Alice"])
# Get relationship map
rel_map = store.get_rel_map(subjs=["Alice"], depth=2)
# Structured Cypher query
nodes = store.structured_query(
"MATCH (n:Person) WHERE n.age > 25 RETURN n",
param_map={}
)
Vector queries:
from llama_index.core.vector_stores.types import VectorStoreQuery
vq = VectorStoreQuery(query_embedding=[0.1, 0.2, ...], similarity_top_k=5)
result = store.vector_query(vq)
# Returns (nodes, scores)
Deleting:
Schema:
NetworkX¶
Convert AGE graphs to NetworkX for analysis and visualization.
to_networkx¶
G = db.to_networkx() # full graph as a DiGraph
# Filter by label/type
G = db.to_networkx(labels=["Person"], rel_types=["KNOWS"])
# Undirected
G = db.to_networkx(directed=False)
# Without properties (faster for large graphs)
G = db.to_networkx(node_properties=False, edge_properties=False)
Graph algorithms¶
Built-in algorithms use NetworkX under the hood:
# PageRank
pr_df = db.pagerank(label="Person", rel_type="KNOWS", damping=0.85)
# Community detection (Louvain)
comm_df = db.communities(method="louvain")
# Shortest path (returns list of node IDs)
path = db.shortest_path(source_id, target_id, weight="distance")
# Centrality measures
cent_df = db.centrality(measure="degree") # also: betweenness, closeness
PyTorch Geometric (PyG)¶
Train GNNs on AGE graph data. Duffy provides PyG's FeatureStore and GraphStore interfaces.
DuffyFeatureStore¶
Stores node feature tensors in PostgreSQL (serialized as bytea).
from duffy._integrations.pyg import DuffyFeatureStore
from torch_geometric.data.feature_store import TensorAttr
fs = DuffyFeatureStore(driver, graph="my_graph")
# Store features
fs._put_tensor(tensor, TensorAttr("Person", "x"))
# Retrieve features
t = fs._get_tensor(TensorAttr("Person", "x"))
# List stored attributes
attrs = fs.get_all_tensor_attrs()
DuffyGraphStore (PyG)¶
Extracts edge topology in COO format from AGE. Node IDs are remapped from AGE's 64-bit IDs to contiguous 0-based indices as required by PyG.
from duffy._integrations.pyg import DuffyGraphStore
from torch_geometric.data.graph_store import EdgeAttr
gs = DuffyGraphStore(driver, graph="my_graph")
# Get edge index [2, num_edges] for a specific edge type
edge_index = gs._get_edge_index(
EdgeAttr(edge_type=("Person", "KNOWS", "Person"), layout="coo")
)
# List all edge types
edge_attrs = gs.get_all_edge_attrs()
Typical PyG workflow¶
import duffy
from duffy._integrations.pyg import DuffyFeatureStore, DuffyGraphStore
db = duffy.connect("postgresql://...", graph="social")
# 1. Extract edge topology
gs = DuffyGraphStore(db, graph="social")
edge_index = gs._get_edge_index(
EdgeAttr(edge_type=("Person", "KNOWS", "Person"), layout="coo")
)
# 2. Load/compute node features
fs = DuffyFeatureStore(db, graph="social")
# ... compute embeddings, store with fs._put_tensor(...)
# 3. Build PyG Data object
from torch_geometric.data import Data
data = Data(x=node_features, edge_index=edge_index)
# 4. Train your GNN
# ...