Skip to main content

Collections

Collections are organizational containers for your memories. Think of them like projects or workspaces—a way to keep related information grouped together.
Collections are manual organizational units, not automatic similarity groupings. You control what goes where.

Why Use Collections?

  • Separate contexts: Keep work, personal, and project memories distinct
  • Targeted search: Search only within relevant collections
  • Access control: Different API keys can have different collection permissions
  • Multi-tenancy: Isolate data for different users or customers

Creating Collections

from nebula import Nebula

nebula = Nebula(api_key="your-api-key")

# Create a single collection
collection = nebula.create_collection(
    name="Research Papers",
    description="Academic research documents",
    metadata={"category": "academic"}
)
print(f"Created: {collection.name} (ID: {collection.id})")

# Create multiple collections
collections = [
    nebula.create_collection(name="Work", description="Work-related notes"),
    nebula.create_collection(name="Personal", description="Personal memories")
]

Storing Memories in Collections

Always specify the collection_id when storing memories:
# Store in specific collection
work_memory = nebula.store_memory({
    "collection_id": work_collection.id,
    "content": "Q4 planning meeting notes",
    "metadata": {"type": "meeting", "date": "2024-01-15"}
})

# Batch store to different collections
memories = [
    {
        "collection_id": work_collection.id,
        "content": "Project deadline is Friday",
        "metadata": {"priority": "high"}
    },
    {
        "collection_id": personal_collection.id,
        "content": "Book recommendation: Clean Code",
        "metadata": {"category": "books"}
    }
]
memory_ids = nebula.store_memories(memories)

Searching Within Collections

Search within specific collections using IDs or names, or search across all your collections:
# Search by collection name
work_results = nebula.search(
    query="project deadlines",
    collection_ids=["Work"],
    limit=10
)

# Search by collection ID
work_results = nebula.search(
    query="project deadlines",
    collection_ids=[work_collection.id],
    limit=10
)

# Search multiple collections (mix names and IDs)
all_results = nebula.search(
    query="important notes",
    collection_ids=["Work", personal_collection.id],
    limit=20
)

# Search all accessible collections
global_results = nebula.search(
    query="important notes",
    limit=20
)

# Display results
for result in work_results:
    print(f"Score: {result.score:.2f}")
    print(f"Content: {result.content}")
    print(f"Cluster: {result.collection_id}")
Use collection names for convenience or UUIDs for precision. Omit collection_ids entirely to search across all your collections.

Managing Collections

# List all collections
collections = nebula.list_collections(limit=50)

for collection in collections:
    print(f"{collection.name} - {collection.memory_count} memories")
    print(f"  ID: {collection.id}")
    print(f"  Created: {collection.created_at}")

Use Cases

Multi-Tenancy

# Create collection per customer
customer_collection = nebula.create_collection(
    name=f"customer_{customer_id}",
    description=f"Data for customer {customer_id}",
    metadata={"customer_id": customer_id, "plan": "enterprise"}
)

# Store customer-specific data
nebula.store_memory({
    "collection_id": customer_collection.id,
    "content": "Customer support conversation...",
    "metadata": {"customer_id": customer_id}
})

Project Organization

# Separate collections for each project
projects = ["project_alpha", "project_beta", "project_gamma"]
project_collections = {
    name: nebula.create_collection(name=name, description=f"Project {name} documentation")
    for name in projects
}

# Search within specific project (by name)
results = nebula.search(
    query="API documentation",
    collection_ids=["project_alpha"]
)

Environment Separation

# Development vs Production
dev_collection = nebula.create_collection(name="dev", description="Development environment")
prod_collection = nebula.create_collection(name="prod", description="Production environment")

# Different API keys can have access to different collections

Best Practices

  1. Use descriptive names - customer_support not collection_1
  2. Leverage metadata - Store organizational info in collection metadata
  3. Specify collection_ids for targeted searches - Use collection names or UUIDs for faster, scoped searches
  4. Plan for scale - Consider collectioning strategy early
  5. Monitor collection sizes - Keep collections focused and manageable
  6. Implement soft deletes - Mark collections as archived instead of deleting

Common Patterns

PatternWhen to Use
One collection per userUser-specific memory/context
One collection per projectProject-scoped documentation
One collection per data typeDifferent content types (docs, chats, notes)
Hierarchical collectionsUse metadata to create virtual hierarchies

Next Steps