Skip to main content

Search Guide

Nebula provides powerful hybrid search that combines semantic (vector) and full-text search for optimal results.
from nebula import Nebula

nebula = Nebula(api_key="your-api-key")

# Simple search
results = nebula.search(
    query="machine learning algorithms",
    collection_ids=["research-collection"]
)

for result in results:
    print(f"Score: {result.score:.3f}")
    print(f"Content: {result.content}")
    print(f"Metadata: {result.metadata}\n")
Search results include:
  • score: Relevance score (0-1, higher is better)
  • content: The memory content
  • metadata: Custom metadata
  • memory_id: Memory ID for retrieval
  • collection_id: Which collection it belongs to
  • created_at: Timestamp

Search with Metadata Filters

Combine semantic search with metadata filtering:
# Search with filters
results = nebula.search(
    query="project updates",
    collection_ids=["work-collection"],
    filters={
        "priority": "high",
        "status": "active",
        "team": "engineering"
    },
    effort="medium"
)
See Metadata Filtering for advanced filter operators.

Hybrid Search Weights

Control the balance between semantic and full-text search:
results = nebula.search(
    query="neural networks",
    collection_ids=["research-collection"],
    search_settings={
        "semantic_weight": 0.7,    # Semantic search weight
        "fulltext_weight": 0.3  # Keyword search weight
    }
)
Weight Guidelines:
  • semantic_weight: 0.8, fulltext_weight: 0.2 - Prefer semantic meaning (default)
  • semantic_weight: 0.5, fulltext_weight: 0.5 - Balanced
  • semantic_weight: 0.2, fulltext_weight: 0.8 - Prefer exact keyword matches
Default weights (0.8 semantic, 0.2 fulltext) work well for most use cases. Adjust only if needed.

Search Effort

Nebula always performs hybrid search. By default, effort is set to auto which adapts based on query complexity. You can optionally control the computational budget using effort (auto/low/medium/high) - this controls traversal compute and exploration depth, not the exact number of results. Adjust the balance between semantic and keyword search using semantic_weight and fulltext_weight.

Advanced Search Options

Search Multiple Collections

# Search across multiple collections
results = nebula.search(
    query="customer feedback",
    collection_ids=["support-2024", "support-2023", "surveys"],
    effort="medium"
)

Search All Collections

Omit collection_ids to search across all accessible collections:
# Search everything
results = nebula.search(
    query="product roadmap",
    effort="high"
)
Searching all collections is slower than targeting specific collections.

Complex Metadata Filters

# Advanced filtering with multiple operators
results = nebula.search(
    query="bug reports",
    collection_ids=["issues-collection"],
    filters={
        "severity": {"$in": ["critical", "high"]},
        "status": {"$ne": "closed"},
        "created_date": {"$gte": "2024-01-01"}
    }
)
Available operators: $eq, $ne, $in, $nin, $gt, $gte, $lt, $lte, $like, $ilike, $overlap, $contains, $and, $or Full details in Metadata Filtering.

Authority Scores

Control which content should be prioritized in search results using authority scores (0-1).
# High authority for verified content
nebula.store_memory({
    "collection_id": "docs",
    "content": "Official API documentation...",
    "authority": 0.95,
    "metadata": {"verified": True}
})

# Lower authority for user-contributed content
nebula.store_memory({
    "collection_id": "docs",
    "content": "I think you do it this way...",
    "authority": 0.4,
    "metadata": {"verified": False}
})
ScoreUse Case
0.9-1.0Verified facts, official docs
0.7-0.9Reliable sources, quality content
0.5-0.7General content (default)
0.3-0.5Uncertain information
0.0-0.3Low-quality or spam
You can also set authority per message in conversations to prioritize specific chunks:
conv_id = nebula.store_memory({
    "collection_id": "support",
    "content": "Here is the verified answer...",
    "role": "assistant",
    "authority": 0.9
})

nebula.store_memory({
    "memory_id": conv_id,
    "collection_id": "support",
    "content": "I'm not sure, maybe try this?",
    "role": "user",
    "authority": 0.4
})
Default authority is 0.5. Higher authority content ranks higher in search results.

Search Best Practices

  1. Use natural language queries - “How do I reset my password?” works better than “password reset”
  2. Target specific collections - Faster and more relevant results
  3. Start with defaults - Only adjust weights if results aren’t good
  4. Leverage metadata filters - Narrow results by structured fields
  5. Choose appropriate effort - Balance between coverage and performance (auto/low/medium/high)
  6. Cache frequent queries - Store common search results

Next Steps