✅ Overview

Scimax VS Code includes a powerful database and search system that indexes your org, markdown, and Jupyter notebook files. The database provides:

  • Full-text search with FTS5 (SQLite's Full-Text Search) and BM25 ranking

  • Semantic search using vector embeddings for meaning-based queries

  • Hybrid search combining keyword and semantic approaches

  • Advanced search with query expansion, weighted RRF, and LLM reranking (SOTA)

  • Structured queries for headings, TODOs, tags, properties, and links

  • Agenda views for scheduled items and deadlines

  • Code block search filtered by programming language

The database is built on SQLite (via @libsql/client) with support for vector similarity search, making it both fast and capable of sophisticated semantic queries.

✅ What Gets Indexed

✅ File Types

The database automatically indexes three types of files:

  • `.org' files - Org-mode documents

  • `.md' files - Markdown documents

  • `.ipynb' files - Jupyter notebooks

✅ Indexed Content

For each file, the database extracts and indexes:

✅ Headings

  • Heading text and level (*, **, ***, etc.)

  • TODO states (TODO, DONE, IN-PROGRESS, etc.)

  • Priority markers ([#A], [#B], [#C])

  • Tags (both direct and inherited)

  • Properties (CUSTOMID, CATEGORY, etc.)

  • Scheduling information (SCHEDULED, DEADLINE, CLOSED)

  • Line numbers for navigation

✅ Source Blocks

  • Programming language

  • Complete code content

  • Header arguments (:results, :exports, etc.)

  • Line numbers

  • For notebooks: cell indices

✅ Hashtags

  • Inline hashtags (e.g., #research, #todo)

  • Associated file paths

✅ Full Text

  • Complete document content for full-text search

  • Indexed with FTS5 virtual tables

  • Porter stemming and Unicode normalization

Here is a hashtag #FullTextSearch

✅ File Watching

The database automatically watches for file changes:

  • New files are indexed when created

  • Modified files are re-indexed (debounced with 500ms delay)

  • Deleted files are removed from the index

  • Changes are queued and processed sequentially

✅ Ignore Patterns

By default, the following patterns are ignored:

  • **/node_modules/**

  • **/.git/**

  • **/dist/**

  • **/build/**

  • **/.ipynb_checkpoints/**

Configure additional patterns in settings: scimax.db.exclude

[[cmd:workbench.action.openSettings2]]

Open Scimax Settings

✅ Advanced Search (SOTA Pipeline)

Overview

Advanced search implements a state-of-the-art (SOTA) search pipeline inspired by modern search engines like qmd. It combines multiple techniques for maximum recall and precision:

  1. Query Expansion - Generates alternative query formulations

  2. Parallel Retrieval - Runs FTS5 and vector search concurrently

  3. Weighted Reciprocal Rank Fusion (RRF) - Combines results intelligently

  4. LLM Reranking - Uses AI to improve final ranking (optional)

Usage

Command

Run Scimax: Advanced Search ([[cmd:scimax.db.searchAdvanced]])

When to Use

  • Complex research queries

  • When you want maximum recall

  • When hybrid search isn't finding relevant content

  • For important searches where accuracy matters more than speed

Query Expansion

Query expansion improves recall by generating alternative formulations of your query.

Pseudo-Relevance Feedback (PRF)

How it works:

  1. Runs initial search with original query

  2. Extracts key terms from top 5 results

  3. Creates expanded query with additional terms

  4. Searches again with expanded query

Example:

Original: "machine learning"
Expanded: "machine learning neural network training models"

LLM Query Expansion

How it works:

  1. Sends query to LLM (e.g., qwen3:1.7b)

  2. LLM generates 3 alternative phrasings

  3. All variants searched in parallel

  4. Original query gets 2× weight in ranking

Example:

Original: "database optimization"
Variants:
  - "improve SQL query performance"
  - "speed up database queries"
  - "index tuning for databases"

Weighted RRF

Standard RRF assigns scores based on rank position. Advanced search enhances this with:

Position Bonuses

Top-ranked results get bonuses:

  • Rank 1: +15% bonus

  • Rank 2: +10% bonus

  • Rank 3: +5% bonus

Original Query Weight

The original query's results receive 2× weight compared to expanded query results.

Score Normalization

Different backends produce different score ranges:

  • BM25: Negative values (normalized to 0-1)

  • Vector: Cosine distance (converted to similarity)

  • All scores normalized before fusion

LLM Reranking

How It Works

  1. Takes top 30 candidates from RRF fusion

  2. LLM scores each document's relevance (0-10)

  3. Scores blended with retrieval scores using position-aware weights

Position-Aware Blending

High-confidence retrieval matches are preserved:

  • Ranks 1-3: 75% retrieval, 25% reranker

  • Ranks 4-10: 60% retrieval, 40% reranker

  • Ranks 11+: 40% retrieval, 60% reranker

Performance Considerations

Reranking adds latency (~1-2 seconds for 30 documents). Disable for fast searches.

Capabilities Check

Run Scimax: Show Search Capabilities (scimax.db.searchCapabilities) to see:

✓ Full-Text Search (FTS5/BM25) - Available
✓ Semantic/Vector Search - Available (Ollama)
✓ Query Expansion (PRF) - Available (no LLM required)
✗ Query Expansion (LLM) - Unavailable - check Ollama
✗ LLM Reranking - Unavailable - pull qwen3:0.6b

Graceful Degradation

Advanced search works without all features:

If UnavailableFallback
Vector searchFTS-only
LLM expansionPRF-only
RerankingSkip reranking
All LLM featuresEquivalent to hybrid search

Configuration

Search Mode

{
  "scimax.search.defaultMode": "hybrid",  // or "fast", "semantic", "advanced"
  "scimax.search.defaultLimit": 20
}

Query Expansion

{
  "scimax.search.queryExpansion.enabled": true,
  "scimax.search.queryExpansion.method": "prf",  // or "llm", "both"
  "scimax.search.queryExpansion.prfTopK": 5,
  "scimax.search.queryExpansion.prfTermCount": 5,
  "scimax.search.queryExpansion.llmModel": "qwen3:1.7b"
}

Reranking

{
  "scimax.search.reranking.enabled": false,  // Enable for better accuracy
  "scimax.search.reranking.model": "qwen3:0.6b",
  "scimax.search.reranking.topK": 30,
  "scimax.search.reranking.usePositionBlending": true
}

Hybrid Weights

{
  "scimax.search.hybrid.ftsWeight": 0.5,
  "scimax.search.hybrid.vectorWeight": 0.5,
  "scimax.search.hybrid.usePositionBonus": true,
  "scimax.search.hybrid.k": 60  // RRF constant
}

Caching

{
  "scimax.search.caching.enabled": true,
  "scimax.search.caching.ttlSeconds": 900,  // 15 minutes
  "scimax.search.caching.maxEntries": 500
}

Setting Up LLM Features

To enable query expansion and reranking with Ollama:

  1. Install Ollama: https://ollama.ai

  2. Start Ollama: ollama serve

  3. Pull models:

  4. Enable in settings:

Performance Comparison

ModeSpeedRecallPrecisionWhen to Use
Fast<50msLowHighExact matches
Semantic~200msHighMediumConceptual queries
Hybrid~300msHighHighGeneral purpose
Advanced1-3sHighestHighestImportant searches

Structured Queries

Beyond free-text search, the database supports structured queries for specific content types.

✅ File Browser

Browse all indexed files:

✅ Command

Scimax: Browse Indexed Files (scimax.db.browseFiles). [[cmd:scimax.db.browseFiles]]

✅ Features

  • Lists all files in database

  • Shows last indexed date

  • Sorted by most recently indexed

  • Displays file type (org, md, ipynb)

✅ Agenda and Time Management

✅ Agenda View

The agenda shows scheduled items and deadlines:

✅ Command

Scimax: Show Agenda (scimax.db.agenda). [[cmd:scimax.db.agenda]]

✅ Time Periods

  • Next 2 weeks (default)

  • Next month

  • Next 3 months

  • All items (no time limit)

✅ Options

  • Include unscheduled TODOs

  • Filter by time range

  • Sorted by urgency (overdue first)

✅ Item Types

✅ Deadline

Items with DEADLINE timestamps:

,* TODO Submit report
DEADLINE: <2026-01-20 Mon>

✅ Scheduled

Items with SCHEDULED timestamps:

,* TODO Team meeting
SCHEDULED: <2026-01-15 Wed 14:00>

✅ Unscheduled TODOs

Items with TODO state but no scheduling

✅ Deadline View

Show only upcoming deadlines:

Command

Scimax: Show Deadlines (scimax.db.deadlines). [[cmd:scimax.db.deadlines]]

✅ Features

  • Next 2 weeks of deadlines

  • Overdue items highlighted

  • Shows days until deadline

  • Excludes DONE and CANCELLED

Display Format

⚠️  Overdue: Submit TPS Report (3 days ago)
🔔 Today: Code Review
🔔 Tomorrow: Documentation Update
🔔 In 5 days: Project Demo

Date Formats

Scheduling in Org Files

# Simple date
SCHEDULED: <2026-01-20>

# Date with time
SCHEDULED: <2026-01-20 Mon 14:00>

# Date with time range
SCHEDULED: <2026-01-20 Mon 14:00-16:00>

# Deadline with warning period
DEADLINE: <2026-01-20 Mon -3d>

# Closed timestamp
CLOSED: [2026-01-13 Mon 10:30]

Relative Dates

+2w    # 2 weeks from now
+1m    # 1 month from now
+3d    # 3 days from now
+1y    # 1 year from now

Search Scope

Limit searches to specific directories:

Commands

Scimax: Set Search Scope (scimax.db.setScope)

Scope Types

All Files (Default)

  • Searches entire indexed database

  • Includes all workspace folders

  • Includes additional configured directories

Current Directory

  • Limits to active file's directory

  • Includes subdirectories

  • Useful for project-focused searches

Current Scope Indicator

The current scope is shown when setting scope:

Search scope: all
Search scope: directory (my-project)

Database Management

Reindexing

Full Reindex

Command: Scimax: Reindex Files (scimax.db.reindex). [[cmd:scimax.db.reindex]]

  • Scans all workspace folders

  • Checks file modification times

  • Only reindexes changed files

  • Shows progress notification

  • Reports statistics on completion

✅ Auto-Indexing

{
  "scimax.db.autoIndex": true
}

Warning: Disable for very large workspaces (>10,000 files) to prevent memory issues.

✅ Indexing Sources

By default, the database indexes:

  • Journal directory (scimax.db.includeJournal: true)

  • Workspace folders (scimax.db.includeWorkspace: true)

  • Scimax projects (scimax.db.includeProjects: true)

Add additional directories with scimax.db.include:

{
  "scimax.db.include": [
    "/home/user/research",
    "/home/user/notes",
    "~/Documents/org"
  ]
}

Optimization

✅ Command

Scimax: Optimize Database (scimax.db.optimize). [[cmd:scimax.db.optimize]]

Operations

  • Removes entries for deleted files

  • Runs VACUUM to reclaim space

  • Rebuilds indexes for performance

  • Should be run periodically (monthly)

Clearing Database

Command

Scimax: Clear Database (scimax.db.clear)

Warning

This is destructive and requires confirmation:

  • Removes all indexed data

  • Clears embeddings

  • Resets statistics

  • Requires full reindex to restore

When to Clear

  • Database corruption

  • Major schema changes

  • Troubleshooting issues

  • Fresh start needed

Statistics

Command

Scimax: Show Database Stats (scimax.db.stats)

Information Displayed

Scimax DB: 127 files (98 org, 23 md, 6 ipynb),
1,234 headings, 456 code blocks, 789 links.
Semantic search: Enabled (243 chunks).
Last indexed: 2026-01-13 14:30:00

Stats Include

  • File count by type

  • Heading count

  • Code block count

  • Link count

  • Chunk count (for semantic search)

  • Embedding status

  • Last index timestamp

Performance Considerations

Indexing Performance

File Size

  • Small files (<100KB): ~10-50ms

  • Medium files (100KB-1MB): ~50-200ms

  • Large files (>1MB): ~200ms-1s

Batch Indexing

  • 100 small files: ~2-5 seconds

  • 1,000 small files: ~20-60 seconds

  • With embeddings: 2-5x slower

Optimization Tips

  1. Use ignore patterns for large non-content directories

  2. Disable auto-indexing for huge workspaces

  3. Index incrementally (only changed files)

  4. Run optimization monthly

Search Performance

Full-Text Search (FTS5)

  • Query time: 10-50ms (typical)

  • Scales well to 10,000+ files

  • BM25 scoring is highly optimized

  • Results returned in rank order

Database Size

Typical Sizes

100 files:      ~5-10 MB
1,000 files:    ~50-100 MB
10,000 files:   ~500 MB-1 GB

With Embeddings

+50-100% size increase for chunks and vectors
384-dim embeddings: ~1.5 KB per chunk
768-dim embeddings: ~3 KB per chunk
1536-dim embeddings: ~6 KB per chunk

Memory Usage

Indexing

  • Base: ~50-100 MB

  • Peak during large batch: ~200-500 MB

  • Embedding generation: +100-300 MB

Searching

  • FTS5: ~10-50 MB

  • Vector search: ~50-200 MB (loads embeddings)

  • Minimal memory footprint when idle

Scaling Guidelines

Small Workspace (<100 files)

  • Enable auto-indexing

  • Use any embedding provider

  • Full reindex in seconds

Medium Workspace (100-1,000 files)

  • Enable auto-indexing

  • Local or Ollama embeddings recommended

  • Full reindex in under a minute

Large Workspace (1,000-10,000 files)

  • Consider disabling auto-indexing

  • Ollama embeddings recommended

  • Reindex incrementally

Very Large Workspace (>10,000 files)

  • Disable auto-indexing (manual reindex)

  • Use selective directory indexing

  • Consider multiple smaller databases

  • Ollama with a fast model recommended

Configuration Reference

Database Settings

scimax.db.includeJournal

Type: boolean Default: true

Include journal directory in database indexing.

scimax.db.includeWorkspace

Type: boolean Default: true

Include workspace folders in database indexing.

scimax.db.includeProjects

Type: boolean Default: true

Include all scimax projects in database indexing.

scimax.db.include

Type: string[] Default: []

Additional directories or files to index (supports ~ for home directory).

{
  "scimax.db.include": [
    "/home/user/notes",
    "~/Documents/research"
  ]
}

scimax.db.exclude

Type: string[] Default: ["**/node_modules/**", "**/.git/**", "**/dist/**", "**/build/**"]

Patterns or paths to exclude from indexing (globs and absolute paths).

{
  "scimax.db.exclude": [
    "**/node_modules/**",
    "**/.git/**",
    "**/dist/**",
    "**/temp/**",
    "**/*.backup.org",
    "~/notes/scratch.org"
  ]
}

`scimax.db.autoIndex'

Type: boolean Default: false

Automatically index workspace on activation. Disable for large workspaces.

{
  "scimax.db.autoIndex": true
}

Embedding Settings

`scimax.db.embeddingProvider'

Type: enum Values: "none" | "ollama" Default: "ollama"

Embedding provider for semantic search.

{
  "scimax.db.embeddingProvider": "ollama"
}

`scimax.db.ollamaUrl'

Type: string Default: "http://localhost:11434"

Ollama server URL.

{
  "scimax.db.ollamaUrl": "http://localhost:11434"
}

`scimax.db.ollamaModel'

Type: string Default: "nomic-embed-text"

Ollama embedding model name.

{
  "scimax.db.ollamaModel": "nomic-embed-text"
}

Command Reference

Search Commands

CommandDescription
scimax.db.searchFull-text search (FTS5)
scimax.db.searchSemanticSemantic search (vector)
scimax.db.searchHybridHybrid search (FTS + vector)
scimax.db.searchAdvancedAdvanced search (full pipeline)
scimax.db.searchCapabilitiesShow search capabilities
scimax.db.searchHeadingsSearch headings
scimax.db.searchByTagSearch by org tag
scimax.db.searchByPropertySearch by property value
scimax.db.searchBlocksSearch code blocks
scimax.db.searchHashtagsSearch by hashtag

View Commands

CommandDescription
scimax.db.showTodosShow TODO items
scimax.db.agendaShow agenda
scimax.db.deadlinesShow upcoming deadlines
scimax.db.browseFilesBrowse indexed files

Management Commands

CommandDescription
scimax.db.reindexReindex all files
scimax.db.optimizeOptimize database
scimax.db.clearClear database
scimax.db.statsShow database statistics
scimax.db.setScopeSet search scope
scimax.db.configureEmbeddingsConfigure embedding service
scimax.db.backupBackup database to file
scimax.db.restoreRestore database from file
scimax.db.rebuildRebuild database completely
scimax.db.verifyVerify database integrity

✅ Database Maintenance

✅ Backup and Restore

The database can be backed up and restored to prevent data loss and enable migration between machines.

✅ Backup

Command: Scimax: Backup Database (scimax.db.backup)

Creates a portable backup file containing:

  • All indexed file paths (not file contents)

  • Project information

  • Database metadata

Backup is stored in JSON format for portability.

# Example backup location
~/.scimax/backup-2026-01-22.json

✅ Restore

Command: Scimax: Restore Database (scimax.db.restore)

Restores database from a backup file:

  • Imports project list

  • Queues files for reindexing

  • Preserves original creation timestamps

Note: Actual file content must still be reindexed after restore.

✅ Database Rebuild

Command: Scimax: Rebuild Database (scimax.db.rebuild)

Completely rebuilds the database from scratch:

  • Drops and recreates all tables

  • Re-scans all configured directories

  • Regenerates all indexes

  • Regenerates embeddings (if configured)

Use when:

  • Database appears corrupted

  • Major schema changes after update

  • Switching embedding providers

  • Performance issues after many incremental updates

Options

OptionDescription
Full rebuildComplete reindex of all files
Projects onlyOnly rebuild project table

✅ Database Verification

Command: Scimax: Verify Database (scimax.db.verify)

Checks database integrity and freshness:

Checks Performed

  1. File existence - Verifies indexed files still exist on disk

  2. Modification time - Detects files modified since indexing

  3. Index integrity - Validates FTS5 and vector indexes

  4. Project validity - Checks project directories exist

Result Format

Database Verification Results:
- Total files: 127
- Missing files: 2
- Stale files: 5
- Projects: 8 (7 valid, 1 missing)
- Status: NEEDS_REINDEX

Status Values

StatusMeaning
OKDatabase is current and valid
NEEDSREINDEXSome files are stale or missing
CORRUPTEDIndex integrity check failed

✅ Project Integration

The database now stores project information, integrated with the Projectile project manager:

Benefits

  • Projects persist across VS Code restarts

  • Shared project list between Projectile and Database

  • Fast project switching using indexed data

  • Projects can be associated with indexed files

Project Commands

Projects are managed through Projectile commands (C-c p), but the database provides the persistence layer.

See Projectile for project management commands.

Troubleshooting

Semantic Search Not Working

Problem

Semantic search returns no results, shows "unavailable", or displays an error.

First: Check Vector Search Support

Run Scimax: Show Database Stats (scimax.db.stats) to see the semantic search status:

Status MessageMeaning
Semantic search: Enabled (N chunks)Working, N chunks with embeddings
Semantic search: Ready (no embeddings)Supported, but no provider configured
Semantic search: Unavailable (error)Vector search not supported by database

Vector Search Unavailable

If you see "Semantic search: Unavailable", the libsql database doesn't support vector operations. This can happen if:

  • The libsql version doesn't include vector support

  • The vector index failed to create

Embedding Provider Issues

If vector search is supported but not working:

  1. Check embedding provider is configured: scimax.db.embeddingProvider

  2. Test connection: Run Scimax: Configure Embedding Service

  3. Ensure files are reindexed after configuring embeddings

  4. Check console for errors (Help: Toggle Developer Tools)

Local Provider Issues

  • First use downloads model (~30MB), wait for completion

  • Check extension cache directory has write permissions

  • Try different model if one fails

Ollama Issues

  • Ensure Ollama is running: ollama serve

  • Pull model: ollama pull nomic-embed-text

  • Check URL is correct in settings

  • Test connection: curl http://localhost:11434/api/embeddings

Search Returns No Results

Problem

Searches return empty results despite having files.

Solutions

  1. Run Scimax: Show Database Stats to check file count

  2. If files 0, run Scimax: Reindex Files

  3. Check file extensions (.org, .md, .ipynb)

  4. Verify files aren't in ignored directories

  5. Check search scope is set to "All files"

Slow Indexing

Problem

Indexing takes very long or appears stuck.

Solutions

  1. Check workspace size (number of files)

  2. Add ignore patterns for large non-content directories

  3. Disable embedding generation if not needed

  4. Index directories incrementally

  5. Check disk I/O and available memory

Database Corruption

Problem

Errors mentioning "database is locked" or "disk I/O error".

Solutions

  1. Close other VS Code windows accessing same workspace

  2. Restart VS Code

  3. Run Scimax: Clear Database and reindex

  4. Check disk space is available

  5. Verify database file permissions

High Memory Usage

Problem

VS Code uses excessive memory during indexing or searching.

Solutions

  1. Disable auto-indexing

  2. Reduce number of indexed directories

  3. Use more aggressive ignore patterns

  4. Clear and reindex database

  5. Restart VS Code between large indexing operations

Examples and Workflows

Research Paper Management

# Organize papers with properties
,* TODO Read: Attention Is All You Need
:PROPERTIES:
:CUSTOM_ID: vaswani2017attention
:AUTHOR: Vaswani et al.
:YEAR: 2017
:CATEGORY: research
:END:

#transformers #attention #nlp

SCHEDULED: <2026-01-15 Wed>

# Search by property
Property: CATEGORY
Value: research

# Search by hashtag
#nlp

# Semantic search
transformer architecture papers

Project Todo Management

# Use tags for organization
,* TODO Implement login feature :work:backend:
DEADLINE: <2026-01-20 Mon>

,* TODO Write API documentation :work:docs:
SCHEDULED: <2026-01-18 Sat>

# Search by tag
:work: -> shows all work items
:backend: -> shows backend tasks

# View agenda
Next 2 weeks -> prioritized by deadline

# Search TODOs
Filter by: TODO (in progress)

Code Snippet Library

# Store reusable code blocks
,* Data Processing Utils

,#+BEGIN_SRC python
def normalize_data(df):
    """Normalize numeric columns"""
    return (df - df.mean()) / df.std()
,#+END_SRC

# Search code blocks
Language: python
Query: normalize

# Or semantic search
how to standardize dataframe columns

Personal Knowledge Base

# Use hybrid search for discovery
Query: "improve code performance"

Results will include:
- Exact matches: "code performance" articles
- Related concepts: optimization, profiling, caching
- Similar topics: algorithm efficiency, memory management

# Use properties for metadata
:PROPERTIES:
:CREATED: [2026-01-13 Mon 10:00]
:MODIFIED: [2026-01-13 Mon 15:30]
:CATEGORY: programming
:END:

Best Practices

Indexing Strategy

  1. Start selective - Index specific directories first

  2. Use ignore patterns - Exclude build artifacts and dependencies

  3. Index incrementally - Don't reindex everything on changes

  4. Schedule optimization - Run monthly for large databases

  5. Monitor statistics - Check file counts and sizes regularly

Search Strategy

  1. Start broad - Use semantic or hybrid search for exploration

  2. Refine with keywords - Switch to FTS for specific terms

  3. Use structured queries - Filter by tags/properties when possible

  4. Set scope appropriately - Narrow to directories for focused work

  5. Combine approaches - Use multiple search types for thorough research

Organization Tips

  1. Use consistent tags - Establish tag naming conventions

  2. Add properties - Include metadata for filtering

  3. Set schedules - Use SCHEDULED/DEADLINE for time management

  4. Include hashtags - Quick inline categorization

  5. Write descriptive headings - Better search results

Performance Tips

  1. Disable auto-indexing - For large workspaces (manual trigger)

  2. Choose appropriate embeddings - Balance quality vs. speed

  3. Limit result counts - Don't request thousands of results

  4. Use search scope - Narrow searches to relevant directories

  5. Cache frequent queries - Database has 15-minute result cache

Technical Architecture

Database Schema

Files Table

Tracks indexed files with modification tracking:

  • path, file_type, mtime, hash, size, indexed_at

Headings Table

Org/markdown headings with full metadata:

  • level, title, todo_state, priority

  • tags, inherited_tags, properties

  • `scheduled', `deadline', `closed'

  • line_number, begin_pos

Source Blocks Table

Code blocks with language and content:

  • `language', `content', `headers'

  • line_number, cell_index

Hashtags Table

Inline hashtags:

  • tag, file_path

Chunks Table

Text chunks for semantic search:

  • content, line_start, line_end

  • embedding (F32BLOB vector)

FTS Content (Virtual Table)

Full-text search index:

  • file_path, title, content

  • Porter stemming, Unicode normalization

  • BM25 ranking support

Indexes

Performance indexes on:

  • headings.file_id, headings.todo_state

  • `headings.deadline', `headings.scheduled'

  • source_blocks.language

  • `hashtags.tag'

  • `chunks.embedding' (vector index with cosine metric)

  • files.file_type

Parsers

Org Mode

UnifiedParserAdapter - Full AST parser compatible with org-element:

  • Recursive heading parsing with inheritance

  • Property drawer extraction

  • Timestamp parsing (scheduled, deadline, closed)

  • Source block with headers

  • Link extraction

Markdown

Simplified parser:

  • ATX heading syntax (`#')

  • Fenced code blocks

  • Inline links

Jupyter Notebooks

ipynbParser - Notebook-specific parser:

  • Markdown cells → headings and links

  • Code cells → source blocks

  • Cell indices tracked for navigation

  • Hashtags from markdown cells

See Also

Navigation