Tools

index_project

Index your codebase for AI-powered semantic search

The index_project tool creates the indexes that power Stellarion's advanced features. It generates vector embeddings for semantic search and builds a dependency graph for structure analysis. Running this tool enables fast, intelligent code search and accurate dependency mapping.

Overview

Stellarion's intelligent indexing engine provides a comprehensive, structured representation of your codebase by extracting and organizing code elements including files, procedures, functions, classes, variables, and their relationships. This powerful capability forms the foundation for enhanced AI-assisted development and natural language code exploration.

What Indexing Captures

When you index your codebase, Stellarion performs deep semantic analysis to create a rich, searchable representation of your code's structure and content. The indexing process captures:

File-level metadata: File paths, imports, dependencies, and module relationships
Code structures: Functions, methods, classes, interfaces, and their hierarchies
Variables and constants: Declarations, types, scopes, and usage patterns
Documentation: Inline comments, docstrings, and annotations
Code relationships: Call graphs, inheritance chains, and data flow patterns

Key Benefits

1. Optimized AI Interactions

When you request AI assistance for code generation, enhancement, or debugging, Stellarion leverages the index to provide only the most relevant context to the AI model. Instead of passing entire files or large code blocks, the index allows Stellarion to:

Dramatically reduce token consumption: By sending only pertinent code elements rather than entire files, you minimize API costs and stay within context window limits
Improve AI response quality: The AI receives precisely the context it needs—function signatures, type definitions, related dependencies—resulting in more accurate, contextually-aware suggestions
Speed up response times: Smaller, focused context means faster processing and quicker results

2. Semantic Code Search

The index powers Stellarion's semantic search capabilities, enabling you to explore your codebase using natural language queries. You can:

Ask questions like "Where is user authentication handled?" or "Show me all database connection functions"
Discover code patterns and implementations without prior knowledge of the codebase
Navigate large, unfamiliar codebases quickly and intuitively
Find relevant code based on intent and functionality, not just keyword matching

This is particularly valuable when:

Onboarding to new projects
Working with legacy code
Collaborating across teams with different codebases
Conducting code reviews or audits

3. Contextual Code Understanding

The semantic understanding provided by indexing allows Stellarion to:

Identify dependencies and potential breaking changes
Suggest refactoring opportunities
Detect code patterns and anti-patterns
Provide intelligent autocomplete and code suggestions

How It Works

Stellarion uses local embedding models to generate vector embeddings of your code. Code is split into 50-line chunks, each converted to a high-dimensional vector that captures semantic meaning. These vectors are stored in a RocksDB database for fast similarity search. Simultaneously, import/export statements are analyzed to build a dependency graph in KuzuDB.

When to Use

Setting up a new project: First-time indexing to enable semantic search
After major codebase changes: Re-index to capture new code
When search results seem outdated: Refresh the index
Setting up a new development environment: Initialize Stellarion for your project

Parameters

Parameter	Type	Required	Default	Description
`path`	string	No	`.`	Project root to index
`forceReindex`	boolean	No	false	Re-index even if already indexed
`maxFiles`	number	No	1000	Maximum files to index
`fileTypes`	array	No	All code files	Specific extensions to index
`buildGraph`	boolean	No	true	Build dependency graph
`generateEmbeddings`	boolean	No	true	Generate semantic embeddings
`includeStatistics`	boolean	No	false	Include detailed indexing statistics
`useCache`	boolean	No	false	Use cache to skip unchanged files

MCP Command Syntax

mcp__stellarion__index_project path:. forceReindex:true includeStatistics:true

Examples

Basic Indexing

Natural Language:

Index this project for semantic search

Direct MCP Call:

mcp__stellarion__index_project

Returns: Indexing summary with file count and storage location

Force Re-indexing

Natural Language:

Re-index this project from scratch

Direct MCP Call:

mcp__stellarion__index_project forceReindex:true

Returns: Fresh index, ignoring any existing data

Index Specific File Types

Natural Language:

Index only TypeScript and JavaScript files

Direct MCP Call:

mcp__stellarion__index_project fileTypes:["ts","tsx","js","jsx"]

Returns: Index containing only the specified file types

Get Detailed Statistics

Natural Language:

Index this project and show me detailed statistics

Direct MCP Call:

mcp__stellarion__index_project includeStatistics:true

Returns: Comprehensive stats including files processed, chunks created, time taken

Index Only the Dependency Graph

Natural Language:

Just rebuild the dependency graph, don't regenerate embeddings

Direct MCP Call:

mcp__stellarion__index_project buildGraph:true generateEmbeddings:false

Returns: Updated dependency graph for structure analysis

Index Only Embeddings

Natural Language:

Regenerate semantic search embeddings only

Direct MCP Call:

mcp__stellarion__index_project generateEmbeddings:true buildGraph:false

Returns: Updated vector embeddings for semantic search

Limit Large Codebases

Natural Language:

Index this large project but limit to 500 files

Direct MCP Call:

mcp__stellarion__index_project maxFiles:500

Returns: Index of the most important 500 files

Output Format

Results include:

Field	Description
Files indexed	Number of files processed
Chunks created	Number of code chunks generated
Embeddings generated	Number of vector embeddings created
Graph nodes	Number of files in dependency graph
Graph edges	Number of dependency relationships
Time taken	Duration of indexing process
Storage location	Path to .stellarion directory

What Gets Indexed

Supported Languages

Language	Extensions
TypeScript	.ts, .tsx
JavaScript	.js, .jsx, .mjs, .cjs
Python	.py
Rust	.rs
Go	.go
Java	.java
C/C++	.c, .cpp, .h, .hpp
Ruby	.rb

Files Automatically Excluded

node_modules/ and similar dependency directories
dist/, build/, target/ (build artifacts)
.git/ and other VCS directories
.stellarion/ (Stellarion's own data)
Files matching .gitignore patterns

What's Captured

Functions and methods with their implementations
Classes and interfaces
Type definitions
Documentation comments (JSDoc, docstrings)
Import/export relationships

Storage Location

Index data is stored in your project:

.stellarion/
├── vector_store/     # Embeddings (RocksDB)
└── graph_kuzu/       # Dependency graph (KuzuDB)

This directory should be added to .gitignore as it's machine-specific and can be regenerated.

When to Re-index

Re-index your project when:

Situation	Command
Adding significant new code	`forceReindex:true`
After major refactoring	`forceReindex:true`
Search results seem stale	`forceReindex:true`
Switching branches with different code	`forceReindex:true` (if major differences)
Fixing corrupted index	`forceReindex:true`

Indexing Strategies

Full Codebase Indexing

Recommended for: Most projects, especially when working with AI assistance frequently

Indexing your entire codebase provides:

Complete context for AI-generated code
Comprehensive semantic search across all modules
Better understanding of cross-module dependencies
Maximum benefit from Stellarion's capabilities

Partial Codebase Indexing

Recommended for: Very large monorepos, multi-language projects, or when you want to focus on specific modules

You can selectively index:

Specific directories or modules you're actively working on
Core libraries while excluding test fixtures or build artifacts
Your application code while excluding third-party dependencies
Performance-critical sections that require frequent AI interaction

Best Practices

Index broadly: Start with a full codebase index to maximize benefits
Update regularly: Re-index after significant code changes, merges, or refactoring sessions
Set up automatic updates: Configure Stellarion to automatically refresh indexes on file changes or at scheduled intervals
Exclude generated code: Skip auto-generated files, build outputs, and vendor directories to keep indexes lean and relevant
Version control integration: Index after pulling updates from your repository to stay synchronized with your team

Performance Tips

Initial indexing takes time: Large projects (1000+ files) may take several minutes
Use maxFiles for very large codebases: Focus on the most important directories
Use useCache: true for incremental updates: Skip files that haven't changed
Index only what you need: Use fileTypes to exclude irrelevant languages
Re-index after major changes: Keep the index fresh for accurate results

Troubleshooting

Index seems out of date

mcp__stellarion__index_project forceReindex:true

Semantic search not working

mcp__stellarion__index_project generateEmbeddings:true

Structure analysis not accurate

mcp__stellarion__index_project buildGraph:true forceReindex:true

Index taking too long

mcp__stellarion__index_project maxFiles:500 fileTypes:["ts","js"]

analyze_impact

Change risk assessment and blast radius analysis

project_info

Comprehensive project metadata and structure detection