LlamaIndex

LlamaIndex — User Guide

LlamaIndex; data + LLM.

Visit website VPN may be required Freemium
Strengths
  • Focus on data connections and RAG, going deeper than LangChain in this area
  • Supports 100+ data sources (PDF, databases, APIs, web pages, etc.)
  • Advanced RAG technology: HyDE, sentence window, parent-child search, etc.
  • LlamaCloud provides managed data processing services
  • Can be used with LangChain
Best for
  • Enterprise knowledge base question and answer system
  • Unified retrieval of multiple data sources
  • Natural language query for structured data (database, Excel)
  • Document understanding and information extraction
  • Build an AI research assistant

Quickly build a RAG system

LlamaIndex provides a very simple API, and you can build a RAG system with just a few lines of code.

Scenario

The simplest documentation Q&A

Prompt example
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader


from llama_index.llms.openai import OpenAI


from llama_index.core import Settings




# Configure LLM


Settings.llm = OpenAI(model="gpt-4o", api_key="your-key")




# Load documents (supports PDF, Word, TXT, etc.)


documents = SimpleDirectoryReader("./docs").load_data()




# Build index (automatic chunking, vectorization)


index = VectorStoreIndex.from_documents(documents)




#Create query engine


query_engine = index.as_query_engine()




#Ask a question


response = query_engine.query("What is the company's core product?")


print(response)


print("\nSource:", response.source_nodes[0].text[:200])
Output / what to expect

LlamaIndex autocomplete:

  • Document loading and parsing

  • Text chunking (default 1024 tokens)

  • Vectorized storage

  • Retrieve the most relevant snippets

  • Generate answers with sources

Tips

SimpleDirectoryReader supports mixed file types, and PDF, Word, and TXT in one directory can all be loaded.

Scenario

Persistent index (avoid repeated construction)

Prompt example
import os
from llama_index.core import (
    VectorStoreIndex, SimpleDirectoryReader,
    StorageContext, load_index_from_storage
)

PERSIST_DIR = "./storage"

if not os.path.exists(PERSIST_DIR):
    # Build and save for the first time
    documents = SimpleDirectoryReader("./docs").load_data()
    index = VectorStoreIndex.from_documents(documents)
    index.storage_context.persist(persist_dir=PERSIST_DIR)
    print("Index construction completed and saved")
else:
    # Directly load existing indexes
    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
    index = load_index_from_storage(storage_context)
    print("Load index from cache")

query_engine = index.as_query_engine()
response = query_engine.query("your question")
Output / what to expect

The first run will build and save the index,

Subsequent runs will load directly.

Avoid repeated calls to the Embedding API,

Save time and money.

Tips

For large document libraries, persistent indexes are very important and can save a lot of API call costs.

Advanced RAG technology

LlamaIndex offers a variety of advanced RAG technologies that can significantly improve search quality.

Scenario

Improve accuracy using sentence window retrieval

Prompt example
from llama_index.core.node_parser import SentenceWindowNodeParser
from llama_index.core.postprocessor import MetadataReplacementPostProcessor

# Sentence window: Retrieve in sentences, but return a larger context window
node_parser = SentenceWindowNodeParser.from_defaults(
    window_size=3, # Return the context of 3 sentences before and after
    window_metadata_key="window",
    original_text_metadata_key="original_text"
)

# Build index
index = VectorStoreIndex.from_documents(
    documents,
    transformations=[node_parser]
)

#Replace with full window when querying
query_engine = index.as_query_engine(
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window")
    ]
)
Output / what to expect

Advantages of sentence window retrieval:

  • Higher retrieval accuracy (matching in sentence units)

  • The context returned is more complete (including the sentences before and after)

  • Reduce information loss due to block boundaries

Tips

Sentence window retrieval is effective for scenarios that require precise citations (such as legal documents, technical manuals).

Structured data query

LlamaIndex allows AI to query databases and Excel files using natural language.

Scenario

Natural language query SQL database

Prompt example
from llama_index.core import SQLDatabase
from llama_index.core.query_engine import NLSQLTableQueryEngine
from sqlalchemy import create_engine

# Connect to database
engine = create_engine("sqlite:///sales.db")
sql_database = SQLDatabase(engine, include_tables=["orders", "customers"])

# Create a natural language query engine
query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["orders", "customers"]
)

# Query using natural language
response = query_engine.query(
    "Which customer had the highest order value in the past 30 days?"
)
print(response)
print("Executed SQL:", response.metadata["sql_query"])
Output / what to expect

LlamaIndex automatically converts natural language into SQL,

Execute the query and return the results,

At the same time, the generated SQL statement is displayed,

Convenient to verify the correctness of the query.

Tips

Providing a detailed description of the table structure (via the table_info parameter) can improve the accuracy of SQL generation.

Compared with similar tools

ToolStrengthBest forPricing
LlamaIndex This toolRAG has the most focused and in-depth functions and the richest data connectorsApplications focused on knowledge bases and data retrievalOpen Source Free / LlamaCloud Paid
LangChainMore comprehensive functions and stronger Agent capabilitiesComprehensive application that requires RAG + Agent + workflowOpen source and free
HaystackEnterprise-level NLP pipeline, stable production environmentEnterprise-level search and question answering systemOpen source and free
WeaviateProfessional vector database with strong retrieval performanceLarge-scale vector retrieval, production environment deploymentOpen source free / cloud version paid

Sources & references: