
Document Search in .NET with Kernel Memory
I recently discovered the Kernel Memory library for document indexing, web scraping, semantic search, and LLM-based question answering. The capabilities, flexibility, and simplicity of this library are so fantastic that it's quickly ascended my list of favorite AI libraries to work with for RAG search, document search, or AI-based question answering.
In this article I'll walk you through what Kernel Memory is and how you can use the C# version of this library to quickly index, search, and chat with knowledge stored in documents or web pages.
Kernel Memory, a flexible document indexing and RAG search library
At its core, Kernel Memory is all about ingesting information in various sources, indexing it, storing it in a vector storage solution, and providing a means for searching and question answering with this indexed knowledge.
We'll walk through a full small application in this article, but here's a simple implementation to help orient you:
1IKernelMemory memory = new KernelMemoryBuilder()
2 .WithOpenAI(openAiConfig)
3 .Build();
4
5await memory.ImportDocumentAsync("TheGuide.pdf");
6
7string question = "What is the answer to the question of life, the universe, and everything?"
8MemoryAnswer answer = await memory.AskAsync(question);
9
10string reply = answer.Result;
11console.WriteLine(reply);
In this snippet we see that:
- Kernel Memory uses a standard builder API allowing you to add in various sources that are relevant to you (here an OpenAI text and embedding model)
- Kernel Memory provides
Import
methods allowing you to index documents, text, and web pages and store them in its current vector store - Kernel Memory provides a convenient way of asking questions to an LLM and providing your information as a RAG data source
In this short example we're using the default volatile memory vector store which is built into Kernel Memory for demonstration purposes, but you could just as easily use an existing vector storage provider such as Qdrant, Azure AI Search, Postgres, Redis, or others.
Likewise, Kernel Memory supports a wide range of LLMs and other ingestion data sources including OpenAI, Anthropic, ONNX, and even locally-running Ollama models.
This last point has me particularly excited because I can now use locally hosted LLMs and on-network vector storage solutions to ingest and search documents without needing to worry about data leaving my network or per-usage cloud hosting costs. This opens up new usage scenarios for me for experimentation, workshops at conferences, and business scenarios.
Let's drill into a larger Kernel Memory app and see how it flows.
Creating a Kernel Memory instance in C#
Through the rest of this article we'll walk through a small C# console application from start to finish. This project is available on GitHub if you'd like to clone it locally and experiment with it as well, though you'll need to provide your own API keys.
Next we use some fairly ordinary C# code using the Microsoft.Extensions.Configuration
mechanism for reading settings:
1IConfiguration config = new ConfigurationBuilder()
2 .AddJsonFile("appsettings.json", optional: true, reloadOnChange: false)
3 .AddEnvironmentVariables()
4 .AddUserSecrets<Program>()
5 .AddCommandLine(args)
6 .Build();
7DocSearchDemoSettings settings = config.Get<DocSearchDemoSettings>()!;
This reads information from a local JSON file, command-line arguments, user secrets, and environment variables and stores them into a settings object that looks like this:
1public class DocSearchDemoSettings
2{
3 public required string OpenAIEndpoint { get; init; }
4 public required string OpenAIKey { get; init; }
5 public required string TextModelName { get; init; }
6 public required string EmbeddingModelName { get; init; }
7}
Most of these settings are set in the appsettings.json
file, though you should store your endpoint and key in user secrets or environment variables if you plan on working with your own keys and managing things in source control.
1{
2 "OpenAIEndpoint": "YourEndpoint",
3 "OpenAIKey": "YourApiKey",
4 "TextModelName": "gpt-4o-mini",
5 "EmbeddingModelName": "text-embedding-3-small"
6}
With our configuration loaded, we now jump into creating our IKernelMemory
instance, where we'll need to provide information on what model, endpoints and keys to use:
1OpenAIConfig openAiConfig = new()
2{
3 APIKey = settings.OpenAIKey,
4 Endpoint = settings.OpenAIEndpoint,
5 EmbeddingModel = settings.EmbeddingModelName,
6 TextModel = settings.TextModelName,
7};
8IKernelMemory memory = new KernelMemoryBuilder()
9 .WithOpenAI(openAiConfig)
10 .Build();
11
12IAnsiConsole console = AnsiConsole.Console;
13console.MarkupLine("[green]KernelMemory initialized.[/]");
This creates and configures our Kernel Memory instance using an OpenAI text and embeddings model. The text completion model will be used for conversations with our data using the AskAsync
method while the embeddings model is used to generate a vector representing different chunks of documents that are indexed as well as the search queries when the memory instance is searched.
By default Kernel Memory is using a volatile in-memory vector store that gets completely discarded and recreated every time the application runs. This is not a production-level solution, but is fine for quick demonstrations on low volumes of data. For larger-scale scenarios or production usage you would use a dedicated vector storage solution and connect it to Kernel Memory when building your IKernelMemory
instance.
Also note the use of AnsiConsole
. This is part of Spectre.Console
, a library I frequently use alongside .NET console apps for enhanced input and output capabilities. We'll see more of this later.
Indexing Documents and Web Scraping with Kernel Memory
With our Kernel Memory instance set up and running an empty vector store, we should ingest some data before we continue on.
Kernel Memory supports importing data in the following formats:
- Raw strings
- Web pages via web scraping
- Documents in a supported format (PDF, Images, Word, PowerPoint, Excel, Markdown, text files, and JSON)
The API for importing each of these sources is exceptionally simple as well:
1// Index documents and web content
2console.MarkupLine("[yellow]Importing documents...[/]");
3
4await memory.ImportTextAsync("KernelMemory allows you to import web pages, documents, and text");
5await memory.ImportTextAsync("KernelMemory supports PDF, md, txt, docx, pptx, xlsx, and other formats", "Doc-Id");
6
7await memory.ImportDocumentAsync("Facts.txt", "Repository-Facts");
8
9await memory.ImportWebPageAsync("https://LeadingEDJE.com", "Leading-EDJE-Web-Page");
10await memory.ImportWebPageAsync("https://microsoft.github.io/kernel-memory/",
11 "KernelMemory-Web-Page",
12 new TagCollection { "GitHub"});
13
14console.MarkupLine("[green]Documents imported.[/]");
This code indexes a pair of raw strings, a Facts.txt file included with the repository, and a pair of web pages: Leading EDJE's web site (my employer, an IT services consultancy in Columbus, Ohio) and the GitHub repository for Kernel Memory.
Note that as we index anything we can just give it a data source, or we could give it a data source and a document Id, or we could provide additional tag or index metadata as well.
Using tags and indexes you can annotate the documents you insert as belonging to certain collections. This allows you to filter down to certain groups of documents later on when searching or asking questions which supports critical scenarios such as restricting information available to different users based on which organization they're in or their security role.
Most everything about Kernel Memory is customizable as well, so you can change how documents are decoded and partitioned and you can substitute in your own web scraping provider in lieu of Kernel Memory's default one, for example.
These Import
calls will take a few seconds to complete, based on the size of the data, your text embeddings model, and your choice of vector storage solution. Once it completes, your data will be available for search.
Searching Documents with Kernel Memory and Text Embeddings
With our data ingested, we can now query Kernel Memory for specific questions. This can come in one of two ways:
SearchAsync
which provides raw search results to be handled programmaticallyAskAsync
which performs a search and then has an LLM respond to the question asked given the search results.
While the search results are more complex than the ask results, we should start by exploring search as this helps us understand what Kernel Memory is doing under the hood.
The code to conduct the search itself is straightforward and intuitive:
1string search = console.Ask<string>("What do you want to search for?");
2console.MarkupLineInterpolated($"[yellow]Searching for '{search}'...[/]");
3
4SearchResult results = await memory.SearchAsync(search);
The SearchResult
object organizes its results into different citations representing different documents searched. Within each citation will be different sets of partitions which represent different chunks of the document which are indexed and stored for data retrieval. This is important because documents and web pages can be very long and you want to match only on the most relevant portions of a document when performing a search.
Each partition has a relevance score stored as a decimal percentage value ranging from 0 to 1.
Using Spectre.Console
, you can loop over these citations and partitions and create a display table using the following code:
1Table table = new Table()
2 .AddColumns("Document", "Partition", "Section", "Score", "Text");
3
4foreach (var citation in results.Results)
5{
6 foreach (var part in citation.Partitions)
7 {
8 string snippet = part.Text;
9 if (part.Text.Length > 100)
10 {
11 snippet = part.Text[..100] + "...";
12 }
13
14 table.AddRow(citation.DocumentId, part.PartitionNumber.ToString(), part.SectionNumber.ToString(), part.Relevance.ToString("P2"), snippet);
15 }
16}
17
18table.Expand();
19console.Write(table);
20console.WriteLine();
This produces a formatted table resembling the following image:
As you can see, each match will have a document, partition, section within that partition, relevance score, and some associated text. Individual tags and source URLs will also be available. Note how document names are not mandatory, but Kernel Memory generates its own random Ids if you don't provide an id yourself.
You can use SearchAsync
to manually identify the most relevant documents and pieces of documents from your vector store. This can be useful for providing semantic search capabilities across your site, or for identifying text to inject into prompts as a form of Retrieval Augmetnation Generation (RAG). However, if you're working with RAG, there's a chance you might be better off using the AskAsync
method instead, as we'll see next.
Question answering with KernelMemory and LLM
If your end goal is to provide a reply to the user from a query they sent you, you should consider using Kernel Memory's AskAsync
method.
AskAsync
uses the text model to summarize the result of a search and provide that string back to the user.
The code for this is extremely straightforward:
1string question = console.Ask<string>("What do you want to ask?");
2console.MarkupLineInterpolated($"[yellow]Asking '{question}'...[/]");
3
4MemoryAnswer answer = await memory.AskAsync(question);
5
6console.WriteLine(answer.Result);
This provides a text output from your LLM as you would expect:
As you might imagine, the AskAsync
method takes significantly longer than SearchAsync
because it effectively is performing the search as a RAG search and then using the results to chat with the underlying LLM.
If you need information about the sources cited, those are also available in the MemoryAnswer
, which can be helpful for diagnostic / logging purposes or simply to let the user know what was used in answering their question - or giving them additional links to investigate.
Kernel Memory Extensions and Integrations
Kernel Memory is a very powerful and flexible library with a simple API and good behaviors out of the box. It has a high degree of customizability in terms of those default behaviors for the times when you need additional control.
For example, you can customize the prompts Kernel Memory uses for fact extraction and summarization, giving you more control of its behavior as a chat partner.
Additionally, Kernel Memory has a number of different deployment models, ranging from in-process MemoryServerless
implementations like the one described in this article to pre-built Docker containers, to web services.
Kernel Memory was also built with Semantic Kernel at least partially in mind. While Semantic Kernel has its own built-in vector store capabilities, they're harder to use than Kernel Memory's options and don't have as many ingestion options. As a result, you can connect Kernel Memory into a Semantic Kernel instance, providing a RAG data source for your AI orchestration solution. In fact, there's even a pre-built SemanticKernelPlugin NuGet package built just for this purpose.
Conclusion
I'm absolutely enamored with the Kernel Memory library and see a lot of uses for this technology including:
- Simple RAG search and question answering for web applications
- Indexing existing knowledge sources like Confluence or Obsidian vaults
- Providing a cost-effective and secure option for document ingestion, ensuring document data never leaves the network
If you're curious about Kernel Memory, I'd encourage you to take a look at the GitHub Repository containing this article's code and try things yourself.
I'll be writing more about Kernel Memory in a chapter in my next technical book which releases in Q3 2025, and I'm looking at revising my Digital Dungeon Master solution to take advantage of the Kernel Memory / Semantic Kernel integration. I can also see some very real ways where Kernel Memory can help offer some of Leading EDJE's current and prospective clients additional value, capabilities, and cost savings so I'm excited to share this technology with the broader technical community.
It's a great time to be doing AI and ML in .NET and I'm elated to have Kernel Memory as a tool in my toolbox.