Skip to content

Markdown Generation for LLM Integration #2098

Closed
@andyl

Description

@andyl

LLMs perform well on languages like Python and Javascript, where there is a great deal of training data. LLMs perform less well on Elixir because there is a smaller volume of training data. It can take months between LLM releases, and sometimes new and rapidly changing Elixir libraries like ASH are not properly incorporated in the training data.

Feature request: for LLM integration, it would be nice if ExDoc generated markdown-formatted variants (*.md) for each page.

These markdown files can be ingested at 'run time' by LLMs and AI Development tools.

The potential benefit for Elixir developers is to make HEX packages first-class LLM citizens, even for very new and rapidly changing libraries.

Comparables and Reference Information

Overview: LLMs integration with LLMS.TXT protocol

  1. Claude (Anthropic)

    • Description: Anthropic’s Claude, a conversational AI model, supports llms.txt files by allowing users to upload or paste their contents directly into the chat interface. This enables Claude to utilize the concise, LLM-friendly context provided by llms.txt for more accurate responses about a website or project.
    • How It Uses llms.txt: Users can manually provide the llms.txt file content (e.g., via copy-paste or file upload) to give Claude up-to-date, structured context, especially useful for technical documentation or APIs. Anthropic has implemented basic support, as seen with their own /llms.txt file.
    • Purpose: Enhances Claude’s ability to reason over specific site content without parsing raw HTML or dealing with context window limitations.
  2. ChatGPT (OpenAI)

    • Description: ChatGPT can process llms.txt files if users input the file content or a URL pointing to it (where supported by browsing capabilities in certain versions).
    • How It Uses llms.txt: By pasting the contents of an llms.txt file into the prompt, ChatGPT can use the markdown-structured data to answer questions or perform tasks related to the site or project described. It doesn’t natively crawl for llms.txt, but it leverages the format when provided.
    • Purpose: Allows users to give ChatGPT a distilled version of a website’s key information, improving response relevance for specific queries.

Overview: AI development tool integration with LLMS.TXT protocol

  1. Cursor

    • Description: An AI-first code editor built on VS Code, designed for efficient coding with LLM integration.
    • How It Uses llms.txt: Supports adding llms.txt files via the @Docs feature. Users can manually input or link to an llms.txt file, indexing it as context for chats and code generation, enhancing project-specific assistance.
    • Purpose: Improves contextual coding by referencing structured, LLM-friendly documentation without parsing raw HTML.
  2. Windsurf

    • Description: An agentic IDE by Codeium, combining copilot and agent features for multi-file editing and contextual awareness.
    • How It Uses llms.txt: While not natively fetching llms.txt, users can paste its contents into the Cascade agent or chat interface, leveraging the markdown format for precise, project-specific responses.
    • Purpose: Enhances agent-driven coding by utilizing concise, structured data from llms.txt for better codebase understanding.
  3. Aider

    • Description: A terminal-based AI pair programming tool that edits code in local Git repositories using LLMs.
    • How It Uses llms.txt: Users can manually provide llms.txt content via natural language prompts, allowing Aider to incorporate project documentation or metadata into its multi-file editing process.
    • Purpose: Boosts efficiency in refactoring and feature implementation by grounding LLM actions in llms.txt-provided context.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions