Skip to content

API docs are not being properly indexed for search #1432

Open
@waylan

Description

@waylan

The default behavior is for search to use the heading of the relevant section as the title of the search result. However, the code spans within headings of the API docs are being HTML escaped in the results and many of the obvious headings are not being returned as results. I'm assuming the second issue is related to the first in that the correct text is not getting indexed. In other words, if only the plain text content of a heading was being indexed, then that would result in better search results.

Consider the following example. ESCAPED_CHARS is an instance attribute of the Markdown class. One might expect that its relevant section (ESCAPED_CHARS) would be returned in a search for the string ESCAPED_CHARS, but it is not. However, the method markdown.inlinepatterns.EscapeInlineProcessor.handleMatch is in the results because it mentions ESCAPED_CHARS in the body of its documentation. Yet, in the search result, the title is the text <code class="doc-symbol doc-symbol-toc doc-symbol-method"></code>&nbsp;handleMatch not handleMatch or markdown.inlinepatterns.EscapeInlineProcessor.handleMatch as one might expect. Frustratingly, the search term handleMatch does not return that result at all. Yet, it is in the results for the search term doc-symbol-method, which shouldn't even be indexed, as it is an HTML class assigned to the code span, not text.

@pawamoy do you have any insight into this? I have not yet looked at the code and am not sure how the mkdocsstrings extension passes its generated pages to search for indexing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    docsRelated to the project documentation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions