Skip to content

Use weak symbol create_fallback_regex to separate the implementation using PCRE2 and std::regex #77

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 22, 2025

Conversation

larryliu0820
Copy link
Contributor

@larryliu0820 larryliu0820 commented May 21, 2025

Declare and define a weak symbol create_fallback_regex, the default implementation is returning an error.

We then build a separate target regex_lookahead for PCRE2 and std::regex, to support lookahead feature. This library can be optionally linked into the binary.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 21, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D75173102

@larryliu0820 larryliu0820 changed the title Fix app binary size by reverting tiktoken changes made to Cria Use weak symbol create_fallback_regex to separate the implementation using PCRE2 and std::regex May 21, 2025
facebook-github-bot pushed a commit that referenced this pull request May 22, 2025
Summary:

As titled, in D74864339 I changed the tiktoken dependency of `//xplat/cria/utils:prompt` to be `tiktoken_lookahead` which increased the binary size.

This change can be reverted since Cria is not using this feature yet.

Reviewed By: derekxu

Differential Revision: D75173102
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D75173102

Summary:

As titled, in D74864339 I changed the tiktoken dependency of `//xplat/cria/utils:prompt` to be `tiktoken_lookahead` which increased the binary size.

This change can be reverted since Cria is not using this feature yet.

Reviewed By: derekxu

Differential Revision: D75173102
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D75173102

* @brief Creates a regex instance. Tries RE2 first, falls back to std::regex.
* @brief Creates a regex instance. If no strong symbol defined, only
* uses RE2. This is a weak symbol to allow other regex libraries to be
* used.
*
* @param pattern The regex pattern to compile.
* @return A unique pointer to an IRegex-compatible object.
*/
Result<std::unique_ptr<IRegex>> create_regex(const std::string& pattern);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just make this the weak symbol?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we can't have RE2 in both weak implementation and strong implementation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean just have create_regex weak implementation do re2, then strong implementation does re2 + fallback. Basically what you are doing now, seems a bit clearer to me

*
* @param pattern The regex pattern to compile.
* @return A unique pointer to an IRegex-compatible object.
*/
Result<std::unique_ptr<IRegex>> create_regex(const std::string& pattern);

/**
* @brief Creates a fallback regex instance. If no strong symbol defined,
* returns Error, otherwise uses PCRE2 and std::regex.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is meant to be extensible to other regex libs, would make sense to me to remove PCRE2 from the comment and rename regex_lookahead.cpp to regex_pcre2.cpp or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lookahead = pcre2 + std::regex, right?

@jackzhxng jackzhxng requested a review from SS-JIA May 22, 2025 18:51
Copy link
Contributor

@jackzhxng jackzhxng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to unblock internal regression

@facebook-github-bot facebook-github-bot merged commit be07807 into main May 22, 2025
4 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants