Skip to content

CDRIVER-4679 Revert BSON_HAVE_XSI_STRERROR_R and use strerror_l instead #1370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Aug 3, 2023

Conversation

eramongodb
Copy link
Contributor

This PR reverts #1350 and re-addresses CDRIVER-4679 using a more nuanced solution. Given the information that strerror_r is scheduled to be marked as obsolete in the upcoming POSIX Issue, this PR favors using strerror_l instead of strerror_r, which is not susceptible to the GNU extension and musl library compatibility problems documented in #1350. Per the provided link:

The strerror_r function is problematic because there are historically at least two versions, and at least one major implementation, the GNU C Library, seems committed to providing a version incompatible with the version specified in the standard. While an attempt is made in glibc to provide a conforming version when the proper feature test macros are defined, the workaround is header-based and is in fact non-conforming since it does not support applications that omit inclusion of string.h and declare the function manually.

The strerror_l function is mandatory beginning with issue 7, provides a much easier-to-use interface to the same functionality, and does not have incompatible historical versions of the interface that can interfere with application compatibility.

Unfortunately, the Apple platform does not seem to yet provide strerror_l. This PR proposes treating the Apple platform specially, as is already being done for the Windows platform, by unconditionally using strerror_r (as strerror_s on Windows). Furthermore, given the possibility of the C Driver being compiled with GCC on Apple, it is still susceptible to the strerror_r problems that are documented in #1350. This PR proposes a compromise on the Apple platform only that deliberately ignores checking the return value for errors by depending on the quality of implementation to avoid undefined behavior, per the POSIX Specification:

If an invalid error number is supplied as the value of errnum, applications should be prepared to handle any of the following: [...] Since applications frequently use the return value of strerror() as an argument to functions like fprintf() (without checking the return value) and since applications have no way to parse an error message string to determine whether errnum represents a valid error number, implementations are encouraged to implement [the behavior where errno is set to [EINVAL] and the return value points to a string like "unknown error" or "error number xxx" (where xxx is the value of errnum)]. Similarly, implementations are encouraged to have strerror_r() return [EINVAL] and put a string like "unknown error" or "error number xxx" in the buffer pointed to by strerrbuf when the value of errnum is not a valid error number.

We will assume similar QoI for potential [ERANGE] errors.

Although we do not expect anyone to be compiling the C Driver without _XOPEN_SOURCE=700 on non-Windows or non-Apple platforms (per default), this PR leaves the _GNU_SOURCE condition in out of abundance of caution.

@eramongodb eramongodb requested review from jmikola and kevinAlbs August 3, 2023 15:20
@eramongodb eramongodb self-assigned this Aug 3, 2023
// Apple. Avoid the compatibility headaches with GNU extensions and the musl
// library by assuming QoI will not cause UB when reading the error message
// string even when `strerror_r` fails.
(void) strerror_r (err_code, buf, buflen);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider using this code path for the final #else case instead of raising a build error of "Unable to find a supported strerror_r candidate"?

I'll note that PHP core does that (see: php/php-src#11624 (comment)). IMO it's not the ideal approach (I disagree with PHP doing it for all cases, since they never bother to detect the XSI variant), but it generally works. And given that errno translation probably isn't a critical component of libbson, I think a user would be more inclined to receive "Unknown error" at runtime than a build error.

Looking at uses of bson_strerror_r() throughout libbson and libmongoc, I see that the err_code param is only sometimes included in the logged error message. That seems like something that could be addressed independently. Rather than hunt through all of those occurrences, a compromise might be to incorporate it into the returned string for !ret at the end of this function (e.g. "Unknown error (errno: %d)").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider using this code path for the final #else case instead of raising a build error of "Unable to find a supported strerror_r candidate"?

I elected to have the #else -> #error branch for more informative error messages should any of the expectations accounted for by the prior conditions are not met (even if that shouldn't be the case). The explicit error message would be more informative than a compilation error should these expectations ever be broken (i.e. future maintenance changing the build configuration).

And given that errno translation probably isn't a critical component of libbson

The concern isn't with the criticality of bson_strerror_r usage, but the possibility of undefined behavior when the caller reads the error message string when strerror_r fails and strerror_r does not write to the buffer on failure (implementations are encouraged, but not required to, write to the buffer on failure).

For example, bson_json_reader_new_from_file's errmsg_buf buffer is uninitialized. If strerror_r fails internally and does not write a failure message to the buffer, the successful result of bson_strerror_r would point to the uninitialized buffer as the error message string, and any subsequent reads of that error message string would be undefined behavior.

The reason for including the note about "assuming QoI will not cause UB" is to document this potential scenario.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, bson_json_reader_new_from_file's errmsg_buf buffer is uninitialized. If strerror_r fails internally and does not write a failure message to the buffer, the successful result of bson_strerror_r would point to the uninitialized buffer as the error message string, and any subsequent reads of that error message string would be undefined behavior.

Ah, very good point. I was confused by jumping back and forth between libbson and PHP, which allocates the buffer in https://github.com/php/php-src/blob/3d832da8f53d2c2f93d4ffbd1c807d9a5d66536b/Zend/zend.c#L1631-L1641 (where the worst case scenario there is just an empty string).

SGTM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where the worst case scenario there is just an empty string

FYI the linked function is stil susceptible to the same potential undefined behavior as described for bson_json_reader_new_from_file: the buf array is allocated but its elements are left uninitialized. Subsequent read of the buf elements in zend_error_noreturn may be undefined behavior if strerror_r failed and did not write a message to buf. The risk could be reduced by initializing the elements of the buf array, but it would still be a risk nevertheless (reliance on QoI to always write something to buf even on failure).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right you are again. The buf is declared within the function and unallocated. I was thinking of a static/global decl being allocated with zeroes.

// errors, an application wishing to check for all error situations should
// set errno to 0, then call strerror_l(), then check errno.
errno = 0;
ret = strerror_l (err_code, uselocale ((locale_t) 0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strerror(3) states:

The behavior of strerror_l() is undefined if locale is the special locale object LC_GLOBAL_LOCALE or is not a valid locale object handle.

Quoting uselocale(3):

On success, uselocale() returns the locale handle that was set by the previous call to uselocale() in this thread, or LC_GLOBAL_LOCALE if there was no such previous call. On error, it returns (locale_t) 0, and sets errno to indicate the error.

I don't see any other calls to uselocale() throughout libbson and libmongoc. Do we have to worry about LC_GLOBAL_LOCALE being returned? If the intention is to disregard a locale altogether, then would it make more sense to just pass (locale_t) 0 directly to strerror_l instead?

I don't have much experience with uselocale(), but a related discussion that came up while searching for this LC_GLOBAL_LOCALE edge case was: https://austin-group-l.opengroup.narkive.com/MFGftIBP/use-of-lc-global-locale-with-l-functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Implemented the missing locale handling to ensure we always pass a valid locale.

Copy link
Member

@jmikola jmikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems OK to me, so over to @kevinAlbs.

errno = 0;
ret = strerror_l (err_code, locale);

if (errno != 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a fleeting thought of "what if we wanted to translate errno here? We can't possibly use strerror_l()!"

:laugh-cry:

@eramongodb eramongodb merged commit d933a8c into mongodb:master Aug 3, 2023
@eramongodb eramongodb deleted the cdriver-4679 branch August 3, 2023 21:32
eramongodb added a commit that referenced this pull request Aug 3, 2023
…ad (#1370)

* Revert commit ac436db.

* Depend on XSI-compliant strerror_l instead of strerror_r
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants