Improper truncation when reading from UTF-8 CHAR(n) fields containing characters outside of the Basic Multilingual Plane

When using the ADO.NET provider to read a UTF-8 CHAR(n) field containing at least one character outside of the Basic Multilingual Plane (e.g. any emoji), the result will be improperly truncated. As an example, reading a CHAR(1) field containing the character '😊' (code point 0x1F60A) will result in a string value containing only the high surrogate (0xD83D). If this same character is stored in a VARCHAR(1) field, reading it works as expected.

I believe the cause of this issue can be found in `GdsStatement.ReadRawValue`: 

https://github.com/FirebirdSQL/NETProvider/blob/4230c1e2985ad91442145f743f0c9033cfdfbe5e/src/FirebirdSql.Data.FirebirdClient/Client/Managed/Version10/GdsStatement.cs#L1534-L1539

After reading the string value from the `IXdrReader`, that value is truncated to remove the extra characters that were present in the buffer as padding. However, this truncation combines usage of the `DbField.CharCount` property (the  number of Unicode code points stored in the field) with the .NET `string.Length` property and `string.Substring` method (which are based on the number of UTF-16 code units), leading to incorrect behavior when a single code point is encoded using multiple code units.

	var s = xdr.ReadString(innerCharset, field.Length);
	if ((field.Length % field.Charset.BytesPerCharacter) == 0 &&
	s.Length > field.CharCount)
	{
	return s.Substring(0, field.CharCount);
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improper truncation when reading from UTF-8 CHAR(n) fields containing characters outside of the Basic Multilingual Plane #1213

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Improper truncation when reading from UTF-8 CHAR(n) fields containing characters outside of the Basic Multilingual Plane #1213

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions