0

I'm having the issue while converting the string to base 64 for encoding using T-SQL. As I use chatgpt to search and some function like

SELECT BASE64_ENCODE (CAST ('hello world' AS varbinary))

SELECT CAST('string' AS varbinary(max)) 
FOR XML PATH(''), BINARY BASE64

After that I test with a function using C# as:

Encoding.UTF8.GetString(Convert.FromBase64String(base64Encoded)

However, it returns the undesired result. For example, I have a string ABCD and when using the above methods to convert it into Base64 in SQL with the above methods it returns as QQBCAEMARAA=, in which will return as A�B�C�D� for me later by the decode function while testing with Postman.

So in here I want the string ABCD to be converted into QUJDRAo= according to https://www.base64encode.org/, does anyone know to deal with this issue ?

4
  • 1
    It helps if you use the right data type, and encoding, for your string. Windows-1252 <> UTF-16: db<>fiddle
    – Thom A
    Commented Jun 8 at 11:05
  • 1
    I also assume CrapGPT was not kind enough to vomit out words to tell you BASE64_ENCODE is currently only available in Azure solutions, not on premises.
    – Thom A
    Commented Jun 8 at 11:09
  • Why do you need to do this in SQL though, especially since you mention postman, which i'm guessing doesn't call the database directly Commented Jun 8 at 11:29
  • Also varbinary or varchar without a length, don't ever do this. Commented Jun 8 at 23:16

2 Answers 2

0

It looks like you currently have an nvarchar string so when you convert it to binary it is using UCS-2 encoding whereas the C# is trying to decode it as UTF8.

You can cast it to varchar with a UTF8 collation before casting to binary to get your expected result.

DECLARE @string NVARCHAR(MAX) = N'ABCD'

SELECT CONVERT(VARBINARY(MAX),
                CONVERT(VARCHAR(MAX), @string collate Latin1_General_100_BIN2_UTF8)
              ) 
        FOR XML PATH(''), BINARY BASE64
0

Your issue is not about Base64 conversion, but about representation of strings. Your string is being created as Unicode. When passed to the base64 encode, it's still Unicode i.e. a sequence of 16-bit characters. The C# decode is restoring the data to an array of bytes which would be correctly interpreted as a sequence of 16-bit characters, except that the UTF8.GetString is interpreting the byte array as a sequence of 8-bit characters.

It appears that you want the base64 to represent a UTF-8 string, so you need to ensure that your string is UTF8 before you encode to base64.

Not the answer you're looking for? Browse other questions tagged or ask your own question.