How to see Unicode Vietnamese in human-readable?

How to see Unicode Vietnamese in human-readable?

This is my file Microsoft OneDrive - Access files anywhere. Create docs with free Office Online. , I used Notepad++ to see. I want to see Unicode in Vietnamese human-readable for editing few item. Please help me.

I could be mistaken, but I believe you're experiencing an issue related to Notepad or perhaps encoding standards in Windows. If you were to load these in Python you should be able to resemble them properly.

For example. If I take this string.

"bơ đậu phộng"

Then I can choose to encode it via;

"bơ đậu phộng".encode()
# b'b\xc6\xa1 \xc4\x91\xe1\xba\xadu ph\xe1\xbb\x99ng'

This is likely what you're seeing in Notepad, it's a representation of the text on disk. But you can turn this into an encoding that prints nicely using .decode();

"bơ đậu phộng".encode().decode('utf-8')
# "bơ đậu phộng"

If I were to take the text from your example then:

# 'Công'

You might also find this thread insightful. It discusses a similar phenomenon but for Arabic.

Thanks Vincent. Where exactly doe you specify/execute the .encode()/.decode() argument - in Notepad you mean?

The .enocde()/.decode() is Python code, which is run outside of the Notepad. The idea is to load in the file with Python, make some minor edits with Python and save the new file on disk.