i'm learning how read text files. used way:
f=open("sample.txt") print(f.read())
it worked fine if typed txt file myself. when copied text news article on web, produced following error:
unicodeencodeerror: 'charmap' codec can't encode charater '\u2014' in position 738: character maps undefined
i tried changing encoding setting in notepad++ utf-8 read somewhere due that
i tried using:
f=open("sample.txt",encoding='utf-8')
from here
but still didn't work.
you're on windows , trying print console. print() throwing exception.
the windows console natively supports 8bit code pages, outside of region break (despite people chcp 65001).
you need install , use https://github.com/drekin/win-unicode-console. module talks @ low-level console api, giving support multi-byte characters, input , output.
alternatively, don't print console , write output file, opened encoding. example:
with open("myoutput.log", "w", encoding="utf-8") my_log: my_log.write(body)
ensure open file correct encoding.
Comments
Post a Comment