[Dwarf-discuss] Do Dwarf symbols only use ascii?

David Anderson davea42@gmail.com
Thu Nov 2 17:06:27 GMT 2023


On 11/2/23 03:29, Roger Phillips via Dwarf-discuss wrote:
> I'm currently trying to debug a problem in the dynamorio system where 
> the isdigit function crashes in elftoolchain while trying to parse 
> symbols from dwarf info:
> 
> https://github.com/DynamoRIO/dynamorio/issues/6161 
> <https://github.com/DynamoRIO/dynamorio/issues/6161>
> 
> My question is whether these symbols really need the locale 
> functionality of libc's isdigit function or if the symbols in Dwarf are 
> just standard ascii and could be parsed in a portable way with the 
> simple method mentioned there.

Effectively, UTF8 can often be processed with non-locale library functions,
though attempting to print such locale-independent can look odd.
isdigit() seems to me to be safe to use in any context. Seems.

Any library table of values *should* be large enough to be indexable
with an 8 bit value (256), right? Who would make such smaller than
256 entries these days? When dealing with utf8-aware
functions things are...different.

Since the bytes in a multibyte (non-ascii) UTF8 codepoint
are never (ascii) control characters, even if your system environment
is not set as UTF8 it is hard to see how the bytes would
be any danger to anything (?), even if printed to a terminal or device.

See en.wikipedia.org/wiki/UTF-8
for details on the encoding.

David Anderson
-- 
Although it is generally known, I think it's about
time to announce that I was born at a very early age.

-- Groucho Marx



More information about the Dwarf-discuss mailing list