[Novalug] question about linux file names

Bryan J. Smith b.j.smith@ieee.org
Sun Dec 13 20:11:39 EST 2009


Do you mean ASCII?  Or DOS ASCII (which it is actually not)?  There
was also DOS-V and other extensions as well.


ASCII stands for "American," first and foremost.  Vendors who played
games with reversed characters deserve what they get. Different
codepages utilized are still not ASCII.  People use ASCII to describe
all sorts of things that were _never_ ASCII from NIST.

ASCII defines printable characters from 32-127 (20-7Fh).  These map
into UTF-8 implementations perfectly, by design.  Anything outside of
these characters are not ASCII, but non-standard extensions.  That's
why Linux "has issues" when you write to non-ASCII standards.

More specifically, Red Hat Linux 7 was the first distribution to pull a
ton of "radical" changes, to force developers to consider both forward
and backward ABI/API compatibility as well as internationalization.
Two of the biggest PITA with RHL7 were forcing ANSI C++ compliance
and across-the-board UTF-8 with strict ASCII, including in the Ext2 (and
Ext3) filesystems.  I know some developers -- especially Perl (because
it had not tackled such hard issues) -- absolutely took issue with this.

Most other distros followed suit after developers accommodated
changes.  It wasn't the first time Red Hat did this, and it wasn't the last.
But it became the standard, and no one argues that it was a bad move
today (although some argue, in complete folly, about back when the
move was made, not realizing we all benefit today).



----- Original Message ----
From: Bonnie Dalzell <bdalzell@qis.net>

the old dos low ascii set does not map one to one to linux. i have some old data in 1990's type ascii and the umlaut characters, etc are not the same as their representation under my linux system so if I try and load a file with one of these names I can get errors trying to open and read the files.

i have written routines for my pedigree program to change the foreign low ascii charaters into "english equivalent letters" for the file names but I want to also experiment with going from the low ascii foreign letters to utf encoding.



More information about the Novalug mailing list