[Novalug] [OT] You all probably know that MACs are Unix machines
James Ewing Cottrell III
JECottrell3@Comcast.NET
Thu Jan 1 14:34:09 EST 2015
On 12/26/2014 5:25 PM, Jon LaBadie via Novalug wrote:
> On Fri, Dec 26, 2014 at 11:05:27AM -0500, shawn wilson wrote:
>> On Dec 26, 2014 10:29 AM, "Derek LaHousse" <dlahouss@mtu.edu> wrote:
>>>
>>
>>> And on the topic of all lowercase names: In the "c" locale, Uppercase
>>> comes before lowercase.
>>>
>>
>> Historically Unix has been C or POSIX locale but Debian and the likes use
>> UTF by default. What this means is that commands like ls output listing
>> reverse upper/lower order and regex sets (and probably globs) [a-Z] vs
>> [A-z] breaks (maybe other issues / unexpected things?) IIRC OSX is C locale
>> (which I think is the right way).
>>
>> Ps - I find it a bug that you can have a range that goes outside of a
>> group [ -z] and [ -Z] should both error no matter the locale (I've got a
>> perl feature request with this filed).
>
> What is your arbitrary definition of "a group"? The group of printable
> ASCII characters is [!-~] (or [ -~] with a tab and space instead of !).
> I'd hate to have that be an error.
>
> jl
>
Aw, cmon, JL, you know what he means. Letters and Numbers. And while any
particular group might be discontiguous, such as accents, he means "any
span that include characters in different classes". Of course, EBCDIC
has disjointed Letters, but I'm betting that IBM implementations using
EBCDIC interpret [A-Z as [A-IJ-RS-Z].
Still, I agree with you that he shouldn't want that, at least not as a
default; perhaps as a trailing char after an RE: /[ -Z]/r ('r' for
'range', 'g' is taken) would fail.
I often use "/[ -~]" to go to lines with "bogus" characters in them.
JIM
More information about the Novalug
mailing list