[Novalug] [OT] You all probably know that MACs are Unix machines

James Ewing Cottrell III JECottrell3@Comcast.NET
Thu Jan 1 14:34:09 EST 2015


On 12/26/2014 5:25 PM, Jon LaBadie via Novalug wrote:
> On Fri, Dec 26, 2014 at 11:05:27AM -0500, shawn wilson wrote:
>> On Dec 26, 2014 10:29 AM, "Derek LaHousse" <dlahouss@mtu.edu> wrote:
>>>
>>
>>> And on the topic of all lowercase names: In the "c" locale, Uppercase
>>> comes before lowercase.
>>>
>>
>> Historically Unix has been C or POSIX locale but Debian and the likes use
>> UTF by default. What this means is that commands like ls output listing
>> reverse upper/lower order and regex sets (and probably globs) [a-Z] vs
>> [A-z] breaks (maybe other issues / unexpected things?) IIRC OSX is C locale
>> (which I think is the right way).
>>
>> Ps -  I find it a bug that you can have a range that goes outside of a
>> group [ -z] and [ -Z] should both error no matter the locale (I've got a
>> perl feature request with this filed).
>
> What is your arbitrary definition of "a group"?  The group of printable
> ASCII characters is [!-~] (or [  -~] with a tab and space instead of !).
> I'd hate to have that be an error.
>
> jl
>

Aw, cmon, JL, you know what he means. Letters and Numbers. And while any 
particular group might be discontiguous, such as accents, he means "any 
span that include characters in different classes". Of course, EBCDIC 
has disjointed Letters, but I'm betting that IBM implementations using 
EBCDIC interpret [A-Z as [A-IJ-RS-Z].

Still, I agree with you that he shouldn't want that, at least not as a 
default; perhaps as a trailing char after an RE: /[ -Z]/r ('r' for 
'range', 'g' is taken) would fail.

I often use "/[ -~]" to go to lines with "bogus" characters in them.

JIM



More information about the Novalug mailing list