[Novalug] question about grep in perl

John Holland jholland@vin-dit.org
Tue Mar 28 17:08:04 EDT 2017


Well, the problem given was to process the whole file. There’s no getting around reading the entire file once. 

I think the questions to be asked are, 
how many comparisons are being done?
How much memory is used?
how concise is the code?
Can it be plugged into something as it is? (does it fit Bonnie’s codebase, or is it generic enough to fit just about any codebase?)
Does it use grep? This was the original question, whether grep would be useful.

Re the examples you give:
the first one, which builds the tree every time the page loads, is a failure for doing something over and over that is very expensive and could be saved.
The second one, I’m not sure if it’s the same situation. Does it RELOAD them for every login?

John
> On Mar 28, 2017, at 5:02 PM, William Sutton via Novalug <novalug@firemountain.net> wrote:
> 
> My concern is someone starts with a test case (maybe not Bonnie for her pedigree application; maybe some business that wants to be "enterprisey" when it grows up) and does it in some manner that is fine for small data sets but breaks badly when it scales up to millions/10's of millions/billions of items.
> 
> For small throwaway stuff, no big deal.  But I think one should be aware of the ramifications of such a decision and knowingly accept them...rather than do it blindly and then have to deal with a problem.
> 
> Two examples on point:
> 
> The iConect litigation review tool pulls all of your tagged (think bookmarked, if you like) items and dynamically builds an expanding/collapsing tree of them every time it renders a page.  That's fine if you have 100 tags.  I once imported a data set that had 22,000 tags across several million documents.  The result was that it took 15+ minutes for their page to load.  Internally.  On 10GE.
> 
> The kCura Relativity litigation review tool loads all user and case mappings into the login page prior to login.  That's great if you have a small law firm with a few dozen cases.  If you have 1,000+, it slows to a crawl.
> 
> I could go on.
> 
> William Sutton
> 
> On Tue, 28 Mar 2017, Howard L via Novalug wrote:
> 
>> Seems like a lot less processing to allocate and write to memory once than to do it 1000 times.
>> Even a 1 meg file.
>> 
>> On 03/28/2017 03:23 PM, John Holland via Novalug wrote:
>>> The only beef I could see with it is that it loads the whole file into memory at once; you can do it with only a line at a time in memory.
>>> >
>> 
>> **********************************************************************
>> The Novalug mailing list is hosted by firemountain.net.
>> 
>> To unsubscribe or change delivery options:
>> http://www.firemountain.net/mailman/listinfo/novalug
>> 
> **********************************************************************
> The Novalug mailing list is hosted by firemountain.net.
> 
> To unsubscribe or change delivery options:
> http://www.firemountain.net/mailman/listinfo/novalug





More information about the Novalug mailing list