grep 2 at signs - SCO
This is a discussion on grep 2 at signs - SCO ; Hello all,
bob@vodka.com
jpr@jane.com
jeff@needshelp.net
joey@test.net
jeff@waylon@bank.com
^------^------------ 2 @ signs
I want to locate any email with 2 "at" signs inside file "file_list".
Keep in mind there is only ONE email per line and there are
10's of thousands ...
-
grep 2 at signs
Hello all,
bob@vodka.com
jpr@jane.com
jeff@needshelp.net
joey@test.net
jeff@waylon@bank.com
^------^------------ 2 @ signs
I want to locate any email with 2 "at" signs inside file "file_list".
Keep in mind there is only ONE email per line and there are
10's of thousands of email addresses in "file_list".
# grep "*@*@" file_list Displays every line
# grep "*\@*\@" file_list Displays every line
Thanks in advance,
Jeff H
-
Re: grep 2 at signs
Jeff Hyman wrote (on Mon, Jul 10, 2006 at 04:46:54PM -0400):
| bob@vodka.com
| jpr@jane.com
| jeff@needshelp.net
| joey@test.net
| jeff@waylon@bank.com
| ^------^------------ 2 @ signs
|
| I want to locate any email with 2 "at" signs inside file "file_list".
| Keep in mind there is only ONE email per line and there are
| 10's of thousands of email addresses in "file_list".
|
| # grep "*@*@" file_list Displays every line
| # grep "*\@*\@" file_list Displays every line
How about:
egrep '.*@.*@' file_list
or
awk '/.*@.*@/ { print }'file_list
Bob (who would use mawk 'cause it'd be fastest)
--
Bob Stockler +-+ bob@trebor.iglou.com +-+ http://members.iglou.com/trebor
-
Re: grep 2 at signs
Jeff Hyman typed (on Mon, Jul 10, 2006 at 04:46:54PM -0400):
|
| Hello all,
|
| bob@vodka.com
| jpr@jane.com
| jeff@needshelp.net
| joey@test.net
| jeff@waylon@bank.com
| ^------^------------ 2 @ signs
|
| I want to locate any email with 2 "at" signs inside file "file_list".
| Keep in mind there is only ONE email per line and there are
| 10's of thousands of email addresses in "file_list".
|
| # grep "*@*@" file_list Displays every line
| # grep "*\@*\@" file_list Displays every line
Really?? Neither of those commands should find any match whatsoever in
that list; have you some other odd filename in your current directory
which the shell is expanding before passing grep its argument?
In a regular expression, '*' is not a wild card. It indicates any
number of matches (including none) of what precedes it. In a regular
expression, "." stands for any single character, and ".*" for any number
of any character. including none.
Add 'two@@any.org' to your list, just for kicks. Then try:
grep "@.*@"
--
JP
==> http://www.frappr.com/cusm <==
-
Re: grep 2 at signs
Bob Stockler typed (on Mon, Jul 10, 2006 at 05:16:48PM -0400):
| Jeff Hyman wrote (on Mon, Jul 10, 2006 at 04:46:54PM -0400):
|
| | bob@vodka.com
| | jpr@jane.com
| | jeff@needshelp.net
| | joey@test.net
| | jeff@waylon@bank.com
| | ^------^------------ 2 @ signs
| |
| | I want to locate any email with 2 "at" signs inside file "file_list".
| | Keep in mind there is only ONE email per line and there are
| | 10's of thousands of email addresses in "file_list".
| |
| | # grep "*@*@" file_list Displays every line
| | # grep "*\@*\@" file_list Displays every line
|
| How about:
|
| egrep '.*@.*@' file_list
| or
| awk '/.*@.*@/ { print }'file_list
|
| Bob (who would use mawk 'cause it'd be fastest)
When it comes to commands, JP goes by the shibboleth that
'shorter is better', so he'd scratch one each of ".", "*",
" ", "{", " ", "p", "r", "i", "n", "t", " ", "}" and just type:
mawk '/@.*@/' file_list
--
JP
==> http://www.frappr.com/cusm <==
-
Re: grep 2 at signs
Jean-Pierre Radley wrote (on Mon, Jul 10, 2006 at 06:05:44PM -0400):
| Bob Stockler typed (on Mon, Jul 10, 2006 at 05:16:48PM -0400):
| | Jeff Hyman wrote (on Mon, Jul 10, 2006 at 04:46:54PM -0400):
| |
| | | bob@vodka.com
| | | jpr@jane.com
| | | jeff@needshelp.net
| | | joey@test.net
| | | jeff@waylon@bank.com
| | | ^------^------------ 2 @ signs
| | |
| | | I want to locate any email with 2 "at" signs inside file "file_list".
| | | Keep in mind there is only ONE email per line and there are
| | | 10's of thousands of email addresses in "file_list".
| | |
| | | # grep "*@*@" file_list Displays every line
| | | # grep "*\@*\@" file_list Displays every line
| |
| | How about:
| |
| | egrep '.*@.*@' file_list
| | or
| | awk '/.*@.*@/ { print }'file_list
| |
| | Bob (who would use mawk 'cause it'd be fastest)
|
|
| When it comes to commands, JP goes by the shibboleth that
| 'shorter is better', so he'd scratch one each of ".", "*",
| " ", "{", " ", "p", "r", "i", "n", "t", " ", "}" and just type:
|
| mawk '/@.*@/' file_list
More elegant . . . less informative to those less knowledgeable.
OTOH it does, in fact, inform (or remind) us of the subtleties
of AWK, which by default prints any matched line.
Bob
--
Bob Stockler +-+ bob@trebor.iglou.com +-+ http://members.iglou.com/trebor
-
Re: grep 2 at signs
Bob Stockler typed (on Mon, Jul 10, 2006 at 06:22:12PM -0400):
| Jean-Pierre Radley wrote (on Mon, Jul 10, 2006 at 06:05:44PM -0400):
|
| | Bob Stockler typed (on Mon, Jul 10, 2006 at 05:16:48PM -0400):
| | | Jeff Hyman wrote (on Mon, Jul 10, 2006 at 04:46:54PM -0400):
| | |
| | | | bob@vodka.com
| | | | jpr@jane.com
| | | | jeff@needshelp.net
| | | | joey@test.net
| | | | jeff@waylon@bank.com
| | | | ^------^------------ 2 @ signs
| | | |
| | | | I want to locate any email with 2 "at" signs inside file "file_list".
| | | | Keep in mind there is only ONE email per line and there are
| | | | 10's of thousands of email addresses in "file_list".
| | | |
| | | | # grep "*@*@" file_list Displays every line
| | | | # grep "*\@*\@" file_list Displays every line
| | |
| | | How about:
| | |
| | | egrep '.*@.*@' file_list
| | | or
| | | awk '/.*@.*@/ { print }'file_list
| | |
| | | Bob (who would use mawk 'cause it'd be fastest)
| |
| |
| | When it comes to commands, JP goes by the shibboleth that
| | 'shorter is better', so he'd scratch one each of ".", "*",
| | " ", "{", " ", "p", "r", "i", "n", "t", " ", "}" and just type:
| |
| | mawk '/@.*@/' file_list
|
| More elegant . . . less informative to those less knowledgeable.
|
| OTOH it does, in fact, inform (or remind) us of the subtleties
| of AWK, which by default prints any matched line.
|
| Bob
|
| --
| Bob Stockler +-+ bob@trebor.iglou.com +-+ http://members.iglou.com/trebor
Well guys, here's the results:
egrep '.*@.*@' list # Works, but slow and prints twice
# real 0m1.19s
# user 0m1.16s
# sys 0m0.02
awk '/.*@.*@/ { print }'list # does not work... just hangs
# ps shows its doing something
grep "@.*@" list # Works & fast
# real 0m0.07s
# user 0m0.07s
# sys 0m0.01
mawk '/@.*@/' list # Works & Fastest
# real 0m0.01s
# user 0m0.01s
# sys 0m0.01
You guys (as always) have been a great help and I thank you!
Jeff H
-
Re: grep 2 at signs
On Tue, Jul 11, 2006, Jeff Hyman wrote:
>Bob Stockler typed (on Mon, Jul 10, 2006 at 06:22:12PM -0400):
>| Jean-Pierre Radley wrote (on Mon, Jul 10, 2006 at 06:05:44PM -0400):
>|
....
>| | | | I want to locate any email with 2 "at" signs inside file "file_list".
>| | | | Keep in mind there is only ONE email per line and there are
>| | | | 10's of thousands of email addresses in "file_list".
....
>Well guys, here's the results:
>
I would think these might have problems given the greedy nature
of regular expressions (e.g. .*@ matches the longest string
ending in @).
This might be a better solution:
#!/usr/local/bin/python
import re
pattern = re.compile(r'.*?@.*@') # the '?' is a non-greedy match
fh = open('file_list')
for line in fh.readlines():
if pattern.search(line): print line, # comma suppresses extra newline
> egrep '.*@.*@' list # Works, but slow and prints twice
> # real 0m1.19s
> # user 0m1.16s
> # sys 0m0.02
>
> awk '/.*@.*@/ { print }'list # does not work... just hangs
> # ps shows its doing something
>
>
> grep "@.*@" list # Works & fast
> # real 0m0.07s
> # user 0m0.07s
> # sys 0m0.01
>
> mawk '/@.*@/' list # Works & Fastest
> # real 0m0.01s
> # user 0m0.01s
> # sys 0m0.01
>
>You guys (as always) have been a great help and I thank you!
>
>Jeff H
>
--
Bill
--
INTERNET: bill@Celestial.COM Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/ PO Box 820; 6641 E. Mercer Way
FAX: (206) 232-9186 Mercer Island, WA 98040-0820; (206) 236-1676
``If ye love wealth greater than liberty, the tranquillity of servitude
greater than the animating contest for freedom, go home from us in peace.
We seek not your consul, nor your arms. Crouch down and lick the hand that
feeds you. May your chains set lightly upon you; and may posterity forget
ye were our countrymen.'' -- Samuel Adams (American Patriot)
-
Re: grep 2 at signs
In article ,
Bill Campbell wrote:
>...
>>| | | | I want to locate any email with 2 "at" signs inside file "file_list".
>>| | | | Keep in mind there is only ONE email per line and there are
>>| | | | 10's of thousands of email addresses in "file_list".
>...
>
>I would think these might have problems given the greedy nature
>of regular expressions (e.g. .*@ matches the longest string
>ending in @).
Greedy expressions will never prevent a match; "greedy" affects only how much
is matched.
Another solution, about as fast as mawk, is to use 'pgrep' (supplied with
gwxlibs).
John
--
John DuBois spcecdt@armory.com KC6QKZ/AE http://www.armory.com/~spcecdt/
-
Re: grep 2 at signs
John DuBois wrote:
> Another solution, about as fast as mawk, is to use 'pgrep' (supplied
> with gwxlibs).
That's actually `pcregrep`. gwxlibs is unwise (at best) to provide the
`pgrep` alias link. On Linux systems:
$ apropos pgrep; apropos pcregrep
pgrep, pkill - look up or signal processes based on name and other attributes
pcregrep - a grep with Perl-compatible regular expressions
>Bela<
-
Re: grep 2 at signs
Jeff Hyman wrote:
> awk '/.*@.*@/ { print }'list # does not work... just hangs
typo: you need a blank before the filename
(awk listens on stdin with the above)