regex - accessible longest match (from the beginning) without substring in replacement -
i wondered if possible using sed
match longest string (from beginning) not containing substring making match accessible laterwards using sed's regex replacement variables \n
.
regarding following snippet
echo "blabla/a/b/dee/per" | sed -r -e 's:([^/a]*):\1:g'
i trying print out longest match containing sign indicated *
not including substring /a
in way above snippet prints out
blabla
regarding (/a
deleted/replaced)
echo "blabla/b/b/dee/per" | sed -r -e 's:([^/a]*):\1:g'
i expecting
blabla/b/b/dee/per
as output due substring /a
not available , longest match leads strings end. stuck @ describing substring /a
.
caution: [^/a]
placeholder describe problem. needs imo replaced correct substring description. possible in way using sed?
thank in advance
edit: john1024's third answer completes question. following snippet used:
sed -r -e 's:(/a|$):\x00:;s:^(.*)\x00(.*):\1:g'
edit: fulfill original task prepend values pathes different prefixes containing substring surrounded other characters came along
$ echo -ne "blabla/a/b/dee/per\nblabla/b/dee/per" | \ sed -r -e 's:(.*)/a/b:\1\x00:;s:(.*)/b:\1\x01:;s:^(.*)\x00(.*):\1/foo/a/b\2:g;s:^(.*)\x01(.*):\1/foo/b\2:g' blabla/foo/a/b/dee/per blabla/foo/b/dee/per
which first replaces prefix pathes /a/b
or /b
\x00
or \x01
respectively making sed groups, a.k.a. prefix , suffix pathes, accessible through \n
described below.
note: additional trick used here avoid (.*)/b
matching (.*)/a/b
replace longest path prefixes first. again @john1024
find string beginning until first occurrence of /a
(2nd version of question)
$ echo "blabla/a/b/dee/per" | sed 's|/a.*||' blabla $ echo "blabla/b/b/dee/per" | sed 's|/a.*||' blabla/b/b/dee/per
find longest string not containing /a
(original question)
this problem more natural match awk:
$ echo "blabla/a/b/dee/per" | awk -v rs='/a' 'length($0)>max{longest=$0; max=length(longest);} end{print longest;}' /b/dee/per $ echo "blabla/b/b/dee/per" | awk -v rs='/a' 'length($0)>max{longest=$0; max=length(longest);} end{print longest;}' blabla/b/b/dee/per
how works
-v rs='/a'
this sets record separator
/a
. divides input upon every occurrence of/a
.length($0)>max{longest=$0; max=length(longest);}
if current record,
$0
, longer previous longest record, updatelongest
,max
new record.end{print longest;}
when reach end of input, print out
longest
record saw.
capture string beginning first /a
in sed group (3rd version of question)
$ echo "blabla/a/b/dee/per" | sed -r 's!(/a|$)!\x00!; s|^(.*)\x00.*|i found "\1".|' found "blabla". $ echo "blabla/b/b/dee/per" | sed -r 's!(/a|$)!\x00!; s|^(.*)\x00.*|i found "\1".|' found "blabla/b/b/dee/per".
how works:
s!(/a|$)!\x00!
this replaces first occurrence of
/a
nul character,\x00
. if no occurrence of/a
found, nul character placed @ end of string (signified in regex$
). (the nul character chosen because can never held in bash variable and, thus, extremely unlikely in input string.)s|^(.*)\x00.*|i found "\1".|
this saves group 1 characters location first
/a
used be. can use\1
in replacement please.
as written, requires sed, such gnu sed, supports nul-character, hex 00. if sed not support nul, replace \x00
character won't in input string sed support. \x01
might second choice.
Comments
Post a Comment