regex - accessible longest match (from the beginning) without substring in replacement -
i wondered if possible using sed match longest string (from beginning) not containing substring making match accessible laterwards using sed's regex replacement variables \n.
regarding following snippet
echo "blabla/a/b/dee/per" | sed -r -e 's:([^/a]*):\1:g' i trying print out longest match containing sign indicated * not including substring /a in way above snippet prints out
blabla regarding (/a deleted/replaced)
echo "blabla/b/b/dee/per" | sed -r -e 's:([^/a]*):\1:g' i expecting
blabla/b/b/dee/per as output due substring /a not available , longest match leads strings end. stuck @ describing substring /a.
caution: [^/a] placeholder describe problem. needs imo replaced correct substring description. possible in way using sed?
thank in advance
edit: john1024's third answer completes question. following snippet used:
sed -r -e 's:(/a|$):\x00:;s:^(.*)\x00(.*):\1:g' edit: fulfill original task prepend values pathes different prefixes containing substring surrounded other characters came along
$ echo -ne "blabla/a/b/dee/per\nblabla/b/dee/per" | \ sed -r -e 's:(.*)/a/b:\1\x00:;s:(.*)/b:\1\x01:;s:^(.*)\x00(.*):\1/foo/a/b\2:g;s:^(.*)\x01(.*):\1/foo/b\2:g' blabla/foo/a/b/dee/per blabla/foo/b/dee/per which first replaces prefix pathes /a/b or /b \x00 or \x01 respectively making sed groups, a.k.a. prefix , suffix pathes, accessible through \n described below.
note: additional trick used here avoid (.*)/b matching (.*)/a/b replace longest path prefixes first. again @john1024
find string beginning until first occurrence of /a (2nd version of question)
$ echo "blabla/a/b/dee/per" | sed 's|/a.*||' blabla $ echo "blabla/b/b/dee/per" | sed 's|/a.*||' blabla/b/b/dee/per find longest string not containing /a (original question)
this problem more natural match awk:
$ echo "blabla/a/b/dee/per" | awk -v rs='/a' 'length($0)>max{longest=$0; max=length(longest);} end{print longest;}' /b/dee/per $ echo "blabla/b/b/dee/per" | awk -v rs='/a' 'length($0)>max{longest=$0; max=length(longest);} end{print longest;}' blabla/b/b/dee/per how works
-v rs='/a'this sets record separator
/a. divides input upon every occurrence of/a.length($0)>max{longest=$0; max=length(longest);}if current record,
$0, longer previous longest record, updatelongest,maxnew record.end{print longest;}when reach end of input, print out
longestrecord saw.
capture string beginning first /a in sed group (3rd version of question)
$ echo "blabla/a/b/dee/per" | sed -r 's!(/a|$)!\x00!; s|^(.*)\x00.*|i found "\1".|' found "blabla". $ echo "blabla/b/b/dee/per" | sed -r 's!(/a|$)!\x00!; s|^(.*)\x00.*|i found "\1".|' found "blabla/b/b/dee/per". how works:
s!(/a|$)!\x00!this replaces first occurrence of
/anul character,\x00. if no occurrence of/afound, nul character placed @ end of string (signified in regex$). (the nul character chosen because can never held in bash variable and, thus, extremely unlikely in input string.)s|^(.*)\x00.*|i found "\1".|this saves group 1 characters location first
/aused be. can use\1in replacement please.
as written, requires sed, such gnu sed, supports nul-character, hex 00. if sed not support nul, replace \x00 character won't in input string sed support. \x01 might second choice.
Comments
Post a Comment