regex - accessible longest match (from the beginning) without substring in replacement -


i wondered if possible using sed match longest string (from beginning) not containing substring making match accessible laterwards using sed's regex replacement variables \n.

regarding following snippet

echo "blabla/a/b/dee/per" | sed -r -e 's:([^/a]*):\1:g' 

i trying print out longest match containing sign indicated * not including substring /a in way above snippet prints out

blabla 

regarding (/a deleted/replaced)

echo "blabla/b/b/dee/per" | sed -r -e 's:([^/a]*):\1:g' 

i expecting

blabla/b/b/dee/per 

as output due substring /a not available , longest match leads strings end. stuck @ describing substring /a.

caution: [^/a] placeholder describe problem. needs imo replaced correct substring description. possible in way using sed?

thank in advance

edit: john1024's third answer completes question. following snippet used:

 sed -r -e 's:(/a|$):\x00:;s:^(.*)\x00(.*):\1:g' 

edit: fulfill original task prepend values pathes different prefixes containing substring surrounded other characters came along

 $ echo -ne "blabla/a/b/dee/per\nblabla/b/dee/per" | \    sed -r -e 's:(.*)/a/b:\1\x00:;s:(.*)/b:\1\x01:;s:^(.*)\x00(.*):\1/foo/a/b\2:g;s:^(.*)\x01(.*):\1/foo/b\2:g'  blabla/foo/a/b/dee/per  blabla/foo/b/dee/per 

which first replaces prefix pathes /a/b or /b \x00 or \x01 respectively making sed groups, a.k.a. prefix , suffix pathes, accessible through \n described below.

note: additional trick used here avoid (.*)/b matching (.*)/a/b replace longest path prefixes first. again @john1024

find string beginning until first occurrence of /a (2nd version of question)

$ echo "blabla/a/b/dee/per" | sed 's|/a.*||' blabla  $ echo "blabla/b/b/dee/per" | sed 's|/a.*||' blabla/b/b/dee/per 

find longest string not containing /a (original question)

this problem more natural match awk:

$ echo "blabla/a/b/dee/per" | awk -v rs='/a' 'length($0)>max{longest=$0; max=length(longest);} end{print longest;}' /b/dee/per  $ echo "blabla/b/b/dee/per" | awk -v rs='/a' 'length($0)>max{longest=$0; max=length(longest);} end{print longest;}' blabla/b/b/dee/per 

how works

  • -v rs='/a'

    this sets record separator /a. divides input upon every occurrence of /a.

  • length($0)>max{longest=$0; max=length(longest);}

    if current record, $0, longer previous longest record, update longest , max new record.

  • end{print longest;}

    when reach end of input, print out longest record saw.

capture string beginning first /a in sed group (3rd version of question)

$ echo "blabla/a/b/dee/per" | sed -r 's!(/a|$)!\x00!; s|^(.*)\x00.*|i found "\1".|' found "blabla".  $ echo "blabla/b/b/dee/per" | sed -r 's!(/a|$)!\x00!; s|^(.*)\x00.*|i found "\1".|' found "blabla/b/b/dee/per". 

how works:

  • s!(/a|$)!\x00!

    this replaces first occurrence of /a nul character, \x00. if no occurrence of /a found, nul character placed @ end of string (signified in regex $). (the nul character chosen because can never held in bash variable and, thus, extremely unlikely in input string.)

  • s|^(.*)\x00.*|i found "\1".|

    this saves group 1 characters location first /a used be. can use \1 in replacement please.

as written, requires sed, such gnu sed, supports nul-character, hex 00. if sed not support nul, replace \x00 character won't in input string sed support. \x01 might second choice.


Comments

Popular posts from this blog

python - No exponential form of the z-axis in matplotlib-3D-plots -

php - Best Light server (Linux + Web server + Database) for Raspberry Pi -

c# - "Newtonsoft.Json.JsonSerializationException unable to find constructor to use for types" error when deserializing class -