php - Regex lookahead complex pattern -
this code replacing many single pipes double pipes. keep change small, i'd prefer correct second regex allows spaces between "|" , ",".
so, question how modify second regex not match \|[[:blank:]]*[^,\r\n]
.
code:
$patterns = array ( '/\\\\\|,/', "/(?<=[^,])\|(?=[^,\n\r])/" ); $replacements = array ( '|,', '||' ); $line = preg_replace ($patterns, $replacements, $line);
example:
for string: "|di|,|15| ,|c00413914|,|| ,|f|"
expected / desired result:
"|di|,|15| ,|c00413914|,|| ,|f|"
actual result:
"|di|,|15|| ,|c00413914|,||| ,|f|"
i've tried this, didn't work:
"/(?<=[^,])\|(?=[[:blank:]]*[^,\n\r])/"
please note:
this question fixing bug smallest change possible. current regex may suboptimal (like using negative character classes instead of negative lookaround), first priority minimize changes , not optimize regex.
update:
in other words, based on interpretation of original regex, revised should match single |
followed 0 or more spaces that's not @ beginning or end of line, not preceded comma, , not followed ,
, \r
, or \n
.
more examples:
5|foo
should match5| foo
should match5|,
should not match5| ,
should not match5|\r
should not match5| \r
should not match,||,
should not match,|| ,
should not match
discovered applying suggestions real data. original regex appears observe behavior:
|foo|,
should not match. pipe first character on line.|foo| ,
should not match. pipe first character on line.,|foo|
should not match. pipe last character on line, newline may not exist (such eof).,|foo|
should not match. pipe + whitespace last characters on line, newline may not exist (such eof).
the regex looking in beginning can written
(?<!,)\|(?![[:blank:]]*[,\n\r])
it matches pipe if there no optional whitespaces followed comma or linebreaks, , not preceded comma.
note in regex example not need possessive quantifier because inside lookahead in php possessive behavior turned on default due internal optimizations..
your final regex can like
(?<=[^\r\n,])\|(?=[[:blank:]]*+[^,\n\r])
it checks if pipe preceded character other comma or linebreaks, , followed 0 or more spaces not followed commq or linebreaks. possessive behavior can forced *+
if pcre library compiled without optimizations.
Comments
Post a Comment