Regex flavors
Regex flavors
Regex flavors
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Regex really isn't that bad when using named capture groups.
Oh yeah they definitely have uses, but there's a real tendency for people to go a bit crazy with them. Complex regexen aren't exactly readable, there's all kinds of fun performance gotchas, there's sometimes other tools/algorithms that are more suitable for the task, and sometimes people try to use them to eg. parse HTML because they don't know that it is literally impossible to use regular expressions to parse languages that aren't regular
Jwz’s 2nd law!
I learned Regex once and now it just works. Only problem for me is using MacOS so the Regex flavors aren't consistent. But once I sort that, it's smooth sailing.
Regex feels distinctly eldritch to me. Like, a lot of computing knowledge feels like magic, but regex feels like the kind of magic you get by consorting with dark forces
regex feels like the kind of magic you get by consorting with dark forces
AKA reading the manual.
Or studying computer science and learning about finite state machines
Im a good christian boy thats why I refuse to read the manual
I really like this approach for doing non trivial regex https://github.com/VerbalExpressions
const tester = VerEx() .startOfLine() .then('http') .maybe('s') .then('://') .maybe('www.') .anythingBut(' ') .endOfLine();
I don't. It may look less like line noise, but it doesn't unravel the underlying complexity of what it does. It's just wordier without being helpful.
Edit: also, these alternative syntaxes tend to make some easy cases easy, but they have no idea what to do with more complicated cases. Try making nested capture groups with these, for instance. It gets messy fast.
it doesn't unravel the underlying complexity of what it does... these alternative syntaxes tend to make some easy cases easy, but they have no idea what to do with more complicated cases
This can be said of any higher-level language, or API. There is always a cost to abstraction. Binary -> Assembly -> C -> Python. As you go up that chain, many things get easier, but some things become impossible. You always have the option to drop down, though, and these regex tools are no different. Software development, sysops, devops, etc are full of compromises like this.
Named groups are nice but can I please define a group more than once because maybe I want to group my data and consolidate values in a logical way without you complaining I have already used a group previously. I know I did, I’m the one telling you, now capture it twice!
Can you actually name capture groups, or this means how you can refer to them by number?
You can use backreferences \1 \2
etc. but you can also give them names explicitly.
it looks like this: (?<name>inner-regex)
Some flavors support it, kotlins doesn't apparently.
TIL thanks!
In modern languages you can name them with labels as well yes. Not sure about the syntax right now. Something like (?label:...) I think
It's (?<NAME>...)
and those are the named capture groups referred to in the post.
I don't see the problem. But that's probably because my goto-language is perl.