Watched this talk on Regular Expressions by Damian Conway today. He discusses how RE engine is actually a virtual machine and how regular expressions are source code that is executed inside the VM. He goes on to discuss NFA and DFA engines and show how regular expressions work and gives some tips for using them.
- RE engine is a VM, executing code (the regular expression)
- Lack of whitespace in REs is a real problem.
- In addition to perl’s /x switch to enable extended formatting (which I know of but don’t use as much as I should), comments and whitespace can be added to PCREs by starting them with (?x)
- Editor REs typically require metacharacters to be escaped with backslashes – eg /a|b/ becomes /a\|b/ in emacs
- To improve performance try to reduce backtracking, think about whether to use maximal or minimal loops and try to write REs that will succeed rather than fail