New Code!
After an intermission of a few weeks, Discount has been updated to version 1.2.1 with the addition of a completely reworked emphasis parser. Previously,
it did a naive “turn all *
’s and _
’s into <em>
; turn all **
and __
into <strong>
” which led to incorrect XML when I interleaved *
and **
, but it now attempts to pair matching tokens before it starts spitting out emphasis.
The way I did this was to split the existing second pass (discount has a first pass that breaks the input into blocks, and a second pass that does text substitutions on the contents of those blocks) into two passes; The second pass now converts runs of *
into emphasis tokens, interleaved with fully-processed other stuff, and the third pass concatenates them together, matching open and close emphasis together as it goes.
for example, the four variants of
**A*B***
produce correct XML now:
***A*B**
–><p><strong><em>A</em>B</strong></p>
***A**B*
–><p><em><strong>A</strong>B</em></p>
**A*B***
–><p><strong>A<em>B</em></strong></p>
*A**B***
–><p><em>A<strong>B</strong></em></p>
which is much better than the status quo ante.
So it doesn’t dump core (as the presence of this weblog post shows,) it fixes a few memory leaks, and it produces better XML on pathological emphasis cases. And that’s good enough to be New Code! to amaze and educate your friends and family.