Some notes I wrote in January 2008. Put into blog for storage. Of course, this idea is not practical.
Author’s note. I came up with this while staring at an XML document, a Spring context file to be exact. I was calling this contextual or anonymous end-tags. Of course, later I learned that this is not new a new idea. XML’s predecessor, SGML, used “empty end tag” and many other minimization techniques. The empty end tag minimization syntax of SGML was not adopted by XML. Are the reasons against empty end-tag minimization still valid today?
When XML was being designed there were some goals set out.
In XML an end tag is defined as: The end of every element that begins with a start-tag must be marked by an end-tag containing a name that echoes the element’s type as given in the start-tag:
 ETag ::= ‘</’ Name S? ‘>
There is also a Well-Formedness Constraint:
Element Type Match
The Name in an element’s end-tag must match the element type in the start-tag.
An example of an end-tag:
Proposed change to the ETag:
Element Type Match
The Name in an element’s end-tag, if present, must match the element type in the start-tag.
In other words, the tag name is optional. This works since XML must be well-formed and thus every empty end-tag will nest properly with its associated start-tag. And, it is also similar to the shortcut for elements that contain no content, for example, Supporting both types of end-tags helps toward compatibility with older applications of XML. Many other XML alternative proposals, of course, are more radical, but this proposal is meant to continue the XML syntax design.
|With no content||With content|
|<img src=’madonna.gif’></img>||<img src=’madonna.gif’>some content</img>|
|<img src=’madonna.gif’/>||<img src=’madonna.gif’>some content</>|
It is suggested that non-empty end-tags still be used where there is a
need for human readable markup such as in small display oriented ‘pages’,
Todo: Derive a real mathematical expression for size savings. This is something like, Size = Total – S n(i)e(i), then savings = size * cost/unit
Undoubtedly there were good reasons for the dropping of empty end-tags, such as: parser complexity and human readability.
Appendix A. XML Comments
Since we’re in the topic of changes that would break compatibility with SGML, another important change would be the XML Comments. The XML production is:
For compatibility the string “–” must not occur within the comments.
I propose that instead the production be:
And, the string “–” be allowed anywhere in the comments.