Single post
jump to repliessmolweb-validator is a new CLI tool to check if a web page respects smolweb HTML subset.
This tool uses jsoup HTML5 parser, a java library.
Exemple :
$ ./smolweb-validate https://adele.pages.casa/md/
π Validating: https://adele.pages.casa/md/
ββββββββββββββ
π VALIDATION SUMMARY
ββββββββββββββ
β
VALID: This page conforms to the smolweb HTML subset!
π Statistics:
Total elements: 71
Valid elements: 71
Invalid elements: 0
ββββββββββββββ
Validation complete.
14 visible replies; 16 more replies hidden or not public
back to top@adele thanks, this is great! :)
Do you think it would be useful to specify a custom DTD for validation? https://en.wikipedia.org/wiki/Document_type_definition
Not sure if the W3C validator can be used with a custom DTD pointing to a smolweb URL. Here is the one for XHTML 1.0 Strict as a example: https://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
I checked a simple case with a XML file using an embedded DTD with xmllint (in libxml2 package) and it detects errors as expected.
@dillo the problem with DTD is that the document has to be a well-formed XML doc. This constraint is too heavy for handwritten html.
@adele you are referring to leaving some tags open? If so, I believe that would require having a complicated parsing/recovery strategy:
https://html.spec.whatwg.org/multipage/parsing.html#misnested-tags:-b-i-/b-/i
https://html.spec.whatwg.org/multipage/parsing.html#adoptionAgency
https://html.spec.whatwg.org/multipage/parsing.html#reconstruct-the-active-formatting-elements
@dillo I think that XHTML was wrong way during HTML history. Before and after XHTML period, Β«IMGΒ», Β«HRΒ», Β«BRΒ»,Β«INPUTΒ»β¦ without a closing / are more simple to write, and more logical (these tags never enclose any thing, so why would it be necessary to close them). All browsers are already able to understand them.
@adele I think the html tags got filtered in your toot. I mean not closing tags such as (replaced "<>" with "()") this case:
(ul)
(li)Item one
(li)Item two
(/ul)
One of the complexities of the parser in Dillo (and other HTML 5 parsers) is that it needs to determine which tags close which other previous tags. Sometimes this is the result of a typo, not intentional.
I would like to get rid of such complexities at parsing, but maybe that would be outside the scope of the smolweb goal.
@dillo sorry, I have corrected my previous toot and used Β« Β» chars.
OK, I see your problem. This kind of syntax should not be used. I will add this in the smolweb rules and the validator. Same thing for the other problem you mentioned (such as «B»«I»text«/B»«/I» )
@dillo now smolweb-validator checks for misnested and unclosed tags
@dillo v1.2.1 check more parsing problems
@adele sorry for the delay, thanks for the update!
s tag is not part of the W3C XHTML Basic, unlike b and i.
As explained in the smolweb HTML subset guide, this subset is inspired by a previous work proposed by W3C: the XHTML Basic.
The XHTML Basic document type includes the minimal set of modules required to be an XHTML host language document type, and in addition it includes images, forms, basic tables, and object support. It is designed for Web clients that do not support the full set of XHTML features; for example, Web clients such as mobile phones, PDAs, pagers, and set top boxes. The document type is rich enough for content authoring.
For a better compatibility, it is not a good idea to specify the XHTML Basic 1.1 DTD in the doctype for smolwebsites.
Some deprecated tags (accronym, big, tt) have been removed from this list. Object and param tags have been banned to avoid inclusion of specific code such as Java applets.
As specify in Guidelines, semantics tags issued in more recent HTML versions have been added to propose a better accessibility.
@vulonkaaz However, adding s tag could be a good idea.
I will search if it is well supported by old and tiny/basic browsers. If ignored, it would be dangerous to not see that a text is stroke.
think it would make sense to add it