Single post

jump to replies

smolweb-validator is a new CLI tool to check if a web page respects smolweb HTML subset.

This tool uses jsoup HTML5 parser, a java library.

Exemple :

$ ./smolweb-validate https://adele.pages.casa/md/

🔍 Validating: https://adele.pages.casa/md/
━━━━━━━━━━━━━━

📊 VALIDATION SUMMARY
━━━━━━━━━━━━━━
✅ VALID: This page conforms to the smolweb HTML subset!

📈 Statistics:
  Total elements: 71
  Valid elements: 71
  Invalid elements: 0

━━━━━━━━━━━━━━
Validation complete.

#smolweb #smallweb

8 replies

back to top
Dillo browser , @dillo@fosstodon.org
(open profile)

@adele thanks, this is great! :)

Do you think it would be useful to specify a custom DTD for validation? en.wikipedia.org/wiki/Document

Not sure if the W3C validator can be used with a custom DTD pointing to a smolweb URL. Here is the one for XHTML 1.0 Strict as a example: w3.org/TR/xhtml1/DTD/xhtml1-st

I checked a simple case with a XML file using an embedded DTD with xmllint (in libxml2 package) and it detects errors as expected.

Dillo browser , @dillo@fosstodon.org
(open profile)

@dillo I think that XHTML was wrong way during HTML history. Before and after XHTML period, «IMG», «HR», «BR»,«INPUT»… without a closing / are more simple to write, and more logical (these tags never enclose any thing, so why would it be necessary to close them). All browsers are already able to understand them.

Dillo browser , @dillo@fosstodon.org
(open profile)

@adele I think the html tags got filtered in your toot. I mean not closing tags such as (replaced "<>" with "()") this case:

(ul)
(li)Item one
(li)Item two
(/ul)

One of the complexities of the parser in Dillo (and other HTML 5 parsers) is that it needs to determine which tags close which other previous tags. Sometimes this is the result of a typo, not intentional.

I would like to get rid of such complexities at parsing, but maybe that would be outside the scope of the smolweb goal.

@dillo sorry, I have corrected my previous toot and used « » chars.

OK, I see your problem. This kind of syntax should not be used. I will add this in the smolweb rules and the validator. Same thing for the other problem you mentioned (such as «B»«I»text«/B»«/I» )