Single post

jump to replies

smolweb-validator is a new CLI tool to check if a web page respects smolweb HTML subset.

This tool uses jsoup HTML5 parser, a java library.

Exemple :

$ ./smolweb-validate https://adele.pages.casa/md/

πŸ” Validating: https://adele.pages.casa/md/
━━━━━━━━━━━━━━

πŸ“Š VALIDATION SUMMARY
━━━━━━━━━━━━━━
βœ… VALID: This page conforms to the smolweb HTML subset!

πŸ“ˆ Statistics:
  Total elements: 71
  Valid elements: 71
  Invalid elements: 0

━━━━━━━━━━━━━━
Validation complete.

#smolweb #smallweb

14 visible replies; 16 more replies hidden or not public

back to top
Dillo browser , @dillo@fosstodon.org
(open profile)

@adele thanks, this is great! :)

Do you think it would be useful to specify a custom DTD for validation? en.wikipedia.org/wiki/Document

Not sure if the W3C validator can be used with a custom DTD pointing to a smolweb URL. Here is the one for XHTML 1.0 Strict as a example: w3.org/TR/xhtml1/DTD/xhtml1-st

I checked a simple case with a XML file using an embedded DTD with xmllint (in libxml2 package) and it detects errors as expected.

Dillo browser , @dillo@fosstodon.org
(open profile)

@dillo I think that XHTML was wrong way during HTML history. Before and after XHTML period, Β«IMGΒ», Β«HRΒ», Β«BRΒ»,Β«INPUT»… without a closing / are more simple to write, and more logical (these tags never enclose any thing, so why would it be necessary to close them). All browsers are already able to understand them.

Dillo browser , @dillo@fosstodon.org
(open profile)

@adele I think the html tags got filtered in your toot. I mean not closing tags such as (replaced "<>" with "()") this case:

(ul)
(li)Item one
(li)Item two
(/ul)

One of the complexities of the parser in Dillo (and other HTML 5 parsers) is that it needs to determine which tags close which other previous tags. Sometimes this is the result of a typo, not intentional.

I would like to get rid of such complexities at parsing, but maybe that would be outside the scope of the smolweb goal.

@dillo sorry, I have corrected my previous toot and used Β« Β» chars.

OK, I see your problem. This kind of syntax should not be used. I will add this in the smolweb rules and the validator. Same thing for the other problem you mentioned (such as «B»«I»text«/B»«/I» )

@vulonkaaz

s tag is not part of the W3C XHTML Basic, unlike b and i.

As explained in the smolweb HTML subset guide, this subset is inspired by a previous work proposed by W3C: the XHTML Basic.

The XHTML Basic document type includes the minimal set of modules required to be an XHTML host language document type, and in addition it includes images, forms, basic tables, and object support. It is designed for Web clients that do not support the full set of XHTML features; for example, Web clients such as mobile phones, PDAs, pagers, and set top boxes. The document type is rich enough for content authoring.

For a better compatibility, it is not a good idea to specify the XHTML Basic 1.1 DTD in the doctype for smolwebsites.

Some deprecated tags (accronym, big, tt) have been removed from this list. Object and param tags have been banned to avoid inclusion of specific code such as Java applets.

As specify in Guidelines, semantics tags issued in more recent HTML versions have been added to propose a better accessibility.

https://smolweb.org/specs/index.html

https://www.w3.org/TR/xhtml-basic/#s_xhtmlmodules