htmlc – compile HTML into C code


This utility has been deprecated by the hcache functionality of libhtml library. It will, however, still be maintained.

htmlc is a utility transforming HTML-4.01 strict files into equivalent DOM-building invocations in C. It uses the libhtml library. htmlc was inspired by xmlc, an HTML to Java compiler for Enhydra.

Why? This eliminates the gap between presentation data—HTML pages—and compiled code by causing accessors of HTML ID tag to be auto-generated C functions. Thus, omissions of tags don't cause unexpected errors during the life-time of, say, a web application.

Consider the following example. A web application needs to modify an error page error.html and write its serialised DOM tree to the client. It does by looking up the tag <span id="ErrorMsg"></span> and replacing its text child with a formatting string. Usually, a template page is parsed on initialisation, with the identifier “ErrorMsg” looked up at run-time and replaced. If, however, a designer has forgotten to include the tag, the application must either return the given page or throw an error—either way, an acceptable situation. With htmlc, the existence of this tag is known at compile-time; furthermore, the DOM tree need not be parsed at initialisation, which, in the case of bad mark-up, may result in last-minute debugging sessions of presentation logic.


% htmlc -h test.html >test.h
% htmlc -s test.html >test.c

The htmlc utility is a Project member.


Sources correctly build and install on OpenBSD, NetBSD, and GNU/Linux operating systems, tested variously on generic i386, AMD64, and DEC Alpha. The current version is 0.1.15. The libhtml library is the only dependency.


Source archive htmlc.tar.gz (md5)
Archived source archive/
On-line source cvsweb


The manual is generated automatically and refers to the current snapshot.

htmlc(1) compile HTML into C code


For all issues related to htmlc, contact Kristaps Dzonsons,


24-06-2010: version 0.1.15

Synchronised with libhtml-0.3.0. This utility has been functionally deprecated, but will continue being maintained.

22-04-2010: version 0.1.14

Synchronised with libhtml-0.2.14.

12-03-2010: version 0.1.13

Directly use of PIC enumeration names with hproc_enumname (no more compiler warnings!). Utility now accepts standard input as well as the existing method (for easier interoperability with mutf8 for UTF-8 validation). Synchronised with libhtml-0.2.12.

21-02-2010: version 0.1.12

Directly use element and declaration enumeration names with helem_enumname and hdecl_enumname instead of looking them up. Synchronised with libhtml-0.2.8.

19-02-2010: version 0.1.11

Directly use attribute enumeration names with hattr_enumname instead of looking them up. Synchronised with libhtml-0.2.7.

See cvsweb for historical notes.

Copyright © 2009, 2010 Kristaps Dzonsons, $Date: 2010/06/24 13:01:14 $