Content-Type: text/plain;
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Sorry for the late reply.

On Sunday 11 September 2005 20:27, Adriaan de Groot wrote:
> On Sunday 11 September 2005 17:49, Frans Englich wrote:
> > ( Background: CC'ing kde-devel for larger exposure and archiving;
> > Benjamin

> About bloody time, if you ask me.
> > Meyer because we've discussed doing large-scale testing in KDE before, as
> > well as he's maintaining a current testing framework. Christopher, who is
> > a usability researcher(among other things, of course), and I were in
> > private discussing ways of improving projects such as KDE, from a
> > management/usability perspective(in a nutshell), this time focusing on
> >
> > > * It seems that something like Defense can only succeed if there is
> > > a widespread generally perceived need to solve a problem or
> > > consciousness for the problem itself. If KDE would gets burned several
> > > times by releases that turn out to unstable, then a change might occur.

> A truism about testing in general; you could say the same thing about
> writing specifications or thinking about use-cases as well.
> > > * It might be that the wrong people are hurting (power-users users
> > > instead of developers) and that the process (with feature freezes)
> > > prevents larger slips.

> ?
> > > * Do you know anybody who could champion an effort to introduce CI?

> What is CI? If you're talking about introducing a formalized quality
> architecture, source code verification (a la ESC Java, or what
> Reasoning.com does with C code, or see what Joe Kiniry is doing in
> general), about tinderboxing, about working on good coverage of functional
> tests, use tests, about introducing regular quality metric reports (like
> the apidox which is already run), then one person who might like to
> champion that is me. I do this stuff - have done this stuff - as research
> for about five years, and I've been trying to champion it through other
> channels in KDE as well. But I didn't know about y'all.
> > > * Is there evidence in any large OS project that CI works except
> > > Mozilla?

> Assuming my assumptions about CI as stated above are correct, then 'yes'.
> There is some, either in the BSD world or in smaller verification examples
> (again, Joe Kiniry is a champion of this stuff in general).
> > > * What would you think could be gained? How much quality, developer
> > > comfort, etc.?

> One rallying cry is 'lightweight formal methods' in the industry; I'm not
> sure I really believe in them, but as a place to introduce some better
> forms of specification and checking, it's worth a start. I believe that
> reporting on pointer errors and dubious constructs automatically would
> already help the quality of KDE considerably. Maybe not so much in libs,
> where a lot of eyeballing has already been done, but more so in newer
> applications, all over KDE PIM. If you start programming and right away
> reports start rolling in telling you how your code might be better, you
> might learn faster.
> Look at the apidox tinderbox -- that warns about bad dox 4 times a day, but
> requires people to look at the warnings for the code they are responsible
> for. That doesn't really work very well, so there needs to be a response to
> bad dox.

I can understand that. What's crucial is to _expose_ errors. KDE developers
who've been here for a while knows about plenty of common errors, and that
they're duplicated all over the place. We need to throw it in people's faces,
educate them about them, make it visible. That's the thought about Defense.

> For now, that's done by hand when I commit dox fixes: I usually CC
> the maintainer to point out how to do things better in future. I hope that
> has an effect eventuially.

I think so too, although it's a tedious job from your side.

[snip build verification and generic Q&A discussion, I'm here focusing on
Source verification/Defense.]

> > power. Build verification and source verification are different things,
> > and implementing the two can be done separately, and dealt with
> > separately(afaict, imho, etc). Defense's aim is solely on source
> > verification.

> Good.
> > Defense is by the same principle of kdetestscripts, it's framework is
> > just tons of possible errors to detected; finding invalid XML, invalid
> > PHP code,

> How are those two relevant for KDE?

I gave them as examples, but I think one can argue for them as well. You gave
Frerich's Docbook checker as example. Imagine that all our web pages were
well-formed XML/HTML and stayed that way, or that silly syntax errors in the
media framework were fixed before someone on kde-www reported them because a
subsite broke(yes, there's examples of that). When validation comes into
picture it gets even better. There are plenty of KConfigXT and XMLGUI files
that have invalid XML -- that can be detected.

(An important thing behind testing is that it's not the individual tests that
brings the improvement, but that them together over time are powerful, and
that one has testing as a methodology).

> > faulty include guards, etc, etc.

> This hardly seems like stuff I'd call 'verification'; more a 'spotting of
> obvious errors'. Useful in its own right, for sure.
> > I think generic source verification has a future in KDE. One important
> > aspect for that, is it is attractive and practical, such that developers
> > wants to look up errors, and don't find it cumbersome. Hence, the

> Noone wants to look stuff up. You have to push error reports to developers.

Right. I think that's the crucial point in this matter. We're not having a
technical problem, we're having a communication problem among our developers.
I see Defense as working in a media-dimension, and think in terms of that. We
have tons of developers who don't know certain details -- const-correctness,
no semicolons after function declarations, and other endless things we
usually yell about on kde-commits -- and we need to get better to push that
knowledge around. (That's how I see it.)

You mention Doxygen as an example above and I think it's mentioned as a
possible test in Defense's documentation. People don't care about Doxygen
because they don't "see" the errors, they are not aware of it, the errors in
their code needs to be exposed such that they start to care.

I want to integrate Defense into KDE's web framework. I want an error summary
published as RSS, possibly published on the Dot and DKO.

> That would require knowing who to talk to for a given piece of code; having
> module / directory / application maintainers (also in libs) was discussed
> recently and written off as infeasible. So you don't know who to talk to
> for a problem in code piece X.
> > ...the practical problem is man power, as Christopher states. I myself is
> > more than busy implementing XPath 2.0..

> I'm trying to get that manpower.

I'm such a whiner. Of course I have time.

> > One aspect is described like this: "X hundred developers all make the
> > same mistake but the one which knows that particular detail and can do it
> > right." There's many practical examples of this, syntax details for
> > different versions of compilers; corner cases in build systems, etc, etc.

> Consider Binner, the human lint. He eventually started explaining _what_ he
> was fixing and _why_ instead of just using commit -m "Fixes", and that
> helped awareness of the problems that previously only he could spot.

Yupp, Binner, Dirk, you, and others do massive investments in fixing common
errors, and what roughly is the red thread as I see it that you know/care
about something that others doesn't do. And that's my intention with Defense,
to bring that to a scalable level. I don't want Binner to manually go fixing
3 million lines of code, I want to see an XSL-T script that extracts relvant
title elements from Designer files, runs it through a specialized spell
checker, and then publishes it on a website.

> > And that's what Defense is. Defense is a web application, whose core is
> > written in Python and W3C XML Schema/XML. It produces a website
> > consisting of XHTML, XML+XSL, CSS and ECMAScript. From an architectural
> > perspective, Defense consists of a core which schedules tests(which there
> > can be hundreds of) that are run out-of-process and can be written in any
> > language. The tests are run on source code(such as kdelibs) and they
> > produce an XML file containing the test result, what file that was
> > tested, and other meta data. Among other things, this is what's published
> > on the website.

> While I'm happy that Defence is buzzword-compliant,

That's because I'm a buzz person(that skill of mine does of course Scale to
Enterprise Level). What's vaporware is relative, but Defense surely is
something: sloccount reports about 1000 loc of Python, 3700 lines of
WXS(schemata), 800 loc XSL-T, plus Docbook/CSS/XHTML/ECMAScript. What it does
follows below.

> this doesn't tell me
> much about what it's testing or how. From the following description I
> gather I could do just about anything as a test, including dumping
> everything to UPPAAL and checking reachability, modeling the machine in Coq
> or ACL2 and verifying properties of the machine model loaded with KDE code,
> and whatever else I feel like -- but that doesn't tell me what Defense can
> do _now_.

What exists is fully functioning framework code, and two tests(proof of
concept). What's missing is more tests and to start use it and identify
what's missing -- I built the thing to a beta level and then left it.

It takes a test, runs it in a safe way, and after that focus solely on
publishing the results on a website. One writes an XML mentioning MIME types
the test applies for, add some XHTML docs, and the core takes care of the

The interesting part is how Defense adds value /to/ tests. Results are not
dumped in a big list, but saved in a structured way, knowing what file and
what test a particular result was created for/by(I've attached the descriptor
file and executable for the XML-wellformedness test as example). The website
uses this; it builds a tree view over tested files, adds tooltips, links to
the docs, etc. And that's the whole point, to expose, publish, bring

And that's only a start. Once(if..) Defense is brought into production this
can be much improved. Not only Defense itself, but to promote it. For
example, you Adriaan mention "apidox tinderbox" which I don't have a clue
about -- because there haven't been an article on the Dot trumpeting how
mindbreaking idea it is(doesn't matter whether it is). Bombs like that are

I want to make Defense itch for developers. That developers can click on their
project, and see "red" failures, and that they take pleasure in having no
errors, such that "no Defense tests are flagged for my app" is an indicator
of quality. I also want to heavily brand the Defense website with the Crystal
icons for example, such that it feels like home and is appealing.

> (All told, this mail can be summed up as: (1) I like Formal Methods and
> formal QA and well-prganized testing (2) I'm trying to work on it in a
> full-time way but not being very successful at it yet (3) I'd like to
> cooperate with what you already have.)

I would also like to work on this. This is a suggestion I have: a second
developer(you Adriaan, if you got time and all that) teams up and gets
Defense running locally on his setup(I think that's tricky), and that the
usual twenty what's this & what's that questions is answered, so two people
have insight and can comment deeply. After that one can combine forces. I'm
in particular interested in how different views on problems, knowledge of
different resources(websites, servers, etc), can result in interesting

One can team up on IRC, if of interest. I'm FransE, usually on #ksvg &



Content-Type: text/xml;
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;



XML Well-formedness

When a document meets the requirements of the XML syntax, it is
said to be well-formed. Hence, a not well-formed document isn't
an XML document, strictly speaking. Well-formedness is a
basic, fundamental requirement, and is a necessity for anything
that involves XML.

These are the requirements of well-formedness, roughly

  • The document may only have one top element, a so called
    document element. For example, in
    the document element is called html.

  • Each start tag must have a corresponding end tag, or be
    an empty element, and they must nest properly. For



  • Since certain characters are used to signal elements and
    attributes, they must be escaped by character
    references when they should be interpreted as
    ordinary characters. These are the predefined entity

    &lt; < less than
    &gt; > greater than
    &amp; & ampersand
    &apos; ' apostrophe
    &quot; " quotation mark

    Forgetting to escape characters is a
    common mistake. They often hide inside href="http://www.google.dk/search?&q=define%3AURL">URLs
    and attributes. Some processors report entity errors as
    "invalid tokens".

An XML document should also have an XML declaration at the very
beginning of the file. It typically looks like this:

<?xml version="1.0" encoding="UTF-8" ?>

A document may in addition to being well-formed XML, also be valid
XML, meaning that it conforms to a document definition, such as
XHTML 1.0, or Docbook XML 4.3.

This is an example of a well-formed XML document, accompanied
by the recommended XML declaration:

An element containing an attribute and

Five is less than six, but greater than four: 4 < 5 > 6
"I hate quotations." - Ralph Waldo Emerson


With PHP In the Picture

When using PHP or another dynamic language to generate XML,
the document may by nature be not well-formed, while the
output well is. For example, the following example is not
well-formed because it have multiple top elements:

$title = "KDE - Personal Information Management";
$location = "/ KDE PIM";
$dir = "./";
include "$dir/inc/header.inc";

KDE-PIM is an application suite for

The KDE-PIM applications help you to
organize your mail, addresses, todo's,
appointments, and so on.

include "$dir/inc/footer.inc";

This can be solved by wrapping all elements in a trivial
element, such as div or span:

$title = "KDE - Personal Information Management";
$location = "/ KDE PIM";
$dir = "./";
include "$dir/inc/header.inc";

KDE-PIM is an application suite for

The KDE-PIM applications help you to
organize your mail, addresses, todo's,
appointments, and so on.

include "$dir/inc/footer.inc";

Verifying well-formedness

Applications which reads XML documents are required by the XML
specification to stop processing if a non well-formed document
is encountered, and it is hence easy to find tools for
testing well-formedness. One method is to use
part of libxml2 and
installed on practically every Linux distribution:
run xmllint -noout yourDocument.xml, and see if any
error message is printed. Another approach is to (try to) open
the XML document in a web browser from, say, the Mozilla family,
such as Firefox.

See the W3C Recommendation href="http://www.w3.org/TR/2004/REC-xml-20040204/">Extensible
Markup Language (XML) 1.0 (Third Edition)
to read more
about the XML format.


Content-Type: application/x-python;
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;

#!/usr/bin/env python
# Copyright (C) 2004 - 2005 Frans Englich
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License as
# published by the Free Software Foundation; either version 2 of
# the License, or (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public
# License along with this program; if not, write to the Free
# Software Foundation, Inc., 51 Franklin Steet, Fifth Floor, Boston,
# MA 02110-1301, USA.

import defense
import libxml2

# Fixed, but see nevertheless:
# http://bugzilla.gnome.org/show_bug.cgi?id=167134

class XMLWellFormed(defense.Test):
Tests XML files for well-formedness with libxml2, via the Python bindings.

@author Frans Englich

def start( self ):

errors = []

#libxml2 error handler. See:
# * libxml2.registerErrorHandler
# * http://mail.gnome.org/archives/xml/2.../msg00008.html
def errorHandler(self, _, str):

libxml2.registerErrorHandler( errorHandler, None )

ctxt = libxml2.createFileParserCtxt( self.filename )

if ctxt.parseDocument():

self.setFailed( True )
self.setTitle( "XML file not well-formed" )
self.setDescription( "

" + "".join(errors) + "

" )

if __name__ == "__main__":
defense.runTest( XMLWellFormed )

# vim: set et tw=80 ts=4 sw=4 sts=4:

Content-Type: text/plain; charset="iso-8859-1"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline


>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscrib=

e <<