cant

Langue: en

Version: April 2009 (debian - 07/07/09)

Section: 1 (Commandes utilisateur)

NAME

cant - CAnonicalize N-Triples

DESCRIPTION

CAnonicalize N-Triples

OPTIONS

--verbose
-v Print what you are doing as you go
--help
-h Print this message and exit
--from=uri
-f uri Specify an input file (or web resource)
--diff=uri
-d uri Specify a difference file

Can have any number of --from <file> parameters, in which case files are merged. If none are given, /dev/stdin is used.

If any diff files are given then the diff files are read merged separately and compared with the input files. the result is a list of differences instead of the canonicalizd graph. This is NOT a minimal diff. Exits with nonzero system status if graphs do not match.

This is an independent n-triples cannonicalizer. It uses heuristics, and will not terminate on all graphs. It is designed for testing: the output and the reference output are both canonicalized and compared.

It uses the very simple NTriples format. It is designed to be independent of the SWAP code so that it can be used to test the SWAP code. It doesn't boast any fancy algorithms - just tries to get the job done for the small files in the test datasets.

The algorithm to generate a "signature" for each bnode. This is just found by looking in its immediate viscinity, treating any local bnode as a blank. Bnodes which have signatures unique within the graph can be allocated cannonical identifiers as a function of the ordering of the signatures. These are then treated as fixed nodes. If another pass is done of the new graph, the signatures are more distinct.

This works for well-labelled graphs, and graphs which don't have large areas of interconnected bnodes or large duplicate areas. A particular failing is complete lack of treatment of symmetry between bnodes.

References:

.google graph isomorphism See also eg http://www.w3.org/2000/10/rdf-tests/rdfcore/utils/ntc/compare.cc NTriples: see http://www.w3.org/TR/rdf-testcases/#ntriples
Not to mention,
published this month by coincidence:
Kelly, Brian, [Whitehead Institute]
"Graph cannonicalization", Dr Dobb's Journal, May 2003.
$Id: cant.py,v 1.15 2007/06/26 02:36:15 syosi Exp $

This is or was http://www.w3.org/2000/10/swap/cant.py W3C open source licence <http://www.w3.org/Consortium/Legal/copyright-software.html>.

2004-02-31 Serious bug fixed. This is a test program, shoul dbe itself tested.

Quis custodiet ipsos custodes?