lshtmlref

Langue: en

Version: Jan 2000 (mandriva - 01/05/08)

Section: 1 (Commandes utilisateur)

NAME

lshtmlref - list all relative links in html files

SYNOPSIS

lshtmlref [-ahdAWL] [-i filelist] html-files

DESCRIPTION

lshtmlref searches html files for relative links and prints the path names to these files. This can be used to build consistent tar archives from a number of html pages. lshtmlref helps to include into these tar archives all web pages, images, text files etc...
Note: lshtmlref is not recursive. It will only list the links in the files provided on the command line. lshtmlref expands a relative file path into a direct straight path by removing any .. and compensating it with the previous path component. lshtmlref list each linked file only once independent of how often the file is referenced from any of the html-files. lshtmlref finds out whether or not a link points to a directory by using stat(2). It can therefore only conclude that a given index file must be appended to the file name if the directory does really exist. See option -i on how to specify the file name of the index file.

To avoid errors in from the tar program lshtmlref does not include broken links. Instead it warns about non existent files on stderr unless -W option is given.

OPTIONS

-a
Print all relative links independent on whether the files exist or not.
-d
Print all links from the web-pages in debug format with line number and html-file name
-h
Prints a little help/usage information.
-i
Index files to use when an URL points to a directory. This is a comma seperated list. The default value is:
index.html,index.htm,index.shtml,index.phtml
-A
List all links (absolut, relative, mailto ...) in the files. This option may be used to get an overview over the content of a html file. This option must not be used when building a tar ball.
-L
Do not list the file names that were provided on the command line.
-W
Do not warn about broken links on stderr. Normally lshtmlref will check the existence of the referenced file and print an error message if it does not exist.

EXAMPLE

Build a tar archive that includes all text files, images, etc... which are referenced with relative links (The quotes below are back-quotes):
tar cvf web.tar `lshtmlref *.html */*.html`

Build a tar ball from all the images, html files etc... that are referenced by index.html with relative links:
lshtmlref index.html | xargs tar cvf ball.tar

Check that relative links from the files a.html and b.html do not point to any other html files than a.html and b.html. This check can be useful before you build a complete tar archive:
lshtmlref -L a.html a.html | fgrep .html

BUGS

no known bugs

AUTHOR

Guido Socher (guido.socher@linuxfocus.org)

SEE ALSO

hrefgrep(1), srcgrep(1), webfgrep(1), httpcheck(1), blnkcheck(1)