taggrep

Langue: en

Version: May 1999 (mandriva - 01/05/08)

Section: 1 (Commandes utilisateur)

NAME

taggrep - search for a given html tag and display its content

SYNOPSIS

taggrep [-hs] [-c closetag1,closetag2,...] starttag html-files

DESCRIPTION

taggrep is a grep program that lists all html tags of type starttag in a given list of html-pages. It can also list the content enclosed between tags. To do this a comma separated list of tags that should end the starttag must be specified. Note: In order to cope with broken html or missing clostags on the command line, the search ends automatically if the starttag is not terminated after 200 characters. In this case ...... is printed to indicate that the line continues. taggrep produces one line of output per starttag. I.e. it removes repeated white space and newlines when printing the result.

OPTIONS

-c closetag1,closetag2,...
Specify a comma separated list of html tags that should end the starttag
-h
Prints a little help/usage information.
-s
Short listing. Do not print the name of the html file and the line number of the starttag.

EXAMPLE

Show the META tags of all html pages in the current directory:
taggrep meta *.html

Show the document title of all html pages in the current directory:
taggrep -c title title *.html
This lists everything from <title> to </title>.

To show all lists (LI) in a given html document you may type:
taggrep -c li,ol,ul li doc.html

For a document that looks as follows
<ol>
<li>item 1</li>
<li>item 2</li>
<li type=disc> three
<li type=square>four</li>
</ol>

this will produce:
doc.html:12: <li>item 1</li>
doc.html:13: <li>item 2</li>
doc.html:14: <li type=disc> three <li type=square>
doc.html:15: <li type=square> four</li>

Note: That the tag <li type=square> terminates the list item on line 14 and starts a new list item on line 15. taggrep prints the <li type=square> therefore twice although it can be found only once in the document.

BUGS

no known bugs

AUTHOR

Guido Socher (guido.socher@linuxfocus.org)

SEE ALSO

hrefgrep(1), srcgrep(1), taggrep(1), lshtmlref(1), blnkcheck(1)