NAME

Apache::Tidy - htmltidy as an apache filter


SYNOPSIS

  PerlModule Apache::Filter
  PerlModule Apache::Tidy
  <Location /filtered/*.html>
     SetHandler perl-script
     PerlHandler Apache::Tidy
  </Location>


ABSTRACT

  Cleans up and fixes invalid HTML on the fly.


DESCRIPTION

Wrapper for the htmltidy program (http://tidy.sourceforge.net/) using the Apache::Filter framework. Fixes HTML/XHTML validation issues on the fly.

Dave Raggett's HTML Tidy is a free command-line utility for cleaning up messy and invalid HTML or XHTML code. It will correct missing or mismatched end tags, clean up Microsoft Word generated HTML, convert pages to XHTML, and format markup for easier reading.

Apache::Tidy uses the Apache::Filter framework to allow you to automatically run tidy over web content as it is being served. This can be very useful if you have editors or CMSes that produce invalid markup.

To filter static content add the following to your httpd.conf:

  PerlModule Apache::Tidy
  <Location /directory/to/filter/>
     SetHandler perl-script
     PerlHandler Apache::Tidy
  </Location>

Apache::Tidy can also work as part of an Apache::Filter chain:

  PerlModule Apache::Filter
  PerlModule Apache::RegistryFilter
  PerlModule Apache::Tidy
  <Location /perl/*.pl>
     PerlSetVar Filter On
     SetHandler perl-script
     PerlHandler Apache::RegistryFilter Apache::Tidy
  </Location>

Apache::Tidy supports all of htmltidy's command-line options by setting TidyOptions:

  <Location /filtered/>
    SetHandler perl-script
    PerlHandler Apache::Tidy
    PerlSetVar TidyOptions '-wrap 60'
    PerlSetVar TidyOptions -clean
    PerlSetVar TidyOptions -asxhtml
  </Location>

It defaults to '-q -asxhtml' if no options are explicitly set.

You can also specify a different path to the tidy executable (necessary if you've installed it anywhere but in /usr/bin/) and the temp directory used can also be specified (defaults to /tmp):

   <Location /filtered/>
    SetHandler perl-script
    PerlHandler Apache::Tidy
    PerlSetVar TidyPath /opt/local/bin/tidy
    PerlSetVar TidyTempDir /some/other/temp/dir
   </Location>


NOTES

You must have htmltidy installed on your system. if it is installed anywhere other than in /usr/bin/, you'll have to specify the full path with

  PerlSetVar TidyPath /path/to/tidy

I've only tested Apache::Tidy on unix systems. It may run on other platforms, but you will probably have to change the path, temp directory, and options.

Since Apache::Tidy just jumps out to the shell to call the external tidy program, it probably isn't very efficient. I'd like to reimplement this someday with an XS or SWIG wrapped tidylib.


SEE ALSO

the Apache::Filter manpage, http://tidy.sourceforge.net/, the Apache::RegistryFilter manpage


AUTHOR

Anders Pearson, <anders@columbia.edu>


COPYRIGHT AND LICENSE

Copyright 2003 by Anders Pearson

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.