--9dhkaBkwM1QzkQhP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Jul 12, 2007 at 04:00:33AM +0200, arni wrote:
> I put this together within an around an hour to show how its possible to=

=20
> cope with pdf spam - the script compeltely decodes the pdf attachment=20
> into text and images and reattaches them. Like this the text is fully=20
> available to all means of sa processing, as well as the images to=20
> FuzzyOCR, if installed.


Please don't do that (adding in new message parts), btw. There's a 3.2
plugin call (post_message_parse, per bug 5069) which was specifically
added such that plugins can manipulate messages after the initial parse
has completed. This allows for things like OCR of images and PDF->text,
and the rendered text can go right in the message part, and then gets
included automatically by SA as body text and so is available for body
rules, uri parsing, etc.


--=20
Randomly Selected Tagline:
"Never go off on tangents, which are lines that intersect a curve at only
one point and were discovered by Euclid, who live in the 6th century,
which was an era dominated by the Goths, who lived in what we now know
as Poland." - Unknown from Nov. 1998 issue of Infosystems Executive.

--9dhkaBkwM1QzkQhP
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFGlZRuamwUIkXWD1cRAhn9AJ4yE3c+XpKWN4tKoxmyVl urLqEzXACfVH7Z
ALkjYcp2SACbeSBTNqx7c3U=
=YjUF
-----END PGP SIGNATURE-----

--9dhkaBkwM1QzkQhP--