|
TMDA Filter Sources
In the following list of sources, the expected match field is
documented as well as any optional or required arguments. Square
brackets ([]) indicate the the argument is optional. Words in
chevrons (<>) should be replaced by the appropriate
option, without the chevrons.
NOTE:
The from* and to* sources match against
different addresses or sets of addresses depending on whether they are
used in an incoming filter or an outgoing filter.
Source | Incoming Filter | Outgoing Filter |
from* |
- envelope sender
- From: header field
- Reply-To: header field
- X-Primary-Address header field if the PRIMARY_ADDRESS_MATCH
setting can be satisfied
|
- From: header field
set by user's MUA
|
to* |
|
One of:
- recipients on tmda-sendmail's command line
- To: header field
|
Domain Search
Domain names of the addresses given above are also used in the search.
The portion after the first @ in an e-mail address is
considered the "domain". i.e,
jason@mastaler.com -> mastaler.com
mastaler@cs.yale.edu -> cs.yale.edu
Domains must be listed one per line when used in a file. e.g,
wingnet.net
mastaler.com
tmda.net
cs.yale.edu
The matching is exact. This isn't wildcarding, so with the above list,
mastaler@cs.yale.edu would match, but
mastaler@wopr.cs.yale.edu would not match.
You'd have to add wopr.cs.yale.edu to the list first.
This feature may be useful for sites that wish to check a large number
of domain names, but don't want the overhead of the wildcard
code. This feature is less flexible than wildcarding, but is much
faster since the list of domains can be stored in a CDB or DBM (either directly,
or by using -autocdb / -autodbm ).
Sources
This group of sources may be used in either incoming or outgoing
filter files.
-
from <email_address>
to <email_address>
- The
from and to sources expect a match field
of either an explicit email address or a wildcarded email address. The format
of the email address is documented below.
-
from-file [ -autocdb | -autodbm
] [ -optional ] <textfile>
to-file [ -autocdb | -autodbm
] [ -optional ] <textfile>
- The
from-file and to-file sources expect the
name of a textfile as the match field. You can specify the entire path explicitly
or use a leading '~' to represent the user's home directory, like the shell
does. The match field is always the name of the textfile. You do not need
to add '.cdb' or '.db' if you use the -auto* flags. It will be
automatically appended to the filename. The format of the textfile is documented
below.
Both from-file and to-file can take one of the -autocdb
or -autodbm flags . The -autocdb and -autodbm flags are documented
below.
If the -optional flag is given, the non-existence of the file
is not an error. If the file should exist, don't specify this flag; the parser
will log an error and will defer the mail so that you have a chance to fix
the problem.
-
from-cdb [ -optional ] <database.cdb>
to-cdb [ -optional ] <database.cdb>
- The
from-cdb and to-cdb sources expect a match
field of a CDB database filename. You can specify the entire path or use a
leading '~' to represent the user's home directory. You should specify the
.cdb extension as part of the filename. The CDB format and expected
contents are documented below.
If the -optional flag is given, the non-existence of the file
is not an error. If the file should exist, don't specify this flag; the parser
will log an error and will defer the mail so that you have a chance to fix
the problem.
-
from-dbm [ -optional ] <database.db>
to-dbm [ -optional ] <database.db>
- The
from-dbm and to-dbm sources expect the name
of a DBM database in the match field. You can specify the entire path or use
a leading '~' to represent the user's home directory. You should specify the
.db extension as part of the filename. The DBM format and expected
contents are documented below.
If the -optional flag is given, the non-existence of the file
is not an error. If the file should exist, don't specify this flag; the parser
will log an error and will defer the mail so that you have a chance to fix
the problem.
-
from-ezmlm [ -optional ] <path_to_subscribers_parent_dir>
to-ezmlm [ -optional ] <path_to_subscribers_parent_dir>
- The
from-ezmlm and to-ezmlm sources match against
the subscriber list of an ezmlm mailing list. They expect the match field to
be the full path of the parent directory of an ezmlm `subscribers'
directory. You should not include the `subscribers' portion of the path.
If the -optional flag is given, the non-existence of the file
is not an error. If the file should exist, don't specify this flag; the parser
will log an error and will defer the mail so that you have a chance to fix
the problem.
-
from-mailman -attr=<attribute> [ -optional ] <path_to_list_dir>
to-mailman -attr=<attribute> [ -optional ] <path_to_list_dir>
- The
from-mailman and to-mailman sources match
against addresses contained in a Mailman configuration database. The match field should
be the full path to the list directory. Both Mailman 2.0 and 2.1-style configuration
databases are supported.
The -mailman sources require you to specify an `attribute' to
search. Use the -attr=attribute flag to specify the name of an
attribute contained in the database. For example, `members' (subscriber addresses),
`digest_members' (digest subscriber addresses), or `owner' (list owner's address).
If the -optional flag is given, the non-existence of the file
is not an error. If the file should exist, don't specify this flag; the parser
will log an error and will defer the mail so that you have a chance to fix
the problem.
-
from-sql -wildcards | -addr_column=<column_name> [ -action_column=<column_name>] <SQL_query>
to-sql -wildcards | -addr_column=<column_name> [ -action_column=<column_name> ] <SQL_query>
- The
from-sql and to-sql sources match
against addresses stored in a SQL database. The <SQL_query>
is a SQL SELECT statement that should retrieve the appropriate
data. Just what that data needs to be will vary, depending on how
you choose to use the from-sql and
to-sql rules. Remember to enclose the
<SQL_query> in quotes, since it will contain spaces. Double
quotes are recommended, to avoid clashing with the single quotes
used by SQL.
For a simple SELECT statement, you can put it directly in your
from-sql or to-sql rule. For more
complex statements, you can define a macro in the filter
file or you can define a variable in either /etc/tmdarc or
~/.tmda/config and use the macro or variable name in the filter
rule. Defining a variable can be more flexible, especially for
multiple users at a single site who need the same SELECT statement
with minor modifications. For example, you could define this
SELECT statement in your /etc/tmdarc:
SQL_WHITELIST = """
SELECT wl.address
FROM whitelist AS wl, users AS u
WHERE u.uid = wl.uid
AND u.address = %(recipient)s
AND %(criteria)s
LIMIT 1"""
This assumes a 'users' table with a unique ID (uid )
and the user's email address (address ). It joins the
'users' and 'whitelist' tables based on the
uid . Details on '%(recipient)s' and '%(criteria)s'
can be found below.
A rule in your incoming filter using this variable would look like
this:
from-sql -addr_column=wl.address "${SQL_WHITELIST}" accept
Note that the variable is quoted because it is a string that
contains spaces.
Pre-requisite: In either /etc/tmdarc or ~/.tmda/config you
must set the DB_CONNECTION variable
so that TMDA can talk to the database. How this is done depends on
the database module used. Examples:
- MySQLdb
DB_CONNECTION = MySQLdb.connect(db='<dbname>',
host='<dbhost>',
user='<username>',
passwd='<password>')
- PyGreSQL
DB_CONNECTION = pgdb.connect('<dbhost>:<dbname>:<username>:<password>')
Wildcard Searches
The *-sql rules can be used in two scenarios. If the
-wildcards argument is given, the <SQL_query>
is run and the resulting data set is read, in its entirety, from
the database. The first column should be the addresses to match
against. The second column is optional, but if it is present, it
should be the overriding action or NULL. The returned data is
searched in exactly the same way as text files containing
wildcards. See Email Addresses
below.
Any columns beyond the second will be ignored. This can come in
handy if you need a column in the SELECT list for an ORDER BY
clause. Because the search code stops at the first match, unsorted
data could cause an incorrect match and the overriding action
might not be what you want. If you use wildcards in the address
column and you allow an overriding action, you should sort the
returned values using an ORDER BY clause.
Exact Match Searches
If an exact match of the sender or recipient is all you need,
e.g. you don't need wildcards, then you can use the *-sql rules to
have the database perform the search for you, returning only the
rows that exactly matched. You should specify the
-addr_column argument and provide the name of the
column that contains the addresses to search. You do not
need to include this column in the SELECT list.
If you have an overriding action column, you should give its name
using the -action_column argument. If you use the
-action_column argument, you must include that
column in the SELECT list.
Caveat: When the exact-match form of the
from-sql rule is used, TMDA can search for more than
one sender at once. If the SELECT statement returns more than one
row, TMDA will use the overriding action from the first row, since
it has no way of knowing which sender (the From:, the Reply-To: or
the envelope sender address) you care most about. Instead of
using overriding actions, consider using separate blacklists and
whitelists.
Your SELECT statement can be as complex as you care to make it,
including joins, an ORDER BY clause, a LIMIT clause, etc. TMDA
must know where to place the search conditions ("<sender1> =
<addr_column> OR <sender2> = <addr_column>",
etc.). You should include the string "%(criteria)s" in your SELECT
statement at the appropriate location. TMDA will build the list of
conditions based on the addresses to be matched and will replace
"%(criteria)s" with that list. Here's an example to make this
clearer.
Assume you have the following rule in your incoming filter.
from-sql -addr_column=address <SQL_query> ok
An email arrives with a From: header of "friend@example.com" and
a Reply-To: header of "friend@another.com". TMDA will generate the
following criteria string:
(address = 'friend@example.com' OR address = 'friend@another.com')
Your SELECT statement (<SQL_query>) might look something like this:
SELECT address FROM addr_list WHERE %(criteria)s
The SQL code that TMDA actually sends to the database will look
like this (reformatted for easier readability):
SELECT address
FROM addr_list
WHERE (address = 'friend@example.com' OR
address = 'friend@another.com')
If you store all of your users' whitelists in a single table (a
good schema design), you will need some way to restrict your
search to a single user's list; the user whose copy of TMDA is
querying the database. In order to facilitate that, the
from-sql and to-sql rules provide three
strings that can be used anywhere in your SELECT statement.
- username -- the value of the USERNAME variable from ~/.tmda/config
- hostname -- the value of the HOSTNAME variable from ~/.tmda/config
- recipient -- username@hostname
You can place these in your SELECT statement by using
"%(username)s", "(%hostname)s" and/or "%(recipient)s", as needed.
TMDA will substitute the appropriate values into the SELECT at the
time of the search. Do not put quotes around the above
variables. The Python DB API takes care of that for you in a
manner appropriate for the database you are using.
The following group of sources may be used only in incoming filter files.
-
body [ -case ] <regular_expression>
headers [ -case ] <regular_expression>
-
The
body and headers sources expect a match
field that is a regular expression as defined in Python's re module. The body source
matches against the body of the message while the headers
matches against the header fields.
Because regular expressions may include spaces, you must surround the
regular expressions with quotation marks. You may use either single
quotes (' ) or double quotes
(" ) as long as you use the the same one at both
the beginning and the end.
If you need to match a quote in your regular expression, simply use
the other style of quotes to surround the expression or escape the
embedded quote with a backslash (\ ).
The regular expression match is case-insensitive by default. If you
want a case-sensitive match, specify the -case flag.
-
body-file [ -case ] [-optional ] <regexp_file>
headers-file [ -case ] [-optional ] <regexp_file>
-
The
body-file and headers-file sources
expect the match field to contain the filename of a textfile
containing one or more regular expressions as defined in Python's re module. The body-file
source matches against the body of the message while the
headers-file matches against the header fields. The
format of the regular expression file is documented below.
The regular expression match is case-insensitive by default. If you
want a case-sensitive match, specify the -case flag.
If the -optional flag is given, the non-existence of the
file is not an error. If the file should exist, don't specify this
flag; the parser will log an error and will defer the mail so that you
have a chance to fix the problem.
-
size < <size_in_bytes | >size_in_bytes >
-
The
size source expects a comparison operator and a
number of bytes to compare to the size of the message. Only the
`<' and `>' operators are supported. There must not be any
whitespace between the operator and the number.
-
pipe <command_string>
-
The
pipe source expects a shell command string to which the
full contents of the incoming message will be piped to. A match is found
if command_string returns a zero exit status. Any non-zero
exit status is interpreted as a non-match. If command_string
contains whitespace, it should be quoted.
Miscellaneous Notes
- Email Addresses
- In addition to explicit email addresses, you can use expressions based
on UNIX shell-style wildcard characters anywhere an email address is expected.
NOTE: Wildcard characters are not recognized in a CDB or
DBM file and are only recognized in SQL databases if you specify
the -wildcards argument to the rule.
The special characters are:
Characters(s) Description
------------- -----------
* Matches everything.
? Matches any single character.
[seq] Matches any character in seq.
[!seq] Matches any character not in seq.
In addition, `@= ' (a custom rule) will expand to match both @
and @*.
Here are some common examples:
# match only jdoe@domain.dom
jdoe@domain.dom
# match anyone@domain.dom, but not anyone@sub.domain.dom
*@domain.dom
# match anyone@sub.domain.dom, but not anyone@domain.dom
*@*.domain.dom
# match both anyone@domain.dom, and anyone@sub.domain.dom
*@=domain.dom
NOTE: To match the empty envelope sender such as bounce messages
are sent with, use <> as the expression.
- Email Address Files
- Email address files are textfiles containing an email address, domain, or
wildcarded email address on each line.
When using the
from-file and to-file sources, the
textfile is searched sequentially, with the first match terminating the search.
Address files may contain an optional second field on each line that specifies
an action (ok, drop, bounce, etc. ). If the action is specified,
it overrides the action given in the filter rule.
- Auto- Database Flags
- If you have lengthy email address textfiles, you might want to consider
using the much faster hashed databases instead. The address files used by
the auto-building hashed database feature are the same email address textfiles
documented above with the sole exception that wildcards are not supported.
The -autocdb and -autodbm arguments are intended
to ease the use of CDB/DBM lists in TMDA by automatically rebuilding the CDB
or DBM file as necessary. This gives you the performance advantages of hashed
databases without the hassle of having to manually maintain them. With the
-auto* arguments, TMDA will rebuild the database if it doesn't
exist or if its timestamp is older than its source file. If the rebuild fails
for some reason, TMDA will fall back to matching from the textfile instead.
Before you try the CDB version of this feature, make sure you have the python-cdb extension module installed.
- Database Files
- CDB and DBM files are hashed databases. TMDA can look up email addresses
or domains in these files. Lookup in these files is much faster than in a textfile. On
the other hand, wildcards are not supported in database files -- only in textfiles.
In a CDB or DBM, the keys should be the email addresses or domain to match, and their
corresponding values (or records) should be empty unless you want to override
the action specified in the filter file.
CDB or DBM files can be created outside of TMDA and merely referenced by your
filter files (use the *-cdb and *-dbm filter rules)
or can be automatically created by TMDA if you use the -autocdb
or -autodbm flags and the *-file rules.
If you wish to explore CDB databases, make sure you have the python-cdb extension module installed.
-
-
- Regular Expression Files
- A regular expression textfile is simply a text file with a regular expression
on each line. The file is read sequentially and each regular expression is
used to attempt a match. As soon as there is a match, the search stops.
Because regular expressions may include spaces, you must surround the regular
expressions with quotation marks. You may use either single quotes (' )
or double quotes (" ) as long as you use the the same one
at both the beginning and the end.
If you need to match a quote in your regular expression, simply use the other
style of quotes to surround the expression or escape the embedded quote with
a backslash (\ ).
|