<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
	  <!ENTITY RFC1034 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1034.xml">
	  <!ENTITY RFC1035 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1035.xml">
	  <!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
	  <!ENTITY RFC2606 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2606.xml">
	  <!ENTITY RFC3596 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3596.xml">
	  <!ENTITY RFC3833 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3833.xml">
	  <!ENTITY RFC4291 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4291.xml">
	  ]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="2"?>
<?rfc symrefs="yes"?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<rfc category="std" docName="draft-irtf-asrg-dnsbl-07" ipr="full3978">
 <front>
    <title>DNS Blacklists and Whitelists</title>
      <author fullname="John Levine" initials="J." surname="Levine">
        <organization>Taughannock Networks</organization>
        <address>
          <postal>
            <street>PO Box 727</street>
            <city>Trumansburg</city>
            <code>14886</code>
            <region>NY</region>
          </postal>
           <phone>+1 831 480 2300</phone>
          <email>standards@taugh.com</email>
	  
	  <uri>http://www.taugh.com</uri>
        </address>
      </author>
      <date month="October" year="2008" />
      <area>Applications</area>
      <workgroup>Anti-Spam Research Group</workgroup>
      <keyword>mail, electronic mail, DNS, spam, blacklist, whitelist</keyword>

      <abstract>
      <t>The rise of spam and other anti-social behavior on the Internet has led
	to the creation of shared blacklists and whitelists of IP addresses
	or domains.
	The DNS has become a de-facto standard method of distributing these
	blacklists and whitelists.
	This memo documents the structure and usage of DNS based blacklists and
	whitelists, and the protocol used to query them.
      </t>
      </abstract>

      <note title='IRTF Notice'>
      <t>This document is a product of the Anti-Spam Research Group (ASRG).
	Comments and discussion may be directed to the ASRG mailing list,
	asrg@irtf.org.</t>

      <t>This document represents
	the consensus of the Anti-Spam Research Group of the Internet Research
	Task Force.</t>
      </note>

 </front>

 <middle>
   <section title='Introduction'>
     <t>In 1997, Dave Rand and Paul Vixie,
       well known Internet software engineers, started
       keeping a list of IP addresses that had sent them spam or engaged in
       other behavior that they found objectionable.
       Word of the list quickly spread, and they started distributing it as a
       BGP feed for people who wanted to block all traffic from listed IP
       addresses at their routers.
       The list became known as the Real-time Blackhole List (RBL).</t>

     <t>Many network managers wanted to use the RBL to block unwanted e-mail,
       but weren't prepared to use a BGP feed.
       Rand and Vixie created a DNS-based distribution scheme that quickly became
       more popular than the original BGP distribution.
       Other people created other DNS-based blacklists either to compete
       with the RBL or to complement it by listing different categories of
       IP addresses.
       Although some people refer to all DNS-based blacklists as ``RBLs'',
       the term properly is used for the MAPS RBL, the descendant of the
       original list.
       (In the United States, the term RBL is a registered service mark of Trend
       Micro<xref target='MAPSRBL' />.)</t>

     <t>The conventional term is now DNS Blacklist or Blocklist, or DNSBL.
       Some people also publish DNS-based whitelists or DNSWLs.
       Network managers typically use DNSBLs to block traffic and
       DNSWLs to preferentially accept traffic.
       The structure of a DNSBL and DNSWL are the same,
       so in the subsequent discussion
       we use the abbreviation DNSxL to mean either.</t>

     <t>This document defines the structure of DNSBLs and DNSWLs.
       It describes the structure, operation, and use of DNSBLs and
       DNSWLs but does not describe or recommend policies for adding or
       removing addresses to and from DNSBLs and DNSWLs, nor does it recommend policies
       for using them.
       We anticipate that management policies will be addressed in a companion
       document.</t>
      <t>
        <list style="hanging">
          <t hangText="Requirements Notation:  "> The key words "MUST", "MUST
            NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT",
            "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
            interpreted as described in <xref target="RFC2119" />.
          </t>
        </list>
      </t>
   </section>

   <section title='Structure of an IP address DNSBL or DNSWL'>
     
     <t>A DNSxL is a zone in the
       DNS<xref target="RFC1034" /><xref target="RFC1035" />.
       The zone containing
       resource records identifies hosts present in a blacklist or whitelist.
       Hosts were originally encoded into DNSxL zones
       using a transformation of their IP addresses,
       but now host names are sometimes encoded as well.
       Most DNSxLs still use IP addresses.</t>

     <section title='IP address DNSxL'>

     <t>An IPv4 address DNSxL has a structure adapted from that of the rDNS.
       (The rDNS, reverse DNS, is the IN-ADDR.ARPA<xref target="RFC1034" /> 
       and IP6.ARPA<xref target="RFC3596" /> domains
       used to map IP addresses to domain names.)
       Each IPv4 address listed in the DNSxL has a corresponding DNS entry
       created by reversing the order of the octets of the text representation
       of the IP address, and appending the domain name of the DNSxL.</t>

     <t>If, for example, the DNSxL is called bad.example.com, and the IPv4
       address to be listed is 192.0.2.99, the name of the DNS entry would
       be 99.2.0.192.bad.example.com.
       Each entry in the DNSxL MUST have an A record.
       DNSBLs SHOULD have a TXT record that describes the reason for
       the entry.
       DNSWLs MAY have a TXT record that describes the reason for
       the entry.
       The A record conventionally has the value 127.0.0.2, but MAY have
       other values as described below in Combined IP address DNSxLs.
       The TXT record describes the reason that the IP is listed in the DNSxL,
       and is often used as the text of an SMTP error response when an SMTP
       client attempts to send mail to a server using the list as a DNSBL,
       or as explanatory text when the DNSBL is used in a scoring spam filter.
       The DNS records for this entry might be:</t>

     <figure>
       <artwork>
99.2.0.192.bad.example.com    A      127.0.0.2
99.2.0.192.bad.example.com    TXT
         "Dynamic address, see http://bad.example.com?192.0.2.99"
       </artwork>
     </figure>

     <t>Some DNSxLs use the same TXT record for all entries,
       while others provide a different TXT record for each entry or range
       of entries that describes the reason that entry or range is listed.
       The reason often includes the URL of a web page where more information
       is available.
       Client software MUST check the A record and MAY check the TXT record.</t>

     <t>If a range of addresses is listed in the DNSxL, 
       the DNSxL MUST contain an A record
       (or a pair of A and TXT records) for every address in the DNSxL.
       Conversely,
       if an IP address is not listed in the DNSxL, there MUST NOT be any records for
       the address.</t>
     </section>

     <section title='IP address DNSWL'>

     <t>Since SMTP has no standard way for a server to advise a client why a request
       was accepted, TXT records in DNSWLs are not very useful.
       Some DNSWLs contain TXT records anyway to document the
       reasons that entries are present.</t>

     <t>It is possible and occasionally useful for a DNSxL to be used as a DNSBL in one
       context and a DNSWL in another.
       For example, a DNSxL that lists the IP addresses assigned to dynamically
       assigned addresses on a particular network might be used as a DNSWL on that network's
       outgoing mail server or intranet web server, and used as a DNSBL for mail servers
       on other networks.</t>
     </section>

     <section title='Combined IP address DNSxL'>

     <t>In many cases, an organization maintains a DNSxL that contains
       multiple entry types, with the entries of each type constituting a sublist.
       For example, an organization that publishes a DNSBL listing
       sources of unwanted e-mail might wish to indicate why various addresses are
       included in the list, with one sublist for addresses listed due to sender
       policy, a second list for addresses of open relays, a third list for hosts
       compromised by malware, and so forth.
       (At this point all of the DNSxLs with sublists of which we are aware are
       intended for use as DNSBLs, but the sublist techniques are equally usable
       for DNSWLs.)</t>

     <t>There are three common methods of representing a DNSxL with multiple sublists:
       subdomains, multiple A
       records, and bit encoded entries.
       DNSxLs with sublists SHOULD use both subdomains and one of the other methods.</t>

     <t>Sublist subdomains are merely subdomains of the main DNSxL domain.
       If for example, bad.example.com had two sublists relay and malware,
       entries for 
       192.0.2.99 would be 99.2.0.192.relay.bad.example.com
       or 99.2.0.192.malware.bad.example.com.
       If a DNSxL contains both entries for a main domain and for sublists,
       sublist names MUST be at least two characters and
       contain non-digits, so there is no problem of name collisions with
       entries in the main domain, where the IP addresses consist of digits or single
       hex characters.</t>

     <t>To minimize the number of DNS lookups, multiple sublists can also be encoded as bit masks or multiple
       A records.
       With bit masks, the A record entry for each IP is the logical OR of the bit masks for all of the lists on
       which the IP appears.
       For example, the bit masks for the two sublists might be 127.0.0.2 and 127.0.0.4, in which case
       an entry for an IP on both lists would be 127.0.0.6:</t>

     <figure>
       <artwork>
99.2.0.192.bad.example.com    A      127.0.0.6
       </artwork>
     </figure>

     <t>With multiple A records, each sublist has a different assigned value
       such as 127.0.1.1, 127.0.1.2, and so forth,
       with an A record for each sublist on which the IP appears:</t>

     <figure>
       <artwork>
99.2.0.192.bad.example.com    A      127.0.1.1
99.2.0.192.bad.example.com    A      127.0.1.2
       </artwork>
     </figure>

     <t>There is no widely used convention for mapping sublist names to bits
       or values, beyond the convention that all A values SHOULD be in the
       127.0.0.0/8 range to prevent unwanted network traffic if the value is
       accidentally used as an IP address.</t>

     <t>DNSxLs that return multiple A records sometimes return multiple TXT
       records as well, although the lack of any way to match the TXT records
       to the A records limits the usefulness of those TXT records.  Other
       combined DNSxLs return a single TXT record.</t>
     </section>
   
     <section title='IPv6 DNSxLs'>

     <t>The structure of DNSxLs based on IPv6 addresses is adapted from
       that of the IP6.ARPA domain defined in <xref target="RFC3596" />.
       Each entry MUST be a 32-component hex nibble-reversed IPv6
       addresses suffixed by the the DNSxL domain.
       For example, to represent the address:</t>
     <figure>
       <artwork>
  2001:db8:1:2:3:4:567:89ab
       </artwork>
     </figure>

     <t>in the DNSxL ugly.example.com, the entry might be:</t>

     <figure>
       <artwork>
  b.a.9.8.7.6.5.0.4.0.0.0.3.0.0.0.2.0.0.0.1.0.0.0.8.b.d.0.1.0.0.2.
               ugly.example.com. A 127.0.0.2
                                 TXT "Spam received."  
       </artwork>
     </figure>

     <t>Combined IPv6 sublist DNSxLs are represented the same way as IPv4 DNSxLs,
       replacing the four octets of IPv4 address with the 32 nibbles of IPv6 address.</t>

     <t>A single DNSxL could in principle contain both IPv4 and IPv6 addresses,
       since the different lengths prevent any ambiguity.
       If a DNSxL is represented using traditional zone files and wildcards,
       there is no way to specify the length of the name that a wildcard matches,
       so wildcard names would indeed be ambiguous for DNSxLs served in that fashion.</t>
     </section>
   </section>

   <section title='Domain name DNSxLs'>

     <t>A few DNSxLs list domain names rather than IP addresses.
       They are sometimes called RHSBLs, for right hand side blacklists.
       The names of their entries MUST contain the listed domain name followed by the name of the DNSxL.
       If the DNSxL were called doms.example.net, and the domain invalid.edu were to be listed, the
       entry would be named invalid.edu.doms.example.net:</t>

     <figure>
       <artwork>
invalid.edu.doms.example.net    A      127.0.0.2
invalid.edu.doms.example.net    TXT    "Host name used in phish"
       </artwork>
     </figure>

    <!-- <t>A few name-based DNSBLs encode e-mail addresses using a convention
adapted from DNS SOA records, with the mailbox name encoded as the first
       component of the domain name, so an entry for fred@invalid.edu would
have the name fred.invalid.edu.doms.example.net:</t>-->

    <!--<figure>
       <artwork>
fred.invalid.edu.doms.example.net    A      127.0.0.2
       </artwork>
     </figure>-->

     <t>Name-based DNSBLs are far less common than IP based DNSBLs.
       There is no agreed convention for wildcards.</t>

     <t>Name-based DNSWLs can be created in the same manner as DNSBLs,
and have been used as simple reputation systems with the values of octets
       in the A record representing reputation scores and confidence values,
       typically on a 0-100 or 0-255 scale.</t>
   </section>

     <section title='DNSxL cache behavior'>

     <t>The per-record time-to-live and zone refresh intervals of
       DNSBLs and DNSWLs vary greatly depending on the management policy of
       the list.  The TTL and refresh times SHOULD be chosen to reflect the
       expected rate of change of the DNSxL.
       A list of IP addresses assigned to dynamically allocated
       dialup and DHCP users could be expected to change slowly, so the TTL
       might be several days and the zone refreshed once a day.  On the other
       hand, a list of IP addresses that had been observed sending spam might
       change every few minutes, with comparably short TTL and refresh
       intervals.</t>
     </section>

     <section title='Test and contact addresses'>

     <t>IPv4 based DNSxLs MUST contain an entry for 127.0.0.2 for testing purposes.
       IPv4 based DNSxLs MUST NOT contain an entry for 127.0.0.1.</t>

     <t>DNSBLs that return multiple values SHOULD have multiple test addresses so that,
       for example, a DNSBL that can return 127.0.0.5 would have a test
       record for 127.0.0.5 that returns an A record
       with the value 127.0.0.5, and a corresponding TXT record.</t>

     <t>IPv6 based DNSxLs MUST contain an entry for ::FFFF:7F00:2
       (::FFFF:127.0.0.2), and MUST NOT contain
       an entry for ::FFFF:7F00:1 (::FFFF:127.0.0.1), the
       <xref target="RFC4291">IPv4-Mapped IPv6 Address</xref>
       equivalents of the IPv4 test addresses.</t>

     <t>Domain name based DNSxLs MUST contain an entry for the
       <xref target="RFC2606" /> reserved domain name
       "TEST" and MUST NOT contain an entry for the reserved
       domain name "INVALID".</t>

     <t>DNSxLs also MAY contain an A record at the apex of the DNSxL zone
       that points to a web server, so that
       anyone wishing to learn about the bad.example.net DNSBL can check http://bad.example.net.</t>

     <t>The combination of a test address that MUST exist and an address that
       MUST NOT exist allows a client system to defend against DNSxLs which
       deliberately
       or by accident install a wildcard that returns an A record for all
       queries.
       DNSxL clients SHOULD periodically check appropriate test entries to
       ensure that the DNSxLs they are using are still operating.</t>
     </section>

   <section title='Typical usage of DNSBLs and DNSWLs'>

   <t>DNSxLs can be served either from standard DNS servers, or from specialized
servers like <xref target='RBLDNS'>rbldns</xref>
and <xref target='RBLDNSD'>rbldnsd</xref>
that accept lists of IP addresses and CIDR ranges
and synthesize the appropriate DNS records on the fly.
Organizations that make heavy use of a DNSxL usually arrange for a private
mirror of the DNSxL, either using the standard AXFR and IXFR or by fetching
a file containing addresses and CIDR ranges for the specialized servers.
If a /24 or larger range of addresses is listed, and the
zone's server uses traditional zone files to represent the DNSxL, the DNSxL
MAY use wildcards to limit the size of the zone file.  If for example,
the entire range of 192.0.2.0/24 were listed, the DNSxL's zone could
contain a single wildcard for *.2.0.192.bad.example.com.</t>

   <t>DNSBL clients are most often mail servers or spam filters called from mail servers.
There's no requirement that DNSBLs be used only for mail, and other services such as IRC use
them to check client hosts that attempt to connect to a server.</t>

   <t>A client MUST interpret any returned A record as meaning that an address
     or domain is listed in a DNSxL.
Mail servers that test combined lists most often handle them the same as single lists
and treat any A record as meaning that an IP is listed without distinguishing among the
various reasons it might have been listed.
   DNSxL clients SHOULD be able to use bit masks and value range tests on
   returned A record values in order to select particular sublists of a combined list.</t>

   <t>Mail servers typically check a list of DNSxLs on every incoming SMTP connection, with the
names of the DNSxLs set in the server's configuration.
A common usage pattern is for
the server to check each list in turn until it finds one with a DNSBL entry, in which case it
rejects the connection, or a DNSWL entry in which case it accepts the connection.
If the address appears on no list at all (the usual case for legitimate mail),
the mail server accepts the connection.
In another approach, DNSxL entries are used as inputs to a weighting function
that computes an overall score for each message.</t>

   <t>The mail server uses its normal local DNS cache to limit traffic to the DNSxL servers and
to speed up retests of IP addresses recently seen.
Long-running mail servers MAY cache DNSxL data internally,
but MUST respect the TTL values and discard expired records.</t>

   <t>An alternate approach is to check DNSxLs in a spam filtering package after a message
has been received.
In that case, the IP(s) to test are usually extracted from "Received:" header fields
or URIs in the body of the message.  The DNSxL
results can be used to make a binary accept/reject decision, or in a scoring system.</t>

   <t>Packages that test multiple header fields MUST be able to distinguish among values in lists with sublists
since, for example, an entry indicating that an IP is assigned to dialup users might be treated
as a strong indication that a message would be rejected if the IP sends mail directly to the recipient
system, but not if the message were relayed through an ISP's mail server.</t>

   <t>Name-based DNSBLs have been used both to check domain names of e-mail addresses
and host names found in mail headers, and to check the domains found in URLs
in message bodies.</t>
   </section>

   <section title='Security Considerations'>

  <t>Any system manager that uses DNSxLs is entrusting part of his or her server management
to the parties that run the lists.
A DNSBL manager that decided to list 0/0 (which has actually happened) could cause every server
that uses the DNSBL to reject all mail.
Conversely, if a DNSBL manager removes all of the entries (which has also happened), systems that
depend on the DNSBL will find that their filtering doesn't work as they want it to.</t>

  <t>Since DNSxL users usually make a query for every incoming e-mail message,
the operator of a DNSxL can extract approximate mail volume statistics
from the DNS server logs.
This has been used in a few instances to estimate the amount of mail
individual IPs or IP blocks send<xref target='SENDERBASE' />
<xref target='KSN' />.</t>

  <t>As with any other DNS based services, DNSBLs and DNSWLs are subject to 
various types of DNS attacks which are described in <xref target="RFC3833"/>.</t>
   </section>

   <section anchor="IANA" title="IANA Considerations">
  <t>This memo includes no request to IANA.</t>
   </section>
 </middle>

 <back>

   <references title="Normative References" >
     &RFC1034;
     &RFC1035;
     &RFC2119;
     &RFC2606;
     &RFC3596;
     &RFC4291;
   </references>

   <references title="Informative References" >
     &RFC3833;

     <reference anchor='RBLDNS'>
       <front>
	 <title>rbldns, in 'djbdns'</title>
	 <author initials='D. J.' surname='Bernstein'
		 fullname='D. J. Bernstein'>
	   <organization />
	 </author>
       </front>
       <format type="HTML" target='http://cr.yp.to/djbdns.html' />
     </reference>

     <reference anchor='MAPSRBL'>
       <front>
	 <title>MAPS RBL+</title>
	 <author fullname='Mail Abuse Prevention System'>
	   <organization />
	 </author>
       </front>
       <format type="HTML" target='http://mail-abuse.com/' />
     </reference>

     <reference anchor='RBLDNSD'>
       <front>
	 <title>rbldnsd: Small Daemon for DNSBLs</title>
	 <author initials='M.' surname='Tokarev' fullname='Michael Tokarev'>
	   <organization />
	 </author>
       </front>
       <format type="HTML" target='http://www.corpit.ru/mjt/rbldnsd.html' />
     </reference>

     <reference anchor='SENDERBASE'>
       <front>
	 <title>Senderbase</title>
	 <author initials='' surname='Ironport Systems'>
	   <organization />
	 </author>
       </front>
       <format type="HTML" target='http://www.senderbase.org' />
     </reference>

     <reference anchor='KSN'>
       <front>
	 <title>The South Korean Network Blocking List</title>
	 <author fullname="John Levine" initials="J." surname="Levine">
           <organization>Taughannock Networks</organization>
         </author>
       </front>
       <format type="HTML" target='http://korea.services.net' />
     </reference>

   </references>
    <section title="Change Log">
      <t>
        <spanx style="strong">NOTE TO RFC EDITOR: This section may be removed
          upon publication of this document as an RFC.</spanx>
      </t>
      <section title="Changes since -asrg-dnsbl-06">
	<t>Change forbidden example from EXAMPLE to INVALID.</t>
	<t>Remove SOA encoded email addresses.</t>
	<t>Change IPv6 test addresses.</t>
      </section>
      <section title="Changes since -asrg-dnsbl-05">
	<t>Pervasive edits to standard language, including RFC2119 terms.</t>
	<t>Test entries clarified for IPv4, invented for IPv6 and domains.</t>
      </section>
    </section>
 </back>
</rfc>
