<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>

<!--
     ASCII to XML transformation by Invisible Worlds, Inc.
     http://invisible.net/
     Last transformation: 03-Feb-1999, 02:08:04

     Cannonical version of this document is at:
     http://info.internet.isi.edu/in-notes/rfc/files/rfc2017.txt

     Implementors should verify all content with
     cannonical version.  Failure to do so may result in
     protocol failures.
-->

<rfc number="2017"
     category="std">
<front>
<title abbrev="URL Access-Type">Definition of the URL MIME External-Body Access-Type</title>
<author initials="N." surname="Freed" fullname="Ned Freed">
<organization>Innosoft International, Inc.</organization>
<address>
<postal>
<street>1050 East Garvey Avenue South</street>
<street>West Covina</street>
<street>CA 91790</street>
<country>USA</country>
</postal>
<phone>+1 818 919 3600</phone>
<facsimile>+1 818 919 3614</facsimile>
<email>ned@innosoft.com</email>
</address>
</author>
<author initials="K." surname="Moore" fullname="Keith Moore">
<organization>Computer Science Dept.</organization>
<address>
<postal>
<street>University of Tennessee</street>
<street>107 Ayres Hall</street>
<street>Knoxville</street>
<street>TN 37996-1301</street>
<country>USA</country>
</postal>
<email>moore@cs.utk.edu</email>
</address>
</author>
<date month="October" year="1996"/>
<area>Applications</area>
<keyword>multipurpose internet mail extensions</keyword>
<keyword>MIME</keyword>
<keyword>uniform resource</keyword>
</front>
<middle>
<!-- RFC original section: (1.) -->
<section title="Abstract">
<t>
   This memo defines a new access-type for message/external-body MIME
   parts for Uniform Resource Locators (URLs).  URLs provide schemes to
   access external objects via a growing number of protocols, including
   HTTP, Gopher, and TELNET.  An initial set of URL schemes are defined
   in RFC 1738.
</t>
</section>
<!-- RFC original section: (2.) -->
<section title="Introduction">
<t>
   The Multipurpose Internet Message Extensions (MIME) define a facility
   whereby an object can contain a reference or pointer to some form of
   data rather than the actual data itself. This facility is embodied in
   the message/external-body media type defined in RFC 1521.  Use of
   this facility is growing as a means of conserving bandwidth when
   large objects are sent to large mailing lists.
</t>
<t>
   Each message/external-body reference must specify a mechanism whereby
   the actual data can be retrieved.  These mechanisms are called access
   types, and RFC 1521 defines an initial set of access types: &quot;FTP&quot;,
   &quot;ANON-FTP&quot;, &quot;TFTP&quot;, &quot;LOCAL-FILE&quot;, and &quot;MAIL-SERVER&quot;.
   Uniform Resource Locators, or URLs, also provide a means by which
   remote data can be retrieved automatically.  Each URL string begins
   with a scheme specification, which in turn specifies how the
   remaining string is to be used in conjunction with some protocol to
   retrieve the data. However, URL schemes exist for protocol operations
   that have no corresponding MIME message/external-body access type.
   Registering an access type for URLs therefore provides
   message/external-body with access to the retrieval mechanisms of URLs
   that are not currently available as access types.  It also provides
   access to any future mechanisms for which URL schemes are developed.
</t>
<t>
   This access type is only intended for use with URLs that actually
   retreive something. Other URL mechansisms, e.g.  mailto, may not be
   used in this context.
</t>
</section>
<!-- RFC original section: (3.) -->
<section title="Definition of the URL Access-Type">
<t>
   The URL access-type is defined as follows:
<list>
<t>
    (1)   The name of the access-type is URL.
</t>
<t>
    (2)   A new message/external-body content-type parameter is
          used to actually store the URL string. The name of the
          parameter is also &quot;URL&quot;, and this parameter is
          mandatory for this access-type. The syntax and use of
          this parameter is specified in the next section.
</t>
<t>
    (3)   The phantom body area of the message/external-body is
          not used and should be left blank.
</t></list>
</t>
<t>
   For example, the following message illustrates how the URL access-
   type is used:
</t>
<figure><artwork>
    Content-type: message/external-body; access-type=URL;
                  URL=&quot;http://www.foo.com/file&quot;

    Content-type: text/html
    Content-Transfer-Encoding: binary

    THIS IS NOT REALLY THE BODY!
</artwork></figure>
<!-- RFC original section: (3.1.) -->
<section title="Syntax and Use of the URL parameter">
<t>
   Using the ANBF notations and definitions of RFC 822 and RFC 1521, the
   syntax of the URL parameter Is as follows:
</t>
<figure><artwork>
     URL-parameter := &lt;&quot;&gt; URL-word *(*LWSP-char URL-word) &lt;&quot;&gt;

     URL-word := token
                 ; Must not exceed 40 characters in length
</artwork></figure>
<t>
   The syntax of an actual URL string is given in RFC 1738.  URL strings
   can be of any length and can contain arbitrary character content.
   This presents problems when URLs are embedded in MIME body part
   headers that are wrapped according to RFC 822 rules. For this reason
   they are transformed into a URL-parameter for inclusion in a
   message/external-body content-type specification as follows:
<list>
<t>
    (1)   A check is made to make sure that all occurrences of
          SPACE, CTLs, double quotes, backslashes, and 8-bit
          characters in the URL string are already encoded using
          the URL encoding scheme specified in RFC 1738. Any
          unencoded occurrences of these characters must be
          encoded.  Note that the result of this operation is
          nothing more than a different representation of the
          original URL.
</t>
<t>
    (2)   The resulting URL string is broken up into substrings
          of 40 characters or less.
</t>
<t>
    (3)   Each substring is placed in a URL-parameter string as a
          URL-word, separated by one or more spaces.  Note that
          the enclosing quotes are always required since all URLs
          contain one or more colons, and colons are tspecial
          characters [RFC 1521].
</t></list>
</t>
<t>
   Extraction of the URL string from the URL-parameter is even simpler:
   The enclosing quotes and any linear whitespace are removed and the
   remaining material is the URL string.
   The following example shows how a long URL is handled:
</t>
<figure><artwork>
     Content-type: message/external-body; access-type=URL;
                   URL=&quot;ftp://ftp.deepdirs.org/1/2/3/4/5/6/7/
                        8/9/10/11/12/13/14/15/16/17/18/20/21/
                        file.html&quot;

     Content-type: text/html
     Content-Transfer-Encoding: binary

     THIS IS NOT REALLY THE BODY!
</artwork></figure>
<t>
   Some URLs may provide access to multiple versions of the same object
   in different formats. The HTTP URL mechanism has this capability, for
   example.  However, applications may not expect to receive something
   whose type doesn&apos;t agree with that expressed in the
   message/external-body, and may in fact have already made irrevocable
   choices based on this information.
</t>
<t>
   Due to these considerations, the following restriction is imposed:
   When URLs are used in the context of an access-type only those
   versions of an object whose content-type agrees with that specified
   by the inner message/external-body header can be retrieved and used.
</t>
</section>
</section>
<!-- RFC original section: (4.) -->
<section title="Security Considerations">
<t>
   The security considerations of using URLs in the context of a MIME
   access-type are no different from the concerns that arise from their
   use in other contexts. The specific security considerations
   associated with each type of URL are discussed in the URL&apos;s defining
   document.
</t>
<t>
   Note that the Content-MD5 field can be used in conjunction with any
   message/external-body access-type to provide an integrity check. This
   insures that the referenced object really is what the message
   originator intended it to be. This is not a signature service and
   should not be confused with one, but nevetheless is quite useful in
   many situations.
</t>
</section>
<!-- RFC original section: (5.) -->
<section title="Acknowledgements">
<t>
   The authors are grateful for the feedback and review provided by John
   Beck and John Klensin.
</t>
</section>
<!-- RFC original section: (6.) -->
<section title="References (BOILERPLATE)">
<t>
This RFC contained boilerplate in this section which has been moved
to the RFC2223-compliant unnumbered section &quot;References.&quot;
</t>
</section>
<!-- RFC original section: (7.) -->
<section title="Authors&apos; Addresses (BOILERPLATE)">
<t>
This RFC contained boilerplate in this section which has been moved
to the RFC2223-compliant unnumbered section &quot;Author&apos;s Address.&quot;
</t>
</section>
</middle>
<back>
</back>
</rfc>
