One of the possible forms of a
URL
is an FTP URL, which begins with ftp:
as the protocol part.
Example:
ftp://ftp.funet.fi/pub/standards/RFC/rfc959.txt
An FTP URL designates a file or a a directory on an Internet host accessible using the FTP protocol, (The sample URL above refers to a copy of the FTP protocol specification, RFC 959, in one repository of RFCs. Note that this specification has partly been updated by later RFCs.)
Such URLs are often used in Web documents in order to refer to a resource which is downloadable from a public FTP server (as in the example above). Less importantly, they can also be used to refer to a resource on a non-public area, so that a password is required.
According to the specification of URL formats, RFC 1738, an FTP URL is of the form
ftp://
user:
password@
host:
port/
path
so that some or all of the parts
user:
password@
,
:
password,
:
port
and
/
path
may be excluded. Although RFC 1738 has been obsoleted
as regards to generic URL syntax (now
defined in
RFC 3986), some of the specific parts, like
FTP URL syntax, are still in force.
The components obey the following rules:
:
or @
or
/
, the character must be
encoded
/
cwd2/
.../
cwdN/
name
/
or ;
within
a cwdi or the name must be
encoded)
optionally followed by
;type=
typecode
a
, i
, d
Effectively, ;type=a
means
"Ascii mode" (plain text mode) of transfer whereas
;type=i
means image (binary) mode.
If the ;type=
typecode part of an FTP URL
is
omitted, the client program interpreting the URL must guess the
appropriate mode to use. In general, the data content type of a file
can only be guessed from the name, e.g., from the suffix of the name;
the appropriate type code to be used for transfer of the file can
then be deduced from the data content of the file.
USER
and PASS
commands after making the connection to the FTP server.
Otherwise the conventions for "anonymous" FTP are used:
anonymous
is supplied
However, browsers often fail to conform to these requirements.
Generally, they cannot have access
to the user's
correct E-mail address.
In practice, browsers may send addresses
with invented user name parts like
mozilla
or IE30user
.
Such "addresses"
are syntactically legal
in the sense of passing some tests made by an FTP server
(such as checking that there is an @
somewhere)
but, being fancy nonexistent addresses, fail to serve the purpose
for which FTP servers like to get E-mail addresses.
(Such purposes may include statistics collection or informing
users about errors detected in files which they have fetched.)
In some cases, the address passed is a valid address but
the address of a proxy or gateway,
not the
address of the user.
If the URL supplies user but no password and the FTP server requests a password, the program interpreting the FTP URL (usually, a Web browser) should request a password from the user. Typically this takes place in a dialog box in which the password will not be visible as you type it (i.e., no echoing). However, some browsers (e.g., old versions of IE) do not request password from the user; instead, the connection fails.
There is no way in the FTP URL syntax
to ask the user agent to prompt
for a user name.
Either you provide a fixed user name (with or without a password)
in the URL
or no user name, in which the user name anonymous
is used.
(Toby Speight
has suggested the following way of circumventing this
restriction: Instead of providing directly a link with
an FTP URL, present an
HTML form
which
asks for the username; make the script which processes the
form send a 301 (temporarily moved) response, substituting the
received username into the URL in the
Location
header.
Cf. to Get RFC by number -
a demonstration of combined client and server side
scripting.)
However, IE 7 (where processing of FTP URLs is
rather peculiar in many ways), ignores the user and
password parts if present. It initiates a dialogue where the
user is prompted for them. Any
user:
password data
seems to trigger this; the data is otherwise discarded but
taken as a request to use non-anonymous FTP. This can be treated
as added security, since passing passwords in URLs is risky,
as URL specifications have always said.
The path is effectively a pathname of a resource and correspongs to a series of FTP commands as follows:
CWD
(change working directory) FTP command.
(Some Web browsers do a single combined CWD
instead.
It is unclear whether this deviation has practical implications.)
d
, a NLST
(name list) FTP command is performed with name
as argument, and the result is interpreted as a file
directory listing.
(On a typical Web browser, the listing is presented so that entries
act as links to files or subdirectories, allowing the user
to navigate in a directory tree.)
Otherwise, a TYPE
FTP command is performed with
typecode as the argument, and then the file
whose name is name is accessed (for example,
using the RETR
(retrieve) FTP command).
ftp://myname@host.dom/%2Fetc/motd
is
interpreted by FTP-ing to host.dom
, logging in as
myname
(prompting for a password if it is asked for), and then executing
CWD /etc
and then
RETR motd
. This has a different meaning from
ftp://myname@host.dom/etc/motd
which would
CWD etc
and then
RETR motd
; the initial
CWD
might be executed relative to the
default directory for
myname
. On the other hand,
ftp://myname@host.dom//etc/motd
, would
CWD
with a null
argument, then CWD etc
, and then RETR motd
.
Browsers often
seem to violate this: they effectively start from the root of the
file system of the FTP host.
(I first observed this in IE 4.0 under Win95 and
Lynx 2.7.1 under Unix.)
Thus, in order to access file
.plan
in user jkorpela
's home directory on
alfa.hut.fi
, a correct URL would be
ftp://jkorpela@alfa.hut.fi/.plan
but this does not work on IE which requires something like
ftp://jkorpela@alfa.hut.fi/m/fs/lai/lai/LK/lk/jkorpela/.plan
(which does not work on other browsers).
Moreover, as mentioned above, some versions of IE are
unable to prompt for password, so
even that URL might not work unless you put the password there.
If you refer to resources which require a password (in the genuine sense), you would have to include the password into the URL or accept the fact that (due to IE bugs mentioned above) a large portion of users will be unable to access the resource even if they know the password.
Moreover, when to tell the password for a user name to some people (or perhaps even to the public), you are effectively giving them full access (read and write) to all files under the user name. If the host accepts remote logins, which is pretty common, they can also login under that user name.
Even if the URL is not in an HTML file (e.g. you type the URL directly as input to a Web browser), including a password is a security risk. On most browsers, it will be visible as you type it, and it will be retained by your history file.
Jukka Korpela.