[//000000001]: # (S3 \- Amazon S3 Web Service Utilities)
[//000000002]: # (Generated from file 'S3\.man' by tcllib/doctools with format 'markdown')
[//000000003]: # (2006,2008 Darren New\. All Rights Reserved\. See LICENSE\.TXT for terms\.)
[//000000004]: # (S3\(n\) 1\.0\.3 tcllib "Amazon S3 Web Service Utilities")
[ Main Table Of Contents | Table Of Contents | Keyword Index | Categories | Modules | Applications ]
# NAME
S3 \- Amazon S3 Web Service Interface
# Table Of Contents
- [Table Of Contents](#toc)
- [Synopsis](#synopsis)
- [Description](#section1)
- [ERROR REPORTING](#section2)
- [COMMANDS](#section3)
- [LOW LEVEL COMMANDS](#section4)
- [HIGH LEVEL COMMANDS](#section5)
- [LIMITATIONS](#section6)
- [USAGE SUGGESTIONS](#section7)
- [FUTURE DEVELOPMENTS](#section8)
- [TLS Security Considerations](#section9)
- [Bugs, Ideas, Feedback](#section10)
- [Keywords](#keywords)
- [Category](#category)
- [Copyright](#copyright)
# SYNOPSIS
package require Tcl 8\.5
package require S3 ?1\.0\.3?
package require sha1 1\.0
package require md5 2\.0
package require base64 2\.3
package require xsxp 1\.0
[__S3::Configure__ ?__\-reset__ *boolean*? ?__\-retries__ *integer*? ?__\-accesskeyid__ *idstring*? ?__\-secretaccesskey__ *idstring*? ?__\-service\-access\-point__ *FQDN*? ?__\-use\-tls__ *boolean*? ?__\-default\-compare__ *always|never|exists|missing|newer|date|checksum|different*? ?__\-default\-separator__ *string*? ?__\-default\-acl__ *private|public\-read|public\-read\-write|authenticated\-read|keep|calc*? ?__\-default\-bucket__ *bucketname*?](#1)
[__S3::SuggestBucket__ ?*name*?](#2)
[__S3::REST__ *dict*](#3)
[__S3::ListAllMyBuckets__ ?__\-blocking__ *boolean*? ?__\-parse\-xml__ *xmlstring*? ?__\-result\-type__ *REST|xml|pxml|dict|names|owner*?](#4)
[__S3::PutBucket__ ?__\-bucket__ *bucketname*? ?__\-blocking__ *boolean*? ?__\-acl__ *\{\}|private|public\-read|public\-read\-write|authenticated\-read*?](#5)
[__S3::DeleteBucket__ ?__\-bucket__ *bucketname*? ?__\-blocking__ *boolean*?](#6)
[__S3::GetBucket__ ?__\-bucket__ *bucketname*? ?__\-blocking__ *boolean*? ?__\-parse\-xml__ *xmlstring*? ?__\-max\-count__ *integer*? ?__\-prefix__ *prefixstring*? ?__\-delimiter__ *delimiterstring*? ?__\-result\-type__ *REST|xml|pxml|names|dict*?](#7)
[__S3::Put__ ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-blocking__ *boolean*? ?__\-file__ *filename*? ?__\-content__ *contentstring*? ?__\-acl__ *private|public\-read|public\-read\-write|authenticated\-read|calc|keep*? ?__\-content\-type__ *contenttypestring*? ?__\-x\-amz\-meta\-\*__ *metadatatext*? ?__\-compare__ *comparemode*?](#8)
[__S3::Get__ ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-blocking__ *boolean*? ?__\-compare__ *comparemode*? ?__\-file__ *filename*? ?__\-content__ *contentvarname*? ?__\-timestamp__ *aws|now*? ?__\-headers__ *headervarname*?](#9)
[__S3::Head__ ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-blocking__ *boolean*? ?__\-dict__ *dictvarname*? ?__\-headers__ *headersvarname*? ?__\-status__ *statusvarname*?](#10)
[__S3::GetAcl__ ?__\-blocking__ *boolean*? ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-result\-type__ *REST|xml|pxml*?](#11)
[__S3::PutAcl__ ?__\-blocking__ *boolean*? ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-acl__ *new\-acl*?](#12)
[__S3::Delete__ ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-blocking__ *boolean*? ?__\-status__ *statusvar*?](#13)
[__S3::Push__ ?__\-bucket__ *bucketname*? __\-directory__ *directoryname* ?__\-prefix__ *prefixstring*? ?__\-compare__ *comparemode*? ?__\-x\-amz\-meta\-\*__ *metastring*? ?__\-acl__ *aclcode*? ?__\-delete__ *boolean*? ?__\-error__ *throw|break|continue*? ?__\-progress__ *scriptprefix*?](#14)
[__S3::Pull__ ?__\-bucket__ *bucketname*? __\-directory__ *directoryname* ?__\-prefix__ *prefixstring*? ?__\-blocking__ *boolean*? ?__\-compare__ *comparemode*? ?__\-delete__ *boolean*? ?__\-timestamp__ *aws|now*? ?__\-error__ *throw|break|continue*? ?__\-progress__ *scriptprefix*?](#15)
[__S3::Toss__ ?__\-bucket__ *bucketname*? __\-prefix__ *prefixstring* ?__\-blocking__ *boolean*? ?__\-error__ *throw|break|continue*? ?__\-progress__ *scriptprefix*?](#16)
# DESCRIPTION
This package provides access to Amazon's Simple Storage Solution web service\.
As a quick summary, Amazon Simple Storage Solution provides a for\-fee web
service allowing the storage of arbitrary data as "resources" within "buckets"
online\. See [http://www\.amazonaws\.com/](http://www\.amazonaws\.com/) for
details on that system\. Access to the service is via HTTP \(SOAP or REST\)\. Much
of this documentation will not make sense if you're not familiar with the terms
and functionality of the Amazon S3 service\.
This package provides services for reading and writing the data items via the
REST interface\. It also provides some higher\-level operations\. Other packages in
the same distribution provide for even more functionality\.
Copyright 2006 Darren New\. All Rights Reserved\. NO WARRANTIES OF ANY TYPE ARE
PROVIDED\. COPYING OR USE INDEMNIFIES THE AUTHOR IN ALL WAYS\. This software is
licensed under essentially the same terms as Tcl\. See LICENSE\.txt for the terms\.
# ERROR REPORTING
The error reporting from this package makes use of $errorCode to provide more
details on what happened than simply throwing an error\. Any error caught by the
S3 package \(and we try to catch them all\) will return with an $errorCode being a
list having at least three elements\. In all cases, the first element will be
"S3"\. The second element will take on one of six values, with that element
defining the value of the third and subsequent elements\. S3::REST does not throw
an error, but rather returns a dictionary with the keys "error", "errorInfo",
and "errorCode" set\. This allows for reliable background use\. The possible
second elements are these:
- usage
The usage of the package is incorrect\. For example, a command has been
invoked which requires the library to be configured before the library has
been configured, or an invalid combination of options has been specified\.
The third element of $errorCode supplies the name of the parameter that was
wrong\. The fourth usually provides the arguments that were actually supplied
to the throwing proc, unless the usage error isn't confined to a single
proc\.
- local
Something happened on the local system which threw an error\. For example, a
request to upload or download a file was made and the file permissions
denied that sort of access\. The third element of $errorCode is the original
$errorCode\.
- socket
Something happened with the socket\. It closed prematurely, or some other
condition of failure\-to\-communicate\-with\-Amazon was detected\. The third
element of $errorCode is the original $errorCode, or sometimes the message
from fcopy, or \.\.\.?
- remote
The Amazon web service returned an error code outside the 2xx range in the
HTTP header\. In other words, everything went as documented, except this
particular case was documented not to work\. The third element is the
dictionary returned from __::S3::REST__\. Note that S3::REST itself never
throws this error, but just returns the dictionary\. Most of the higher\-level
commands throw for convenience, unless an argument indicates they should
not\. If something is documented as "not throwing an S3 remote error", it
means a status return is set rather than throwing an error if Amazon returns
a non\-2XX HTTP result code\.
- notyet
The user obeyed the documentation, but the author has not yet gotten around
to implementing this feature\. \(Right now, only TLS support and sophisticated
permissions fall into this category, as well as the S3::Acl command\.\)
- xml
The service has returned invalid XML, or XML whose schema is unexpected\. For
the high\-level commands that accept service XML as input for parsing, this
may also be thrown\.
# COMMANDS
This package provides several separate levels of complexity\.
- The lowest level simply takes arguments to be sent to the service, sends
them, retrieves the result, and provides it to the caller\. *Note:* This
layer allows both synchronous and event\-driven processing\. It depends on the
MD5 and SHA1 and base64 packages from Tcllib \(available at
[http://core\.tcl\.tk/tcllib/](http://core\.tcl\.tk/tcllib/)\)\. Note that
__S3::Configure__ is required for __S3::REST__ to work due to the
authentication portion, so we put that in the "lowest level\."
- The next layer parses the results of calls, allowing for functionality such
as uploading only changed files, synchronizing directories, and so on\. This
layer depends on the __TclXML__ package as well as the included
__[xsxp](xsxp\.md)__ package\. These packages are package required
when these more\-sophisticated routines are called, so nothing breaks if they
are not correctly installed\.
- Also included is a separate program that uses the library\. It provides code
to parse $argv0 and $argv from the command line, allowing invocation as a
tclkit, etc\. \(Not yet implmented\.\)
- Another separate program provides a GUI interface allowing drag\-and\-drop and
other such functionality\. \(Not yet implemented\.\)
- Also built on this package is the OddJob program\. It is a separate program
designed to allow distribution of computational work units over Amazon's
Elastic Compute Cloud web service\.
The goal is to have at least the bottom\-most layers implemented in pure Tcl
using only that which comes from widely\-available sources, such as Tcllib\.
# LOW LEVEL COMMANDS
These commands do not require any packages not listed above\. They talk directly
to the service, or they are utility or configuration routines\. Note that the
"xsxp" package was written to support this package, so it should be available
wherever you got this package\.
- __S3::Configure__ ?__\-reset__ *boolean*? ?__\-retries__ *integer*? ?__\-accesskeyid__ *idstring*? ?__\-secretaccesskey__ *idstring*? ?__\-service\-access\-point__ *FQDN*? ?__\-use\-tls__ *boolean*? ?__\-default\-compare__ *always|never|exists|missing|newer|date|checksum|different*? ?__\-default\-separator__ *string*? ?__\-default\-acl__ *private|public\-read|public\-read\-write|authenticated\-read|keep|calc*? ?__\-default\-bucket__ *bucketname*?
There is one command for configuration, and that is __S3::Configure__\.
If called with no arguments, it returns a dictionary of key/value pairs
listing all current settings\. If called with one argument, it returns the
value of that single argument\. If called with two or more arguments, it must
be called with pairs of arguments, and it applies the changes in order\.
There is only one set of configuration information per interpreter\.
The following options are accepted:
* __\-reset__ *boolean*
By default, false\. If true, any previous changes and any changes on the
same call before the reset option will be returned to default values\.
* __\-retries__ *integer*
Default value is 3\. If Amazon returns a 500 error, a retry after an
exponential backoff delay will be tried this many times before finally
throwing the 500 error\. This applies to each call to __S3::REST__
from the higher\-level commands, but not to __S3::REST__ itself\. That
is, __S3::REST__ will always return httpstatus 500 if that's what it
receives\. Functions like __S3::Put__ will retry the PUT call, and
will also retry the GET and HEAD calls used to do content comparison\.
Changing this to 0 will prevent retries and their associated delays\. In
addition, socket errors \(i\.e\., errors whose errorCode starts with "S3
socket"\) will be similarly retried after backoffs\.
* __\-accesskeyid__ *idstring*
* __\-secretaccesskey__ *idstring*
Each defaults to an empty string\. These must be set before any calls are
made\. This is your S3 ID\. Once you sign up for an account, go to
[http://www\.amazonaws\.com/](http://www\.amazonaws\.com/), sign in, go
to the "Your Web Services Account" button, pick "AWS Access
Identifiers", and your access key ID and secret access keys will be
available\. All __S3::REST__ calls are authenticated\. Blame Amazon
for the poor choice of names\.
* __\-service\-access\-point__ *FQDN*
Defaults to "s3\.amazonaws\.com"\. This is the fully\-qualified domain name
of the server to contact for __S3::REST__ calls\. You should probably
never need to touch this, unless someone else implements a compatible
service, or you wish to test something by pointing the library at your
own service\.
* __\-slop\-seconds__ *integer*
When comparing dates between Amazon and the local machine, two dates
within this many seconds of each other are considered the same\. Useful
for clock drift correction, processing overhead time, and so on\.
* __\-use\-tls__ *boolean*
Defaults to false\. This is not yet implemented\. If true,
__S3::REST__ will negotiate a TLS connection to Amazon\. If false,
unencrypted connections are used\.
* __\-bucket\-prefix__ *string*
Defaults to "TclS3"\. This string is used by
__S3::SuggestBucketName__ if that command is passed an empty string
as an argument\. It is used to distinguish different applications using
the Amazon service\. Your application should always set this to keep from
interfering with the buckets of other users of Amazon S3 or with other
buckets of the same user\.
* __\-default\-compare__ *always|never|exists|missing|newer|date|checksum|different*
Defaults to "always\." If no \-compare is specified on __S3::Put__,
__S3::Get__, or __S3::Delete__, this comparison is used\. See
those commands for a description of the meaning\.
* __\-default\-separator__ *string*
Defaults to "/"\. This is currently unused\. It might make sense to use
this for __S3::Push__ and __S3::Pull__, but allowing resources
to have slashes in their names that aren't marking directories would be
problematic\. Hence, this currently does nothing\.
* __\-default\-acl__ *private|public\-read|public\-read\-write|authenticated\-read|keep|calc*
Defaults to an empty string\. If no \-acl argument is provided to
__S3::Put__ or __S3::Push__, this string is used \(given as the
x\-amz\-acl header if not keep or calc\)\. If this is also empty, no
x\-amz\-acl header is generated\. This is *not* used by __S3::REST__\.
* __\-default\-bucket__ *bucketname*
If no bucket is given to __S3::GetBucket__, __S3::PutBucket__,
__S3::Get__, __S3::Put__, __S3::Head__, __S3::Acl__,
__S3::Delete__, __S3::Push__, __S3::Pull__, or
__S3::Toss__, and if this configuration variable is not an empty
string \(and not simply "/"\), then this value will be used for the
bucket\. This is useful if one program does a large amount of resource
manipulation within a single bucket\.
- __S3::SuggestBucket__ ?*name*?
The __S3::SuggestBucket__ command accepts an optional string as a prefix
and returns a valid bucket containing the *name* argument and the Access
Key ID\. This makes the name unique to the owner and to the application
\(assuming the application picks a good *name* argument\)\. If no name is
provided, the name from __S3::Configure__ *\-bucket\-prefix* is used\. If
that too is empty \(which is not the default\), an error is thrown\.
- __S3::REST__ *dict*
The __S3::REST__ command takes as an argument a dictionary and returns a
dictionary\. The return dictionary has the same keys as the input dictionary,
and includes additional keys as the result\. The presence or absence of keys
in the input dictionary can control the behavior of the routine\. It never
throws an error directly, but includes keys "error", "errorInfo", and
"errorCode" if necessary\. Some keys are required, some optional\. The routine
can run either in blocking or non\-blocking mode, based on the presense of
__resultvar__ in the input dictionary\. This requires the
*\-accesskeyid* and *\-secretaccesskey* to be configured via
__S3::Configure__ before being called\.
The possible input keys are these:
* __verb__ *GET|PUT|DELETE|HEAD*
This required item indicates the verb to be used\.
* __resource__ *string*
This required item indicates the resource to be accessed\. A leading / is
added if not there already\. It will be URL\-encoded for you if necessary\.
Do not supply a resource name that is already URL\-encoded\.
* ?__rtype__ *torrent|acl*?
This indicates a torrent or acl resource is being manipulated\. Do not
include this in the __resource__ key, or the "?" separator will get
URL\-encoded\.
* ?__parameters__ *dict*?
This optional dictionary provides parameters added to the URL for the
transaction\. The keys must be in the correct case \(which is confusing in
the Amazon documentation\) and the values must be valid\. This can be an
empty dictionary or omitted entirely if no parameters are desired\. No
other error checking on parameters is performed\.
* ?__headers__ *dict*?
This optional dictionary provides headers to be added to the HTTP
request\. The keys must be in *lower case* for the authentication to
work\. The values must not contain embedded newlines or carriage returns\.
This is primarily useful for adding x\-amz\-\* headers\. Since
authentication is calculated by __S3::REST__, do not add that header
here\. Since content\-type gets its own key, also do not add that header
here\.
* ?__inbody__ *contentstring*?
This optional item, if provided, gives the content that will be sent\. It
is sent with a tranfer encoding of binary, and only the low bytes are
used, so use \[encoding convertto utf\-8\] if the string is a utf\-8 string\.
This is written all in one blast, so if you are using non\-blocking mode
and the __inbody__ is especially large, you may wind up blocking on
the write socket\.
* ?__infile__ *filename*?
This optional item, if provided, and if __inbody__ is not provided,
names the file from which the body of the HTTP message will be
constructed\. The file is opened for reading and sent progressively by
\[fcopy\], so it should not block in non\-blocking mode even if the file is
very large\. The file is transfered in binary mode, so the bytes on your
disk will match the bytes in your resource\. Due to HTTP restrictions, it
must be possible to use \[file size\] on this file to determine the size
at the start of the transaction\.
* ?__S3chan__ *channel*?
This optional item, if provided, indicates the already\-open socket over
which the transaction should be conducted\. If not provided, a connection
is made to the service access point specified via __S3::Configure__,
which is normally s3\.amazonaws\.com\. If this is provided, the channel is
not closed at the end of the transaction\.
* ?__outchan__ *channel*?
This optional item, if provided, indicates the already\-open channel to
which the body returned from S3 should be written\. That is, to retrieve
a large resource, open a file, set the translation mode, and pass the
channel as the value of the key outchan\. Output will be written to the
channel in pieces so memory does not fill up unnecessarily\. The channel
is not closed at the end of the transaction\.
* ?__resultvar__ *varname*?
This optional item, if provided, indicates that __S3::REST__ should
run in non\-blocking mode\. The *varname* should be fully qualified with
respect to namespaces and cannot be local to a proc\. If provided, the
result of the __S3::REST__ call is assigned to this variable once
everything has completed; use trace or vwait to know when this has
happened\. If this key is not provided, the result is simply returned
from the call to __S3::REST__ and no calls to the eventloop are
invoked from within this call\.
* ?__throwsocket__ *throw|return*?
This optional item, if provided, indicates that __S3::REST__ should
throw an error if throwmode is throw and a socket error is encountered\.
It indicates that __S3::REST__ should return the error code in the
returned dictionary if a socket error is encountered and this is set to
return\. If __throwsocket__ is set to *return* or if the call is
not blocking, then a socket error \(i\.e\., an error whose error code
starts with "S3 socket" will be returned in the dictionary as
__error__, __errorInfo__, and __errorCode__\. If a foreground
call is made \(i\.e\., __resultvar__ is not provided\), and this option
is not provided or is set to *throw*, then
__[error](\.\./\.\./\.\./\.\./index\.md\#error)__ will be invoked instead\.
Once the call to __S3::REST__ completes, a new dict is returned, either
in the *resultvar* or as the result of execution\. This dict is a copy of
the original dict with the results added as new keys\. The possible new keys
are these:
* __error__ *errorstring*
* __errorInfo__ *errorstring*
* __errorCode__ *errorstring*
If an error is caught, these three keys will be set in the result\. Note
that __S3::REST__ does *not* consider a non\-2XX HTTP return code
as an error\. The __errorCode__ value will be formatted according to
the [ERROR REPORTING](#section2) description\. If these are present,
other keys described here might not be\.
* __httpstatus__ *threedigits*
The three\-digit code from the HTTP transaction\. 2XX for good, 5XX for
server error, etc\.
* __httpmessage__ *text*
The textual result after the status code\. "OK" or "Forbidden" or etc\.
* __outbody__ *contentstring*
If *outchan* was not specified, this key will hold a reference to the
\(unencoded\) contents of the body returned\. If Amazon returned an error
\(a la the httpstatus not a 2XX value\), the error message will be in
__outbody__ or written to __outchan__ as appropriate\.
* __outheaders__ *dict*
This contains a dictionary of headers returned by Amazon\. The keys are
always lower case\. It's mainly useful for finding the x\-amz\-meta\-\*
headers, if any, although things like last\-modified and content\-type are
also useful\. The keys of this dictionary are always lower case\. Both
keys and values are trimmed of extraneous whitespace\.
# HIGH LEVEL COMMANDS
The routines in this section all make use of one or more calls to
__S3::REST__ to do their work, then parse and manage the data in a
convenient way\. All these commands throw errors as described in [ERROR
REPORTING](#section2) unless otherwise noted\.
In all these commands, all arguments are presented as name/value pairs, in any
order\. All the argument names start with a hyphen\.
There are a few options that are common to many of the commands, and those
common options are documented here\.
- __\-blocking__ *boolean*
If provided and specified as false, then any calls to __S3:REST__ will
be non\-blocking, and internally these routines will call \[vwait\] to get the
results\. In other words, these routines will return the same value, but
they'll have event loops running while waiting for Amazon\.
- __\-parse\-xml__ *xmlstring*
If provided, the routine skips actually communicating with Amazon, and
instead behaves as if the XML string provided was returned as the body of
the call\. Since several of these routines allow the return of data in
various formats, this argument can be used to parse existing XML to extract
the bits of information that are needed\. It's also helpful for testing\.
- __\-bucket__ *bucketname*
Almost every high\-level command needs to know what bucket the resources are
in\. This option specifies that\. \(Only the command to list available buckets
does not require this parameter\.\) This does not need to be URL\-encoded, even
if it contains special or non\-ASCII characters\. May or may not contain
leading or trailing spaces \- commands normalize the bucket\. If this is not
supplied, the value is taken from __S3::Configure \-default\-bucket__ if
that string isn't empty\. Note that spaces and slashes are always trimmed
from both ends and the rest must leave a valid bucket\.
- __\-resource__ *resourcename*
This specifies the resource of interest within the bucket\. It may or may not
start with a slash \- both cases are handled\. This does not need to be
URL\-encoded, even if it contains special or non\-ASCII characters\.
- __\-compare__ *always|never|exists|missing|newer|date|checksum|different*
When commands copy resources to files or files to resources, the caller may
specify that the copy should be skipped if the contents are the same\. This
argument specifies the conditions under which the files should be copied\. If
it is not passed, the result of __S3::Configure \-default\-compare__ is
used, which in turn defaults to "always\." The meanings of the various values
are these:
* *always*
Always copy the data\. This is the default\.
* *never*
Never copy the data\. This is essentially a no\-op, except in
__S3::Push__ and __S3::Pull__ where the \-delete flag might make
a difference\.
* *exists*
Copy the data only if the destination already exists\.
* *missing*
Copy the data only if the destination does not already exist\.
* *newer*
Copy the data if the destination is missing, or if the date on the
source is newer than the date on the destination by at least
__S3::Configure \-slop\-seconds__ seconds\. If the source is Amazon,
the date is taken from the Last\-Modified header\. If the source is local,
it is taken as the mtime of the file\. If the source data is specified in
a string rather than a file, it is taken as right now, via \[clock
seconds\]\.
* *date*
Like *newer*, except copy if the date is newer *or* older\.
* *checksum*
Calculate the MD5 checksum on the local file or string, ask Amazon for
the eTag of the resource, and copy the data if they're different\. Copy
the data also if the destination is missing\. Note that this can be slow
with large local files unless the C version of the MD5 support is
available\.
* *different*
Copy the data if the destination does not exist\. If the destination
exists and an actual file name was specified \(rather than a content
string\), and the date on the file differs from the date on the resource,
copy the data\. If the data is provided as a content string, the "date"
is treated as "right now", so it will likely always differ unless
slop\-seconds is large\. If the dates are the same, the MD5 checksums are
compared, and the data is copied if the checksums differ\.
Note that "newer" and "date" don't care about the contents, and "checksum"
doesn't care about the dates, but "different" checks both\.
- __S3::ListAllMyBuckets__ ?__\-blocking__ *boolean*? ?__\-parse\-xml__ *xmlstring*? ?__\-result\-type__ *REST|xml|pxml|dict|names|owner*?
This routine performs a GET on the Amazon S3 service, which is defined to
return a list of buckets owned by the account identified by the
authorization header\. \(Blame Amazon for the dumb names\.\)
* __\-blocking__ *boolean*
See above for standard definition\.
* __\-parse\-xml__ *xmlstring*
See above for standard definition\.
* __\-result\-type__ *REST*
The dictionary returned by __S3::REST__ is the return value of
__S3::ListAllMyBuckets__\. In this case, a non\-2XX httpstatus will
not throw an error\. You may not combine this with *\-parse\-xml*\.
* __\-result\-type__ *xml*
The raw XML of the body is returned as the result \(with no encoding
applied\)\.
* __\-result\-type__ *pxml*
The XML of the body as parsed by __xsxp::parse__ is returned\.
* __\-result\-type__ *dict*
A dictionary of interesting portions of the XML is returned\. The
dictionary contains the following keys:
+ Owner/ID
The Amazon AWS ID \(in hex\) of the owner of the bucket\.
+ Owner/DisplayName
The Amazon AWS ID's Display Name\.
+ Bucket/Name
A list of names, one for each bucket\.
+ Bucket/CreationDate
A list of dates, one for each bucket, in the same order as
Bucket/Name, in ISO format \(as returned by Amazon\)\.
* __\-result\-type__ *names*
A list of bucket names is returned with all other information stripped
out\. This is the default result type for this command\.
* __\-result\-type__ *owner*
A list containing two elements is returned\. The first element is the
owner's ID, and the second is the owner's display name\.
- __S3::PutBucket__ ?__\-bucket__ *bucketname*? ?__\-blocking__ *boolean*? ?__\-acl__ *\{\}|private|public\-read|public\-read\-write|authenticated\-read*?
This command creates a bucket if it does not already exist\. Bucket names are
globally unique, so you may get a "Forbidden" error from Amazon even if you
cannot see the bucket in __S3::ListAllMyBuckets__\. See
__S3::SuggestBucket__ for ways to minimize this risk\. The x\-amz\-acl
header comes from the __\-acl__ option, or from __S3::Configure
\-default\-acl__ if not specified\.
- __S3::DeleteBucket__ ?__\-bucket__ *bucketname*? ?__\-blocking__ *boolean*?
This command deletes a bucket if it is empty and you have such permission\.
Note that Amazon's list of buckets is a global resource, requiring far\-flung
synchronization\. If you delete a bucket, it may be quite a few minutes \(or
hours\) before you can recreate it, yielding "Conflict" errors until then\.
- __S3::GetBucket__ ?__\-bucket__ *bucketname*? ?__\-blocking__ *boolean*? ?__\-parse\-xml__ *xmlstring*? ?__\-max\-count__ *integer*? ?__\-prefix__ *prefixstring*? ?__\-delimiter__ *delimiterstring*? ?__\-result\-type__ *REST|xml|pxml|names|dict*?
This lists the contents of a bucket\. That is, it returns a directory listing
of resources within a bucket, rather than transfering any user data\.
* __\-bucket__ *bucketname*
The standard bucket argument\.
* __\-blocking__ *boolean*
The standard blocking argument\.
* __\-parse\-xml__ *xmlstring*
The standard parse\-xml argument\.
* __\-max\-count__ *integer*
If supplied, this is the most number of records to be returned\. If not
supplied, the code will iterate until all records have been found\. Not
compatible with \-parse\-xml\. Note that if this is supplied, only one call
to __S3::REST__ will be made\. Otherwise, enough calls will be made
to exhaust the listing, buffering results in memory, so take care if you
may have huge buckets\.
* __\-prefix__ *prefixstring*
If present, restricts listing to resources with a particular prefix\. One
leading / is stripped if present\.
* __\-delimiter__ *delimiterstring*
If present, specifies a delimiter for the listing\. The presence of this
will summarize multiple resources into one entry, as if S3 supported
directories\. See the Amazon documentation for details\.
* __\-result\-type__ *REST|xml|pxml|names|dict*
This indicates the format of the return result of the command\.
+ REST
If *\-max\-count* is specified, the dictionary returned from
__S3::REST__ is returned\. If *\-max\-count* is not specified, a
list of all the dictionaries returned from the one or more calls to
__S3::REST__ is returned\.
+ xml
If *\-max\-count* is specified, the body returned from
__S3::REST__ is returned\. If *\-max\-count* is not specified, a
list of all the bodies returned from the one or more calls to
__S3::REST__ is returned\.
+ pxml
If *\-max\-count* is specified, the body returned from
__S3::REST__ is passed throught __xsxp::parse__ and then
returned\. If *\-max\-count* is not specified, a list of all the
bodies returned from the one or more calls to __S3::REST__ are
each passed through __xsxp::parse__ and then returned\.
+ names
Returns a list of all names found in either the Contents/Key fields
or the CommonPrefixes/Prefix fields\. If no *\-delimiter* is
specified and no *\-max\-count* is specified, this returns a list of
all resources with the specified *\-prefix*\.
+ dict
Returns a dictionary\. \(Returns only one dictionary even if
*\-max\-count* wasn't specified\.\) The keys of the dictionary are as
follows:
- Name
The name of the bucket \(from the final call to
__S3::REST__\)\.
- Prefix
From the final call to __S3::REST__\.
- Marker
From the final call to __S3::REST__\.
- MaxKeys
From the final call to __S3::REST__\.
- IsTruncated
From the final call to __S3::REST__, so always false if
*\-max\-count* is not specified\.
- NextMarker
Always provided if IsTruncated is true, and calculated of Amazon
does not provide it\. May be empty if IsTruncated is false\.
- Key
A list of names of resources in the bucket matching the
*\-prefix* and *\-delimiter* restrictions\.
- LastModified
A list of times of resources in the bucket, in the same order as
Key, in the format returned by Amazon\. \(I\.e\., it is not parsed
into a seconds\-from\-epoch\.\)
- ETag
A list of entity tags \(a\.k\.a\. MD5 checksums\) in the same order
as Key\.
- Size
A list of sizes in bytes of the resources, in the same order as
Key\.
- Owner/ID
A list of owners of the resources in the bucket, in the same
order as Key\.
- Owner/DisplayName
A list of owners of the resources in the bucket, in the same
order as Key\. These are the display names\.
- CommonPrefixes/Prefix
A list of prefixes common to multiple entities\. This is present
only if *\-delimiter* was supplied\.
- __S3::Put__ ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-blocking__ *boolean*? ?__\-file__ *filename*? ?__\-content__ *contentstring*? ?__\-acl__ *private|public\-read|public\-read\-write|authenticated\-read|calc|keep*? ?__\-content\-type__ *contenttypestring*? ?__\-x\-amz\-meta\-\*__ *metadatatext*? ?__\-compare__ *comparemode*?
This command sends data to a resource on Amazon's servers for storage, using
the HTTP PUT command\. It returns 0 if the __\-compare__ mode prevented
the transfer, 1 if the transfer worked, or throws an error if the transfer
was attempted but failed\. Server 5XX errors and S3 socket errors are retried
according to __S3:Configure \-retries__ settings before throwing an
error; other errors throw immediately\.
* __\-bucket__
This specifies the bucket into which the resource will be written\.
Leading and/or trailing slashes are removed for you, as are spaces\.
* __\-resource__
This is the full name of the resource within the bucket\. A single
leading slash is removed, but not a trailing slash\. Spaces are not
trimmed\.
* __\-blocking__
The standard blocking flag\.
* __\-file__
If this is specified, the *filename* must exist, must be readable, and
must not be a special or directory file\. \[file size\] must apply to it
and must not change for the lifetime of the call\. The default
content\-type is calculated based on the name and/or contents of the
file\. Specifying this is an error if __\-content__ is also specified,
but at least one of __\-file__ or __\-content__ must be specified\.
\(The file is allowed to not exist or not be readable if __\-compare__
*never* is specified\.\)
* __\-content__
If this is specified, the *contentstring* is sent as the body of the
resource\. The content\-type defaults to "application/octet\-string"\. Only
the low bytes are sent, so non\-ASCII should use the appropriate encoding
\(such as \[encoding convertto utf\-8\]\) before passing it to this routine,
if necessary\. Specifying this is an error if __\-file__ is also
specified, but at least one of __\-file__ or __\-content__ must be
specified\.
* __\-acl__
This defaults to __S3::Configure \-default\-acl__ if not specified\. It
sets the x\-amz\-acl header on the PUT operation\. If the value provided is
*calc*, the x\-amz\-acl header is calculated based on the I/O
permissions of the file to be uploaded; it is an error to specify
*calc* and __\-content__\. If the value provided is *keep*, the
acl of the resource is read before the PUT \(or the default is used if
the resource does not exist\), then set back to what it was after the PUT
\(if it existed\)\. An error will occur if the resource is successfully
written but the kept ACL cannot be then applied\. This should never
happen\. *Note:* *calc* is not currently fully implemented\.
* __\-x\-amz\-meta\-\*__
If any header starts with "\-x\-amz\-meta\-", its contents are added to the
PUT command to be stored as metadata with the resource\. Again, no
encoding is performed, and the metadata should not contain characters
like newlines, carriage returns, and so on\. It is best to stick with
simple ASCII strings, or to fix the library in several places\.
* __\-content\-type__
This overrides the content\-type calculated by __\-file__ or sets the
content\-type for __\-content__\.
* __\-compare__
This is the standard compare mode argument\. __S3::Put__ returns 1 if
the data was copied or 0 if the data was skipped due to the comparison
mode so indicating it should be skipped\.
- __S3::Get__ ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-blocking__ *boolean*? ?__\-compare__ *comparemode*? ?__\-file__ *filename*? ?__\-content__ *contentvarname*? ?__\-timestamp__ *aws|now*? ?__\-headers__ *headervarname*?
This command retrieves data from a resource on Amazon's S3 servers, using
the HTTP GET command\. It returns 0 if the __\-compare__ mode prevented
the transfer, 1 if the transfer worked, or throws an error if the transfer
was attempted but failed\. Server 5XX errors and S3 socket errors are are
retried according to __S3:Configure__ settings before throwing an error;
other errors throw immediately\. Note that this is always authenticated as
the user configured in via __S3::Configure \-accesskeyid__\. Use the
Tcllib http for unauthenticated GETs\.
* __\-bucket__
This specifies the bucket from which the resource will be read\. Leading
and/or trailing slashes are removed for you, as are spaces\.
* __\-resource__
This is the full name of the resource within the bucket\. A single
leading slash is removed, but not a trailing slash\. Spaces are not
trimmed\.
* __\-blocking__
The standard blocking flag\.
* __\-file__
If this is specified, the body of the resource will be read into this
file, incrementally without pulling it entirely into memory first\. The
parent directory must already exist\. If the file already exists, it must
be writable\. If an error is thrown part\-way through the process and the
file already existed, it may be clobbered\. If an error is thrown
part\-way through the process and the file did not already exist, any
partial bits will be deleted\. Specifying this is an error if
__\-content__ is also specified, but at least one of __\-file__ or
__\-content__ must be specified\.
* __\-timestamp__
This is only valid in conjunction with __\-file__\. It may be
specified as *now* or *aws*\. The default is *now*\. If *now*, the
file's modification date is left up to the system\. If *aws*, the
file's mtime is set to match the Last\-Modified header on the resource,
synchronizing the two appropriately for __\-compare__ *date* or
__\-compare__ *newer*\.
* __\-content__
If this is specified, the *contentvarname* is a variable in the
caller's scope \(not necessarily global\) that receives the value of the
body of the resource\. No encoding is done, so if the resource \(for
example\) represents a UTF\-8 byte sequence, use \[encoding convertfrom
utf\-8\] to get a valid UTF\-8 string\. If this is specified, the
__\-compare__ is ignored unless it is *never*, in which case no
assignment to *contentvarname* is performed\. Specifying this is an
error if __\-file__ is also specified, but at least one of
__\-file__ or __\-content__ must be specified\.
* __\-compare__
This is the standard compare mode argument\. __S3::Get__ returns 1 if
the data was copied or 0 if the data was skipped due to the comparison
mode so indicating it should be skipped\.
* __\-headers__
If this is specified, the headers resulting from the fetch are stored in
the provided variable, as a dictionary\. This will include content\-type
and x\-amz\-meta\-\* headers, as well as the usual HTTP headers, the
x\-amz\-id debugging headers, and so on\. If no file is fetched \(due to
__\-compare__ or other errors\), no assignment to this variable is
performed\.
- __S3::Head__ ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-blocking__ *boolean*? ?__\-dict__ *dictvarname*? ?__\-headers__ *headersvarname*? ?__\-status__ *statusvarname*?
This command requests HEAD from the resource\. It returns whether a 2XX code
was returned as a result of the request, never throwing an S3 remote error\.
That is, if this returns 1, the resource exists and is accessible\. If this
returns 0, something went wrong, and the __\-status__ result can be
consulted for details\.
* __\-bucket__
This specifies the bucket from which the resource will be read\. Leading
and/or trailing slashes are removed for you, as are spaces\.
* __\-resource__
This is the full name of the resource within the bucket\. A single
leading slash is removed, but not a trailing slash\. Spaces are not
trimmed\.
* __\-blocking__
The standard blocking flag\.
* __\-dict__
If specified, the resulting dictionary from the __S3::REST__ call is
assigned to the indicated \(not necessarily global\) variable in the
caller's scope\.
* __\-headers__
If specified, the dictionary of headers from the result are assigned to
the indicated \(not necessarily global\) variable in the caller's scope\.
* __\-status__
If specified, the indicated \(not necessarily global\) variable in the
caller's scope is assigned a 2\-element list\. The first element is the
3\-digit HTTP status code, while the second element is the HTTP message
\(such as "OK" or "Forbidden"\)\.
- __S3::GetAcl__ ?__\-blocking__ *boolean*? ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-result\-type__ *REST|xml|pxml*?
This command gets the ACL of the indicated resource or throws an error if it
is unavailable\.
* __\-blocking__ *boolean*
See above for standard definition\.
* __\-bucket__
This specifies the bucket from which the resource will be read\. Leading
and/or trailing slashes are removed for you, as are spaces\.
* __\-resource__
This is the full name of the resource within the bucket\. A single
leading slash is removed, but not a trailing slash\. Spaces are not
trimmed\.
* __\-parse\-xml__ *xml*
The XML from a previous GetACL can be passed in to be parsed into
dictionary form\. In this case, \-result\-type must be pxml or dict\.
* __\-result\-type__ *REST*
The dictionary returned by __S3::REST__ is the return value of
__S3::GetAcl__\. In this case, a non\-2XX httpstatus will not throw an
error\.
* __\-result\-type__ *xml*
The raw XML of the body is returned as the result \(with no encoding
applied\)\.
* __\-result\-type__ *pxml*
The XML of the body as parsed by __xsxp::parse__ is returned\.
* __\-result\-type__ *dict*
This fetches the ACL, parses it, and returns a dictionary of two
elements\.
The first element has the key "owner" whose value is the canonical ID of
the owner of the resource\.
The second element has the key "acl" whose value is a dictionary\. Each
key in the dictionary is one of Amazon's permissions, namely "READ",
"WRITE", "READ\_ACP", "WRITE\_ACP", or "FULL\_CONTROL"\. Each value of each
key is a list of canonical IDs or group URLs that have that permission\.
Elements are not in the list in any particular order, and not all keys
are necessarily present\. Display names are not returned, as they are not
especially useful; use pxml to obtain them if necessary\.
- __S3::PutAcl__ ?__\-blocking__ *boolean*? ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-acl__ *new\-acl*?
This sets the ACL on the indicated resource\. It returns the XML written to
the ACL, or throws an error if anything went wrong\.
* __\-blocking__ *boolean*
See above for standard definition\.
* __\-bucket__
This specifies the bucket from which the resource will be read\. Leading
and/or trailing slashes are removed for you, as are spaces\.
* __\-resource__
This is the full name of the resource within the bucket\. A single
leading slash is removed, but not a trailing slash\. Spaces are not
trimmed\.
* __\-owner__
If this is provided, it is assumed to match the owner of the resource\.
Otherwise, a GET may need to be issued against the resource to find the
owner\. If you already have the owner \(such as from a call to
__S3::GetAcl__, you can pass the value of the "owner" key as the
value of this option, and it will be used in the construction of the
XML\.
* __\-acl__
If this option is specified, it provides the ACL the caller wishes to
write to the resource\. If this is not supplied or is empty, the value is
taken from __S3::Configure \-default\-acl__\. The ACL is written with a
PUT to the ?acl resource\.
If the value passed to this option starts with "<", it is taken to be a
body to be PUT to the ACL resource\.
If the value matches one of the standard Amazon x\-amz\-acl headers \(i\.e\.,
a canned access policy\), that header is translated to XML and then
applied\. The canned access policies are private, public\-read,
public\-read\-write, and authenticated\-read \(in lower case\)\.
Otherwise, the value is assumed to be a dictionary formatted as the
"acl" sub\-entry within the dict returns by __S3::GetAcl \-result\-type
dict__\. The proper XML is generated and applied to the resource\. Note
that a value containing "//" is assumed to be a group, a value
containing "@" is assumed to be an AmazonCustomerByEmail, and otherwise
the value is assumed to be a canonical Amazon ID\.
Note that you cannot change the owner, so calling GetAcl on a resource
owned by one user and applying it via PutAcl on a resource owned by
another user may not do exactly what you expect\.
- __S3::Delete__ ?__\-bucket__ *bucketname*? __\-resource__ *resourcename* ?__\-blocking__ *boolean*? ?__\-status__ *statusvar*?
This command deletes the specified resource from the specified bucket\. It
returns 1 if the resource was deleted successfully, 0 otherwise\. It returns
0 rather than throwing an S3 remote error\.
* __\-bucket__
This specifies the bucket from which the resource will be deleted\.
Leading and/or trailing slashes are removed for you, as are spaces\.
* __\-resource__
This is the full name of the resource within the bucket\. A single
leading slash is removed, but not a trailing slash\. Spaces are not
trimmed\.
* __\-blocking__
The standard blocking flag\.
* __\-status__
If specified, the indicated \(not necessarily global\) variable in the
caller's scope is set to a two\-element list\. The first element is the
3\-digit HTTP status code\. The second element is the HTTP message \(such
as "OK" or "Forbidden"\)\. Note that Amazon's DELETE result is 204 on
success, that being the code indicating no content in the returned body\.
- __S3::Push__ ?__\-bucket__ *bucketname*? __\-directory__ *directoryname* ?__\-prefix__ *prefixstring*? ?__\-compare__ *comparemode*? ?__\-x\-amz\-meta\-\*__ *metastring*? ?__\-acl__ *aclcode*? ?__\-delete__ *boolean*? ?__\-error__ *throw|break|continue*? ?__\-progress__ *scriptprefix*?
This synchronises a local directory with a remote bucket by pushing the
differences using __S3::Put__\. Note that if something has changed in the
bucket but not locally, those changes could be lost\. Thus, this is not a
general two\-way synchronization primitive\. \(See __S3::Sync__ for that\.\)
Note too that resource names are case sensitive, so changing the case of a
file on a Windows machine may lead to otherwise\-unnecessary transfers\. Note
that only regular files are considered, so devices, pipes, symlinks, and
directories are not copied\.
* __\-bucket__
This names the bucket into which data will be pushed\.
* __\-directory__
This names the local directory from which files will be taken\. It must
exist, be readable via \[glob\] and so on\. If only some of the files
therein are readable, __S3::Push__ will PUT those files that are
readable and return in its results the list of files that could not be
opened\.
* __\-prefix__
This names the prefix that will be added to all resources\. That is, it
is the remote equivalent of __\-directory__\. If it is not specified,
the root of the bucket will be treated as the remote directory\. An
example may clarify\.
S3::Push -bucket test -directory /tmp/xyz -prefix hello/world
In this example, /tmp/xyz/pdq\.html will be stored as
http://s3\.amazonaws\.com/test/hello/world/pdq\.html in Amazon's servers\.
Also, /tmp/xyz/abc/def/Hello will be stored as
http://s3\.amazonaws\.com/test/hello/world/abc/def/Hello in Amazon's
servers\. Without the __\-prefix__ option, /tmp/xyz/pdq\.html would be
stored as http://s3\.amazonaws\.com/test/pdq\.html\.
* __\-blocking__
This is the standard blocking option\.
* __\-compare__
If present, this is passed to each invocation of __S3::Put__\.
Naturally, __S3::Configure \-default\-compare__ is used if this is not
specified\.
* __\-x\-amz\-meta\-\*__
If present, this is passed to each invocation of __S3::Put__\. All
copied files will have the same metadata\.
* __\-acl__
If present, this is passed to each invocation of __S3::Put__\.
* __\-delete__
This defaults to false\. If true, resources in the destination that are
not in the source directory are deleted with __S3::Delete__\. Since
only regular files are considered, the existance of a symlink, pipe,
device, or directory in the local source will *not* prevent the
deletion of a remote resource with a corresponding name\.
* __\-error__
This controls the behavior of __S3::Push__ in the event that
__S3::Put__ throws an error\. Note that errors encountered on the
local file system or in reading the list of resources in the remote
bucket always throw errors\. This option allows control over "partial"
errors, when some files were copied and some were not\.
__S3::Delete__ is always finished up, with errors simply recorded in
the return result\.
+ throw
The error is rethrown with the same errorCode\.
+ break
Processing stops without throwing an error, the error is recorded in
the return value, and the command returns with a normal return\. The
calls to __S3::Delete__ are not started\.
+ continue
This is the default\. Processing continues without throwing,
recording the error in the return result, and resuming with the next
file in the local directory to be copied\.
* __\-progress__
If this is specified and the indicated script prefix is not empty, the
indicated script prefix will be invoked several times in the caller's
context with additional arguments at various points in the processing\.
This allows progress reporting without backgrounding\. The provided
prefix will be invoked with additional arguments, with the first
additional argument indicating what part of the process is being
reported on\. The prefix is initially invoked with *args* as the first
additional argument and a dictionary representing the normalized
arguments to the __S3::Push__ call as the second additional
argument\. Then the prefix is invoked with *local* as the first
additional argument and a list of suffixes of the files to be considered
as the second argument\. Then the prefix is invoked with *remote* as
the first additional argument and a list of suffixes existing in the
remote bucket as the second additional argument\. Then, for each file in
the local list, the prefix will be invoked with *start* as the first
additional argument and the common suffix as the second additional
argument\. When __S3::Put__ returns for that file, the prefix will be
invoked with *copy* as the first additional argument, the common
suffix as the second additional argument, and a third argument that will
be "copied" \(if __S3::Put__ sent the resource\), "skipped" \(if
__S3::Put__ decided not to based on __\-compare__\), or the
errorCode that __S3::Put__ threw due to unexpected errors \(in which
case the third argument is a list that starts with "S3"\)\. When all files
have been transfered, the prefix may be invoked zero or more times with
*delete* as the first additional argument and the suffix of the
resource being deleted as the second additional argument, with a third
argument being either an empty string \(if the delete worked\) or the
errorCode from __S3::Delete__ if it failed\. Finally, the prefix will
be invoked with *finished* as the first additional argument and the
return value as the second additional argument\.
The return result from this command is a dictionary\. They keys are the
suffixes \(i\.e\., the common portion of the path after the __\-directory__
and __\-prefix__\), while the values are either "copied", "skipped" \(if
__\-compare__ indicated not to copy the file\), or the errorCode thrown by
__S3::Put__, as appropriate\. If __\-delete__ was true, there may also
be entries for suffixes with the value "deleted" or "notdeleted", indicating
whether the attempted __S3::Delete__ worked or not, respectively\. There
is one additional pair in the return result, whose key is the empty string
and whose value is a nested dictionary\. The keys of this nested dictionary
include "filescopied" \(the number of files successfully copied\),
"bytescopied" \(the number of data bytes in the files copied, excluding
headers, metadata, etc\), "compareskipped" \(the number of files not copied
due to __\-compare__ mode\), "errorskipped" \(the number of files not
copied due to thrown errors\), "filesdeleted" \(the number of resources
deleted due to not having corresponding files locally, or 0 if
__\-delete__ is false\), and "filesnotdeleted" \(the number of resources
whose deletion was attempted but failed\)\.
Note that this is currently implemented somewhat inefficiently\. It fetches
the bucket listing \(including timestamps and eTags\), then calls
__S3::Put__, which uses HEAD to find the timestamps and eTags again\.
Correcting this with no API change is planned for a future upgrade\.
- __S3::Pull__ ?__\-bucket__ *bucketname*? __\-directory__ *directoryname* ?__\-prefix__ *prefixstring*? ?__\-blocking__ *boolean*? ?__\-compare__ *comparemode*? ?__\-delete__ *boolean*? ?__\-timestamp__ *aws|now*? ?__\-error__ *throw|break|continue*? ?__\-progress__ *scriptprefix*?
This synchronises a remote bucket with a local directory by pulling the
differences using __S3::Get__ If something has been changed locally but
not in the bucket, those difference may be lost\. This is not a general
two\-way synchronization mechanism\. \(See __S3::Sync__ for that\.\) This
creates directories if needed; new directories are created with default
permissions\. Note that resource names are case sensitive, so changing the
case of a file on a Windows machine may lead to otherwise\-unnecessary
transfers\. Also, try not to store data in resources that end with a slash,
or which are prefixes of resources that otherwise would start with a slash;
i\.e\., don't use this if you store data in resources whose names have to be
directories locally\.
Note that this is currently implemented somewhat inefficiently\. It fetches
the bucket listing \(including timestamps and eTags\), then calls
__S3::Get__, which uses HEAD to find the timestamps and eTags again\.
Correcting this with no API change is planned for a future upgrade\.
* __\-bucket__
This names the bucket from which data will be pulled\.
* __\-directory__
This names the local directory into which files will be written It must
exist, be readable via \[glob\], writable for file creation, and so on\. If
only some of the files therein are writable, __S3::Pull__ will GET
those files that are writable and return in its results the list of
files that could not be opened\.
* __\-prefix__
The prefix of resources that will be considered for retrieval\. See
__S3::Push__ for more details, examples, etc\. \(Of course,
__S3::Pull__ reads rather than writes, but the prefix is treated
similarly\.\)
* __\-blocking__
This is the standard blocking option\.
* __\-compare__
This is passed to each invocation of __S3::Get__ if provided\.
Naturally, __S3::Configure \-default\-compare__ is used if this is not
provided\.
* __\-timestamp__
This is passed to each invocation of __S3::Get__ if provided\.
* __\-delete__
If this is specified and true, files that exist in the
__\-directory__ that are not in the __\-prefix__ will be deleted
after all resources have been copied\. In addition, empty directories
\(other than the top\-level __\-directory__\) will be deleted, as Amazon
S3 has no concept of an empty directory\.
* __\-error__
See __S3::Push__ for a description of this option\.
* __\-progress__
See __S3::Push__ for a description of this option\. It differs
slightly in that local directories may be included with a trailing slash
to indicate they are directories\.
The return value from this command is a dictionary\. It is identical in form
and meaning to the description of the return result of __S3::Push__\. It
differs only in that directories may be included, with a trailing slash in
their name, if they are empty and get deleted\.
- __S3::Toss__ ?__\-bucket__ *bucketname*? __\-prefix__ *prefixstring* ?__\-blocking__ *boolean*? ?__\-error__ *throw|break|continue*? ?__\-progress__ *scriptprefix*?
This deletes some or all resources within a bucket\. It would be considered a
"recursive delete" had Amazon implemented actual directories\.
* __\-bucket__
The bucket from which resources will be deleted\.
* ____\-blocking____
The standard blocking option\.
* ____\-prefix____
The prefix for resources to be deleted\. Any resource that starts with
this string will be deleted\. This is required\. To delete everything in
the bucket, pass an empty string for the prefix\.
* ____\-error____
If this is "throw", __S3::Toss__ rethrows any errors it encounters\.
If this is "break", __S3::Toss__ returns with a normal return after
the first error, recording that error in the return result\. If this is
"continue", which is the default, __S3::Toss__ continues on and
lists all errors in the return result\.
* ____\-progress____
If this is specified and not an empty string, the script prefix will be
invoked several times in the context of the caller with additional
arguments appended\. Initially, it will be invoked with the first
additional argument being *args* and the second being the processed
list of arguments to __S3::Toss__\. Then it is invoked with
*remote* as the first additional argument and the list of suffixes in
the bucket to be deleted as the second additional argument\. Then it is
invoked with the first additional argument being *delete* and the
second additional argument being the suffix deleted and the third
additional argument being "deleted" or "notdeleted" depending on whether
__S3::Delete__ threw an error\. Finally, the script prefix is invoked
with a first additional argument of "finished" and a second additional
argument of the return value\.
The return value is a dictionary\. The keys are the suffixes of files that
__S3::Toss__ attempted to delete, and whose values are either the string
"deleted" or "notdeleted"\. There is also one additional pair, whose key is
the empty string and whose value is an embedded dictionary\. The keys of this
embedded dictionary include "filesdeleted" and "filesnotdeleted", each of
which has integer values\.
# LIMITATIONS
- The pure\-Tcl MD5 checking is slow\. If you are processing files in the
megabyte range, consider ensuring binary support is available\.
- The commands __S3::Pull__ and __S3::Push__ fetch a directory listing
which includes timestamps and MD5 hashes, then invoke __S3::Get__ and
__S3::Put__\. If a complex __\-compare__ mode is specified,
__S3::Get__ and __S3::Put__ will invoke a HEAD operation for each
file to fetch timestamps and MD5 hashes of each resource again\. It is
expected that a future release of this package will solve this without any
API changes\.
- The commands __S3::Pull__ and __S3::Push__ fetch a directory listing
without using __\-max\-count__\. The entire directory is pulled into memory
at once\. For very large buckets, this could be a performance problem\. The
author, at this time, does not plan to change this behavior\. Welcome to Open
Source\.
- __S3::Sync__ is neither designed nor implemented yet\. The intention
would be to keep changes synchronised, so changes could be made to both the
bucket and the local directory and be merged by __S3::Sync__\.
- Nor is __\-compare__ *calc* fully implemented\. This is primarily due to
Windows not providing a convenient method for distinguishing between local
files that are "public\-read" or "public\-read\-write"\. Assistance figuring out
TWAPI for this would be appreciated\. The U\*\*X semantics are difficult to map
directly as well\. See the source for details\. Note that there are not tests
for calc, since it isn't done yet\.
- The HTTP processing is implemented within the library, rather than using a
"real" HTTP package\. Hence, multi\-line headers are not \(yet\) handled
correctly\. Do not include carriage returns or linefeeds in x\-amz\-meta\-\*
headers, content\-type values, and so on\. The author does not at this time
expect to improve this\.
- Internally, __S3::Push__ and __S3::Pull__ and __S3::Toss__ are
all very similar and should be refactored\.
- The idea of using __\-compare__ *never* __\-delete__ *true* to
delete files that have been deleted from one place but not the other yet not
copying changed files is untested\.
# USAGE SUGGESTIONS
To fetch a "directory" out of a bucket, make changes, and store it back:
file mkdir ./tempfiles
S3::Pull -bucket sample -prefix of/interest -directory ./tempfiles \
-timestamp aws
do_my_process ./tempfiles other arguments
S3::Push -bucket sample -prefix of/interest -directory ./tempfiles \
-compare newer -delete true
To delete files locally that were deleted off of S3 but not otherwise update
files:
S3::Pull -bucket sample -prefix of/interest -directory ./myfiles \
-compare never -delete true
# FUTURE DEVELOPMENTS
The author intends to work on several additional projects related to this
package, in addition to finishing the unfinished features\.
First, a command\-line program allowing browsing of buckets and transfer of files
from shell scripts and command prompts is useful\.
Second, a GUI\-based program allowing visual manipulation of bucket and resource
trees not unlike Windows Explorer would be useful\.
Third, a command\-line \(and perhaps a GUI\-based\) program called "OddJob" that
will use S3 to synchronize computation amongst multiple servers running OddJob\.
An S3 bucket will be set up with a number of scripts to run, and the OddJob
program can be invoked on multiple machines to run scripts on all the machines,
each moving on to the next unstarted task as it finishes each\. This is still
being designed, and it is intended primarily to be run on Amazon's Elastic
Compute Cloud\.
# TLS Security Considerations
This package uses the __[TLS](\.\./\.\./\.\./\.\./index\.md\#tls)__ package to
handle the security for __https__ urls and other socket connections\.
Policy decisions like the set of protocols to support and what ciphers to use
are not the responsibility of __[TLS](\.\./\.\./\.\./\.\./index\.md\#tls)__, nor
of this package itself however\. Such decisions are the responsibility of
whichever application is using the package, and are likely influenced by the set
of servers the application will talk to as well\.
For example, in light of the recent [POODLE
attack](http://googleonlinesecurity\.blogspot\.co\.uk/2014/10/this\-poodle\-bites\-exploiting\-ssl\-30\.html)
discovered by Google many servers will disable support for the SSLv3 protocol\.
To handle this change the applications using
__[TLS](\.\./\.\./\.\./\.\./index\.md\#tls)__ must be patched, and not this
package, nor __[TLS](\.\./\.\./\.\./\.\./index\.md\#tls)__ itself\. Such a patch
may be as simple as generally activating __tls1__ support, as shown in the
example below\.
package require tls
tls::init -tls1 1 ;# forcibly activate support for the TLS1 protocol
... your own application code ...
# Bugs, Ideas, Feedback
This document, and the package it describes, will undoubtedly contain bugs and
other problems\. Please report such in the category *amazon\-s3* of the [Tcllib
Trackers](http://core\.tcl\.tk/tcllib/reportlist)\. Please also report any ideas
for enhancements you may have for either package and/or documentation\.
When proposing code changes, please provide *unified diffs*, i\.e the output of
__diff \-u__\.
Note further that *attachments* are strongly preferred over inlined patches\.
Attachments can be made by going to the __Edit__ form of the ticket
immediately after its creation, and then using the left\-most button in the
secondary navigation bar\.
# KEYWORDS
[amazon](\.\./\.\./\.\./\.\./index\.md\#amazon),
[cloud](\.\./\.\./\.\./\.\./index\.md\#cloud), [s3](\.\./\.\./\.\./\.\./index\.md\#s3)
# CATEGORY
Networking
# COPYRIGHT
2006,2008 Darren New\. All Rights Reserved\. See LICENSE\.TXT for terms\.