Configuration Guide

JetS3t includes a number of configuration mechanisms that allow users with specialised requirements to control how the toolkit and applications will operate:

Items with a red star (*) are new or changed in JetS3t version 0.8.0

JetS3t Properties

The main configuration properties for the JetS3t toolkit and applications are stored in a file called jets3t.properties. This file specifies advanced communication properties. A properties file with default settings is included in the configs directory JetS3t release.

The jets3t.properties file is a standard text document that lists properties in the standard Java Properties file format, containing name value pairs and comments:

   # Comments start with a hash character.
   # Empty lines are ignored...

   # Properties are set as name=value, for example:
   propertyName=propertyValue
   my.favourite.colour=blue

Overview of jets3t.properties

StorageService (apply to both S3 and Google Storage services)
storage-service.internal-error-retry-max The maximum number of retries that will be attempted when a service connection fails with an InternalServer error. To disable retries of InternalError failures, set this to 0.
Note: After each failure, the service pauses before retrying. The time to wait is calculated with the formula:
50 * (internalErrorCount ^ 2)
Default: 5
ThreadedStorageService (multi-threaded wrapper for S3 or Google services)*
threaded-service.max-thread-count* The maximum number of concurrent communication threads that will be started by the ThreadedStorageService/SimpleThreadedStorageService services for upload and download operations. This value should not be too high, otherwise you risk I/O errors due to bandwidth starvation when tranferring many large files.
Default: 2
Note: This value must not exceed the maximum number of HTTP connections available to JetS3t, as set by the property httpclient.max-connections
threaded-service.admin-max-thread-count* The maximum number of concurrent communication threads that will be started by the ThreadedStorageService/SimpleThreadedStorageService services for administrative operations such as DELETE and HEAD requests. Because these operations are short-lived and incur little bandwidth, most systems can reliably use many more administrative threads than upload/download threads.
Default: 20
Note: This value must not exceed the maximum number of HTTP connections available to JetS3t, as set by the property httpclient.max-connections
threaded-service.ignore-exceptions-in-multi If this value is set to true, JetS3t will ignore exceptions that occur during multi-threaded operations performed by the ThreadedStorageService/SimpleThreadedStorageService classes. This option should only be used as a last resort when you need to complete an upload or download operation despite many communication errors.
Default: false
S3ServiceMulti (multi-threaded wrapper for S3 service)
s3service.max-thread-count The maximum number of concurrent communication threads that will be started by the S3ServiceMulti/S3ServiceSimpleMulti multi-threaded services for upload and download operations. This value should not be too high, otherwise you risk I/O errors due to bandwidth starvation when tranferring many large files.
Default: 2
Note: This value must not exceed the maximum number of HTTP connections available to JetS3t, as set by the property httpclient.max-connections
s3service.admin-max-thread-count The maximum number of concurrent communication threads that will be started by the S3ServiceMulti/S3ServiceSimpleMulti multi-threaded service for administrative operations such as DELETE and HEAD requests. Because these operations are short-lived and incur little bandwidth, most systems can reliably use many more administrative threads than upload/download threads.
Default: 20
Note: This value must not exceed the maximum number of HTTP connections available to JetS3t, as set by the property httpclient.max-connections
s3service.ignore-exceptions-in-multi If this value is set to true, JetS3t will ignore exceptions that occur during multi-threaded operations performed by the S3ServiceMulti/S3ServiceSimpleMulti classes. This option should only be used as a last resort when you need to complete an upload or download operation despite many communication errors.
Default: false
RestS3Service
s3service.https-only If true, all communication with S3 will be via encrypted HTTPS connections, otherwise communications will be sent unencrypted via HTTP
Default: true
s3service.s3-endpoint The host name of the S3 service. You should only ever change this value from the default if you need to contact an alternative S3 endpoint for testing purposes.
Default: s3.amazonaws.com
s3service.s3-endpoint-http-port The HTTP port number of the S3 service. You should only ever change this value from the default if you need to contact an alternative S3 endpoint for testing purposes.
Default: 80
s3service.s3-endpoint-https-port The HTTPS (secure HTTP) port number of the S3 service. You should only ever change this value from the default if you need to contact an alternative S3 endpoint for testing purposes.
Default: 443
s3service.disable-dns-buckets By default, JetS3t will always incorporate bucket names that are DNS-compatible into the host name of its requests. For example, a request directed at the bucket named "mybucket" will be sent to the host name "mybucket.s3.amazonaws.com". If you set this property to true, JetS3t will specify bucket names in the request path of the HTTP message instead of the Host header, for example: "http://s3.amazonaws.com/mybucket". This behaviour may be useful for testing purposes, or if DNS problems prevent the default host name addressing format from working correctly.
Note: If you set this property to true, you will be unable to access buckets located in the EU.
Default: false
s3service.default-bucket-location The default geographical location in which S3 buckets will be created. This value may be US or EU.
Default: US
s3service.enable-storage-classes* If true, S3 services will set the storage class property for newly-created objects to either the default value (see s3service.default-storage-class) or the REDUCED_REDUNDANCY value if specified. This setting should only be disabled if you are talking to a service (like Google Storage) that offers an S3-like API but doesn't support the storage class feature.
Default: true
s3service.default-storage-class* The default storage class value to use for newly-created objects in S3.
Default: STANDARD
s3service.s3-endpoint-virtual-path A non-standard URL to use when talking to services that use an S3-like API but expose them in a non-root URL location, e.g. the Eucalyptus Walrus project.
Default: STANDARD
GoogleStorageService
gsservice.https-only If true, all communication with Google Storage will be via encrypted HTTPS connections, otherwise communications will be sent unencrypted via HTTP
Default: true
gsservice.gs-endpoint The host name of the Google Storage service. You should only ever change this value from the default if you need to contact an alternative endpoint for testing purposes.
Default: commondatastorage.googleapis.com
gsservice.gs-endpoint-http-port The HTTP port number of the Google Storage service. You should only ever change this value from the default if you need to contact an alternative endpoint for testing purposes.
Default: 80
gsservice.gs-endpoint-https-port The HTTPS (secure HTTP) port number of the Google Storage service. You should only ever change this value from the default if you need to contact an alternative endpoint for testing purposes.
Default: 443
CloudFrontService
cloudfront-service.internal-error-retry-max The maximum number of retries that will be attempted when a CloudFront connection fails with an InternalServer error. To disable retries of InternalError failures, set this to 0.
Default: 5
REST/HTTP HttpClient properties
httpclient.max-connections The maximum number of simultaneous connections to allow globally
Default: 20
Note: If you have a fast Internet connection, you can improve the performance of your S3 client by increasing this setting and the corresponding S3 Service properties s3service.max-thread-count and s3service.admin-max-thread-count. However, be careful because if you increase this value too much for your connection you may exceed your available bandwidth and cause communications errors.
httpclient.max-connections-per-host The maximum number of simultaneous connections to allow for a single host, such as an S3 bucket accessed through a virtual host name (e.g. jets3t.s3.amazonaws.com). This property is only useful to advanced users.
Default: The value of the property httpclient.max-connections
This property will only be of interest to people who use a single RestS3Service class to interact with multiple S3 buckets at the same time. If you need to do this, and if your buckets are accessed using DNS-compatible virtual host names like "jets3t.s3.amazonaws.com", you may wish to configure the maximum number of simultaneous connections to each bucket you are interacting with. For example, you could ensure that connections are spread evenly between two buckets by setting the value of this property to be half the value of httpclient.max-connections.
httpclient.retry-max How many times to retry connections when they fail with IO errors. Set this to 0 to disable retries.
Default: 5
httpclient.connection-timeout-ms How many milliseconds to wait before a connection times out. 0 means infinity.
Default: 60000
httpclient.socket-timeout-ms How many milliseconds to wait before a socket connection times out. 0 means infinity.
Default: 60000
httpclient.stale-checking-enabled "Determines whether stale connection checking is to be used. Disabling stale connection check may result in slight performance improvement at the risk of getting an I/O error when executing a request over a connection that has been closed at the server side." (from HttpClient documentation)
Default: true
httpclient.useragent The user agent string sent with HTTP requests.
Default: Application-specific
httpclient.read-throttle Imposes a rudimentary limit on the bandwidth used for uploads, by throttling the speed at which data will be sent to S3. This property specifies the limit in KB/s, expressed as an integer.
Default: N/A (This property is commented-out by default)
httpclient.authentication-preemptive Instructs the HttpClient library whether to pre-emtively authenticate HTTP connections. To improve compatibility with NTLM proxies, pre-emtive authentication is disabled by default.
Default: false
httpclient.proxy-autodetect Indicates whether JetS3t should auto-detect the HTTP proxy settings appropriate for the local machine.
Default: true
httpclient.proxy-host Explicitly sets the host name of a HTTP proxy server. To apply this setting, proxy autodetection should be disabled by setting httpclient.proxy-autodetect to false.
Default: N/A (httpclient.proxy-autodetect=true is used by default)
httpclient.proxy-port Explicitly sets the port number of a HTTP proxy server. To apply this setting, proxy autodetection should be disabled by setting httpclient.proxy-autodetect to false.
Default: N/A (httpclient.proxy-autodetect=true is used by default)
httpclient.proxy-user Explicitly sets the user name credential for proxy authentication. To apply this setting, proxy autodetection should be disabled by setting httpclient.proxy-autodetect to false.
Default: N/A (httpclient.proxy-autodetect=true is used by default)
httpclient.proxy-password Explicitly sets the password credential for proxy authentication. To apply this setting, proxy autodetection should be disabled by setting httpclient.proxy-autodetect to false.
Default: N/A (httpclient.proxy-autodetect=true is used by default)
httpclient.proxy-domain Explicitly sets the domain credential for proxy authentication. To apply this setting, proxy autodetection should be disabled by setting httpclient.proxy-autodetect to false.
Default: N/A (httpclient.proxy-autodetect=true is used by default)
Requester Pays Settings
httpclient.requester-pays-buckets-enabled When set to true, JetS3t will be able to access Requester Pays buckets and the library user will be liable for any subsequent S3 request and bandwidth fees.
Note: This option can also be set/over-ridden programmatically via the method S3Service#setRequesterPaysEnabled.
Default: false
TCP window size hints for kernel
httpclient.socket-receive-buffer Integer value for the TCP receive window size in bytes, which is provided as a hint to the kernel by the HttpClient library before HTTP socket connections are established. This value will not necessarily override a kernel's default TCP window size settings because the kernel is free to ignore the hint.
Default: N/A (This property is commented-out by default, in which case the default TCP window size will be used)
httpclient.socket-send-buffer Integer value for the TCP receive window size in bytes, which is provided as a hint to the kernel by the HttpClient library before HTTP socket connections are established. This value will not necessarily override a kernel's default TCP window size settings because the kernel is free to ignore the hint.
Default: N/A (This property is commented-out by default, in which case the default TCP window size will be used)
httpclient.connection-manager-timeout Specify a timeout for how long an S3 operation will wait for an HTTP connection to be made available by the HttpClient connection pool. The default setting (0) means wait indefinitely for a connection to become available.
Default: 0
Upload properties
uploads.stream-retry-buffer-size* How many bytes to buffer for use when retrying failed transmissions. This value must be small enough that applications using multiple upload threads will not exceed their available memory by buffering data
Note: Applies only to uploads that obtain data from a non-markable InputStream. Uploads of data in File or String objects do not need to be buffered.
Default: 131072
uploads.storeEmptyDirectories Boolean value that indicates whether JetS3t applications that upload files to S3 should create place-holder objects to represent directories. Place-holder objects are named after a directory, have no data content, and have the mimetype application/x-directory. These place-holder objects are useful because they allow JetS3t to store empty directories in S3, but they may cause conflicts with other S3 clients that use a different technique to store directory names. If this option is set to false, synchronization from a local filesystem to S3 will not include empty directories.
Default: true
Download properties
downloads.restoreLastModifiedDate Boolean value that indicates whether JetS3t applications that download files from S3 should restore the original last modified date of the file, according to the value of the metadata item "jets3t-original-file-date-iso8601" created by JetS3t tools that upload files. If this property is set to true, it will be possible to retain a file's last modified date when that file is downloaded or restored from S3.
Default: true
File/Object comparison properties
filecomparer.ignore-panic-dir-placeholders Boolean value that indicates whether JetS3t applications that synchronize files with S3 should ignore directory place-holder objects created by the Transmit application (as sold by Panic). If you use both JetS3t and Transmit programs to interact with your S3 account, setting this option to true should minimize naming conflicts.
Default: true
filecomparer.ignore-s3organizer-dir-placeholders Boolean value that indicates whether JetS3t applications that synchronize files with S3 should ignore directory place-holder objects created by the S3 Organizer Firefox add-on. If you use both JetS3t and S3 Organizer to interact with your S3 account, setting this option to true should minimize naming conflicts.
Default: true
filecomparer.skip-symlinks Boolean value that indicates whether JetS3t should ignore symlink/alias files when deciding which files to send to S3. Note that although "soft" links can be detected and ignored, "hard" links cannot be detected and will be treated as ordinary files.
Default: false (This property is commented-out by default)
filecomparer.use-md5-files Boolean value that indicates whether JetS3t applications that synchronize files with S3 should look for files containing pre-generated MD5 hash values. These MD5 files must be named after the file they refer to, eg <filename>.md5, and must contain a hex-encoded MD5 value. If one of these files is present, JetS3t will use the pre-generated hash value rather than calculating a new hash - potentially saving a great deal of time when you are synchronizing large files. If the pre-generated MD5 file is older than the file it refers to, a new hash value will be calculated.
Default: false (This property is commented-out by default)
filecomparer.generate-md5-files Boolean value that indicates whether JetS3t applications that synchronize files with S3 should create <filename>.md5 files to store the MD5 hash values they calculate. If these values are stored, they need not be recalculated in future synchronizations. This property is intended to be used in conjunction with filecomparer.use-md5-files.
Default: false (This property is commented-out by default)
filecomparer.md5-files-root-dir The directory path in which <filename>.md5 files generated for file comparison purposes will be stored. By default MD5 files are saved in the same directories as the data files that are being compared but this can cause clutter. This property allows you to store these files in a parallel directory structure. This property is intended to be used in conjunction with filecomparer.use-md5-files.
Default: N/A (This property is commented-out by default)
filecomparer.skip-upload-of-md5-files Boolean value that indicates whether JetS3t applications that synchronize files with S3 should avoid uploading <filename>.md5 files to S3. This property is intended to be used in conjunction with filecomparer.use-md5-files.
Default: false (This property is commented-out by default)
filecomparer.assume-local-latest-in-mismatch Boolean value that indicates whether JetS3t applications that synchronize files with S3 should assume that the local file version is the most recent when there is a clash between the last update time as recorded locally and in S3. This option should only be enabled by users who experience errors synchronizing Microsoft Excel files to S3, due to Excels observed habit of altering a file's contents without updating its last-modified timestamp.
Default: false (This property is commented-out by default)
Encryption properties
crypto.algorithm Name of the PBE cryptographic algorithm to used by JetS3t applications that encrypt items uploaded to S3.
Note: The default algorithm isn't a strong one, and should be improved if security is important
Note 2: The algorithms available will depend on the cryptography provider libraries installed on a system, and the Java policy file settings. This is a complicated topic that is not specific to JetS3t, please refer to documentation elsewhere.
Default: PBEWithMD5AndDES
XML Parsing properties
xmlparser.sanitize-listings Boolean value that indicates whether JetS3t should apply a work-around to properly interpret S3 object names that contain carriage return characters. Because S3 returns object names in XML documents that can be misinterpreted by standard XML parsing tools, JetS3t performs and extra translation step to convert troublesome characters into an unambiguous format before the XML parsing takes place. This step may consume more memory and processing time, so you may want to disable it if you are sure that none of your object names contain problematic characters.
Default: true
Amazon DevPay Settings
As of version 0.7.0, you can use AWS DevPay S3 account credentials programatically with the AWSDevPayCredentials class.
devpay.user-token The user token of an S3 account that is part of an Amazon DevPay product. You should only use this setting when you are interacting with a DevPay S3 account.
Default: N/A (This property is commented-out by default)
devpay.product-token The product of an S3 account that is part of an Amazon DevPay product. You should only use this setting when you are interacting with a DevPay S3 account.
Default: N/A (This property is commented-out by default)
GUI properties
gui.verboseErrorDialog If true, JetS3t applications that display error dialogs will include more information about errors.
Default: true

Mimetypes

JetS3t applications, and any toolkit-based application that use the Mimetypes utility class, can recognise the mime/content type of a file based on the file's extension. For example, .html files are recognised as text/html. A properties file with default settings is included in the configs directory JetS3t release.

Mimetype recognition is configured in the mime.types configuration file, which is exactly the same format as is used by the Apache Web Server (in fact, we pinched the whole mimetypes file from Apache).

The mime.types file is a standard text document that lists the recognised mime types, each on a new line. The mimetype is listed, followed by one or more tabs, then one or more space-separated file extensions that map to the mimetype. The best way to understand the file format is to look at some examples:

text/html               html htm
text/plain              asc txt
application/pdf         pdf
audio/mpeg              mpga mp2 mp3

In addition to explicit extension-to-mimetype mappings, you can specify a default mimetype to use for uploaded files that have an unrecognized or missing extension by defining a '*' extension in the mime.types config file. The default mimetype used by JetS3t is application/octet-stream but you can override this setting if you need to (though we do not recommend you do this).

Ignore Files on Upload

JetS3t applications, and any toolkit-based application that uses the FileComparer utility class, will look for special ignore files named .jets3t-ignore when generating listings of local files to upload. These ignore files are plain text documents that contain file or directory names in the directory that should not be uploaded. Any file or directory that matches a name or a pattern in this file is completely ignored, as if it doesn't exist.
Users familiar with CVS will be familiar with this concept, as it copies the idea of .cvsignore files.

Ignore files are mainly useful when managing the backup of large file sets. For example, say you want to backup a folder called Documents but you don't want to backup a sub-folder in Documents called Private, nor do you want to backup all the .tmp files your word processor creates when you open a document. You could achieve this by creating a .jets3t-ignore file in the Documents folder with the contents:

Private
*.tmp

The .jetset-ignore file format allows basic wild-card substitution:

  • Asterix (*): matches zero or more characters
  • Question mark (?): matches exactly one character

Ignoring nested paths

As of version 0.7.0, .jets3t-ignore files can contain paths that refer to deeply nested files or folders. These nested paths can also contain wildcards.

For example, if you have a private Finances subdirectory in year-based parent directories (e.g. 2007/Finances, 2008/Finances) you could ignore these with the following ignore file path:

*/Finances

There is also a special path wildcard (**) that allows you to ignore specific files or folders no matter how deeply they are nested. For example, to ignore all the CVS directories no matter where they occur you would use the following ignore path:

**/CVS

Prior versions

Note that in versions of JetS3t prior to 0.7.0, the contents of the ignore file applied only to the immediate directory containing the .jets3t-ignore file. For example, the wildcard string *.tmp would only ignore .tmp files in the current directory.

JMX Instrumentation

As of version 0.7.1 JetS3t includes a JMX Instrumentation module (contribs.mx) that makes it possible to track important events that occur in your S3 service. To enable JMX event tracking you need to set a system property prior to running your applications. For Java 1.6 set the system property jets3t.mx, or on earlier versions of Java set com.sun.management.jmxremote.

Instrumentation for S3Service events such as PUT, GET, HEAD, DELETE is enabled by default when JMX is turned on. There is also instrumentation available for tracking the S3Bucket and S3Object classes, however this is disabled by default. To enable this low-level reporting, set one or both of the following system properties to true: jets3t.bucket.mx, jets3t.object.mx.

For example, to run a JetS3t application with full JMX instrumentation enabled specify the following Java system property definitions on the application's command line:

  • Java 1.6: -Djets3t.mx -Djets3t.bucket.mx=true -Djets3t.object.mx=true
  • Prior versions: -Dcom.sun.management.jmxremote -Djets3t.bucket.mx=true -Djets3t.object.mx=true