public class FileComparer
extends java.lang.Object
File comparisons are based primarily on MD5 hashes of the files' contents. If a local file does not match an object in the service with the same name, this utility determine which of the items is newer by comparing the last modified dates.
Modifier and Type | Class and Description |
---|---|
class |
FileComparer.PartialObjectListing |
Constructor and Description |
---|
FileComparer(Jets3tProperties jets3tProperties)
Constructs the class.
|
Modifier and Type | Method and Description |
---|---|
FileComparerResults |
buildDiscrepancyLists(java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap,
java.util.Map<java.lang.String,StorageObject> objectsMap)
Compares the contents of a directory on the local file system with the contents of a service
resource.
|
FileComparerResults |
buildDiscrepancyLists(java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap,
java.util.Map<java.lang.String,StorageObject> objectsMap,
BytesProgressWatcher progressWatcher)
Compares the contents of a directory on the local file system with the contents of a service
resource.
|
FileComparerResults |
buildDiscrepancyLists(java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap,
java.util.Map<java.lang.String,StorageObject> objectsMap,
BytesProgressWatcher progressWatcher,
boolean isForceUpload)
Compares the contents of a directory on the local file system with the contents of a service
resource.
|
protected java.util.List<java.util.regex.Pattern> |
buildIgnoreRegexpList(java.io.File directory,
java.util.List<java.util.regex.Pattern> parentIgnorePatternList)
If a
.jets3t-ignore file is present in the given directory, the file is read
and all the paths contained in it are coverted to regular expression Pattern objects. |
java.util.Map<java.lang.String,java.lang.String> |
buildObjectKeyToFilepathMap(java.io.File[] fileList,
java.lang.String fileKeyPrefix,
boolean includeDirectories)
Builds a map of files and directories that exist on the local system, where the map
keys are the object key names that will be used for the files in a remote storage
service, and the map values are absolute paths (Strings) to that file in the local
file system.
|
protected void |
buildObjectKeyToFilepathMapForDirectory(java.io.File directory,
java.lang.String fileKeyPrefix,
java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap,
boolean includeDirectories,
java.util.List<java.util.regex.Pattern> parentIgnorePatternList)
Recursively builds a map of object key names to file paths that contains
all the files and directories inside the given directory.
|
java.util.Map<java.lang.String,StorageObject> |
buildObjectMap(StorageService service,
java.lang.String bucketName,
java.lang.String targetPath,
java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap,
boolean forceMetadataDownload,
boolean isForceUpload,
BytesProgressWatcher progressWatcher,
StorageServiceEventListener eventListener)
Builds a service Object Map containing all the objects within the given target path,
where the map's key for each object is the relative path to the object.
|
FileComparer.PartialObjectListing |
buildObjectMapPartial(StorageService service,
java.lang.String bucketName,
java.lang.String targetPath,
java.lang.String priorLastKey,
java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap,
boolean completeListing,
boolean forceMetadataDownload,
boolean isForceUpload,
BytesProgressWatcher progressWatcher,
StorageServiceEventListener eventListener)
Builds a service Object Map containing a partial set of objects within the given target path,
where the map's key for each object is the relative path to the object.
|
byte[] |
generateFileMD5Hash(java.io.File file,
java.lang.String relativeFilePath,
BytesProgressWatcher progressWatcher) |
static FileComparer |
getInstance() |
static FileComparer |
getInstance(Jets3tProperties jets3tProperties) |
java.io.File |
getMd5FilesRootDirectoryFile() |
protected java.io.File |
getPreComputedHashFile(java.io.File file,
java.lang.String relativeFilePath) |
boolean |
isAssumeLocalLatestInMismatch() |
boolean |
isGenerateMd5Files() |
protected boolean |
isIgnored(java.util.List<java.util.regex.Pattern> ignorePatternList,
java.io.File file)
Determines whether a file should be ignored when building a file map.
|
boolean |
isSkipMd5FileUpload() |
boolean |
isSkipSymlinks() |
boolean |
isUseMd5Files() |
StorageObject[] |
listObjectsThreaded(StorageService service,
java.lang.String bucketName,
java.lang.String targetPath)
Lists the objects in a bucket using a partitioning technique to divide
the object namespace into separate partitions that can be listed by
multiple simultaneous threads.
|
StorageObject[] |
listObjectsThreaded(StorageService service,
java.lang.String bucketName,
java.lang.String targetPath,
java.lang.String delimiter,
int toDepth)
Lists the objects in a bucket using a partitioning technique to divide
the object namespace into separate partitions that can be listed by
multiple simultaneous threads.
|
byte[] |
lookupFileMD5Hash(java.io.File file,
java.lang.String relativeFilePath)
Return the pre-generated MD5 hash value of a file, as previously stored by JetS3t
(or another program) in an .md5 file corresponding to the given file.
|
java.util.Map<java.lang.String,StorageObject> |
lookupObjectMetadataForPotentialClashes(StorageService service,
java.lang.String bucketName,
java.lang.String targetPath,
StorageObject[] objectsWithoutMetadata,
java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap,
boolean forceMetadataDownload,
boolean isForceUpload,
BytesProgressWatcher progressWatcher,
StorageServiceEventListener eventListener)
Given a set of storage objects for which only minimal information is available,
retrieve metadata information for any objects that potentially clash with
local files.
|
protected java.lang.String |
normalizeUnicode(java.lang.String str)
Normalize string into "Normalization Form Canonical Decomposition" (NFD).
|
java.util.Map<java.lang.String,StorageObject> |
populateObjectMap(java.lang.String targetPath,
StorageObject[] objects)
Builds a map of key/object pairs each object is associated with a key based on its location
in the service target path.
|
public FileComparer(Jets3tProperties jets3tProperties)
jets3tProperties
- the object containing the properties that will be applied in this class.public static FileComparer getInstance(Jets3tProperties jets3tProperties)
jets3tProperties
- the object containing the properties that will be applied in the instance.public static FileComparer getInstance()
protected java.util.List<java.util.regex.Pattern> buildIgnoreRegexpList(java.io.File directory, java.util.List<java.util.regex.Pattern> parentIgnorePatternList)
.jets3t-ignore
file is present in the given directory, the file is read
and all the paths contained in it are coverted to regular expression Pattern objects.
If the parent directory's list of patterns is provided, any relevant patterns are also
added to the ignore listing. Relevant parent patterns are those with a directory prefix
that matches the current directory, or with the wildcard depth pattern (*.*./).directory
- a directory that may contain a .jets3t-ignore
file. If this parameter is null
or is actually a file and not a directory, an empty list will be returned.parentIgnorePatternList
- a list of Patterns that were applied to the parent directory of the given directory. If this
parameter is null, no parent ignore patterns are applied.protected boolean isIgnored(java.util.List<java.util.regex.Pattern> ignorePatternList, java.io.File file)
ignorePatternList
- a list of Pattern objects representing the file names to ignore.file
- a file that will either be ignored or not, depending on whether it matches an ignore Pattern
or is a symlink/alias.protected java.lang.String normalizeUnicode(java.lang.String str)
str
- public java.util.Map<java.lang.String,java.lang.String> buildObjectKeyToFilepathMap(java.io.File[] fileList, java.lang.String fileKeyPrefix, boolean includeDirectories)
A file/directory hierarchy is represented using '/' delimiter characters in object key names.
Any file or directory matching a path in a .jets3t-ignore
file will be ignored.
fileList
- the set of files and directories to include in the file map.fileKeyPrefix
- A prefix added to each file path key in the map, e.g. the name of the root directory the
files belong to. If provided, a '/' suffix is always added to the end of the prefix. If null
or empty, no prefix is used.includeDirectories
- If true all directories, including empty ones, will be included in the Map. These directories
will be mere place-holder objects with a trailing slash (/) character in the name and the
content type Mimetypes.MIMETYPE_BINARY_OCTET_STREAM
.
If this variable is false directory objects will not be included in the Map, and it will not
be possible to store empty directories in the service.protected void buildObjectKeyToFilepathMapForDirectory(java.io.File directory, java.lang.String fileKeyPrefix, java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap, boolean includeDirectories, java.util.List<java.util.regex.Pattern> parentIgnorePatternList)
A file/directory hierarchy is represented using '/' delimiter characters in object key names.
Any file or directory matching a path in a .jets3t-ignore
file will be ignored.
directory
- The directory containing the files/directories of interest. The directory is not
included in the result map.fileKeyPrefix
- A prefix added to each file path key in the map, e.g. the name of the root directory the
files belong to. This prefix must end with a '/' character.objectKeyToFilepathMap
- map of '/'-delimited object key names to local file absolute paths, to which this method adds items.includeDirectories
- If true all directories, including empty ones, will be included in the Map. These directories
will be mere place-holder objects with a trailing slash (/) character in the name and the
content type Mimetypes.MIMETYPE_BINARY_OCTET_STREAM
.
If this variable is false directory objects will not be included in the Map, and it will not
be possible to store empty directories in the service.parentIgnorePatternList
- a list of Patterns that were applied to the parent directory of the given directory. This list
will be checked to see if any of the parent's patterns should apply to the current directory.
See buildIgnoreRegexpList(File, List)
for more information.
If this parameter is null, no parent ignore patterns are applied.public StorageObject[] listObjectsThreaded(StorageService service, java.lang.String bucketName, java.lang.String targetPath, java.lang.String delimiter, int toDepth) throws ServiceException
This partitioning technique will work best for buckets with many objects that are divided into a number of virtual subdirectories of roughly equal size.
service
- the service object that will be used to perform listing requests.bucketName
- the name of the bucket whose contents will be listed.targetPath
- a root path within the bucket to be listed. If this parameter is null, all
the bucket's objects will be listed. Otherwise, only the objects below the
virtual path specified will be listed.delimiter
- the delimiter string used to identify virtual subdirectory partitions
in a bucket. If this parameter is null, or it has a value that is not
present in your object names, no partitioning will take place.toDepth
- the number of delimiter levels this method will traverse to identify
subdirectory partions. If this value is zero, no partitioning will take
place.ServiceException
public StorageObject[] listObjectsThreaded(StorageService service, java.lang.String bucketName, java.lang.String targetPath) throws ServiceException
This partitioning technique will work best for buckets with many objects that are divided into a number of virtual subdirectories of roughly equal size.
The delimiter and depth properties that define how this method will
partition the bucket's namespace are set in the jets3t.properties file
with the setting:
filecomparer.bucket-listing.<bucketname>=<delim>,<depth>
For example: filecomparer.bucket-listing.my-bucket=/,2
service
- the service object that will be used to perform listing requests.bucketName
- the name of the bucket whose contents will be listed.targetPath
- a root path within the bucket to be listed. If this parameter is null, all
the bucket's objects will be listed. Otherwise, only the objects below the
virtual path specified will be listed.ServiceException
public java.util.Map<java.lang.String,StorageObject> buildObjectMap(StorageService service, java.lang.String bucketName, java.lang.String targetPath, java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap, boolean forceMetadataDownload, boolean isForceUpload, BytesProgressWatcher progressWatcher, StorageServiceEventListener eventListener) throws ServiceException
service
- bucketName
- targetPath
- objectKeyToFilepathMap
- map of '/'-delimited object key names to local file absolute pathsforceMetadataDownload
- if true, metadata is always downloaded for objects in the storage service. If false,
metadata is only downloaded if deemed necessary. This flag should be set to true when
data for any objects in the storage service has been transformed, such as by
encryption or compression during upload.isForceUpload
- set to true if the calling tool will upload files regardless of the comparison, so this
method will avoid any unnecessary and potentially expensive data/date comparison checks.progressWatcher
- watcher to monitor bytes read during comparison operations, may be null.eventListener
- ServiceException
lookupObjectMetadataForPotentialClashes(StorageService, String, String, StorageObject[], Map, boolean, boolean, BytesProgressWatcher, StorageServiceEventListener)
public FileComparer.PartialObjectListing buildObjectMapPartial(StorageService service, java.lang.String bucketName, java.lang.String targetPath, java.lang.String priorLastKey, java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap, boolean completeListing, boolean forceMetadataDownload, boolean isForceUpload, BytesProgressWatcher progressWatcher, StorageServiceEventListener eventListener) throws ServiceException
If the method is asked to perform a complete listing, it will use the
listObjectsThreaded(StorageService, String, String)
method to list the objects
in the bucket, potentially taking advantage of any bucket name partitioning
settings you have applied.
If the method is asked to perform only a partial listing, no bucket name partitioning will be applied.
service
- bucketName
- targetPath
- priorLastKey
- the prior last key value returned by a prior invocation of this method, if any.objectKeyToFilepathMap
- map of '/'-delimited object key names to local file absolute pathsforceMetadataDownload
- if true, metadata is always downloaded for objects in the storage service. If false,
metadata is only downloaded if deemed necessary. This flag should be set to true when
data for any objects in the storage service has been transformed, such as by
encryption or compression during upload.isForceUpload
- set to true if the calling tool will upload files regardless of the comparison, so this
method will avoid any unnecessary and potentially expensive data/date comparison checks.completeListing
- if true, this method will perform a complete listing of a service target.
If false, the method will list a partial set of objects commencing from the
given prior last key.progressWatcher
- watcher to monitor bytes read during comparison operations, may be null.eventListener
- ServiceException
lookupObjectMetadataForPotentialClashes(StorageService, String, String, StorageObject[], Map, boolean, boolean, BytesProgressWatcher, StorageServiceEventListener)
public java.util.Map<java.lang.String,StorageObject> lookupObjectMetadataForPotentialClashes(StorageService service, java.lang.String bucketName, java.lang.String targetPath, StorageObject[] objectsWithoutMetadata, java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap, boolean forceMetadataDownload, boolean isForceUpload, BytesProgressWatcher progressWatcher, StorageServiceEventListener eventListener) throws ServiceException
service
- bucketName
- targetPath
- objectsWithoutMetadata
- objectKeyToFilepathMap
- forceMetadataDownload
- if true, metadata is always downloaded for objects in the storage service. If false,
metadata is only downloaded if deemed necessary. This flag should be set to true when
data for any objects in the storage service has been transformed, such as by
encryption or compression during upload.isForceUpload
- set to true if the calling tool will upload files regardless of the comparison, so this
method will avoid any unnecessary and potentially expensive data/date comparison checks.progressWatcher
- watcher to monitor bytes read during comparison operations, may be null.eventListener
- ServiceException
populateObjectMap(String, StorageObject[])
public java.util.Map<java.lang.String,StorageObject> populateObjectMap(java.lang.String targetPath, StorageObject[] objects)
targetPath
- objects
- protected java.io.File getPreComputedHashFile(java.io.File file, java.lang.String relativeFilePath) throws java.io.IOException
java.io.IOException
public byte[] lookupFileMD5Hash(java.io.File file, java.lang.String relativeFilePath) throws java.io.IOException
file
- relativeFilePath
- java.io.IOException
public byte[] generateFileMD5Hash(java.io.File file, java.lang.String relativeFilePath, BytesProgressWatcher progressWatcher) throws java.io.IOException, java.security.NoSuchAlgorithmException
file
- relativeFilePath
- progressWatcher
- watcher to monitor bytes read during comparison operations, may be null.java.io.IOException
java.security.NoSuchAlgorithmException
public FileComparerResults buildDiscrepancyLists(java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap, java.util.Map<java.lang.String,StorageObject> objectsMap) throws java.security.NoSuchAlgorithmException, java.io.FileNotFoundException, java.io.IOException, java.text.ParseException
objectKeyToFilepathMap
- map of '/'-delimited object key names to local file absolute pathsobjectsMap
- a map of keys to StorageObjects.java.security.NoSuchAlgorithmException
java.io.FileNotFoundException
java.io.IOException
java.text.ParseException
public FileComparerResults buildDiscrepancyLists(java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap, java.util.Map<java.lang.String,StorageObject> objectsMap, BytesProgressWatcher progressWatcher) throws java.security.NoSuchAlgorithmException, java.io.FileNotFoundException, java.io.IOException, java.text.ParseException
objectKeyToFilepathMap
- map of '/'-delimited object key names to local file absolute pathsobjectsMap
- a map of keys to StorageObjects.progressWatcher
- watcher to monitor bytes read during comparison operations, may be null.java.security.NoSuchAlgorithmException
java.io.FileNotFoundException
java.io.IOException
java.text.ParseException
public FileComparerResults buildDiscrepancyLists(java.util.Map<java.lang.String,java.lang.String> objectKeyToFilepathMap, java.util.Map<java.lang.String,StorageObject> objectsMap, BytesProgressWatcher progressWatcher, boolean isForceUpload) throws java.security.NoSuchAlgorithmException, java.io.FileNotFoundException, java.io.IOException, java.text.ParseException
objectKeyToFilepathMap
- map of '/'-delimited object key names to local file absolute pathsobjectsMap
- a map of keys to StorageObjects.progressWatcher
- watcher to monitor bytes read during comparison operations, may be null.isForceUpload
- set to true if the calling tool will upload files regardless of the comparison, so this
method will avoid any unnecessary and potentially expensive data/date comparison checks.java.security.NoSuchAlgorithmException
java.io.FileNotFoundException
java.io.IOException
java.text.ParseException
public boolean isSkipSymlinks()
public boolean isUseMd5Files()
public boolean isGenerateMd5Files()
public boolean isSkipMd5FileUpload()
public boolean isAssumeLocalLatestInMismatch()
public java.io.File getMd5FilesRootDirectoryFile() throws java.io.FileNotFoundException
java.io.FileNotFoundException