I did a heap of aws s3 sync ./Documents s3://myBucket/Documents
commands on a MacBook to upload files to S3.
When I went to download the S3 files to Windows. I found that there were many characters that the Macbook and S3 would happily accept but windows wouldn't allows. Some examples I can think of are.
- A file path in mac that is
/Path To/File with trailing space /filename here.jpeg
- A folder with trailing dots
/Path To/Folder with trailing dots.../filename here2jpeg
- Macbooks will happily allow in a file or folder name double quotes, asterisks *, greater > and less < than symbols, the pipe character | and question marks ???
So to remedy this once the files have been uploaded I did the following
Get a list of your S3 objects :
1 2 3 4 5 6 7 | #!/bin/bash # awscommands.sh # get a list of objects in your bucket aws s3api list-objects --bucket myBucket --prefix Documents --query 'Contents[].[Key]' --output text > $1 # check for the illegal characters in the list and pipe to file cat $1 | grep -e '*' -e '|' -e '<' -e '>' -e '\\' -e '?' -e '"' -e ':' > ${1}out |
Loop through the filenames and run aws s3 mv against them by running the following script php fixfilenames.php
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | <?php # fixfilenames.php $lines = file( 'docsout' ); foreach ( $lines as $line ) { $trimmed = trim( $line ); $s3 = 's3://myBucket/' ; $arg1 = escapeshellarg ( $s3 . $trimmed ); $arg2 = escapeshellarg ( $s3 . filterFilename( $trimmed )); $cmd = 'aws s3 mv ' ; //--dryrun $fullCommand = $cmd . $arg1 . ' ' . $arg2 ; echo $fullCommand . "\n" ; $ret = shell_exec( $fullCommand ); echo $ret . "\n" ; } /** * filterFilename * @param string $name the file name string * @return string */ function filterFilename( $name ) { // remove illegal file system characters https://en.wikipedia.org/wiki/Filename#Reserved_characters_and_words $name = str_replace ( array_merge ( array_map ( 'chr' , range(0, 31)), array ( '<' , '>' , ':' , '"' , '\\' , '|' , '?' , '*' ) ), '' , $name ); // maximise filename length to 255 bytes http://serverfault.com/a/9548/44086 // $ext = pathinfo($name, PATHINFO_EXTENSION); //$name= mb_strcut(pathinfo($name, PATHINFO_FILENAME), 0, 255 - ($ext ? strlen($ext) + 1 : 0), mb_detect_encoding($name)) . ($ext ? '.' . $ext : ''); // mixed preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &amp;amp;amp;$count ]] ) $name = preg_replace( '/\s\//' , '/' , $name ); # this strips out trailing spaces in folder names $name = preg_replace( '/\/\s+/' , '/' , $name ); # this strips out leading spaces in folder names return $name ; } |
0 Comments