I was at an Amazon event recently and the presenter said S3 should really be called S4: Subtle and Sophisticated Storage Service.
I can’t help but agree. There have been a plethora of features that have been added since it’s first introduction, covering replication, storage tiers, encryption, versioning, policies etc. While it is still an order of magnitude simpler than building and operating a similar service yourself, there are gotchas to avoid. One of which I wasn’t aware of until recently and so I thought I’d share it.
When you upload a multi-part object but for whatever reason it doesn’t complete, the object won’t be listed in the interface, or by many 3rd party tools for s3 management. Clearly you still pay for the storage though! If you have multiple large multi-part files (like backups) that haven’t completed upload, over time you might find your bill rising and the difference between what you see and what you pay for becomes noticeable.
To find the files run this command:
aws s3api list-multipart-uploads –bucket <mybucketname>
You can then select the file parts that are not longer needed and save on your bill.
For our customers we run automated scripts to identify or remove part-files automatically across all buckets to agreed timelines.