S3 – AWS Developer Certified Exam Notes

  • S3 provides developers and IT teams with secure, durable, high-scalable object storage. Amazon S3 is easy to use, with a simple web services interface to store and retrieve any amount of data from anywhere on the web
  • S3 is an Object Based Storage where you can store your files, and it’s not a place to install an OS or a DB; for that you need a Block Based Storage
  • Files can be from 0 to 5 TB, unlimited storage
  • Files are stored in buckets (a Folder)
  • S3 is a universal namespace, names must be unique globally
  • Data consistency model for S3
    • Read after write consistency for PUTS of new objects (Immediate consistency)
    • Eventual consistency for overwrite PUTS and DELETES (can take sometime to propagates)
      • Updates to S3 are atomic: Immediate READ might get new data or old data, you are not going to get a partion of data
  • S3 is a simple Key, value store (Key, value, Version ID, Metadata)
  • S3 used to store data in alphabetical order (this of adding random value to file names)
  • Amazon guarantee 99,99% availability
  • Amazon guarantee 99,999999999% durability for S3 information ( 11 * 9’s)
  • S3 supports versionning and encryption
  • data can be secured (access control, bucket policies)
  • S3 – IA (Infrequently Accessed) for data that is accessed less frequently, but requires rapid access when needed. Lower fee than S3, but you are charged a retrieval fee
  • Reduced Redundancy Storage – Designed to provide 99,99% durability and 99,99% availability of objects over a given year
  • Glacier – Very cheap, but used for archival only. It takes 3 – 5 hours to restore from Glacier. 0,01$  per gigabyte
    No SLA for availability
  • S3 charged for:
    • Storage
    • Requests
    • Storage management pricing
    • Data transfer pricing (data coming to S3 is free, but moving data from a region to an other will be charged for)
    • S3 Transfer Acceleration (upload data to London, user upload to the closest Edge location and then from edge location to London, moving data is accelerated)
  • By default all buckets are private and ALL OBJECTS STORED INSIDE THEM ARE PRIVATE
  • Objects within a tag don’t inherits bucket tags
  • Encryption:
    • Client side encryption
    • Server side encryption
      • Server side encryption with Amazon S3 Managed keys (SSE-S3)
      • Server side encryption with KMS (SSE-KMS)
      • Server side encryption with customer provided keys (SSE-C)
  • Control access to buckets using either ACL or Buckets Policies
  • URL of static web site hosting in S3: http://mhayani.s3-website.eu-west-2.amazonaws.com
  • Cross Origin Resource Sharing (CORS) : A way of allowing one resource to access another one
  • In a bucket with versionning enabled, storage size is the sum of all versions
  • In S3, to restore an object, you need to switch to old UI and remove “delete” item
  • Once enabled, versionning cannot be disabled. it can only be suspended
  • Cross-region replication requires versionning enabled on source and destination bucket
  • Cross-region Replication by default copies only updated or newly added objects. To copy existing content, we use command line:
    aws s3 cp –recursive s3://firstBucket s3://secondBucket
  • The largest object that can be uploaded in a single PUT is 5 gigabytes. For objects larger than 100 megabytes, customers should consider using the Multipart Upload capability.
  • You can use Multi-Object Delete to delete large numbers of objects from Amazon S3. This feature allows you to send multiple object keys in a single request to speed up your deletes. Amazon does not charge you for using Multi-Object Delete.
  • By default, customers can provision up to 100 buckets per AWS account
  • With Amazon S3, you pay only for what you use. There is no minimum fee.
  • We charge less where our costs are less.
  • Upon sign-up, new AWS customers receive 5 GB of Amazon S3 standard storage, 20,000 Get Requests, 2,000 Put Requests, 15GB of data transfer in, and 15GB of data transfer out each month for one year
  • Customers may use four mechanisms for controlling access to Amazon S3 resources: Identity and Access Management (IAM) policies, bucket policies, Access Control Lists (ACLs) and query string authentication
  • SSE-C enables you to leverage Amazon S3 to perform the encryption and decryption of your objects while retaining control of the keys used to encrypt objects.
  • SSE-KMS enables you to use AWS Key Management Service (AWS KMS) to manage your encryption keys.
  • Amazon Macie is an AI-powered security service that helps you prevent data loss by automatically discovering, classifying, and protecting sensitive data stored in Amazon S3
  • If you enable Versioning with MFA Delete on your Amazon S3 bucket, two forms of authentication are required to permanently delete a version of an object: your AWS account credentials and a valid six-digit code and serial number from an authentication device in your physical possession
  • Standard – IA is designed for long-lived, but infrequently accessed data that is retained for months or years. Data that is deleted from Standard – IA within 30 days will be charged for a full 30 days
  • When processing a retrieval job, Amazon S3 first retrieves the requested data from Amazon Glacier, and then creates a temporary copy of the requested data in RRS (which typically takes on the order of a few minutes). The access time of your request depends on the retrieval option you choose: Expedited, Standard, or Bulk retrievals
  • Objects that are archived to Glacier have a minimum of 90 days of storage, and objects deleted before 90 days incur a pro-rated charge equal to the storage charge for the remaining.
  • You can retrieve 10 GB of your Amazon Glacier data per month for free.
  • There are no additional charges from Amazon S3 for event notifications. You pay only for use of Amazon SNS or Amazon SQS to deliver event notifications, or for the cost of running the AWS Lambda function
  • Up to ten tags can be added to each S3 object
  • Object Tags can be replicated across regions using Cross-Region Replication
  • With storage class analysis, you can analyze storage access patterns and transition the right data to the right storage class
  • S3 Inventory provides a CSV (Comma Separated Values) flat-file output of your objects and their corresponding metadata
  • As data matures, it can become less critical, less valuable, S3 Lifecycle management provides the ability to define the lifecycle of your object with a predefined policy and reduce your cost of storage. You can set lifecycle transition policy to automatically migrate Amazon S3 objects to Standard – Infrequent Access (Standard – IA) and/or Amazon Glacier based on the age of the data. You can also set lifecycle expiration policies to automatically remove objects based on the age of the object
  • The lifecycle policy that expires incomplete multipart uploads allows you to save on costs by limiting the time non-completed multipart uploads are stored
  • Cross-Region Replication CRR is an Amazon S3 feature that automatically replicates data across AWS regions
  • You can configure separate lifecycle rules on the source and destination bucket
  • Transfer Acceleration provides the same security as regular transfers to Amazon S3
  • If we determine that Transfer Acceleration is not likely to be faster than a regular Amazon S3 transfer of the same object to the same destination AWS region, we will not charge for that use of Transfer Acceleration for that transfer, and may bypass the Transfer Acceleration system for that upload
  • If you have objects that are smaller than 1GB or if the data set is less than 1GB in size, you should consider using Amazon CloudFront’s PUT/POST commands for optimal performance rather than Transfer Acceleration
  • S3 error code MissingSecurityHeader has a 400 HTTP status code
  • S3 error code NoSuchBucketPolicy has a 404 HTTP status code
  • S3 bucket may only contain only lower case letters, periods, numbers, and dashes. Bucket name must not be formatted as an IP address, and they may not begin or end with a period
  • It is possible to stop a multi-part upload. Once stopped, the upload may be aborted or resumed
  • A multi-part upload can be executed while the file is still being created
  • ReadObject is not an S3 API call
  • DownloadBucket, CompleteMultipartUpload, UploadPart are S3 API
  • Multipart Upload is recommended for files larger than 100Mo and required for file larger than 5Go
  • In order to request SSE when using REST API to upload files to an S3 bucket you need to use: “x-amz-server-side-encryption” header
  • x-amz-delete-marker, x-amz-id-2, and x-amz-request-id  are all common S3 response headers
  • Amazon S3 handles error codes with the customary HTTP response codes
  • The max size of S3 object is 5TB, cannot extend it. Not possible to call support AWS
  • A 409 HTTP Status Code can indicate a BucketNotEmpty error code. This error will arise if you try to delete a bucket that’s not empty
  • UNLESS Amazon explicitly mention an “eventually” consistent read, they are referring to a strongly consistent read
  • S3 does support website redirect
  • IPv6 can be used with Amazon S3
  • S3 buckets are stored lexicographically
  • There is no cost associated with moving data from S3 to EC2 if both are in the same region
  • SNS Notifications supported by S3 are: 
HTTP PUT and POST, S3 copy actions, and S3 CompleteMultiPartUpload and ReduceRedundancyLostObject
  • S3 bucket names are not transferrable
  • Maximize upload performance: 
 – Transfer acceleration 
 – multi-part upload
  • 409 on S3 means there is a conflict issue
  • S3 IncompleteBody, InvalidBucketName, InvalidDigest correspond to HTTP status code 400
  • You can use Multi-Object Delete to delete large numbers of objects from Amazon S3
  • Parts of multi-part upload will not be completed until the “complete” request has been called which puts all the parts of the file together
  • Bucket names must be at least 3 or more than 63 characters
  • A pre-signed URL gives you access to the object identified in the URL, provided that the creator of the pre-signed URL has permissions to access that object
  • Optimal performance on S3: best way to include hash prefix that keep ON CHANGING
  • SSE-S3: employs strong multi-factor encryption. Amazon S3 encrypts each object with a unique key. As an additional safeguard, it encrypts the key itself with a master key that it regularly rotates
  • An Objects uploaded prior to versionning will have the version ID as NULL
  • When objects are uploaded to S3, they can either be encrypted or non-encrypted
  • Determining whether a Request is Allowed or Denied: http://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_evaluation-logic.html
  • An S3 policy contains the following elements: Resources, Actions, Effect, Principal (the account who is allowed access to the actions)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s