CareerRaft

Amazon Simple Storage Service (S3) is AWS’s object storage platform. It’s designed for storing massive amounts of unstructured data images, logs, videos, backups, datasets, and more. No file systems. No mounts. Just buckets and objects.

You upload an object, it gives you a URL or object key, and that’s it. You can store petabytes if you want S3 just scales.

Buckets and Objects

S3 organizes data using two core components:

Bucket: A universally unique container of objects. Think of it like a folder at the top level.
Object: The actual data (file) you store and its associated metadata and uniquely identifying key.

Each object lives inside a bucket and is identified by a key (like images/2025/08/logo.png). Keys can be nested with slashes but S3 is flat there's no real directory tree.

Storage Classes

S3 offers multiple storage classes depending on access frequency and durability needs:

Versioning

Here's where you can enable versioning per bucket to keep track of changes:

Enabled: Each time you overwrite/delete an object, S3 will keep the old version,
which works out well for backing up or rolling back and audit purposes.
You will still pay for all versions, so be sure to use lifecycle rules for cleanup.

aws s3api put-bucket-versioning --bucket my-bucket --versioning-configuration Status=Enabled

Lifecycle Policies

Lifecycle policies help automate transitions and deletions.

Transition: Move objects to a cheaper storage class (e.g., Standard → IA → Glacier).
Expiration: Delete objects or previous versions after X days.

Real-world usage: for a logging bucket, I set logs to move to Glacier after 30 days and delete after 365.

Example (JSON):

{

  "Rules": [{

    "ID": "ArchiveLogs",

    "Prefix": "logs/",

    "Status": "Enabled",

    "Transitions": [{

      "Days": 30,

      "StorageClass": "GLACIER"

    }],

    "Expiration": {

      "Days": 365

    }

  }]

}

Access Control: IAM, Bucket Policies, ACLs

S3 supports several layers of access control:

IAM Policies: Control who can access S3 using IAM roles or users.
Bucket Policies: Resource-level JSON policies attached to a bucket.
ACLs (Access Control Lists): Legacy. Avoid unless absolutely needed.

I usually stick to IAM and bucket policies. For public hosting (e.g., a static website), you need to set public read access via a policy.

Example policy for public read:

{

  "Effect": "Allow",

  "Principal": "*",

  "Action": "s3:GetObject",

  "Resource": "arn:aws:s3:::my-bucket/*"

}

Replication (Cross-Region and Same-Region)

S3 supports Cross-Region Replication (CRR) and Same-Region Replication (SRR)

Keeps data in sync across buckets
Helpful for DR (disaster recovery), data residency, or latency optimization
Requires versioning to be enabled

aws s3api put-bucket-replication \

  --bucket source-bucket \

  --replication-configuration file://replication.json

Server-Side Encryption (SSE)

Encryption is simple with S3:

SSE-S3: Managed by AWS
SSE-KMS: Use AWS Key Management Service
SSE-C: You provide your own key (less common)

KMS gives better control and audit logs. I use it when dealing with sensitive logs or customer data.

Example:

aws s3 cp myfile.txt s3://secure-bucket/ --sse aws:kms

Multipart Upload

For files larger than 100MB, you should use multipart uploads

Breaks big files into parts and uploads them in parallel
Increases upload speed and reliability

Helpful for backups, large media files, or data exports.

aws s3 cp largefile.iso s3://my-bucket/ --storage-class STANDARD --expected-size 1000000000

S3 Transfer Acceleration

Improves upload/download speed using Amazon’s edge locations.

Just enable it on the bucket
Access via bucketname.s3-accelerate.amazonaws.com

It helps when users are uploading from remote locations (e.g., India → US).

Logging and Monitoring

Always turn on logging and monitoring:

Server Access Logs - record every request to your bucket.
CloudTrail - records API-level events (e.g., who deleted a file).
CloudWatch Metrics - keep record of the storage, request counts, errors.

Example (enabling logging to another bucket):

aws s3api put-bucket-logging \

  --bucket my-bucket \

  --bucket-logging-status file://logging.json

Hosting Static Websites

S3 can host static HTML/CSS/JS websites:

Enable static website hosting
Set index and error documents
Make objects public
Optional: Use CloudFront for CDN + HTTPS

aws s3 website s3://my-site/ --index-document index.html --error-document error.html

Common CLI Commands

# List all buckets
aws s3 ls

# Sync a local folder to a bucket
aws s3 sync ./website s3://my-bucket/

# Download a specific file
aws s3 cp s3://my-bucket/file.txt .

# Remove all objects from a bucket
aws s3 rm s3://my-bucket/ --recursive

Best Practices

Enable versioning + lifecycle rules for backups
Use SSE-KMS for sensitive data
Avoid public buckets unless necessary
Tag data for cost tracking
Monitor with CloudWatch
Use aws s3 sync for backups
Never use S3 as a database — it's object storage, not a key-value store

Conclusion

S3 is potentially the most reliable and scalable solution for storage on the planet. The service is suited for storage layers of all kinds including a static site, logs, or backing up terabytes of a hard drive. However, as with any AWS service, you'll need to configure S3 correctly, and there are a number of configuration items to consider! Access control, lifecycle management, encryption and monitoring, are all part of S3 configuration items. If you can get those configured correctly, S3 will provide your storage layer for years of happy service without you having to touch it.

What Is Amazon S3?