Day 6: Linux File Compression & Transfer for DevOps & Cloud

File compression and transfer are essential for optimizing storage, performance, and data movement in DevOps & Cloud environments. This guide covers compression commands, file transfer methods, usage, and alternatives.

Why Do We Need to Compress Files or Folders?

Reduce Storage Usage – Saves disk space in cloud environments.
Improve Transfer Speed – Smaller files upload/download faster.
Optimize Backup & Archiving – Efficient data storage and recovery.
Save Bandwidth – Reduces data transfer costs in cloud services.

🔹 Alternatives to Compression:

  • Deduplication – Removes duplicate data instead of compressing it.
  • Sparse Files – Stores only the actual data, skipping empty space.
  • Cloud Storage Optimization – Use S3 intelligent tiering or block storage snapshots.

Compression Commands in Linux

zip – Compress Files into .zip Archive

Usage: Creates compressed ZIP archives.
How It Works:

  • Uses the Deflate algorithm to reduce file size.
  • Archives multiple files into one .zip file.

Example:

zip archive.zip file1.txt file2.txt

Creates archive.zip containing file1.txt and file2.txt.

Compress a Directory:

zip -r backup.zip myfolder/

📌 Best Practice: Use -r to recursively zip directories.

unzip – Extract .zip Files

Usage: Extracts .zip archives.
How It Works:

  • Restores files to their original state.

Example:

unzip archive.zip

Extracts all files from archive.zip into the current directory.

Extract to a Specific Directory:

unzip archive.zip -d /path/to/destination

📌 Best Practice: Use -d to specify an extraction folder.

gzip – Compress a Single File

Usage: Compresses files using the GNU zip algorithm.
How It Works:

  • Replaces original files with .gz compressed versions.

Example:

gzip file.txt

Compresses file.txt to file.txt.gz.

Compress Multiple Files:

gzip file1.txt file2.txt

📌 Best Practice: gzip does not archive multiple files (use tar for that).

gunzip – Extract .gz Files

Usage: Decompresses .gz files.
How It Works:

  • Restores the original file.

Example:

gunzip file.txt.gz

Extracts file.txt.gz back to file.txt.

📌 Best Practice: Use zcat to view compressed files without extracting:

zcat file.txt.gz

tar – Archive Files (Without Compression)

Usage: Packages multiple files into a single archive.
How It Works:

  • Does NOT compress by default (only archives).
  • Use tar with gzip or bzip2 for compression.

Example:

tar -cvf archive.tar file1.txt file2.txt

Creates archive.tar containing file1.txt and file2.txt.

Extract .tar Archive:

tar -xvf archive.tar

📌 Best Practice: Use tar with compression (tar.gz or tar.bz2).

tar + Compression (.tar.gz or .tgz)

Usage: Creates a compressed .tar.gz archive.

Example (Compress with gzip):

tar -czvf archive.tar.gz myfolder/

Creates a compressed archive archive.tar.gz.

Extract .tar.gz Archive:

tar -xzvf archive.tar.gz

📌 Best Practice: Use -z for gzip, -j for bzip2.

tar Flags Table

FlagDescription
-cCreate an archive
-xExtract an archive
-vVerbose mode (show progress)
-fSpecify archive file name
-zCompress with gzip
-jCompress with bzip2
-rAppend files to an archive
--deleteRemove files from an archive

File Transfer in Linux

SCP (Secure Copy)

Usage: Transfers files securely using SSH.
How It Works:

  • Uses SSH for encryption.
  • Works for Local → Remote & Remote → Local transfers.

Example (Copy from Local → Remote Server)

scp file.txt user@remote:/home/user/

Transfers file.txt to /home/user/ on the remote server.

Explanation:

  • scp → Securely copy files
  • file.txt → Source file
  • user@remote:/home/user/ → Destination

Copy a Folder (-r for Recursive)

scp -r myfolder user@remote:/home/user/

Transfer File from Remote → Local

scp user@remote:/home/user/file.txt .

Copies file.txt from the remote server to the current local directory.

📌 Best Practice: Use -C for compression:

scp -C file.txt user@remote:/home/user/

Rsync (Efficient File Synchronization)

Usage: Transfers files incrementally (only modified files).
How It Works:

  • Faster than scp for large files.
  • Synchronizes local to remote, or remote to local.

Example (Local → Remote)

rsync -avz myfolder/ user@remote:/home/user/

Transfers myfolder to the remote server only copying changes.

Sync Remote → Local

rsync -avz user@remote:/home/user/myfolder/ .

📌 Best Practice: Use --delete to remove extra files from the destination:

rsync -avz --delete myfolder/ user@remote:/home/user/

Comparison of SCP vs Rsync

FeatureSCPRsync
Encryption✅ Yes (SSH)✅ Yes (SSH)
Incremental Sync❌ No✅ Yes
Speed for Large Files❌ Slow✅ Faster (only syncs changes)
Compression✅ Yes (-C)✅ Yes (-z)
Deletes Extra Files❌ No✅ Yes (--delete)

📌 Best Practice:

  • Use scp for one-time transfers.
  • Use rsync for frequent sync operations.

What is file compression in Linux, and why is it important?

Answer:
File compression reduces the size of files or directories to save disk space, speed up transfers, and optimize backups. It is especially useful in DevOps and Cloud environments where data needs to be stored efficiently and transferred quickly.

Example:

gzip large_log_file.log

📌 Best Practice: Always compress logs and backups before transferring.

What are the differences between gzip, bzip2, and zip?

Answer:

CommandCompression TypeSpeedCompression RatioMultiple Files
gzipLosslessFastModerateNo
bzip2LosslessSlowerHighNo
zipLosslessFastLowYes

📌 Best Practice: Use gzip for speed, bzip2 for higher compression, and zip for Windows compatibility.

How do you compress multiple files in Linux?

Answer:
Use tar with gzip (.tar.gz):

tar -czvf backup.tar.gz file1.txt file2.txt

Explanation:

  • -c → Create an archive
  • -z → Compress with gzip
  • -v → Verbose (show progress)
  • -f → File name

How can you monitor file transfer progress in scp?

Answer:
Use:

scp -v file.txt user@remote:/home/user/

-v (Verbose) displays detailed transfer logs.

Scroll to Top