Reducing Docker Image Size: A Journey of Discovery

BHARGAB KALITA - Oct 26 - - Dev Community

In the world of software development, I’m constantly trying to improve my applications. Recently, I faced a challenge: my Docker images for a Go application were bloated and unwieldy. This not only slowed down deployment times but also put unnecessary strain on my resources. Specifically, the size of the images was causing issues, as they filled the available disk space of my EC2 instance, which had a limit of 1GB. After optimization, the image size dramatically decreased from approximately 800MB (compressed) to just 7.95MB (compressed). Join me as I share my journey to trim down those images, the culprits I identified, and the valuable lessons I learned along the way.

The Initial Dockerfile: The Culprit Unveiled

At first glance, my Dockerfile seemed straightforward. Here’s how it looked:

# Use the latest golang base image
FROM golang:1.21.6

# Set the working directory
WORKDIR /app

# Copy go.mod and go.sum
COPY go.mod go.sum ./
RUN go mod download

# Copy the entire application code
COPY . .

# Build the Go application
RUN go build -o main .

# Expose port 4040
EXPOSE 4040

# Command to run the executable
CMD ["./main"]
Enter fullscreen mode Exit fullscreen mode

While it worked, it became painfully clear that this approach led to massive image sizes. The inclusion of unnecessary files and the use of a bulky base image were the main offenders.

The Unveiling: What Went Wrong?

During my investigation, I identified several culprits contributing to my oversized Docker images:

  1. All Files, No Filters: My initial setup copied everything from the project directory, including test files, documentation, and any stray files that had no business being in a production environment.

  2. Heavyweight Base Image: I was using the full Go image for both building and running my application. This included a ton of tools and libraries that I simply didn’t need in production.

  3. Missing Multi-Stage Builds: My Dockerfile didn’t leverage multi-stage builds, which allow you to build an application in one image and then use a minimal image for running it. This meant my final image included build dependencies and tools that were unnecessary for execution.

The Transformation: Streamlining the Dockerfile

To address these issues, I rolled up my sleeves and made some significant changes. Here’s the revamped Dockerfile that emerged from my efforts:

# Stage 1: Build the Go application
FROM golang:1.21.6 AS builder

# Set the working directory
WORKDIR /app

# Copy go.mod and go.sum files
COPY go.mod go.sum ./
RUN go mod download

# Copy the source code
COPY . .

# Build the Go application with optimizations
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags="-w -s" -o main .

# Stage 2: Create a minimal runtime image with CA certificates
FROM alpine:latest

# Install CA certificates
RUN apk add --no-cache ca-certificates

WORKDIR /app

# Copy the binary from the builder stage
COPY --from=builder /app/main .

EXPOSE 4040

# Command to run the executable
CMD ["./main"]
Enter fullscreen mode Exit fullscreen mode

The Key Changes

  • Multi-Stage Builds: I split my Dockerfile into two stages. The first stage is dedicated to building the application, while the second stage uses a minimal base image (in this case, Alpine) to run it. This separation allowed me to leave behind all the unnecessary build tools and dependencies.

  • Selective Copying: I started being selective about what files I copied into the image. This is where using a .dockerignore file became invaluable. I ensured only the necessary files made it into my Docker image.

Example of a .dockerignore file:

# Ignore test files and directories
*.test
testdata/

# Ignore version control directories
.git
Enter fullscreen mode Exit fullscreen mode
  • Optimizing the Go Binary: By setting CGO_ENABLED=0 and using build flags -ldflags="-w -s", I produced a smaller, statically linked binary.

Why I Added CA Certificates

In my final image, I installed CA certificates using the command RUN apk add --no-cache ca-certificates. This step was crucial for several reasons:

  • Secure Connections: My application communicates with external services, including databases and APIs, that require secure connections over HTTPS. Installing CA certificates enables the Go application to verify the TLS/SSL certificates presented by these services, ensuring secure communication.

  • Minimized Vulnerabilities: Using CA certificates helps prevent man-in-the-middle attacks. By validating certificates, I can trust that the data exchanged with external services is secure and not being intercepted.

Understanding CGO_ENABLED=0 and Build Flags

I included CGO_ENABLED=0 and the flags -ldflags="-w -s" for several reasons:

  1. Static Binary Creation: Setting CGO_ENABLED=0 tells the Go compiler to build a statically linked binary. This means the resulting executable does not rely on any shared libraries from the host system. This is particularly beneficial in a Docker environment where I may not have the same libraries available as I do on my local machine.

  2. Reduced Image Size: A statically linked binary is generally smaller because it includes only the code and libraries necessary to run the application. This reduction in size is crucial when deploying to environments like Docker, where every byte counts.

  3. Optimized Performance: The -ldflags="-w -s" flags strip debug information and symbol tables from the binary. This not only reduces the binary size further but also enhances startup performance, which is beneficial in containerized environments where quick startup times are essential.

Lessons Learned: More Than Just Size

  1. The Right Base Image Matters: The choice of base image can make or break your Docker image size. Going with minimal images not only helps in reducing size but also enhances security by limiting the attack surface.

  2. Power of Multi-Stage Builds: These builds are a game-changer. They help streamline the build process and keep the final image clean and lean.

  3. Be Mindful of Your Copying: Always scrutinize what you include in your Docker images. Using .dockerignore helps keep your builds clean.

  4. Continuous Optimization is Key: Image optimization isn’t a one-time task. Regular reviews and improvements will help maintain a lean codebase and efficient images.

Conclusion

Through this journey, I successfully reduced my Docker image sizes, leading to faster deployments and improved resource efficiency. This experience taught me not just about the technical aspects of Docker but also reinforced the importance of mindful practices in software development.

In the world of DevOps, every byte counts, and I'm excited to continue this journey of optimization and discovery!

. . . . .
Terabox Video Player