Telegram Web
Understanding Kubernetes Multi-Tenancy: Models, Challenges, and Solutions

https://www.loft.sh/blog/understanding-kubernetes-multi-tenancy-models-challenges-and-solutions
We Threw Away 13 Years of Work for EKS

Thirteen years of running in EC2.

Thirteen years of custom AMIs. Thirteen years of deployment pipelines put together with toothpicks and bubblegum. Thirteen years of launch scripts that really-do-seem-to-be-an-anti-pattern-but-hey-at-least-they-work.

And we threw it all away to run in EKS.

This is the choice we made at GumGum in early 2023, and this blog post covers the problems that led to this insane idea, and why this idea wasn’t so insane after all.


https://medium.com/gumgum-tech/we-threw-away-13-years-of-work-for-eks-b0fd8f53917c
How we avoided an outage caused by running out of IPs in EKS

Solving IP exhaustion in EKS: Avoiding a network outage by implementing custom networking


https://medium.com/adevinta-tech-blog/how-we-avoided-an-outage-caused-by-running-out-of-ips-in-eks-c831ab97d0e4
updatecli

Automatically open a PR on your GitOps repository when a third party service publishes an update


https://github.com/updatecli/updatecli
pepr

Pepr is on a mission to save Kubernetes from the tyranny of YAML, intimidating glue code, bash scripts, and other makeshift solutions.


https://github.com/defenseunicorns/pepr
ClusterSecret

The clusterSecret operator makes sure all the matching namespaces have the secret available and up to date.


https://github.com/zakkg3/ClusterSecret
ls-lint

An extremely fast directory and filename linter - Bring some structure to your project filesystem


https://github.com/loeffel-io/ls-lint
zxc

Terminal based intercepting proxy written in rust with tmux and vim as user interface.


https://github.com/hail-hydrant/zxc
graft

Transactional page storage engine supporting lazy partial replication to the edge. Optimized for scale and cost over latency. Leverages object storage for durability.


https://github.com/orbitinghail/graft
liam

Automatically generates beautiful and easy-to-read ER diagrams from your database.


https://github.com/liam-hq/liam
How We Run Terraform At Scale

Managing over 165k cloud resources across hundreds of workspaces could seem daunting. But for us, it’s just another day at Benchling. Here’s how we do it.

We currently have:

- 165k cloud resources under management
- 625 Terraform workspaces
- 38 AWS accounts
- 170 engineers (40 of whom are infra specialists)

We perform:

- 225 infrastructure releases daily (terraform apply operations)
- 723 plans daily (terraform plan operations)

We’ve been successfully operating Benchling’s infrastructure release system for the past two years (spoiler, it’s Terraform Cloud), over which time we’ve doubled our infrastructure footprint with minimal additional release overhead.


https://benchling.engineering/how-we-run-terraform-at-scale-da7bb75dc394
openinfraquote

OpenInfraQuote is a lightweight, open-source CLI tool for estimating infrastructure costs using Terraform plan and state files. It runs locally or in CI/CD. No backend, no API keys, no external services.


https://github.com/terrateamio/openinfraquote
Things that go wrong with disk IO

There are a few interesting scenarios to keep in mind when writing applications (not just databases!) that read and write files, particularly in transactional contexts where you actually care about the integrity of the data and when you are editing data in place (versus copy-on-write for example).


https://notes.eatonphil.com/2025-03-27-things-that-go-wrong-with-disk-io.html
Hot Take: I Want Execs Closer to Incidents, Not Farther

https://uptimelabs.io/hot-take-i-want-execs-closer-to-incidents-not-farther
Improving Kubernetes-Mixin API Server Rules Consistency

A journey into troubleshooting an insidious, and subtle, issue that may occur with Prometheus Recording Rules


https://medium.com/codex/improving-kubernetes-mixin-api-server-rules-consistency-1c0d727e8160
2025/07/08 18:10:22
Back to Top
HTML Embed Code: