DevOps&SRE Library 6379

Scaling in the Clouds: Istio Ambient vs. Cilium

https://istio.io/latest/blog/2024/ambient-vs-cilium

4.69K views15:02

Kubernetes configuration linting tools

https://itnext.io/kubernetes-configuration-linting-tools-699ddeedaeec

4.68K views07:03

Git Happens: How Argo CD took over our deployments

https://mirakl.tech/git-happens-how-argo-cd-took-over-our-deployments-e214343e1532

4.68K views15:06

DevOps&SRE Library

Patroni Backups: When pgBackRest and ArgoCD Have Your Back (Literally)

https://medium.com/@yatzikziv/patroni-backups-when-pgbackrest-and-argocd-have-your-back-literally-091afa98be50

4.51K views07:01

DevOps&SRE Library

(Yet) Another Take on Integrating Terraform with Argo CD

https://akuity.io/blog/yet-another-take-on-integrating-terraform-with-argo-cd

4.23K views15:01

DevOps&SRE Library

DBaaS in 2024: Which PostgreSQL operator for Kubernetes to select for your platform? Part 4

https://medium.com/@davidpech_39825/dbaas-in-2024-which-kubernetes-postgresql-operator-part-4-crunchys-pgo-9225d518c71d

3.75K views07:02

DevOps&SRE Library

300,000+ Prometheus Servers and Exporters Exposed to DoS Attacks

https://www.aquasec.com/blog/300000-prometheus-servers-and-exporters-exposed-to-dos-attacks

4.12K views15:03

DevOps&SRE Library

Turing Pi 2 Home cluster

https://tomassirio.medium.com/turing-pi-2-home-cluster-e4a7446ef4ba

3.68K views07:02

DevOps&SRE Library

Connecting Kubernetes K3s cluster to external router using BGP with MetalLB and Nginx Ingress

https://medium.com/@nikoolayy1/connecting-kubernetes-k3s-cluster-to-external-router-using-bgp-with-metallb-bgp-nginx-as-ingress-9bb767dcecd2

4.02K views15:05

DevOps&SRE Library

silver-surfer

Api-Version Compatibility Checker & Provides Migration Path for K8s Objects

https://github.com/devtron-labs/silver-surfer

3.54K views07:02

DevOps&SRE Library

kubectl-klock

A kubectl plugin to render the kubectl get pods --watch output in a much more readable fashion.

Think of it as running watch kubectl get pods, but instead of polling, it uses the regular watch feature to stream updates as soon as they occur.

https://github.com/applejag/kubectl-klock

4.13K views15:01

DevOps&SRE Library

kubepfm

kubepfm is a simple wrapper to the kubectl port-forward command for multiple pods/deployments/services. It can start multiple kubectl port-forward processes based on the number of input targets. Terminating the tool (Ctrl-C) will also terminate all running kubectl sub-processes.

https://github.com/flowerinthenight/kubepfm

4.36K views07:00

DevOps&SRE Library

oomd

oomd is userspace Out-Of-Memory (OOM) killer for linux systems.

https://github.com/facebookincubator/oomd

4.35K views15:01

DevOps&SRE Library

cloud-snitch

Map visualization and firewall for AWS activity, inspired by Little Snitch for macOS.

https://github.com/ccbrown/cloud-snitch

4.08K views07:00

DevOps&SRE Library

arkflow

High-performance Rust stream processing engine, providing powerful data stream processing capabilities, supporting multiple input/output sources and processors.

https://github.com/arkflow-rs/arkflow

4.11K views15:05

DevOps&SRE Library

brush

brush (Bo(u)rn(e) RUsty SHell) is a POSIX- and bash-compatible shell, implemented in Rust. It's built and tested on Linux and macOS, with experimental support on Windows. (Its Linux build is fully supported running on Windows via WSL.)

https://github.com/reubeno/brush

3.98K views07:01

DevOps&SRE Library

outpost

Outpost is a self-hosted and open-source infrastructure that enables event producers to add outbound webhooks and Event Destinations to their platform with support for destination types such as Webhooks, Hookdeck Event Gateway, Amazon EventBridge, AWS SQS, AWS SNS, GCP Pub/Sub, RabbitMQ, and Kafka.

https://github.com/hookdeck/outpost

3.97K views15:03

DevOps&SRE Library

tilt

Define your dev environment as code. For microservice apps on Kubernetes.

https://github.com/tilt-dev/tilt

3.62K views07:02

DevOps&SRE Library

Anomaly Detection in Time Series Using Statistical Analysis

Setting up alerts for metrics isn’t always straightforward. In some cases, a simple threshold works just fine — for example, monitoring disk space on a device. You can just set an alert at 10% remaining, and you’re covered. The same goes for tracking available memory on a server.

But what if we need to monitor something like user behavior on a website? Imagine running a web store where you sell products. One approach might be to set a minimum threshold for daily sales and check it once a day. But what if something goes wrong, and you need to catch the issue much sooner — within hours or even minutes? In that case, a static threshold won’t cut it because user activity fluctuates throughout the day. This is where anomaly detection comes in.

https://medium.com/booking-com-development/anomaly-detection-in-time-series-using-statistical-analysis-cc587b21d008

3.73K views15:06

DevOps&SRE Library

Incident SEV scales are a waste of time

Ask an engineering leader about their incident response protocol and they’ll tell you about their severity scale. “The first thing we do is we assign a severity to the incident,” they’ll say, “so the right people will get notified.”

And this is sensible. In order to figure out whom to get involved, decision makers need to know how bad the problem is. If the problem is trivial, a small response will do, and most people can get on with their day. If it’s severe, it’s all hands on deck.

Severity correlates (or at least, it’s easy to imagine it correlating) to financial impact. This makes a SEV scale appealing to management: it takes production incidents, which are so complex as to defy tidy categorization on any dimension, and helps make them legible.

A typical SEV scale looks like this:

- SEV-3: Impact limited to internal systems.
- SEV-2: Non-customer-facing problem in production.
- SEV-1: Service degradation with limited impact in production.
- SEV-0: Widespread production outage. All hands on deck!

But when you’re organizing an incident response, is severity really what matters?

https://blog.danslimmon.com/2025/01/29/incident-sev-scales-are-a-waste-of-www.tgoop.com/

3.76K views07:02

2025/07/08 12:50:33
Back to Top

HTML Embed Code:

<iframe width="100%" src="https://www.tgoop.com/buyppe/web?embed=1" title="Telegram Web" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>