isotopp,
@isotopp@chaos.social avatar

This is about the third story I hear about an instance losing all their data because of a CI/CD mistake.

https://firefish.social/notes/9iqefgi8rzfksnqc

Hugops, but also the usual grizzled old sysadmin advice:

  1. Never say backup. Always say restore. This changes your mind.

A backup is a cost center. It has no value, it has only cost. Only a restore has a proven value, and comes with knowledge:

  • You know you actually can restore, the backup was complete and does connect.
isotopp,
@isotopp@chaos.social avatar
  • You know how long the restore took, so you know the time to restore when asked. Not an estimate. The actual time.
  • You know the restore procedure.

Restore EVERY backup ALL the time, then throw the recovered instance away. Keep the metrics, keep the backup.

isotopp,
@isotopp@chaos.social avatar
  1. There is no such thing as immutable, statelessness or whatever.

Parts of your setup may be stateless deployments with immutable images. That is, because you collected all state and put it into one or two selected locations.

You can redeploy everything but these selected locations.

If you drop them, if you make a config mistake, these things are gone gone. They cannot be redeployed, unless you have taken measures to do so. See 1.

isotopp,
@isotopp@chaos.social avatar
  1. Devops is easy except for the stateful parts.

That is why the storage people and the database people all look down on you hipster devops people and make condescending remarks. 🙂

Yah, ok, they are nicer than you probably think they are, but the DO have a completely different outlook on operations.

Listen and learn. Also, restore test.

isotopp,
@isotopp@chaos.social avatar

Also,

https://argo-cd.readthedocs.io/en/stable/user-guide/sync-options/#no-prune-resources

and

https://kubernetes.io/docs/tasks/administer-cluster/change-pv-reclaim-policy/
Retain, not Delete.

There are people who have taken steps to prevent their CI/CD from messing with EBS volumes, S3 buckets or K8s PVs and there are people who will lose data in the future.

Don't be in the second group.

leah,
@leah@chaos.social avatar

@isotopp first rule for me in ops automation: never ever delete automatically if you are not totally sure what you are doing.

jzohren,
@jzohren@inductive.space avatar

@leah wie macht ihr das bei @ubernauten ?

Nach Liste gelöschter Accounts einmal pro Monat manuell auf löschen klicken?

leah,
@leah@chaos.social avatar

@jzohren @ubernauten das ist Teil eines Produkts und nicht einer Ops Automatisierung. Das wird schon automatisch gelöscht, aber es gibt a) backups und b) das lösen ja nicht wir sondern die Kunden manuell aus.

ubernauten,
@ubernauten@uberspace.social avatar

@leah @jzohren Jein. Bei nicht bezahlten Accounts lösen das schon wir und durchaus automatisch aus. Bei den durch Kunden ausgelösten Löschungen gab es aber bis vor ca. 5 Jahren tatsächlich nur eine automatische Mail an den Support, der dann die Löschung manuell vorgenommen hat.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • uselessserver093
  • Food
  • aaaaaaacccccccce
  • test
  • CafeMeta
  • testmag
  • MUD
  • RhythmGameZone
  • RSS
  • dabs
  • KamenRider
  • Ask_kbincafe
  • TheResearchGuardian
  • KbinCafe
  • Socialism
  • oklahoma
  • SuperSentai
  • feritale
  • All magazines