Site Status

Byawa

Oct 10, 2023

The site has not been as stable as I want it to be. We are experiencing a failure about once every 48-72hours. The outage normally lasts less than 5 minutes. Today it exceeded 5 minutes.

I know what the issue is. K8S is killing off parts of the infrastructure. Normally, it is the database engine.

When the database goes down, the site tells K8S that it is sick. This results in the 503 errors you might have seen.

The root cause is that K8S doesn’t think there are enough resources available and “reaps” something, normally the RDBMS.

The fix for this is to move from rook-ceph with an internal cluster to rook-ceph with an external cluster. The advantage of an external cluster is that it requires less resources within K8S, and I have better control over it.

I have created an external cluster within my own K8S test system. I’m in the process of documenting how to bring up a K8S external cluster. It isn’t working yet. I’ll get there.

Spread the love

By awa

.......

2 thoughts on “Site Status”

it's just Boris says:

October 11, 2023 at 6:39 am

Thanks, both for the effort and the explanation.

2
Bad Dancer says:

October 11, 2023 at 8:25 am

I feel like a bullfrog someone is trying to explain the finer points of organic chemistry to but I appreciate the work ya do keeping this place of respite up.

2

Comments are closed.

Gun Free Zone

Site Status

Byawa

Like this:

Related

By awa

Related Post

Gone to Substack.

Please visit VineOfLiberty.com

Future of the Bloggings

2 thoughts on “Site Status”

Gun Free Zone

Gun Free Zone

Site Status

Byawa

Partake this:

Like this:

Related

By awa

Related Post

Gone to Substack.

Please visit VineOfLiberty.com

Future of the Bloggings

2 thoughts on “Site Status”