Howdy everyone,
I'm here to explain you the ups and downs of last week, what we did, and what happened on accident. So let's dive right into that...
The server system broarmy.net was based on was constructed months back without any reference for user numbers or similar things to work with. We did automatic stress tests, but those just aren't the same as real people flooding our gates. So it had to happen, the servers were overloaded and only half of the people or less actually got through. That sparked an emergency plan: plan UPGRADE SERVERS BEFORE WE GO UNDER. I know, our plans have the best names.
This plan involved renting new infrastructure and rebuilding our system to work in such an environment. After the servers we had purchased arrived a little bit delayed due to complicated circumstances, we got to work. "Who's we?" I hear you asking. As project lead it's usually just me who deals with all the servers, but for these big projects I like to get my buddy @SolSoCoG on board. He has a lot of experience with Linux servers and web servers and so on. We rebuilt the system step by step, hence all the smaller maintenances we had. One by one the server cluster got more powerful.
And then, this weekend suddenly the site went down for a long time, and data disappeared - it's like this weekend didn't even exist on here. What is that you ask, "why?"... it's all related to a series of unfortunate events that start at a normal maintenance, go over hardware failing and being reset and ending up with new software suddenly clogging up the system. Due to these system failures we had to roll back using our backups twice, making it look like the posts you've written never existed. We're terribly sorry about that. This will not happen again. We promise. Pinky swear.
So what does that leave us with now? We finished the maintenance, allowing us to use more than triple the power we had just a week ago. We employ new mechanics to backup and secure data. We use new technology to ensure fast and reliable access to the site. We worked hard, and we improved, a lot.
I sincerely apologize for the downtimes and issues that came up during the last week. We hope it didn't cause you too much inconvenience. It's only going up from now on.
Heiko 'mKeRix' Rothe
I'm here to explain you the ups and downs of last week, what we did, and what happened on accident. So let's dive right into that...
The server system broarmy.net was based on was constructed months back without any reference for user numbers or similar things to work with. We did automatic stress tests, but those just aren't the same as real people flooding our gates. So it had to happen, the servers were overloaded and only half of the people or less actually got through. That sparked an emergency plan: plan UPGRADE SERVERS BEFORE WE GO UNDER. I know, our plans have the best names.
This plan involved renting new infrastructure and rebuilding our system to work in such an environment. After the servers we had purchased arrived a little bit delayed due to complicated circumstances, we got to work. "Who's we?" I hear you asking. As project lead it's usually just me who deals with all the servers, but for these big projects I like to get my buddy @SolSoCoG on board. He has a lot of experience with Linux servers and web servers and so on. We rebuilt the system step by step, hence all the smaller maintenances we had. One by one the server cluster got more powerful.
And then, this weekend suddenly the site went down for a long time, and data disappeared - it's like this weekend didn't even exist on here. What is that you ask, "why?"... it's all related to a series of unfortunate events that start at a normal maintenance, go over hardware failing and being reset and ending up with new software suddenly clogging up the system. Due to these system failures we had to roll back using our backups twice, making it look like the posts you've written never existed. We're terribly sorry about that. This will not happen again. We promise. Pinky swear.
So what does that leave us with now? We finished the maintenance, allowing us to use more than triple the power we had just a week ago. We employ new mechanics to backup and secure data. We use new technology to ensure fast and reliable access to the site. We worked hard, and we improved, a lot.
I sincerely apologize for the downtimes and issues that came up during the last week. We hope it didn't cause you too much inconvenience. It's only going up from now on.
Heiko 'mKeRix' Rothe