Server Beach Screwed Up Our Server!

Discussion in 'News' started by Adrian Wong, Sep 20, 2010.

  1. Adrian Wong

    Adrian Wong Da Boss Staff Member

    It was supposed to be an easy job to replace a malfunctioning RAID card. Shouldn't take more than 15 minutes, tops.

    Well, it took them over 3 hours to finish, and they somehow managed to corrupt some files on the main hard disk drive. Worse of all, it didn't solve our problem. It took us hours to replace the corrupted files and get the sites up and running again.

    They then told that our remaining hard disk drive might be the cause. It wasn't failing - the RAID controller just wouldn't mirror it. So they suggested we move our data to another drive and swap them to get RAID1 working again. We asked them to help us clone our drive into another drive. Their curt answer - no, we don't do cloning. You will have to do it yourselves.

    Fine, we instructed them to attach another drive so we could do it ourselves. That's when they suggested we allow them to take the server down for 30 minutes for a quick check, to see if they can solve the problem without any further hassle to us. Since it was already late at night, we agreed.

    In the morning, we woke up to a DISASTER. They didn't just check the drive, they attempted to rebuild the array and somehow, they managed to badly corrupt the SINGLE good hard disk drive our server had. Don't ask us how they did it. They just did it.

    It was corrupted so badly, it wouldn't even boot up after that. Screw-up after screw-up occurred, and our server remained offline for the better part of two days. In the end, they tossed us the badly corrupted drive to rebuild on a new server.

    We have to thank Ken and Chai for their help and sacrifice. They hunkered down and brought everything up in record time! Thanks a million times, guys!!! :hug: :hug:

    Unfortunately, we lost lots of data. You will have noticed by now, we seemed to have lost about 5-6 days worth of posts in the main site and forums, with the latest threads and posts at September 14th. We are sorry, but there's nothing we can do about the lost threads and posts. We are just thankful that Chai managed to recover this much data.

    You may also encounter articles that display nothing but gibberish. We have tried our best to replace those corrupted files, but if you ever find anything out of the ordinary, please let us know. We would appreciate your help in finding these orphaned or corrupted files.

    As for Server Beach, well, all we can say is that karma will pay them back for what they did to us. This isn't the first time they "screwed us". Not long after we migrated to our first server at Server Beach, it started developing problems with one of the hard disk drives. Then the main hard disk drive started to fail! We had to quickly migrate to another server, but we still lost some data in the process. :(

    In just six months, this NEW server started to develop problems with, you guessed it - one of the hard disk drives! Server Beach replaced the malfunctioning drive TWICE but it didn't work, which led them to believe it was the RAID controller... and that developed from a minor maintenance issue to a MAJOR DISASTER.

    Thank you, Server Beach. We will be sure to "recommend" your "expert" services and "superb" reliability to our friends in the industry. You can count on it!
     
  2. peaz

    peaz ARP Webmaster Staff Member

    Time to move perhaps :D Anyone has good experiences with Virtual Private Servers? We are thinking to go cloud this time as it should be a much more reliable infrastructure with higher redundancies and with a good backup plan, we should be good.

    Share your thoughts. :)
     
  3. zy

    zy zynine.com Staff Member

    :shifty: No wonder the site was down for sometime :shifty:

    That must have sucked big time. :wall:
     
  4. karhoe

    karhoe Newbie

    So that means currently you guys are using DEDICATED server and now want to shift to VPS? Isn't that some sort of 'downgrade' ?
     
  5. peaz

    peaz ARP Webmaster Staff Member

    in some sense yes. But VPS gives us MUCH more flexibility in scaling the server and heck, we no longer need to worry about managing the health of the hardware itself. That's a nice benefit of running on top of a virtual environment. It's the datacenter guys that would worry about maintaining the availability of the server cluster.

    with dedicated server, you basically are left with just one node. So. if you look from the MTBF aspect, a vertically scaled out environment is always better compared to just one node.

    Oh and BTW, VPS still provides us with a guaranteed server resource. RAM, CPU cores, HDD, network bandwidth. Unlike shared hosting where we have no guarantees there.
     
  6. karhoe

    karhoe Newbie

    It depends on the host, you never know, they can screw up the entire server again, and in a VPS, some other users could consume huge resources and I believe that could affect performance of TechArp as well?
     
  7. roguetech

    roguetech Newbie

    DATA BACKUPS

    IF YOUR ADMINISTRATORS KNEW WHAT THEY WERE DOING THEY WOULD HAVE DATA BACKUPS....

    SERVERBEACH IS A SELF MANAGED SERVER SOLUTION

    MEANING YOU ARE RESPONSIBLE FOR YOUR DATA NOT SERVERBEACH!

    SHAME ON YOU FOR PUTTING THE BLAME ON THEM

    YOU SHOULD BLAME YOURSELF AND LEARN A LESSON FROM YOUR MISTAKE!

    AS FAR AS KARMA IS CONCERNED MAYBE YOUR KARMA CAME BACK TO YOU...

    MAYBE SELF MANAGED IS NOT A SOLUTION FOR YOUR COMPANY

    OBVIOUSLY YOU NEED SOME GUIDANCE WHEN IT COMES TO BACKING UP DATA...

    LOOK INTO IT....:nuts:
     
  8. zy

    zy zynine.com Staff Member

    WHAT IS WITH THE CAPS? :whistle: :snooty: :nuts:
     
  9. terry99

    terry99 Newbie

    roguetech, read the original post again. It was a RAID array = self-backup. Then a backup was attempted and got borked. Yes, in an ideal world we would always have an infinite number of backups but I'm usually guilty of only have a half a backup; just like most people, I guess. In this case it seems that the server owners could have communicated better.
     
  10. Adrian Wong

    Adrian Wong Da Boss Staff Member

    See, the thing is, this isn't the first time it happened to us. This is actually the THIRD time they corrupted our data while attempting to fix a hardware issue and the SECOND time they corrupted our drives so badly, they had to reload the server with new drives and OS.

    We have two drives in a RAID1 array, but in our first server, one of the hard disk drives started failing right from the beginning.. and the second one failed before the bad drive could be replaced. Bad luck? Perhaps. We didn't get that pissed with them the first time.

    However, this time, they screwed up. No doubt about it. One of the mistakes they admitted they made - they took out the WRONG drive (the good one) when they attempted to replace the malfunctioning HDD. And we all know, it's not possible to "corrupt" the remaining drive by rebuilding the array, but that's what they are claiming.

    As far as backup is concerned, we do have back-ups. Is it as recent as we would have liked? No. But that does NOT absolve them of the mistakes they made.

    Make no mistake - the good drive was still in good condition. The data was still intact. We wanted to clone it. They asked our permission to take down the server for just 30 minutes to "check it out". We didn't ask them to rebuild the array, or trash the hard disk drive - we already made it a point to clone the drive in the morning so we could swap the drives.

    Yet, when we woke up in the morning to start cloning the drive, we found that they did not do what they said they would do - just check. They went and did whatever they felt like doing and somehow, corrupted our data.

    Did we have a chance to solve the problem with our data intact? We sure did, if we were allowed to clone our drive as we wanted to. Would our server have been up much earlier and with far less work? Of course! We could have cloned the drives, swapped them and be on our way.

    I'm not going to argue with you about karma, roguetech, because we all get the karma we DESERVE. It doesn't matter what we think we deserve.

    PS. Seriously, dude, you need to get your keyboard fixed.
     
  11. Falcone

    Falcone Official Mascot Creator

    Are your working for Server Beach?

    Seriously your first post and this?
     
  12. Chai

    Chai Administrator Staff Member

    Obviously, he is. :haha:
     
  13. zy

    zy zynine.com Staff Member

    Maybe he is the one that screwed up the server :haha:
     
  14. PsYkHoTiK

    PsYkHoTiK Admin nerd

    I was just about to ask if our box is in San Antonio. :angel:
     
  15. zy

    zy zynine.com Staff Member

    We already know the answer :angel:
     
  16. Adrian Wong

    Adrian Wong Da Boss Staff Member

    After replacing the RAID controller and HDDs, and corrupting our data... the problem was NOT solved. Soon enough, we received this error message :

    Looks like the second HDD in the RAID1 array was not working! :wall: :wall:

    They finally decided that it could be the motherboard controller to which the drives were connected to that might be causing the problem. They recommended that we allow them to swap chassis - basically take out our drives and move them into a different server with the same specs.

    However, I honestly don't know how that's possible since the RAID1 drives are connected to the 3ware RAID controller! Can someone tell me how the motherboard's SATA controller can affect the RAID1 array on a separate RAID card?

    Anyway, we decided that the motherboard itself might be the problem and a chassis swap would really be a good idea. We made a backup of all our stuff and then scheduled a complete chassis swap on Monday night. Took them a good 2 hours to complete it, but thankfully, NO data loss this time. :pray: :pray:
     

Share This Page