If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
|
Thread Tools | Rate Thread | Display Modes |
#1
|
|||
|
|||
My Wednesday
Hi All,
Windows 7, sp1, x64 Wednesday was a fun day. Well, sort of… A customer with RSTe RAID1 pair had one of the drives completely fail. They were Intel SSDs. Quelle surprise! And yes they were the ones Intel Tech support told me to use. Now ordinarily, this is a pretty easy fix. You power down, remove the replace the bad hard drive (Samsung this time), power back up and use the Intel RST software utility to rebuild to a new drive. Done it several times. It is usually boring. NOT THIS TIME. The rebuild ran for about two minutes and them the computer rebooted and Windows told me I could only boot into Safe Mode. Okay, so I booted into safe mode, set up a `chkdsk c: /f` and rebooted. This time Windows booted up okay. Rise, later, repeat! AAAAAHHHHHH. Now no drives register in the second slot. Hmmm. I knocked off a power connector. Sheez. Okay, so I rebooted, when into RSTe BIOS and broke the RAID container. Now I am booting up as a single drive. So, if the RSTe software did not work, I used the Samsung Data Migration utility (doesn’t work on RAID pairs). I ran for about 6% and froze. What the hell???? Rebooted and tried again. Same result. This is the FIRST time this utility has failed. It is usually totally boring to run. So, not to be defeated, I boot into Clone Zilla. I starts to clone and Clone Zilla blows a cork on me. This time it tells me what is wrong. Tons of bad sectors. Intel strikes again!!!! Okay, so I rerun the clone and go into “Expert” mode and select “rescue” (got to LOVE Clone Zilla). Rescue mode just skips over the bad spots. This time I get a successful clone. I run gparted from Fedora Xfce Live and expand the partition. And Windows likes it too after a forced chkdsk. (Clone Zilla deliberately throws the NTFS dirty flag). So, I go into the BIOS RSTe and set up the RAID1 container. Ooops, I forgot the stinking utility does not do the copy existing drive. It should, but doesn’t get me started. Bad word. Bad word. Bad word. So, I had to clone everything over again. This time I use the Windows RSTe utility to create the container. And, FINALLY, everything is working again. I arrived at 15:39 and left at 23:38. That was my Wednesday. How was yours? -T |
Ads |
#2
|
|||
|
|||
My Wednesday
T wrote:
Hi All, Windows 7, sp1, x64 Wednesday was a fun day. Well, sort of… snipped the personal diary entry That was my Wednesday. How was yours? This is Usenet. Not Facebook, Twitter, CafeMom, nor a logbook to store your personal diary entries. So does your *customer* paying you know about your shotgun repair? |
#3
|
|||
|
|||
My Wednesday
On 2018-03-16 22:09, T wrote:
Hi All, Windows 7, sp1, x64 Wednesday was a fun day. Well, sort of… A customer with RSTe RAID1 pair had one of the drives completely fail. They were Intel SSDs. Quelle surprise! And yes they were the ones Intel Tech support told me to use. Intel's are bad, they deliberatly go into DOA mode when they decide there's too many bad blocks. Now ordinarily, this is a pretty easy fix. You power down, remove the replace the bad hard drive (Samsung this time), power back up and use the Intel RST software utility to rebuild to a new drive. Done it several times. It is usually boring. Doesn't it rebuild by itself? It did here on my RAID5 when I removed a drive, booted into BIOS, shutdown, reconnected the drive and rebooted into Windows. I never had to tell it to, it started to rebuild all by itself. NOT THIS TIME. The rebuild ran for about two minutes and them the computer rebooted and Windows told me I could only boot into Safe Mode. Okay, so I booted into safe mode, set up a `chkdsk c: /f` and rebooted. This time Windows booted up okay. Okay, so I rebooted, when into RSTe BIOS and broke the RAID container. Now I am booting up as a single drive. You're lucky breaking the RAID left the drives bootable at all. So, if the RSTe software did not work, I used the Samsung Data Migration utility (doesn’t work on RAID pairs). I ran for about 6% and froze. What the hell???? Rebooted and tried again. Same result. This is the FIRST time this utility has failed. It is usually totally boring to run. So, not to be defeated, I boot into Clone Zilla. I starts to clone and Clone Zilla blows a cork on me. This time it tells me what is wrong. Tons of bad sectors. Intel strikes again!!!! Never used CloneZilla, but I added it to my "Get This" list. Okay, so I rerun the clone and go into “Expert” mode and select “rescue” (got to LOVE Clone Zilla). Rescue mode just skips over the bad spots. This time I get a successful clone. I run gparted from Fedora Xfce Live and expand the partition. And Windows likes it too after a forced chkdsk. (Clone Zilla deliberately throws the NTFS dirty flag). So, I go into the BIOS RSTe and set up the RAID1 container. Ooops, I forgot the stinking utility does not do the copy existing drive. It should, but doesn’t get me started. Bad word. Bad word. Bad word. It's a BIOS extension, don't expect too much... So, I had to clone everything over again. This time I use the Windows RSTe utility to create the container. And, FINALLY, everything is working again. I arrived at 15:39 and left at 23:38. That was my Wednesday. How was yours? Better than yours, but my Friday was just as bad as your Wednesday. Regards, -- ! _\|/_ Sylvain / ! (o o) Memberavid-Suzuki-Fdn/EFF/Red+Cross/SPCA/Planetary-Society oO-( )-Oo "Life," said Marvin, "Don't talk to ME about life!" |
#4
|
|||
|
|||
My Wednesday
T wrote:
Hi All, Windows 7, sp1, x64 Wednesday was a fun day. Well, sort of… Interesting post. Thanks! |
#5
|
|||
|
|||
My Wednesday
T wrote:
Hi All, Windows 7, sp1, x64 Wednesday was a fun day. Well, sort of… A customer with RSTe RAID1 pair had one of the drives completely fail. They were Intel SSDs. I hope you're learning something from all this. The very first error, is Intel drives with the "fold up shop" behavior. If you were going to RAID something, should have RAIDed an Intel and a Sammy. Not two Intels. Not two *identical* Intels. The possibility of correlated failures (both exceed wear life in the same microsecond) should have entered your mind. Each device in the RAID, does the same write at the same time. Each wear counter wears down at exactly the *same* rate. BOOM! BOOM! And out go the lights. You should not RAID two "fold up shop" drives which are identical, because they will do the identical math and fold up shop... together :-/ The Sammy probably doesn't fold up shop. And the standard lessons have been taught here, about SMART passthru. You mean to tell me, that in addition to "degrade" and "fail", there's no "SMART indicates imminent failure on Drive X" message ? The array driver must be able to read the SMART table, and pass a warning to the OS in some way. If there is array management software, a tray application, it should be installed and running. (Some Promise installs, you could install the driver and not bother with the tray. Driving blind.) We already learned enough from the SIL3112 failure modes, to trust soft (driver) RAID1 about as far as it can be thrown. ******* Did you compare the checksums on the unchanged files in the most recent backup, against your clone attempt ? You can use hashdeep to traverse a tree and compute checksums. Paul |
#6
|
|||
|
|||
My Wednesday
On 03/16/2018 07:50 PM, VanguardLH wrote:
T wrote: Hi All, Windows 7, sp1, x64 Wednesday was a fun day. Well, sort of… snipped the personal diary entry That was my Wednesday. How was yours? This is Usenet. Not Facebook, Twitter, CafeMom, nor a logbook to store your personal diary entries. So does your *customer* paying you know about your shotgun repair? Are you having a bad day? |
#7
|
|||
|
|||
My Wednesday
On 03/16/2018 08:42 PM, B00ze wrote:
Now ordinarily, this is a pretty easy fix.* You power down, remove the replace the bad hard drive (Samsung this time), power back up and use the Intel RST software utility to rebuild to a new drive. Done it several times.* It is usually boring. Doesn't it rebuild by itself? It did here on my RAID5 when I removed a drive, booted into BIOS, shutdown, reconnected the drive and rebooted into Windows. I never had to tell it to, it started to rebuild all by itself. With RSTe and a new drive it does not recognize, you have to press the "rebuild to another drive" button. It is pretty simple. |
#8
|
|||
|
|||
My Wednesday
On 03/17/2018 12:40 AM, Paul wrote:
T wrote: Hi All, Windows 7, sp1, x64 Wednesday was a fun day.* Well, sort of… A customer with RSTe RAID1 pair had one of the drives completely fail.* They were Intel SSDs. I hope you're learning something from all this. The very first error, is Intel drives with the "fold up shop" behavior. If you were going to RAID something, should have RAIDed an Intel and a Sammy. Not two Intels. Not two *identical* Intels. Intel SSD's are sh*t. I have taken it in the short over them before. I will be glad when they are all gone. I have said this before on these parts and some have argued with me. Did you compare the checksums on the unchanged files in the most recent backup, against your clone attempt ? You can use hashdeep to traverse a tree and compute checksums. No. I just tried everything out afterwards and it worked. The user is very studios and complains a lot if anything is out of place. And she was quite for two days. |
#9
|
|||
|
|||
My Wednesday
On 03/17/2018 01:55 AM, T wrote:
On 03/16/2018 07:50 PM, VanguardLH wrote: T wrote: Hi All, Windows 7, sp1, x64 Wednesday was a fun day.* Well, sort of… snipped the personal diary entry That was my Wednesday.* How was yours? This is Usenet. Not Facebook, Twitter, CafeMom, nor a logbook to store your personal diary entries. So does your *customer* paying you know about your shotgun repair? Are you having a bad day? Maybe worse than my Wednesday? Today I am recoding a 6282 line Perl 5 program into Perl 6. It is very slow going. Perl 5's sub declarations are sh*t. |
#10
|
|||
|
|||
My Wednesday
On Sat, 17 Mar 2018 03:40:48 -0400, Paul wrote:
I hope you're learning something from all this. The very first error, is Intel drives with the "fold up shop" behavior. It drives me up the wall -- customers get a sequence of error messages, and they tell me only about the last one. I don't know how many times I've told someone, "When you get multiple error messages, it's almost always the first one that gives the clue to what is wrong." -- Stan Brown, Oak Road Systems, Tompkins County, New York, USA http://BrownMath.com/ http://OakRoadSystems.com/ Shikata ga nai... |
#11
|
|||
|
|||
My Wednesday
On 17/3/2018 10:09, T wrote:
.... And, FINALLY, everything is working again. I arrived at 15:39 and left at 23:38. That was my Wednesday. How was yours? Did you got paid? -- @~@ Remain silent! Drink, Blink, Stretch! Live long and prosper!! / v \ Simplicity is Beauty! /( _ )\ May the Force and farces be with you! ^ ^ (x86_64 Ubuntu 9.10) Linux 2.6.39.3 不借貸! 不詐騙! 不*錢! 不援交! 不打交! 不打劫! 不自殺! 不求神! 請考慮綜援 (CSSA): http://www.swd.gov.hk/tc/index/site_...sub_addressesa |
#12
|
|||
|
|||
My Wednesday
On 03/17/2018 04:44 AM, Stan Brown wrote:
On Sat, 17 Mar 2018 03:40:48 -0400, Paul wrote: I hope you're learning something from all this. The very first error, is Intel drives with the "fold up shop" behavior. It drives me up the wall -- customers get a sequence of error messages, and they tell me only about the last one. I don't know how many times I've told someone, "When you get multiple error messages, it's almost always the first one that gives the clue to what is wrong." The first sign of trouble was when the second drive in the pair because a brick. It showed as being there but was unable to respond to anything. The first drive stayed working, but was full of sector errors: about 20 of them. I bet the wear leveling had a joyful time of that! Intel drive are sh*t. I wish I'd never got involved with them in the first place. I only sell Samsung now and have had ZERO issue with them. (It mesmerized me that some argue about this with me!) If you do any cloning, CloneZilla is a must in your arsenal. http://clonezilla.org/downloads.php Expert mode, was a "rescue" switch that will clone past errors. Often times, I am cloning to save data from a dying drive. |
#13
|
|||
|
|||
My Wednesday
On 03/17/2018 04:50 AM, Mr. Man-wai Chang wrote:
On 17/3/2018 10:09, T wrote: .... And, FINALLY, everything is working again.* I arrived at 15:39 and left at 23:38. That was my Wednesday.* How was yours? Did you got paid? Not yet. I do my billing on Sundays usually. This was a "mission critical" computer. A crash would have been an absolute disaster for the company. (Yes, it is backed up. The downtime would have killed them.) That was shy the RAID1 in the first place. I saved their asses. I was glad I did not have to reinstall Windows and all their apps. I would have been there all night! Intel SSD's are sh*t. |
#14
|
|||
|
|||
My Wednesday
On 03/17/2018 7:10 AM, T wrote:
On 03/17/2018 04:50 AM, Mr. Man-wai Chang wrote: On 17/3/2018 10:09, T wrote: * .... And, FINALLY, everything is working again.* I arrived at 15:39 and left at 23:38. That was my Wednesday.* How was yours? Did you got paid? Not yet.* I do my billing on Sundays usually. This was a "mission critical" computer.* A crash would have been an absolute disaster for the company. (Yes, it is backed up.* The downtime would have killed them.)* That was shy the RAID1 in the first place. I saved their asses. I was glad I did not have to reinstall Windows and all their apps.* I would have been there all night! Intel SSD's are sh*t. Did you mention Intel drives are sh*t. :-) Rene |
#15
|
|||
|
|||
My Wednesday
Rene Lamontagne wrote:
On 03/17/2018 7:10 AM, T wrote: On 03/17/2018 04:50 AM, Mr. Man-wai Chang wrote: On 17/3/2018 10:09, T wrote: .... And, FINALLY, everything is working again. I arrived at 15:39 and left at 23:38. That was my Wednesday. How was yours? Did you got paid? Not yet. I do my billing on Sundays usually. This was a "mission critical" computer. A crash would have been an absolute disaster for the company. (Yes, it is backed up. The downtime would have killed them.) That was shy the RAID1 in the first place. I saved their asses. I was glad I did not have to reinstall Windows and all their apps. I would have been there all night! Intel SSD's are sh*t. Did you mention Intel drives are sh*t. :-) Rene It's one particular policy they have, for device operation, that makes an Intel SSD, a consumer-antagonistic device. In a corporate environment, where fancy backup systems do incremental backups on every disk in the company once a day, the Intel policy doesn't matter. But for a home user who never makes backups... the Intel approach is deadly. When a device is near dead, it's OK for it to stop writing, causing Windows to freak out. As long as it allows reading, it allows a user to run a backup of the files. Then, once a new SSD is installed, you can restore from backup. If the device completely disappears without saying a word, well, that isn't a very good policy. Lazy or unmotivated users, who have never made a backup in their life, will learn a very sharp lesson with their Intel SSD. This is why, with SSDs, you really need to check each brand, to see how they handle the "Intel issue". The blue line here, the drive "pulled the trigger on itself" when no bad sectors were present. The other three brands continued to run, https://techreport.com/review/27909/...heyre-all-dead That article covers a number of issues. It shows drives having uncorrectable errors. It shows drives operating past their calculated lifespan. And it mentions that once the lifespan is surpassed, some notification was presented by the drive that it was at the end-of-life conditions (wear limit). You'll notice there's very little they could write about the Intel drive, because it packed up before showing any reallocations at all. Paul |
|
Thread Tools | |
Display Modes | Rate This Thread |
|
|