View Single Post
  #28  
Old October 29th 17, 12:24 AM posted to alt.comp.os.windows-8
Paul[_32_]
external usenet poster
 
Posts: 11,873
Default Asus X550J laptop

Mayayana wrote:
"Neil" wrote

| | I discovered the hard way that Win8.x disregards CHKDSK (!), and SMART
| | eventually killed the drive's boot sectors
| |

I don't understand. I very, very rarely run CHKDSK, so
I don't see why it should have a big effect.

| The issue I ran into is that bad sectors that should have been isolated
| by CHKDSK was disregarded by Win8.1 and kept writing to those sectors
| until it exceeded the SMART's track allocation space.
|
| Do
| you have any links to info that might be relevant?
|
| I'd suggest doing a search to find info that best fits your level of
| understanding of the hardware.

You have no corroborating docs for your
theory? Then how do you know that's what
happened?


There are two levels of activity.

1) If the disk detects trouble, it queues a sector for "evaluation"
on the next write. If the write is bad, the sector is spared out.
In neither the read nor the write case, is the drive throwing
a CRC error. Wither it's error correction on a read, or sparing
on a write, the code returned is "success". Only the time delay
until it says "success", hints at the trouble it's having.

On a write, if the drive runs out of spares, it could report an
error (failure) on the write. Or, on a read, it can report a CRC
error, if it tries for 15 seconds to read the sector (times 120
tries per second). If you have a drive with TLER, the time
it's willing to retry is cut by more than 50% (that's so the
RAID controller won't force a rebuild, because the RAID driver
timed out before the 15 seconds is up).

2) If the OS gets a fatal report, only then does $BADCLUS get involved.
NTFS has the ability to disable all the sectors in a single cluster
at one time, by marking a cluster as bad. $BADCLUS is a "sparse" file,
whose size covers the entire disk surface. The $BADCLUS file consists
of the clusters that are bad. So the bad clusters are marked as
unusable. But that only happens, if the hard drive "gives up"
on its little dance routine. If all the clusters on the partition
were bad, the $BADCLUS file would be full (fully populated), and
the drive would be "empty of usable clusters".

The sparing in (1) is automatic and non-reversible. Even if
you do an Enhanced Secure Erase, it shouldn't change the
status of which sectors got mapped out. An Enhanced Secure Erase
will try to zero out *all* sectors, both the working sectors
and the ones that are no longer accessible. The drive doesn't
wait around to find out how those writes went. Enhanced Secure
Erase is a "best effort" command in that sense. Only if the heads
fell off, would the Enhanced Secure Erase stop.

The old SCSI drives on the other hand, you could reverse the
sparing process on those, and copy the factory list over top
of the grown list, and allow the drive to (in the fullness of
time) reevaluate any dodgy sectors. SATA/IDE on the other hand,
doesn't allow such interference.

In terms of SMART, there are two parameters of interest, with
regard to (1).

Current Pending Sector is supposed to be a count of sectors
waiting for "write evaluation" the next time the drive goes
to write those sectors. A sector could wait for a couple
years, before a chance comes up, or it could get evaluated
a second from now - it all depends on when and where the
disk gets written. If you wanted to drop the CPS to zero
in a hurry, simply re-writing the entire drive with the
info you read off it, should be enough.

Now, what's wrong with that "theory". Well, on the Seagate
drives I've got, I've *never* seen Current Pending Sector
go non-zero. Even when other activity indicates the drive
is sick, and Current Pending should be growing. Some brand
of drive, probably is using Current Pending Sector, but
not in the case of the Seagates I've owned.

Current Pending returns to zero, if an opportunity comes
along to write the entire drive.

Reallocated Sector Count is a measure of how many spares
have been used up. It's thresholded, so only after a large
number of sectors were spared, does the count value go non-zero.
The result is, the user is unaware exactly how large the
spared sector count is. Generally, a spare sector should be
in the same track or cylinder, because you don't want to
"pay a seek time" to change cylinders to get a spare.
A head switch still costs around 1 millisecond and is
expensive. If a spare was staged in the same track, the
track cache could make it available almost immediately.

On IBM drives, one eighth of the cache RAM chip was
used for the spare sector map. That meant that no extra
accesses were needed, to figure out where the spare is.
If you read 5, and the table says 5 is now at 12, then
the controller can immediately transfer 12 in place of
5 on a read. For other brands of drive, they don't offer
public info like that. First it was IBM, then when IBM
research moved to HGST, the HGST web site had tech info
on how drives work. No other manufacturer really goes
into the level of detail that HGST did. Now that HGST
has been sold again, AFAIK the info has gone underground.
So no more tidbits on how drives work... Or what kinda
Carnauba wax they're using on the platters this week
(I'm making fun of their choice of platter lubricants).
The lubricant is bonded to the platter, so it won't
get away. One or two molecules are bonded, and a
"movable" molecule sits on top. You won't need a greasy
rag to wipe your hands, if you touch one.

Paul
Ads