Store 2.0

Aus Store2 Wiki
Zur Navigation springen Zur Suche springen

STABILITÄT Store 2.0

Vermutung: Die drei SATA-Erweiterungskarten crashen gelegentlich das komplette System.

Karten:

http://www.sybausa.com/productInfo.php?iid=537 

Syba SY-PEX40008 4-port SATA II PCI-e Software RAID Controller Card--Bundle with Low Profile Bracket, SIL3124 Chipset Sind die identischen Karten, die immer noch von Backblaze verbaut werden (Pod 2.0 UND Pod 3.0!) Hängen an drei PCI-E 1x (kleiner Port)

RAID bliebt heile (zum Glück!), da die Karten dann komplett die Zugriff sperren.

Arch-Log files haben gar keine Einträge zu den Crashs!!!

Remote-Log über Syslogd (über Myth) zeigt als letzten Eintrag:

 mdadm: sending ioctl 1261 to a partition (buggy Eintrag, aber unkritisch)
 sata_sil24: IRQ status == 0xffffffff, PCI fault or device removal

sata-sil24:

https://ata.wiki.kernel.org/index.php/Sata_sil24 

Spurious interrupts are expected on SiI3124 suffering from IRQ loss erratum on PCI-X


PATCH?

http://old.nabble.com/-PATCH-06-13--sata_sil24%3A-implement-loss-of-completion-interrupt-on-PCI-X-errta-fix-p3799674.html

Thread über Zugang mit SIL3124 Chip

http://www.linuxquestions.org/questions/linux-kernel-70/how-to-access-sata-drives-attached-to-sii3124-719408/

Test?

http://marc.info/?l=linux-ide&m=127228317404771&w=2

Raid nach Booten öffnen und mount

Um Auto-Assembly beim Booten zu verhindern muss die Config-Datei /etc/mdadm.conf leer (oder zumindest komplett auskommentiert sein) und "MDADM_SCAN=no" in /etc/sysconfig/mdadm

1.) Checken ob alle Platten da sind:

/root/bin/diskserial_sort2.sh 

Müssen im Moment 17 Platten sein. Basis ist die Datei disknum.txt unter /root/bin

2.) Raids suchen und assemblen (kein Autostart):

mdadm --assemble --scan 

3.) Cryptsetup:

cryptsetup luksOpen /dev/md125 cr_md125 

4.) Mounten:

mount /dev/mapper/cr_md125 /data

Schliessen wäre:

cryptsetup luksClose cr_md125

JD2

java -Xmx512m -jar /home/gagi/jd2/JDownloader.jar

VNC

dergagi.selfhost.bz:5901

Festplatten-Layout

3000GB Hitachi Deskstar 5K3000 HDS5C3030ALA630 CoolSpin 32MB 3.5" (8.9cm) SATA 6Gb/s

3000GB Western Digital WD30EZRX 3TB interne Festplatte (8,9 cm (3,5 Zoll), 5400 rpm, 2ms, 64MB Cache, SATA III

Problem mit WD-Platten und LCC

http://idle3-tools.sourceforge.net/ 
http://koitsu.wordpress.com/2012/05/30/wd30ezrx-and-aggressive-head-parking/ 

Get idle3 timer raw value

idle3ctl -g /dev/sdh 

Disable idle3 timer:

idle3ctl -d /dev/sdh 

Serial auslesen mit:

udevadm info --query=all --name=/dev/sdi | grep ID_SERIAL_SHORT 

Serial Systemplatte 160GB:

JC0150HT0J7TPC

Serials der Datenplatten

00  : 00000000000000 (1TB System, Samsung)

geht

01  : 
02  : 
03  : 
04  :
05  : 

geht jetzt auch, Molex-Kontakt Problem behoben

06  : 
07  : MJ1311YNG4J48A (3TB)
08  : WD-WCC070299387 (3TB WD)
09  : MJ1311YNG3UUPA (3TB)
10  : 

geht

11  : MJ1311YNG3SAMA (3TB)
12  : 13V9WK9AS NEU Hot Spare data
13  : MJ1311YNG09EDA (3TB) Garantie-Austausch, HOT SPARE data2
14  : 
15  : MCE9215Q0AUYTW (3TB Toshiba neu)

geht

16  : MJ0351YNGA02YA (3TB)
17  : 
18  : 
19  : 
20  : WD-WCAWZ1881335 (3TB WD)

geht jetzt auch, Molex-Kontakt Problem behoben

21  : 
22  : WD-WCAWZ2279670 (3TB WD)
23  : MJ1311YNG3SSLA (3TB)
24  : MJ1311YNG25Z6A (3TB)
25  : 

geht

26  : MJ1311YNG3RM5A (3TB)
27  : MJ1311YNG3NZ3A (3TB)
28  : MJ1311YNG3NT5A (3TB)
29  : 
30  : MCM9215Q0B9LSY (3TB Toshiba neu)

geht

31  : 234BGY0GS (3TB Toshiba neu)
32  : 
33  : MJ1311YNG3WZVA (3TB)
34  : MJ1311YNG3Y4SA (3TB)
35  : MJ1311YNG3SYKA (3TB)

geht

36  : 
37  : WD-WCC070198169 (3TB WD)
38  : 
39  : MJ1311YNG3RZTA (3TB)
40  : 

geht

41  : MJ1311YNG3LTRA (3TB)
42  : 
43  : MJ1311YNG38VGA (3TB)
44  :
45  :

TOTAL: 25 (18 + 4 + 2 Hotspare + eine Systemplatte) von 46 möglichen

Raid Baubefehl

im Screen mdadm

mdadm --create /dev/md125 --chunk=64 --level=raid6 --layout=ls --raid-devices=15 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1 /dev/sdl1 /dev/sdm1 /dev/sdn1 /dev/sdo1 /dev/sdp1


Re-Create 2014-01-31: NEUER RICHTIGER RE-CREATE BEFEHL von mdadm-git: ./mdadm --create --assume-clean /dev/md125 --chunk=64 --level=raid6 --layout=ls --raid-devices=18 /dev/sdb1:1024 /dev/sdd1:1024 /dev/sdf1:1024 /dev/sdg1:1024 /dev/sdh1:1024 /dev/sdc1:1024 /dev/sdt1:1024 /dev/sdn1:1024 /dev/sdo1:1024 /dev/sdq1:1024 /dev/sdm1:1024 /dev/sdp1:1024 /dev/sdu1:1024 /dev/sdv1:1024 /dev/sda1:1024 /dev/sds1:1024 /dev/sdl1:1024 /dev/sdw1:1024

Zweites Raid Baubefehl

im Screen mdadm

mdadm --create /dev/md126 --chunk=64 --level=raid6 --layout=ls --raid-devices=4 /dev/sdv1 /dev/sdw1 /dev/sdj1 /dev/sdh1

Verschlüsseln mit speziellen Paramtern für Hardware-Verschlüsselung:

cryptsetup -v luksFormat --cipher aes-cbc-essiv:sha256 --key-size 256 /dev/md126

Öffnen:

cryptsetup luksOpen /dev/md126 cr_md126 

XFS Filesystem drauf:

mkfs.xfs /dev/mapper/cr_md126

Spare-Group einrichten

aktuelle Config in mdadm.conf schreiben

mdadm -D -s >> /etc/mdadm.conf 

spare-group ergänzen

nano /etc/mdadm.conf

ganz unten spare-group=shared ergänzen

ARRAY /dev/md/126 metadata=1.2 spares=1 name=store2:126 UUID=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx spare-group=shared

Raid-Baustatus

cat /proc/mdstat

automatisch jede Sekunde aktualisiert

watch -n 1 cat /proc/mdstat

Verschlüssel von Hand (ohne Yast2)

Verschlüsseln:

cryptsetup -v --key-size 256 luksFormat /dev/md125 

Mit speziellen Paramtern für Hardware-Verschlüsselung:

cryptsetup -v luksFormat --cipher aes-cbc-essiv:sha256 --key-size 256 /dev/md125

Öffnen:

cryptsetup luksOpen /dev/md125 cr_md125 

Filesystem drauf:

mkfs.xfs /dev/mapper/cr_md125 
store2:~ # mkfs.xfs /dev/mapper/cr_md125
meta-data=/dev/mapper/cr_md125   isize=256    agcount=36, agsize=268435424 blks
         =                       sectsz=512   attr=2, projid32bit=0
data     =                       bsize=4096   blocks=9523357168, imaxpct=5
         =                       sunit=16     swidth=208 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=16 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


Status:

cryptsetup luksDump /dev/md125

Grown

Festplatte vorbereiten

Open gdisk with the first hard drive:
$ gdisk /dev/sda 

and type the following commands at the prompt:

   Add a new partition: n
   Select the default partition number: Enter
   Use the default for the first sector: Enter
   For sda1 and sda2 type the appropriate size in MB (i.e. +100MB and +2048M). For sda3 just hit Enter to select the remainder of the disk.
   Select Linux RAID as the partition type: fd00
   Write the table to disk and exit: w 

Bad Blocks

Screen-Umgebung starten

screen -S bb
badblocks -vs -o sdy-badblock-test /dev/sdy

verbose, show progress, output-file (log) badblocks sucht nur nach bad blocks, zerstört aber keine Daten.


Detach

Strg-a d = detach

Reatach

screen -r bb

ODER Wieder reingehen

screen -x bb

Device zum Raid hinzufügen

mdadm --add /dev/md125 /dev/sdc1

Raid reshapen mit zusätzlichem Device (dauert ca. 3 volle Tage)

mdadm --grow --raid-devices=18 /dev/md125 --backup-file=/home/gagi/mda125backup

um zu sehen, wer/was gerade Zugriff nimmt:

lsof /data 

Samba-Service beenden:

rcsmb stop 
systemctl stop smbd 

Unmounten:

umount /data 


XFS (Data)

XFS checken (ungemountet)

xfs_repair -n -o bhash=1024 /dev/mapper/cr_md125 

Cryptcontainer wachsen

cryptsetup --verbose resize cr_md125 

Mounten:

mount /dev/mapper/cr_md125 /data

XFS vergrößern

xfs_growfs /data 

XFS checken (ungemountet)

xfs_repair -n -o bhash=1024 /dev/mapper/cr_md125 


Read-Only Mounten:

mount -o ro /dev/mapper/cr_md125 /data

Samba-Freigabe

There's a bug in Samba in openSuse 11.4. Here's the workaround:

   go to Yast --> AppArmor --> Control Panel (on) --> Configure Profile Modes --> usr.sbin.smbd = complain
   go to Yast --> system --> runlevels --> smb=on + nmb=on
   reboot

Direkte Netwerkverbindung Store1 <-> Store 2.0

du kannst auch mal schauen was in /etc/udev/rules.d/70-persistent-net... (oder wie auch immer die date heißt) steht.

da wird die mac einer bestimmten netzwerkadresse (eth0, eth1, ...) zugewiesen.

die datei kannst du auch löschen oder verschieben - wird beim neustart neu angelegt.

da kommt machmal was durcheinander - bei 'nem kernelupdate oder bios-update.

GEHT ! unterschiedliche subnets (192.168.2.100 und 192.168.2.102)

Fast-Copy

1.) Empfänger (Store2.0)

cd <Zielverzeichnis> 
netcat -l -p 4323 | gunzip | cpio -i -d -m 

2.) Sender (Store)

cd <Quellverzeichnis>
find . -type f | cpio -p | gzip -1 | netcat 192.168.2.102 4323





1.) Empfänger (Store2.0)

socat tcp4-listen:4323 stdout | tar xvpf - /data/eBooks

2.) Sender (Store)

tar cvf - /data/eBooks | socat stdin tcp4:192.168.2.102:4323 


Test mit Fortschrittsanzeige bei bekannter Datengröße:

1.) Empfänger (Store2.0)

cd <Zielverzeichnis> 
socat tcp4-listen:4323 stdout | pv -s 93G | tar xvpf -

2.) Sender (Store)

cd <Quellverzeichnis> 
tar cvf - * | pv -s 93G | socat stdin tcp4:192.168.2.102:4323


dd if=/dev/sdl | bar -s 1.5T | dd of=/dev/sdw

FileBot Renamer Linux

filebot -rename -get-subtitles -non-strict /data/Downloads/Fertig/ --output /data/Downloads/Fertig/FileBot/ --format "{n}/Season {s}/{n}.{s00e00}.{t}" --db TheTVDB 
filebot -get-missing-subtitles -non-strict -r --lang en /data/Downloads/Fertig/FileBot/ 
filebot -script fn:replace --conflict override --def "e=.eng.srt" "r=.srt" /data/Downloads/Fertig/FileBot/

RemoteDesktop ArchLinux Client

z.B. auf Busfahrer: 
rdesktop -g 1440x900 -P -z  -x l -r sound:off -u gagi 192.168.1.149

Backplane-Rotation zur Fehlerdiagnose

Urzustand mit 24 Platten 2013-11-12

/dev/sdx -> 00 : 00000000000000 (1TB) 31��C
/dev/sdp -> 01 : WD-WCC070299387 (3TB WD) 31��C
/dev/sdq -> 03 : MJ1311YNG3SSLA (3TB) 33��C
/dev/sdr -> 05 : MJ1311YNG3NZ3A (3TB) 32��C
/dev/sds -> 07 : MJ1311YNG4J48A (3TB) 32��C
/dev/sdt -> 09 : MJ1311YNG3UUPA (3TB) 33��C
/dev/sdu -> 11 : MJ1311YNG3SAMA (3TB) 32��C
/dev/sdv -> 13 : MJ1311YNG3SU1A (3TB) 34��C
/dev/sdw -> 15 : MCE9215Q0AUYTW (3TB Toshiba neu) 31��C
/dev/sdh -> 16 : MJ0351YNGA02YA (3TB) nicht im Einsatz, bb-check 2013-08-28 37��C
/dev/sdi -> 18 : MJ1311YNG3Y4SA (3TB) 40��C
/dev/sdj -> 20 : WD-WCAWZ1881335 (3TB WD) hot spare 38��C
/dev/sdk -> 22 : WD-WCAWZ2279670 (3TB WD) 41��C
/dev/sdl -> 24 : MJ1311YNG25Z6A (3TB) 39��C
/dev/sdm -> 26 : MJ1311YNG3RM5A (3TB) 39��C
/dev/sdn -> 28 : MJ1311YNG3NT5A (3TB) 40��C
/dev/sdo -> 30 : MCM9215Q0B9LSY (3TB Toshiba neu) 38��C
/dev/sda -> 31 : 234BGY0GS (3TB Toshiba neu) 40��C
/dev/sdb -> 33 : MJ1311YNG3WZVA (3TB) 43��C
/dev/sdc -> 35 : MJ1311YNG3SYKA (3TB) 42��C
/dev/sdd -> 37 : WD-WCC070198169 (3TB WD) 41��C
/dev/sde -> 39 : MJ1311YNG3RZTA (3TB) 39��C
/dev/sdf -> 41 : MJ1311YNG3LTRA (3TB) 39��C
/dev/sdg -> 43 : MJ1311YNG38VGA (3TB) 39��C
Insgesamt 24 Platten gefunden.

Crashes

ArchLinux 2011-09-09

Sep  9 18:20:04 localhost kernel: [156439.479947] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Sep  9 18:20:04 localhost kernel: [156439.480035] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Sep  9 18:20:04 localhost kernel: [156439.486612] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Sep  9 18:20:04 localhost kernel: [156439.503656] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Sep  9 18:20:04 localhost kernel: [156439.504562] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Sep  9 18:34:11 localhost -- MARK --
Sep  9 18:42:42 localhost kernel: [157797.911330] r8169: eth0: link up
Sep  9 18:54:11 localhost -- MARK --
Sep  9 19:14:11 localhost -- MARK --
Sep  9 19:34:11 localhost -- MARK --
Sep  9 19:54:11 localhost -- MARK --
Sep  9 20:14:11 localhost -- MARK --
Sep  9 20:27:32 localhost kernel: [164086.971566] r8169: eth0: link up
Sep  9 20:27:42 localhost kernel: [164097.580071] r8169: eth0: link up
Sep  9 20:27:50 localhost kernel: [164105.391755] r8169: eth0: link up
Sep  9 20:27:51 localhost kernel: [164106.272019] r8169: eth0: link up
Sep  9 20:28:12 localhost kernel: [164127.150062] r8169: eth0: link up
Sep  9 20:28:22 localhost kernel: [164137.941304] r8169: eth0: link up
Sep  9 20:28:33 localhost kernel: [164148.890097] r8169: eth0: link up
Sep  9 20:28:38 localhost kernel: [164153.080536] r8169: eth0: link up
Sep  9 20:28:58 localhost kernel: [164173.790064] r8169: eth0: link up
Sep  9 20:42:19 localhost kernel: [    0.000000] Initializing cgroup subsys cpuset
Sep  9 20:42:19 localhost kernel: [    0.000000] Initializing cgroup subsys cpu
Sep  9 20:42:19 localhost kernel: [    0.000000] Linux version 2.6.32-lts (tobias@T-POWA-LX) (gcc version 4.6.1 20110819    (prerelease) (GCC) ) #1 SMP Tue Aug 30 08:59:44 CEST 2011
Sep  9 20:42:19 localhost kernel: [    0.000000] Command line: root=/dev/disk/by-uuid/ba47ea9a-c24c-4dc6-a9a2-ca3b442bdbfc ro vga=0x31B
Sep  9 20:42:19 localhost kernel: [    0.000000] KERNEL supported cpus:
Sep  9 20:42:19 localhost kernel: [    0.000000]   Intel GenuineIntel
Sep  9 20:42:19 localhost kernel: [    0.000000]   AMD AuthenticAMD
Sep  9 20:42:19 localhost kernel: [    0.000000]   Centaur CentaurHauls

OpenSuse 2011-09-26

Sep 26 23:15:59 store2 su: (to nobody) root on none
Sep 26 23:17:17  su: last message repeated 2 times
Sep 26 23:25:23 store2 smartd[4617]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 166 to 162
Sep 26 23:25:26 store2 smartd[4617]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 171 to 166
Sep 26 23:25:29 store2 smartd[4617]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 171 to 166
Sep 26 23:25:36 store2 smartd[4617]: Device: /dev/sdk [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 171 to 166
Sep 26 23:25:37 store2 smartd[4617]: Device: /dev/sdl [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 181 to 187
Sep 26 23:55:22 store2 smartd[4617]: Device: /dev/sdb [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 96
Sep 26 23:55:23 store2 smartd[4617]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 162 to 166
Sep 26 23:55:26 store2 smartd[4617]: Device: /dev/sde [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 26 23:55:26 store2 smartd[4617]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 166 to 171
Sep 26 23:55:29 store2 smartd[4617]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 166 to 171
Sep 26 23:55:32 store2 smartd[4617]: Device: /dev/sdi [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 98
Sep 27 00:55:26 store2 kernel: imklog 5.6.5, log source = /proc/kmsg started.

OpenSuse2011-09-27

Sep 27 16:35:17 store2 smbd[29588]: [2011/09/27 16:35:17.391212,  0] param/loadparm.c:8445(check_usershare_stat)
Sep 27 16:35:17 store2 smbd[29588]:   check_usershare_stat: file /var/lib/samba/usershares/ owned by uid 0 is not a regular file
Sep 27 16:44:06 store2 smbd[29163]: [2011/09/27 16:44:06.795153,  0] lib/util_sock.c:474(read_fd_with_timeout)
Sep 27 16:44:06 store2 smbd[29163]: [2011/09/27 16:44:06.795341,  0] lib/util_sock.c:1441(get_peer_addr_internal)
Sep 27 16:44:06 store2 smbd[29597]: [2011/09/27 16:44:06.795323,  0] lib/util_sock.c:474(read_fd_with_timeout)
Sep 27 16:44:06 store2 smbd[29163]:   getpeername failed. Error was Der Socket ist nicht verbunden
Sep 27 16:44:06 store2 smbd[29592]: [2011/09/27 16:44:06.795368,  0] lib/util_sock.c:474(read_fd_with_timeout)
Sep 27 16:44:06 store2 smbd[29163]:   read_fd_with_timeout: client 0.0.0.0 read error = Die Verbindung wurde vom Kommunikationspartner zurückgesetzt.
Sep 27 16:44:06 store2 smbd[29597]: [2011/09/27 16:44:06.795422,  0] lib/util_sock.c:1441(get_peer_addr_internal)
Sep 27 16:44:06 store2 smbd[29597]:   getpeername failed. Error was Der Socket ist nicht verbunden
Sep 27 16:44:06 store2 smbd[29597]:   read_fd_with_timeout: client 0.0.0.0 read error = Die Verbindung wurde vom Kommunikationspartner zurückgesetzt.
Sep 27 16:44:06 store2 smbd[29592]: [2011/09/27 16:44:06.795468,  0] lib/util_sock.c:1441(get_peer_addr_internal)
Sep 27 16:44:06 store2 smbd[29592]:   getpeername failed. Error was Der Socket ist nicht verbunden
Sep 27 16:44:06 store2 smbd[29592]:   read_fd_with_timeout: client 0.0.0.0 read error = Die Verbindung wurde vom Kommunikationspartner zurückgesetzt.
Sep 27 16:45:42 store2 smbd[29585]: [2011/09/27 16:45:42.499038,  0] lib/util_sock.c:474(read_fd_with_timeout)
Sep 27 16:45:42 store2 smbd[29593]: [2011/09/27 16:45:42.499082,  0] lib/util_sock.c:474(read_fd_with_timeout)
Sep 27 16:45:42 store2 smbd[29593]: [2011/09/27 16:45:42.499174,  0] lib/util_sock.c:1441(get_peer_addr_internal)
Sep 27 16:45:42 store2 smbd[29585]: [2011/09/27 16:45:42.499174,  0] lib/util_sock.c:1441(get_peer_addr_internal)
Sep 27 16:45:42 store2 smbd[29593]:   getpeername failed. Error was Der Socket ist nicht verbunden
Sep 27 16:45:42 store2 smbd[29585]:   getpeername failed. Error was Der Socket ist nicht verbunden
Sep 27 16:45:42 store2 smbd[29593]:   read_fd_with_timeout: client 0.0.0.0 read error = Die Verbindung wurde vom Kommunikationspartner zurückgesetzt.
Sep 27 16:45:42 store2 smbd[29585]:   read_fd_with_timeout: client 0.0.0.0 read error = Die Verbindung wurde vom Kommunikationspartner zurückgesetzt.
Sep 27 19:35:14 store2 kernel: imklog 5.6.5, log source = /proc/kmsg started.

OpenSuse 2011-09-29

während kräftigem Copyjob von Store

Sep 29 23:16:19  su: last message repeated 2 times
Sep 29 23:28:41 store2 smartd[4624]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 157 to 162
Sep 29 23:28:44 store2 smartd[4624]: Device: /dev/sdd [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 153 to 157
Sep 29 23:28:49 store2 smartd[4624]: Device: /dev/sdh [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 29 23:28:53 store2 smartd[4624]: Device: /dev/sdk [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 162 to 157
Sep 29 23:28:57 store2 smartd[4624]: Device: /dev/sdo [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 176 to 181
Sep 29 23:58:44 store2 smartd[4624]: Device: /dev/sdd [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 29 23:58:49 store2 smartd[4624]: Device: /dev/sdh [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 99 to 100
Sep 29 23:58:53 store2 smartd[4624]: Device: /dev/sdk [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 157 to 162
Sep 29 23:58:57 store2 smartd[4624]: Device: /dev/sdn [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 187 to 193
Sep 29 23:58:58 store2 smartd[4624]: Device: /dev/sdo [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 181 to 176
Sep 29 23:59:02 store2 smartd[4624]: Device: /dev/sdq [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 176 to 181
Sep 30 00:28:41 store2 smartd[4624]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 187 to 193
Sep 30 00:28:43 store2 smartd[4624]: Device: /dev/sdd [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 99 to 100
Sep 30 00:28:49 store2 smartd[4624]: Device: /dev/sdh [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 30 00:28:58 store2 smartd[4624]: Device: /dev/sdo [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 176 to 181
Sep 30 00:58:47 store2 smartd[4624]: Device: /dev/sdf [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 30 00:58:49 store2 smartd[4624]: Device: /dev/sdh [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 99 to 100
Sep 30 00:58:59 store2 smartd[4624]: Device: /dev/sdp [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 30 01:28:47 store2 smartd[4624]: Device: /dev/sdf [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 99 to 100
Sep 30 01:28:47 store2 smartd[4624]: Device: /dev/sdf [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 157 to 162
Sep 30 01:28:50 store2 smartd[4624]: Device: /dev/sdi [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 30 01:58:47 store2 smartd[4624]: Device: /dev/sdf [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 162 to 157
Sep 30 01:59:00 store2 smartd[4624]: Device: /dev/sdp [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 99 to 100
Sep 30 02:28:45 store2 smartd[4624]: Device: /dev/sdd [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 30 02:28:46 store2 smartd[4624]: Device: /dev/sde [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 30 02:28:46 store2 smartd[4624]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 157 to 162
Sep 30 02:28:48 store2 smartd[4624]: Device: /dev/sdf [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 30 02:28:52 store2 smartd[4624]: Device: /dev/sdi [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 99 to 100
Sep 30 02:58:45 store2 smartd[4624]: Device: /dev/sdd [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 99 to 100
Sep 30 02:58:46 store2 smartd[4624]: Device: /dev/sde [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 99 to 100
Sep 30 02:58:46 store2 smartd[4624]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 162 to 157 
Sep 30 02:58:47 store2 smartd[4624]: Device: /dev/sdf [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 99 to 100
Sep 30 02:58:49 store2 smartd[4624]: Device: /dev/sdh [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 100 to 99
Sep 30 09:39:22 store2 kernel: imklog 5.6.5, log source = /proc/kmsg started.

What you are seeing are the Normalized Attribute values changing.

For example when the Raw_Read_Error_Rate changed from 99 to 100, the increase in Normalized value from 99 to 100 means that the disk now thinks it is a bit LESS likely to fail than before, because this Normalized value is moving further above the (low) Threshold value.

ArchLinux 2011-10-17

Oct 17 21:21:35 localhost smartd[1941]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 181 to 176
Oct 17 21:21:37 localhost smartd[1941]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 181 to 176
Oct 17 21:21:45 localhost smartd[1941]: Device: /dev/sdm [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 200 to 206
Oct 17 21:30:03 localhost -- MARK --
Oct 17 21:50:03 localhost -- MARK --
Oct 17 21:51:37 localhost smartd[1941]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 176 to 181
Oct 17 21:51:41 localhost smartd[1941]: Device: /dev/sdi [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 181 to 187
Oct 17 22:10:03 localhost -- MARK --
Oct 17 22:21:34 localhost smartd[1941]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 176 to 181
Oct 17 22:21:47 localhost smartd[1941]: Device: /dev/sdo [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 206 to 214
Oct 17 22:21:49 localhost smartd[1941]: Device: /dev/sdq [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 206 to 214
Oct 17 22:30:03 localhost -- MARK --
Oct 17 22:50:03 localhost -- MARK --
Oct 17 23:11:18 localhost kernel: [    0.000000] Initializing cgroup subsys cpuset
Oct 17 23:11:18 localhost kernel: [    0.000000] Initializing cgroup subsys cpu

ArchLinux 2011-11-06

Nov  6 12:39:05 localhost -- MARK --
Nov  6 12:42:18 localhost smartd[1927]: Device: /dev/sdi [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 193 to 200
Nov  6 12:42:20 localhost smartd[1927]: Device: /dev/sdj [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 193 to 187
Nov  6 12:42:24 localhost smartd[1927]: Device: /dev/sdn [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 222 to 214
Nov  6 12:42:25 localhost smartd[1927]: Device: /dev/sdo [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 230 to 222
Nov  6 12:42:26 localhost smartd[1927]: Device: /dev/sdp [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 214 to 222
Nov  6 12:59:05 localhost -- MARK --
Nov  6 14:29:21 localhost kernel: [    0.000000] Initializing cgroup subsys cpuset
Nov  6 14:29:21 localhost kernel: [    0.000000] Initializing cgroup subsys cpu
Nov  6 14:29:21 localhost kernel: [    0.000000] Linux version 3.0-ARCH (tobias@T-POWA-LX) (gcc version 4.6.1 20110819 (prerelease) (GCC) ) #1 SMP PREEMPT Wed Oct$
Nov  6 14:29:21 localhost kernel: [    0.000000] Command line: root=/dev/disk/by-id/ata-Hitachi_HCS5C1016CLA382_JC0150HT0J7TPC-part3 ro
Nov  6 14:29:21 localhost kernel: [    0.000000] BIOS-provided physical RAM map: 

EINFACH SO!

ArchLinux 2011-11-21

Dabei war vorher ein Systemupdate gelaufen (inklusive neuem Kernel), aber noch nicht rebootet.

Nov 21 09:30:27 localhost smartd[2208]: Device: /dev/sdj [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 200 to 206
Nov 21 09:30:30 localhost smartd[2208]: Device: /dev/sdl [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 240 to 250
Nov 21 09:30:31 localhost smartd[2208]: Device: /dev/sdm [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 250 to 240
Nov 21 09:30:35 localhost smartd[2208]: Device: /dev/sdp [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 230 to 240
Nov 21 09:43:32 localhost kernel: [1280595.864622] ------------[ cut here ]------------
Nov 21 09:43:32 localhost kernel: [1280595.864636] WARNING: at drivers/gpu/drm/i915/i915_irq.c:649 ironlake_irq_handler+0x1102/0x1110 [i915]()
Nov 21 09:43:32 localhost kernel: [1280595.864638] Hardware name: H61M-S2V-B3
Nov 21 09:43:32 localhost kernel: [1280595.864639] Missed a PM interrupt
Nov 21 09:43:32 localhost kernel: [1280595.864640] Modules linked in: xfs sha256_generic dm_crypt dm_mod raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx md_mod coretemp nfsd exportfs nfs lockd fscache  auth_rpcgss nfs_acl sunrpc ipv6 ext2 sr_mod cdrom snd_hda_codec_realtek usb_storage uas sg evdev snd_hda_intel snd_hda_codec iTCO_wdt snd_hwdep snd_pcm snd_timer i915 snd drm_kms_helper drm pcspkr i2c_algo_bit r8169 ppdev i2c_i801 shp chp parport_pc intel_agp i2c_core pci_hotplug parport intel_gtt mei(C) soundcore snd_page_alloc processor button mii iTCO_vendor_support video aesni_intel cryptd aes_x86_64 aes_generic ext4 mbcache jbd2 crc16 usbhid hid sd_mod sata_sil24 ahci libahci libata scsi_mod ehci_hcd usbcore
Nov 21 09:43:32 localhost kernel: [1280595.864674] Pid: 0, comm: swapper Tainted: G         C  3.0-ARCH #1
Nov 21 09:43:32 localhost kernel: [1280595.864675] Call Trace:
Nov 21 09:43:32 localhost kernel: [1280595.864676]  <IRQ>  [<ffffffff8105c76f>] warn_slowpath_common+0x7f/0xc0
Nov 21 09:43:32 localhost kernel: [1280595.864684]  [<ffffffff8105c866>] warn_slowpath_fmt+0x46/0x50
Nov 21 09:43:32 localhost kernel: [1280595.864688]  [<ffffffff81078f7d>] ? queue_work+0x5d/0x70
Nov 21 09:43:32 localhost kernel: [1280595.864693]  [<ffffffffa0235a22>] ironlake_irq_handler+0x1102/0x1110 [i915]
Nov 21 09:43:32 localhost kernel: [1280595.864696]  [<ffffffff812a4bc5>] ? dma_issue_pending_all+0x95/0xa0
Nov 21 09:43:32 localhost kernel: [1280595.864699]  [<ffffffff81333db1>] ? net_rx_action+0x131/0x300
Nov 21 09:43:32 localhost kernel: [1280595.864702]  [<ffffffff810bf835>] handle_irq_event_percpu+0x75/0x2a0
Nov 21 09:43:32 localhost kernel: [1280595.864705]  [<ffffffff810bfaa5>] handle_irq_event+0x45/0x70
Nov 21 09:43:32 localhost kernel: [1280595.864707]  [<ffffffff810c21af>] handle_edge_irq+0x6f/0x120
Nov 21 09:43:32 localhost kernel: [1280595.864710]  [<ffffffff8100d9f2>] handle_irq+0x22/0x40
Nov 21 09:43:32 localhost kernel: [1280595.864712]  [<ffffffff813f66aa>] do_IRQ+0x5a/0xe0
Nov 21 09:43:32 localhost kernel: [1280595.864715]  [<ffffffff813f4393>] common_interrupt+0x13/0x13
Nov 21 09:43:32 localhost kernel: [1280595.864716]  <EOI>  [<ffffffff81273cdb>] ? intel_idle+0xcb/0x120
Nov 21 09:43:32 localhost kernel: [1280595.864720]  [<ffffffff81273cbd>] ? intel_idle+0xad/0x120
Nov 21 09:43:32 localhost kernel: [1280595.864723]  [<ffffffff81313d9d>] cpuidle_idle_call+0x9d/0x350
Nov 21 09:43:32 localhost kernel: [1280595.864726]  [<ffffffff8100a21a>] cpu_idle+0xba/0x100
Nov 21 09:43:32 localhost kernel: [1280595.864729]  [<ffffffff813d1eb2>] rest_init+0x96/0xa4
Nov 21 09:43:32 localhost kernel: [1280595.864731]  [<ffffffff81748c23>] start_kernel+0x3de/0x3eb
Nov 21 09:43:32 localhost kernel: [1280595.864733]  [<ffffffff81748347>] x86_64_start_reservations+0x132/0x136
Nov 21 09:43:32 localhost kernel: [1280595.864735]  [<ffffffff81748140>] ? early_idt_handlers+0x140/0x140
Nov 21 09:43:32 localhost kernel: [1280595.864737]  [<ffffffff8174844d>] x86_64_start_kernel+0x102/0x111
Nov 21 09:43:32 localhost kernel: [1280595.864738] ---[ end trace 01037f4ec3ec4ee5 ]---
Nov 21 10:00:16 localhost smartd[2208]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 206 to 200
Nov 21 10:00:18 localhost smartd[2208]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 200 to 193
Nov 21 10:00:19 localhost smartd[2208]: Device: /dev/sdd [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 206 to 200
Nov 21 10:00:23 localhost smartd[2208]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 206 to 200
Nov 21 10:00:29 localhost smartd[2208]: Device: /dev/sdl [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 250 to 240
Nov 21 10:00:30 localhost smartd[2208]: Device: /dev/sdm [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 240 to 250
Nov 21 10:00:33 localhost smartd[2208]: Device: /dev/sdp [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 240 to 230
Nov 21 10:00:34 localhost smartd[2208]: Device: /dev/sdq [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 240 to 250
Nov 21 11:52:01 localhost kernel: [    0.000000] Initializing cgroup subsys cpuset
Nov 21 11:52:01 localhost kernel: [    0.000000] Initializing cgroup subsys cpu