Cannot get disk geometry: Cambio de disco en Solaris con problemas
Se abrió el siguiente caso en MOS para solicitar un recambio de disco:
////// metastat -p muestra la configuracion de SVM, solo un volumen con 2 caras de mirror y 5 discos en cada cara.
d10 -m d30 d40 1
d30 1 6 c1t3d0s0 c1t5d0s0 c1t7d0s0 c1t9d0s0 c1t11d0s0 c1t13d0s0 -i 32b
d40 1 6 c1t2d0s0 c1t4d0s0 c1t6d0s0 c1t8d0s0 c1t10d0s0 c1t12d0s0 -i 32b
///// ATENCION: TODAS LAS REPLCAS ESTAN EN 1 SOLO DISCO, UN UNICO PUNTO DE FALLA, POR LO QUE ESTO NO ES RECOMENDADO.
metadb -i
----------------------------------------------------
flags first blk block count
a m pc luo 16 8192 /dev/dsk/c1t2d0s7
///// El 19 de Abril comenzo un problema de escritura y lectura en el disco sd7 que es el c1t7d0
Apr 19 07:42:27 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:42:27 serverpi Disconnected command timeout for Target 7
Apr 19 07:42:28 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:42:28 serverpi Log info 31140000 received for target 7.
Apr 19 07:42:28 serverpi scsi_status=0, ioc_status=8048, scsi_state=c
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0/sd@7,0 (sd7):
Apr 19 07:43:22 serverpi Error for Command: write(10) Error Level: Retryable
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.notice] Requested Block: 2546032 Error Block: 2546032
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: XxXxXxXxXx
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.notice] Sense Key: Unit Attention
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2
Apr 19 07:45:21 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:45:21 serverpi Log info 3112011a received for target 7.
Apr 19 07:45:21 serverpi scsi_status=0, ioc_status=804b, scsi_state=c
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0/sd@7,0 (sd7):
Apr 19 07:45:37 serverpi Error for Command: write Error Level: Retryable
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.notice] Requested Block: 122208 Error Block: 122208
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: XxXxXxXxXx
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.notice] Sense Key: Unit Attention
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2
Apr 19 07:46:38 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:46:38 serverpi Disconnected command timeout for Target 7
Apr 19 07:46:40 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:46:40 serverpi Log info 31140000 received for target 7.
Apr 19 07:46:40 serverpi scsi_status=0, ioc_status=8048, scsi_state=c
Apr 19 07:47:50 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:47:50 serverpi Disconnected command timeout for Target 7
Apr 19 07:47:51 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:47:51 serverpi Log info 31140000 received for target 7.
Apr 19 07:47:51 serverpi scsi_status=0, ioc_status=8048, scsi_state=c
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0/sd@7,0 (sd7):
Apr 19 07:48:25 serverpi Error for Command: write Error Level: Retryable
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.notice] Requested Block: 122208 Error Block: 122208
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: XxXxXxXxXx
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.notice] Sense Key: Unit Attention
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2
Apr 19 07:51:21 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:51:21 serverpi Disconnected command timeout for Target 7
Apr 19 07:51:23 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:51:23 serverpi Log info 31140000 received for target 7.
Apr 19 07:51:23 serverpi scsi_status=0, ioc_status=8048, scsi_state=c
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0/sd@7,0 (sd7):
Apr 19 07:51:49 serverpi Error for Command: write Error Level: Retryable
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.notice] Requested Block: 122208 Error Block: 122208
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: XxXxXxXxXx
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.notice] Sense Key: Unit Attention
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2
Apr 19 07:53:41 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0):
Apr 19 07:53:41 serverpi Log info 3112011a received for target 7.
Apr 19 07:53:41 serverpi scsi_status=0, ioc_status=804b, scsi_state=c
///// Finalmente por los errores del disco, el SVM marco error en el volumen d30
Apr 19 10:19:31 serverpi md_stripe: [ID 641072 kern.warning] WARNING: md: d30: write error on /dev/dsk/c1t7d0s0
///// El mirror esta en mantenimiento. el unico disco con errores es el c1t7d0
d10: Mirror
Submirror 0: d30
State: Needs maintenance
Submirror 1: d40
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 1718726400 blocks (819 GB)
d30: Submirror of d10
State: Needs maintenance
Invoke: metareplace d10 c1t7d0s0 <new device>
Size: 1718726400 blocks (819 GB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase State Reloc Hot Spare
c1t3d0s0 0 No Okay Yes
c1t5d0s0 20352 No Okay Yes
c1t7d0s0 20352 No Maintenance Yes
c1t9d0s0 20352 No Okay Yes
c1t11d0s0 20352 No Okay Yes
c1t13d0s0 20352 No Okay Yes
d40: Submirror of d10
State: Okay
Size: 1718726400 blocks (819 GB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase State Reloc Hot Spare
c1t2d0s0 0 No Okay Yes
c1t4d0s0 20352 No Okay Yes
c1t6d0s0 20352 No Okay Yes
c1t8d0s0 20352 No Okay Yes
c1t10d0s0 20352 No Okay Yes
c1t12d0s0 20352 No Okay Yes
///// Conteo de errores totales en el disco
[root@serverpi:] iostat -En | grep c1t7d0
c1t7d0 Soft Errors: 0 Hard Errors: 32 Transport Errors: 146
///// Target de disco c1t7d0, numero de parte 540-7355 [C] 146GB - 10000 RPM SAS SFF Disk
c1::dsk/c1t7d0 connected configured unknown SEAGATE ST914602SSUN146G
unavailable disk n /devices/pci@400/pci@0/pci@8/scsi@0:scsi::dsk/c1t7d0
Para resolver el inconveniente se ejecutaron los siguientes pasos:
Reemplazar disco HDD7 con NP: 540-7355 [C] 146GB - 10000 RPM SAS SFF Disk/ alterna
Duración Aproximada: 3 hrs (por el tiempo que tardarian en sincronizar los discos)
I.- Agregar replicas de discos, actualmente solo tienen 1 replica en 1 solo disco, si ese disco falla se perdera el acceso a los volumenes de SVM:
# metadb -a -c 3 c1t3d0s7
NOTA el comando mencionado agregara 3 replicas a otro disco, recomiendo que al menos agreguen 3 replicas en 3 discos distintos.
2.- Copiar tabla de particiones del disco mirror:
# prtvtoc /dev/rdsk/c1t6d0s2 > /var/tmp/file
3.- Quitar el disco del control del operativo:
# cfgadm -c unconfigure c1::dsk/c1t7d0
4.- Reemplazar el disco HDD7, ver la liga:
https://support.oracle.com/handbook_private/Systems/SE_T5240/component.front.html
5. Reconocer el disco en el operativo:
# cfgadm -c configure c1::dsk/c1t7d0
6. Copiar tabla de particiones
# fmthard -s /var/tmp/file /dev/rdsk/c1t7d0s2
['file' is the prtvtoc saved in step 3]
7. reiniciar el volumen en SVM
# metadevadm -u c1t7d0
# metareplace -e d10 c1t7d0s0
Cuando se ejecutaba el reemplazo me encontré con el siguiente error:
[root@serverpi:] fmthard -s /var/tmp/prtvtoc_c1t6d0s2.txt /dev/rdsk/c1t7d0s2
/dev/rdsk/c1t7d0s2: Cannot get disk geometry
[root@serverpi:]
El cual se resolvió de la siguiente manera:
To fix this issue use format with the -e option to update the label. In format select the disk -> partition -> label
Once in the label menu the option between SMI and EFI is given. Select SMI.
After putting a new label on with the 'format -e' command run the fmthard command again.
[root@serverpi:] format -e
Searching for disks...done
c1t7d0: configured with capacity of 136.71GB
AVAILABLE DISK SELECTIONS:
0. c1t0d0 <LSILOGIC-LogicalVolume-3000 cyl 65533 alt 2 hd 16 sec 273>
/pci@400/pci@0/pci@8/scsi@0/sd@0,0
1. c1t2d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@2,0
2. c1t3d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@3,0
3. c1t4d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@4,0
4. c1t5d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@5,0
5. c1t6d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@6,0
6. c1t7d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@7,0
7. c1t8d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@8,0
8. c1t9d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@9,0
9. c1t10d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@a,0
10. c1t11d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@b,0
11. c1t12d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@c,0
12. c1t13d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@d,0
13. c1t14d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@e,0
14. c1t15d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
/pci@400/pci@0/pci@8/scsi@0/sd@f,0
Specify disk (enter its number): 6
selecting c1t7d0
[disk formatted]
Disk not labeled. Label it now? y
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
scsi - independent SCSI mode selects
cache - enable, disable or query SCSI disk cache
volname - set 8-character volume name
!<cmd> - execute <cmd>, then return
quit
format> part
PARTITION MENU:
0 - change `0' partition
1 - change `1' partition
2 - change `2' partition
3 - change `3' partition
4 - change `4' partition
5 - change `5' partition
6 - change `6' partition
7 - change `7' partition
select - select a predefined table
modify - modify a predefined partition table
name - name the current table
print - display the current table
label - write partition map and label to the disk
!<cmd> - execute <cmd>, then return
quit
partition> quit
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
scsi - independent SCSI mode selects
cache - enable, disable or query SCSI disk cache
volname - set 8-character volume name
!<cmd> - execute <cmd>, then return
quit
format> parti
PARTITION MENU:
0 - change `0' partition
1 - change `1' partition
2 - change `2' partition
3 - change `3' partition
4 - change `4' partition
5 - change `5' partition
6 - change `6' partition
7 - change `7' partition
select - select a predefined table
modify - modify a predefined partition table
name - name the current table
print - display the current table
label - write partition map and label to the disk
!<cmd> - execute <cmd>, then return
quit
partition> label
[0] SMI Label
[1] EFI Label
Specify Label type[0]: 0
Ready to label disk, continue? y
partition> exit
`exit' is not expected.
partition> quit
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
scsi - independent SCSI mode selects
cache - enable, disable or query SCSI disk cache
volname - set 8-character volume name
!<cmd> - execute <cmd>, then return
quit
format> quit
[root@serverpi:] fmthard -s /var/tmp/prtvtoc_c1t6d0s2.txt /dev/rdsk/c1t7d0s2
fmthard: New volume table of contents now in place.
[root@serverpi:] metadevadm -u c1t7d0
Updating Solaris Volume Manager device relocation information for c1t7d0
Old device reloc information:
id1,sd@n5000c5000b16d247
New device reloc information:
id1,sd@n5000cca0002ce768
[root@serverpi:] metareplace -e d10 c1t7d0s0
d10: device c1t7d0s0 is enabled
[root@serverpi:]
Entonces, si este problema nos ocurre, es que no seleccionamos el SMI label.
Básicamente EFI funciona con discos mayores a 20 terabytes y SMI menores a dicho tamaño.
Si quieren más información acerca de SMI y EFI label lo pueden encontrar debajo:
1- http://docs.oracle.com/cd/E19253-01/817-5093/disksconcepts-14/
2- http://unixadminschool.com/blog/2012/01/disk-initialisation-and-labelling-solaris-smi-label-vx-efi-label/
No hay comentarios:
Publicar un comentario