viernes, 6 de mayo de 2016

Cannot get disk geometry: Cambio de disco en Solaris con problemas


Cannot get disk geometry: Cambio de disco en Solaris con problemas


Se abrió el siguiente caso en MOS para solicitar un recambio de disco:

////// metastat -p muestra la configuracion de SVM, solo un volumen con 2 caras de mirror y 5 discos en cada cara.

d10 -m d30 d40 1 
d30 1 6 c1t3d0s0 c1t5d0s0 c1t7d0s0 c1t9d0s0 c1t11d0s0 c1t13d0s0 -i 32b 
d40 1 6 c1t2d0s0 c1t4d0s0 c1t6d0s0 c1t8d0s0 c1t10d0s0 c1t12d0s0 -i 32b 

///// ATENCION: TODAS LAS REPLCAS ESTAN EN 1 SOLO DISCO, UN UNICO PUNTO DE FALLA, POR LO QUE ESTO NO ES RECOMENDADO.

metadb -i 
---------------------------------------------------- 
flags first blk block count 
a m pc luo 16 8192 /dev/dsk/c1t2d0s7 

///// El 19 de Abril comenzo un problema de escritura y lectura en el disco sd7 que es el c1t7d0

Apr 19 07:42:27 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:42:27 serverpi Disconnected command timeout for Target 7 
Apr 19 07:42:28 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:42:28 serverpi Log info 31140000 received for target 7. 
Apr 19 07:42:28 serverpi scsi_status=0, ioc_status=8048, scsi_state=c 
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0/sd@7,0 (sd7): 
Apr 19 07:43:22 serverpi Error for Command: write(10) Error Level: Retryable 
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.notice] Requested Block: 2546032 Error Block: 2546032 
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: XxXxXxXxXx 
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.notice] Sense Key: Unit Attention 
Apr 19 07:43:22 serverpi scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2 
Apr 19 07:45:21 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:45:21 serverpi Log info 3112011a received for target 7. 
Apr 19 07:45:21 serverpi scsi_status=0, ioc_status=804b, scsi_state=c 
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0/sd@7,0 (sd7): 
Apr 19 07:45:37 serverpi Error for Command: write Error Level: Retryable 
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.notice] Requested Block: 122208 Error Block: 122208 
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: XxXxXxXxXx 
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.notice] Sense Key: Unit Attention 
Apr 19 07:45:37 serverpi scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2 
Apr 19 07:46:38 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:46:38 serverpi Disconnected command timeout for Target 7 
Apr 19 07:46:40 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:46:40 serverpi Log info 31140000 received for target 7. 
Apr 19 07:46:40 serverpi scsi_status=0, ioc_status=8048, scsi_state=c 
Apr 19 07:47:50 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:47:50 serverpi Disconnected command timeout for Target 7 
Apr 19 07:47:51 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:47:51 serverpi Log info 31140000 received for target 7. 
Apr 19 07:47:51 serverpi scsi_status=0, ioc_status=8048, scsi_state=c 
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0/sd@7,0 (sd7): 
Apr 19 07:48:25 serverpi Error for Command: write Error Level: Retryable 
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.notice] Requested Block: 122208 Error Block: 122208 
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: XxXxXxXxXx 
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.notice] Sense Key: Unit Attention 
Apr 19 07:48:25 serverpi scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2 
Apr 19 07:51:21 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:51:21 serverpi Disconnected command timeout for Target 7 
Apr 19 07:51:23 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:51:23 serverpi Log info 31140000 received for target 7. 
Apr 19 07:51:23 serverpi scsi_status=0, ioc_status=8048, scsi_state=c 
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@8/scsi@0/sd@7,0 (sd7): 
Apr 19 07:51:49 serverpi Error for Command: write Error Level: Retryable 
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.notice] Requested Block: 122208 Error Block: 122208 
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.notice] Vendor: SEAGATE Serial Number: XxXxXxXxXx 
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.notice] Sense Key: Unit Attention 
Apr 19 07:51:49 serverpi scsi: [ID 107833 kern.notice] ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2 
Apr 19 07:53:41 serverpi scsi: [ID 365881 kern.info] /pci@400/pci@0/pci@8/scsi@0 (mpt0): 
Apr 19 07:53:41 serverpi Log info 3112011a received for target 7. 
Apr 19 07:53:41 serverpi scsi_status=0, ioc_status=804b, scsi_state=c 


///// Finalmente por los errores del disco, el SVM marco error en el volumen d30

Apr 19 10:19:31 serverpi md_stripe: [ID 641072 kern.warning] WARNING: md: d30: write error on /dev/dsk/c1t7d0s0 


///// El mirror esta en mantenimiento. el unico disco con errores es el c1t7d0

d10: Mirror 
Submirror 0: d30 
State: Needs maintenance 
Submirror 1: d40 
State: Okay 
Pass: 1 
Read option: roundrobin (default) 
Write option: parallel (default) 
Size: 1718726400 blocks (819 GB) 

d30: Submirror of d10 
State: Needs maintenance 
Invoke: metareplace d10 c1t7d0s0 <new device> 
Size: 1718726400 blocks (819 GB) 
Stripe 0: (interlace: 32 blocks) 
Device Start Block Dbase State Reloc Hot Spare 
c1t3d0s0 0 No Okay Yes 
c1t5d0s0 20352 No Okay Yes 
c1t7d0s0 20352 No Maintenance Yes 
c1t9d0s0 20352 No Okay Yes 
c1t11d0s0 20352 No Okay Yes 
c1t13d0s0 20352 No Okay Yes 


d40: Submirror of d10 
State: Okay 
Size: 1718726400 blocks (819 GB) 
Stripe 0: (interlace: 32 blocks) 
Device Start Block Dbase State Reloc Hot Spare 
c1t2d0s0 0 No Okay Yes 
c1t4d0s0 20352 No Okay Yes 
c1t6d0s0 20352 No Okay Yes 
c1t8d0s0 20352 No Okay Yes 
c1t10d0s0 20352 No Okay Yes 
c1t12d0s0 20352 No Okay Yes 

///// Conteo de errores totales en el disco

[root@serverpi:] iostat -En | grep c1t7d0 
c1t7d0 Soft Errors: 0 Hard Errors: 32 Transport Errors: 146 

///// Target de disco c1t7d0, numero de parte 540-7355 [C] 146GB - 10000 RPM SAS SFF Disk

c1::dsk/c1t7d0 connected configured unknown SEAGATE ST914602SSUN146G 
unavailable disk n /devices/pci@400/pci@0/pci@8/scsi@0:scsi::dsk/c1t7d0 


Para resolver el inconveniente se ejecutaron los siguientes pasos:

Reemplazar disco HDD7 con NP: 540-7355 [C] 146GB - 10000 RPM SAS SFF Disk/ alterna 
Duración Aproximada: 3 hrs (por el tiempo que tardarian en sincronizar los discos) 

I.- Agregar replicas de discos, actualmente solo tienen 1 replica en 1 solo disco, si ese disco falla se perdera el acceso a los volumenes de SVM: 
# metadb -a -c 3 c1t3d0s7 

NOTA el comando mencionado agregara 3 replicas a otro disco, recomiendo que al menos agreguen 3 replicas en 3 discos distintos. 

2.- Copiar tabla de particiones del disco mirror: 
# prtvtoc /dev/rdsk/c1t6d0s2 > /var/tmp/file 

3.- Quitar el disco del control del operativo: 

# cfgadm -c unconfigure c1::dsk/c1t7d0 

4.- Reemplazar el disco HDD7, ver la liga: 
https://support.oracle.com/handbook_private/Systems/SE_T5240/component.front.html 

5. Reconocer el disco en el operativo: 
# cfgadm -c configure c1::dsk/c1t7d0 


6. Copiar tabla de particiones 

# fmthard -s /var/tmp/file /dev/rdsk/c1t7d0s2 

['file' is the prtvtoc saved in step 3] 

7. reiniciar el volumen en SVM 

# metadevadm -u c1t7d0 

# metareplace -e d10 c1t7d0s0 


Cuando se ejecutaba el reemplazo me encontré con el siguiente error:
[root@serverpi:] fmthard -s /var/tmp/prtvtoc_c1t6d0s2.txt /dev/rdsk/c1t7d0s2 
/dev/rdsk/c1t7d0s2: Cannot get disk geometry 

[root@serverpi:] 

El cual se resolvió de la siguiente manera:
To fix this issue use format with the -e option to update the label. In format select the disk -> partition -> label 

Once in the label menu the option between SMI and EFI is given. Select SMI. 


After putting a new label on with the 'format -e' command run the fmthard command again. 

[root@serverpi:] format -e
Searching for disks...done

c1t7d0: configured with capacity of 136.71GB


AVAILABLE DISK SELECTIONS:
       0. c1t0d0 <LSILOGIC-LogicalVolume-3000 cyl 65533 alt 2 hd 16 sec 273>
          /pci@400/pci@0/pci@8/scsi@0/sd@0,0
       1. c1t2d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@2,0
       2. c1t3d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@3,0
       3. c1t4d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@4,0
       4. c1t5d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@5,0
       5. c1t6d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@6,0
       6. c1t7d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@7,0
       7. c1t8d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@8,0
       8. c1t9d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@9,0
       9. c1t10d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@a,0
      10. c1t11d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@b,0
      11. c1t12d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@c,0
      12. c1t13d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@d,0
      13. c1t14d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@e,0
      14. c1t15d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
          /pci@400/pci@0/pci@8/scsi@0/sd@f,0
Specify disk (enter its number): 6
selecting c1t7d0
[disk formatted]
Disk not labeled.  Label it now? y


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        save       - save new disk/partition definitions
        inquiry    - show vendor, product and revision
        scsi       - independent SCSI mode selects
        cache      - enable, disable or query SCSI disk cache
        volname    - set 8-character volume name
        !<cmd>     - execute <cmd>, then return
        quit
format> part


PARTITION MENU:
        0      - change `0' partition
        1      - change `1' partition
        2      - change `2' partition
        3      - change `3' partition
        4      - change `4' partition
        5      - change `5' partition
        6      - change `6' partition
        7      - change `7' partition
        select - select a predefined table
        modify - modify a predefined partition table
        name   - name the current table
        print  - display the current table
        label  - write partition map and label to the disk
        !<cmd> - execute <cmd>, then return
        quit
partition> quit


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        save       - save new disk/partition definitions
        inquiry    - show vendor, product and revision
        scsi       - independent SCSI mode selects
        cache      - enable, disable or query SCSI disk cache
        volname    - set 8-character volume name
        !<cmd>     - execute <cmd>, then return
        quit
format> parti


PARTITION MENU:
        0      - change `0' partition
        1      - change `1' partition
        2      - change `2' partition
        3      - change `3' partition
        4      - change `4' partition
        5      - change `5' partition
        6      - change `6' partition
        7      - change `7' partition
        select - select a predefined table
        modify - modify a predefined partition table
        name   - name the current table
        print  - display the current table
        label  - write partition map and label to the disk
        !<cmd> - execute <cmd>, then return
        quit
partition> label
[0] SMI Label
[1] EFI Label
Specify Label type[0]: 0
Ready to label disk, continue? y

partition> exit
`exit' is not expected.
partition> quit


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        save       - save new disk/partition definitions
        inquiry    - show vendor, product and revision
        scsi       - independent SCSI mode selects
        cache      - enable, disable or query SCSI disk cache
        volname    - set 8-character volume name
        !<cmd>     - execute <cmd>, then return
        quit
format> quit
[root@serverpi:] fmthard -s /var/tmp/prtvtoc_c1t6d0s2.txt /dev/rdsk/c1t7d0s2
fmthard:  New volume table of contents now in place.

[root@serverpi:] metadevadm -u c1t7d0
Updating Solaris Volume Manager device relocation information for c1t7d0
Old device reloc information:
        id1,sd@n5000c5000b16d247
New device reloc information:
        id1,sd@n5000cca0002ce768
[root@serverpi:] metareplace -e d10 c1t7d0s0
d10: device c1t7d0s0 is enabled

[root@serverpi:]


Entonces, si este problema nos ocurre, es que no seleccionamos el SMI label.
Básicamente EFI funciona con discos mayores a 20 terabytes y SMI menores a dicho tamaño.


Si quieren más información acerca de SMI y EFI label lo pueden encontrar debajo:
1- http://docs.oracle.com/cd/E19253-01/817-5093/disksconcepts-14/
2- http://unixadminschool.com/blog/2012/01/disk-initialisation-and-labelling-solaris-smi-label-vx-efi-label/



No hay comentarios:

Publicar un comentario