[Nutanix] checking rf2, locality

Chang, Hee Sung

4 min readOct 23, 2020

checking data locality and replication factor

(각 VM의 Locality와 Replication이 잘 동작하고 있는지 확인 하는 방법)

1. vdisk 정보 확인

$ ncli vdisk list

Name : 00051c8a-5061-ff4d-0000–00000000327a::NFS:276683

Container ID : 00051c8a-5061-ff4d-0000–00000000327a::1008

Max Capacity : 4 TiB (4,398,046,511,104 bytes)

Reserved Capacity : 50 GiB (53,687,091,200 bytes)

Read-only : false

NFS File Name : Ubuntu-test-flat.vmdk

NFS Parent File Name (… :

Fingerprint On Write : none

On-Disk Dedup : none

Name : 00051c8a-5061-ff4d-0000–00000000327a::NFS:8397

Container ID : 00051c8a-5061-ff4d-0000–00000000327a::1008

Max Capacity : 4 TiB (4,398,046,511,104 bytes)

Reserved Capacity : -

Read-only : false

NFS File Name : network_222-flat.vmdk

NFS Parent File Name (… : controller_221-flat.vmdk

Fingerprint On Write : none

On-Disk Dedup : none

2. vdisk configuration 정보 확인

$ vdisk_config_printer

vdisk_id: 276683

vdisk_name: “NFS:276683”

==> the same ids : created by by hypervisor, 최초로 생성된 vdisk

vdisk_size: 4398046511104

container_id: 1008

params {

total_reserved_capacity: 53687091200

}

creation_time_usecs: 1439444440113799

vdisk_creator_loc: 6

vdisk_creator_loc: 276635

vdisk_creator_loc: 136617

nfs_file_name: “Ubuntu-test-flat.vmdk”

last_modification_time_usecs: 1439444440142052

vdisk_id: 233626

vdisk_name: “NFS:8397”

===> different vdisk id with NFS ID : snapshot, clone, …

parent_vdisk_id: 233218

vdisk_size: 4398046511104

container_id: 1008

params {

}

creation_time_usecs: 1438822662173389

closest_named_ancestor: “NFS:233589”

vdisk_creator_loc: 5

vdisk_creator_loc: 181

vdisk_creator_loc: 13144306

nfs_file_name: “network_222-flat.vmdk”

parent_nfs_file_name_hint: “controller_221-flat.vmdk”

last_modification_time_usecs: 1439428801701538

3. vdisk usage 확인

$ vdisk_usage_printer -vdisk_id=276683

Egid # eids UT Size T Size Ratio Garbage Orphans T Type Replicas(disk/svm/rack)

282092 8 2.75 MB 488.00 KB 17% 0.00 KB 0 C,[37*/6/18][42*/5/18]

281410 12 11.56 MB 5.69 MB 49% 24.00 KB 1 C,[29*/4/18][37*/6/18]

281411 13 12.75 MB 5.29 MB 41% 300.00 KB 31 C,[29*/4/18][37*/6/18]

281412 10 9.62 MB 4.82 MB 50% 408.00 KB 49 C,[29*/4/18][37*/6/18]

281413 11 10.50 MB 4.71 MB 44% 212.00 KB 23 C,[37*/6/18][42*/5/18]

281414 10 8.81 MB 5.05 MB 57% 156.00 KB 16 C,[37*/6/18][42*/5/18]

280877 8 4.56 MB 712.00 KB 15% 0.00 KB 0 C,[37*/6/18][42*/5/18]

280876 8 3.69 MB 524.00 KB 13% 0.00 KB 0 C,[37*/6/18][42*/5/18]

281429 11 10.56 MB 5.14 MB 48% 212.00 KB 22 C,[37*/6/18][42*/5/18]

* Replicas : [disk_id/svm_id/rack_id] pair

=== > Local SVM ID — 6, Remote SVM — 4 and 5

$ vdisk_usage_printer -vdisk_id=233626 <== vdisk_id로 1차 확인

$ vdisk_usage_printer -vdisk_id=233218 <== parent_vdisk_id로 2차 확인, 계속 해서 parent_vdisk_id로 확인

4. checking snapshot chain — parent vdisk_id 일괄 확인

$ snapshot_tree_printer

Starting to process 517 vdisks

Found 10 chains with live leaves 435 chains with removable leaves and 7 singletons

[5][264759(1,5,1)][264724(2,4,1,cb,an,ep=8305,is)][255902(3,3,1,an,is)][253696(4,2,1,an,is)][8305(5,1,4,an,im)]

[4][237216(1,4,1)][233625(2,3,1,rm,ac,is)][232590(3,2,1,rm,is)][8305(5,1,4,an,im)]

[4][265059(1,4,1)][265034(2,3,1,cb,an,is)][263092(3,2,1,an,is)][262186(4,1,1,an,is)]

[3][233626(1,3,1)][233218(2,2,1,rm,is)][8305(5,1,4,an,im)]

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[3][260138(1,3,1)][250727(2,2,1,an,is)][8305(5,1,4,an,im)]

[3][263268(1,3,1)][233651(2,2,1,rm,is)][122504(3,1,1,rm,is)]

[3][263270(1,3,1)][233650(2,2,1,rm,is)][122367(3,1,1,rm,is)]

[3][263271(1,3,1)][233654(2,2,1,rm,is)][122366(3,1,1,rm,is)]

[3][263272(1,3,1)][233653(2,2,1,rm,is)][122237(3,1,1,rm,is)]

[3][264942(1,3,1)][233652(2,2,1,rm,is)][122238(3,1,1,rm,is)]

vdisk_id 233626은 [3]개의 chain으로 구성되어 있으며 각각 233626->233218>8305의 순으로 연결되어 있음

따라서 실제 VM의 데이터는 233626, 233218, 8305의 vdisk_id 내용을 모두 따라가면서 확인해야 함

5. Curator 수동 수행 : 6시간마다 자동으로 Full Scan을 돌리는데 꼬리가 긴 snapshot_tree를 정리해야 할 경우

$ links http://<cvm-ip>:2010/ ==> Curator Master 확인

==> 기존에 동작 중인 Full/Partial Scan이 있을 경우 해당 scan 이 강제 중지되면서 정합성이 깨질 가능성이 있으니

반드시 확인 또는 Nutanix Support에 문의 후 진행해야 함

$ links http://<curator-master>:2010/master/api/client/StartCuratorTasks?task_type=2

==> Curator Full Scan 수동 수행

==> cmd-line 웹브라우저로는 수행이 안되는 것 같음 (확인 필요), 이 명령은 GUI에서 수행

GUI에서 수행할 경우 결과가 화면에 보이는 것은 아니니 여러번 “Enter” 치거나 “Refresh” 행위 금지

$ links http://<curator-master>:2010 ==> Curator Jobs에서 “Full Scan”이 수행 과정 모니터링 가능

========================================

Scheduled Scans

Scheduled scans run periodically. Their goal is general maintenance of the system. There are 2 types of scheduled scans, partial and full. Partial scans do the following:

-ILM (hot -> cold tier)

-Snapshot chain severing

-Delete data that was marked to_remove but not deduped

-Correct Block Awareness

-Delete/replicate over/under replicated data

Full scans do all of this, but they also do on-disk dedupe. Because only full scans do on-disk dedupe multiple scans might need to happen before deduped data is actually deleted.

By default, partial scans run 1 hour (3600 seconds) after the last partial scan completed. Full scans run on their own timer and run 6 hours (21600 seconds) after the last full scan completed.

When this timer pops, if there is another scan running it will wait for this to complete before running

Triggered Scans

Triggered scans are used to respond to a situation in the cluster where curator is needed right away. The most common triggered scans are:

1. ILM. If the hot tier gets too full a ILM scan will be triggered to drain some of the data to the cold tier

2. Disk Failure. If a disk fails a scan will be triggered to replicate the missing data

3. Node Failure. Like a disk failure scan, but with more data to replicate

4. User. Manually triggered by the SRE team in certain situations. This can be a full or partial scan

These scans are all partial scans (unless the user manually triggers a full scan). This means that they will reset the clock on the next periodic partial scan (or full if a user triggered scan)