Vertically scaling disks

With the growing amount of data, it might be necessary to add, delete, or modify disks attached to Data Lake or Data Hub clusters in AWS.

There are many clusters that are deployed with standard magnetic storage. With the growing lineage data, these disks are running out of space on core nodes. These need to be moved to General Purpose SSDs (gp2/gp3 on AWS) and or resized to a bigger disk.

The disks attached to the Data Lake and Data Hub clusters can be changed or resized in AWS without downtime.

Limitations

When using this preview feature, be aware of the following limitations:
  • This feature is only available for AWS.
  • The disks can only be resized up, meaning you cannot reduce the size of an attached block storage. If there are multiple disks of different sizes, the size of all the disks attached to the instances in the group that are smaller or lesser than the requested size will be increased to the requested size.
  • This feature will only resize additional block storages in an instance and not the root volume.
  • Clusters and cluster services must be in running state before disk vertical scaling is performed.
  • Current implementation does not support this feature through CDP UI; It is available only through Beta CDP CLI. To install Beta CDP, refer to Installing Beta CDP CLI.
  • The disk modification feature on AWS can only be used once in 6 hours. This is a limitation on the AWS side.

Permissions

This feature requires the following permissions to be added to the cross-account policy described in Cross-account access IAM role.

  • ec2:ModifyVolume
  • ec2:DescribeVolumesModifications
  • ec2:DescribeVolumeStatus

The following table explains why CDP needs these permissions:

Permission Description
ec2:ModifyVolume It is required to modify the volume attributes such as type, size and IOPS capacity. Without this, volume modifications cannot be performed by CDP.
ec2:DescribeVolumesModifications It is required to verify whether the volume modifications performed by CDP were successful. Only upon successful modification, other steps like resizing will be done.
ec2:DescribeVolumeStatus This is required to make sure that the volume being modified is attached to an instance and not an orphaned volume.