Refer to Introductionto learn more about Shared Storage configuration on IDEA.
Apps and Data Storage (Required)
For the IDEA Cluster to function, shared storage configuration must include Apps and Data storage configurations. Both Apps and Data are cluster scoped file systems and are mounted automatically on all applicable infrastructure hosts, eVDI linux sessions and SOCA Compute Nodes.
Apps
Apps shared-storage is used to save critical cluster configuration scripts, files and logs.
For Scale-Out computing workloads, additional Applications (eg. OpenMPI or IntelMPI, Python, Solvers etc) can be installed on shared-storage, and can be leveraged by Compute Nodes.
Default Configuration:
Apps storage is mounted on /apps mount path, and is configurable.
Amazon EFS is used as the default storage provider for Apps storage.
A custom CloudWatch monitoring rules and Lambda function is deployed for EFS Apps storage volumes, which help monitor the throughput of the file system and dynamically adjust the throughput mode to provisioned or bursting.
Data
Data storage is primarily used to store User Home Directories.
Additional directories for project/group level file shares can be created on Data Storage.
Default Configuration:
Data storage is mounted on /data mount path, and is configurable.
Amazon EFS is used as default storage provider for Apps.
To save cost, EFS Lifecycle policy is set to move data to Infrequently Accessed storage class after 30 days.
Scope
A notion of scope is introduced in IDEA to enable cluster administrators manage multiple file systems and specify mount criteria based on access, use-case and workload needs. Shared Storage mounts can be scoped based on:
Cluster
Cluster scoped shared storage mounts are applied across all nodes in the cluster. These include applicable infrastructure nodes, SOCA Compute Nodes and eVDI Hosts.
Module
Module scoped shared storage mounts are applicable across all nodes for the module, including applicable infrastructure nodes and eVDI or Compute Nodes.
Project
Project scoped shared storage mounts are applicable for Compute Nodes or eVDI Hosts, if they are launched for an applicable Project.
Scale-Out Computing: Queue Profiles
Queue Profile scoped shared storage mounts are applicable for all Compute Nodes launched for Jobs submitted to the queues configured under a Queue Profile.
Add or Attach Shared Storage to Cluster
The idea-admin.sh shared-storage utility enables admins to generate configurations for:
Provisioning new file systems
Re-use existing file systems
Either of the use-cases can be executed prior to initial cluster deployment OR after cluster deployment.
If shared storage configurations are updated after an IDEA Cluster is deployed, depending upon the Scope, manual actions will be required to mount the file system on applicable existing cluster nodes. All new hosts launched after the configuration update will automatically mount the configured file systems. See below for example(s)
Provision new File System
Shared Storage config generation for provisioning new file systems is only supported for Amazon EFS at the moment.
To generate configurations for provisioning new file systems you can use the idea-admin.sh shared-storage add-file-system command as below:
Example
idea-admin.sh utility will automatically update your IDEA cluster environment if you select " Update Cluster Settings and Exit". You can also choose to automatically "Deploy" the cluster which will automatize the steps mentioned below. For this demo, we are just Updating Cluster Settings and will proceed to a manual deployment afterwards.
Once done, you can validate your new mount point in the web interface via "Cluster Management" > "Settings" > "Shared Storage"
At this point, the FileSystem ID is empty because you asked to provision a brand new EFS. To update the backend infrastructure and trigger the EFS creation, you must run deploy command (see this page for more details about deploy utility).
First, run the idea-admin.sh cdk diff to confirm the new EFS will be created:
Now that the deployment command is complete, go back to the web interface and validate the new EFS has been created and now has valid FileSystem ID assigned.
To further validate our new mount point, we can submit a test job which will output df command
qsub -- /bin/df -h
The job output should display the mount point (custom/path) for your new filesystem
To generate configurations for attaching an existing file system, you can use the idea-admin.sh shared-storage attach-file-system command as below. This utility will automatically search for existing backed storage (FSx for Lustre/NetApp/OpenZFS/Windows, EFS) running in your VPC.
Example
Remove a File System
Run ./idea-admin.sh config delete shared-storage.<filesystem_name> to remove a shared filesystem from IDEA.
Removing a file system from IDEA won't trigger a file system deletion. Make sure to re-deploy the shared-storage module if you want to remove a filesystem previously created by IDEA
Shared Storage Providers
Amazon EFS
New EFS Configuration
Existing EFS Configuration
Amazon FSx for Lustre
Existing FSx for Lustre Configuration
Amazon FSx for NetApp ONTAP
Existing FSx for NetApp ONTAP
Amazon FSx for OpenZFS
Existing FSx for OpenZFS
Amazon FSx for Windows File Server
Existing FSx for Windows File Server
Visualize Cluster Settings
Shared storage settings can be viewed via Web Portal and IDEACLI.
Web Portal
Navigate to "Cluster Management" > "Settings" > "Shared Storage"
Project and Module scopes can be combined to create an AND condition.
Queue Profile and Project scopes can be combined to create an AND condition.
$ ./idea-admin.sh shared-storage add-file-system --help
Usage: idea-admin shared-storage add-file-system [OPTIONS]
add new shared-storage file-system
Options:
--cluster-name TEXT Cluster Name
--aws-region TEXT AWS Region [required]
--aws-profile TEXT AWS Profile Name
--kms-key-id TEXT KMS Key ID
-h, --help Show this message and exit.
./idea-admin.shshared-storageadd-file-system \--aws-region<REGION> \--cluster-name<CLUSTER_NAME>AddSharedStoragetoanIDEAClusterSharedStorageSettings? [Name] Enter the name of the shared storage file system (Must be all lower case, no spaces or special characters) testefs
? [Title] Enter a friendly title for the file system "New Shared EFS for Project A"? [Shared Storage Provider] Select a provider for the shared storage file system Amazon EFS? [Mount Directory] Location of the mount directory. eg. /my-mount-dir /custom_path? [Mount Scopes] Select the mount scope for file system ClusterNewAmazonEFSSettings? [Throughput Mode] Select the throughput mode Bursting? [Performance Mode] Select the performance mode General Purpose? [Enable CloudWatch Monitoring] Enable cloudwatch monitoring to manage throughput? No? [Lifecycle Policy] Transition to infrequent access (IA) storage? Transition to IA Disabled? [EFS Mount Options] Enter mount options nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport 0 0
Shared Storage Config ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
testefs:title:'"New Shared EFS for Project A"'provider:efsscope:-clustermount_dir:/custom_pathmount_options:nfs4nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport00efs:kms_key_id:~encrypted:truethroughput_mode:burstingperformance_mode:generalPurposeremoval_policy:DESTROYcloudwatch_monitoring:falsetransition_to_ia:~ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
? How doyouwanttoproceedfurther?UpdateClusterSettingsandExitsyncconfigentriestodb.overwrite:Trueupdatingconfig:shared-storage.testefs.title="New Shared EFS for Project A"updatingconfig:shared-storage.testefs.provider=efsupdatingconfig:shared-storage.testefs.scope= ['cluster']updatingconfig:shared-storage.testefs.mount_dir=/custom_pathupdating config: shared-storage.testefs.mount_options = nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport 0 0
updatingconfig:shared-storage.testefs.efs.kms_key_id=Noneupdatingconfig:shared-storage.testefs.efs.encrypted=Trueupdatingconfig:shared-storage.testefs.efs.throughput_mode=burstingupdatingconfig:shared-storage.testefs.efs.performance_mode=generalPurposeupdatingconfig:shared-storage.testefs.efs.removal_policy=DESTROYupdatingconfig:shared-storage.testefs.efs.cloudwatch_monitoring=Falseupdatingconfig:shared-storage.testefs.efs.transition_to_ia=None
./idea-admin.sh shared-storage attach-file-system --help
Usage: idea-admin shared-storage attach-file-system [OPTIONS]
attach existing shared-storage file-system
Options:
--cluster-name TEXT Cluster Name
--aws-region TEXT AWS Region [required]
--aws-profile TEXT AWS Profile Name
--kms-key-id TEXT KMS Key ID
-h, --help Show this message and exit.
$./idea-admin.shshared-storageattach-file-system \--aws-region<REGION> \--cluster-name<CLUSTER_NAME>AddSharedStoragetoanIDEAClusterSharedStorageSettings? [Name] Enter the name of the shared storage file system (Must be all lower case, no spaces or special characters) demo
? [Title] Enter a friendly title for the file system Demo FS? [VPC] Select the VPC from which an existing file system can be used vpc-0cb462f0bfc14526b (10.0.0.0/16) [<CLUSTER_NAME>-vpc]
? [Shared Storage Provider] Select a provider for the shared storage file system Amazon FSx for Lustre? [Mount Directory] Location of the mount directory. eg. /my-mount-dir /demo? [Mount Scopes] Select the mount scope for file system ClusterExistingFSxforLustreSettings? [Existing FSx for Lustre] Select an existing Lustre file system fsx-lustre (FileSystemId: fs-01a2ccc035f0f007c, Provider: fsx_lustre)
? [Mount Options] Enter /etc/fstab mount options lustre defaults,noatime,flock,_netdev 0 0Shared Storage Config -----------------------------------------------------------------------------------------------------------------
demo:title:DemoFSprovider:fsx_lustrescope:-clustermount_dir:/demomount_options:lustredefaults,noatime,flock,_netdev00fsx_lustre:use_existing_fs:truefile_system_id:fs-01a2ccc035f0f007cdns:fs-01a2ccc035f0f007c.fsx.us-east-1.amazonaws.commount_name:drohpbevversion:'2.10'----------------------------------------------------------------------------------------------------------------------------------------
? How doyouwanttoproceedfurther?