Benchmarking the Amazon cc1.4xlarge EC2 instance.
These are the questions they aimed to answer
We are asking very broad questions and testing assumptions along the lines of:
- Does the hot new 10 Gigabit non-blocking networking fabric backing up the new instance types really mean that “legacy” compute farm and HPC cluster architectures which make heavy use of network filesharing possible?
- How does filesharing between nodes look and feel on the new network and instance types?
- Are the speedy ephemeral disks on the new instance types suitable for bundling into NFS shares or aggregating into parallel or clustered distribtued filesystems?
- Can we use the replication features in GlusterFS to mitigate some of the risks of using ephemeral disk for storage?
- Should the shared storage built from ephermeral disk be assigned to “/scratch” or other non-critical duties due to the risks involved? What can we do to mitigate the risks?
- At what scale is NFS the easiest and most suitable sharing option? What are the best NFS server and client tuning parameters to use?
- When using parallel or cluster filesystems like GlusterFS, what rough metrics can we use to figure out how many data servers to dedicate to a particular cluster size or workflow profile?