Amazon Web Services (AWS) has launched EC2 instances that it says are optimized specifically for deep learning training.
The new Amazon EC2 Trn1 instances are powered by AWS Trainium chips, a second generation ML chip designed by AWS, followed by its own AWS Inferentia chips.
The Cloud Giant claims that these new states are well suited for the large-scale distributed training of complex deep learning models, such as natural language processing and image recognition.
What do users get?
Trn1 instances are available in two configurations and are powered by up to 16 AWS Trainium chips with 128 vCPUs.
The instances apparently offer up to 512GB of high-bandwidth memory and provide up to 3.4 petaflops of TF32/FP16/BF16 compute power and feature NeuronLink between chips. NeuronLink helps avoid connection bottlenecks when scaling workloads across multiple Trainium chips.
Additionally, Amazon says Trn1 instances are the first EC2 instances to enable up to 800Gbps of Elastic Fabric Adapter (EFA) network bandwidth for high-throughput network connections. Trn1 instances come with up to 8TB of local NVMe SSD storage for ultra-fast access to large data sets.
AWS also states that its Trainium chips include specific scalar, vector, and tensor engines designed specifically for deep learning algorithms.
Other new features of the Trainium chips include support for a wide range of data types, including FP32, TF32, BF16, FP16, UINT8, random rounding, as well as custom C++-written operators and dynamic tensor shapes.
AWS Training shares the same thing AWS Neuron SDK Like AWS Inferentia, which makes the transition to AWS Trainium easier.
Where can I register?
You can run instances of Trn1 today in certain regions such as AWS US East (N. Virginia) and US West (Oregon).
These Trn1 instances can be deployed using AWS Deep Learning AMIs, and container images are available across managed services such as Amazon SageMaker, Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Elastic Container Service (Amazon ECS), and AWS ParallelCluster.
To find out more, you can head over to Trn1 instances page in Amazon EC2 (Opens in a new tab).