Announcing Enhancements to the Dell AI Platform with AMD

Technology

You have likely seen how promising artificial intelligence can be in a controlled environment. Across industries, many enterprises have successfully launched AI pilots and proof-of-concept (POC) projects. Yet, they often hit a wall when trying to transition these early successes into full-scale production to drive real business impact. As you move from a pilot to production, complexity skyrockets. You face steep challenges bridging the infrastructure gap, managing seamless data integration and migrating workloads effectively.

We understand these hurdles. The upcoming enhancements to the Dell AI Platform with AMD, a part of the Dell AI Factory, provide a secure, flexible, scalable, on-premises solution to help you confidently operationalize AI. Instead of piecing together disparate parts, this solution takes a comprehensive approach. We integrate compute, storage, network infrastructure and comprehensive AI and MLOps software stacks with your enterprise data sources, key software partner solutions and flexible professional services. The result is a fully validated, scalable solution for training, fine-tuning and high-throughput inference eliminating the need for separate, siloed AI stacks.

The new Dell AI Platform with AMD

To help you deploy AI precisely and predictably, we are introducing two major enhancements to the Dell AI Platform with AMD.

First, we are introducing a new large-scale configuration designed for rigorous model training, pre-training and inferencing, utilizing Dell PowerEdge XE9785 server nodes equipped with AMD Instinct MI355X GPUs and EPYC CPUs, backed by the same robust PowerSwitch networking and PowerScale storage. Powered by AMD Instinct MI355X GPUs, the platform delivers industry-leading memory capacity, enabling larger models and more efficient scaling. This is particularly valuable for large enterprises and service providers managing massive round-the-clock AI demands.

Second, we are extending the Dell AI Factory modular architecture to include AMD Instinct MI350P PCIe GPUs and AMD EPYC CPUs. This new configuration utilizes Dell PowerEdge XE7745 and R7725 server nodes, Dell PowerSwitch networking and Dell PowerScale storage alongside the Dell AI Data Platform. If you want to transition from a pilot to production on a predictable, common foundation, this solution allows you to scale cost-effectively and address specific resource bottlenecks.

Both configurations utilize the AMD ROCm and AMD enterprise AI software stack, with native integration of open frameworks such as Pytorch and vLLM. The solution will also be available through the Dell Automation Platform, which simplifies initial deployments and ongoing scaling.

Furthermore, a recent Omdia study commissioned by Dell Technologies validated that the Dell and AMD on-premises solution utilizing Dell PowerEdge XE9785 compute nodes with AMD Instinct MI355X GPUs delivers up to 65% lower total cost of ownership (TCO) than enterprise cloud alternatives. Powered by AMD Instinct GPUs and the open ROCm ecosystem, the solution delivers the performance, flexibility and cost efficiency enterprises need to move beyond pilots. You can explore the executive summary or read the full report for detailed insights.

Confidently operationalize AI with Dell and AMD

The core value of the Dell AI Platform with AMD is simple: it empowers organizations of all sizes to operationalize AI confidently across their functions to drive transformational business outcomes. It achieves this through three key capabilities:

Modularity
The new modular architecture provides a common path from small pilots to full-scale enterprise production. You can start small with just a single compute node with 2 GPUs for early testing. As you move into production, you can repurpose this infrastructure by adding additional GPU or compute nodes, storage and networking. As your user base grows, you can easily add precise resources, whether GPUs, CPUs, memory, network bandwidth or storage capacity, in small increments to meet your exact needs.

Versatility
You need an infrastructure that adapts to your workloads, not the other way around. Optimized for the AMD enterprise AI software and key software partner solutions, the platform handles a wide range of use cases. It serves as a common foundation for all your AI initiatives, with seamless model portability on open software stacks. Now you never have to worry about getting locked into inefficient, one-off silos.

Security
As AI usage spreads across your organization, maintaining strict security and governance is critical. The platform is built on Dell’s trusted on-premises foundation, which inherently minimizes external security risks. Additionally, the deep governance and controls built directly into the AMD enterprise AI Resource Manager help you put the right access and data management procedures in place to protect your business.

Turn your AI vision into reality

Bridging the gap between a successful pilot and a thriving production environment does not have to be a struggle. With new modular and large-scale configurations, you gain the flexibility, scalability and security needed to scale AI confidently.

Ready to transform your business outcomes? Learn more about Dell and AMD AI solutions here.

Dell reported this
Source: www.dell.com
Source link

Leave a Reply

Your email address will not be published. Required fields are marked *

four × five =