Video courtesy NVIDIA

The Ultimate Deep Learning server with up to 8 PCIe Gen 3 GPU accelerators on a Single Root Complex including the NVIDIA® Tesla® P40.

A lot has changed since Alexey Grigorevich Ivakhnenko and V. G. Lapa published the first functional Deep Learning algorithm in 1965. Today’s technology allows for unsupervised learning of multiple levels of features or representations of data, building on the output from the previous layer as input, at speeds that were completely unimaginable 50 years ago.

BOXX GX8 Rackmount Server Overview


The BOXX GX8 Rackmount Series supports the Intel® Xeon® processor E5-2600 v4 product families and up to 10 PCIe Gen 3.0 x16 compatible devices enabling multidevice peering on a single PCIe root complex, such as the NVIDIA® Tesla® P40 GPU cards, powered by NVIDIA Pascal™ architecture, which is driving the AI revolution and enabling HPC breakthroughs, so you get only the best for your accelerated applications. This multi-device peering on a single PCIe root complex makes it a perfect solution for GPU accelerated applications and libraries like those used for deep learning, data analytics and molecular dynamics such as Torch 7, Theano, Caffe, and TensorFlow.

The GX8 Series rackmount servers are different from other GPU supporting hardware implementations. Most of the hardware configurations available today only provide maximum performance between specific pairs of GPUs; and since GPUs are paired up, jobs requiring communication between arbitrary GPUs experience a performance impact. Additionally, there can be a significant performance impacts with trying to scale more than four GPUs on multi-socket systems. These have been persistent problems for customers who are pushing the limits of GPUs with large, complex data-sets and calculations, or where data must be streamed between GPUs. Cirrascale has been able to overcome these issues, and achieve near linear performance scaling with its design.

By utilizing the Cirrascale SR3615 PCIe 96-lane switch riser, the BOXX GX8 supports up to eight NVIDIA Tesla P40 GPU cards and provides room for additional InfiniBand® or NVMe storage devices while enabling increased bandwidth and lower latencies between PCIe Gen3 devices than are possible in traditional systems. By enabling up to 8 discrete GPU accelerators to communicate directly with each other on the PCI bus, free of the need for host CPU intervention, they can create a "micro-cluster", sharing a single memory address space.

Product Features and Benefits

  • Supports the Cirrascale SR3615 PCIe switch riser enabling the peering of up to 10 PCIe Gen 3.0 x16 devices on a single root complex.
  • Supports dual Intel® Xeon® Processors E5-2600 v4 family.
  • Supports the NVIDIA® Tesla® P40 and P100 GPU Accelerators, powered by NVIDIA Pascal™ architecture, which is driving the AI revolution and enabling HPC breakthroughs, so you get only the best for your accelerated applications
  • A perfect solution for highly parallel applications like those used for deep learning, data analytics and molecular dynamics such as Torch 7, Theano, Caffe, and TensorFlow


Maximize PCIe Bandwidth

BOXX is a strong believer in utilizing a technology to its fullest potential whenever possible and GPUs and GPU Accelerators are no exception. If the GPU has a PCIe Gen3 x16 link, then it should use it when communicating with other GPUs — any other GPUs. Our switch riser technology allows us to scale and peer multiple PCIe x16 Gen3 cards on a single root hub ensuring that the maximum PCIe bandwidth is available utilized for inter-card communication.

Minimize Intercard Latency and Obtain Consistent Performance Between GPUs

Our switch riser allows GPUs to communicate as if they are all on the same bus... because they are. Gone are the days of needing a bounce-buffer in host memory, or leaving GPU DMA engines unused because they couldn't address other devices in the system. This reduces intercard latency while helping to maintain a consistent performance level between GPUs.

Enable GPU-Centric Development and Usage

Since most all of the GPU traffic is passed between the GPUs directly via the Cirrascale SR3615 switch riser, a very negligible amount of host resources are needed to perform GPU work. Additionally, with a single address space and simultaneous inter-card communication at full PCIe x16 Gen3 speeds, software can spend more time doing work than thinking about when to schedule data copies.

Supports the Largest Number of GPU Offerings

We work closely with our technology partners to ensure you're given the broadest offerings for your application. The BOXX GX8 Series supports both professional and consumer cards from the leading manufacturers including ground-breaking GPU accelerators, such as the NVIDIA® Tesla® P100 or P40 Accelerators designed specifically for deep learning applications.





The ideal platform for deep learning, the all-new BOXX GX8 is configurable with up to eight NVIDIA® Tesla™ or NVIDIA® Quadro™ GPU accelerators.

UP To 3.5 GHz
UP To 44 cores

Typically ships in 13 - 15 business days. Please refer to the Ship Date on the emailed receipt upon order completion.

Basic Configuration Specs

  • Configurations will vary greatly based on specific needs. Please contact us for a quote.



  • Dual Intel® Xeon® processors w/ up to 22 cores each (44 total cores)
  • Up to 1TB DDR4-2400MHz Memory
  • Up to eight NVIDIA® Quadro™ or NVIDIA® Tesla™ professional graphics cards by utilizing PCIe Gen 3 Switch Risers
  • Up to 4 x 1600W or 2000W Redundant Power Supplies
  • 8 x PCIe x16 (Gen3 x16 bus) slots
    2 x PCIe x16 (Gen3 x8 bus) slots
  • IPMI 2.0-compliant ASMB8-iKVM module and ASWM Enterprise
    WfM 2.0, DMI 2.0, WOL by PME, PXE


By utilizing PCIe 96-lane switch risers, the BOXX GX8 supports up to eight NVIDIA® GPU accelerators like the Tesla® P40, Quadro® GP100, Quadro® P100 and Quadro® P6000 while providing room for additional networking, storage adapters and NVMe storage.



This revolutionary technology enables multi-device peering on a single PCIe root complex ideal for or deep learning inference, data analytics and molecular dynamics applications like Caffe, Torch 7, Theano, TensorFlow, Neon, and AMBER.



What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?

Artificial intelligence is the future. Artificial intelligence is science fiction. Artificial intelligence is already part of our everyday lives. All those statements are true, it just depends on what flavor of AI you are referring to. In this multi-part series, long-time tech journalist Michael Copeland explains the fundamentals of deep learning.



BOXX supports multiple configurations of its products and prefers to work closely with its customers and partners to determine the best fit for their company's needs. We work hard to listen and understand and can tailor any of our products to your specific requirements. If your’re looking to accelerate training and inference of deep neural networks using applications like TensorFlow, Caffe, Torch 7, Theano, Neon, and AMBER one of our performance specialists can guide you to the appropriate solution and configuration. Click below to connect with us.

in the USA

At BOXX, we’re engineers and creative professionals too. In fact, we rely on SolidWorks, 3Ds Max, and other applications every day. Our chassis are designed by BOXX engineers and proudly manufactured in the USA, but they aren’t built for sending emails or gaming. They’re crafted out of aircraft quality aluminum and steel strengthening components. That means maximum airflow and cool, quiet operation—even with the most demanding hardware configurations.

Tech Support

At BOXX, we understand that you need to be back working just as soon as possible when something goes wrong. That's why YOUR productivity is always our top priority. Our in-house technical support operatives will attempt to recreate any issues you have in an effort to reproduce even the most obscure problem. We'll even overnight parts when necessary during your premium warranty period.


The BOXX Workflow

Keep working while you render! BOXX offers unique hardware packages specifically designed to reduce the bottlenecks that plague professional software applications. By offloading your rendering, simulation, or other multi-threaded tasks, creativity never has to be put on hold by your hardware. That's the philosophy behind The BOXX Workflow.



We understand that it's important to know where your money goes when purchasing a premium workstation. BOXX offers services and solutions that go far beyond what you'll find at Dell, HP, or Apple.