Run:AI brings virtualization to GPUs running Kubernetes workloads

In the early 2000s, VMware introduced the world to virtual servers that allowed IT to make more efficient use of idle server capacity. Today, Run:AI is introducing that same concept to GPUs running containerized machine learning projects on Kubernetes.

This should enable data science teams to have access to more resources than they would normally get were they simply allocated a certain number of available GPUs. Company CEO and co-founder Omri Geller says his company believes that part of the issue in getting AI projects to market is due to static resource allocation holding back data science teams.

“There are many times when those important and expensive computer sources are sitting idle, while at the same time, other users that might need more compute power since they need to run more experiments and don’t have access to available resources because they are part of a static assignment,” Geller explained.

To solve that issue of static resource allocation, Run:AI came up with a solution to virtualize those GPU resources, whether on prem or in the cloud, and let IT define by policy how those resources should be divided.

“There is a need for a specific virtualization approaches for AI and actively managed orchestration and scheduling of those GPU resources, while providing the visibility and control over those compute resources to IT organizations and AI administrators,” he said.

Run:AI creates a resource pool, which allocates based on need. Image Credits Run:AI

Run:AI built a solution to bridge this gap between the resources IT is providing to data science teams and what they require to run a given job, while still giving IT some control over defining how that works.

“We really help companies get much more out of their infrastructure, and we do it by really abstracting the hardware from the data science, meaning you can simply run your experiment without thinking about the underlying hardware, and at any moment in time you can consume as much compute power as you need,” he said.

While the company is still in its early stages, and the current economic situation is hitting everyone hard, Geller sees a place for a solution like Run:AI because it gives customers the capacity to make the most out of existing resources, while making data science teams run more efficiently.

He also is taking a realistic long view when it comes to customer acquisition during this time. “These are challenging times for everyone,” he says. “We have plans for longer time partnerships with our customers that are not optimized for short term revenues.”

Run:AI was founded in 2018. It has raised $13 million, according to Geller. The company is based in Israel with offices in the United States. It currently has 25 employees and a few dozen customers.


By Ron Miller

Nvidia launches colossal HGX-2 cloud server to power HPC and AI

Nvida launched a monster box yesterday called the HGX-2, and it’s the stuff that geek dreams are made of. It’s a cloud server that is purported to be so powerful it combines high performance computing with artificial intelligence requirements in one exceptionally compelling package.

You know you want to know the specs, so let’s get to it: It starts with 16x NVIDIA Tesla V100 GPUs. That’s good for 2 petaFLOPS for AI with low precision, 250 teraFLOPS
for medium precision and 125 teraFLOPS for those times when you need the highest precision. It comes standard with a 1/2 a terabyte of memory and 12 Nvidia NVSwitches, which enable GPU to GPU communications at 300 GB per second. They have doubled the capacity from the HGX-1 released last year.

Chart: Nvidia

Paresh Kharya, group product marketing manager for Nvidia’s Tesla data center products says this communication speed enables them to treat the GPUs essentially as a one giant, single GPU. “And what that allows [developers] to do is not just access that massive compute power, but also access that half a terabyte of GPU memory memory as a single memory block in their programs,” he explained.

Graphic: Nvidia

Unfortunately you won’t be able to buy one of these boxes. In fact, Nvidia is distributing them strictly to resellers, who will likely package these babies up and sell them to hyperscale datacenters and cloud providers. The beauty of this approach for cloud resellers is that when they buy it, they have the entire range of precision in a single box, Kharya said

“The benefit of the unified platform is as companies and cloud providers are building out their infrastructure, they can standardize on a single unified architecture that supports the entire range of high performance workloads. So whether it’s AI, or whether it’s high performance simulations the entire range of workloads is now possible in just a single platform,”Kharya explained.

He points out this is particularly important in large scale datacenters. “In hyperscale companies or cloud providers, the main benefit that they’re providing is the economies of scale. If they can standardize on the fewest possible architectures, they can really maximize the operational efficiency. And what HGX allows them to do is to standardize on that single unified platform,” he added.

As for developers, they can write programs that take advantage of the underlying technologies and program in the exact level of precision they require from a single box.

The HGX-2 powered servers will be available later this year from partner resellers including Lenovo, QCT, Supermicro and Wiwynn.


By Ron Miller

Pure Storage teams with Nvidia on GPU-fueled Flash storage solution for AI

As companies gather increasing amounts of data, they face a choice over bottlenecks. They can have it in the storage component or the backend compute system. Some companies have attacked the problem by using GPUs to streamline the back end problem or Flash storage to speed up the storage problem. Pure Storage wants to give customers the best of both worlds.

Today it announced, Airi, a complete data storage solution for AI workloads in a box.

Under the hood Airi starts with a Pure Storage FlashBlade, a storage solution that Pure created specifically with AI and machine learning kind of processing in mind. NVidia contributes the pure power with four NVIDIA DGX-1 supercomputers, delivering four petaFLOPS of performance with NVIDIA ® Tesla ® V100 GPUs. Arista provides the networking hardware to make it all work together with Arista 100GbE switches. The software glue layer comes from the NVIDIA GPU Cloud deep learning stack and Pure Storage AIRI Scaling Toolkit.

Photo: Pure Storage

One interesting aspect of this deal is that the FlashBlade product operates as a separate product inside of the Pure Storage organization. They have put together a team of engineers with AI and data pipeline understanding with the focus inside the company on finding ways to move beyond the traditional storage market and find out where the market is going.

This approach certainly does that, but the question is do companies want to chase the on-prem hardware approach or take this kind of data to the cloud. Pure would argue that the data gravity of AI workloads would make this difficult to achieve with a cloud solution, but we are seeing increasingly large amounts of data moving to the cloud with the cloud vendors providing tools for data scientists to process that data.

If companies choose to go the hardware route over the cloud, each vendor in this equation — whether Nvidia, Pure Storage or Arista — should benefit from a multi-vendor sale. The idea ultimately is to provide customers with a one-stop solution they can install quickly inside a data center if that’s the approach they want to take.