Elon Musk's xAI Supercomputer "Colossus" Goes Live with 100,000 Nvidia GPUs

Key points

  • xAI, owned by Elon Musk, has launched its supercomputer "Colossus," equipped with 100,000 Nvidia GPUs.
  • The supercomputer went live in just four months and is set to double its power soon to enhance AI model training.
  • xAI's Colossus will be key in advancing Grok AI models and other projects.
  •  
    Elon Musk's xAI Supercomputer "Colossus" Goes Live with 100,000 Nvidia GPUs

    Elon Musk’s xAI has officially brought its new supercomputer, named "Colossus," online at a facility in Memphis, Tennessee. The massive machine, built to train AI models, consists of 100,000 Nvidia H100 GPUs and was set up in a remarkably short period of just four months.

    The supercomputer is part of xAI's broader vision to scale up its computational capabilities, with plans to double the GPU count to 200,000 in the coming months. This includes adding 50,000 of Nvidia's latest H200 GPUs, which offer greater computational power and memory. Musk has committed billions of dollars to sourcing these GPUs, which are currently in high demand among major tech players like Meta, Google, Amazon, and Microsoft.

    Despite the fierce competition for Nvidia's GPUs, xAI managed to secure its initial batch by redirecting units originally intended for Tesla. Colossus is believed to be one of the largest GPU clusters in existence and will play a critical role in training the next version of Grok, xAI's AI chatbot. Grok-2 was released in beta earlier this year, and Grok-3, set to launch by the end of 2024, will be powered by Colossus.

    While the new supercomputer will significantly boost the learning capabilities of Grok and other AI projects, it remains to be seen whether it can outpace the AI advancements of competitors like Meta, which is also rapidly expanding its GPU infrastructure. In addition to Grok, Colossus is expected to support the AI models behind Tesla's Optimus robot, further showcasing Musk's ambitious plans in the AI space.