Mon, 19 Aug 2024
Latitude.sh started offering bare-metal A100 and H100 GPUs in 2023, and so far in 2024 has already added L40s GPUs and containerized GPUs through Launchpad. What features and products are your customers most asking for next, and what do you think your ML offering will look like in a year’s time?
When we talk about AI clusters, on the one hand, we have noticed a growing demand for raw GPU processing power for training large-language models as the model sizes are quickly scaling and already getting to the hundreds of billions of parameters.
These workloads require terabytes of RAM to be properly trained, which can only be accomplished by linking many GPUs of the latest generation, so customers have already started requesting larger instances with those specs.
On the other hand, fine-tuning and inference workloads are becoming more efficient and require less compute power compared to training from scratch. This makes the L40S GPU a more cost-effective option for these tasks than the higher-end H100s and A100s.
Instead of aiming to become an ‘AI farm’, our goal is to continue developing solutions tailored to our clients’ needs. We are focused on creating APIs that provide practical, applied functionalities and address the challenges users face, rather than simply expanding our portfolio.
The cloud GPU market is not as big or mature as what we see for CPU servers and there’s plenty we need to test and learn before committing to growing our GPU portfolio. However, our team is very committed to conducting research and performing tests to better understand the best ways we can address the growing demand for GPU compute, which is why we released Launchpad earlier this year.
There are specific workloads that can greatly benefit from a containerized compute experience and our goal is to enhance our product to address that demand and help users unlock new use cases as well.
Since Launchpad was released, it has evolved based on the feedback from the Machine Learning community, and now has many new features available to improve the container experience, such as SSH support, persistent storage, per-minute billing, and access to a wide blueprint library created by the users themselves.
Our commitment is to make AI infrastructure easier to use so that the users can focus on their applications. In the next year, we plan to keep working closely with the AI community to make our products even better suited to a wide range of AI tasks.
When you announced Launchpad, you mentioned your commitment to the open development of AI, and democratizing AI training by providing developers with the tools they need to scale up their training. How does your support of ML competitions contribute to this vision?
We firmly believe that Artificial Intelligence is the most powerful technology ever built, and its impact and reach will only keep growing until it becomes constantly present for everyone, everywhere.
The problem is that the most powerful ML models widely available today are controlled by just a handful of companies fighting for dibs on AI and no one company should hold that much power. We believe AI should be open, accessible, and transparent.
With that in mind, our dedication to open development and making AI accessible to everyone must go beyond just providing hardware solutions to potential customers, which is why we decided to start providing GPU grants to researchers and lately, support ML competitions.
By actively supporting ML competitions we can encourage collaboration and knowledge sharing within the AI community. These contests provide a space for developers to showcase their skills, test new ideas, and contribute to the overall advancement of AI.
Moreover, by equipping them with the necessary tools to scale up their training efforts, we are empowering a new generation of AI practitioners to drive innovation further and make significant contributions to the field so open-source AI can grow to become the standard in the industry.
With its 23-year history, Latitude.sh predates most companies in the space. It was around even before the term “cloud computing” became widely used, and has experienced multiple waves of innovation in the industry. From that vantage point, what can you tell us about the current machine-learning revolution?
The tech infrastructure industry has been evolving a lot for the past decades and the amount of compute power and network bandwidth we see available in servers today is just amazing, and that is one of the main reasons why Machine Learning projects have been scaling so fast.
In the past, there just wasn’t enough computing power available to the market at scale, and most importantly, at affordable prices. You would only hear about machine learning projects carried out on supercomputers located at huge tech companies or in renowned universities with access to rich sponsors.
However, as companies increased investment in chip R&D and top-tier hardware became widely available in cloud computing platforms, that situation started to shift and a growing number of organizations, teams, and individuals started to have access to the compute needed to enter the Machine Learning landscape, which greatly contributed to the growth and the evolution of this market.
The Machine Learning we see today represents a significant shift in how data is processed and utilized. We are currently experiencing an unprecedented convergence of technology and innovation thanks to cutting-edge advancements in hardware, but most importantly due to the growing number of people working on AI.
At Latitude.sh, we are proud to be at the forefront of this revolution, empowering organizations worldwide to unleash the full potential of AI and shape the future of technology. With our extensive experience, we are well-positioned to support this transformation by providing the necessary infrastructure to facilitate the deployment of open-source AI and offer powerful tools to everyone who needs it.
Given Latitude.sh’s long history of providing bare-metal hardware and hosting solutions, how has that given you an edge in building out your new ML training and inference products?
Our expertise in providing bare-metal and hosting solutions has given us a significant advantage in decreasing the time to launch our GPU cloud infrastructure.
Throughout the years, we have been expanding and improving our platform’s backbone by opening new sites and upgrading the network. Once we identified the opportunity to support machine learning workloads with GPU servers, it was just a matter of understanding the main hardware requirements they had and then adapting the specs of our racks to add these machines.
Everything else needed to provide these new types of instances to developers, like automation, was already in place and our GPU offering benefited from our dev-first platform from the day it was first installed in our data centers.
Additionally, our extensive experience in managing complex hardware infrastructure, optimizing servers for performance, and ensuring platform reliability has been honed over the years. This expertise has proven to be a vital asset as we’ve ventured into the realm of AI and Machine Learning, where efficient resource utilization and computational power are crucial.
Could you tell us a bit about the company’s history, and your approach to scaling up to the global footprint you have now, with data centers in 22 locations?
Latitude.sh’s journey, formerly known as Maxihost, began over two decades ago as we saw an opportunity to address the demand from the early stages of the commercial internet by hosting websites.
Since then, we’ve undergone a significant transformation, evolving our core business and expanding our global footprint to meet the needs of our users.
From hosting website servers in the early 2000s, we grew to become a global cloud computing platform specialized in offering bare metal servers across different verticals, and our approach to scaling has been guided by a commitment to platform reliability, server scalability, and customer satisfaction.
This commitment was the key to our global expansion as our customers were the ones requesting us to open new locations so they could leverage our servers to address their demand on multiple continents.
From the Americas, we expanded to Europe, then to APAC, and now our infrastructure backbone enables us to deliver world-class services to customers all around the globe, at 22 different locations.
Latitude.sh is the cloud that powers innovation. From high-demanding applications to machine learning workflows, provision scalable, high-performance, and cost-effective cloud infrastructure on an easy-to-use and modern platform that puts developers first.
With both GPU containers and fully dedicated clusters available, choose the option that best fits your compute needs to effortlessly train, fine-tune and run inference on your Machine Learning models.
Start training today! Use the code MLC24 to get 10% off during your first 3 months.
Latitude.sh sponsored ML Contests’ State of Competitive ML (2023) report.