{"id":33076,"date":"2025-09-24T16:41:36","date_gmt":"2025-09-24T16:41:36","guid":{"rendered":"https:\/\/agooka.com\/news\/business\/a-former-apple-luminary-sets-out-to-create-the-ultimate-gpu-software\/"},"modified":"2025-09-24T16:41:36","modified_gmt":"2025-09-24T16:41:36","slug":"a-former-apple-luminary-sets-out-to-create-the-ultimate-gpu-software","status":"publish","type":"post","link":"https:\/\/agooka.com\/news\/business\/a-former-apple-luminary-sets-out-to-create-the-ultimate-gpu-software\/","title":{"rendered":"A Former Apple Luminary Sets Out to Create the Ultimate GPU Software"},"content":{"rendered":"<p>Save StorySave this storySave StorySave this story<\/p>\n<p>At a certain point between building Apple\u2019s developer tools, leading a core part of Google\u2019s AI infrastructure team, and clashing with Elon Musk during a stint as Tesla\u2019s Autopilot chief, Chris Lattner\u2019s vision for his life\u2019s work started to come into focus. AI was taking over the world, and demand was growing for the chips that powered it. But the software stack for those chips was dominated by just a few big companies. Would developers be able to easily run their code across all the different chips dotting the AI landscape?<\/p>\n<p>Lattner\u2019s answer to that question is Modular, a software startup he founded in 2022 with his former Google colleague Tim Davis. Modular makes a unifying software layer that helps cloud businesses squeeze as much juice as possible out of GPUs and CPUs\u2014the high-powered chips that underpin generative AI. The startup has also built a new coding language, based on Python, that lets developers use a single language to build AI apps that run across multiple GPUs and CPUs. Modular\u2019s basic premise is that if a developer builds an app for one chip, they shouldn\u2019t have to jump through hoops in order to run it on another vendor\u2019s chip.<\/p>\n<p>But Modular\u2019s long-term goal is even more ambitious: to loosen the software choke hold that companies like Nvidia and AMD have on the industry, and become the de facto software for AI chips.<\/p>\n<p>\u201cOur thesis is that the need for compute power is just exploding, but there is no unified compute platform,\u201d Lattner says. \u201cSovereign AI will be everywhere. There will be many Stargates. But there will be different types of chips optimized for different use cases, and there needs to be a unified layer for that.\u201d<\/p>\n<p>There are early signs that Modular\u2019s thesis bears out. AI giants like Nvidia, AMD, and Amazon have partnered with the startup to test the waters. The GPU cluster company SF Compute also worked with Modular to build what they claim is the world\u2019s cheapest API for large AI models. As of this week, Modular\u2019s developer platform now supports Apple Silicon GPUs, in addition to Nvidia and AMD chips.<\/p>\n<p>Building on this momentum, Modular just raised $250 million in venture capital funding, its third round of financing in three years, bringing its total valuation to $1.6 billion. The round was led by the Pittsburgh-based US Innovative Technology Fund. DFJ Growth also invested, along with existing investors General Catalyst, Greylock, and GV (formerly known as Google Ventures).<\/p>\n<p>\u201cWe\u2019ve spent a bunch of time and energy trying to figure out what makes a startup in this space interesting, and with every company that has tried to build their own chip\u2014and even the big players, like AMD and Nvidia\u2014it all comes back to the software,\u201d says Dave Munichiello, managing partner at GV. \u201cChris convinced me that the software was the most interesting and valuable problem to address.\u201d<\/p>\n<p>It might be valuable\u2014but it\u2019s also extremely complicated. Part of that complication stems from Nvidia\u2019s closed ecosystem. Nvidia\u2019s chips make up the vast majority of the GPU market, but the company\u2019s 20-year-old proprietary software platform, CUDA, keeps developers locked in. AMD\u2019s software platform for high-performance computing, called ROCm, differs in that it\u2019s open source. This allows developers to more easily move code to different chips.<\/p>\n<p>Still, developers say that bringing code from Nvidia\u2019s CUDA to ROCm isn\u2019t a smooth process, which means they typically focus on building for just one chip vendor.<\/p>\n<p>\u201cROCm is amazing, it\u2019s open source, but it runs on one vendor\u2019s hardware,\u201d Lattner told the crowd at AMD\u2019s Advancing AI event in June. Then he made his pitch for why Modular\u2019s software is more portable and makes GPUs that much faster.<\/p>\n<p>Lattner\u2019s talk at AMD is representative of the kind of dance that Lattner and Davis need to do as they spread the Modular gospel. Today, Nvidia and AMD are both crucial partners for the firm. In a future universe, they\u2019re also direct competitors. Part of Modular\u2019s value proposition is that it can ship software for optimizing GPUs even faster than Nvidia, as there might be a months-long gap between when Nvidia ships a new GPU and when it releases an \u201cattention kernel\u201d\u2014a critical part of the GPU software.<\/p>\n<p>\u201cRight now Modular is complimentary to AMD and Nvidia, but over time you could see both of those companies feeling threatened by ROCm or CUDA not being the best software that sits on top of their chips,\u201d says Munichiello. He also worries that potential cloud customers may balk at having to pay for an additional software layer like Modular\u2019s.<\/p>\n<p>Writing software for GPUs is also something of a \u201cdark art,\u201d says Waleed Atallah, the cofounder and CEO of Mako, a GPU kernel optimization company. \u201cMapping an algorithm to a GPU is an insanely difficult thing to do. There are a hundred million software devs, 10,000 who write GPU kernels, and maybe a hundred who can do it well.\u201d<\/p>\n<p>Mako is building AI agents to optimize coding for GPUs. Some developers think that\u2019s the future for the industry, rather than building a universal compiler or a new programming language like Modular. Mako just raised $8.5 million in seed funding from Flybridge Capital and the startup accelerator Neo.<\/p>\n<p>\u201cWe\u2019re trying to take an iterative approach to coding and automate it with AI,\u201d Atallah says. \u201cBy making it easier to write the code, you exponentially grow the number of people who can do that. Making another compiler is more of a fixed solution.\u201d<\/p>\n<p>Lattner notes that Modular also uses AI coding tools. But the company is intent on addressing the whole coding stack, not just kernels.<\/p>\n<p>There are roughly 250 million reasons why investors think this approach is viable. Lattner is something of a luminary in the coding world, having previously built the open source compiler infrastructure project LLVM, as well as Apple\u2019s Swift programming language. He and Davis are both convinced that this is a software problem that must be solved outside of a Big Tech environment, where most companies focus on building software for their own technology stack.<\/p>\n<p>&quot;When I left Google I was a little bit depressed, because I really wanted to solve this,\u201d Lattner says. \u201cWhat we realized is that it\u2019s not about smart people, it\u2019s not about money, it\u2019s not about capability. It\u2019s a structural problem.\u201d<\/p>\n<p>Munichiello shared a mantra common in the tech investing world: He says he\u2019s betting on the founders themselves as much as their product. \u201cHe\u2019s highly opinionated and impatient, and also right a lot of the time,\u201d Munichiello said of Lattner. \u201cSteve Jobs was also like that\u2014he didn\u2019t make decisions based on consensus, but he was often right.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Save StorySave this storySave StorySave this story At a certain point between building Apple\u2019s developer tools, leading a core part of Google\u2019s AI infrastructure team, and clashing with Elon Musk during a stint as Tesla\u2019s Autopilot chief, Chris Lattner\u2019s vision for his life\u2019s work started to come into focus. AI was taking over the world, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":33077,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[36],"tags":[],"class_list":["post-33076","post","type-post","status-publish","format-standard","has-post-thumbnail","category-business"],"_links":{"self":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/33076","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/comments?post=33076"}],"version-history":[{"count":0,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/posts\/33076\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media\/33077"}],"wp:attachment":[{"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/media?parent=33076"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/categories?post=33076"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/agooka.com\/news\/wp-json\/wp\/v2\/tags?post=33076"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}