Google Cloud now has a dedicated cluster of Nvidia GPUs for YC startups

zhyder • 11 months ago

This link doesn't indicate the credits are exclusive to YC, seems they're open to all AI startups: https://cloud.google.com/startup/apply

Astroboy007 • 11 months ago

While that is true. They state that they are specifically giving every YC company this deal. So you won’t have to hope that they accept you into their program, once you’re in yc you’re guaranteed it.

sieabahlpark • 11 months ago

Honestly I see this as anti-competitive of Google to award only YC companies this benefit. All, specific, or none.

eru • 11 months ago

Why? Google doesn't have any monopoly on compute clusters, in fact they aren't even the market leader.

(And for all we know, Google is willing to offer similar deals to other accelerators in return for some compensation.)

jgalt212 • 11 months ago

sokoloff • 11 months ago

I see this as competitive rather than anti-competitive. Google does not have a monopoly position here and is competing to become the supplier to these startups.

That’s a well-functioning market, I think.

dartos • 11 months ago

It’s specifically YC companies.

It’s likely so that GCP can get its hooks in these startups early.

boringg • 11 months ago

Its not likely. It is why they are doing it.

chrisandchris • 11 months ago

I see it as advantage for their competitors, they'll waste so much money on hyped AI startups others can invest in other thing.

blitzar • 11 months ago

YC startups given startup status and preapproved for "Google for Startups Cloud Program" didn't sound as good.

YetAnotherNick • 11 months ago

> Google will provide a dedicated cluster with priority access

This is the key part. Even with the credits there is a GPU shortage.

sashank_1509 • 11 months ago

In a startup I worked we received: 230k Google cloud credits, and 180k AWS credits no questions asked. Especially when your company is pre product and not scaling, these cloud credits allow you to iterate rapidly

PJones2000 • 11 months ago

We got AWS credits too, but when we wanted to do some serious ML there were never any machines available. Our contact person at AWS just stopped replying to emails when we raised this. While one shouldn't look a gift horse in the mouth, it was just a waste of time and we had more success elsewhere.

RileyJames • 11 months ago

Our experience exactly. AWS credits are advertised everywhere, but seem difficult to utilise.

93po • 11 months ago

https://scientistseessquirrel.wordpress.com/2019/04/16/for-t...

sashank_1509 • 11 months ago

We’ve not had issues getting T4 GPU instances from AWS. We faced difficulties provisioning A100 and AWS is annoying that the 2 tiers are either T4 or directly A100’s I think.

We use AWS GPU machines for our CI but for serious ML training workloads we use GCP L4 GPU instances. Even in GCP we couldn’t provision or A or H100 (our quota itself is just 1 GPU of these instances) but we’ve never had issues provisioning L4 GPU’s and I think that’s enough for smaller not LLM Scale Models. For LLM scale startups, it’s tough provisioning GPU’s even if you have money.

(We’re based in Bay Area btw)

playacools • 11 months ago

Did you try this? We've had good luck using capacity blocks, they always seem to have some A100s and H100s available in the next few days. You can just get them yourself, don't have to wait for someone from AWS to help you.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-capa...

PJones2000 • 11 months ago

This seems promising, I'll be trying this out. Thank you for the pointer.

Lucasoato • 11 months ago

What kind of machine weren't you able to find? Were they specific H100 or A100 setups? Were your credits bound to one particular region?

PJones2000 • 11 months ago

No, credits weren't tied to a region but even switching to a few other regions we couldn't get hold of machines. We asked if there was a way to search across all regions so we could switch to where the machines were located. No reply. No A100s or H100s at all. We were just looking for more than 24GB VRAM which is not a big ask. I don't understand the AWS credits; however I noticed they asked a lot of questions about our business model, technology and customers. Draw from that what you will.

twelve40 • 11 months ago

ZeroCool2u • 11 months ago

I'm forced to operate in AWS GovCloud for my work and it's the same thing, but even worse. Old instance types and there's barely any of them in there. Mind you, there's two GovCloud regions and the East one is even worse! There's basically nothing outside general compute instance types, so you're really stuck with just the West region. P4's are officially available in only 1 AZ, come in exactly 1 size, and are one of the most expensive instances in the region. P3's (initially released in 2018!) are so hard to come by it's infuriating. Meanwhile we have a horde of AWS reps and they all claim there's availability.

Really feels like if you need accelerated compute GCP is the better option these days. At least there you can rewrite in Jax if it comes down to it and opt for TPU's.

authorfly • 11 months ago

Same. But in time, I realise, you don't see 90% of the things that block companies. Startups lose their main marketing channel, their plan for scaling is thwarted by quotas or random API limits that were unexpected, core team members leave.

The trick is your strategy and what's in place to survive and turn into those waves when they come in the end. Seems like you did this as you had more success elsewhere.

wil421 • 11 months ago

If your business is dependent on big Tech credits and quotas, you’re going to had a bad time. Maybe even lose your business entirely due to an ELUA change.

authorfly • 11 months ago

The problem is, what other option is there sometimes?

Quotas in this case refers to needing to request access to GPUs. Those reviews takes minutes or days depending on your relationship, and are often final.

Its possible that none of the cloud providers offering GPUs will ever allow you quotas high enough to scale to profitable margins, and there's nothing you can do about it except try and host in house (which would be mad). But the GPUs being on cloud makes that very uneconomical and they are in high demand.

This was all as true in 2019 or 2022 (more so with TPUs in 2020 I should say) as it is now. It's not a ChatGPT thing.

2013 is approximately when university and national computing units became relatively useless compared to GPU cloud compute. This stopped a wave of would-be university spin offs from having access to sufficient compute to compete.

moralestapia • 11 months ago

>elsewhere

Where?

bankcust08385 • 11 months ago

The crack dealers want startups to get hooked so they can't or won't consider scaling to real, cheaper metal when/if their usage stabilize.

Disclaimer: Former AWS consultant

snihalani • 11 months ago

private equity + public equity = unicorns. Thanks openai for the play.

next idea I'd love to see: professors getting grants/cloud credits to teach classes on the gcloud

fragmede • 11 months ago

> If you're a faculty member at an eligible institution, you can apply for Google Cloud education credits to teach using Google Cloud. The credits grant you a spending allowance as a Cloud Billing account credit, and can be used for all Google Cloud services

https://cloud.google.com/billing/docs/how-to/edu-grants#:~:t....

bachmeier • 11 months ago

When I looked into this before, it had the problem that the student/faculty member had to accept unlimited liability if something goes wrong - and when you're learning, things do go wrong. If a student gets a bill for $50,000, what do you do?

bn-l • 11 months ago

Also no way to set a hard limit in gcp

scandox • 11 months ago

It seems your dystopian nightmares are rather passé

sashank_1509 • 11 months ago

During grad school, quite a few professors had access to TPU’s from Google Cloud Research Program. I imagine with LLM scale now, it would be much harder to get access, but still possible if you’re from a big name institution.

MattGaiser • 11 months ago

Isn't this pretty standard? As a student I got plenty of cloud credit.

chazeon • 11 months ago

This is already happening.

aussieguy1234 • 11 months ago

how common is it for venture funds to buy bulk cloud services for use in startups they own/invest in?

Eridrus • 11 months ago

I have no idea why people are saying this is common. There are plenty of cloud credits programs for startups from GCP, etc, but they are not funded by VCs, they are sales & marketing programs funded by the cloud vendors.

A small minority of VCs (AI Grant, a16z, and now YC) have been using their funds to help startups get access to GPUs specifically for the last year and a bit, but there's no need to do a similar thing for general cloud services where there is no shortage.

ignoramous • 11 months ago

> sales & marketing programs funded by the cloud vendors...

The more serious ones have their CEOs do the selling: https://www.youtube.com/watch?v=6nKfFHuouzA / https://ghostarchive.org/varchive/6nKfFHuouzA

hodgesrm • 11 months ago

> how common is it for venture funds to buy bulk cloud services for use in startups they own/invest in?

Outside of this article, I've never heard of it. In fact it seems kind of illogical because VCs don't necessarily know which infra tech to invest in. (Not their job.)

What VCs will do is connect you with public cloud vendor programs for startups or get you access to favorable discounts that are not generally available for small companies. My company benefited from both of these.

Edit: clarity

wmf • 11 months ago

One upside of a monopoly is that it's obvious which tech to use.

warkdarrior • 11 months ago

Google is not a monopoly in Cloud, nor in AI, nor in cloud AI.

wmf • 11 months ago

aleph_minus_one • 11 months ago

> One upside of a monopoly is that it's obvious which tech to use.

Not in the tech of the monopoly, since disrupting the monopoly enables you huge cost savings (example: in its formation years, Google used of-the-shelf computers hold together by Velcro tape instead of expensive servers by the big vendors).

richardw • 11 months ago

Here’s another:

A16z building a stash of GPU’s:

https://www.theinformation.com/articles/andreessen-horowitz-...

jedberg • 11 months ago

It's not common, but not unheard of. Nat Friedman and Daniel Gross own a cluster of GPUs that they rent back to their startups for below market rates. They essentially trade GPU time for equity.

Index Ventures has a deal with Oracle to provide GPUs at no cost to their startups (they pay the bill).

morgante • 11 months ago

> Nat Friedman and Daniel Gross own a cluster of GPUs that they rent back to their startups for below market rates.

Not just their startups btw.

leohonexus • 11 months ago

Link to the cluster's specs: https://andromeda.ai/

alephnerd • 11 months ago

Very common. All the major CSPs have dedicated Startup and VC GTM teams for this reason.

For example, Thoma Bravo's VC fund would give a 10-20% discount on a certain major CSP's compute because of the parent fund's significant stake in that company.

stingraycharles • 11 months ago

Extremely common, to the point that all major cloud vendors have special VC programs.

adelpozo • 11 months ago

A16z seems to have bought 20K GPUs to be used by the startups they fund. https://www.theinformation.com/articles/andreessen-horowitz-... (July 9)

Sorry for the direct link, couldn’t get archive.is to work with this one.

Astroboy007 • 11 months ago

i think A16z has a similar model but I believe they own the servers themselves

foolfoolz • 11 months ago

very common. we had many thousands of dollars in cloud provider credits through our investors

Narkov • 11 months ago

Not sure many VC's are actually spending money to get those credits versus being given them for free as a promotional tool.

p1esk • 11 months ago

Interesting, I’d expect Google to offer TPUs, not GPUs.

ipsum2 • 11 months ago

> Google Cloud is giving Y Combinator startups access to a dedicated, subsidized cluster of Nvidia graphics processing units and Google tensor processing units to build AI models.

It's both.

moneywoes • 11 months ago

what’s the difference in practice

michaelt • 11 months ago

At cloud prices, TPUs are cheaper per FLOP but have much worse library support, leading to much higher upfront engineering costs - and you're locked into Google's cloud.

On the other hand, essentially every ML project works out-the-box with nvidia GPUs. There's still vendor lock-in to nvidia, but it's more palatable.

If you spend $100k of an ML engineer's time to get FooNet to work on TPU, then the cutting edge advances or you pivot and instead you need BarNet support - you might wish you'd spent that $100k just buying a stack of nvidia GPUs.

martinald • 11 months ago

But also the cost per FLOP will/has/should come down aggressively over time for nvidia, whereas I doubt Google will do the same for TPUs (as they have lock in).

Also the hyperscalers as per usual are far more expensive than others - this is an incomplete list https://getdeploying.com/reference/cloud-gpu/nvidia-h100 - GCP seems to be around the $100/hour for the 8xH100 config (similar to AWS).

robertlagrant • 11 months ago

latchkey • 11 months ago

This is fun validation for what I've been working on for the last year!

My startup (Hot Aisle) is all about building, managing and deploying dedicated compute clusters for businesses. At the enterprise level of compute, there is a lot that goes into making this happen, so we are effectively the capex / opex for businesses that don't want to do this themselves, but want to have a lot more control over the compute they are running on.

The twist is that while we can deploy any compute that our customers want, we are starting with AMD instead of Nvidia. The goal is to work towards offering alternatives to a single provider of compute for all of AI.

You can't do this for others unless you also do it for yourself. As such, we're building our own first cluster of 16x Dell chassis with 128 MI300x GPUs deployed into a Tier 5 data center as our initial rollout. Full technical details are on our website. It has been a long road to get here and we hope to be online and available for rental at the end of this month.

One of my goals has also been to get Dell / AMD / Advizex (our var) to offer compute credits on our cluster. Those credits would then get turned around into future purchases to grow into more clusters. It becomes a developer flywheel... the more developers on the hardware, the more hardware needed, the more we buy. This is something unfamiliar to their existing models, so wish me luck in convincing them. Hopefully this announcement helps my story. =)

Edit: Getting downvoted. Would love to hear some dialog for why. I don't really consider this an advertisement, so apologies if you're clicking that button for that reason. I'm really just excited about learning about validation of my business model and explaining why.

reaperman • 11 months ago

Even if it did feel like an advertisement, it's on topic for this discussion and adds something. That said, maybe it's a bit hard to read? Something about the flow throws me off - that could contribute to downvotes. I don't know.

latchkey • 11 months ago

That's how you know it wasn't written by AI. =)

fragmede • 11 months ago

nah, for some reason voters here have been getting very anti-advertisement lately, as if this site wasn't somehow a giant ad in and of itself.

npinsker • 11 months ago

I did not downvote (I upvoted), but I think I can see why others did.

Mentioning their startup’s name, right at the top, isn’t a great start for me. OP didn’t need to do that, it means nothing to me, and it doesn’t add anything other than advertising. A few other phrasing choices like “full technical details on our website” evoke a recruiter spiel a little bit, because of wrong time and place — this is a comments section, and to me the writeup is a bit too detailed and overly confident in how interested I am. If I care, I’ll ask, or I’ll go to your profile.

Sounds like OP was just excited; unfortunately, the practice of using HN to “organically” advertise startups (particularly through blog posts) is quite common nowadays, and I can’t help being sensitive to it as I feel it doesn’t help discourse. This post was relevant and interesting though; thanks for flagging it as such.

latchkey • 11 months ago

I've been chastised before for not being explicit about my involvement in my own company.

anxman • 11 months ago

TIL there’s a downvote on here

latchkey • 11 months ago

From the FAQ (linked in the footer): https://news.ycombinator.com/newsfaq.html

Why don't I see down arrows? There are no down arrows on stories. They appear on comments after users reach a certain karma threshold, but never on direct replies.

Also explained here:

https://github.com/minimaxir/hacker-news-undocumented/blob/m...

I just voted you up to get one more point closer!

jonathanyc • 11 months ago

You asked for an explanation of why you're being downvoted. Here is another comment that is effectively an advertisement, but one that adds value: https://news.ycombinator.com/item?id=41113750

More than anything, your comment just feels like you just copy-pasted it from the blurb you send to investors? For example you say:

> ... while we can deploy any compute that our customers want, we are starting with AMD instead of Nvidia...

> One of my goals has also been to get Dell ... to offer compute credits on our cluster. ... It becomes a developer flywheel...

I mean, OK? The second part in particular ("It becomes a developer flywheel") seems totally irrelevant to anyone except a potential investor. Why would I as a customer care about your product being a flywheel??

I can hardly speak for everyone on this website, but I know I'm here first and foremost because I love to learn (c.f. "If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity."). Even if I were working on a startup similar to yours, your comment doesn't really teach me anything ("we're building our own first cluster of 16x Dell chassis..." OK?)

Also, quite frankly, I just went to your website and it reads like it was generated by ChatGPT.

> In essence, Hot Aisle is not merely a cloud compute provider but a dedicated partner that accelerates the journey of businesses towards HPC advancements, ensuring they navigate the digital transformation landscape with assured resource scalability, enhanced security, and unwavering support.

latchkey • 11 months ago

Great feedback! I definitely didn't copy/paste it, I wrote that whole comment for this post.

The developer flywheel was in response to the mentions of credits in the Google announcement. It isn't my product that is the flywheel, it is the imho smart concept of attracting developers to a solution by giving them credits. "free drugs", if you will. This is, in my eyes, a big reason why Nvidia is so popular today and what made the cloud in general, so successful.

I'm trying to bring that concept to the AMD ecosystem. Previously, you could only get access to AMD MI (enterprise) class compute, if you had access to super computers like El Capitan and Frontier. I'd like to bring these things to the masses and a big part of that is lowering the barriers as far as possible.

drivebycomment • 11 months ago

Honestly, even with this reply, it's not at all clear what you think this YC-Google deal is validating, and how it's directly relevant to what your startup is doing.

It seems stretch to call the annoucement as any kind of validation for your startup, whatever your exact logic is here.

Your reply above makes no sense to me. NVidia isn't popular because of any "free drugs". Cloud did not become popular simply by giving free credit.

It's not at all clear what value your startup is brining to the world. El Capitan is 40MW compute. You can get that much computer from top cloud providers (with money of course) - their combined compute is in tens of GW range, estimate based on their renewable power portfolio. The barrier nowadays is roughly only money, and unless you have magic to lower the price of power and machines, you are not lowering any barrier vs top providers. If you have the magic sauce, it's not clear what that is, at least in the post.

yanslookup • 11 months ago

latchkey • 11 months ago

I like this response a lot. I'm not sure what your experience is in the field, so it is a little hard for me figure out how to reply to you. I'll do my best, but apologies if I get it wrong.

The validation is clear to me, but this is my field to recognize that. My business is about building super computers and either renting them piecemeal or whole to people and businesses. This is exactly what Google is doing in this announcement, which I take as validation because I started working on this similar thing, about a year ago now.

Nvidia built software and hardware (s/h) before AI. Nvidia ensured that all of their s/h solutions were easily available to developers. AI recognized that the s/h was useful and took advantage of that. An example of "free drugs" were to make large gifts of the s/h to colleges [0].

I'm not saying that cloud only got popular with free credit, but it definitely was a contributing factor (just an example: [1]) in the building of many startups.

The value of my startup is something I describe above in my original comment. You're dead wrong that the only barrier is money. It is the experience and relationships that we have in building, deploying and running large scale compute. Think of us as a consultancy for super computers. We also have the backing to fund the capex so that businesses don't have to put out millions up front, on rather finicky cutting edge hardware.

Not everyone needs 40MW of El Cap, all the time. Not everyone wants to deploy into a cloud, many want to have more control over where their compute and data is located. We work with Dell, AMD and data centers directly to build and deploy these systems. I won't talk about pricing other than to say that both companies are highly incentivized to work together to deploy as much compute as they can, and I'm the one that has joined with them to make it happen. I'd say that there are about 25-30 people involved with us, just to deploy our single first cluster. It is a massive amount of coordination.

It takes years of relationship building to even get your foot into the door on this. It is far more complicated than just racking boxes and we already have put the time and effort in to create the blueprint designs for best in class compute. We help companies that want this compute deployed yesterday, to speed up the whole process.

I'm sorry if that is not valuable to you personally, but it is to others.

[0] https://developer.nvidia.com/higher-education-and-research

[1] https://news.ycombinator.com/item?id=39117292

Catenu • 11 months ago

I believe these deals are crucial for attracting people. In my experience, they play a significant role in helping early-stage startups get off the ground.

bushbaba • 11 months ago

Does this mean google doesn't have enough external customer demand to fill the needs of it's GPU and TPU resources?

mirashii • 11 months ago

On the contrary, it’s very difficult to get GPUs in GCP and has been for a while.

outside1234 • 11 months ago

There is no way Google would do this if that was the case.

This is a classic cloud vendor move to get someone, anyone, using the fixed asset.

mirashii • 11 months ago

> Google Cloud is giving Y Combinator startups access to a dedicated, subsidized cluster of Nvidia graphics

They're being paid, so of course they would do it. It's not much different than the reservations that GCP allows you to purchase today.

holoduke • 11 months ago

Really? Any source on this? I just started one. Yes i know just one.

mirashii • 11 months ago

A quick search on Google, or r/googlecloud, or one of any number of communities will give some sense.

Alternatively, from the article: > For early-stage AI startups, Hu says one of the most common issues she hears is that startups are compute-restrained. Large enterprises are able to strike multi-year, massive deals with cloud providers for GPU access, but small startups are often left out to dry.

ai4ever • 11 months ago

i had the same reaction.

the clouds are unable to sell gpus/tpus for premium rates, and now are forced to give them as incentives and credits to customers

CheekyBlunders • 11 months ago

Any interest in SOCOM (DoD)?

LarsDu88 • 11 months ago

This is great deal for Google and YC startups.

NKosmatos • 11 months ago

I know I’m going to be heavily downvoted because not all people have the same sense of humor, but I’m going to make the comment… Does this mean we’ll have a new server to host HN and also get native dark mode?

P.S. Yes I know that the server requirements are very low (explained by dang and others) and I also know there are many plugins and hacks to get dark mode ;-)

adin8mon • 11 months ago

[flagged]

kyrra • 11 months ago

Googler, opinions are my own. I don't work on cloud.

While AWS is more popular and has more public documentation and blogs, One thing I've heard about gcp that people seem to love a More consistent UI and CLI. Features also tend to interop better.

But when you have those is some of your driving goals, it can slow down the rollout of new features. It's also very possible to use gcp without doing lock-in, if you stay away from specialized services like spanner.

yas_hmaheshwari • 11 months ago

> More consistent UI and CLI. Features also tend to interop better

I agree with this part of the statement. Having used both AWS and GCP, GCP seems a bit easier to navigate

> if you stay away from specialized services like spanner

Don't agree with this one. Once you are in one cloud, you are locked-in. Using services like BigQuery, Google pub-sub would eventually lock you in

scarface_74 • 11 months ago

Former AWS ProServe employee - I did work in cloud and had some part in quite a few migrations. I did my bid at BigTech and I have no love loss for Amazon.

You can not use any service at scale and avoid “lock in”. “Infrastructure has weight”. I’ve seen it take over a year to migrate an organizations VMWare hosted VMs to AWS.

No “just use Kubernetes” is not the answer. Every hosted Kubernetes provider has some type of custom metadata you need to add to your configuration. It can be leaky.

No “just use Terraform” is not the answer. Providers are unique for each cloud platform and you still have to rewrite everything.

Not to mention at scale, you have to deal with the PMO, compliance, training, if you are large enough you might have a physical connection from your Colo to the provider etc..

influx • 11 months ago

A Googler who benefits from GOOG RSU going brrr says Google is better than a competitor. OK? Are you authorized to speak on Google's behalf publicly?

kyrra • 11 months ago

Our market share in cloud speaks volumes. We had a huge start with AppEngine. But for some reason we failed to launch. I'd guess it was a lack of support and enterprise hand holding. the cloud team hasbeen trying to play catch-up there, and it's just taking a lot of work.

kyruzic • 11 months ago

He even started his comment like it had been approved by lawyer first lol

kyrra • 11 months ago

Ha. Nope. I post a bunch of nonsense on the Internet, but if I'm writing about anything Google, I will always start with that message. I am definitely skewed by being paid to work there. Also, I've never ran anything I've written about Google online by anyone internally.

scarface_74 • 11 months ago

Yes because out of all the things that a startup should be worried about, thinking that one day they might want to host their own cluster of GPUs should be top of mind…

ruined • 11 months ago

cloud-to-butt extension really spices up this concept

outside1234 • 11 months ago

So Google can't sell the GPUs to actually paying customers is what this tells me.

In contrast with Microsoft where they are GPU limited (don't have enough to sell).