James Garcia at Runpod - Episode 164 - The Route to Networking

James Garica

By James Garica

From Speedtest to Superclusters: James Garcia on Building AI-Ready Networks

In this episode of The Route to Networking, Ben Davies sits down with James Garcia, Director of Data Centre Management at RunPod, one of the fastest growing names in AI infrastructure. James’ career story blends curiosity, unexpected turns and huge engineering challenges, making this a fascinating look into how modern data centres and AI workloads are being built at scale.

Growing Up With Networking in the Garage

James’s introduction to networking came early. His dad was a network sales engineer throughout the 1990s, working at companies like Cabletron and Enterasys. That meant their garage was less about bikes and footballs and more about racks, switches and routers.

“We always had a networking rack in the garage. I grew up playing around with switches and routers. My dad would show me IP tables, TCP/IP, and UDP. I never thought I would be a network engineer myself.”

Even with all that early exposure, he did not set out to join the same industry. He studied Information Systems at university and leaned towards database management. A career in tech felt likely, but networking was not on the plan. Then, as often happens, one opportunity led straight into the next.

 

Falling into Networking and Scaling Infrastructure at Speedtest

His first role after university came at Riverbed, where he worked on SD-WAN technology and helped teach customers how to configure and troubleshoot it. This experience led directly to his years at Ookla, the company behind Speedtest, where his career accelerated rapidly.

With managers moving on and gaps opening up, James found himself stepping into bigger responsibilities far sooner than expected.

One of his standout achievements was helping scale Speedtest’s infrastructure from around 3,000 servers to over 16,000 to 17,000, spanning more than 30 new countries.

“You see the map when you start, and then you see the map after, and it covers the globe.”

This expansion also supported rising global demands such as 5G testing, 4K video streaming and huge increases in bandwidth usage. His team played a major role in improving latency, increasing resilience and pushing a dramatic shift towards IPv6.

“We went from 3%  IPv6 traffic to 60% in about eight months.”

And of course, there were endless hours of deep technical work.

“Hundreds of hours looking at TCP dumps. You can look at two identical captures, and one is good and one is bad, and you do not know which is which.”

 

Moving into AI Infrastructure at RunPod

When RunPod approached him, the decision was surprisingly easy. James had worked for years with Brennan Smith, now Head of Engineering at RunPod, and trusted him deeply.

“He asked if I wanted to come work with him at RunPod. He is one of the smartest people I know. On top of that, more pay, a title shift upward, and I could stay in Spain. It was an easy answer.”

RunPod operates distributed data centres designed for AI workloads, which felt familiar to James but demanded a much deeper dive into hardware and large-scale systems.

The founding team includes a CEO with a PhD in quantum computing and a CTO, whom James calls one of the smartest people he has ever worked with. The culture is fast paced, deeply technical and extremely collaborative.

Life at RunPod is dominated by one word: scale.

RunPod is not just adding GPUs and expanding its footprint. It is rebuilding networking architectures, solving multi-million-pound infrastructure challenges and keeping prices competitive while customer demand grows at extraordinary speed.

One of RunPod’s biggest breakthroughs is its Instant Clusters platform, which allows customers to spin up 32 to 64 GPU nodes in under two minutes.

“We are getting performance within 3-4% of bare metal. It is crazy.”

They also operate a serverless inference platform where 150,000 to 180,000 pods are created every single day. Some user models are several terabytes in size, yet customers still expect sub-second responsiveness.

“We do not have time to download one to four terabytes before inference starts. It has to be instant.”

At RunPod, solving a problem once often means solving it again on a much larger scale. James seems energised by that constant escalation.

 

What Today’s Data Centre Engineers Need

When asked what skills new engineers should focus on, James explains that the industry is too broad for a single path.

“It depends on where in the stack you want to live. But knowing any part gives you a foot into every other part.”

Hardware-focused engineers have an advantage in understanding system behaviour. Cloud engineers excel in orchestration. Younger engineers often come in with strong computer science and ML backgrounds.

He also believes AI fluency will become standard for the next generation.

“Kids growing up with AI will have a natural fluency with it. That gives them an edge.”

 

The Future of AI Infrastructure

James is particularly excited about agentic workflows. These are AI-driven systems that can monitor logs, detect anomalies and even take automated actions.

“I still think it is cron jobs with LLMs, but even if that is true, it is amazing.”

He also anticipates a major shift towards GPU compute at the edge, bringing AI inferencing closer to end users.

Right now inference time dominates total latency, so location barely matters. James believes that will change as models become faster and expectations tighten.

“People do not care where inferencing happens at the moment, but that will change.”

 

Leadership with Trust, Clarity and Simplicity

Despite leading high-pressure, high-complexity teams, James’s leadership approach is remarkably grounded.

He focuses on:

  • bringing solutions rather than just problems
  • removing blockers
  • being honest and direct
  • trusting people to own their work
  • keeping things simple

“If I assign something, I forget about it because I know it will get done.”

He values effort and growth above perfection.

“If you show effort and growth, in six months you are a completely different person.”

 

Advice for Young Engineers

James’ advice is straightforward but powerful:

“Shadow people who are doing good work. Follow the founding engineer. Watch how they solve problems.”

He encourages newer engineers to ask questions, dive into difficult problems and use AI tools to accelerate learning.

“Now you have an expert to talk to all the time.”

 

Quick Fire Round

James wraps up with our fun, quick-fire round, where Ben asks him things like:

  • A book or podcast he recommends
  • A piece of tech he can’t live without
  • One thing he wishes he’d learned sooner
  • The biggest misconception about AI infrastructure
  • His dream location for a data centre

Let’s just say… his answers range from philosophical sci-fi to unexpectedly practical engineering takes.

 

Listen to the Full Episode

This episode is packed with real-world insight into the challenges and opportunities shaping modern AI infrastructure. James shares honest reflections from his journey through networking, large-scale system design and the rapid expansion of AI-driven data centres.

Whether you are an engineer, a leader or someone curious about how high-performance compute actually works behind the scenes, this conversation offers clear, practical takeaways and a grounded look at where the industry is heading.

For anyone interested in networking, distributed systems or the future of AI infrastructure, James’ perspective is a must-listen.

If you want to discover more about James, connect with him on LinkedIn here.