Most conversations about AI start in the wrong place. They start with the model, the vendor, the use case, but skip straight past, “Does any of it work?” or “Is your data good enough?”
Jim Markunas has spent 20 years running complex enterprise programs at companies like DIRECTV, BCG, and Publicis Sapient, and more recently leading a citywide smart infrastructure overhaul at CPS Energy that touched 200,000 streetlights, SAP core systems, and real-time field operations. He’s seen what happens when organizations rush to deploy AI on top of foundations that can’t support it. We caught up with Jim at LLM Day in Austin, where he was a keynote speaker to discuss data architecture, the agentic AI hype cycle, and why the gaps between demo and production is wider than most technology leaders want to admit.
Q1. You’ve worked across utilities, fintech, streaming, eCommerce, a lot of very different environments. Is there a data mistake that shows up everywhere, regardless of industry?
Several. The most universal one: data is always treated as a tail-end problem. Teams want to build first and clean later. That’s how you get to launch day and discover your product catalog is a disaster, your customer records are incomplete, and your AI has been trained on garbage.
In my Boehringer Ingelheim project, engineering was building a global Adobe Commerce ecosystem off a small sample of test data. The production environment had product, customer, and order data at a completely different scale and complexity. We made the call early to engineer the system off the full production data set. That single decision saved the launch.
The second mistake I see everywhere: confusing data volume with data readiness. Companies invest heavily in data lake infrastructure and walk into AI projects feeling prepared. They’re not. A full lake with no taxonomy, no ownership, and no access standards is not an asset. It’s a swamp. AI doesn’t fix that. It just moves faster through it and gets lost faster.
The third one is what I’d call the ‘junk migration problem.’ In enterprise system re-platforms (and I’ve led a lot of them, across BigCommerce, Shopify, Adobe Commerce), teams assume that moving to a better platform cleans the data. It doesn’t. Bad product attributes, missing pricing logic, duplicate customer records: all of it travels. The new platform just surfaces the mess at higher speed.
The third mistake is the ‘siloed ownership problem.’ No single system of record, no agreed data owner, multiple teams writing to the same fields with different logic. I see this constantly in multi-vendor program environments. Feed an AI into that kind of fragmented data ownership and it doesn’t fail loudly, it just produces confident answers that are subtly wrong, because it’s averaging across sources that don’t agree with each other. Nobody notices until a customer gets burned.
Q2. At CPS Energy you had to connect SAP, field tools, resident-facing apps, and real-time operational data into something that actually worked. How do you even begin untangling that kind of legacy infrastructure?
APIs are the magic bullet in any ecosystem. This is how systems talk to each other. Before AI, you had to sit down in a comfortable chair with a cup of coffee, and physically review API documentation. You had to make educated guesses and then test your hypotheses in Postman. Nowadays, I use just Claude.
For CPS specifically, it was a two-phase project. I came on at phase two. They had already spent 6 months replacing all the streetlights throughout the city with enterprise versions of the Phillips Hue (smart LED bulbs that broadcast via bluetooth/chipset). They also implemented a telematics infrastructure to tie the chipset and the iPads that Dalkia (the company performing the streetlight repairs) was using out in the field together.
My job was to own the customer-facing design (a web page that took a live feed of where each streetlight was located, and its on/off status) where customers could report outages using a self-service, automated system.
In parallel, we had to string all of the back office systems (like SAP) together with APIs. It was a very manual process, and I spent a lof of late hours on video chat with my senior solution architect to diagram out all of the systems. It wasn’t easy, and as a quasi-government organization, CPS Energy wasn’t warm to the idea of using Chat GPt to analyze the API documentation. Here’s the system we built at a high level:

Q3. Everyone is talking about agentic AI right now. From where you sit, what’s the thing that breaks first when a company tries to deploy agents before their data house is in order?
Rule #1 is don’t be like Quicken Simplifi! When Intuit closed Mint, a universally loved product, users were forced to go with either Monarch Money or Quicken Simplifi. The only thing elegant about Simplifi’s product was its live/human customer support… which needed to be contacted quite often, because the Simplifi product is a complete mess. A few months ago, they attempted to replace their human customer support with an agentic chat, and it was like when Beyoncé made a country album; bad beyond words!
A critical step should always be testing your agent before deploying it.
The issue with Agentic AI is that context is hard… Almost to the point of being an uncanny valley. Before we discuss how to fix this at the enterprise level, let’s take building a Claude project at the consumer level as an example. To get any usable context out of Claude, you need to have a large data pool for it to pull from, and then you need to actively train it through interviews (think of training a pet, same concept).
At the enterprise level, this gets even more complex. Most casual users of AI don’t get the distinction between a co-pilot (a chat that you drive) and an agent (an autonomous AI that drives and reports back to the user). I’ll spare you the boring details, but the short answer is:
1. Have an end goal in mind – what do you want to accomplish with your AI (increase order throughput, remove buy funnel friction, alert IT staff of critical outages and zero day risks, etc.)
- Develop a detailed, step-by-step operational process, and perfect it. How are you currently solving for your goal defined in #1?
- Determine what can be automated, and what needs human intervention.
- Develop AI guardrails – Will it keep you in a frustrating loop like Simplifi’s agentic chat, or did you build in a rule that says, “after the user starts vehemently cursing at you, direct the user to a live customer support agent)?
Just so you don’t think I’m an iconoclast, an example of a company using agentic AI elegantly is Hostinger, a GoDaddy competitor. They have 24/7 live support just like GoDaddy, but their agentic layer is incredible! Their agent can advise with context and safely make changes to your web infrastructure without breaking it. It knows exactly when to bring a 24/7 customer support agent into the chat.
Q4. Regulated industries like fintech, utilities, pharma have to move carefully. But AI programs that move too carefully tend to die in pilot. How do you square that?
The programs that die in pilot aren’t dying because they moved too carefully. They’re dying because they treated compliance as a full-stop instead of a design constraint.
There’s a meaningful difference between those two things. A full-stop means you don’t ship until legal blesses every line. A design constraint means you build compliance into the delivery process from day one, the same way you’d build in QA gates or UAT sign-off.
I ran the CMS platform delivery for New York Life through Fusion92. 12,000 insurance agents, everything they touch subject to brand and compliance controls. The previous approach was to review everything at the end, which predictably created bottlenecks that killed velocity. What we built instead was a system where compliance requirements were translated into acceptance criteria at the story level. Approval workflows were baked into the platform, not bolted on afterward. The result was about 40% faster time-to-value and roughly 30% fewer defects, in a regulated environment that should have been a slowdown.
The other piece that kills regulated AI programs is scope. Teams try to automate everything in one shot. The right move is to identify the one workflow where the risk is lowest, the data is cleanest, and the ROI is clearest, prove it in 60 days, and use that win to build organizational trust for the next phase. Regulated organizations aren’t actually risk-averse. They’re proof-averse. Show executives a working system with an audit trail and they’ll move.
Q5. You talk about AI producing “confident but wrong” outputs. Is that a data problem, a model problem, or something else entirely?
Ha! Great question! The short answer is… If you’ve dated a narcissist and have been gaslight, you’ll notice similar behavioral patterns in AI. While not malintentioned (unlike the narcissist), AI loves to gaslight. Imagine calling out your husband and wife, and them responding with, “You’re absolutely right, you did ask for this code change, and I completely ignored you. My bad, I’ll do better next time,” followed by the exact same behavior. Infuriating!
I’m not sure what the root cause is, but there are a few fixes & life hacks I’ve discovered in my AI journey:
- Have one AI talk to another AI. When Claude Code torpedoes the front end of my website, or deletes all of my uncommitted Github pipeline changes, I run crying to Codex, and vise-versa. Same goes for data: use multiple AI’s to triple-check your work. My website greatestpmever.com was built using 5 AI co-pilots, Google AI Studio, and 1 autonomous agent.
- Break complex tasks into chunks
- Build me a homepage = bad
- As a User, I want a homepage hero based off the attached Finox design system and design, so that I can present content to website visitors = good
- Know which AI is good at what. When I was in a band, I HATED when the bass player tried to sing, or when the drummer tried to write lyrics. People have lanes and natural gifts within their lanes… And so it goes with AI. Think of yourself as a producer: know each AI’s string points and weakpoints, and make each one stay in its best lane.
- MD Files are your friend. I use them any chance I get, and they’re an excellent way to introduce guardrails to AI.
- There’s no substitute for hard work. To use Claude code, you need basic coding knowledge, to use AI to write your LinkedIn posts, you need to know how to write, to use AgentForce to build you a warm sales pipeline, you need to know your ideal customer profile. AI is a force multiplier, not a replacement for humans.
Q6. If you had to describe what a genuinely production-ready AI system looks like, what would you say? And how many enterprise companies are actually there?
I have yet to find one. If a company invents a truly out of the box, production ready, back-office agentic system, they will become overnight billionaires. At the enterprise-level, I’m impressed by Hostinger and I’d give SalesForce’s AgentForce a B+, and Kubernetes is super powerful if you have an experienced data scientist (a human) to pair with it.
I’m known to patchwork systems together as a profession. The current state of AI requires multiple systems working in concert to truly add value at the enterprise level.





