How Flexbe's CTO Scaled A Platform To 35,000 Sites And Rebuilt Its Core With AI

Maksim Kachurin has been with Flexbe since its founding, responsible for the platform’s architecture, engineering, and team from day one. Over that time, he transformed the product from a simple landing page builder into a professional website platform now serving more than 35,000 active sites across multiple markets. Along the way, he built an AI page generator at a time when the tools to do it properly didn’t yet exist, and is now developing Reforma, a platform that brings together AI-assisted code generation, a visual editor, and real code — allowing users to build and evolve applications directly in the browser. In this interview, Kachurin talks about how the platform’s reliability is architected, why standard approaches to AI generation didn’t work, and what actually made a difference when it came to speeding up development.

Reliability at 35,000 sites

Flexbe hosts more than 35,000 published client sites on a single platform, meaning that with the wrong architecture, a traffic spike on one site or a failure in one service can theoretically affect everyone else. The platform architecture designed and developed by Kachurin addresses this issue through isolation and the prevention of cascading failures.

Kachurin draws a clear line between two types of failures with different acceptable thresholds. The platform can degrade briefly — unpleasant, but tolerable. A client’s site going down is not. Based on this, the part of the system responsible for serving client sites is isolated from the rest of the platform code. “If part of the editor goes down, published sites keep running independently — a user would only notice if they tried to edit their site at the exact moment the problem occurred,” he says. Client sites are distributed across separate backends, so a failure in one backend affects only the sites running on it — the rest of the platform continues operating normally.

Kachurin designed the platform’s infrastructure around leased dedicated physical servers across multiple regions — no cloud, no VPS. Cloud environments offer no guarantee that server resources aren’t shared with other tenants. Every server is mirrored in a separate data center; if one fails, traffic shifts to the replica, and if that fails too — to the next data center. The same applies at the service level: multiple instances running in parallel, with traffic automatically redistributed if one degrades. This also enables multiple releases per day with no downtime — a new version is brought up alongside the old one, and traffic only switches over after a successful start.

Kachurin designed the system to isolate DDoS attacks at the client level. When a specific client site comes under attack, its traffic is routed to a dedicated filtering server. Other clients are unaffected.

Most pages on Flexbe carry dynamic content — A/B tests, geo-targeted landing pages, product catalogs — so full-page caching isn’t an option. Instead, he implemented partial caching of stable parts of the page, which allows the platform to maintain performance. Latency is monitored at the service level; in critical situations — such as an entire server going down — the system triggers an automatic phone call for the responsible engineers.

“None of this came together at once — it evolved as the client base grew, loads increased, and new failure modes appeared. In the early days we were fighting fires constantly. Now it barely takes any of the team’s time.” says Kachurin.

How Kachurin rebuilt Flexbe around AI — and what he learned along the way

As the AI boom reshaped user expectations, Kachurin made the decision to rethink how Flexbe works at its core — moving the product from a traditional site builder toward an AI-driven platform. The AI page generator was one of the first major outcomes of that shift: built at a time when LLM models were still weak and expensive, context windows were small, and the tooling was severely limited. “We were literally inventing what is now called agentic AI,” Kachurin says.

The original idea seemed workable: take the pre-built sections that make up a site and ask the LLM to select and fill them. But the model had three consistent failure modes. It couldn’t hold the full site in context — pages came out thin. It had no concept of layout and picked sections based on content rather than visual logic, producing something closer to a newspaper than a website. And it wouldn’t stay within the data format — it changed the structure, added and removed fields, and generated content ended up in places the system simply ignored.

Kachurin evaluated the standard approaches and ruled them out one by one. Templates filled by an LLM meant every user got the same skeleton with slightly different text. Generating sections in parallel meant each block duplicated the one next to it, because the model had no awareness of the rest of the page. Many major site builders have adopted the template-based approach; in Kachurin’s assessment, it tends to produce repetitive, low-quality results that fall short of what users actually need.

To implement a generation process that would meet quality standards, Kachurin decomposed it into an agent pipeline combining sequential and parallel stages. The system first generates a global page layout — defining structure, section categories, and metadata — independent of the platform’s internal representation.

Visual style is then applied through a curated system of tagged components, color palettes, and typography, with constraints that prevent invalid or low-quality combinations.

Only after structure and style are fixed is content generated for individual sections in parallel, with images selected based on context from a large dataset.

By separating structure, style, and content, the system avoids the typical pitfalls of LLM-based builders — inconsistent layouts, duplicated sections, and corrupted data — while still leveraging the model where it performs best. “We broke the generation process into stages handled by different agents — some running sequentially, some in parallel,” Kachurin says. The average cost of generating a page remains under a few cents, and the approach delivered strong results even on the models available at the time.

How the team fixed its development process as it grew

When Flexbe was starting out, development moved quickly — small team, no legacy code, every new feature made a visible difference. Then came growth, and the processes stalled. Covid locked it in: the team went remote and the established processes broke down.

The most costly problem turned out to be not technical but communicational. When Covid forced the team to go fully remote, the existing processes no longer scaled: ownership blurred, communication became inconsistent, and decisions often required rework.

To fix this, Kachurin restructured the process and introduced a single point of entry for all questions. A systems analyst was brought in to act as a coordinator within this system — collecting context, aligning stakeholders, and ensuring that any solution is approved before implementation. “If anyone has a question, you go to him — and he’ll pull in whoever is needed, get it approved, and tell you exactly how it should be done.” He describes this as one of the most effective process changes introduced during that period.

Another issue was building in parallel: at one point, three different versions of the interface coexisted across the product. The lack of a shared source of truth made it unclear which components and states already existed, leading to frequent duplication instead of reuse. To solve this, Kachurin restructured the way UI was developed, creating a unified design system based on design tokens and Storybook as a single source of truth. This eliminated duplication and aligned design with implementation.

Bug reports had their own dynamic. Everything coming through support landed directly in a shared development channel. This led to frequent context-switching within the development team and inconsistent handling of incoming requests. Kachurin identified this as a focus problem: the team was drowning in noise, with no way to separate real issues from support requests that needed a different kind of response.

To fix this, Kachurin placed a QA layer between support and the development team. A tester either confirms the bug or closes the ticket — half of all support reports turn out not to be bugs at all. Only then does a task reach a specific developer. This allowed him to protect the focus of developers and speed up the resolution of user issues. “It wasn’t a standard QA responsibility, but it turned out to be one of the most simple and effective changes we made.” That role required deep familiarity with user-reported issues — which is why, he adds, the best QA engineers often came from the support team.

On legacy code — refactoring for its own sake isn’t his approach. Instead, he treats it as a pragmatic trade-off: two days spent improving the architecture before starting a new feature are often more efficient than getting stuck for a week in the middle of building it. Sometimes a task has to be shelved entirely — because the cascade of changes required is too large. But a few months later it turns out the necessary groundwork has been laid, and the same task gets done in a couple of days. This approach became part of how the team makes engineering decisions.

The result of these changes — from the systems analyst to the design system to the QA layer — was a development process that Kachurin brought from reactive to predictable. Process improvement hasn’t stopped there: smaller adjustments come out of regular retrospectives, and most of the ongoing work happens at that level rather than through sweeping structural changes.

Reforma and the browser-based IDE

Reforma is a new kind of development platform that Kachurin is building as co-founder of a new startup alongside his work at Flexbe. The core idea: combine AI-assisted code generation with a visual editor — a direct response to what he sees as a fundamental limitation of existing AI tools. “Say you’ve built a simple landing page and want to change a color. Instead of just picking it with an eyedropper, you have to write a prompt and hope the model understands what you mean,” Kachurin says. Reforma is built to eliminate that friction: users can click, drag, and edit directly, switching between code, visual editing, and AI depending on what the task requires. The product isn’t launched yet, but the technical problem it required solving is instructive in itself.

Writing code in a browser is easy — a code editor can be implemented anywhere. The real challenge is different: for development to be fully functional, the editor needs access to all project files, dependencies, and modules — that’s thousands of files. None of that normally lives in the browser. The code also needs to be built and run: files bundled, unused parts removed, TypeScript and React compiled into plain JavaScript, HTML, and CSS. In local development, a dev server on the developer’s machine handles all of this. Moving this entire process into the browser is a separate engineering problem.

Existing proprietary solutions formally address it and allow running backend code right in the browser, but each comes with constraints: licenses can be revoked, costs are high, and browser support is limited. Storing thousands of module files in browser storage isn’t viable either — none of the available options are reliable and fast enough at that scale.

Kachurin took a different architectural approach. For every user who opens the editor, a separate temporary virtual machine is spun up in the cloud, where the project is deployed on it, dependencies are installed, and a dev server is started. The process is identical to local development — it just runs in the cloud.

To let users view and edit files in the browser, Kachurin mirrors the project’s file system into OPFS, a low-level browser file system API. OPFS has compatibility issues, particularly in Safari. As a low-level API, it’s not practical to use directly, so Kachurin built an abstraction layer on top of it — opfs-worker — and published it as open source, where it has since gained significant adoption, with tens of thousands of downloads each month.

For the code editor layer, he built on top of Eclipse Theia IDE, implementing a long-requested browser-only mode within the Theia community and contributing it upstream — eliminating the need for a backend. Files stay in sync between the browser and the cloud machine, and the user gets a fully functional IDE right in his browser.

The result is a development environment that runs entirely in the browser — where users can switch between writing code, using the visual editor, or prompting AI depending on what the task requires.

Featured image credit

Tags: trends

How Flexbe’s CTO scaled a platform to 35,000 sites and rebuilt its core with AI

Related Posts

OpenAI limits ChatGPT 5.6 access to government-approved users first

Meta debuts AI-powered Creator Studio app to help Facebook creators grow

Figma adds code layers to collaborative design canvas

US reportedly urges Meta to submit AI models

OpenAI upgrades GPT-5.5 Instant for stronger context awareness

ByteDance launches Doubao 2.1 Pro language model

LATEST NEWS

Apple touchscreen MacBook could launch with M5 Pro chips

Apple touchscreen MacBook could launch with M5 Pro chips

OpenAI limits ChatGPT 5.6 access to government-approved users first

Apple to skip M6 Pro and Max chips and launch M7 in 2027

IBM unveils world’s first sub-1nm chip with new nanostack architecture

Apple raises prices across Macs, iPads and home devices

BEST AI MODELS LEADERBOARD

LATEST TOOLS

Autoppt

Otter.ai

Slideoo

Disney Pixar AI Generator

Codebay

Newo

BlackInk.AI

WatchMyCompetitor

TokkingHeads

Fellow.app

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.