Avoid the autonomy trap

Dec 10, 2024

I have worked at startups pre, during, and post hypergrowth. Each of them used a version of the same strategy to unlock organizational scaling - autonomous teams. Autonomous teams are vertically integrated, business-function oriented, and small. They can build, ship, and maintain basically independently. More than one of these organizations referenced “The Spotify Model” as inspiration.

The insight most people seem to take away from The Spotify Model is that it’s better to focus on autonomy than alignment early on. The line of reasoning goes that overindexing on alignment too early presents the risk that you will block progress and delivery, and in the early stage of a company you can’t afford to do that. Getting out of the way of teams lets them build what they want, and you try your best to align that with business goals with KPIs and stakeholders to whom they are accountable. You can accept the risk that teams may be duplicating work, not building a cohesive product or system, and generally optimizing for local rather than global maxima, but you can’t accept the risk that nobody can get anything done without some centralized authority rubberstamping a plan, or without depending on other teams that have their own priorities.

I think all of this is true. However, in practice what I’ve seen is a near total disregard for alignment when it comes at any cost to autonomy (it always does). Very early in a company’s lifecycle I have seen the extreme focus on autonomy help the company cross the chasm into product-market fit. A small group of people working independently can build a lot of features, and if the features are low-hanging fruit that everyone is pretty confident will add a lot of value to the product, then it’s smart to get out of the way of the builders as much as possible. Being late to market is a failure mode.

However, the technical debt incurred in this stage adds up pretty darn quickly. With no coordination and little collaboration, independent builders will make systems that solve immediate problems, and as a whole the system will be very poorly designed (because it wasn’t designed at all). This is Conway’s Law at an extreme end.

Many people think that technical debt is mostly about cutting corners to move faster, as if writing low-quality code takes less time than writing high-quality code (it doesn’t, granted the developer knows how to write high-quality code). This isn’t what I mean - technical debt really describes the fact that the needs of the business and (more commonly) your understanding of those needs change faster than you can change your technical systems.

When a group of independent builders creates an emergent, distributed system (as will happen according to Conway’s Law), the business incurs massive technical debt related to the difficult problems of distributed systems. These problems are usually completely orthogonal to the real needs the business has for its technical systems. Most companies never get to the scale where it’s necessary to solve hard distributed systems problems to meet the traffic demands of their users. Despite what microservices evangelists say, you can scale a monolith to very high demand, in a mature organization - Shopify still runs its business on a monolith.

Here’s the real kicker - distributed systems problems require alignment to solve. The canonical problem in distributed systems is summed up by the CAP Theorem, which states a tradeoff between Consistency (basically data correctness) and Availability when data is writable in multiple places. There are myriad ways to navigate this tradeoff, and each of them requires a degree of coordination between the data stores. You may have to implement specific APIs, track metadata related to the order of operations, or adopt an asynchronous communication protocol. No matter what, you won’t solve this problem without communication and coordination between the data stores, and communication and coordination between the teams building them. This requires some degree of technical alignment, which will come at a cost to team autonomy by definition.

You can solve these problems ad hoc as data becomes untrustworthy or performance suffers, but they are common problems that you will face time and again across the organization, and they are unrelated to the specifics of your business. This type of problem screams out for a reusable abstraction, to ensure that they are solved correctly, consistently, and efficiently. And to have teams get on the same page about a reusable abstraction, communication protocols, shared patterns, etc. requires alignment. It will cost something upfront, but it will very likely pay off not very far down the road.

What doesn’t work is a culture where alignment is eschewed for the sake of autonomy bar none, without considering the nuanced cases where the technical realities of software engineering push organizations to need common solutions to common difficult problems, and a coordinated effort can avoid the unnecessary complexity sinks you will create without proper consideration to distributed system design.

Systemicity

Discussion about this post