A Primer on the Cloud
This is the fifth primer in a series of articles I’ll be writing over the next 6-12 months with the goal of covering every major industry in tech (and the broader goal of having a foundation to study tech companies across industries quickly). You can find the other articles on the Public Comps Blog or my Substack.
Admittedly, I was tempted to skip the cloud primer for now because the industry is so heavily covered. However, I decided it’s too important an industry to skip. To counteract the over-coverage, I’ll spend less time talking about cloud software and more time discussing the foundational elements of the cloud.
Here’s how I’ll be structuring the article:
- Why the Cloud was so Important
- History of the Cloud
- Overview of Cloud Technology
- Overview of Cloud Market - Hyperscaler Stats & Cloud Ecosystem
- What the Future of the Cloud Could Look Like
Important Disclaimer: As always, I like to clarify that I’m an investor studying the space, not an expert on the industry. I look for opportunities in both private and public markets that provide value to investors, and I hope this research provides value to investors doing the same. That said, none of this is investment advice and should not be used as a guide to investing in the industry.
1. An Intro to the Cloud and Why it Was So Important
The cloud is a simple concept: one company rents out its hardware to other companies as software that is delivered over the internet. At the base level, cloud infrastructure virtualizes (represents hardware as software) building blocks of computing like compute power, storage, and networking. On top of that, cloud companies provide increasing levels of management of that software. Cloud SaaS offers the most managed service where a company only needs to configure settings for that software and use the tool. We can visualize the value chain like this:
The business model is simple, so I want to start by describing why the cloud was so impactful. If you’re a fan of Clayton Christensen’s work, the cloud is a perfect example of disruption. A key idea of disruption is that it’s not based on new technologies, it’s based on old technologies delivered in a cheaper way, targeted at a new set of customers. The cloud took an existing industry (enterprise-level hardware and software), created a new business model that lowered the barrier of entry to that technology, and sold it to a new market of companies (mostly technology startups).
The cloud's importance came from the lowering of the barrier to entry. Prior to the cloud, companies had high upfront costs for computing power and lower maintenance costs over time. Lowering this upfront cost significantly expanded the market for computing power. This was most evident in startups, who could much more efficiently access computing power and build software. Those startups then delivered a funnel effect to the cloud providers. As cloud software generated billions of dollars in revenue, the hyperscalers collected a large % of that revenue. Over time, enterprises moved to the cloud as well. We’re still seeing the long tail of cloud migrations, nearly 20 years after AWS’ launch.
AWS defined the early years of the cloud, but many companies helped lay the foundation for AWS’ rise.
2. The History of the Cloud
I’ll break down the history of the cloud into three segments: early history (everything pre-AWS), the rise of AWS, and the evolution of the cloud landscape.
Early History (1960-2003)
Like many overnight successes, the cloud had a long history of technologies that led to its eventual emergence. The first mention of the term “cloud computing” was in 1994 by General Magic; however, the term became more widely known in 1996 when Compaq wrote an internal document about the future of computing and the internet. The more popular term at the time was grid computing - mentioned in academia and corporate 10Ks like IBMs. The basic idea of grid computing was a shared network of computing resources that would “turn the entire internet” into a computer.
Before that (in 1961), John McCarthy, who coined the term ‘Artificial Intelligence’, suggested that computing would be sold as a utility one day. In 1967, IBM virtualized operating systems to create ‘timesharing’, where multiple users could use one machine. ARPANET, an early version of the Internet, was also invented in 1969.
Over the next 20 years, hardware and software would continue to advance, improving data centers. VMWare, founded in 1998, would become the leading company in server virtualization. Salesforce, founded in 1999, helped pioneer internet-delivered software. Around this same time, startups started to realize the potential for computing delivered over the Internet.
In 1999, Marc Andreessen and Ben Horowitz founded Loudcloud, a “software infrastructure service provider.” Loudcloud raised hundreds of millions in venture funding before going public in 2000. However, the huge costs of building data centers combined with a market not ready for the cloud led to Loudcloud selling its hosting business for $63.5M in 2002. They retained the software portion of the business and sold it to HP for $1.6B in 2007.
All this to say: people knew the cloud was coming before it did, and few people still predicted the company that would lead its emergence.
The Rise of AWS
Many popular opinions on the origin of AWS go like this, “The original idea for the cloud came from companies renting out spared data center capacity. The most famous example of this brings us to the advent of the cloud. Amazon.com had to build out enough capacity to account for seasonal swings. Because of this, they decided to rent out spare compute capacity…and the rest is history.” Unfortunately, this doesn’t seem to be true. You can read an excellent article on the origins of AWS here.
The actual origin of AWS came as a response to two problems:
- Engineers spent 70% of their time building the basic elements any project would require: a storage system and compute architecture.
Engineering teams had to do this every time a new feature was released. Bezos and other managers started calling it “undifferentiated heavy lifting.” In response, the company began to think, “Let’s build a shared layer of infrastructure that all these teams can rely on, and none of them have to build out storage, compute, or databases.”
- Other companies wanted to embed Amazon product links on their web pages.
For example, a cooking website wanted to embed a link to a new stand mixer. Amazon was all for it and would send them a bit of code they could plug into their website. Over time, Amazon released scalable software, allowing everyone to incorporate Amazon features into their sites. The surprise was that much of Amazon’s usage came from its internal software engineers. Put simply, engineers wanted to be able to easily integrate software features together via APIs.
In a team offsite in 2003, managers decided Amazon Web Services could be a real business. It would take 3 years for AWS to be released with their first product, Amazon Simple Storage Service (S3). Five months later, Elastic Compute Cloud (EC2) was released.
The cloud wave had started.
As Fortune puts it,”Instead of raising millions of dollars to buy servers and build data centers, startups could now get online with a credit card, and pay a monthly bill for just the computing power and storage they used. If their new app was a hit, they could immediately engage all the needed cloud services. If it bombed, they weren’t stuck with rooms of junk equipment.”
From 2006-2010, AWS released relational database services, NoSQL databases, app development platforms, and monitoring services.
Incredibly, no large competitors entered the cloud market for years.
Jeff Bezos referenced this at a conference in 2018, “A business miracle happened…This is the greatest piece of business luck in the history of business so far as I know. We faced no like-minded competition for seven years. I think the big established enterprise software companies did not see Amazon as a credible enterprise software company, so we had this incredible runway.”
AWS had years to extend their lead before they faced significant competition. Microsoft introduced Windows Azure in 2008, and was officially launched in February 2010. GCP introduced App Engine, their first service, in 2008 but wouldn’t become generally available until 2011. IBM had cloud services in the early 2000s but abandoned them, and wouldn’t launch IBM Cloud until 2013. Oracle’s Cloud Infrastructure wouldn’t go GA until 2016.
The Cloud Landscape Evolves
The early cloud years were defined by infrastructure; instead of buying data centers, companies used an API to get computing and storage on demand. AWS dominated those early years; to a lesser extent, they still dominate core infrastructure today.
The next wave of the cloud was characterized by managed services through PaaS and SaaS models. These service types aimed to further reduce customers' management workloads. Customers no longer had to manage the underlying infrastructure, just the application (PaaS) or data and configurations (SaaS).
Other trends like Containers/Kubernetes and Serverless computing helped define the 2010s as well. Containers provided a way to build more scalable applications with a microservices architecture. Kubernetes provided a way to manage those containers. Serverless computing abstracted away the need to manage underlying hardware. Companies use cloud software and pay for whatever cloud infrastructure they use.
Throughout the 2010s and early 2020s, hyperscalers continued to use their massive resources to scale data centers and cloud services. Incredibly, in an industry with over $200B in annual revenue, they are still capacity-constrained in many regions across the world - meaning demand still outpaces supply.
Summarizing the current state of the cloud: it’s dominated by big tech’s scale and resources.
So where do we go from here? All eyes are on AI, and the cloud providers each want to profit off the computing necessary for AI workloads. It seems the hyperscalers are in prime position to lead the next generation of computing. With that being said, the industry could see new segments that provide competition to the cloud providers: specialized clouds, edge computing, distributed computing. More on this later.
3. Overview of Cloud Technology
We can break down the cloud value chain into three segments:
- Hardware - Semiconductors & Data Centers that build the foundation for the cloud.
- Cloud Infrastructure & Platforms - Virtualized Infrastructure & Platforms to develop apps on the cloud.
- Cloud Software - The applications developed to solve specific needs for customers.
As always, I like to talk about the limitations of these graphics. First, I’m not a huge fan of the IaaS, PaaS, and SaaS breakdown; the important point is that the cloud service providers manage increasing workloads from IaaS -> PaaS -> SaaS.
Secondly, the customer is not listed in this graph. The customer can tap into just the infrastructure, just the platforms, or just software. The important point is that the cloud is just managing the backend hardware to provide compute resources to customers.
Finally, this is not an exhaustive list of companies or segments. Hundreds of companies could be added here, and areas like data center foundations (real estate, power, cooling, energy) are left out. My goal is to provide a high-level way to think about the industry. Each of these boxes could be expanded into their own market map.
For the sake of brevity (and discussing market dynamics), I’m providing a simplified discussion of technology. I’ll include links for diving deeper into the tech. I don’t discuss hardware as I’ve covered those in previous semiconductor and data center articles.
1. Infrastructure
Infrastructure refers to the simplest form of cloud services, typically bucketed into compute, storage, and networking. Put simply, cloud infrastructure provides virtual representations of physical hardware and rents out those resources to customers.
Virtualization
Virtualization is the technology that allows physical hardware to be represented as software. The core idea is that virtualization abstracts the physical resources into a shared software layer. This is done for computing, storage, and networking.
The most popular example is the virtual machine, enabled by hypervisors. A hypervisor is a software installed on physical servers that creates multiple virtual machines, which are virtual computers that can be rented out to customers.
Virtualization providers include VMWare and Nutanix.
Compute - VMs & Containers
Compute refers to the cloud processing power rented to customers. The core unit of compute power is the virtual machine. It is essentially a virtualized computer (with CPU, memory, and storage) rented out to customers. One physical server can have many virtual machines running on the same server.
Containers offer an alternative to Virtual Machines. Containers essentially package an application, its dependencies (such as software libraries), and its configurations (network settings) into one unit. I find it easiest to think of containers as a virtualized operating system.
Containers have become increasingly popular as the way to build applications in the cloud. They use fewer resources than VMs, offer better flexibility (can more easily move applications between clouds or physical data centers if necessary), and provide a simpler way to add new features/applications to software. They’re core to the microservices movement.
Note: cloud-based containers are frequently called containers-as-a-service. The hyperscalers and others like Docker offer Kubernetes management platforms.
Storage
Storage refers to cloud data storage, which typically comes in three types: object storage, block storage, and file storage.
Quick visualization of the differences:
Object storage stores large amounts of unstructured data, making it great for big data processing. It designates each piece of data as an object, and assigns metadata (data descriptors) to the object for retrieval. Because of its low cost and ability to store unstructured data, cloud object storage has become popular for large data stores like data lakes.
File storage stores data in a hierarchical format, like files on a PC. File storage is about storing data in a manner that’s intuitive for human use. An example of use is for web serving where websites will need to hierarchically access data. File storage is built on top of block storage, the oldest of the three. Block storage breaks up the data into evenly sized blocks. It has little metadata and provides low latency (high speed) transactions. It's typically attached to virtual machines or databases.
The hyperscalers lead in cloud storage as well, and there are storage-focused clouds like Wasabi and Contabo.
Other Infrastructure
Compute and storage are the largest categories in cloud infrastructure. Other categories include:
- Cloud Networking - Creating virtual networks to connect company resources.
- Infrastructure Security - Securing cloud infrastructure.
- Infrastructure Management - Managing the various infrastructure services.
- Virtual Desktops - Remotely accessing a shared computer.
Examples of other cloud infra companies include networking providers like Cloudflare and Akamai.
2. Platforms
The core idea of platforms is that they manage the infrastructure and provide platforms for deploying applications. Developers don’t need to deploy VMs or storage; they can focus on developing applications or managing databases.
The original platform services were app development platforms like Google App Engine and AWS Elastic Beanstalk. They don’t get as much attention as other platforms like databases and AI platforms. (These are sometimes delivered as SaaS solutions but I’m including them in the platform section because companies build their applications on top of these services).
Data as a Service platforms provide databases, database management systems, and data services for companies to build applications on. This includes data platforms like Snowflake and Databricks, hyperscaler data services like BigQuery, Fabric, and RedShift, and data infrastructure providers like FiveTran and dbt.
AI Platforms are cloud platforms on which companies deploy AI/ML workloads. They generally fall into two buckets: MLOps and AI deployment. MLOps platforms like Databricks provide the tooling necessary for companies to build custom ML models. AI deployment platforms like AWS bedrock integrate foundation models, databases, and infrastructure to deploy personalized applications based on others’ models.
3. Cloud Software
Finally, we have cloud software and SaaS offerings. I’m not going to dive into these as it’s so broad and I’ll likely cover individual SaaS markets in the future. The important note I’ll make is that cloud software is built on cloud infrastructure and platforms. This falls into three buckets:
- SaaS providers like Salesforce provide cloud software for their customers.
- Cloud customers build custom applications for their customers.
- Hyperscalers offer SaaS products in addition to IaaS and PaaS.
SaaS provides the broadest array of options for customers. Most software built today uses the SaaS model. AWS offers 200+ cloud services and many of those are SaaS offerings (although I don’t know the exact number, I speculate over half are). Serverless offerings also fall into this category; generally this includes managed offerings where customers don’t need to worry about underlying infrastructure.
SaaS offerings are also the most competitive because the barrier to entry is so low for developing software today.
4. Cloud Markets
My goal of this section is to provide an overview of market dynamics between the big 3 cloud providers as well as an overview of alternative cloud providers. I’ll break this section down into two phases: hyperscaler breakdown and cloud infrastructure markets. As mentioned, I don’t touch on PaaS and SaaS as the article would become too long quickly.
Hyperscaler (AWS, Azure, GCP) Statistics:
Across reported revenue, AWS has ~47% market share, Azure ~35% market share, and GCP has ~18% market share. Note: Azure includes estimates based on earnings commentary and growth rates; Microsoft doesn’t specifically disclose Azure revenue.
We can get an idea of market share momentum by tracking market share by net new revenue (the percentage of revenue each provider generated relative to the other two). Azure generated 43% of net new revenue, AWS 36%, and GCP 21%. Again, this tells us the total percentage of new revenue generated over the last year.
Finally, we have seen significant improvements in the overall cloud market for the last three quarters. This is driven by fewer cost optimizations and AI revenue.
Further looking at the cloud’s recovery, we see rebounding growth rates from the cloud providers (especially AWS).
Summarizing this section, we’ve seen the overall market share of AWS and Microsoft converging with GCP in a clear third place. Over the last 4 quarters, we’ve seen continued recoveries in growth rates. We have three companies growing revenue at double-digit rates generating ~$200B in combined annual revenue. I’m not being facetious when I say these are some of the best businesses to ever exist.
Cloud Infrastructure Ecosystem
Finally, I want to note the “alternatives” to the cloud providers. I won’t dive deeply into any of these companies in this article (perhaps in future articles). My goal of this section is for readers to have a “mostly complete” view of companies in the ecosystem.
When companies are choosing cloud infrastructure services, they have the following options:
- Hyperscalers - the tech giants that dominate the cloud infrastructure industry.
- Storage Clouds - cloud providers who provide storage-focused products.
- GPU Clouds - companies renting out GPUs and other AI-focused cloud infra.
- Alternative Clouds - cloud infrastructure companies who don’t get the same coverage as the hyperscalers but offer similar services.
- Edge Compute Clouds - companies who offer networking or edge compute services that put compute closer to the end user.
- Decentralized Clouds - providers using blockchain networks to pool many disparate computinng resources and renting them out as cloud services.
Now, the vast majority of revenue flows to the hyperscalers, but it’s important to note that other models provide cloud infrastructure.
5. Cloud Data from Public Comps
With Public Comps, we can look at industry comps or individual company comps. Using the BVP Cloud 100 as an example, we can see data like FCF, EV/EBITDA, and EV/NTM over time:
For individual companies like Cloudflare, we can see data like revenue, margins, and NRR:
As always, thanks for reading!
Disclaimer: The information contained in this article is not investment advice and should not be used as such. Investors should do their own due diligence before investing in any securities discussed in this article. While I strive for accuracy, I can’t guarantee the accuracy or reliability of this information. This article is based on my opinions and should be considered as such, not a point of fact. Views expressed are solely my own, not those of Public Comps or other employers.