What Is an AI Cloud Stack

Enterprise AI adoption has accelerated at a rapid pace in the past few years. AI applications, models, cloud platforms, and hardware have emerged to form an AI cloud stack to support this growth.

Last year, 42% of enterprises were actively using AI and another 40% were actively exploring its potential use. The global enterprise artificial intelligence market reached $23.95 billion in 2024.

Enterprises looking to adopt AI need to understand the AI cloud stack, how the different components interact, and how this stack can impact your enterprise AI applications.

What Is the AI Cloud Stack?

The AI cloud stack refers to the connected layers, services, and tools that enable the development and deployment of AI applications in the cloud. It includes everything from the hardware and infrastructure to the application your end user interacts with.

The AI cloud stack can be split into three layers:

Application Layer: This layer consists of the end-user applications that make use of AI models. These include chatbots, recommendation engines, and predictive analytics tools.
Model Layer: This layer includes the underlying foundation models that support AI-powered applications. This may include closed-source foundation models or open-source models supported by a model hub.
Infrastructure Layer: This layer includes the physical hardware and cloud infrastructure required to run AI applications. It includes storage, networking, computing, and cloud platforms.

Image credit: a16z.com

Each component of the AI cloud stack works with the others to support a seamless AI development process. Here are the key components of the AI stack, the role each plays, and how they interact with each other:

AI Applications

AI applications are the end-user tools and services that are powered by AI. Users interface directly with this layer of the AI cloud stack to receive AI-driven insights, generative AI, or improved UI through chatbots and recommendation engines.

ChatGPT and Jasper are two examples of user-facing applications powered by artificial intelligence.

These applications rely on foundation models for artificial intelligence. For example, ChatGPT runs on GPT-4 and GPT-3.

Closed-Source Foundation Models

AI application are powered by foundation models - large-scale AI models trained on datasets to perform a wide range of tasks. Applications can use either closed-source or open-source foundation models.

Closed-source models are proprietary technology controlled by its owner. OpenAI’s GPT series, Google’s PaLM 2, and Amazon Titan are the most popular examples. AI application developers can integrate these models into their apps by paying for API access.

With closed-source foundation models, AI developers do not have access to the model’s underlying code or architecture, making it less customizable. However, they do offer a higher level of support and require less setup, especially for common tasks the model is already optimized for.

These models are deployed on cloud platforms and are usually platform dependent. For example, PaLM 2 is only available on Google Cloud and Amazon Titan is only available on AWS.

Open-Source Foundation Models

Open-source foundation models are publicly available and can be freely trained, modified, and used. This allows developers to customize and retrain the model on new data to meet specific use cases. Open-source models are primarily supported and improved by developer communities, offering more innovative features but less available support.

Some popular open-source models include Meta’s LLaMA, BERT (Bidirectional Encoder Representations from Transformers), and GPT-Neo.

Like closed-source models, these foundation models are used to power AI applications and they are deployed on a cloud platform. However, open-source models make use of a model hub, an additional component between the model and the application layer.

Model Hubs

Model hubs are repositories for pre-trained AI models that allow developers to download, fine-tune, or deploy models without starting from scratch. Model hubs such as Hugging Face serve as a bridge between open-source foundation models and AI applications.

Model hubs accelerate development by providing streamlined access to pre-trained models. Without a model hub, the developer would need to set up their own infrastructure and manually configure, train, and deploy the model. Model hubs also manage version control to ensure your application uses the latest version of the foundation model.

Model hubs integrate directly with development frameworks and cloud platforms, making it easy to built into your AI cloud stack.

End-to-end AI Applications

End-to-end AI application include the AI application, model hub, and foundation model in one tool. Tools like Google Vertex AI and AWS SageMaker simplifying the AI cloud stack by providing complete environments for building, training, and deploying AI models and applications.

Because they serve as the complete application and model layers, AI development platforms like Vertex or Sagemaker can deploy specific, supported foundation models. For example, you can deploy closed-source models like Gemini and PaLM 2 and open-source models like LLaMA on Google Vertex AI. These applications are hosted on cloud platforms and are usually vendor specific.

Is your organization ready for AI?

Complete our AI Readiness Assessment to better understand your organization's readiness for AI implementation and provide actionable insights for your AI journey.

Start Quiz

Cloud Platform

Cloud platforms such as AWS, Microsoft Azure, and Google Cloud provide the infrastructure and services required to develop, deploy, and scale AI applications. You can use a cloud platform to host AI models and applications.

Cloud platforms provide access to their own proprietary, closed-source models and also integrate with model hubs to make it easy to use open-source models. These platforms also handle the allocation of compute resources and hardware management.

Choosing a cloud provider to host your AI model is often the most important decision when it comes to building an AI cloud stack. Providers like AWS, Google Cloud, and Azure support specific foundation models and ecosystems, limiting your options.

Compute Hardware

This component of the infrastructure layer involves the GPUs and TPUs needed for processing power to train and run AI models. Cloud platforms generally handle the allocation of hardware resources to optimize app performance.

NVIDIA GPUs and Google’s TPUs are examples of compute hardware used to run foundation models and AI applications.

Cloud providers may also offer different options for resource allocation based on your budget and compute needs. If you’re worried about the compute costs, looking into spot instances, autoscaling, and continuous cost management are good ways to approach artificial intelligence software development on a budget.

While the AI cloud stack may seem complicated, many of the decisions are streamlined by choosing the right cloud platform and working with the right AI development team. Gigster can help evaluate your AI application and build the cloud stack right for your needs. Want to learn more about your organization’s current level of AI readiness? Take our 10 minute AI Readiness Assessment.

Share This Post