Let's take a deep dive into the best approach today for platform engineers to catalog assets and services- the software catalog. In this article we'll discuss what a software catalog is, and why it is the best approach for internal developer platforms, regardless if you build or buy it
A centralized service catalog, a resource catalog, a single pane of glass: everyone is talking about catalogs as the first building block for internal developer portals and a better developer experience.
Is there a better approach to cataloging infrastructure assets, services and anything in between for platform engineering? We believe there is one: the software catalog. In this blog post we’ll make a technical argument to explain exactly what a software catalog is and why it's the best approach for internal developer platforms, regardless if you build or buy.
Introduction: the rise of developer portals
To become agile, engineering organizations have gone a long way. The rise of DevOps has driven a lot of innovation, and one of the changes is that software monoliths are now broken into many small pieces: microservices, micro frontends, mono-repo, multi-repo and more. IaC is used to interact with cloud resources to realize the benefits of git, and third-party software of all kinds (SaaS, OSS, Cloud Services, etc.) is leveraged to write less code and focus on business logic.
Yet, today, almost everyone is in agreement that when developers need to interact with these new tools, technologies, methodologies & processes all that remains is developer cognitive load. This load has to be reduced so that engineering teams can stay productive. A shift left culture of “you build it, you own it” requires internal developer portals. The underlying goal is to abstract away infrastructure and make the lives of devops easier, while making developer self-service simple and with the proper guardrails in place.
Enter catalogs: software catalogs, microservice catalogs, resource catalogs
At the core of any developer portal is a catalog. The catalog is there to abstract away the complexity of DevOps technicalities, and provide developers a single pane of glass that represents all the software and infrastructure they need to interact with.
Ask any platform engineering team what their use cases for internal developer portals or platforms are, and you will receive many answers: self-service, dependencies, microservices, orphan environments, lack of ownership, lack of SDLC understanding and control, Jenkins self-service run amok and more.
Software catalog benefits
Is there an optimal architecture for a catalog - what I call a Software Catalog - that will cover all the use cases for an internal developer platform or a productized developer portal? There is, and its goal is to provide developers with a solid understanding and control of the Software Development Life Cycle across any DevOps architecture.
At the core of this approach is the unified catalog of (1) Service, (2) Environment, (3) Deployed Service and (4) Deployment. Only this unified approach can achieve the following:
- Eliminating developer cognitive load when creating (time to “Hello World!”) and deploying a service or resolving an issue.
- Elimination of orphan resources from environments, to cloud resources and permissions.
- A clear understanding of each service’s maturity and readiness.
- A detailed view of every service’s lifecycle from the first commit to many deployments running across different environments.
- Reducing the amount of recurring tickets or slack messages received by developers with questions and tasks that can be easily automated through the software catalog.
- Alignment of work within the engineering organization by assigning an owner for each catalog entity, so that SREs, DevOps, developers and platform engineering all have a shared language.
- Reducing the time it takes a new developer to perform their first commit by providing clarity and faster onboarding.
Software Catalog: The Basic Model (0-1)
This model covers the main SDLC intersections: from Services, through Environments and Deployments, all the way up to the cloud.
Before I dive into the details of each component in the software catalog, here’s a brief explanation of the ontology diagrammed here:
- Service. A service can be a microservice, software monolith or any other software architecture.
- Environment. An environment is any production, staging, QA, DevEnv, on-demand or any other environment type.
- Deployed Service. A deployed service is a representation of the current “live” version of a service running in a specific environment. It will include references to the service, environment and deployment, as well as real-time information such as status, uptime and any other relevant metadata.
- Deployment. A deployment could be described as an object representing a CD job. It includes the version of the deployed service and a link to the job itself. Unlike other objects, the deployment is an immutable item in the software catalog. It is important to keep it immutable to ensure the catalog remains a consistent source of truth.
The data model should also show dependencies between each component, essentially allowing developers to get answers to questions such as:
- What is the datadog link to service x that is running in production?
- Who owns service y?
- What is the branch name of the running version in staging?
- What is the K8S dashboard link to production?
- What are the versions of the services that are deployed in the production environment?
- What is the deployment frequency of a specific service?
- What is the percentage of successful deployments?
- In which cloud providers are my services provisioned?
- Which programming languages are the most common in my organization?
- Who do I need to interact with when there is a bug in the production environment?
- Who do I need to interact with when I have a question about an API of a specific service?
- What are the relevant DORA Metrics for a team/service?
- And a lot more.
Let’s take a dive into each component type in the catalog →
The service profile should include everything about the service that is not related to the CI/CD pipeline itself. It is advisable to keep the profile clean and concrete to the profile of the service itself. The relation of the service to the environment & deployed service will provide the relevant SDLC information with regards to the service’s life down the road.
Relevant data includes:
- Git repository URL
- Responsible team (fetched from the identify provider)
- Who is on-call
- The main language & version the service is written in (Python/ Node/ Go/ Java/ etc)
- The business domain the service belongs to (if applicable)
- The slack channel for getting notifications about the service
- Link to docs
- Hosting Infra (deployed as a serverless function or on Kubernetes)
- The helm chart used to deploy the service
- Services which the service is dependent upon
Representing environments varies greatly among different engineering organizations. Some deploy services as a Lambda function, and others as Kubernetes clusters. Even when two companies use Kubernetes, they may have little in common with regards to what constitutes an environment. Some organizations represent an environment at a namespace resolution, while others refer to an entire cluster as the environment where services are being deployed on. There is no right way to do this.
Let's assume as an example that you want to represent an environment as a Kubernetes cluster so you can save these properties for the cluster:
- Environment Type (for example dev, staging and prod)
- Cluster Type (AKS, EKS, GKE) - If you operate multi clouds
- Deep-Links to logs & observability tools for the cluster - these are relevant for the DevOps Engineer (New Relic, Kiali, Prometheus, and more)
- Slack channel to get notifications about the cluster
- Link to Runbooks
- Cluster Configuration
We want to enable developers to get information about how a specific service version behaves in a particular environment. The “Deployed Service” represents the runtime of this service@env.
The Deployed Service is related to the Service on the one side and to the Environment from the other side.
Here are the properties we want to store on “Deployed Service”:
- Related service
- Health status
- Related environment
- App URL
- Chart version
- Links to logs (New Relic, Sentry, Prometheus, and more) about the deployed service (mostly the ones that are relevant for a developer)
- CPU limit
- Memory limit
- CPU requests
- Memory requests
This is an immutable entity within the catalog, representing the CD job description including relevant data with regards to the service’s version.
- Job duration
- The user who triggered the job
- The deployed service
- Environments service got deployed for
- The deployed image tag
- The status of the deployment
- Link to the job (Jenkins job, GitHub action, azure DevOps, etc.)
- Link to deployment logs
Some notes on data ingestion when building the catalog (vs. buying a product)
- Ingesting data into the software catalog isn’t trivial. I recommend saving services as YAML files inside your git repository. This transfers the ownership of these files to the service owners.
- Environments will be ingested through Terraform.
- Deployed Services and Deployments will be ingested using an API.
Make sure your resource catalog is API-first so you can pipeline data from different sources and implement “exporters” to fetch data from meaningful intersections in your pipeline.
We have started with four basic components: Service, Environment, DeployedService, and Deployment.
After we establish the basic structure of the software catalog, we will want to add more data to it. We will want to know which cloud resources services are used in a specific environment, which tests are performed on each service, and understand the resulting service maturity, which secrets services use and what is their current on-demand environment and costs. To do this, we will gradually add entities to enrich the visibility in the catalog.
This model is a great beginning that gives developers a good understanding and control of the Software Development Lifecycle, a first step towards a good developer experience.