Packages, just like microservices, are modular units of code designed to be used and reused by different software elements.
In the discussion around internal developer portals, one of the main drivers is the difficulty of managing microservices; package management in a microservices architecture presents similar challenges. Interestingly enough, in the case of package security vulnerabilities, internal developer portals may provide a solution that is superior to some of the best security tools out there. We’ll cover this case too, in this post, but we will begin first with package visibility and control in a microservices environment.
Just like with microservices, packages are associated with the same questions: ownership, reusability, documentation, contribution, and version visibility across different environments, microservices, and infrastructure. Ensuring package readiness and quality standards is a challenge, as is getting package information for root cause analysis (when needed).
The 7 challenges of package management in a microservice architecture
1. The visibility challenge: understanding which packages are used and where
At any given point in time, your systems may contain multiple package versions, in different microservices, running on different environments (development, staging, and production). These environments, which are often represented as a k8s namespace, are managed in different cloud regions or accounts. Telling which package version is used and where is almost impossible.
Since most organizations don’t have internal developer portals with software catalogs (that can track the relevant package information), package information is taken from the runtime itself, by integrating into the different cloud environments, clusters and the specific workload or running container, analyzing the packages loaded into the processes.
What we’d really like is visibility of package versions and their dependent components (either microservices deployed across different environments, clusters, lambdas, etc) , within the software catalog.
2. The migration challenge: aligning package versions in a microservice world
The visibility challenge soon becomes a migration challenge when packages need to change, from upgrading the language runtime version or a package version for some other reason. As a result of the visibility challenge described above, this becomes difficult to do at scale, especially as the number of microservices, teams, and developers grows.
3. The impact challenge: is it even possible to do a security impact analysis for vulnerable packages?
It is not uncommon for security incidents to occur on packages such as log4j. Once such issues come to attention, security researchers need to understand the impact these vulnerabilities have. They need to determine where the "infected version" of the package is located, which parts of the infrastructure it is used on, and which microservices utilize the affected version.
Current tools try to solve this problem. Technically, this is done by mapping the packages in a container or analyzing the memory itself to understand which packages are loaded. This is not a simple solution, quite the opposite, and is mostly used for security purposes. It’s difficult to run these approaches on K8s, it’s difficult to analyze memory, and the result is that these solutions simply aren’t feasible. They are usually intrusive, have a lot of false positives and require a deep DevOps implementation which is often a barrier to adoption.
4. The quality challenge: can you score the quality of deployed packages?
The quality of a package is determined by several different factors (this is a partial list):
Common vulnerabilities and exposures alerts, provided by tools that detect CVEs in packages.
Alerts from Static and Dynamic Application Security Testing tools (SAST & DAST, such as SonarQube). These tools continuously perform code inspection quality and fully automated code reviews/analysis to detect code smells, bugs, performance enhancements, and security vulnerabilities.
Open bugs or issue on a package - for an open-source package managed in Github, issues, bugs or breaking changes are tagged and managed inconsistently between packages. This makes change tracking or dealing with major releases a problem.
Alerts from exception and error tracking tools such as Sentry or Bugsnag.
For in-house packages, there is also the question of setting quality and reliability standards, such as ensuring a package has a specified owne, and ensuring tests were run, such as verifying the file Test.java exists in the git repo.
Issues with packages are tracked by different tools, and there is no unified quality metric for a given package version. Alerts appear in different tools and it isn’t simple to tell which packages need immediate attention and which don’t.
5. The ownership and documentation challenge: in-house developed packages
In many cases, it is difficult to tell who owns a certain package, making it difficult for developers to understand it, implement it or add capabilities to it. This is usually replaced by inaccurate tribal knowledge about the package, which can be exacerbated by storing the data on shared spreadsheets, documentation in Confluence or version control system Readme files which are often stale, since they aren’t maintained regularly.
6. The code duplication and shadow package challenge
When multiple teams work on projects with similar needs, code duplication happens, resulting in several packages essentially performing the same business logic. This wastes team efforts, creates a maintenance issue and results in different implementations of the same package being used in different sections of the codebase.
7. The “locked” package challenge
A package owned by a specific team and not open for contribution can create a bottleneck because the team owns the code, the package’s ci, and any contributions to the package. This means that other groups depend on the availability and priorities of the team that owns the code to make any changes or improvements, leading to potential delays and a lack of collaboration.
Managing packages as part of the Internal Developer Portal
Let’s see how these challenges are overcome through the software catalog in the Internal Developer Portal. The software catalog lies at the heart of any internal developer portal, as we’ve written here:
“The software catalog in the internal developer portal is much more than a microservices catalog. It is both services and infrastructure aware and shows all information in context. By combining information about clouds, environments, software, engineering tools as well as ownerships, docs, APIs and more, it provides the ability to really understand anything software in real time”.
This same catalog also contains valuable data about packages. It shows packages, and their dependent components and is easily searchable.
Using the software catalog you can get visibility into packages, their versions, owners, documentation, etc.
Within the software catalog you can visualize dependencies of package-dependent components such as microservices, environments, cloud resources, regions, cloud accounts etc.
Using the scorecards in the software catalog, you can determine unified scorecards for each package, in context, based on information coming in from several different tools.
Using the search capability in the software catalog, you can drive migrations and initiatives, from updating package versions to package migrations and the like.
Building the software catalog with a data model that supports package management
As we’ve explained elsewhere, one of the first things that need determining, is the right taxonomy for the software catalog in the internal developer portal. Let’s examine the taxonomy of a software catalog built for package management.
This schema represents a software catalog data model that supports package management. Let’s look at what it’s made of:
Service: a microservice or any other software, regardless of architecture (including a monolith).
Environment: any production, staging, QA, DevEnv, on-demand, or any other environment type.
Running Service: this is a service instance. This provides information about how the service is running now, in a specific environment. This entity references the service, environment and all the associated package versions that are part of the running service version.
Package: this represents a package with the relevant information about the package.
Package version: represents the package version (every package can have multiple versions).
Applying this model into real data will allow you to create the following level of visualization on packages and their entire dependency model:
This way, you can see the package lodash and all its associated versions. We can see version 2.2 and all the services that use version 2.2 according to the related environment they are deployed at.
Bringing package data into the Internal Developer Portal
How does package data become populated in the internal developer portal?
Next, during the CD process, when pulling the image of the desired version to be deployed, use the APU to report the Running-Service object to the software catalog. This object should include the Service's version id (previously created during the CI process) and be associated with the environment in which the CD process is deploying it.
By following these steps, you can maintain a live representation of all your services and their associated package versions across different environments. This can be helpful for tracking dependencies and ensuring that the correct versions are being used in each environment.
Creating quality scorecards for managed packages
To set up a scorecard, all the relevant information about the package version needs to be brought into the internal developer portal. For example, alerts from tools such as Sentry, Datadog, Snyk and SonarQube, as well as the issues opened for the particular package. We can then define standards for quality and set rules for scoring.
For this example below, we want to verify these three minimum requirements:
Scaffolding new in-house packages
One of the main uses of internal developer portals is developer self-service actions. One of the prime examples is scaffolding a new microservice through the internal developer platform, ensuring that quality and readiness standards are met.
Why might a company want to allow developers to scaffold new in-house packages using an internal developer portal rather than creating them independently? One reason is that it helps to ensure consistency and adherence to DevOps best practices across all in-house packages. By providing everything developers need, such as code boilerplates, CI configuration, and tests, as part of the developer portal, the package scaffolding process can be streamlined. This allows developers to start coding right away and own the package, while still ensuring that packages are developed according to best practice requirements.
The streamlined process can also save time and effort for developers, allowing them to focus on other tasks. An internal developer portal can facilitate collaboration between developers, ensuring that in-house packages are developed in a coordinated manner.
Top Benefits of Internal Developer Portal for Package Management
Managing packages within an organization can be a complex task, but a developer portal can make it easier and more efficient. By centralizing and streamlining processes such as tracking package versions across the software development lifecycle, aligning package versions across microservices, and conducting security analyses of vulnerable packages, a developer portal can help organizations save time and resources while improving security and code quality.
Additionally, a developer portal can facilitate open collaboration on in-house package codebases, helping to prevent issues such as code duplication and shadow packages. All of these benefits can lead to cost savings, increased productivity, and improved developer efficiency and business growth.