Archipelo System Architecture
The following diagram illustrates Archipelo high level system architecture:
Click on the image to go to the original miro board and explore in higher resolution
- Archipelo System Architecture
Main Elements
The main elements of the system includes:
Web Application
The web application is the main entry point for the users of the system. It exposes main Archipelo features for the users to interact with. The web application uses Svelte as the framework of choice.
Unleash Proxy and Server
Archipelo uses feature flags to control the availability of the features in the target environment. For feature flag management we use Unleash, an open source feature management software with support for frontend, backend and extensions.
API Cloud Run Service
The API Cloud Run service is the main entry point for the Archipelo system. It provides APIs and services for the web application, extensions and other components of the system to interact with. The API Cloud Run service is implemented using in Golang.
Webhooks
The Webhooks are system integrations points enabling connection between Archipelo and external systems, i.e. for data exchange.
Cloud Run Services
The following services are run using GCP Cloud Run.
ARSBom
ARSBom is a Cloud Run service providing SBOM related functionality for the Archipelo system using the open source Syft tool deployed as a service.
ARSemgrep
ARSemgrep is a Cloud Run service providing static code analysis functionalities for the Archipelo system using Semgrep open source tool deployed as a service.
ARGitleaks
ARGitleaks is a Cloud Run service providing secret scanning functionalities for the Archipelo system using Gitleaks open source tool deployed as a service.
ARTrivy
ARTrivy is a Cloud Run service providing vulnerability scanning functionalities for the Archipelo system using Trivy open source tool deployed as a service.
ARCheckov
ARCheckov is a Cloud Run service providing ifrastructure as code scanning functionalities for the Archipelo system using Checkov open source tool deployed as a service.
Dataflow pipelines (ETL)
The data layer orf Archipelo is built on top of Apache Beam. This is run on Dataflow at the moment.
GKE Cluster (Kubernetes)
Managed Kubernetes cluster used to host the Archipelo system components.
Control Plane and API
Part of the Archipelo Attestation Engine used to manage the attestation process and built using Chainloop open source project.
Content Addressable Storage and API
Part of the Archipelo Attestation Engine used to store the attestation results and built using Chainloop open source project.
SQL Proxy
Part of the Archipelo Attestation Engine used to provide the persiatance layer for the attestation results.
CERT Manager
Archipelo uses the CERT manager to automatically obtain certificates, ensure certificates validity and control the certificates renewal process before expiry.
NGINX Ingress
Archipelo uses the NGINX Ingress controller monitors NGINX Ingress resources to discover requests for services that require ingress load balancing inside the managed GKE cluster.
GCP Task Queue
Archipelo System takes advantage of the Google Cloud Tasks to help run time-consuming, resource-intensive, and bandwidth-limited tasks asynchronously, outside of the main application flow.
PostgreSQL Databases
Main relational databases used by the Archipelo System to store user and business data. Archipelo uses two separate PostgreSQL databases:
- Identity DB - used to store user and identity related data,
- ARDB DB - used to store business data.
GCP Buckets
The S3 compatible storage used by the Archipelo System to store data including, but not limited to:
- ETL generated data
- Labelling data
- Machine learning models
GCP Firestore
The data store managed by GCP and used by the ETL pipelines to store the data.
Flows
The operations inside the system span across multiple components and depend on various data availability. The following section provides the information on some of the flows that touch number of platform elements.
License Retrival
The following flow diagram shows a simplified view of the licence retriaval process illustrating its dependency on the availability of the SBOM data provided by the Syft Cloud Run service.