Package Cache
Caching layer for repository and package data.
What is the Package Cache?
The Package Cache is the intermediary layer between the CaD(configuration as data) Engine and external Git repositories. It maintains in-memory or database-backed representations of repositories, packages, and package revisions to improve performance and reduce load on external repository systems.
The cache is responsible for:
- Repository Management: Opening, closing, and tracking repository connections
- Data Caching: Storing repository metadata, package revisions, and package revision resources
- Background Synchronization: Periodically refreshing cached data from external repositories
- Repository Abstraction: Providing a unified interface regardless of storage backend (Git)
- Connection Pooling: Managing connections to external repositories efficiently
- Change Detection: Tracking repository versions to detect when updates are needed
Role in the Architecture
The Package Cache sits between the CaD Engine and Repository Adapters:
CaDEngine
↓
Package Cache
↓
┌──┴────┐
↓ ↓
CR Cache DB Cache
↓ ↓
Repository Adapter
↓
External Repositories (Git)
Key architectural responsibilities:
-
Performance Optimization: Reduces latency by caching repository data in memory or a database rather than fetching from Git on every request
-
Repository Lifecycle Management:
- Opens repositories on first access
- Maintains repository connections while in use
- Closes repositories when no longer needed
- Handles repository sharing when multiple Repository CRs point to the same Git repository
-
Abstraction Layer: Provides a consistent Cache interface with two implementations:
- CR Cache: Caches package data in-memory, stores PackageRev CR metadata in Kubernetes
- DB Cache: Stores all package data and metadata in a PostgreSQL database
-
Background Synchronization:
- Periodically refreshes repository state from Git repositories to maintain cache consistency
- Detects new, modified, and deleted package revisions
- Sends watch events (Added/Modified/Deleted) to notify watchers
- Implements configurable per-repository synchronization frequency as specified in the spec of the Repository CRD
-
Repository Adapter Integration:
- Creates repository adapter instances (Git adapter)
- Wraps adapters with caching logic
- Delegates actual Git operations to adapters
- Caches adapter responses
-
Change Notification:
- Sends watch events (Added/Modified/Deleted) when package revisions change
- Enables real-time watch streams for API clients through the CaDEngine’s WatcherManager
- Propagates changes from direct operations and background synchronizations to all active watchers
Cache selection:
The cache implementation is selected at Porch server startup based on configuration:
- CR Cache: Default implementation, stores metadata as Kubernetes resources
- DB Cache: Alternative implementation for larger deployments, stores metadata in PostgreSQL
Both implementations provide the same interface and functionality, differing only in storage mechanism and scalability characteristics.
Singleton pattern:
The cache is instantiated once during Porch server initialization and shared across all operations. This ensures:
- Consistent view of repository state
- Efficient resource usage
- Centralized synchronization control
Cache architecture and implementation patterns.
CR-based cache implementation details.
Database-based cache implementation details.
Overview of cache functionality and detailed documentation pages.
How the cache integrates with repositories and adapters.