ALTRepo Uploader (a.k.a ALTRepoDB) is a set of tools that used to uploading data about ALT Linux distributions to Clickhouse database.
  • Python 98.4%
  • C++ 1.5%
  • C 0.1%
Find a file
2026-04-07 14:32:35 +03:00
.gear 2.22.1-alt1 2026-04-07 14:32:35 +03:00
altrepobot update copyright 2026-01-20 16:46:40 +03:00
bin bin/image_loader: add alt-mobile edition and related platform 2026-04-07 14:31:04 +03:00
service altrepodb/bdu: add HTTP request timeout 2026-01-27 16:25:44 +03:00
sql SQL: fix table column name discrepancy with existing implementation 2026-02-03 12:04:48 +03:00
src bump version 2.22.0 -> 2.22.1 2026-04-07 14:31:39 +03:00
tests altrepodb: add shared HttpClient with retry logic and migrate HTTP operations 2026-01-27 18:58:47 +03:00
tools tools: add simple system licenses uploader script 2025-09-29 11:53:21 +03:00
.clang-format add clang-format configuration file 2023-12-15 17:57:38 +03:00
.gitignore Add docker-image support 2025-05-21 17:24:50 +03:00
.ruff.toml add Ruff configuration 2025-02-04 15:04:27 +03:00
amqpfire_config.json.example altrepodb: add p11 branch support 2024-05-24 10:52:37 +03:00
AUTHORS.txt update AUTHORS 2025-05-23 10:16:45 +03:00
CHANGELOG.md bump version 2.22.0 -> 2.22.1 2026-04-07 14:31:39 +03:00
config.ini.example add ErrataID service configuration example 2023-03-14 11:15:58 +03:00
LICENSE update copyright 2026-01-20 16:46:40 +03:00
MANIFEST.in WIP: prepare for packaging 2022-05-30 17:11:00 +03:00
README.md update README 2026-01-23 15:45:48 +03:00
requirements.txt update requirements 2025-07-25 12:02:47 +03:00
setup.cfg altrepodb: update project' flake8 configuration 2024-06-19 17:35:00 +03:00
setup.py altrepodb_libs: refactor library 2025-12-25 15:05:13 +03:00

ALTRepo Uploader

ALTRepo Uploader (a.k.a ALTRepoDB) is a set of tools that used to uploading data about ALT Linux distributions to Clickhouse database.

Database contents is used to maintain ALT Linux development and analytics with ALTRepo API.

License

GNU GPLv3

Dependencies

ALTRepo Uploader requires Python version 3.9 or higher.

ALTRepo Uploader requires following packages installed for tools to be full functional.

Note: some package names are ALT Linux specific

System packages

  • xz
  • git
  • fuseiso
  • gostsum
  • squashfuse
  • cdrkit-utils
  • libvirt
  • qemu-img
  • qemu-kvm
  • libguestfs
  • guestfs-data
  • rabbitmq-c
  • librpm7
  • libclickhouse-cpp

Python packages

  • python3-module-rpm
  • python3-module-requests
  • python3-module-zstandard
  • python3-module-libarchive-c
  • python3-module-setproctitle
  • python3-module-beautifulsoup4
  • python3-module-clickhouse-driver

Database structure

Project overview

Project summary and purpose

ALTRepo Uploader collects repository snapshots, build tasks, distribution images, and QA and security feeds from the ALT Linux ecosystem and stores them in ClickHouse. The data powers ALTRepo API and analytics for package history, build state, and vulnerability tracking.

Example: a daily branch snapshot is loaded into a package set so the API can answer "what packages were in p10 on a given date."

Architecture overview

  • CLI loaders parse local artifacts (repo snapshots, task trees, images) and write normalized data to ClickHouse.
  • The altrepod daemon runs service instances that consume AMQP messages and invoke the same loaders.
  • Scheduler tasks periodically pull external feeds and refresh enrichment tables.
  • Errata Server integration performs CVE matching and errata creation during repo and task processing.
  • ALTRepo API reads from ClickHouse for analytics and reporting.

Example: an AMQP message for a repo update triggers repo_loader through altrepod.

Business logic description

  • Repository ingestion: parse a branch snapshot, compute package and file metadata, and create a package set record.
  • Task ingestion: parse build tasks, store task state, subtasks, logs, approvals, and build iterations for traceability.
  • Image ingestion: extract package lists from ISO/IMG/QCOW or container images and store image package sets with edition and variant metadata.
  • Enrichment: load ACL ownership, Bugzilla issues, Beehive build status, Repocop checks, Watch updates, SPDX licenses, and Repology versions.
  • Vulnerability and errata processing: update CVE and CPE data, map packages to CPEs by branch, match version ranges, compute vulnerability status, and generate errata entries from changelogs.

Example: when a package version moves outside a vulnerable range, it is marked fixed and linked to an errata record.

Database structure description

  • Package inventory: Packages, PackageHash, Files, FileNames, and Changelog.
  • Package sets and repository state: PackageSetName, PackageSet, and RepositoryStatus.
  • Build and task history: Tasks, TaskStates, TaskIterations, TaskLogs, TaskProgress, and TaskApprovals.
  • Image metadata: ImagePackageSetName, ImageStatus, and ImageTagStatus.
  • QA and external feeds: Bugzilla, BeehiveStatus, PackagesRepocop, PackagesWatch, SPDXLicenses, and RepologyLatestVersions.
  • Vulnerability and errata: Vulnerabilities, CpeDictionary, CpeMatch, PackagesCveMatch, PackagesVulnerabilityStatus, ErrataHistory, ErrataChangeHistory, and ErrataID.
  • Ingestion helpers: buffer tables and materialized views that batch loads and keep "latest" datasets up to date.

Example: PackageSetName links a branch snapshot to the package hashes it contains.

ALTRepo Uploader uses Clickhouse as DBMS due to it's high performance and convenience for analytics.

Database structure initialization

Initial database structure is stored in sql/0000-initial.sql file and could be deployed at Clickhouse server with following command:

[user@host]$ cat sql/0000_initial.sql | clickhouse-client -h %SEREVR_IP_OR_DNS_NAME% -d %DATABASE_NAME% -n

Database contents initialization

Some additional initialization data included as well. For example license name aliases could be uploaded with:

[user@host]$ cat sql/license_aliases.json | clickhouse-client  -h %SEREVR_IP_OR_DNS_NAME% -d %DATABASE_NAME% --query="INSERT INTO LicenseAliases FORMAT JSONEachRow"

Database permissions

It is necessary to set proper permissions for database user that will be used by utilities for connection. At least it is neccessary to grant read and write permissions for all created tables and full permissions for temporary tables.

ALTRepo Uploader service

ALTRepo Uploader provides an altrepod systemd daemon that handles uploading data by receiving AMQP messages from RabbitMQ broker.

Altrepod uses service instances with separate configuration to handle particular AMQP messages.

Configuration files

When installed through RPM package, systemd unit file ready to be enabled in regular way right after appropriate configuration files are added to /etc/altrepod/config.json for altrepod itself and /etc/altrepod/services.d/%service_name%.json for each service instance that enabled.

Configuration templates could be found in /etc/altrepod directory.

Each service configuration file consists of 3 sections:

  1. Service behaviour configuration
  2. Database connection configuration
  3. RabbitMQ connection configuration

Secure connection to RabbitMQ

While connecting with RabbitMQ using SSL(https) it is required to have certificate file on host and set path to it in configuration files accordingly.

The amqpfire utility

In order to provide tool to 'fire' some specific altrepod service an repodb_amqpfire utility were added. The utility sends AMQP messages with appropriate payload using it's own configuration file.

List of supported services and options could be obtained running utility with -h argument.

[user@host]$ repodb_amqpfire -h
[user@host]$ repodb_amqpfire -c amqpfire_config.json -s repo -p p10 2022-06-22

Configuration example could be found in /usr/share/doc/altrepodb-%version%/ dicrectory.

ALTRepo Uploader utilities

Most of provided CLI tools has pretty common set of arguments. All of them have at least -h option that displays the usage message.

Configuration file

All CLI tools supports configuration provided by file with -c, --config option. Configuration file example is config.ini.example.

[DEFAULT]
workers=10              # number of threads (if used by utility)

[LOGGING]
log_to_file=no          # controls logging to file
log_to_syslog=no        # controls logging to syslog
log_to_console=yes      # controls logging to console [stderr]
syslog_ident=altrepodb  # controls syslog identity

[DATABASE]
dbname=repodb           # database name
host=localhost          # Clickhouse server IP address
port=9000               # Clickhouse server port
user=default            # databse user name
password=               # database user password

Note: Only logging level could be managed by CLI options. Logging handlers are controlled only by configuration file.

Command line tools

repo_loader

The utility uploads content of branch's repository state from file system to database. Check the usage message with command:

[user@host]$ repo_loader -h

Usage example:

[user@host]$ repo_loader sisyphus /archive/repo/sisyphus/date/2021/08/18 --date 2021-08-18 -c config.ini --tag test_load -v

task_loader

The utility uploads content of building task state from file system to database. Check the usage message with command:

[user@host]$ task_loader -h

Usage example:

[user@host]$ task_loader /archive/tasks/done/_276/283337 -c config.ini -f -D

image_loader

The utility uploads content of ALT Linux distribution image in ISO, TAR, IMG, QCOW2 formats to database. Check the usage message with command:

[user@host]$ image_loader -h

Usage example:

[user@host]$ image_loader alt-p10-opennebula-x86_64.qcow2 --branch p10 --edition cloud --version 10.0.0 --release release --platform "" --variant install --flavor opennebula --arch x86_64 --date 2022-02-10 --url "http://ftp.altlinux.org/%PATH_TO_IMAGE%" --type qcow -c config.ini --debug

acl_loader

The utility uploads ALT Linux maintaners ACLs to database. Check the usage message with command:

[user@host]$ acl_loader -h

beehive_loader

The utility uploads Beehive packages build results to database. Check the usage message with command:

[user@host]$ beehive_loader -h

bugzilla

The utility uploads Bugzilla issues to database. Check the usage message with command:

[user@host]$ bugzilla -h

repocop_loader

The utility uploads Repocop packages inspection to database. Check the usage message with command:

[user@host]$ repocop_loader -h

watch_loader

The utility uploads package's versions updates from Watch to database. Check the usage message with command:

[user@host]$ watch_loader -h

spdx_loader

The utility uploads licenses information from SPDX Git repository to database. Check the usage message with command:

[user@host]$ spdx_loader -h

Code style

Now project uses black for code formatting and flake8 as a linter with configuration defined in setup.cfg file.

Afternote

ALTRepo Uploader is under continuous development.

Functionality, database and code structure changes rapidly.

Check changelog and Git history for details.