6.3 KiB
GCP Data Engineering VM with Forgejo
This project sets up a reusable Google Cloud Platform (GCP) virtual machine for data engineering and machine learning projects, with Forgejo as the first installed module.
Technologies Used
- Terraform: For infrastructure provisioning and state management
- Cloud-init: For initial VM setup and configuration with Debian
- Ansible: For configuration management and application deployment
- Docker & Docker Compose: For containerized applications
- Forgejo: Open source git repository that can be accessed through a DuckDNS subdomain
Project Structure
.
├── ansible/ # Ansible playbooks and configurations
│ ├── forgejo.yml # Forgejo installation playbook
│ ├── inventory.yml # Ansible inventory file
│ └── templates/ # Jinja2 templates for configurations
│ ├── app.ini.j2 # Forgejo app configuration
│ ├── docker-compose.yml.j2 # Docker Compose template
│ ├── nginx.conf.j2 # Nginx configuration with SSL support
│ └── certbot-renew.j2 # SSL certificate renewal script
├── cloud-init/ # Cloud-init scripts
│ └── cloud-init.sh # Initial VM setup script
├── terraform/ # Terraform configurations
│ ├── main.tf # Main Terraform configuration
│ ├── variables.tf # Variable definitions
│ └── terraform.tfvars.example # Example variable values
├── deploy.sh # Deployment automation script
└── README.md # This file
Prerequisites
- Google Cloud Platform account with a project
- Terraform installed locally
- Ansible installed locally
- SSH key pair for VM access
- DuckDNS subdomain for Forgejo access
Setup Instructions
1. Configure Terraform Variables
Copy the example variables file and edit it with your GCP project details:
cp terraform/terraform.tfvars.example terraform/terraform.tfvars
Edit terraform/terraform.tfvars with your GCP project ID, preferred region/zone, and SSH key path.
2. Configure Ansible Inventory
Edit ansible/inventory.yml to set your DuckDNS subdomain and admin email for Let's Encrypt SSL certificates.
3. Run the Deployment Script
./deploy.sh
This script will:
- Initialize and apply Terraform configuration to create the VM
- Update the Ansible inventory with the VM's IP address
- Wait for the VM to be ready
- Run the Ansible playbook to install and configure Forgejo
Accessing Forgejo
Once deployment is complete, Forgejo will be accessible at:
https://your-duckdns-subdomain.duckdns.org
The deployment automatically configures:
- HTTPS with Let's Encrypt SSL certificates
- Automatic HTTP to HTTPS redirection
- Weekly SSL certificate renewal
- SQLite with WAL mode for improved performance
SSH access to the Forgejo Git server will be available on port 222:
ssh -p 222 git@your-duckdns-subdomain.duckdns.org
Performance Optimizations
SQLite WAL Mode
The deployment configures SQLite to use Write-Ahead Logging (WAL) mode for better performance. This provides:
- Improved write performance with concurrent operations
- Better read concurrency (readers don't block writers)
- Reduced disk I/O
- Improved durability and crash recovery
To modify this setting, edit the ansible/templates/app.ini.j2 file:
[database]
SQLITE_JOURNAL_MODE = WAL
Customization
- Machine Type: Edit
terraform/terraform.tfvarsto change the VM size - Forgejo Configuration: Modify
ansible/templates/app.ini.j2to customize Forgejo settings - Additional Applications: Add new Ansible playbooks in the
ansibledirectory
Maintenance
- Terraform State: The Terraform state is stored locally in
terraform/terraform.tfstate - VM Updates: SSH into the VM and run standard Debian update commands
- Forgejo Updates:
- Automatic updates: Forgejo is configured to automatically check for and install updates weekly
- Manual updates: Update the Docker image version in
ansible/templates/docker-compose.yml.j2
- SSL Certificate Renewal: Certificates are automatically renewed weekly via a cron job
- Database Backups: To back up the Forgejo database, copy
/opt/forgejo/data/gitea/gitea.db*from the VM - Monitoring: Check the status of services with:
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo systemctl status nginx" ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo docker ps"
Auto-Update System
The deployment includes an automatic update system for Forgejo with the following features:
- Weekly Updates: Checks for and installs new Forgejo versions every Sunday at 3:00 AM
- Pre-Update Backups: Creates a backup of critical data before performing updates
- Rollback Capability: Maintains backups to allow manual rollback if needed
- Update Logs: Detailed logs of update operations at
/opt/forgejo/logs/update.log - Failure Handling: Proper error detection and reporting
To manually trigger an update check:
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo /opt/forgejo/scripts/update-forgejo.sh"
To view update logs:
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo cat /opt/forgejo/logs/update.log"
Troubleshooting
- SSH Access:
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS - Logs:
- Forgejo logs:
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo docker logs forgejo" - Nginx logs:
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo tail -f /var/log/nginx/error.log" - SSL certificate logs:
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo tail -f /var/log/letsencrypt/letsencrypt.log"
- Forgejo logs:
- Connectivity: Ensure GCP firewall rules allow traffic on ports 80, 443, 22, 3000, and 222
- SSL Issues:
- Check certificate status:
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo certbot certificates" - Force certificate renewal:
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo /etc/cron.weekly/certbot-renew"
- Check certificate status:
- Database Performance: If experiencing slowdowns, check SQLite WAL mode is working:
You should see additional files likessh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo ls -la /opt/forgejo/data/gitea/gitea.db*"gitea.db-walandgitea.db-shmif WAL mode is active.