# GCP Data Engineering VM with Forgejo This project sets up a reusable Google Cloud Platform (GCP) virtual machine for data engineering and machine learning projects, with Forgejo as the first installed module. ## Technologies Used - **Terraform**: For infrastructure provisioning and state management - **Cloud-init**: For initial VM setup and configuration with Debian - **Ansible**: For configuration management and application deployment - **Docker & Docker Compose**: For containerized applications - **Forgejo**: Open source git repository that can be accessed through a DuckDNS subdomain ## Project Structure ``` . ├── ansible/ # Ansible playbooks and configurations │ ├── forgejo.yml # Forgejo installation playbook │ ├── inventory.yml # Ansible inventory file │ └── templates/ # Jinja2 templates for configurations │ ├── app.ini.j2 # Forgejo app configuration │ ├── docker-compose.yml.j2 # Docker Compose template │ ├── nginx.conf.j2 # Nginx configuration with SSL support │ └── certbot-renew.j2 # SSL certificate renewal script ├── cloud-init/ # Cloud-init scripts │ └── cloud-init.sh # Initial VM setup script ├── terraform/ # Terraform configurations │ ├── main.tf # Main Terraform configuration │ ├── variables.tf # Variable definitions │ └── terraform.tfvars.example # Example variable values ├── deploy.sh # Deployment automation script └── README.md # This file ``` ## Prerequisites 1. Google Cloud Platform account with a project 2. Terraform installed locally 3. Ansible installed locally 4. SSH key pair for VM access 5. DuckDNS subdomain for Forgejo access ## Setup Instructions ### 1. Configure Terraform Variables Copy the example variables file and edit it with your GCP project details: ```bash cp terraform/terraform.tfvars.example terraform/terraform.tfvars ``` Edit `terraform/terraform.tfvars` with your GCP project ID, preferred region/zone, and SSH key path. ### 2. Configure Ansible Inventory Edit `ansible/inventory.yml` to set your DuckDNS subdomain and admin email for Let's Encrypt SSL certificates. ### 3. Run the Deployment Script ```bash ./deploy.sh ``` This script will: - Initialize and apply Terraform configuration to create the VM - Update the Ansible inventory with the VM's IP address - Wait for the VM to be ready - Run the Ansible playbook to install and configure Forgejo ## Accessing Forgejo Once deployment is complete, Forgejo will be accessible at: ``` https://your-duckdns-subdomain.duckdns.org ``` The deployment automatically configures: - HTTPS with Let's Encrypt SSL certificates - Automatic HTTP to HTTPS redirection - Weekly SSL certificate renewal - SQLite with WAL mode for improved performance SSH access to the Forgejo Git server will be available on port 222: ```bash ssh -p 222 git@your-duckdns-subdomain.duckdns.org ``` ## Performance Optimizations ### SQLite WAL Mode The deployment configures SQLite to use Write-Ahead Logging (WAL) mode for better performance. This provides: - Improved write performance with concurrent operations - Better read concurrency (readers don't block writers) - Reduced disk I/O - Improved durability and crash recovery To modify this setting, edit the `ansible/templates/app.ini.j2` file: ```ini [database] SQLITE_JOURNAL_MODE = WAL ``` ## Customization - **Machine Type**: Edit `terraform/terraform.tfvars` to change the VM size - **Forgejo Configuration**: Modify `ansible/templates/app.ini.j2` to customize Forgejo settings - **Additional Applications**: Add new Ansible playbooks in the `ansible` directory ## Maintenance - **Terraform State**: The Terraform state is stored locally in `terraform/terraform.tfstate` - **VM Updates**: SSH into the VM and run standard Debian update commands - **Forgejo Updates**: - Automatic updates: Forgejo is configured to automatically check for and install updates weekly - Manual updates: Update the Docker image version in `ansible/templates/docker-compose.yml.j2` - **SSL Certificate Renewal**: Certificates are automatically renewed weekly via a cron job - **Database Backups**: To back up the Forgejo database, copy `/opt/forgejo/data/gitea/gitea.db*` from the VM - **Monitoring**: Check the status of services with: ```bash ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo systemctl status nginx" ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo docker ps" ``` ## Auto-Update System The deployment includes an automatic update system for Forgejo with the following features: - **Weekly Updates**: Checks for and installs new Forgejo versions every Sunday at 3:00 AM - **Pre-Update Backups**: Creates a backup of critical data before performing updates - **Rollback Capability**: Maintains backups to allow manual rollback if needed - **Update Logs**: Detailed logs of update operations at `/opt/forgejo/logs/update.log` - **Failure Handling**: Proper error detection and reporting To manually trigger an update check: ```bash ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo /opt/forgejo/scripts/update-forgejo.sh" ``` To view update logs: ```bash ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo cat /opt/forgejo/logs/update.log" ``` ## Troubleshooting - **SSH Access**: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS` - **Logs**: - Forgejo logs: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo docker logs forgejo"` - Nginx logs: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo tail -f /var/log/nginx/error.log"` - SSL certificate logs: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo tail -f /var/log/letsencrypt/letsencrypt.log"` - **Connectivity**: Ensure GCP firewall rules allow traffic on ports 80, 443, 22, 3000, and 222 - **SSL Issues**: - Check certificate status: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo certbot certificates"` - Force certificate renewal: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo /etc/cron.weekly/certbot-renew"` - **Database Performance**: If experiencing slowdowns, check SQLite WAL mode is working: ```bash ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo ls -la /opt/forgejo/data/gitea/gitea.db*" ``` You should see additional files like `gitea.db-wal` and `gitea.db-shm` if WAL mode is active.