142 lines
5.4 KiB
Markdown
142 lines
5.4 KiB
Markdown
# GCP Data Engineering VM with Forgejo
|
|
|
|
This project sets up a reusable Google Cloud Platform (GCP) virtual machine for data engineering and machine learning projects, with Forgejo as the first installed module.
|
|
|
|
## Technologies Used
|
|
|
|
- **Terraform**: For infrastructure provisioning and state management
|
|
- **Cloud-init**: For initial VM setup and configuration with Debian
|
|
- **Ansible**: For configuration management and application deployment
|
|
- **Docker & Docker Compose**: For containerized applications
|
|
- **Forgejo**: Open source git repository that can be accessed through a DuckDNS subdomain
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
.
|
|
├── ansible/ # Ansible playbooks and configurations
|
|
│ ├── forgejo.yml # Forgejo installation playbook
|
|
│ ├── inventory.yml # Ansible inventory file
|
|
│ └── templates/ # Jinja2 templates for configurations
|
|
│ ├── app.ini.j2 # Forgejo app configuration
|
|
│ ├── docker-compose.yml.j2 # Docker Compose template
|
|
│ ├── nginx.conf.j2 # Nginx configuration with SSL support
|
|
│ └── certbot-renew.j2 # SSL certificate renewal script
|
|
├── cloud-init/ # Cloud-init scripts
|
|
│ └── cloud-init.sh # Initial VM setup script
|
|
├── terraform/ # Terraform configurations
|
|
│ ├── main.tf # Main Terraform configuration
|
|
│ ├── variables.tf # Variable definitions
|
|
│ └── terraform.tfvars.example # Example variable values
|
|
├── deploy.sh # Deployment automation script
|
|
└── README.md # This file
|
|
```
|
|
|
|
## Prerequisites
|
|
|
|
1. Google Cloud Platform account with a project
|
|
2. Terraform installed locally
|
|
3. Ansible installed locally
|
|
4. SSH key pair for VM access
|
|
5. DuckDNS subdomain for Forgejo access
|
|
|
|
## Setup Instructions
|
|
|
|
### 1. Configure Terraform Variables
|
|
|
|
Copy the example variables file and edit it with your GCP project details:
|
|
|
|
```bash
|
|
cp terraform/terraform.tfvars.example terraform/terraform.tfvars
|
|
```
|
|
|
|
Edit `terraform/terraform.tfvars` with your GCP project ID, preferred region/zone, and SSH key path.
|
|
|
|
### 2. Configure Ansible Inventory
|
|
|
|
Edit `ansible/inventory.yml` to set your DuckDNS subdomain and admin email for Let's Encrypt SSL certificates.
|
|
|
|
### 3. Run the Deployment Script
|
|
|
|
```bash
|
|
./deploy.sh
|
|
```
|
|
|
|
This script will:
|
|
- Initialize and apply Terraform configuration to create the VM
|
|
- Update the Ansible inventory with the VM's IP address
|
|
- Wait for the VM to be ready
|
|
- Run the Ansible playbook to install and configure Forgejo
|
|
|
|
## Accessing Forgejo
|
|
|
|
Once deployment is complete, Forgejo will be accessible at:
|
|
|
|
```
|
|
https://your-duckdns-subdomain.duckdns.org
|
|
```
|
|
|
|
The deployment automatically configures:
|
|
- HTTPS with Let's Encrypt SSL certificates
|
|
- Automatic HTTP to HTTPS redirection
|
|
- Weekly SSL certificate renewal
|
|
- SQLite with WAL mode for improved performance
|
|
|
|
SSH access to the Forgejo Git server will be available on port 222:
|
|
|
|
```bash
|
|
ssh -p 222 git@your-duckdns-subdomain.duckdns.org
|
|
```
|
|
|
|
## Performance Optimizations
|
|
|
|
### SQLite WAL Mode
|
|
|
|
The deployment configures SQLite to use Write-Ahead Logging (WAL) mode for better performance. This provides:
|
|
- Improved write performance with concurrent operations
|
|
- Better read concurrency (readers don't block writers)
|
|
- Reduced disk I/O
|
|
- Improved durability and crash recovery
|
|
|
|
To modify this setting, edit the `ansible/templates/app.ini.j2` file:
|
|
|
|
```ini
|
|
[database]
|
|
SQLITE_JOURNAL_MODE = WAL
|
|
```
|
|
|
|
## Customization
|
|
|
|
- **Machine Type**: Edit `terraform/terraform.tfvars` to change the VM size
|
|
- **Forgejo Configuration**: Modify `ansible/templates/app.ini.j2` to customize Forgejo settings
|
|
- **Additional Applications**: Add new Ansible playbooks in the `ansible` directory
|
|
|
|
## Maintenance
|
|
|
|
- **Terraform State**: The Terraform state is stored locally in `terraform/terraform.tfstate`
|
|
- **VM Updates**: SSH into the VM and run standard Debian update commands
|
|
- **Forgejo Updates**: Update the Docker image version in `ansible/templates/docker-compose.yml.j2`
|
|
- **SSL Certificate Renewal**: Certificates are automatically renewed weekly via a cron job
|
|
- **Database Backups**: To back up the Forgejo database, copy `/opt/forgejo/data/gitea/gitea.db*` from the VM
|
|
- **Monitoring**: Check the status of services with:
|
|
```bash
|
|
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo systemctl status nginx"
|
|
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo docker ps"
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
- **SSH Access**: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS`
|
|
- **Logs**:
|
|
- Forgejo logs: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo docker logs forgejo"`
|
|
- Nginx logs: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo tail -f /var/log/nginx/error.log"`
|
|
- SSL certificate logs: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo tail -f /var/log/letsencrypt/letsencrypt.log"`
|
|
- **Connectivity**: Ensure GCP firewall rules allow traffic on ports 80, 443, 22, 3000, and 222
|
|
- **SSL Issues**:
|
|
- Check certificate status: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo certbot certificates"`
|
|
- Force certificate renewal: `ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo /etc/cron.weekly/certbot-renew"`
|
|
- **Database Performance**: If experiencing slowdowns, check SQLite WAL mode is working:
|
|
```bash
|
|
ssh -i ~/.ssh/id_rsa debian@VM_IP_ADDRESS "sudo ls -la /opt/forgejo/data/gitea/gitea.db*"
|
|
```
|
|
You should see additional files like `gitea.db-wal` and `gitea.db-shm` if WAL mode is active.
|