5 // Cloud Deployment
Once the services are created and configured in AWS, the virtual machine can be set up with the needed tools/libraries to run the data pipelines in Dagster.
Installing Tools and Libraries
Connect to the virtual machine and run the following commands to get everything set up:
- Install AWS CLI
- Download via
curl: - Install
unzipprogram: - Unzip the compressed folder:
- Run the installer:
- Set the default region:
- Run an
awscommand such assecretsmanagerto verify AWS connectivity:
- Download via
- Clone Repository
- Create a new directory:
- Change into new directory:
- Add the remote repository:
- Edit the
gitconfig file to turn on sparse checkout: - Tell
gitwhich directory to check out. Then, pull that directory. - Pull the repo into the local directory
- Verify that
card_data/directory was created.
- Install Tools
- Install
uvfor Python: - Add to
PATH: - Install libraries from
pyproject.tomlfile: - Activate virtual environment:
- Create
dagster.yamlfile (replace with correct password and hostname): - Set environment variables:
echo 'export DAGSTER_HOME="$HOME/.dagster"' >> ~/.bashrcecho 'export SUPABASE_USER="<supabase_user>"' >> ~/.bashrcecho 'export SUPABASE_PASSWORD="<supabase_password>"' >> ~/.bashrc
source ~/.bashrc- to load variables in current session.
- Install
- Verify Dagster and Connectivity
dg dev --host 0.0.0.0 --port 3000- In the browser, visit
http://<ip-address-of-vm>:3000
Automating Startup with systemd
Optional
In order to save on costs, the EC2 and RDS instances are scheduled to start and
stop once each day with AWS EventBridge. To automate the starting of the Dagster webservice,
systemd, along with a couple of shell scripts, will be used to create this automation.
Service Files
The card_data/infrastructure/ directory has the following files:
dagster.service- the mainsystemdfile for defining the Dagster service and environment.wait-for-rds.sh- stored asExecStartPreindagster.serviceto check if the RDS instance is available.start-dagster.sh- If the RDS instance is ready, this will run and start the Dagster web service.
Although the files are included in this repository, they need to be moved or created in a specific directory on the Linux virtual machine.
Prerequisites
Before copying or creating the scripts, ensure the following system tools are installed. These are required by the shell scripts:
netcat(nc): Used bywait-for-rds.shto check RDS availabilityjq: Used bystart-dagster.shto parse JSON from AWS Secrets Manager
For Debian/Ubuntu systems, install with:
For other platforms, use the appropriate package manager:
- RHEL/CentOS/Amazon Linux:
sudo yum install -y nc jqorsudo dnf install -y nc jq - macOS:
brew install netcat jq - Alpine Linux:
apk add netcat-openbsd jq
Warning
Without these tools installed, the scripts will fail with errors like "command not found" when systemd attempts to run them.
Copy Files
Copy or move the files from the checked out repository to the proper directory on the Linux machine (the files must first
be edited to match project specific configuration. Such as the proper RDS instance name in wait-for-rds.sh):
cp card_data/card_data/infrastructure/wait-for-rds.sh /home/ubuntu/
cp card_data/card_data/infrastructure/start-dagster.sh /home/ubuntu/
cp card_data/card_data/infrastructure/dagster.service /etc/systemd/system/
Create Files
The files can also be recreated. Update the files below with project specific configuration then
run the cat or tee commands listed below.
First, create dagster.service
- Run the following shell command to create the file (edit any differing details such as AWS region):
dagster.service
sudo tee /etc/systemd/system/dagster.service > /dev/null << 'EOF'
[Unit]
Description=Dagster Development Server
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=ubuntu
WorkingDirectory=/home/ubuntu/card_data/card_data
Environment="AWS_DEFAULT_REGION=us-west-2"
Environment="PATH=/home/ubuntu/card_data/card_data/.venv/bin:/usr/local/bin:/usr/bin:/bin"
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=read-only
ExecStartPre=/home/ubuntu/wait-for-rds.sh
ExecStart=/home/ubuntu/start-dagster.sh
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
EOF
&& echo "File created successfully"
Second, create wait-for-rds.sh
-
Retrieve RDS instance name:
-
Run the following shell command to create the file (replace with correct instance id):
wait-for-rds.sh
cat > /home/ubuntu/wait-for-rds.sh << 'EOF'
#!/bin/bash
MAX_TRIES=20
COUNT=0
RDS_HOST="<rds-instance-id>.<region>.rds.amazonaws.com"
RDS_PORT=5432
echo "Checking if RDS is available..."
while [ $COUNT -lt $MAX_TRIES ]; do
if nc -z -w5 $RDS_HOST $RDS_PORT 2>/dev/null; then
echo "RDS is available!"
exit 0
fi
COUNT=$((COUNT + 1))
echo "Attempt $COUNT/$MAX_TRIES - RDS not ready yet..."
sleep 10
done
echo "RDS did not become available in time"
exit 1
EOF
Last, create start-dagster.sh
- Retrieve RDS secret name from Secrets Manager. AWS auto-creates a secret for RDS.
start-dagster.sh
cat > /home/ubuntu/start-dagster.sh << 'EOF'
#!/bin/bash
# Fetch secrets from AWS Secrets Manager
SUPABASE_SECRETS=$(aws secretsmanager get-secret-value \
--secret-id supabase \
--region us-west-2 \
--query SecretString \
--output text)
AWS_RDS_SECRETS_PW=$(aws secretsmanager get-secret-value \
--secret-id '<correct-rds-secret>' \
--region us-west-2 \
--query SecretString \
--output text)
AWS_RDS_SECRETS_HN=$(aws secretsmanager get-secret-value \
--secret-id rds-hostname \
--region us-west-2 \
--query SecretString \
--output text)
# Extract values
SUPABASE_PASSWORD=$(echo "$SUPABASE_SECRETS" | jq -r '.password')
if [ -z "$SUPABASE_PASSWORD" ] || [ "$SUPABASE_PASSWORD" = "null" ]; then
echo "ERROR: missing SUPABASE_PASSWORD from supabase secret" >&2
exit 1
fi
export SUPABASE_PASSWORD
SUPABASE_USER=$(echo "$SUPABASE_SECRETS" | jq -r '.user')
if [ -z "$SUPABASE_USER" ] || [ "$SUPABASE_USER" = "null" ]; then
echo "ERROR: missing SUPABASE_USER from supabase secret" >&2
exit 1
fi
export SUPABASE_USER
AWS_RDS_PASSWORD=$(echo "$AWS_RDS_SECRETS_PW" | jq -r '.password')
if [ -z "$AWS_RDS_PASSWORD" ] || [ "$AWS_RDS_PASSWORD" = "null" ]; then
echo "ERROR: missing AWS_RDS_PASSWORD from RDS secret" >&2
exit 1
fi
export AWS_RDS_PASSWORD
AWS_RDS_HOSTNAME=$(echo "$AWS_RDS_SECRETS_HN" | jq -r '.hostname')
if [ -z "$AWS_RDS_HOSTNAME" ] || [ "$AWS_RDS_HOSTNAME" = "null" ]; then
echo "ERROR: missing AWS_RDS_HOSTNAME from rds-hostname secret" >&2
exit 1
fi
export AWS_RDS_HOSTNAME
DAGSTER_HOME=/home/ubuntu/card_data/card_data/
export DAGSTER_HOME
# Activate the virtual environment
source /home/ubuntu/card_data/card_data/.venv/bin/activate
# Start Dagster
exec dg dev --host 0.0.0.0 --port 3000
EOF
Start Service
Apply, enable on boot, and start the service:
# Reload systemd to recognize the new service
sudo systemctl daemon-reload
# Enable it to start on boot
sudo systemctl enable dagster.service
# Start/stop
sudo systemctl start dagster.service
Show the status of service running:
View live logs: