问题描述:Airflow任务使用Docker容器进行Redshift查询。即使查询在Redshift上执行,任务仍失败。
解决方案:
确保在Airflow中正确设置Redshift连接(连接字符串、用户名、密码等)。
确保Redshift集群已在Docker容器内正确设置。可以使用以下代码示例:
#Dockerfile
FROM python:3.6.8-slim-stretch
RUN apt-get update -y &&
apt-get install -y wget &&
apt-get install -y ca-certificates &&
apt-get install -y postgresql-client
RUN apt-get update &&
apt-get install -y awscli
#Install Redshift ODBC Driver RUN apt-get install -y libodbc1 odbc-postgresql unixodbc-dev RUN wget https://s3.amazonaws.com/redshift-downloads/drivers/odbc/1.4.11.1000/AmazonRedshiftODBC-64-bit-1.4.11.1000-1.x86_64.deb && dpkg -i AmazonRedshiftODBC-64-bit-1.4.11.1000-1.x86_64.deb
#Add DSN entry ENV ODBCINI /etc/odbc.ini ENV ODBCSYSINI /etc/
RUN echo “[My Redshift]” >> $ODBCINI &&
echo "Driver=/opt/amazon/redshiftodbc/lib/64/libamazonredshiftodbc64.so" >> $ODBCINI &&
echo "Description=Redshift" >> $ODBCINI &&
echo "Servername=
echo "Database=
echo "PortNumber=5439" >> $ODBCINI &&
echo "UID=
echo "PWD=