问题的解决方法是使用 Big Query API,通过设置 Schema 再按列导入数据。示例代码:
from google.cloud import bigquery # 引入 Big Query Python 客户端库
client = bigquery.Client() # 创建 Big Query 客户端对象
job_config = bigquery.LoadJobConfig(
schema=[
bigquery.SchemaField("column_name1", "STRING"),
bigquery.SchemaField("column_name2", "INTEGER")
],
skip_leading_rows=1, # 跳过首行
source_format=bigquery.SourceFormat.CSV, # 数据源为 CSV 文件
)
with open("path/to/file.csv", "rb") as source_file:
job = client.load_table_from_file(
source_file,
"project_id.dataset.table_name",
job_config=job_config
)
job.result() # 等待导入任务完成