AWS S3是一种云存储服务,它支持多种不同的文件格式。以下是一些可能在AWS S3中使用的文件格式的示例:
{ "name": "John Smith", "email": "john.smith@email.com", "phone": "123-456-7890" }
Name,Email,Phone John Smith,john.smith@email.com,123-456-7890 Jane Doe,jane.doe@email.com,987-654-3210
Parquet是一种高效的列式数据存储格式。以下是一个Parquet文件的示例代码:
import pyarrow.parquet as pq
table = pq.read_table('s3://my-bucket/my-file.parquet') df = table.to_pandas()
以下是一个包含一些用户数据的Avro文件的示例代码:
import avro.schema from avro.datafile import DataFileReader, DataFileWriter from avro.io import DatumReader, DatumWriter
schema = avro.schema.Parse(open("user.avsc", "rb").read())
writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema) writer.append({"name": "John Smith", "email": "john.smith@email.com", "phone": "123-456-7890"}) writer.append({"name": "Jane Doe", "email": "jane.doe@email.com", "phone": "987-654-3210"}) writer.close()
reader = DataFileReader(open("users.avro", "rb"), DatumReader()) for user in reader: print(user) reader.close()