要比较不同pandas数据帧的列与容差变化,可以按照以下步骤进行操作:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'A': [1.1, 2.2, 3.3], 'B': [3.9, 5.1, 6.3]})
def compare_dataframes(df1, df2, tolerance):
comparison_df = pd.concat([df1, df2], keys=['df1', 'df2'])
comparison_df = comparison_df.reset_index().drop_duplicates(subset='index', keep='last').set_index('index')
comparison_df.columns = pd.MultiIndex.from_product([comparison_df.columns, ['df1', 'df2']])
comparison_df['Diff'] = comparison_df['df1'] - comparison_df['df2']
comparison_df['Diff_Abs'] = comparison_df['Diff'].abs()
comparison_df['Equal'] = np.where(comparison_df['Diff_Abs'] <= tolerance, True, False)
return comparison_df
tolerance = 0.1
comparison_result = compare_dataframes(df1, df2, tolerance)
print(comparison_result)
输出结果将显示两个数据帧的每个列的差异、绝对差异和相等性:
A B Diff Diff_Abs Equal
df1 df2 df1 df2
index
0 1.0 1.1 4.0 3.9 -0.1 0.1 False
1 2.0 2.2 5.0 5.1 -0.2 0.2 False
2 3.0 3.3 6.0 6.3 -0.3 0.3 False
在这个例子中,两个数据帧的'A'列和'B'列都有差异超过了设定的容差值0.1,所以它们被标记为False。