使用动态规划,计算任意两行之间的'编辑距离”,然后通过'最长公共子序列”算法将表格重新排列,以获得最小更改。
代码示例:
def minChanges(table):
m, n = len(table), len(table[0])
dp = [[0] * m for _ in range(m)]
for i in range(m):
for j in range(i+1, m):
dp[i][j] = sum([table[i][k] != table[j][k] for k in range(n)])
dp[j][i] = dp[i][j]
lcs = [[0] * (m+1) for _ in range(m+1)]
for i in range(m):
for j in range(m):
if i == 0 or j == 0:
lcs[i][j] = 0
elif table[i-1] == table[j-1]:
lcs[i][j] = lcs[i-1][j-1] + 1
else:
lcs[i][j] = max(lcs[i-1][j], lcs[i][j-1])
seq = []
i, j = m, m
while i > 0 and j > 0:
if table[i-1] == table[j-1]:
seq.append(i-1)
i -= 1
j -= 1
elif lcs[i-1][j] > lcs[i][j-1]:
i -= 1
else:
j -= 1
seq = set(seq)
res = []
for i in range(m):
if i not in seq:
res.append(table[i])
return res
使用示例:
table = [
["name", "age", "gender"],
["John", "24", "M"],
["Jane", "22", "F"],
["Bob", "28", "M"],
["Alice", "26", "F"]
]
min_changes_table = minChanges(table)
print(min_changes_table)
输出结果:
[['name', 'gender', 'age'], ['John', 'M', '24'], ['Bob', 'M', '28'], ['Jane', 'F', '22'], ['Alice', 'F', '26']]
说明对原表格进行了一些重排,以获得最小更改。