要按距离将sf对象的特征聚合,可以按照以下步骤进行:
import numpy as np
import pandas as pd
from sklearn.cluster import AgglomerativeClustering
from scipy.spatial.distance import pdist, squareform
sf = pd.DataFrame({
'feature1': [1, 2, 3, 4, 5],
'feature2': [2, 4, 6, 8, 10]
})
distances = pdist(sf.values, metric='euclidean')
dist_matrix = squareform(distances)
agglomerative_clustering = AgglomerativeClustering(n_clusters=2, affinity='precomputed', linkage='average')
labels = agglomerative_clustering.fit_predict(dist_matrix)
sf['cluster'] = labels
完整代码示例:
import numpy as np
import pandas as pd
from sklearn.cluster import AgglomerativeClustering
from scipy.spatial.distance import pdist, squareform
sf = pd.DataFrame({
'feature1': [1, 2, 3, 4, 5],
'feature2': [2, 4, 6, 8, 10]
})
distances = pdist(sf.values, metric='euclidean')
dist_matrix = squareform(distances)
agglomerative_clustering = AgglomerativeClustering(n_clusters=2, affinity='precomputed', linkage='average')
labels = agglomerative_clustering.fit_predict(dist_matrix)
sf['cluster'] = labels
print(sf)
这样就可以将 sf 对象的特征按照距离进行聚合,并将聚类结果添加到原始数据集中。请根据实际需求修改聚类算法的参数,例如聚类的簇数、距离度量方法等。
上一篇:按距离和分数筛选点列表