我很惊讶地执行交叉验证
def cross_v(data, folds=5): algorithms = (SVD, KNNBasic, KNNWithMeans, NormalPredictor) measures = ['RMSE', 'MAE'] for a in algorithms: data.split(folds); algo = a(); algo.fit(data)
我这样调用函数
data = Dataset.load_builtin('ml-100k') multiple_cv(data)
我得到这个错误
Traceback (most recent call last): File "/home/user/PycharmProjects/pac1/prueba.py", line 30, inmultiple_cv(data) File "/home/user/PycharmProjects/pac1/prueba.py", line 19, in multiple_cv algo.fit(data) File "surprise/prediction_algorithms/matrix_factorization.pyx", line 155, in surprise.prediction_algorithms.matrix_factorization.SVD.fit File "surprise/prediction_algorithms/matrix_factorization.pyx", line 204, in surprise.prediction_algorithms.matrix_factorization.SVD.sgd AttributeError: 'DatasetAutoFolds' object has no attribute 'global_mean'
我错过了什么?
根据docs,fit方法的输入必须是Trainset,它与您要使用的Dataset不同。您可以使用此处提到的split方法的输出将数据集拆分为Trainset(和Testset)。
在您的示例中
data = Dataset.load_builtin('ml-100k') trainset = data.build_full_trainset()
然后,您可以使用
algo.fit(trainset)
这样获得的训练集和测试集可以分别用作拟合和测试功能的输入。