大家,早安.我正在尝试使用Keras和pandas来实现这个LSTM算法,以便读入csv文件.我正在使用的后端是Tensorflow.在预测训练集之前,我遇到了反转我的结果的问题.以下是我的代码
import numpy import matplotlib.pyplot as plt import pandas import math from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from sklearn.preprocessing import MinMaxScaler from sklearn.metrics import mean_squared_error #plt.plot(dataset) #plt.show() #fix random seed for reproducibility numpy.random.seed(7) #Load dataset col_names = ['UserID','SysTouchTime', 'EventTime', 'ActivityTouchID', 'Pointer_count', 'PointerID', 'ActionID', 'Touch_X', 'Touch_Y', 'Touch_Pressure', 'Contact_Size', 'Phone_Orientation'] dataframe = pandas.read_csv('touchEventsFor5Users.csv', engine='python', header=None, names = col_names, skiprows=1) #print(dataset.head()) #print(dataset.shape) dataset = dataframe.values dataset = dataframe.astype('float32') print(dataset.isnull().any()) dataset = dataset.fillna(method='ffill') feature_cols = ['SysTouchTime', 'EventTime', 'ActivityTouchID', 'Pointer_count', 'PointerID', 'ActionID', 'Touch_X', 'Touch_Y', 'Touch_Pressure', 'Contact_Size', 'Phone_Orientation'] X = dataset[feature_cols] y = dataset['UserID'] print(y.head()) #normalize the dataset scaler = MinMaxScaler(feature_range=(0, 1)) dataset = scaler.fit_transform(dataset) # split into train and test sets train_size = int(len(dataset) * 0.67) test_size = len(dataset) - train_size train, test = dataset[0:train_size, :], dataset[train_size:len(dataset),:] print(len(train), len(test)) # convert an array of values into a dataset matrix def create_dataset(dataset, look_back=1): dataX, dataY = [], [] for i in range(len(dataset)-look_back-1): a = dataset[i:(i+look_back), 0] dataX.append(a) dataY.append(dataset[i + look_back, 0]) return numpy.array(dataX), numpy.array(dataY) # reshape into X=t and Y=t+1 look_back = 1 trainX, trainY = create_dataset(train, look_back) testX, testY = create_dataset(test, look_back) #reshape input to be [samples, time steps, features] trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1])) testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1])) #create and fit the LSTM network model = Sequential() model.add(LSTM(4, input_dim=look_back)) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy']) model.fit(trainX, trainY, epochs=1, batch_size=32, verbose=2) # make predictions trainPredict = model.predict(trainX) testPredict = model.predict(testX) # invert predictions import gc gc.collect() #####problem occurs with the following line of code############# trainPredict = scaler.inverse_transform(trainPredict) trainY = scaler.inverse_transform([trainY]) testPredict = scaler.inverse_transform(testPredict) testY = scaler.inverse_transform([testY]) # calculate root mean squared error trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0])) print('Train Score: %.2f RMSE' % (trainScore)) testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0])) print('Test Score: %.2f RMSE' % (testScore)) #shift train predictions for plotting trainPredictPlot = numpy.empty_like(dataset) trainPredictPlot[:, :] = numpy.nan trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict # shift test predictions for plotting testPredictPlot = numpy.empty_like(dataset) testPredictPlot[:, :] = numpy.nan testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict # plot baseline and predictions plt.plot(scaler.inverse_transform(dataset)) plt.plot(trainPredictPlot) plt.plot(testPredictPlot) plt.show()
我得到的错误是
ValueError:具有形状(67704,1)的不可广播输出操作数与广播形状不匹配(67704,12)
认为你们可以帮我解决这个问题?我对此非常陌生,但想要了解它,这个错误让我受苦!感谢您提供的任何帮助.
缩放数据时,它将以不同方式缩放12个字段.它将采用每个字段的最小值并将其转换为0到1的值.
当你创建一个invert_transform时,它对函数没有任何意义,因为你只给它一个字段,它不知道如何处理它,它的最小值和最大值是什么......你需要提供12个字段的数据集,将此预测字段放在正确的位置.
尝试在有问题的行之前添加:
# create empty table with 12 fields trainPredict_dataset_like = np.zeros(shape=(len(train_predict), 12) ) # put the predicted values in the right field trainPredict_dataset_like[:,0] = trainPredict[:,0] # inverse transform and then select the right field trainPredict = scaler.inverse_transform(trainPredict_dataset_like)[:,0]
这有帮助吗?:)