Valueerror: found input variables with inconsistent numbers of samples: [7000, 3000]
Answers
Answer:
You are running into that error because your X and Y don't have the same length (which is what train_test_split requires), i.e., X.shape[0] != Y.shape[0]. Given your current code:
>>> X.shape
(1, 6, 29)
>>> Y.shape
(29,)
To fix this error:
Remove the extra list from inside of np.array() when defining X or remove the extra dimension afterwards with the following command: X = X.reshape(X.shape[1:]). Now, the shape of X will be (6, 29).
Transpose X by running X = X.transpose() to get equal number of samples in X and Y. Now, the shape of X will be (29, 6) and the shape of Y will be (29,).
Answer:
Sounds like the shapes of your labels and predictions are not in alignment.
Step-by-step explanation:
I faced a similar problem while fitting a regression model . The problem in my case was, Number of rows in X was not equal to number of rows in y. In most case, x as your feature parameter and y as your predictor. But your feature parameter should not be 1D. So check the shape of x and if it is 1D, then convert it from 1D to 2D.
x.reshape(-1,1)
Also, you likely get problems because you remove rows containing nulls in X_train and y_train independent of each other. y_train probably has few, or no nulls and X_train probably has some. So when you remove a row in X_train and the same row is not removed in y_train it will cause your data to be unsynced and have different lenghts. Instead you should remove nulls before you separate X and y.
http://net-informations.com/ds/mla/default.htm