Jacob & Sebastian,
I think the best way to find out if my modeling approach works is to find a larger dataset, split it into two parts, the first one will be used as training/cross-validation set and the second as a test set, like in a real case scenario.
Regarding the MLPRegressor regularization, below is my optimum setup:
MLPRegressor(random_state=random_state, max_iter=400, early_stopping=True, validation_fraction=0.2, alpha=10, hidden_layer_sizes=(10,))
This means only one hidden layer with maximum 10 neurons, alpha=10 for L2 regularization and early stopping to terminate training if validation score is not improving. I think this is a quite simple model. My final predictor is an SVR that combines 2 MLPRegressors, each one trained with different types of input data.
@Sebastian
You have mentioned dropout again but I could not find it in the docs:
Maybe you are referring to another MLPRegressor implementation? I have seen a while ago another implementation you had on github. Can you clarify which one you recommend and why?
Thank you both of you for your hints!
best
Thomas