score:2

Accepted answer

As pointed out in the comments the smaller the batch size the more variance of the mean for the batches which then appear in more fluctuation in the loss. I typically use a batch size of 80 since I have a fairly large memory capacity. You are using the ModelCheckpoint callback and saving the model with the best validation accuracy. It is better to save the model with the lowest validation loss. You say increasing the number of samples leads to under fitting. That seems rather strange. Usually more samples results in better accuracy.