If you want to use the new gold standard evaluation set to evaluate during training, you can pass it in as the --eval-id argument to ner.batch-train.
If you only want to evaluate an already trained model, you could use a custom recipe like this:
In the above version, it takes the name of the dataset containing your evaluation examples, and the model you trained on your training examples. It then outputs the results.