| Chilepluto

Posted Jun 12, 2026

By Sean

1 min read

This article is part of the Deep Learning hub.

Training a deep network is more than choosing an architecture. The result depends on the data, the loss function, the optimiser, the regularisation strategy, and the evaluation process.

Optimisation

Optimisation adjusts model parameters to reduce the loss function.

Important concepts include:

Gradient descent
Mini-batch training
Learning rate
Momentum
Adaptive optimisers
Training schedules

The learning rate is often the most important setting. If it is too high, training can become unstable. If it is too low, training may make little progress.

Mini-batches

Deep networks are usually trained with mini-batches.

Mini-batches help with:

Hardware efficiency
Memory use
More frequent updates
Practical training time

Batch size affects stability and speed. Larger batches are not always better.

Regularisation

Regularisation helps reduce overfitting.

Common approaches include:

Weight decay
Dropout
Early stopping
Data augmentation
Model size control

The aim is to improve performance on unseen examples, not just reduce training loss.

Early stopping

Early stopping watches validation performance and stops training when improvement stalls.

It is simple and useful, but it depends on a reliable validation set.

Evaluation

Evaluation should compare the model against a baseline.

Track:

Training loss
Validation loss
Main task metric
Error examples
Data split version
Training configuration
Random seed where practical

Without reproducible experiments, model improvement becomes guesswork.

Troubleshooting checklist

Ask:

Is the data pipeline correct?
Are labels reliable?
Is the loss function appropriate?
Is the learning rate reasonable?
Is the model overfitting?
Is the validation split representative?
Is there a simple baseline?
Can the result be reproduced?

Engineering takeaway

Good training practice is disciplined engineering. Start simple, measure carefully, change one thing at a time, and keep enough records to reproduce useful results.

Last reviewed: June 2026. Validate optimiser defaults, framework behaviour, and evaluation methods against current library documentation before production use.

This post is licensed under CC BY 4.0 by the author.