7.1. NumPy Tips and Tricks#

7.1.1. Achieve Reproducibility with np.random.RandomState()#

Reproducibility in Data Science projects is key.

For larger projects, use numpy.random.RandomState() to construct a random number generator.

Using numpy.random.seed() sets the global random seed, which affects all uses to the numpy.random.* module.

Imported packages or other modules can reset the global random seed to another one.

This can result in undesirable and unreproducible results across your project.

With numpy.random.RandomState(), you are not relying on the global random state anymore (which could be resetted).

It’s a subtle, but important step to achieve reproducibility.

import numpy as np

rng = np.random.RandomState(1234)

print(rng.rand(3))