7.1. NumPy Tips and Tricks#
7.1.1. Achieve Reproducibility with np.random.RandomState()
#
Reproducibility in Data Science projects is key.
For larger projects, use numpy.random.RandomState()
to construct a random number generator.
Using numpy.random.seed()
sets the global random seed, which affects all uses to the numpy.random.*
module.
Imported packages or other modules can reset the global random seed to another one.
This can result in undesirable and unreproducible results across your project.
With numpy.random.RandomState()
, you are not relying on the global random state anymore (which could be resetted).
It’s a subtle, but important step to achieve reproducibility.
import numpy as np
rng = np.random.RandomState(1234)
print(rng.rand(3))