Aleatoric example
The ideal physical system for calibration of aleatoric uncertainty can be reproduced at home. You need only multiple dice and one coin.
Roll one die twice and generate two random numbers, let say 3 and 5, they are features. Flip the coin, in case of head roll 3 dice, add all outcomes and record
it as target. In case of tail, roll 5 dice, add outcomes and record it as target. So, the dataset may look like follows:
features targets
3 5 12
6 5 16
...
We add here one more input  the probability of selection the dice set and make the dice virtual, having 10 edges. So, we roll one die in our
thought experiment three times and record features, let say they are 4, 7, 8. Now we roll single die one more time. If outcome is less or equal
4, we roll first set of 7 dice, otherwise we roll other set of 8 dice, add outcomes and record it as target, so our dataset start looking
like this:
features targets
4 7 8 40
8 1 9 7
...
The probability density for the same input built by repeating experiment is, in general, bimodal.
Obviously, we can have 1000 different feature vectors. Let make 1000 random records and use it as dataset with aleatoric
uncertainty. Now, having this dataset with a single output for each feature vector, we need to build probabilistic
model which captures bimodality of the target.
We say that it is possible and recommend to read preprint and watch
video.
The algorithm is called Divisive Data Resorting. We add its detailed description to this site some time later.

