Skip to content

Triangular distribution random sampling is incorrectly implemented #3225

Closed
@lbignell

Description

@lbignell

There seems to be a problem with the way scipy.stats.triang is called by Triangular.random.
An error is thrown even if you use a combination of 'upper', 'lower', and 'c' arguments found in the documentation, see below.
The fact that I can recover the correct parameters in the model below if I catch the error suggests that the issue is limited to when random is sampled.
Calling sample_ppc(dummytrace, model=dummymodel) throws the same error, presumably because it calls the random method.

However, when I change the way scipy.stats.triang is called in the source code to be analogous to that below, it doesn't fix the problem, so maybe I'm missing something.

Description of your problem

Please provide a minimal, self-contained, and reproducible example.

theupper = 8
thelower = 2
thec = 6.5
numobs = 100000
xvals = np.linspace(thelower,theupper,1000)
#This is how I think stats.triang should be called, which differs from Triangular.random.
dummyPDF = stats.triang(loc=thelower, scale=theupper - thelower, c=(thec - thelower)/(theupper-thelower))
yvals = dummyPDF.pdf(xvals)
thedist = pymc3.Triangular.dist(lower=thelower, upper=theupper, c=thec)
allOK = True
dummydata = dummyPDF.rvs(numobs)
try:
    thesamples = thedist.random(size=numobs)
except ValueError:
    #This always gets thrown. I've pasted the error traceback for if it's raised below.
    allOK = False
fig = plt.figure()
plt.plot(xvals, yvals, 'r')
if allOK:
    plt.hist(thesamples, histtype='step', bins=100, range=(thelower,theupper), normed=True);

plt.hist(dummydata, histtype='step', bins=100, range=(thelower,theupper), normed=True);
dummymodel = pymc3.Model()
with dummymodel:
    varlower = pymc3.Normal('varlower', mu=thelower, sd=5)
    varupper = pymc3.Normal('varupper', mu=theupper, sd=5)
    varc = pymc3.Normal('varc', mu=thec, sd=5)
    dummyObs = pymc3.Triangular('dummyObs', lower=varlower, upper=varupper, c=varc, observed=dummydata)
    start = pymc3.find_MAP()
    step = pymc3.Metropolis()
    dummytrace = pymc3.sample(draws=500, step=step, start=start)

pymc3.summary(dummytrace)

Please provide the full traceback.
If the error is caught:

varlower:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  1.993            0.005            0.000            [1.982, 2.002]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  1.982          1.990          1.993          1.997          2.002


varupper:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  8.006            0.003            0.000            [8.001, 8.012]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  8.001          8.004          8.005          8.006          8.014


varc:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  6.502            0.007            0.001            [6.488, 6.512]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  6.488          6.496          6.502          6.505          6.519

image

If the error is raised:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-198-9f2618c889b8> in <module>()
     14 dummydata = dummyPDF.rvs(numobs)
     15 try:
---> 16     thesamples = thedist.random(size=numobs)
     17 except ValueError:
     18     allOK = False

/home/lbignell/anaconda3/lib/python3.5/site-packages/pymc3/distributions/continuous.py in random(self, point, size)
   1860                                       point=point)
   1861         return generate_samples(stats.triang.rvs, c=c-lower, loc=lower, scale=upper-lower,
-> 1862                                 size=size, dist_shape=self.shape, random_state=None)
   1863 
   1864     def logp(self, value):

/home/lbignell/anaconda3/lib/python3.5/site-packages/pymc3/distributions/distribution.py in generate_samples(generator, *args, **kwargs)
    399     if broadcast_shape == (1,) and prefix_shape == ():
    400         if size is not None:
--> 401             samples = generator(size=size, *args, **kwargs)
    402         else:
    403             samples = generator(size=1, *args, **kwargs)

/home/lbignell/anaconda3/lib/python3.5/site-packages/scipy/stats/_distn_infrastructure.py in rvs(self, *args, **kwds)
    938         cond = logical_and(self._argcheck(*args), (scale >= 0))
    939         if not np.all(cond):
--> 940             raise ValueError("Domain error in arguments.")
    941 
    942         if np.all(scale == 0):

ValueError: Domain error in arguments.

Please provide any additional information below.

Versions and main components

  • PyMC3 Version: 3.2
  • Theano Version: 0.9.0
  • Python Version: 3.5.3
  • Operating system: Ubuntu 16.04
  • How did you install PyMC3: conda

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions