Skip to content

Masking invalid x and/or weights in hist#9706

Closed
dstansby wants to merge 4 commits into
matplotlib:masterfrom
dstansby:hist-masking
Closed

Masking invalid x and/or weights in hist#9706
dstansby wants to merge 4 commits into
matplotlib:masterfrom
dstansby:hist-masking

Conversation

@dstansby

@dstansby dstansby commented Nov 6, 2017

Copy link
Copy Markdown
Member

A rebase of #7133, fixes #6483.

@dstansby dstansby added this to the v2.2 milestone Nov 6, 2017
@@ -6061,6 +6061,46 @@ def hist(self, x, bins=None, range=None, density=None, weights=None,
bin_range = range
del range

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a namespace conflict here. range is both the range of histogram and the python function. If I do just do a quick and dirty example:

del range
range(np.arange(100))

I get the below error message:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-21-0eb074616bf8> in <module>()
----> 1 del range
      2 range(np.arange(100))
NameError: name 'range' is not defined

This seems similar to the errors generated in the tests, which comes from using range to refer to the built-in python function on line 6171.

Comment thread lib/matplotlib/axes/_axes.py Outdated
bin_range = range
del range

def _normalize_input(inp, ename='input'):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that this function has disappeared on the master branch, so I think another approach to the problem is needed.

@dstansby

dstansby commented Nov 7, 2017

Copy link
Copy Markdown
Member Author

I've tidied this up a bit - using masked arrays is not working at all with the rest of the ax.hist function though... should I be converting them back to normal arrays with the masked values just removed?

@dopplershift

Copy link
Copy Markdown
Contributor

Yeah, I think cbook.delete_masked_points is the usual way to go here.

@dstansby

dstansby commented Nov 8, 2017

Copy link
Copy Markdown
Member Author

🤦‍♀️ turns out the issue was PEBKAC and not the masked arrays... should be fine now

@dstansby

dstansby commented Nov 8, 2017

Copy link
Copy Markdown
Member Author

(this still needs a test and a rebase, but I think the code is fine now)

@efiring

efiring commented Nov 11, 2017

Copy link
Copy Markdown
Member

Looks like you also need to modify the docstring to note the support for masked arrays and invalid values, and also note the support for masked arrays in the whats_new entry.

TrishGillett and others added 4 commits November 14, 2017 20:10
Tidying up the data prep work in hist.

Mask invalid entries in x and weights before plotting histograms.

Clean up hist masking code

Fix masking
jklymak
jklymak previously approved these changes Dec 2, 2017

@jklymak jklymak left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I'm a little curious what masked values do to density, but I guess thats really in np.histogram, not here.

@dstansby dstansby added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Jan 22, 2018
efiring
efiring previously approved these changes Jan 29, 2018
@tacaswell tacaswell modified the milestones: v2.2, v3.0 Jan 29, 2018
@tacaswell tacaswell removed the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Jan 29, 2018
@efiring efiring dismissed their stale review January 29, 2018 20:57

It looks like this is actually not supporting masked arrays as it says it is.

@tacaswell

Copy link
Copy Markdown
Member

Due to some on-going discussion about what to do in the case of missing data (and how the handle the averaging).

@jklymak jklymak dismissed their stale review January 29, 2018 20:58

As above - this needs a bit more discussion

@efiring efiring left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. the docstring is contradictory.
  2. masked arrays and nans are being passed to np.histogram
  3. there are arguments as to how bad values should be handles

@dstansby

Copy link
Copy Markdown
Member Author

Going to close this since it's been a while - I'll leave my branch around for a bit if anyone thinks it's worthwhile picking up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Range determination for data with NaNs

7 participants