Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Return the right class, when subclassing a DataFrame with multi-level index #9632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 4, 2015

Conversation

m3nu
Copy link

@m3nu m3nu commented Mar 11, 2015

Subclassing pandas.DataFrame had some issues, when a multi-level index was used. This PR should fix that.

@jreback
Copy link
Contributor

jreback commented Mar 13, 2015

needs tests

@jreback jreback added this to the 0.16.1 milestone Mar 13, 2015
@jreback jreback added API Design Compat pandas objects compatability with Numpy or Python functions labels Mar 13, 2015
@m3nu
Copy link
Author

m3nu commented Mar 15, 2015

What to test for, @jreback ? I can test, if subclassing works correctly, but there are other open issues on this topic and I understand it's work in progress.

@jreback
Copy link
Contributor

jreback commented Mar 15, 2015

well you need to assert that the correct class is returned

you create a dummy class do the operation and assert

not sure what you mean by the rest of your statement - exactly what doesn't work?

@m3nu
Copy link
Author

m3nu commented Mar 16, 2015

You're right. That's something that can be tested. I'll just add the use cases that were broken before. Thanks for the pointer.

@jreback
Copy link
Contributor

jreback commented Mar 20, 2015

whats new is now available for 0.16.1 5ebf521

@m3nu
Copy link
Author

m3nu commented Mar 24, 2015

Great. Now it pulled in some other commits as well. Is that an issue?

@jorisvandenbossche
Copy link
Member

@manuelRiel If you do:

git fetch upstream
git rebase upstream/master

this should solve the extra commits

@m3nu
Copy link
Author

m3nu commented Mar 24, 2015

Yes. That did the trick. Usually I do a feature branch, but I forgot this time. So the PR is ok now?

@jreback
Copy link
Contributor

jreback commented Mar 24, 2015

@manuelRiel you need to git push gitusername branchname -f
in order to reflect these changes here.

@m3nu
Copy link
Author

m3nu commented Mar 24, 2015

Cleaned out the extra commits. We're good now?

@@ -2751,6 +2751,52 @@ def test_insert_error_msmgs(self):
with assertRaisesRegexp(TypeError, msg):
df['gr'] = df.groupby(['b', 'c']).count()

def test_frame_subclassing_and_slicing(self):
# Subclass frame and ensure it returns the right class on slicing it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue (this pr number is ok) as a comment

@jreback
Copy link
Contributor

jreback commented Apr 28, 2015

pls add a release note in v0.16.1 (you can put in API changes).

@sinhrks can you take a look

@m3nu
Copy link
Author

m3nu commented Apr 28, 2015

Thanks for the heads up, guys. I have added the reference to the PR, as well as the release note. Let me know, if there is anything else to change.

@@ -27,7 +27,7 @@ Enhancements
API changes
~~~~~~~~~~~


- When subclassing ``DataFrame``, the correct class is returned, as specified in ``_constructor_sliced``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add this the PR number as well (as there isn't an issue number)

@jreback
Copy link
Contributor

jreback commented Apr 28, 2015

pls rebase / squash, see here for help

@m3nu
Copy link
Author

m3nu commented Apr 28, 2015

Done @jreback

@sinhrks
Copy link
Member

sinhrks commented Apr 28, 2015

@manuelRiel Thanks for the fix. Can you prepare test cases when the column is MultiIndex, because you're fixing _getitem_multilevel? Pls check followings can return correct classes.

# comments are results of current master
mcol = pd.MultiIndex.from_tuples([('A', 'A'), ('A', 'B')])
df = CustomDataFrame([[0, 1], [2, 3]], columns=mcol)

type(df['A'])
# <class 'pandas.core.frame.DataFrame'>

mcol = pd.MultiIndex.from_tuples([('A', ''), ('B', '')])
df = CustomDataFrame([[0, 1], [2, 3]], columns=mcol)
type(df['A'])
# <class 'pandas.core.series.Series'>

@m3nu
Copy link
Author

m3nu commented Apr 28, 2015

Sure. I'll look into that. We're using multi-levels extensively and didn't run into issues. But it's possible I overrode this function in the subclass. I'll make sure it works properly.

@m3nu
Copy link
Author

m3nu commented Apr 28, 2015

Added the tests, @sinhrks . Returns custom class as expected.

@@ -27,7 +27,7 @@ Enhancements
API changes
~~~~~~~~~~~


- When subclassing ``DataFrame``, the correct class is returned, as specified in ``_constructor_sliced`` (PR 9632)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test lgtm. I think the issue is classified as a bug rather than API change. Can you describe a occurrence condition briefly, and refer to the issue as the same style as others (pls correct my English...)?

- Bug in subclassed ``DataFrame`` with ``MultiIndex`` columns may not return correct class (:issue:`9632`)

Copy link
Author

@m3nu m3nu Apr 29, 2015 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the wording below acceptable, @sinhrks ? I'll put that under Bug fixes

  • Bug in subclassed DataFrame with our without MultiIndex may not return the correct class, when slicing or subsetting it. (:issue:2859, :issue:9632)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think only referring to #9632 is enough (as 2859 has been closed already).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Committed as discussed.

@jreback
Copy link
Contributor

jreback commented May 4, 2015

@manuelRiel this looks fine. Can you rebase and ping on green.

@m3nu m3nu closed this May 4, 2015
@m3nu m3nu reopened this May 4, 2015
@m3nu
Copy link
Author

m3nu commented May 4, 2015

Rebased and passing, @jreback

jreback added a commit that referenced this pull request May 4, 2015
Return the right class, when subclassing a DataFrame with multi-level index
@jreback jreback merged commit 43b6c3f into pandas-dev:master May 4, 2015
@jreback
Copy link
Contributor

jreback commented May 4, 2015

@manuelRiel thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Compat pandas objects compatability with Numpy or Python functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants