feat: add subset parameter to DataFrame.dropna to select which columns to consider#981
feat: add subset parameter to DataFrame.dropna to select which columns to consider#981
subset parameter to DataFrame.dropna to select which columns to consider#981Conversation
…lumns to consider
|
Looks like some real test failures. I'll look at these later today. |
Looks like there was a bug with |
| <BLANKLINE> | ||
| [3 rows x 3 columns] | ||
|
|
||
| Define in which columns to look for missing values. |
There was a problem hiding this comment.
From doctest.
@@ -1,5 +1,5 @@
- name toy born
-1 Batman Batmobile 1940-04-25
-2 Catwoman Bullwhip NaT
+ name toy born
+1 Batman Batmobile 1940-04-25
+2 Catwoman Bullwhip <NA>
[2 rows x 3 columns]
There was a problem hiding this comment.
Thanks, fixed! The diff got a little big because I pulled in #987. It should be easier to review again once that is merged.
With this change I can once again run ``` pytest --doctest-modules third_party/bigframes_vendored/pandas/core/frame.py ``` Note: having multiple `version.py` files should be fine. release-please will update all such files it finds.
…-subset" This reverts commit 0f18294.
bigframes/dataframe.py
Outdated
| if subset is None: | ||
| subset_ids = None | ||
| elif not utils.is_list_like(subset): | ||
| subset_ids = [self._block.label_to_col_id[subset]] |
There was a problem hiding this comment.
The result of this seems to be a list of tuple, and don't work. May need a loop like in else?
There was a problem hiding this comment.
This is supposed to cover the single column string case. A loop would give one character per iteration. I'll take a look at the error.
There was a problem hiding this comment.
From the defination def label_to_col_id(self) -> typing.Mapping[Label, typing.Sequence[str]], I think self._block.label_to_col_id[subset] will be a sequence/tuple? So I think a loop like for id in self._block.label_to_col_id[label] should be included?
There was a problem hiding this comment.
Chatted offline. self._block.label_to_col_id[subset] is a tuple. I've corrected.
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes internal issue 366248570
🦕