feat: support describe for non-numerical type string#973
Conversation
|
🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use -- conventional-commit-lint bot |
| new_column_index, indexer = self.columns.reindex(columns) | ||
|
|
||
| if indexer is None: | ||
| # The new index is the same as the old one. Do nothing. |
There was a problem hiding this comment.
I'm curious how you discovered this. Worth separating out into it's own PR (with tests).
There was a problem hiding this comment.
This was discovered when I wrote an ealier version of the "describe()":
I didn't check the emptiness of aggregated dataframes, and always concated them in the end. Then there was an edge case where the original dataframe contains only numerical columns, and as a result I was basically reindexing the columns with their original order, which triggered this bug.
60c25e8 to
2c73313
Compare
This change adds the "include" param in the describe method. It can only be set as None(default) or 'all'
I only enabled "count" and "nunique" for string types. More non-numerical types can be added later.