Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Better error messages #497

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jankatins opened this issue Dec 16, 2011 · 3 comments
Closed

Better error messages #497

jankatins opened this issue Dec 16, 2011 · 3 comments
Milestone

Comments

@jankatins
Copy link
Contributor

I'm trying to build a dataframe from a list of list and I get an "AssertionError" and I have no idea what I did wrong or what I should do differently. A shorted version of the data and code:

pprint(_fields)
[ 'name',
'timescited',
'weight',
'closeness',
'betweenness',
'degree',
'numberofworks',
'pagerank',
'pages',
'constraint',
'citationaverage',
'eigenvector']

pprint(_data2)
[ ['Huselid, Ma', 'Gulati, R', 'Damanpour, F', 'Mcallister, Dj', 'Tsai, Wp'],
[2721, 5251, 1269, 1287, 2834],
[6, 17, 6, 3, 6],
[ 0.0002510562854116362,
0.00025108339541739277,
0.00025105435150127433,
0.00025106898828104144,
0.0002510723039607744],
[ 23311.0,
173596.65383408728,
79279.82282582425,
18701.108026093425,
57261.74716265692],
[4, 11, 9, 3, 7],
[6, 17, 6, 3, 6],
[ 0.0001438440250079284,
0.0003063098736173118,
0.00027071793856870986,
7.839463995668608e-05,
0.00020047003509480145],
[130, 387, 148, 68, 88],
[ 0.28573421556122447,
0.12411566040831735,
0.19959178902739988,
0.3703749360800643,
0.200788126303937],
[453, 308, 211, 429, 472],
[ 5.889013732612275e-08,
0.0005066654664776551,
4.567816673280219e-07,
2.4226314239685797e-06,
4.069482619394397e-07]]

df = DataFrame(_data2, columns=_fields )
Traceback (most recent call last):
File "", line 1, in
File "C:\portabel\Python27\lib\site-packages\pandas\core\frame.py", line 208, in init
copy=copy)
File "C:\portabel\Python27\lib\site-packages\pandas\core\frame.py", line 273, in _init_ndarray
block = make_block(values.T, columns, columns)
File "C:\portabel\Python27\lib\site-packages\pandas\core\internals.py", line 211, in make_block
do_integrity_check=do_integrity_check)
File "C:\portabel\Python27\lib\site-packages\pandas\core\internals.py", line 26, in init
assert(len(items) == len(values))
AssertionError

It would be nice if pandas could output a more descriptive error message in case some inputs are not what pandas expects.

Thanks a lot for the lib!

Jan

@jankatins
Copy link
Contributor Author

I found the problem after some try&error: _data2 needs to be a dict... The problem about the unhelpfull error message remains :-)

@wesm
Copy link
Member

wesm commented Dec 18, 2011

ok, here's the new exception:

In [3]: DataFrame(data2, columns=fields)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/home/wesm/code/pandas/<ipython-input-3-7de2f1fbdd32> in <module>()
----> 1 DataFrame(data2, columns=fields)

/home/wesm/code/pandas/pandas/core/frame.pyc in __init__(self, data, index, columns, dtype, copy)
    209         elif isinstance(data, list):
    210             if len(data) > 0 and isinstance(data[0], (list, tuple)):
--> 211                 data, columns = _list_to_sdict(data, columns)
    212                 mgr = self._init_dict(data, index, columns, dtype=dtype)
    213             else:

/home/wesm/code/pandas/pandas/core/frame.pyc in _list_to_sdict(data, columns)
   3557         if len(columns) != len(content):
   3558             raise AssertionError('%d columns passed, passed data had %s '
-> 3559                                  'columns' % (len(columns), len(content)))
   3560 
   3561     sdict = dict((c, lib.maybe_convert_objects(vals))

AssertionError: 12 columns passed, passed data had 5 columns

for this data you'd want to do:


In [7]: DataFrame(zip(*data2), columns=fields)
Out[7]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 0 to 4
Data columns:
name               5  non-null values
timescited         5  non-null values
weight             5  non-null values
closeness          5  non-null values
betweenness        5  non-null values
degree             5  non-null values
numberofworks      5  non-null values
pagerank           5  non-null values
pages              5  non-null values
constraint         5  non-null values
citationaverage    5  non-null values
eigenvector        5  non-null values
dtypes: int64(6), float64(5), object(1)

because you have the columns in the rows, zip(*data) transposes a list of lists

@wesm wesm closed this as completed Dec 18, 2011
@jankatins
Copy link
Contributor Author

Thanks a lot for this change!

I'm still not sure if I would have understood the error. How about something like "12 column name(s) passed, but passed data had 5 columns." or even a added"... If you passed in a list of list, maybe transposing the data with zip(*data) helps."?

dan-nadler pushed a commit to dan-nadler/pandas that referenced this issue Sep 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants