python - Statsmodels Categorical Data from Formula (using pandas) -
i trying finish homework assignment , need use categorical variables in statsmodels (due refusal conform using stata else). have spent time reading through documentation both patsy , statsmodels , can't quite figure out why snippet of code isn't working. have tried breaking them down , creating patsy commands, come same error.
i have:
import numpy np import pandas pd import statsmodels.formula.api sm # i'm getting data data = pd.read_csv("http://people.stern.nyu.edu/wgreene/econometrics/bankdata.csv") # want use form regression form = "c ~ q1 + q2 + q3 + q4 + q5 + c(bank)" # regression mod = sm.ols(form, data=data) reg = mod.fit() print(reg.summary2())
this code raises error says: typeerror: 'series' object not callable
. there similar example here on statsmodels website seems work fine , i'm not sure difference between i'm doing , they're doing is.
any appreciated.
cheers
the problem c
name of 1 of columns in dataframe patsy way of denoting want categorical variable. easiest fix rename column such:
data = data.reaname_axis({'c': 'c_data'}, axis=1) form = "c_data ~ q1 + q2 + q3 + q4 + q5 + c(bank)"
then call sm.ols
work.
the error message typeerror: 'series' object not callable
can interpreted follows:
- patsy interprets
c
column of data frame. in case seriesdata['c']
- then fact followed parenthesis made statsmodels try call
data['c']
function argumentbank
. series objects don't implement__call__
method, hence error message'series' object not callable
.
good luck!
Comments
Post a Comment