python - Statsmodels Categorical Data from Formula (using pandas) -
i trying finish homework assignment , need use categorical variables in statsmodels (due refusal conform using stata else). have spent time reading through documentation both patsy , statsmodels , can't quite figure out why snippet of code isn't working. have tried breaking them down , creating patsy commands, come same error.
i have:
import numpy np import pandas pd import statsmodels.formula.api sm # i'm getting data data = pd.read_csv("http://people.stern.nyu.edu/wgreene/econometrics/bankdata.csv") # want use form regression form = "c ~ q1 + q2 + q3 + q4 + q5 + c(bank)" # regression mod = sm.ols(form, data=data) reg = mod.fit() print(reg.summary2()) this code raises error says: typeerror: 'series' object not callable. there similar example here on statsmodels website seems work fine , i'm not sure difference between i'm doing , they're doing is.
any appreciated.
cheers
the problem c name of 1 of columns in dataframe patsy way of denoting want categorical variable. easiest fix rename column such:
data = data.reaname_axis({'c': 'c_data'}, axis=1) form = "c_data ~ q1 + q2 + q3 + q4 + q5 + c(bank)"
then call sm.ols work.
the error message typeerror: 'series' object not callable can interpreted follows:
- patsy interprets
ccolumn of data frame. in case seriesdata['c'] - then fact followed parenthesis made statsmodels try call
data['c']function argumentbank. series objects don't implement__call__method, hence error message'series' object not callable.
good luck!
Comments
Post a Comment