Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error shapes not alligned when creating Instance with numpy array #19

Open
KamodaP opened this issue Dec 15, 2016 · 1 comment
Open

Comments

@KamodaP
Copy link

KamodaP commented Dec 15, 2016

It is quite common in data science apps to use numpy arrays and pandas dataframes. Unfortunately when you try to create Instance using numpy array of shape (1,x) your code creates in one of the places array of shape (nobs, x, 1) which does not work in dot when multiplied by array of shape (x, nlayers). It could be easily checked with the verify_dataset_shape_and_modify. Here's some snippet:

def mult(x):
     res = x[0]
     for a in x[1:]:
         res *= a
     return res
[...]
def verify_dataset_shape_and_modify(...)
    [...]
    data              = np.array( [instance.features.reshape((mult(instance.features.shape),) \
                                         if type(instance.features) == np.ndarray\ #more generalization perhaps?
                                         else instance.features\
                                         for instance in dataset ] )

whith this snippet even accidentally passing arrays of shape (x,1) will work.

@jorgenkg
Copy link
Owner

You are correct. The documentation states that an instance should be initialized with a built-in one-dimensional python list, e.g. Instance( [0, 1] ). However, as you pint out there are currently no checks that verify whether the argument is indeed a python list.

I'll add validation check to my todo list, thank you for pointing it out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants