Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PairFeatureExtractor.fit_transform() throws exception #5

Open
delip opened this issue Aug 7, 2016 · 0 comments
Open

PairFeatureExtractor.fit_transform() throws exception #5

delip opened this issue Aug 7, 2016 · 0 comments

Comments

@delip
Copy link

delip commented Aug 7, 2016

I am working through the example in the "highered dataset" notebook, and I'm particularly interested in token-level features. But when I run this part of the code:

real = [
    lambda i, j, s1, s2: 1.0,
    lambda i, j, s1, s2: 1.0 if s1[i] == s2[j] else 0.0,
    lambda i, j, s1, s2: 1.0 if s1[i] == s2[j] and len(s1[i]) >= 6 else 0.0,
    lambda i, j, s1, s2: 1.0 if s1[i].isdigit() and s2[j].isdigit() and s1[i] == s2[j] else 0.0,
    lambda i, j, s1, s2: 1.0 if s1[i].isalpha() and s2[j].isalpha() and s1[i] == s2[j] else 0.0,
    lambda i, j, s1, s2: 1.0 if not s1[i].isalpha() and not s2[j].isalpha() else 0.0
]
# Other ideas are:
#  to look up whether words are dictionary words,
#  longest common subsequence,
#  standard edit distance
feature_extractor = PairFeatureExtractor(real=real)
X_extracted = feature_extractor.fit_transform(tokX)

I get the following exception:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-04ddbdf4798d> in <module>()
      1 feature_extractor = PairFeatureExtractor(real=real)
----> 2 X_extracted = feature_extractor.fit_transform(tokX)

/home/delip/anaconda2/envs/tensorflow/lib/python2.7/site-packages/pyhacrf/feature_extraction.pyc in fit_transform(self, raw_X, y)
    108             Feature matrix list, for use with estimators or further transformers.
    109         """
--> 110         return self.transform(raw_X)
    111 
    112     def transform(self, raw_X, y=None):

/home/delip/anaconda2/envs/tensorflow/lib/python2.7/site-packages/pyhacrf/feature_extraction.pyc in transform(self, raw_X, y)
    124             Feature matrix list, for use with estimators or further transformers.
    125         """
--> 126         return [self._extract_features(sequence1, sequence2) for sequence1, sequence2 in raw_X]
    127 
    128     def _extract_features(self, sequence1, sequence2):

/home/delip/anaconda2/envs/tensorflow/lib/python2.7/site-packages/pyhacrf/feature_extraction.pyc in _extract_features(self, sequence1, sequence2)
    138 
    139         for k, feature_function in enumerate(self._binary_features):
--> 140             feature_array[..., k] = feature_function(array1, array2)
    141 
    142         if self._sparse_features:

TypeError: <lambda>() takes exactly 4 arguments (2 given)

Any suggestions on what I can do or I should be doing? I'm executing the python notebook code as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant