Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AtrributeError #4

Open
Jarvx opened this issue Aug 5, 2021 · 5 comments
Open

AtrributeError #4

Jarvx opened this issue Aug 5, 2021 · 5 comments

Comments

@Jarvx
Copy link

Jarvx commented Aug 5, 2021

It looks like the code does not load the directory of Python projects. Can you please kindly look at this.
The command I used is
python TW_extractor.py --o $OUTPUT_FOLDER --d $REPOS --w $THREADS

I have replaced the params with the actual values. It is weird to see "Found 0 processed projects".

`Number of selected Python projects: 0
data_out/funcs
Found 0 processed projects
Traceback (most recent call last):
File "TW_extractor.py", line 131, in
df = parse_df(DATA_FILES, batch_size=128)
File "typewriter/dl-type-python/dltpy/input_preparation/generate_df.py", line 77, in parse_df
df_merged = df_merged.reset_index(drop=True)
AttributeError: 'NoneType' object has no attribute 'reset_index'

@mir-am
Copy link
Member

mir-am commented Aug 5, 2021

Thanks for submitting an issue.
Indeed, there is an issue with the code given that it's not been maintained for a while.
As can be seen from the attached logs above, no python projects were selected.
For now, the solution is to use Python projects that exist in this JSON list here. I know this is weird and you want to use your own dataset.
To overcome this weird limitation, you can patch this function here for your need and try to load repositories without using the above JSON list.

By the way, if you are interested in DL-based type prediction, check out our Type4Py model and its VSCode extension. Type4Py performs better than TypeWriter!

@Jarvx
Copy link
Author

Jarvx commented Aug 6, 2021

Thanks a lot. But I am still a bit confused about how I can patch the function. I am looking for a tool that produces type inference for given Python projects. Do you think Type4Py is suitable. When I read the documentation, there are details about data preprocessing and model training.
It says

Skip this step if you're using the ManyTypes4Py dataset.

However, in the second step, type4py preprocess --o $OUTPUT_DIR --l $LIMIT, the command does not tell how I can preprocess my own data (a few python projects).

I believe the VS code add-on should be a good option but as I hope to collect type inference results for quite a few Python scripts, so it is hard to use VS Code for this purpose.

Can you please let me know how I can use my own data with the pre-trained model to produce types.

@mir-am
Copy link
Member

mir-am commented Aug 6, 2021

Can you please let me know how I can use my own data with the pre-trained model to produce types.

For a few Python projects, you can use Type4Py's API to get type information.
Here is a minimal example of getting type information for one Python file:

import requests

with open('example.py') as f:
    r = requests.post("https://type4py.com/api/predict?tc=0", f.read())
    print(r.json())

This gives you type information for the given Python file in JSON format. Replace example.py with your file(s).
As an example, see this function here on how to retrieve type information for parameters, return types, and variables.
Other fields of the JSON response are documented here.

Let me know if there are questions or issues.

@Jarvx
Copy link
Author

Jarvx commented Aug 8, 2021

Hi Mir,

I really appreciate your patience and help. The API worked perfectly. Just wondering if I hope to process projects at the batch level. Do you recommend this project saltudelft/dl-type-python or the other one. I was hoping to learn more details about the implementation of your algorithms but I was stuck at the first step of data preprocessing.

You said I can patch the JSON function so that the code can load my own dataset. But I noticed the Json file loads from Github projects.

Once again, I really appreciate your help!

@mir-am
Copy link
Member

mir-am commented Aug 10, 2021

To process your own projects and/or train type prediction model, I highly suggest using Type4Py. It's currently under active development and research.

You can start with step 1 of Type4Py, i.e., processing your own dataset.

Let me know if there are issues or questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants