v1.0
Past due by over 9 years
50% complete
Capabilities:
- retrieve data from AMiner source
- transform to relational representation (csv)
- build citation graphs for both papers and authors, including LCC for authors
- construct representative documents (repdocs) for both papers and authors
- use repdocs to build tf corpus for papers and tf/tfidf corpus for authors (node attributes)
- allow filtering of the…
Capabilities:
- retrieve data from AMiner source
- transform to relational representation (csv)
- build citation graphs for both papers and authors, including LCC for authors
- construct representative documents (repdocs) for both papers and authors
- use repdocs to build tf corpus for papers and tf/tfidf corpus for authors (node attributes)
- allow filtering of the dataset based on publication years of papers
- generate stats about each data representation produced
- incorporate robust dependency management to optimize processing and organization
Also provide documentation on each of the data files produced and each of the tasks (data transformations) involved.