Skip to content

v1.0

Past due by over 9 years 50% complete

Capabilities:

  1. retrieve data from AMiner source
  2. transform to relational representation (csv)
  3. build citation graphs for both papers and authors, including LCC for authors
  4. construct representative documents (repdocs) for both papers and authors
  5. use repdocs to build tf corpus for papers and tf/tfidf corpus for authors (node attributes)
  6. allow filtering of the…

Capabilities:

  1. retrieve data from AMiner source
  2. transform to relational representation (csv)
  3. build citation graphs for both papers and authors, including LCC for authors
  4. construct representative documents (repdocs) for both papers and authors
  5. use repdocs to build tf corpus for papers and tf/tfidf corpus for authors (node attributes)
  6. allow filtering of the dataset based on publication years of papers
  7. generate stats about each data representation produced
  8. incorporate robust dependency management to optimize processing and organization

Also provide documentation on each of the data files produced and each of the tasks (data transformations) involved.

Loading