This repository contains the source code of most of the experiments developed in my research. The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory.
Additionally, contributions at the algorithm level are available in the
package ml-research
, which can be found
here.
- Fonseca, J., Bell, A., Abrate, C., Bonchi, F., Stoyanovich, J. (-). Multi Agent Dynamic Counterfactual Recourse. Working paper.
- Fonseca, J., Bacao, F. (2023). Synthetic Data Generation: A Literature Review. Submitted to Journal of Big Data.
- Fonseca, J., & Bacao, F. (2023). Geometric SMOTE for Imbalanced Datasets with Nominal and Continuous Features. Submitted to Expert Systems with Applications.
- Fonseca, J., & Bacao, F. (2023). Improving Active Learning Performance through the Use of Data Augmentation. International Journal of Intelligent Systems, 2023. https://doi.org/10.1155/2023/7941878
- Fonseca, J., & Bacao, F. (2022). Research trends and applications of data augmentation algorithms. arXiv preprint arXiv:2207.08817. https://arxiv.org/abs/2207.08817
- Fonseca, J., Douzas, G., Bacao, F. (2021). Increasing the Effectiveness of Active Learning: Introducing Artificial Data Generation in Active Learning for Land Use/Land Cover Classification. Remote Sensing, 13(13), 2619. https://doi.org/10.3390/rs13132619
- Fonseca, J., Douzas, G., Bacao, F. (2021). Improving Imbalanced Land Cover Classification with K-Means SMOTE: Detecting and Oversampling Distinctive Minority Spectral Signatures. Information, 12(7), 266. https://doi.org/10.3390/info12070266
- Crayton A, Fonseca J, Mehra K, Ng M, Ross J, Sandoval-Castañeda M, von Gnecht R. (2021). Narratives and Needs: Analyzing Experiences of Cyclone Amphan Using Twitter Discourse, in IJCAI 2021 Workshop on AI for Social Good. https://crcs.seas.harvard.edu/publications/narratives-and-needs-analyzing-experiences-cyclone-amphan-using-twitter-discourse
- Douzas, G., Bacao, F., Fonseca, J., & Khudinyan, M. (2019). Imbalanced Learning in Land Cover Classification: Improving Minority Classes’ Prediction Accuracy Using the Geometric SMOTE Algorithm. Remote Sensing, 11(24), 3040. https://doi.org/10.3390/rs11243040
The typical project structure contains the scripts, data, results, analysis and content directories. Each of these are used as described below.
The installation of required packages is essential to reproduce every project. The requirements file may be located either in the project root or scripts directory. To install the required dependencies run the command:
pip install -r requirements.txt
In order to generate the content of the publication in a reproducible format, various scripts are provided.
data.py
Download, preprocess and save the datasets used for the experiments:
python data.py
results.py
Run the experiments and get the results:
python results.py
analysis.py
Analyze the results of experiments:
python analysis.py
It contains the experimental data. They are downloaded and
saved, using the data.py
script.
It contains the results of experiments as pickled pandas dataframes. They are
generated, using the results.py
script.
It contains the analysis of experiments' results in various formats. They are
generated, using the analysis.py
script.
It contains the LaTex source files of the project.