Skip to content

Latest commit

 

History

History
55 lines (36 loc) · 3.76 KB

README.md

File metadata and controls

55 lines (36 loc) · 3.76 KB

Data science laboratory - dsdatascif17lm, FIZ/3/087

This repository is for internal administration of the course

Course Administrator: David Visontai

The List of projects and applicants is here

Location and time of meetings: The meetings will be held in 5.56 Information technology laboratory and will start always at 12:00 PM on Wednesday and end the latest at 14:00. Most of the time students will give presentations and report on their progress, so it is not obligatory to be present during the whole session, but highly recommended.

The goal of the course is to instil practical skills needed for exploratory data analysis. With the acquired knowledge the student shall be able to perform independent research requiring handling of big data. To this end the students will have to explore a couple of longer running projects inspired data intensive problems drawn from multiple fields such as astronomy, genomics and social networks. The students will familiarize themselves with a wide skillset from various software engineering techniques to presenting their well distilled research in a manner that is accessible for the general public.

Timetable

11/09/2024 - (Meeting) Choosing a project
25/09/2024 - (Meeting) Presentation I: Description of the chosen topic and plan of action (10 minutes maximum fo each presentation)
09/10/2024 - (Meeting and progress report submission) Presentation II.
04/11/2024 - Submission of presentation and progress report
13/11/2024 - (Meeting and progress report submission) Presentation III.
27/11/2024 - (Meeting and report submission deadline) Final Presentations.
11/12/2024 - (Meeting) Presentation of the reproduced works  

Grading

  • Quality of the presentations (mainly the final presentation) - 10 points

There will be a maximum of 15 minutes allocated for each presenter and 5 minutes for further discussions.

Data for the projects

  • All the data and other necessary files will be accesible in the Kooplex system, in /v/courses/datascilab_2024.public/ directories
  • If you'd like to access any large file, that is still not there, please notify the administrator

Submission of the reports and presentations

  • Any material should be uploaded to the Kooplex system. If you use another platform for presentation, then supply all necessary informations for accessing that presentation into a file, that will be submitted.
  • Large datafiles (> 100MB), that are produced during the workflow and are necessary for obtaining the final results should be kept also in the /v/courses/datascilab_2024.public/ directory. Before submitting your work, please ask the administrator (David Visontai in this case) to make a copy of it in the right directory

Upload notebooks and code

Pleas upload all the notebooks and scripts, that are needed to reproduce the results!

Please, comment all necessary steps, functions etc.! Consider another person's approach who will try to read your code:

  • what questions will they ask
  • what steps are not obvious
  • The notebooks and scripts have to have comments and help text and docstrings bearing in mind that someone in the future might want to reproduce the results

Communication

  • If you have any technical problem or question, please feel free to file an issue in this github repository so that anyone are able to answer you or see the right answer for that question.