Data Science Projects (Q1-2022)

This first quarter of the 2021-2022 academic year, I supervise 5 final year projects in the Data Science Master of the UOC University, three of which in “general” domains which are 1) argument mining, 2) mining of encyclopedic knowledge from wikipedia, 3) text anonymization, and two of which in applied domains that are 4) use of NLP in an online medical consultation application, and 5) detection of recurrent defects in aircraft safety reports. Below is a selection of the resources and references I give to the students to get them started (sorry, the bibliographical citations are a bit sloppy).

[Read More]

Anonymization

Anonymization What is anonymization? Anonymization means the processing of personal data in order to irreversibly prevent identification. If the personal information is recoverable, say because: it is held in a separate resource, or the information hiding techniques employed make it recoverable, or the data still contains information that makes the person identifiable or partially identifiable, or the different bits of information can be cross-referenced within the dataset or across different datasets to infer the identity of the person. [Read More]