Projet de recherche doctoral numero :6437


Date depot: 9 octobre 2019
Titre: Scalable Analytics Framework for Context Driven Forensics Linguistic Investigation on Unstructured Data
Directrice de thèse: Salima BENBERNOU (LIPADE)
Encadrant : Yehia TAHER (DAVID)
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Non defini

Resumé: Unstructured data is information that either does not have a predefined data model or is not organised in a predefined manner. Today, 80% of the data in most of the organizations is unstructured. Organizations need to understand the types of unstructured data they are accumulating and the best ways to process and store this data for complex analysis to gain advantage. Without effective data management strategies and guidance, organizations run the risks of not capitalizing on unstructured data. Typically, unstructured information is text-heavy that represent subjective opinions however, it can be non-textual, and human- or machine-generated. Unstructured information contain critical data such as name, dates, locations, entities, actions, behavior, and facts. These data are very often suggestive, sensitive, and influential ingredients of mission-critical tasks. Digital forensics investigation is an example of crucial task which infer evidences related to an illicit event or actions. Traditionally, digital forensics investigation is carried out through examination of digital devices such as compter, cell phone, etc. to extract knowledge or evidence. However, forensics analysis of data – which could be highly effective in extracting intelligence or evidence – has been overlooked entirely until today. This type of analysis involve large volume of unstructured text stored within the repository of organizations and also data from the Web and social medias which do not have any predefined structure as well.

Doctorant.e: Abdallah Raed