Projet de recherche doctoral numero :5496

Description

Date depot: 29 novembre 2018
Titre: From Data to Information Management Framework for Reliable Decision-aid Systems
Directrice de thèse: Marie-Jeanne LESOT (LIP6)
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Non defini

Resumé: Information Technology is nowadays very much subject to the so-called Big Data constraints: variety, velocity, volume, veracity, etc. The amount of information to be processed by a human operator is becoming so significant and heterogeneous that managing it manually is rarely an option. As the availability of diverse sensors increases, most systems have to integrate a great amount of outputs combined with domain specific knowledge provided by human experts, all into complex databases (e.g. for Maintenance programs in transportation systems, Operations management etc.). Making sense and taking appropriate decisions out of all this is already a challenge in terms of analytics, but this is not the only problem. Appropriate decision-making is heavily relying on the quality of the notions involved, in other words: the quality of data roughly meaning the output of a given sensor, the quality of information meaning the context that has been associated to some data, its interpretations, a given structure etc., according to the chosen analysis granularity. However, since in some cases human contribution may be regarded as just another sensor output, the distinction between data quality and information quality is somewhat vague. Both suffer from: incomplete responses or entirely missing points, inaccuracy, reliability issues, mismatches, conflicting points … to which one can add the more complex and ambiguous semantics of human-provided “data” and thus its defects (imprecision, subjectivity, variability…). When confronted with either data or information, decision-making usually uses a process called fusion or integration of multiple sources1 in order to give a unique pertinent answer, and a distinction is made between low-level fusion (dealing directly with sensor outputs) and high-level fusion (dealing with structured information already processed in some way from the raw outputs). But, in modern databases, the two are intertwined and perhaps one could benefit from considering these heterogeneous notions together. Integrating and analysing multiple sources of real-world data remains a challenge today, despite existing theoretical frameworks and tools. On the theoretical part this thesis aims at defining a common framework for analytics and decision-making based on data quality and information quality characteristics and metrics. The starting point could be characteristics already defined in previous extensive studies2, adapted to take into account the heterogeneity of the inputs in terms of granularity level of data/information, and not only in the nature of the source.On the practical part, a real-life maintenance database could be considered in order to devise and validate heterogeneous fusion algorithms and/or data comprehension for better decision systems.

Doctorant.e: Lenart Marcin