Daisy Seminar

This is an old revision of the document!

Department news:

CALL FOR PARTICIPATION – 2nd International Beer Tasting for DPTers
VENUE: The Wharf, Borgergade 16, Aalborg
WHEN: November 16th 2011 from 16.00 to 18.30 (or later)
KEYNOTE SPEAKER: Ian Russell, the Wharf
REGISTRATION: closed.

Presentation (discussion):

Title
An overview of my past and on-going research activities.

During this talk I (Yoann) will present a broad overview of the research I have conducted when I was a member of the data mining group at Montpellier (France). Of course, most of the topics I addressed are related to my PhD thesis (the abstract is given below) but I have also investigated others fields such as landscape classification, flexible queries, automatic construction of hierarchies. Thus, I will start my talk by presenting some of the slides I used during my PhD defense and I will finish it by briefly introducing the approaches I proposed in parallel of my thesis.

Abstract of my PhD thesis
Due to the rapid increase of information and communication technologies, the amount of generated and available data exploded and a new kind of data, stream data, has appeared. One possible and common definition of data stream is an unbounded sequence of very precise data incoming at high rate. Thus, it is impossible to store such a stream to perform a posteriori analysis. Moreover, more and more streams concern multidimensional and multilevel data and very few approaches take these specificities into account. Thus, we propose some practical and efficient solutions to deal with such particular data in a dynamic context. More specifically, we are interested in adapting OLAP (On Line Analytical Processing ) techniques to build relevant summaries of the data. First, after describing and discussing existent similar approaches, we propose two solutions allowing to build a data cube on stream data. Second, we investigate the combination of frequent patterns and hierarchies to build a summary based on new generalized sequences. Third, even if there exist a lot of types of hierarchies in the literature, none of them integrates the expert knowledge during the generalization phase. However, such an integration could be very relevant to build semantically richer summaries. We tackled this issue by proposing a new type of hierarchies, namely the contextual hierarchies. Thanks to this new type of hierarchies, we propose a new conceptual, graphical and logical data warehouse model, namely the contextual data warehouse. Finally, since this work is founded by the ANR through the MIDAS project, we evaluate our approaches on real datasets provided by the industrial partners of this project (e.g., Orange Labs or EDF R&D).

Keywords Data stream, Hierarchy, Frequent pattern, Data warehouse

Attendance: