|
||||||||||||||||
|
|
||
|
Today, organisations gather their data from several transactional databases in huge databases designed for analysis, called data warehouses. Due to the digitisation of more and more data and the need for real-time information, the updating frequency in transactional databases is increasing significantly. These updates must be rapidly transmitted to the data warehouse. The propagation of updates in transactional systems remains a complex process, but due to their multidimensional structure, the process is even more complex for data warehouses. Indeed, a data warehouse, denormalizes the data structure by adding redundant data to speed up the analysis. Star schemas and snow flakes are examples of denormalized structures, they contain dimensions and measures gathered in facts. Fact tables, mostly composed of aggregated data are the biggest part of the cube (about 80%). Today, updating a data warehouse implies a complete rebuild from scratch, often performed in nightly batches. The research project I am in charge will try to address the time problems and define a method for an automatic incremental and consistent updating process that will only propagate the updates to data for which it is required. Only some parts of the cube will then be rebuilt. The time required for the complete updating of the cube will decrease and this may open the door to real-time applications. We will also make the process interoperable and open by deploying it as a web service. This research is part of a GEOIDE project entitled: "Development of an interactive web tool to better understand climate-related health vulnerabilities" co-directed by Dr P. Gosselin and Dr T. Badard from Laval University. It mainly involves members from the Centre for Research in Geomatics (CRG), INSPQ, Health Canada, Statistics Canada, Ouranos, INRS, Quebec Civil Protection and the cities of Québec and Lévis. Charlotte Declercq, MSc. Student |