Tuesday, 17 April 2012
What is the difference between Relational and Multidimensional database implementation ?
Michel Voogd :
“The difference in implementation is that a multidimensional database includes pre-packaging subsets of data into small objects that are usable for fast online browsing, usually in a BI portal environment such as Cognos or Business Objects.
A relational database in itself doesn't include those packages but it would allow querying larger datasets.”
Bala Seetharaman :
“Relational DB - ER Modeling it has to comply the Codd's 12 rules. Here you can store only the way supported by DB engine (you can partition or multifile groups)
Multidimensional DB - Dimensional model - store the pre-aggregated data in the multidimensional form, still data sourced from Relational DB or Flat files. (Here you can store in the form of MOLAP, ROLAP and HOLAP and DOLAP too).
SQL - Query language used to search and manipulate the data from Relational DB
MDX - Multidimensional Query Expression - used to search and retrieve the data from cube or MDB (Multidimensional) store.
Siddharth: To answer your question "are there is any different tool or language to query the multidimensional database ( CUBE )",
MDX is the query language used to query the cube like your SQL again, its not like your ANSI standard SQL, we need to write in the form of 3D axis.
the calculations are quite easy in RDB than MDB, here if don't understand the dimension and hierarchy members we can't get the result easily in cube. “
John McKenna :
“….In relational databases data is organized by tables and columns (tuples) and records are grouped into blocks for storage and access. Querying is performed based upon relational algerbra (SQL). In multi-dimensional database implementations (most no longer exist), data is organized into mulit-dimensional cubes (think multi dimensional arrays), and queried based on a language suitable to navigating cubes (I am not aware of a standard although one may exist). To further muddy the waters you have columnar databases that group column data into blocks (efficient for ROLAP applications where few columns are in the result set, therefore less blocks traversed).
In addition to the database implementations many reporting tools have (OLAP/cube) functionality built in but many of these are not full blown multi-dimensional databases but scaled bown persistance engines that store all cube values together. Most full blown multi-dimensional databases have faded away due to performance issues (due to sparcity issues, etc), learning new query languages, supporting multiple database platforms and people finding that it was relatively easy to implement cubes in relational databases (ROLAP) by using dimensional database design (Ralph Kimball). …….”
The next question was the query methodology to query both type of database implementation, luckily I came to know that Oracle has also implemented multidimensional database architecture called the ESSBASE and for SQL Server its SSAS and SSRS.
Well there is still lot of information on my profile if you may want to have a look. Compiling all the notes is actually a tedious job. I have tried to aggregate some of the valuable comments.
For detailed discussion please follow the link
Sunday, 26 February 2012
BI Second Project - Twitter API programming
Sunday, 29 January 2012
Chronological Definition Business Intelligence
Thursday, 12 January 2012
Oracle Golden Gate for Data Synchronization
The way to synchronize different database is done with the help of Change Data Capture also called CDC process. Which captures any change in data called the delta from the source and makes a replication on the target. Earlier Oracle has Oracle Stream to capture these changes, triggers and other methods. The problem with these methods were, they are dependent on Oracle Environment, and the biggest of all they interact directly with the live server, also called the OLTP system. Since an OLTP system is designed for the application access. A constant ping from such CDC process will create performance issue for the DB and Application like making the I/O process long, high CPU utilization which is certainly not desirable for the end users and application.
Broadly we can define the CDC process in 2 different categories:
1) Non-Log based category
2) Log –Based Category.
The above diagram shows the part of pulling the data from the OLTP system while capturing the data from logs and not touching the OLTP DB. The data is read by OGG, at this point of time the OGG is said to be operating as an Extract and writing it on the file called “Trail files”, the data in these trail files are propriety of Oracle and can be read by OGG process. The same trail files can be read by the OGG at the target end and the CDC delta captured at source side is written on the target DB. The above data flow is for and Real TIME BI reporting, but changing the architecture OGG can be configure for Data recovery for Disaster recovery. The real time integration means that change captured can be send across the network with minimum time delay (less than 10 sec).
With the Data flow outlined above, OGG can be configured to capture data at schema level, table level and column level across different platforms and can be migrated to target system. OGG also gives the features to filter out the data from the source side itself, for example business might be interested in capturing the data for a particular time stamp filtering out the unnecessary data.
With this small document on OGG I have tried to show that data synchronization can be done across different platform with zero downtime. I have not gone into details for the component of Oracle Golden Gate, like Extract, Data Pump, trail files, Replicate and definition files. I have attended a three day session on Oracle Golden Gate as a part of my training inand was handling this application for over 1 year."
This part of my implementation may not fall exactly under the umbrella of BI but still for smart BI solutions firm may still want to implement the idea.
Also for the readers I have created a small prototype of the above architecture on my machine, if any one is interested please let me know, I will be more than happy.
Regards,
Siddharth gupta