Geeks Talk

Prepare for your Next Interview




Informatica

This is a discussion on Informatica within the Data Warehousing forums, part of the Databases category; what is the use of starschema and snowflake schema. when u use snowflake schema? Diffrence between scd and incremental aggregation?...


Go Back   Geeks Talk > Databases > Data Warehousing

Register FAQ Members List Calendar Mark Forums Read
  #1 (permalink)  
Old 01-09-2007
Junior Member
 
Join Date: Jan 2007
Location: banglore
Posts: 4
Thanks: 0
Thanked 3 Times in 2 Posts
sreedhark26 is on a distinguished road
Informatica

what is the use of starschema and snowflake schema.

when u use snowflake schema?

Diffrence between scd and incremental aggregation?
Reply With Quote
The Following 2 Users Say Thank You to sreedhark26 For This Useful Post:
Sponsored Links
  #2 (permalink)  
Old 05-14-2007
Contributing Member
 
Join Date: Apr 2007
Location: bangalore
Posts: 46
Thanks: 0
Thanked 9 Times in 9 Posts
reetasharma108 is on a distinguished road
Re: Informatica

Quote:
Originally Posted by sreedhark26 View Post
what is the use of starschema and snowflake schema.

when u use snowflake schema?

Diffrence between scd and incremental aggregation?
star schema:-

The star schema (sometimes referenced as star join schema) is the simplest data warehouse schema, consisting of a single "fact table" with a compound primary key,with one segment for each "dimension" and with additional columns of additive, numeric facts. The name star schema is derived from the fact that the schema diagram is shaped like a star.
The star schema makes multi-dimensional database (MDDB) functionality possible using a traditional relational database. Because relational databases are the most common data management system in organizations today, implementing multi-dimensional views of data using a relational database is very appealing. Even if a specific MDDB solution is used, its sources likely are relational databases.
Another reason for using star schema is its ease of understanding. Fact tables in star schema are mostly in third normal form (3NF), but dimensional tables are in de-normalized second normal form (2NF). If you want to normalize dimensional tables, they look like snowflakes (see snowflake schema) and the same problems of 3NF databases arise - you need complex queries and
business users cannot easily understand the meaning of data. Although query performance may
be improved by advanced DBMS technology and hardware, highly normalized tables make reporting difficult and applications complex.

SELECT
sum (f_sales.units_sold)
FROM
f_sales, d_customer, d_time, d_store, d_product
WHERE
f_sales.customer_id = d_customer.customer_id AND
f_sales.date_id = d_time.date_id AND
f_sales.store_id = d_store.store_id AND
f_sales.product_id = d_product.product_id AND
d_time.year_id = 1997 AND
d_product.category_id = "tv"
GROUP BY
d_product.brand, d_store.country_iso_id

snow flake:-
The snowflake schema is a variation of the star schema used in a data warehouse.
The snowflake and star schema are methods of storing data which are multidimensional in nature (i.e. which can be analysed by any or all of a number of independent factors) in a relational database.
The snowflake schema (sometimes called snowflake join schema) is a more complex schema than the star schema because the tables which describe the dimensions are normalized.
Snowflake schema is nothing but one Fact table which is connected to a number of dimension tables, and the dimension tables in turn connected to other dimension tables.
As the granularity level is more in this case,it more complex query will be there as no of joins will increase and hence tresponse time of this schema will increase.

SCD:-SCD means slowly changing dimensions.Dimension which changes slowly are called SCD.there are three types of the SCD:-
1.Type 1:-these types of dimension are hold only current data.
2.Type 2:-it is used to hold both current and historical data,it has three types:-
a.version type: here we assign a version number corresponding to each record.
b.flag current data type:-a flag is assigned to new and old records.
c.data type:-a date is mention with each records which help in identifying the new and old recods.
type3:-this SCD type used to hold the current as well as one time historical data.

Incremental aggregation:-in some requirements we require to aggregate the data after 15 days one month etc.In this we have to do all calculation on the new data and add the result data with ols result data.this is called incremental aggregation.
for example:-we have 500 records in a table ,300 new records are coming to this table so we will perform all the transformation process on the 300 new recods rather than on 500 old+300 new records.hence processing tim,e is reduced.

Reeta




Star schema has one fact coressponding to that all dimesion distributed in such a way that it create a star.there dimension table are not normalized.Dimension are also not further granularized.
Reply With Quote
The Following User Says Thank You to reetasharma108 For This Useful Post:
  #3 (permalink)  
Old 05-17-2007
I am the DON
 
Join Date: Apr 2006
Location: Just on Earth with all....
Posts: 102
Thanks: 1
Thanked 32 Times in 25 Posts
sanghala is on a distinguished road
Re: Informatica

Quote:
Originally Posted by sreedhark26 View Post
what is the use of starschema and snowflake schema.
For these please check the thread named 'schema' posted by raghuladdu user or else follow this link..
http://www.geekinterview.com/talk/4045-schema.html (schema)

Quote:
Originally Posted by sreedhark26 View Post
when u use snowflake schema?
You can go for Snow-Flake schema when you are tables are loading the data from the succesive tables., i.e normalized in nature.

Definition - "Snowflake schema is nothing but one Fact table which is connected to a number of dimension tables, and the dimension tables in turn connected to other dimension tables"

Why Snow-flake is used ?
  1. If a dimension is very sparse (i.e. most of the possible values for the dimension have no data) and/or a dimension has a very long list of attributes which may be used in a query, the dimension table may occupy a significant proportion of the database and snowflaking may be appropriate.
  2. A multidimensional view is sometimes added to an existing transactional database to aid reporting. In this case, the tables which describe the dimensions will already exist and will typically be normalised. A snowflake schema will hence be easier to implement.
  3. A snowflake schema can sometimes reflect the way in which users think about data. Users may prefer to generate queries using a star schema in some cases, although this may or may not be reflected in the underlying organisation of the database.

Quote:
Originally Posted by sreedhark26 View Post
Diffrence between scd and incremental aggregation?
SCD - Slowly changing Dimensions
A dimension that has levels or attributes that are changing on an occasional basis.
Type #1. Change the data in the dimension table.
Type #2. Add new records to the dimension table that contain the new data.
Type #3. Add new fields to the dimension table to contain the values before and after the change.
Type 2 SCD inserts both the new & changed dimension into target.
This will maintain the complete history of the dimension.

Incremental Aggregation :

In Aggregate Transformation properties (session), we can enable this option, if the data is incrementally aggregating.

There is no difference related to SCD & Incremental Aggregation.

NOTE:

Sreedhar,
Please donot post multiple threads in a single post.
__________________
Sanghala
www.dwforum.net - A Great resource for Data Warehousing Professionals

If you are satisfied with my reply. Please click “Thanks” button which increases the reputation and can serve you better.
Reply With Quote
Reply

  Geeks Talk > Databases > Data Warehousing


Thread Tools
Display Modes


Similar Threads

Thread Thread Starter Forum Replies Last Post
For Informatica Users Lokesh M Geeks Lounge 3 10-06-2008 07:38 AM
description of informatica power center 6 architecture satyajitchowdhury Data Warehousing 1 12-10-2007 04:45 AM
Typical Informatica Scenarios Lokesh M Data Warehousing 5 08-16-2007 09:18 AM
Real time scheduling in Informatica JobHelper Data Warehousing 0 01-07-2007 05:20 AM
Informatica Certification manisha.sinha Certification 0 01-03-2007 09:57 AM


All times are GMT -4. The time now is 12:49 PM.


Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0
Copyright © 2008 GeekInterview.com. All Rights Reserved