Answered Questions

  • memory allocation while using lookup stage

    Hi friends, i know while using lookup stage if the lookup table has huge amount of data we will go for sparse lookup or else will use normal lookup. would anybody explain how the memory will be allocated for both type of lookups?

    harishsj

    • Jan 22nd, 2008

    "Hi friends, i know while using lookup stage if the lookup table has huge amount of data we will go for sparse lookup or else will use normal lookup. would anybody explain how the memory will be ...

  • What is the max size of Data set stage?

    Venkat Duvvuri

    • Sep 9th, 2011

    S I agree with Hari's answer..Thae max size of the dataset stage is entirely depends upon the size of the resource disk space, which we have specified under config file.

    Regards,

    Venkat Duvvuri

  • Which partition we have to use for Aggregate Stage in parallel jobs ?

    Anjaneyulu Pagadala

    • Mar 15th, 2018

    Hash partitioning and in link sorting on grouping keys give better performance and correct results if it is in parallel mode and Auto partition will give correct results if there is no sorting happened only one of the keys we are grouping in previous stage

    yassine

    • Jul 12th, 2017

    Hello Harish I would like to ask you a question How I can choose the appropriate partition for each stage and job how can I analyse situation
    thank you

  • what is the difference between OLAP and datawarehosue

    harishsj

    • Jul 24th, 2007

    Datawarehouse V/S OLAPBoth the terms are interchangble.DW:Data from different source systems is stored in a relational database for end use analysis.Data is organized in summarized, aggregated,subject...

    Guest

    • Jan 4th, 2007

    Don't get confused ,OLAP and Datawarehouse are the terms that can be used interchangeably in some places.Online Analytical Processing is a concept whereas Datawarehouse is the collection of data in denormalized form so that it can help in fast and frequent analysis.

  • What are  Data Marts

    Data Mart is a segment of a data warehouse that can provide data for reporting and analysis on a section, unit, department or operation in the company, e.g. sales, payroll, production. Data marts are sometimes complete individual data warehouses which are usually smaller than the corporate data warehouse.

    spatnam

    • Dec 9th, 2008

    Datamart is a subset of datwarehouse. Data mart deals with single line of business like Sales, Purchase etc. Size of data will be less when compared to Datawarehouse.Some of the different types of datamart are Depndant datamart , Independant datamart, Hybrid datamart