Answered Questions

  • Join and Lookup Stage

    If you have a huge volume of data to be referenced, which stage will you use? Join or Lookup stage? Why

    Gopi N

    • Jul 11th, 2012

    If we have a Huge data at reference defiantly we should go to Join stage because Look up takes much time to process hence we will use entire partition but in Join stage we will give the sorted data and it will simplify better than Look-up

    arjunreddy

    • Aug 24th, 2011

    Look up stage

  • Which partition we have to use for Aggregate Stage in parallel jobs ?

    Anjaneyulu Pagadala

    • Mar 15th, 2018

    Hash partitioning and in link sorting on grouping keys give better performance and correct results if it is in parallel mode and Auto partition will give correct results if there is no sorting happened only one of the keys we are grouping in previous stage

    yassine

    • Jul 12th, 2017

    Hello Harish I would like to ask you a question How I can choose the appropriate partition for each stage and job how can I analyse situation
    thank you