RE: Under what circumstances join should be used inste...
when we have a small file then we can go for join instead of lookup as it'll take much time if we go for a lookup as we directly compare the files in case of small file
RE: Under what circumstances join should be used inste...
In correction to my previous answer we should use join instead of lookup when one of the input file in join have large number of records with a long record length.
RE: Under what circumstances join should be used inste...
In the case of the join the memory issue can be addressed by changing the maxcore limit which if exceeded is handled by writing files to disk. While this may be costly it makes graph fairly robust. In case of lookup if the memory exceeded the graph fails due to lack of swap space.
RE: Under what circumstances join should be used instead on lookup.
Lookup should be used only to speedup the graph execution but here there is catch if the lookup file is small then only we should used lookup because when the graph start executing it will load whole lookup file into the memory resulting in speeding up the execution time but if the lookup is hugh then it will affect the ETL environmet as hugh data takes lot of memory in ETL evironment.