The three stages differ mainly in the memory they use
DataStage doesn't know how large your data is, so cannot make an informed choice whether to combine data using a join stage or a lookup stage. Here's how to decide which to use:
if the reference datasets are big enough to cause trouble, use a join. A join does a high-speed sort on the driving and reference datasets. This can involve I/O if the data is big enough, but the I/O is all highly optimized and sequential. Once the sort is over the join processing is very fast and never involves paging or other I/O
Unlike Join stages and Lookup stages, the Merge stage allows you to specify several reject links as many as input links.