For data parallelism, we can use partition components. For component parallelism, we can use replicate component. Like this which component(s) can we use for pipeline parallelism?

prabhupurna
Profile Answers by prabhupurna Questions by prabhupurna
Jul 9th, 2006
19
11008

Questions by prabhupurna answers by prabhupurna

Showing Answers 1 - 19 of 19 Answers

Abhisek Basu Mullick

Jul 19th, 2006

When connected sequence of components of the same branch of graph execute concurrently is called pipeline parallelism.
Componets like reformat where we distribute input flow to multiple o/p flow using output index depending on some selection criteria and process those o/p flows simultaneosly creates pipeline parallelism.
But components like sort where entire i/p must be read befor a single record is writen to o/p can not achieve pipeline parallelism.

mukund

Dec 22nd, 2006

guys,
this was a very good question
before learning abinitio or building any graph u need to know the concepts of parallisms they are 1)data parallism 2)pipeline parallism 3)component parallism
generally pipeline parallism is a concept of processing of data by different components
let me give u flow:
input file ------>reformat----->rollup------>filter by expression----->o/p file
50th record 25 records 10 records
clearly speaking when ever u run any graph we observe the number of records processed on flows ,this is best example for pipeline parallism
hope this might suffices u
mukund

Vamsi

Feb 18th, 2016

Filter_By_expression is the component which supports Pipeline Parallelism.

sudarshan

Mar 2nd, 2016

You can use components that does not require any sorted data (explicit or in memory sort) to get pipeline parallelism. Components that needed sorted data like join, roll-up, merge, sort, partition by key and sort breaks the pipeline parallelism.

mohankrishna

Apr 8th, 2016

Any component in the flow having no SORT component will do the pipeline parallelism. It Abintio Architecture that does it. For example there is 10 records in a file and u need to reformat it and filter the flow. At first record reformat picks up the first record of file and does transformation and feed it to filter, while filter is applying it specified condition the reformat picks up the second record from input file and transforms it keeps it ready to feed to filter.
Any how if you are using component folding the pipeline parallelism wont work. Folding eliminates this concept.

Gouse

May 26th, 2016

The component without sorted input because sort component breaks pipeline parallelism.

Thanks
Gouse

Mahesh

May 26th, 2021

For data parallelism, we can use partition components.
For component parallelism - its when i/ file ---> rollup(mfs layout) --> output .
Here practically there are 4 copies of components running in parallel on 4 path server, parallely crunching away.
pipeline parallelism - every component is parallel by default till the time it has to store data to work on it.
e.g. sort, rollup, scan, dedup may inhibit pipeline parallelism . But imagine a sort component without key....

For data parallelism, we can use partition components. For component parallelism, we can use replicate component. Like this which component(s) can we use for pipeline parallelism?

Abhisek Basu Mullick

mukund

Vamsi

sudarshan

mohankrishna

Gouse

Mahesh

Give your answer:

Related Answered Questions

Related Open Questions

Latest News

It looks like you are using an AD Blocker!

Login

For data parallelism, we can use partition components. For component parallelism, we can use replicate component. Like this which component(s) can we use for pipeline parallelism?

Abhisek Basu Mullick

mukund

Vamsi

sudarshan

mohankrishna

Gouse

Mahesh

Give your answer:

Related Answered Questions

Related Open Questions

Latest News

It looks like you are using an AD Blocker!