DagStream

DagStream is the Python package in order to manage relationship between functions, especially for data-preprocessing process for machine learning applications.

Key Features

  • Simple method to define dag relationship

  • Quick Visualization by mermaid

Definition of Dag

DagStream class convert your functions into dag nodes.

import dagstream

def funcA():
   print("funcA")
   return 10

def funcB(val_from_A: int):
   print(f"funcB, received {val_from_A} from A")

def funcC(val_from_A: int):
   print(f"funcC, received {val_from_A} from A")
   return 20

def funcD(val_from_C: int):
   print(f"funcD, received {val_from_C} from C")

def funcE():
   print("funcE")

def funcF():
   print("funcF")


stream = dagstream.DagStream()
# convert to functional nodes
A, B, C, D, E, F = stream.emplace(funcA, funcB, funcC, funcD, funcE, funcF)

# define relationship betweeen functional nodes

# A executes before B and C
# output of A passed to B and C
A.precede(B, C, pipe=True)

# E executes after B, C and D
E.succeed(B, C, D)

# D executes after C
# D receives output of C
D.succeed(C, pipe=True)

F.succeed(E)

Relationship between functions are like below.

stateDiagram direction LR state "funcA" as state_0 state "funcB" as state_1 state "funcC" as state_2 state "funcD" as state_3 state "funcE" as state_4 state "funcF" as state_5 [*] --> state_0 state_0 --> state_1: Pipe state_0 --> state_2: Pipe state_1 --> state_4 state_2 --> state_4 state_2 --> state_3: Pipe state_3 --> state_4 state_4 --> state_5 state_5 --> [*]

License

the Apache License, Version 2.0 (the “License”)

Indices and tables