Core Concepts

NPipeline's architecture is built on three fundamental concepts: graph-based data representation, distinct node types, and a streaming data model.

Graph-Based Architecture

NPipeline represents pipelines as directed acyclic graphs (DAGs) where:

Nodes are processing units (sources, transforms, sinks)
Edges are connections between nodes representing data flow
Data flows through the graph from sources to sinks

Source Node
    ↓
Transform Node 1
    ↓
Transform Node 2
    ↓
Sink Node

This approach provides several advantages:

Clarity: Visual representation of data flow
Flexibility: Easy to add, remove, or reroute nodes
Type Safety: Compile-time validation of connections
Composability: Nodes can be tested and reused independently

Node Architecture

From an architectural perspective, nodes are the fundamental processing units that form the vertices of the pipeline graph. NPipeline defines three primary node types:

Source Nodes: Initiate data flow by producing data streams
Transform Nodes: Process and transform data as it flows through the pipeline
Sink Nodes: Consume data at the terminal point of the pipeline

These nodes implement a unified interface hierarchy that enables type-safe connections and consistent execution patterns. The architectural design emphasizes:

Separation of Concerns: Each node type has a distinct responsibility
Type Safety: Compile-time validation of node connections
Composability: Nodes can be combined in various configurations
Testability: Each node can be tested in isolation

For detailed information about node implementations, interfaces, and examples, see the Nodes documentation.

Streaming Data Model

NPipeline uses IAsyncEnumerable<T> for lazy, streaming data flow:

public interface IDataPipe<T> : IAsyncEnumerable<T>
{
    // IDataPipe<T> implements IAsyncEnumerable<T> directly
    // Iterate using: await foreach (var item in dataPipe)
}

Key Characteristics

Lazy Evaluation

Data is only produced when consumed
If pipeline is cancelled, source never reads unused data

Memory Efficient

Only active items are in memory at any time
No need to load entire datasets upfront

Responsive

Processing begins immediately upon pipeline start
Results are available as soon as items flow through

Composable

Each transform creates a new data pipe
Pipes can be layered and composed

Example: How Streaming Works

// Step 1: Source creates pipe (but doesn't read yet)
var sourcePipe = await sourceNode.ExecuteAsync(context, cancellationToken);

// Step 2: Transform wraps pipe (but doesn't process yet)
var transformPipe = new TransformPipe(sourcePipe, transformNode);

// Step 3: Sink actually triggers execution
await foreach (var item in transformPipe.WithCancellation(cancellationToken))
{
    // Data is produced, transformed, and consumed
    // Only ONE item is in memory at a time
}

Benefits of This Design

✅ Memory Efficient - Only active items in memory ✅ Responsive - Processing starts immediately ✅ Cancellable - Can stop at any time ✅ Type Safe - Compile-time validation of node connections ✅ Testable - Each node can be tested in isolation

Next Steps

Component Architecture - Learn about the major system components
Execution Flow - Understand how data flows through pipelines
Nodes - Explore detailed node implementations and examples

Graph-Based Architecture​

Node Architecture​

Streaming Data Model​

Key Characteristics​

Example: How Streaming Works​

Benefits of This Design​

Next Steps​