Skip to main content

Azure Data Lake Storage Gen2 Provider

Azure Data Lake Storage Gen2 Provider

The ADLS Gen2 provider enables NPipeline applications to read, write, list, move, and delete data in Azure Data Lake Storage Gen2 using the adls:// URI scheme. Unlike the Azure Blob Storage provider, this provider targets the Azure.Storage.Files.DataLake SDK and exposes ADLS Gen2's true hierarchical namespace and O(1) atomic rename operations.

When to Use This Provider vs Azure Blob Storage

ConcernAzure Blob (azure://)ADLS Gen2 (adls://)
SDK packageAzure.Storage.BlobsAzure.Storage.Files.DataLake
Directory modelFlat (virtual / delimiter)True POSIX-like directory tree
Atomic move/renameNot natively supportedRenameAsync — O(1) atomic
Per-path ACLsRBAC / container-level onlyPOSIX-style per-file and per-directory
SupportsHierarchyfalsetrue

Use this provider when you require true hierarchical namespace, atomic rename/move, or POSIX ACLs (ACL support is planned for a future release).

Dependencies

<PackageReference Include="NPipeline.StorageProviders.Adls" Version="*" />

Transitive dependencies pulled in automatically:

PackagePurpose
Azure.Storage.Files.DataLakeADLS Gen2 SDK
Azure.IdentityDefaultAzureCredential and token credential support
NPipeline.StorageProvidersCore storage abstractions

URI Format

adls://<filesystem>/<path/to/file.ext>[?param=value&...]
URI componentMaps to
HostData Lake filesystem name (equivalent of a Blob container)
PathFile or directory path within the filesystem

Naming constraints

  • Filesystem name: 3–63 characters; lowercase letters, digits, and hyphens; no leading or trailing hyphen.
  • Path: 1–2,048 characters; no backslash (\); no ?.

Supported query parameters

ParameterDescription
accountNameStorage account name (overrides default)
accountKeyShared-key credential (base64-encoded)
sasTokenSAS token
connectionStringFull connection string
contentTypeMIME type set on write

Authentication

Credential resolution follows this priority order (first match wins):

  1. Per-URI connectionString query parameter
  2. Per-URI accountKey query parameter → StorageSharedKeyCredential
  3. Per-URI sasToken query parameter → AzureSasCredential
  4. AdlsGen2StorageProviderOptions.DefaultConnectionString
  5. AdlsGen2StorageProviderOptions.DefaultCredential
  6. AdlsGen2StorageProviderOptions.DefaultCredentialChain (lazy DefaultAzureCredential) when UseDefaultCredentialChain = true

Registration

With a configuration delegate

services.AddAdlsGen2StorageProvider(options =>
{
// Managed identity / DefaultAzureCredential (recommended for production)
options.ServiceUrl = new Uri("https://<account>.dfs.core.windows.net/");
options.UseDefaultCredentialChain = true;

// OR explicit connection string
// options.DefaultConnectionString = "<connection-string>";
});

With a pre-built options instance

var options = new AdlsGen2StorageProviderOptions
{
ServiceUrl = new Uri("https://<account>.dfs.core.windows.net/"),
UseDefaultCredentialChain = true,
UploadThresholdBytes = 128 * 1024 * 1024,
};

services.AddAdlsGen2StorageProvider(options);

Azurite (local development)

services.AddAdlsGen2StorageProvider(options =>
{
options.DefaultConnectionString = "UseDevelopmentStorage=true";
options.ServiceUrl = new Uri("http://127.0.0.1:10000/devstoreaccount1/");
});
# Start Azurite with ADLS Gen2 support
docker run -p 10000:10000 \
mcr.microsoft.com/azure-storage/azurite \
azurite --blobHost 0.0.0.0 --skipApiVersionCheck --inMemoryPersistence

Configuration Reference

All properties are on AdlsGen2StorageProviderOptions:

PropertyTypeDefaultDescription
DefaultCredentialTokenCredential?nullExplicit token credential
DefaultConnectionStringstring?nullConnection string (takes priority over token credentials)
UseDefaultCredentialChainbooltrueFall back to DefaultAzureCredential when no other credential is configured
ServiceUrlUri?nullCustom DFS service URL (e.g., Azurite or sovereign clouds)
ServiceVersionDataLakeClientOptions.ServiceVersion?nullREST API version override
UploadThresholdByteslong67,108,864 (64 MB)Files at or above this size use chunked transfer options
UploadMaximumConcurrencyint?nullMax parallel upload connections for large files
UploadMaximumTransferSizeBytesint?nullMax bytes per upload chunk
ClientCacheSizeLimitint100Max cached DataLakeServiceClient instances

Usage Examples

Reading a file

var uri = StorageUri.Parse("adls://my-filesystem/data/records.csv");
await using var stream = await provider.OpenReadAsync(uri);

Writing a file

var uri = StorageUri.Parse("adls://my-filesystem/data/output.csv?contentType=text/csv");
await using var stream = await provider.OpenWriteAsync(uri);
await csvWriter.WriteToAsync(stream);

Data is buffered to a local temporary file and uploaded atomically when the stream is disposed.

Checking existence

var exists = await provider.ExistsAsync(uri);

Listing (non-recursive)

var dirUri = StorageUri.Parse("adls://my-filesystem/data/");
await foreach (var item in provider.ListAsync(dirUri, recursive: false))
{
var type = item.IsDirectory ? "[dir]" : $"{item.Size,12} bytes";
Console.WriteLine($" {type} {item.Uri}");
}

Listing (recursive)

await foreach (var item in provider.ListAsync(dirUri, recursive: true))
Console.WriteLine(item.Uri);

Retrieving metadata

var metadata = await provider.GetMetadataAsync(uri);
if (metadata is not null)
{
Console.WriteLine($"Size: {metadata.Size}");
Console.WriteLine($"ContentType: {metadata.ContentType}");
Console.WriteLine($"IsDirectory: {metadata.IsDirectory}");
}

Deleting a file (idempotent)

if (provider is IDeletableStorageProvider del)
await del.DeleteAsync(uri); // silently succeeds even if the path does not exist

Moving / renaming a file (atomic)

ADLS Gen2's O(1) server-side rename is the primary differentiator over Azure Blob Storage:

if (provider is IMoveableStorageProvider mov)
{
var src = StorageUri.Parse("adls://my-filesystem/staging/records.csv");
var dest = StorageUri.Parse("adls://my-filesystem/processed/records.csv");
await mov.MoveAsync(src, dest);
}

Note: Cross-account moves are not supported in v1. Both source and destination must be within the same storage account. A NotSupportedException is thrown otherwise.

Exception Handling

RequestFailedException errors from the Azure SDK are translated to standard .NET exceptions:

HTTP status / error codeThrown exception
AuthenticationFailed, AuthorizationFailed, 401, 403UnauthorizedAccessException
FilesystemNotFound, PathNotFound, 404FileNotFoundException
InvalidResourceName, 400ArgumentException
PathAlreadyExists, 409IOException
429 / 5xx (transient)IOException (preserves retryable context)

AdlsStorageException (inherits ConnectorException) carries Filesystem and Path properties for structured diagnostics.

Provider Capabilities

This provider reports the following capabilities via IStorageProviderMetadataProvider.GetMetadata():

new StorageProviderMetadata
{
Name = "Azure Data Lake Storage Gen2",
SupportedSchemes = ["adls"],
SupportsRead = true,
SupportsWrite = true,
SupportsListing = true,
SupportsMetadata = true,
SupportsHierarchy = true, // ← key differentiator from Azure Blob
Capabilities = {
["supportsAtomicMove"] = true,
["supportsNativeDelete"] = true,
["supportsHierarchicalListing"] = true,
["supportsServiceUrl"] = true,
["supportsConnectionString"] = true,
["supportsSasToken"] = true,
["supportsAccountKey"] = true,
["supportsDefaultCredentialChain"] = true,
}
}

Troubleshooting

InvalidOperationException: Account name must be provided

No credential resolved to an account. Provide one of: connectionString, accountName + accountKey, accountName + sasToken, or DefaultConnectionString.

FileNotFoundException on write

Ensure the filesystem exists or that the provider has permissions to create it. UploadAsync calls CreateIfNotExistsAsync on the target filesystem automatically.

UnauthorizedAccessException

Verify the identity has at minimum Storage Blob Data Contributor (or equivalent ADLS Gen2 role) on the storage account. For Azurite, use the default dev credentials directly.

Azurite partial ADLS Gen2 fidelity

Azurite supports the ADLS Gen2 Data Lake API at partial fidelity. ACL operations and some advanced HNS behaviors may behave differently from the real service. Run nightly tests against a real ADLS Gen2 account for full validation.

ACL Support (future roadmap)

POSIX ACLs (SetAccessControlListAsync, GetAccessControlAsync) are not in scope for v1. They will be exposed via a dedicated IAdlsAclProvider interface extension in a future release.