Choosing a data reading architecture in .NET: Four approaches

Imagine a situation where you are working on an extensive financial application that processes thousands of transactions per minute. You discover that the system's performance has sharply declined at some point, and users begin to complain about delays. Upon analysis, you find that the cause is an inefficient data reading mechanism that can't cope with the current volume of information. This example highlights the importance of choosing the right data reading architecture in .NET that affects performance, scalability, and ease of application support. In this article, we will consider four main approaches, allowing each developer to make their own choice, corresponding to the specifics of their project.

Let's look at the approaches to reading data in .NET, starting with IQueryable<T>.

IQueryable<T>: Deferred execution and query flexibility

This approach allows repository methods to return an IQueryable<T> collection, deferring the execution of the query until it is used. It enables the dynamic creation of complex queries but can cause performance issues and break encapsulation.

Advantages:

Deferred execution: Queries are formed and optimized at the creation stage but only executed when needed.
Query flexibility: Allows dynamic creation of complex queries based on existing ones.
Efficient resource usage: Using IQueryable<T>, database queries are optimized to retrieve only necessary data. This reduces network and memory load as it avoids fetching unnecessary data.
Better integration with LINQ: IQueryable<T> allows integration with LINQ (Language Integrated Query), providing a robust and convenient syntax for forming queries. This simplifies writing complex queries and improves code readability.

Disadvantages:

Complexity: Complex queries can reduce performance, especially with multiple joins and aggregations.
Risk of domain logic encapsulation breach: Deferred query execution can lead to the blurring of business logic outside the domain model.
Provider dependency: IQueryable<T> implementation heavily depends on the database provider, which becomes especially important when using specific SQL functions not supported by all providers. Providers might interpret and optimize queries differently, leading to unexpected results or performance issues.

Let's consider an example of implementing this approach in investment portfolio management.

Example #1

Our domain has an entity Portfolio — an aggregate root consisting of an entity IntestmentTarget. We were asked to build a query to get all active portfolios, including InvestmentTarget's data.

public class Portfolio : AggregateRoot

{

    public Guid Id { get; private set; }

    public bool IsActive { get; set; }

    public InvestmentTarget InvestmentTarget { get; set; }

}



public class InvestmentTarget : Entity

{

    public Guid Id { get; private set; }

    public string Name { get; set; }

}

In the PortfolioRepository class, the GetActivePortfolios() method forms and returns a query of type IQueryable<Portfolio>, allowing filtering and sorting conditions to be added before executing the query.

public interface IPortfolioRepository

{

    IQueryable<Portfolio> GetActivePortfolios();

}



public class PortfolioRepository : IPortfolioRepository

{

    private readonly DbContext dbContext;



    public PortfolioRepository(DbContext dbContext)

    {

        this.dbContext = dbContext;

    }



    public IQueryable<Portfolio> GetActivePortfolios()

    {

        return dbContext.Set<Portfolio>()

            .Where(p => p.IsActive)

            .Include(p => p.InvestmentTarget);

    }

}

// consumer side

...

var activePortfolios = portfolioRepository.GetActivePortfolios().ToList();

Now, let's move on to the List<T> approach.

List<T>: Simplicity and immediate execution

In this approach, repository methods return a List<T>, which ensures the immediate execution of the query. This method simplifies debugging and testing but limits flexibility and can lead to excessive data loading. Unlike IQueryable<T>, List<T> executes queries directly at the time of their call, simplifying debugging due to more predictable behavior but limiting the possibilities for dynamic query modification.

Advantages:

Full upfront loading: Using List<T> allows loading all data at once, which can be helpful in scenarios where all data is needed for further processing without additional database queries.
Simplicity: Easy to use and understand, implying less complexity in query building. List<T> also provides convenience in working with in-memory data, as it represents a concrete collection of objects.

Disadvantages:

Limited flexibility: New repository methods for different queries require explicitly defining each query, increasing code volume, and reducing scalability.
Large data volume: This can lead to excessive loading, particularly relevant when working with large data volumes, where each collection item occupies significant memory space.
Lack of lazy loading: Unlike IQueryable<T>, List<T> does not support lazy loading, meaning all data is loaded immediately, even if not needed, which can increase response time and system load.
Performance issues with updates: If data in List<T> is frequently updated, it may require re-executing the query and reloading the entire collection, which can negatively impact application performance.

To demonstrate, let's consider the use of List<T> in implementing PortfolioRepository, where immediate access to data is required without additional processing on the client side.

Example #2

PortfolioRepository uses List<Portfolio> for the GetActivePortfolios() method in scenarios requiring immediate access to already filtered and prepared data.

public interface IPortfolioRepository

{

    List<Portfolio> GetActivePortfolios();

}



public class PortfolioRepository : IPortfolioRepository

{

    private readonly DbContext dbContext;



    public PortfolioRepository(DbContext dbContext)

    {

        this.dbContext = dbContext;

    }



    public List<Portfolio> GetActivePortfolios()

    {

        return dbContext.Set<Portfolio>()

            .Where(p => p.IsActive)

            .Include(p => p.InvestmentTarget)

            .ToList();

    }

}

// consumer side

...

var activePortfolios = portfolioRepository.GetActivePortfolios();

Let's now move on to the following approach: Specification pattern.

Specification pattern: Clean code and reusability

Using the Specification Pattern, queries are defined as separate classes, improving the domain model's description, code testability, and reuse of queries. However, this adds architectural complexity and can be excessive for simple tasks or applications.

Advantages:

Flexibility and scalability: The Specification Pattern allows flexible and scalable management of queries, adapting them to different needs and changing conditions without altering the repository.
Reusability: Enhances the testability and reuse of query logic. In addition to improving testability, this pattern is ideal for complex business rules where dynamically combining different specifications depending on the context is required.
Better code organization: According to DDD, the pattern promotes better code organization, as the query logic is part of the domain logic.

Disadvantages:

Overcomplication for small projects: In small projects or when working with simple data, the Specification Pattern may lead to unnecessary complexity in the architecture and codebase.
Learning curve: For developers encountering this pattern for the first time, additional time may be required to learn and understand its concepts, increasing the overall learning curve.
Complexity in query optimization: Due to the complexity and abstraction introduced by the pattern, query optimization can become more challenging, especially when a deep understanding of database interactions or performance optimization is required.

In the following example, the Specification Pattern is used for flexible query composition in the portfolio repository, allowing easy adaptation of data selection logic based on business requirements.

Example #3

In the example of ActivePortfoliosWithDependenciesSpec, the Specification Pattern is applied using the Specification library by Steve Smith (a.k.a Ardalis), centralizing the query logic in the domain project.

public class ActivePortfoliosWithDependenciesSpec : Specification<Portfolio>

{

    public ActivePortfoliosWithDependenciesSpec()

    {

        Query.Where(p => p.IsActive)

            .Include(p => p.InvestmentTarget);

    }

}



public interface IRepository<T> where T : class, IAggregateRoot

{

    IEnumerable<T> GetPortfolios(ISpecification<T> spec);

}



public class Repository<T> : IRepository<T> where T : class, IAggregateRoot

{

    // For detailed implementation of the Specification repository, look here: 

    // https://github.com/ardalis/Specification

}

// consumer side

...

var spec = new ActivePortfoliosWithDependenciesSpec();

var activePortfolios = await portfolioRepository.SingleOrDefaultAsync(spec);

Let's move on to the last approach: MediatR's Query Request.

MediatR's query request: Clear separation and scalability

The application of MediatR with the CQRS pattern separates queries and their handling, which combines well with other design patterns. However, it requires a deep understanding of CQRS and can lead to an additional increase in the codebase.

Advantages:

Following DDD and CQRS: A clear separation of responsibilities between forming and handling queries simplifies the application's scalability.
Improved application responsiveness: Using MediatR in combination with CQRS can lead to more responsive user interfaces, as read queries and write commands are processed separately, reducing delays.
Better support for asynchrony: MediatR supports asynchronous query handlers, which can significantly improve the performance of applications, especially in networked or scalable systems.

Disadvantages:

Requirement for team organization: Effective use of MediatR and CQRS often requires a well-organized team of developers and a clear separation of responsibilities, which can be problematic in small or less organized teams.
Complication of architecture: The increase in the amount of code and classes can lead to project maintenance complexity, especially in smaller teams, and can be excessive for simple projects.

In the following example, MediatR separates calls according to CQRS, which was discussed in the article "CQRS, Repository, and MediatR".

Example #4

In GetActivePortfoliosQueryHandler, MediatR handles queries, clearly separating responsibilities between components.

public class GetActivePortfoliosQuery : IRequest<IEnumerable<Portfolio>> { }



public class GetActivePortfoliosQueryHandler : IRequestHandler<GetActivePortfoliosQuery, IEnumerable<Portfolio>>

{

    private readonly IPortfolioRepository portfolioRepository;



    public GetActivePortfoliosQueryHandler(IPortfolioRepository portfolioRepository)

    {

        this.portfolioRepository = portfolioRepository;

    }



    public async Task<IEnumerable<Portfolio>> Handle(GetActivePortfoliosQuery request, CancellationToken cancellationToken)

    {

        return await portfolioRepository.GetActivePortfolios();

    }

}

// consumer side

...

var activePortfolios = await mediator.Send(new GetActivePortfoliosQuery());

Practical example

While developing the individual investment tracking application InWestMan, I chose an approach based on using IQueryable<T> in the first iteration. This choice seemed convenient to me and provided flexibility with the ability to extract data from the database as needed and add additional sorting and filtering on the client side.

Initially, I did not consider the violation of CQRS principles to be a severe problem. However, when testing asynchronous changes of records from the client side, difficulties arose with organizing their integration. Over time, I discovered that the business logic for data extraction from the database was scattered throughout the project, contradicting the DDD architecture I was trying to follow. Some logic was embedded directly in the repository implementation, some was at the application level, and some aspects were now implemented in the web client. The only area where there was virtually no logic was the domain model.

In my project, the Specification Pattern is used for queries and MediatR for commands, such that all business logic is concentrated in the domain project. Each new Specification class is responsible for its query construction in the database. As a result, this approach not only simplified the original structure of the project but also enriched the domain model, as all necessary specifications are now concentrated in one place.

Conclusion

The choice of approach for reading data from the database is a crucial moment in designing the architecture of an application. It should be based on the analysis of the specific project's requirements. Each of the presented methods — IQueryable<T>, List<T>, Specification Pattern, and MediatR — has unique advantages, risks, and limitations that must be considered when solving specific tasks.

IQueryable<T> is appropriate for scenarios requiring flexibility in forming queries on the client side, while List<T> suits scenarios where simplicity and immediate execution of queries are essential. The Specification Pattern is a powerful tool for creating clean and testable code, adhering to DDD principles, and is ideal for situations where business logic needs to be separated from data storage mechanisms. MediatR and the CQRS pattern provide a high degree of abstraction and separation of responsibilities, making them suitable for complex applications with advanced business logic and the need for scalability.

It's important to understand that the choice is not limited to a single approach and can combine several strategies depending on the development and expansion of the project. Experience from personal practice, as in the example with the InWestMan application, underscores the importance of flexibility and adaptability in choosing an approach. It also shows how the evolution of requirements can influence architectural decisions and their changes.