Back to case studies

2016-2023 | EMAsphere

Scalable Multi-Tenant IAM Service for a SaaS

1. Project Overview

1.1. The Challenge

A client developing a complex, multi-tenant Software-as-a-Service (SaaS) platform required a centralized, robust, and highly scalable solution to manage identity, tenancy, and permissions. The core challenge was to build a foundational microservice that could handle a sophisticated security model while remaining performant and maintainable. The service needed to support a wide range of requirements, from isolating the data of thousands of customer tenants to providing granular, object-level permissions and enabling feature entitlement based on commercial subscription plans.

1.2. The Solution

The project involved the design and architecture of a comprehensive Identity and Access Management (IAM) microservice, referred to as the "Directory Service." This service was built to act as the single source of truth for all security and tenancy-related data across the platform. The final architecture implemented a powerful hybrid authorization model, combining the scalability of Role-Based Access Control (RBAC) with the granularity of Access Control Lists (ACLs). It also featured a sophisticated hierarchical tenancy model and a high-performance caching layer to ensure minimal latency for authorization checks.

2. Core Architectural Pillars

The solution was built on a set of core architectural concepts designed for flexibility and scale.

  • Hierarchical Tenancy: The foundation of the design is a hierarchical model for tenants (customers). Instead of a flat list, tenants are nodes in a tree structure. This allows customers to model complex organizational structures (e.g., a parent company with multiple subsidiaries) and enables the powerful inheritance of permissions and settings down the hierarchy.

  • Hybrid Authorization (RBAC + ACL): Recognizing that a single authorization model is often insufficient, a hybrid approach was chosen:

    • RBAC is used for broad, role-based permissions (e.g., an "Accountant" role). This is easy to manage at scale.
    • ACLs are used for fine-grained permissions on specific data objects (e.g., granting a specific user "read-only" access to a single invoice). This provides maximum flexibility.
  • Subscription-Based Entitlement: The architecture decouples functional permissions from commercial offerings. A tenant's available features are controlled by a "Subscription Plan," which acts as a master filter on the permissions their users can have. This allows the business to easily create different product tiers (e.g., Basic, Pro, Enterprise) without changing the core permission logic.

  • Performance via Denormalization: Acknowledging that authorization checks are on the critical path of almost every user action, the architecture prioritizes read performance. It uses a denormalized caching strategy (with Redis) to pre-calculate user permissions, ensuring that authorization checks are near-instantaneous at runtime.

3. Key Features & Technical Deep Dive

3.1. Hierarchical Group and Role Management

To make the RBAC model powerful, the concept of groups and roles was extended to be hierarchy-aware.

  • Inheritable Groups: An administrator can create a group at a high level in the tenant hierarchy (e.g., at the parent company level) and mark it as "inheritable." Members of this group automatically inherit the group's roles and permissions in all descendant tenants (e.g., all subsidiaries). This drastically simplifies managing cross-company roles like auditors or executives.

  • Role Resolution Algorithm: When a user logs into a specific tenant, the service calculates their roles by combining:

    1. Roles assigned directly to the user in the current tenant.
    2. Roles assigned to the groups they are a member of in the current tenant.
    3. Roles assigned to any "inheritable" groups they are a member of in any parent or ancestor tenant.

3.2. The Global Entitlement Resolution Pipeline

The final set of a user's entitlements for a session is determined by a multi-stage pipeline that ensures security, applies business rules, and is executed upon login.

  1. Role Resolution: The process starts by gathering all of the user's effective roles using the hierarchical algorithm described above.
  2. Entitlement Expansion: Each role is expanded into a set of granular, functional permissions (which we can call "entitlements"). This mapping is defined in a configurable "Role-to-Entitlement" map, allowing the meaning of a role to be customized per tenant.
  3. Subscription Filtering: This is a critical step where business logic is applied. The expanded list of entitlements is filtered against the tenant's active Subscription Plan. Any entitlement not included in the plan is removed, effectively disabling features the customer has not paid for.
  4. Final Entitlement Set: The resulting set of roles and filtered entitlements is attached to the user's session token and used by all other microservices to enforce access control.

3.3. High-Performance ACLs with a Denormalized Cache

The ACL system was designed for extreme read performance.

  • The Challenge: A traditional ACL check would require multiple database queries: find the user, find their groups, find the resource, check for a user ACE, check for a group ACE for each group. This is too slow for real-time enforcement.
  • The Solution: A denormalized Redis cache. When a permission is granted to a group, the service does the hard work upfront: it resolves every single user in that group (and any nested groups) and writes a direct, user-specific permission entry into the cache.
  • The Result: At runtime, checking if a user can perform an action on a resource becomes a single, constant-time cache lookup. This eliminates database hotspots and ensures that security checks have a negligible impact on application performance.

4. Technology Stack

  • Backend: Java, Spring Boot, Spring Security
  • Data Persistence: PostgreSQL
  • Caching: Redis
  • Architecture: Microservices

5. Architectural Outcomes

The architecture of the Directory Service delivered significant value to both the client's business and their engineering teams by providing a secure, scalable, and highly performant foundation for the entire SaaS platform.

5.1. Business Outcomes

  • Enhanced Business Agility: The subscription-based entitlement system provided a powerful tool for the product and sales teams. They could now design, launch, and manage different commercial offerings and feature tiers without needing to file engineering tickets, enabling a much faster response to market demands.

  • Flexible Organizational Modeling for Diverse Markets: The hierarchical tenancy model proved to be a key strategic advantage, allowing the platform to serve a wide spectrum of customer segments within a single architecture. The system seamlessly accommodates simple SMEs, large multi-entity corporations, and fiduciary firms managing thousands of their own client tenants. Crucially, this model empowers partners like fiduciaries with full self-service capabilities to create and manage their client portfolios directly on the platform, eliminating operational bottlenecks and enabling scalable growth.

  • Improved Security and Compliance: By centralizing all IAM logic, the architecture established a single, auditable source of truth for security policy, reducing the risk of inconsistent or incorrect permission enforcement. The powerful group and hierarchy model also made it easier for customers to comply with their own internal security policies.

5.2. Technical Outcomes

  • High Scalability and Performance: The hierarchical model and the denormalized caching strategy ensured that the platform could scale to support millions of users across thousands of tenants without performance degradation, providing a solid foundation for future growth.

  • Improved Developer Productivity (The "Complexity Shield"): Perhaps the most significant technical outcome was the abstraction of complexity. The Directory Service acts as a "complexity shield," providing a simple contract to the rest of the microservices ecosystem. This dramatically reduced cognitive load and accelerated feature development across the entire platform. Key aspects of this abstraction include:

    • Self-Contained Authorization Payloads: The service produces a rich authorization payload for each user session, which is packaged into a secure token (e.g., a JWT) by the platform's authentication entrypoint. Business-logic services are stateless and do not need to call the Directory Service on every request.
    • Declarative Security: Developers in other services enforce permissions with simple annotations. A shared client library handles the complexity of performing a high-speed cache lookup, hiding it from the developer.
    • Simplified Data Access: Downstream services access essential, flattened data (like tenant metadata) from a shared key-value cache, decoupling them from the Directory's internal database. For their own persistence, multi-tenancy is reduced to a simple tenant_id column, as the Directory Service has already resolved all complex hierarchical logic.