:::warning Codimd should contain presentations only. Use [authzsvc-docs](https://gitlab.cern.ch/authzsvc/authzsvc-docs/blob/master/docs/index.md) for documentation. ::: # CERN Authentication and Authorization Infrastructure Design ## Overview ### Current Infrastructure CERN's Authentication and Authorisation Infrastructure grants access to scientific and enterprise services for CERN employees and visiting researchers. CERN's AAI is currently based on several core services, namely - An Oracle database (Foundation) where the HR department enters Users into the system - Proprietary software called e-groups for Authorisation Groups - Active Directory as the LDAP User Database - ADFS for Single Sign On - supporting Certificates, Kerberos, SAML and OIDC user authentication - supporting SAML and OAuth2 service authentication/authorisation - Forefront Identity Managemer (FIM) for: - synchronisation between Foundation and LDAP - managing the lifecycle of computing resources (computing accounts, websites, mailboxes etc) ![Current Infrastructure](https://codimd.web.cern.ch/uploads/upload_d3409f7db221937a2934fe0f11573544.png) | Accounts | Count | | --- | --- | | Primary (main account for a person, created automatically) | 40k | | Secondary (personal account, created on demand) | 12k | | Service (account for service credentials or official use) | 6k | | **Total** | **58k** | | Applications | Count | | --- | --- | | SAML Applications | 15.800 | | OIDC Applications | 300 | | **Total** | **16.100** | #### High Energy Physics It is worth noting that currently the High Energy Physics workflows, through the Worldwide Large Hadron Collider Computing Grid ([WLCG](http://wlcg-public.web.cern.ch)), operate under an entirely separate authentication and authorisation model. Researchers use x509 certificates provided by Interoperable Global Trust Federation ([IGTF](https://www.igtf.net)) Certificate Authorities. ### Motivation for Change #### Microsoft license costs Until recently, CERN had been considered eligible for academic prices of Microsoft products. Along with many other research institutes, CERN has been disqualified for not granting degrees and we now face a 20 fold increase in license costs. An appropriate alternative should be found for all high cost components, such as AD and ADFS. #### Data protection The current infrastructure shares all Authorisation information with all applications. To support Data Protection initiatives, only relevant data should be sent to a service since even the title of an Authorisation group may expose sensitive data about an individual. #### Full support for Federated Users Accessing CERN with a federated identity, e.g. from eduGAIN, currently allows only limited access to services. It is not possible to assign federated users to authorisation groups. Additionally, certain services require a full LDAP account for access, such as Infrastructure as a Service (e.g. Jupiter Notebooks, Openstack, CERNBox). #### Identity Lifecycle A lightweight CERN account is currently maintained for individuals who have left CERN, in order for them to access their data. The lifecycle management for these users brings overhead and certain functionality is lost for users. #### Ad-hoc Token Translation Certain Infrastructure as a Service resources (e.g. Jupiter Notebooks, Openstack, CERNBox) require kerberos authentication. It is increasingly common for such services to have a web front end for user access. Today, there is no centrally managed token translation from web login tokens (SAML or OAuth2) to kerberos and services have implemented their own, including the management of administrative credentials that are able to impersonate any user. Removing such workflows and providing a central token translation facility for CERN that conforms with best security practices would provide significant advantages. It remains to be seen exactly how this can be achieved. ## Goals A new authentication and authorization system that supports workflows expected by CERN employees and researchers: - Physicists access CERN resources with home institute credentials - Retirees access health insurance and pension data with a social account after leaving - Access Control by design for all applications - Policies and resources (e.g. websites, VMs) lifecycles for CERN, federated and social users (federated authorization) ### Required Features A selection of features are included here to provide background for the proposed architectural decisions. This does not constitute a full list. | Feature | Description | | --- | --- | | Automatic application registration | The high number of applications requires an automatic registration process where trusted application owners can provide information through a web interface. | | High availability | The number and distribution of users requires a high availability set up around the clock. | | Variety of User Authentication Options | LDAP, SAML Identity Federation, OIDC (ORCID and Github in particular), Certificates (IGTF x509), Kerberos. | | Application specific Authorisation | A token for a specific service should only contain roles relevant for the service. Authorisation should be defined upon application registration and can include groups, LoA and multifactor requirements. | | Account linkage | Users should be able to link multiple credentials to a single Identity. | | LoA visibility | The Level of Assurance should be calculated based on login credential and additional attributes (e.g. Identity Verification), and made available to services.| | Group management by users | Users should be able to create, populate and maintain authorisation groups themselves | ## Conceptual Design With the Authorization Service, application owners will be able to define roles for their application, to grant user permissions. Roles will allow the application owner to specify requirements for Levels of Assurance and Multifactor Authentication. Each role will be application-specific, i.e. role membership will not be visible to other applications. Roles are not directly assigned to accounts, but to identities. An identity, in turn, is mapped to one or more accounts. This allows decoupling permissions from authentication, and lets users manage their own identity mappings. ![Authorization Service Model](https://codimd.web.cern.ch/uploads/upload_649e8c9a9b3a49716109585ae0ceeb48.png) The following example illustrates these concepts applied to Indico (CERN's event management system). Indico administrators could define two roles: - Indico User: everybody with a high enough LoA, e.g. Edugain and CERN users. - Indico Admin: a fixed set of idenities, requiring MFA. The runtime mapping of accounts to roles is performed by the Single Sign-On service, which will also enforce the LoA and MFA requirements, not assigning the role if the account the user authenticated with does not satisfy the requirements. From the user's point of view, after a user leaves CERN, they can associate an account from another institution to their previous identity, thus maintaining access to their own documents. With this model, authorization is decoupled from the authentication provider. It does not matter if a user is authenticating with a CERN account, a federated account or a social account (with the exception of LoA requirements). This allows granting privileges to non-CERN accounts as well, and provide authorization for federated accounts. ![Example with Indico](https://codimd.web.cern.ch/uploads/upload_a4cb8d7d3d66aea04503f5a5753c4248.png) ## Proposed Architecture This diagram illustrates the proposed architecture for the CERN authentication and authorization infrastructure. Red blocks represent applications and services that the infrastructure interacts with. Green blocks represent standard (commercial or open source) products and services that form an essential part of the infrastructure. Blue blocks represent custom developments. ![What it should look like](https://codimd.web.cern.ch/uploads/upload_fa67fca6375b68abee23314c801c0b8c.png) ### Authorization Service API and DB The central part of the whole system is the Authorization Service API, which manages the Authorization Service Database. The database stores: - users' identities and accounts - applications and their permissions (roles) - groups - policies - computing resources for which we want to provide a lifecycle The API is used by the other components of the service and also by external CERN applications. External applications can use the API to: - query for user rights (e.g. in case the application needs to query the rights of a user that is not the currently logged-in user) - define roles dynamically - manage groups based on custom criterias ### Computing Resources Management The Computing Resources Management module is tightly integrated with the Authorization Service API, and enforces the authorization and lifecycle policies for CERN accounts and other resources. **Authorization policies** define the operations that users are allowed to perform, e.g.: - CERN users can create new CERN accounts for themselves - CERN and EduGain users can create Openstack projects **Lifecycle policies** define the lifecycle policies for the resources managed by the system, e.g.: - When a new user joins CERN, they automatically get a new CERN account - 60 days after a user leaves CERN, their personal accounts are blocked - 180 days after a user leaves CERN, their personal accounts are deleted - When a user leaves CERN, their official websites are transferred to their supervisor Note that the policies of the Computing Resources Management component are defined on identities, so a restriction like "EduGain users" should either be defined through a group/role or through a LoA restriction. ### Applications Portal Allows users to: - Register applications - Define roles for applications, specifying LoA and MFA requirements - Define the authentication schemes used by the application (OIDC, SAML, Kerberos) ### Users Portal Allows users to: - Manage their own identities - Associate additional accounts to an existing identity ### Groups Portal Allows users to: - Define static groups of identities - Define dynamic groups of identities for CERN users (will be limited to CERN identities, as it's only for those identities that a meaningful criteria can be defined.) ### Identities and Resources synchronization These subsystems, based on Microsoft Identity Manager (MIM), synchronize data across different systems. MIM is the successor of FIM and remains a Microsoft product - however the license costs for this service are deemed reasonable and at this time there is no plan to identify an alternative. The **identities synchronization** instance synchronizes *only identities* (i.e. it does not care about accounts) from Foundation. Foundation is the authoritative data source for CERN identities. Note: the automatic creation of a CERN account for newcomers will be triggered by the Computing Resources Management module, as the automatic accounts creation will be defined as a policy. The **resources synchronization** instances will synchronize computing resources (accounts, web sites, Openstack projects etc) to the external systems. If an external system is interested in getting identity information (to know the owner of a resource) it will also be possible to synchronize identities. ![FIM servers](https://codimd.web.cern.ch/uploads/upload_4003e5972b3ce4868d3f48276862d615.png) This setup offers several advantages: - Simpler MIM setups: only one source is authoritative for accounts or identities in each setup. - We can keep synchronization from Foundation reasonably fast thanks to the dedicated MIM instance. Ideally, from the moment a user is registered in Foundation, their account should be created within 30 minutes. - Scalability and modularity. ### LDAP / Kerberos FreeIPA instances will provide LDAP and Kerberos authentication. ### SSO service and SSO extension Keycloak provides Single Sign On service, OIDC and SAML authentication to external applications. ## Service Access To visualise credential flow within the new system, the following diagram is provided. Some key points are highlighted below. - WLCG is currently designing a web portal for managing access to grid services. The portal manages token translation to OAuth2 (following the WLCG schema) and x509 for legacy grid services. It is likely that this service will be connected to CERN's AAI as a web client. - Infrastructure portals will also be registered as web clients and require token translation to kerberos. It remains to be seen how/where this functionality will be provided. - Simple Web Services will be registered as web clients to keycloak - Kerberos/LDAP based services will be able to authenticate users via FreeIPA ![](https://codimd.web.cern.ch/uploads/upload_eeab757f5032d14a180384eb1fb4cd92.png) ## Comparison with AARC To aid comparison with AARC, a diagram of the proposed design has been reformulated in line with the AARC BPA. This only covers applications registered with SSO (Keycloak) - kerberos based applications may interact directly with LDAP (FreeIPA). https://aarc-project.eu/architecture/ ![](https://codimd.web.cern.ch/uploads/upload_26295a643b2cd215ca3b303301886525.png)
{}