There are three means of communicating with a Tator deployment: REST requests, object storage requests, and the Kubernetes API. REST requests are the primary means of communicating with Tator that are load balanced and proxied by NGINX. Object storage requests are made using presigned URLs provided by REST requests. Those requests may be made directly to the object storage service, or the object storage service may be proxied by NGINX. And finally, the Kubernetes API gives administrative access to a Tator deployment. The Kubernetes API is protected by SSL encryption and can typically be accessed only by the maintainer of a Tator deployment.
Tator is deployed using a package manager for Kubernetes called Helm. Creating or updating a Tator deployment starts with building necessary docker images, which include a backend image and a client image for performing asynchronous work such as transcodes. The images are pushed to a docker container registry. Helm then reads a local configuration file in YAML format and uses it to create Kubernetes objects. The configuration file includes sensitive data for accessing services such as the container registry, databases, and object storage. This data is stored in Kubernetes using “Secret” objects and is made available to containers as necessary using environment variables or mounted files. These values are only accessible via the Kubernetes API.
Leveraging cloud service security
Tator can run all services in the same Kubernetes cluster using Helm dependencies. This is a suitable configuration for development, in which all Tator services run on a single node or virtual machine and storage is local. For production deployments, Tator allows use of managed services. The following services may be replaced with managed services: Kubernetes, MetalLB, min.io, PostgresQL, and Redis. It is required that all database services be part of the same virtual private cloud (VPC) and security group as the web serving Kubernetes cluster. Object storage and the asynchronous work Kubernetes cluster can be external to the VPC, allowing for configurations such as using a different cloud service provider or on-premises resources for storage or processing. Both are accessed using HTTPS with credentials defined as part of either the Helm configuration file (if the external resources are used globally in the deployment) or stored in the PostgresQL database (if the external resources are defined and used by specific organizations).
User accounts are managed within Tator and can be set up with normal or superuser privileges, with finer access controls for specific database resources. Superuser accounts are meant only for administrators of Tator deployments. A superuser account can access multiple administrator tools that run in Tator, including Grafana (used for monitoring cluster health, scaling, latency, and other metrics) and the Django admin console. Superuser status can only be granted via the Django shell (not REST requests) which can only be accessed via the Kubernetes API in which remote Kubernetes API access is not permitted to the deployment cluster without security keys generated from a pre-existing administrative context.
Currently our REST API supports session and token user authentication methods. Our front end uses session cookies provided through Django’s authentication system via traditional username and password, while access outside of the browser using tools such as cURL or our Python client library tator-py use token authentication. A token for a user can be revoked and regenerated by administrators or the user. A user can only have one token at a time. Tator supports authentication and identification information from third-party identity providers, such as Okta via SAML2. Currently when configured for external authentication Tator generates local accounts based on identity provided, such that the local Tator account is permanently associated with the unique identity provided by the trusted third party. Under this scheme, Tator is a consumer not a provider of identity. There is planned work to integrate with a backend service called Keycloak that will allow us to replace authorization via session cookies with OpenID Connect (OIDC). Keycloak supports third party authentication via many identity providers. By delegating authentication to Keycloak, Tator will be more portable to various deployment environments while simplifying its code base.
When using the internal Tator username and password service, brute force password detection and prevention is enabled for failed login attempts. Under the scheme, 3 failed login attempts cause an account lockout for 10 minutes, regardless of successful password entry. Attempts to login during this period reset the 10 minute timer. Each time the timer is set for a user, site administrators are notified.
There are two main scopes for controlling access to database resources: organizations and projects. Nearly all REST requests check access to the associated object or table for one of these scopes.
Access to organizations is controlled by an associative table called affiliations. Currently there are two levels of access for organizations, member and admin. Admin users can create, edit, and delete resources associated with organizations, such as algorithm cluster definitions, object storage buckets, and invitations to the organization. Tator can be configured to only allow access to Tator via organizational invitation (the initial organization is set up by Tator administrators). Projects must be owned by an organization, and only an admin affiliate can create projects under an organization. Users may be affiliated with multiple organizations and may have admin access to multiple organizations. Member level currently does not grant any privileges for the organization.
Access to projects is controlled by an associative table called memberships. Note that memberships exist independently from affiliations, so a user is not required to be affiliated with the owning organization of a project to be a member of that project. This is a design decision to foster cross organization cooperation. Currently there are five levels of access to a project: view only, can edit, can transfer, can execute, and full control. The levels are currently cumulative, so for example “can execute” has all the permissions as lower access levels. View only can only view media and annotations. Edit allows users to change annotations. Transfer allows users to upload new data and perform bulk downloads through the web interface. Execute allows users to execute workflows that are already registered with the project. And finally, full control allows users to register algorithms, define annotation metadata, edit project info, and add new project members. The current access control scheme is high level, and there are plans to introduce finer-grained access controls.
The Tator REST service is proxied by NGINX, ensuring that HTTPS is required for all requests when it is configured (HTTP is still supported for development purposes). NGINX forwards requests to the REST service which is implemented using Django REST Framework and served internal to the Kubernetes cluster with Gunicorn. Once the request is authorized either by token or session, the project or organization associated with the request is determined, and user membership or affiliation existence is checked, then the appropriate access level for the specific operation is checked. If any of these checks fail, a 403 response is returned.
Object storage is used for all media files in Tator. Both file uploads and downloads can only be accomplished using presigned URLs, which allow for controlling access by users who do not have credentials for accessing a bucket directly. The URLs themselves can be thought of as temporary tokens that allow for a download or upload. Currently the maximum time to expiration for presigned URLs is 24 hours. Presigned URLs are provided using REST requests to the DownloadInfo or UploadInfo endpoints, or by passing the presigned parameter to the Media endpoint. User permissions are checked in these endpoints as they are for other endpoints as described above. Presigned URLs are provided by the REST service without needing to make a request to the bucket and they refer directly to the bucket domain. This means that file operations do not need to be routed through the primary web serving domain, they can access the bucket directly, making external buckets efficient.