GitLab@CERN day - Open discussion

# GitLab@CERN day - Open discussion ## Bug fixes, feature requests and MRs to GitLab: how to make the best of GitLab's development workflow Background: GitLab admins at CERN regularly receive requests by CERN users for bug fixes, improvements or changes in GitLab. These can be challenging to address: * issues often already exist for the bug/feature request, but sometimes are stalled even if they may be popular already and/or have been around for many months. * our user community seems to have a higher level of expectation that CERN's GitLab admins can influence the outcome (compared to other commercial software). In part because: * GitLab puts its open-source core forward * GitLab's issue tracker is public * CERN is contributing to GitLab to some extent In practice, nowadays we can do little more than open a feature proposal or upvote existing ones and wait. It is very hard to give accurate response to our users about the likely outcome, even less estimate when a definitive answer can be given. While the availability of the roadmap and high-level direction is fantastic, visibility on GitLab's decision process or direction for specific items is limited. Some examples of difficult cases: * (_feature request_) ability to request code review without MR: https://gitlab.com/gitlab-org/gitlab-ce/issues/19976 ("popular proposal" for 2 years before reaching "Product Vision 2019" just a few days ago) * (_bug_) crippling regression for our larger repositories: https://gitlab.com/gitlab-org/gitlab-ee/issues/6780 (affected 10.8, 11.0, 11.1 - for some time no supported version existed without the problem) We would like to discuss the following: 1. How does GitLab review/prioritize bug reports and feature proposals internally? 1. How could we improve on the feeling of helplessness? 1. What message should we pass to our users regarding handling of their bug reports and feature requests? ## GitLab performance Background: Performance has been a challenge since the early days of GitLab at CERN. Despite significant efforts in this area, GitLab UI feels sluggish and GitLab can be easily brought down by a few expensive requests on large projects. This appears to be also true for our smaller and virtually unused playground instance "gitlab-dev" and gitlab.com. Recent features like Cloud Native chart and gitaly (without shared storage) might help to some extent. We have already started the transition to object storage and expect to eventually switch from our custom container-based deployment to using the CN chart. We also have to deal with constraints of our on-premise cloud environment. In particular, while gitaly offers new options, moving out of NFS shared storage is not trivial: * high level of service for NFS filers (snapshots, off-site replica, backups, expertise of the storage team...) * using local storage + gitaly increases operational effort for GitLab admins, probably also the risk of downtime We would like to discuss the following: 1. How large is our instance compared to other GitLab customers? 1. How beneficial were CN charts and no-shared-storage gitaly to gitlab.com or other customers? Worth the effort? 1. Any suggestion or reference deployment for a fats GitLab with our deployment size? ## Global search Background: Some developer teams see Global Search as an absolute requirement. However deploying it has proved difficult, requiring a lot of effort (despite the ESaaS and expertise available in CERN IT). The final result is underperforming, difficult to scale and affected by many bugs. There are also concerns about long-term supportability as GitLab is using a single index and thus we can only set a number of shards once and for all at index creation. Should the number of shards prove insufficient at some point, little seems feasible to recover: initial indexing of our instance in the background took 7 weeks, making any any change requiring re-indexing highly challenging. We expect no significant improvement until gitlab.com itself deploys Global Search and improves support for large instances; it seems this was an ongoing project at some point but was cancelled. We would like to discuss the following: 1. What are GitLab's plans to deploy global search on gitlab.com? 1. In pratice we're unable to re-index our instance from scratch. Is that taken into account by GitLab when considering significant changes to Global Search? Some planned changes seem they will require re-indexing (e.g. https://gitlab.com/gitlab-org/gitlab-ee/issues/3217) 2. Any detailed guidance on deploying global search for large instances? ## CI/CD design limits Background: GitLab CI has been very widely adopted at CERN, we run 5K jobs/day on average nowadays. This has been a game-changing feature that made it really easy to add CI/CD to projects. We still need to keep operating several dozen Jenkins CI instances next to GitLab CI for various reasons, some of them are because of some limits in the GitLab CI design. Examples: * Run CI in target project for MRs from forks: https://gitlab.com/gitlab-org/gitlab-ce/issues/23902 (especially ability to test _merged_ code rather than the branch to merge) * Protect CI configuration from changes: https://gitlab.com/gitlab-org/gitlab-ce/issues/20826 (another 2 year old "popular proposal") In addition, it seems gitlab-runner does not get the same level of attention as GitLab itself regarding contributions. For instance we gave up deploying shared runners on Kubernetes in part due to missing restrictions on images and services. A MR adding this has been open for 7 months: https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/840. We're also facing challenges regarding providing shared runners for privileged builds (Docker-in-Docker), as the standard docker-machine setup with single-use VMs doesn't play well with our private cloud environment. The auto-devops stack however requires such runners allowing privileged containers. We would like to discuss the following: 1. What are GitLab's plans regarding the evolution of GitLab CI? 2. How likely is it that GitLab CI closes the feature gap and allows to retire Jenkins? 3. How do other organizations provide privileged shared runners? ## Licensing and Premium/Ultimate features Background: We currently have to pay for every GitLab user account. This creates a few complications: * We're trying to make it easy for CERN users to sign up with our GitLab instance, to make it the one-stop solution for all development activity (no registration, no limit on groups and projects etc). But this also means our user count is high and of the same order of magnitude as our total count of CERN user accounts, while not all users need the same level of functionality. * Inviting external developers means paying licenses for them. We currently direct teams requiring collaboration with external developers to external services like gitlab.com or github.com * Some of our users are interested by Premium features. * Ultimate features regarding security are potentially interesting to all users, but prohibitively expensive. Enabling Premium features globally also raises concerns about vendor/feature lock-in. Following experience with other software suppliers we find it important to keep an exit strategy should GitLab license costs increase beyond reason. We would like to discuss the following: 1. It seems that supporting multiple user "plans" on on-premise instances (as is the case on gitlab.com) could address some of these concerns, for instance to avoid paying for external users by limiting them to the "free" plan, or paying Premium license seats only for these users who need them. How open is GitLab to this? 2. Other suggestions regarding Premium features for a limited set of users? Beyond a separate instance, which goes against a clear "one-stop" approach. ## Challenges regarding the container registry Background: Growing adoption of containers, with increasing use of container orchestration for application deployment. GitLab Registry is the primary solution to store Docker images, with GitLab CI the most common method of building them (using specific shared runners due to the difficulties to offer privileged builds). The container registry keeps growing without any cleanup ever being done. This is a long-standing issue: https://gitlab.com/gitlab-org/gitlab-ce/issues/25322. There is work in progress in this area, but it seems deleting the old tag versions ignores the fact that images (typically older tag versions) may still be referenced by SHA rather than tag names (`myregistry.com/my/image@sha256:12345...def`). This is for instance used extensively by [Openshift's ImageStreams](https://docs.okd.io/latest/architecture/core_concepts/builds_and_image_streams.html#image-streams) so deployments can be rolled back after a tag is updated. We would like to discuss the following: 1. How is registry pruning handled on gitlab.com? 2. How to prune while still supporting SHA references? (E.g. list of images to exclude from pruning, or keep N tag versions) ## Permission system Background: Looking at https://gitlab.cern.ch/explore/projects?utf8=%E2%9C%93&name=note&sort=latest_activity_desc&visibility_level=10 (cern login required) we easily find documents on CERN gitlab like https://gitlab.cern.ch/htx/fourtop-int-note/blob/master/INT/fourtops.pdf (cern login required) that are clearly marked as internal to individual experimental collaborations only, but accessible to all of CERN. The best guess of why this happens we came up with so far is that "internal" is too easily misread as group/project internal rather than instance/CERN internal. Improvements to the documentation won't help in this case as it is not at all obvious to the user that they may have misinterpreted the interface, so won't look at the docs in the first place. (Also, that would kind of raise the entry barrier to use gitlab if something as common as "I need an access restricted repo with access for all of my colleagues and nobody else." would require documentation lookup; especially since the entire rest of the process is "click the obvious buttom and do what the screen says.") Is there: 1. Any experience from other gitlab installations of how to tackle this? (Or more general that gitlab has many configuration mechanisms that - while all justified - make it very easy to set up a repository wrongly, while our old svn setup - one world readable, one collaboration readable - was good enough to cover 99% of the use cases without any needs for configuration by the enduser.) ## Overleaf integration Since editing of LaTeX documents was listed as use case of gitlab at cern, while there seems to be a push towards using the overleaf cloud service [link to CHEP poster here](https://indico.cern.ch/event/587955/contributions/2935957/). Is there a plan to have overleaf integration with gitlab like [with github](https://www.overleaf.com/learn/how-to/How_do_I_connect_an_Overleaf_project_with_a_repo_on_GitHub,_GitLab_or_BitBucket%3F?nocdn=true)? Not also gitlab.com but to any gitlab instance (like the one at CERN).