SCM API

Consumer guide

This document provides the guidelines for using the SCM API.

This document is structured as follows:

  • The first section provides some background into the use cases driving the development of the SCM API.

  • The second section is an overview of all the functionality provided by the SCM API.

  • The subsequent sections consider each of the API extension points in turn and the recommended usage patterns.

Background

The initial core use case for Jenkins is that of monitoring a project in source control and when the project changes, checking out the modified source, building the source and reporting the results of that build to the people who made the changes.

As Jenkins usage has grown, the use cases have also grown.

The initial design of Jenkins was focused on creating Jenkins jobs for each project in source control.

For projects that have a single mainline of development, there would typically be a 1:1 relationship between projects in source control and jobs in Jenkins.

For projects that have multiple mainlines of development, for example where there may be multiple sustaining branches for older releases, there would be multiple Jenkins jobs for each project: one for each branch.

Initially, creating one job for each development branch seems reasonable:

  • The older branches may require older versions of the build toolchain.

  • The job changelog for each branch reflects the changes in that branch.

  • The build break / fix status for each branch is correct.

  • The creation of the branch job is just a copy operation within Jenkins - or a rename and copy if you want to move the build history.

There are some issues however:

  • When a sustaining branch is removed, we need to remember to delete the corresponding sustaining job.

  • If we need to make a configuration change that affects multiple branches, we have to configure the Jenkins jobs in the UI in multiple places.

  • These are potentially heavyweight copy operations: the copy has to be performed within Jenkins itself - often times the Jenkins administrator will have restricted permissions for job creation and thus the creation of the branch job may require internal support tickets, etc.

  • Developers do not get the CI feedback support when working on short-lived feature branches, thus the benefits of CI are only seen when merging the feature branches back to a mainline which increases the risk of a broken mainline build.

  • Inter project job chain dependencies can become incredibly difficult to manage.

    If project A depends on project B then we may have complex triggers between each projects different branch sustaining lines.

  • We you have even more projects, most jobs are doing mostly the same thing and reuse of these patterns and/or implementation of organizational best practices becomes hard. This is especially a concern when those organizational best practices change and you have thousands of jobs that need to be updated.

One of the first attempts to solve this problem was introduced by the Git plugin:

  • Rather than having a job track a single branch, a job could be configured that tracked multiple branches.

The side-effect of this, is that:

  • The job changelog is now meaningless.

    In some cases there is no relationship between the revision of the previous build and the revision of the current build, so asking for a list of changes between them is semi-random.

  • The build break / fix status is now meaningless.

    The previous build may be from a different feature branch and may be broken. Your build is on your feature branch and is not broken.

    Jenkins then notifies you and the developer of the other feature branch that the build is now "Fixed".

  • The trend lines for test results are now meaningless.

  • The trend lines for code coverage are now meaningless.

  • The trend lines for static analysis are now meaningless.

  • Forget about triggering downstream jobs of other projects as a result of a build as there is no guarantee that the upstream job build even warrants the downstream job to be triggered.

In order to allow for some rollback of some of these side-effects, typically one would:

  • Create a job that just builds the main branch.

  • Create one job each for each of the sustaining branches that need to be tracked.

  • Create one job that builds everything else (including for example pull requests).

This leaves only one job with meaningless change logs / build status history / trend charts / etc, but we still have the issue of needing to maintain all the sustaining branches.

The Branch API plugin was developed as a second alternative solution to this problem space. The idea behind the Branch API plugin is to have a special job type which will follow all the branches in a repository and create sub-jobs for each branch. These branch specific jobs can then be managed automatically by the multi-branch project such that they can be removed automatically after the corresponding branch has been removed.

Each branch specific sub-job has:

  • a valid changelog

  • a valid build break / fix status

  • valid trendlines

Other benefits include:

  • Developers can create feature branches and get the CI benefits for that feature branch automatically without needing to worry about requesting creation of corresponding feature branch jobs.

  • The Jenkins administrator does not need to worry about having to clean up old feature branch jobs that are no longer required.

There still remain other issues that the Branch API plugin cannot solve on its own:

  • Each branch specific job needs to be configured.

    If different branches need different job configuration — which may be highly likely if a feature branch is refactoring the build process — we need to be able to get the automatically configured jobs to have the correct configuration.

  • For inter-project dependencies, it can be unclear how to establish branch relationships.

    Should branch feature-23 of project A be using the artifacts from branch master of project B or from branch sustaining-1.x.

Efforts to solve these issues are a responsibility of the plugins that implement multi-branch project types.

  • The Pipeline Multibranch plugin also stores the job configuration in a file (Jenkinsfile) in source control directly in the branch.

    That file can reference lightweight "plugins" in the form of shared groovy libraries. This allows for a Jenkins administrator to split some reusable best practices into a shared library. Changing the best practice in the global shared library can then affect all the branches simultaneously — or if the shared library version was pinned you would need to push the library version update to the required branches.

The SCM API within Jenkins, which is based on hudson.scm.SCM did not provide the required functionality.

In order to allow the missing functionality to be more of more general utility, the decision was taken to put the SCM functionality into a separate API plugin, namely this SCM API plugin.

The initial SCM API extension points were driven by a lot of the use cases of the Branch API plugin, but the design has been tempered to try and allow for other unanticipated use cases.

The primary requirements of the Branch API are:

  • provide a means to enumerate all the "branches" of a "repository".

  • provide a means to identify interesting "branches", by checking whether the branch has specific files or perhaps even checking the contents of specific files within the branch.

  • provide a means to enumerate all the "repositories" of a "source control server".

Overview

The SCM API exposes four main extension points for consumers:

  • The jenkins.scm.api.SCMSource extension point is designed to solve the use case of iterating all the "branches" / "tags" / "change requests" in a "project / repository".

  • The jenkins.scm.api.SCMNavigator extension point is designed to solve the use case of iterating all the "projects / repositories" in a "source control system / server / team / organization".

  • The jenkins.scm.api.SCMFileSystem extension point is designed to solve the use cases of browsing the content of a specific "branches" / "tags" / "change requests" and optionally retrieving specific files.

Consumers will likely be interested in implementing the following extension points:

  • The jenkins.scm.api.SCMEventListener extension point is designed to solve the use case of receiving notification of events from the source control systems.

Consumers will be required to implement the following contract interfaces depending on their use of the extension points of the SCM API:

  • Any hudson.model.Item that owns some jenkins.scm.api.SCMSource instances must implement jenkins.scm.api.SCMSourceOwner.

    Note
    SCMSourceOwner implementation discovery

    In the general case, the hudson.model.Item above will be TopLevelItem instances within the standard Jenkins item hierarchy.

    If your object is not discoverable through Jenkins.getInstance().getAllItems(SCMSourceOwner.class) then you will need to provide an implementation of the SCMSourceOwners.Enumerator extension point that can find your object.

  • Any hudson.model.Item that owns some jenkins.scm.api.SCMNavigator instances must implement jenkins.scm.api.SCMNavigatorOwner.

    Note
    SCMNavigatorOwner implementation discovery

    In the general case, the hudson.model.Item above will be TopLevelItem instances within the standard Jenkins item hierarchy.

    If your object is not discoverable through Jenkins.getInstance().getAllItems(SCMNavigator.class) then you will need to provide an implementation of the SCMNavigatorOwners.Enumerator extension point that can find your object.

Using SCMSource instances

Each SCMSource assumes that it is owned by a SCMSourceOwner object. The owner is responsible for:

  • providing a context from which the SCMSource can resolve any required Credentials via the Credentials plugin.

  • providing any SCMSourceCriteria that would be required by the SCMSource in order to determine if a candidate SCMHead is actually a SCMHead that the owner is interested in.

While it is possible to use a detached SCMSource without an owner, when operated in such a fashion, it is exceedingly likely that any required credentials will be unresolved and thus the usage may fail.

When loading SCMHead or SCMRevision instances from persistence on disk, a consumer is recommended to pass the objects through SCMHeadMigration.readResolveSCMHead(SCMSource,SCMHead) or SCMHeadMigration.readResolveSCMRevision(SCMSource,SCMRevision).

SCMSourceOwner contract

If you implement jenkins.scm.api.SCMSourceOwner your implementation must:

  • Be discoverable through jenkins.scm.api.SCMSourceOwners.

    This is the normal expected situation and will be the case if your implementation is part of the standard Jenkins.getInstance().getAllItems() tree.

    Note

    The contract for hudson.model.Item does not mandate that instances be discoverable through Jenkins.getInstance().getAllItems().

    For example, it is conceivable that a plugin might decide to store hudson.model.User specific hudson.model.Item through a custom hudson.model.ItemGroup attached to the owning user.

    If your plugin has decided to wander off the well established idiom, you will have to pay the cost of providing an implementation of SCMSourceOwners.Enumerator that can discover your jenkins.scm.api.SCMSourceOwner instances in order to ensure that events are delivered correctly.

  • Ensure that SCMSource.setOwner(owner) has been called before any SCMSource instance is returned from either SCMSourceOwner.getSCMSources() or SCMSourceOwner.getSCMSource_id).

    Normally this is achieved by setting the owner on creation, load and reconfiguration. Lazy owner setting immediately before first access of any specific SCMSource is also a valid solution.

  • Trigger a reindex of at least the specified source on receipt of a SCMSourceOwner.onSCMSourceUpdated(source) notification.

  • Call SCMSource.afterSave() on all the SCMSourceOwner.getSCMSources() instances after every save of the SCMSourceOwner.

Your implementation of jenkins.scm.api.SCMSourceOwner should:

  • Assume that SCMSource.fetch(observer, listener) may ignore your SCMSourceOwner.getSCMSourceCriteria — not all source control systems will have the ability to perform either shallow probes (via jenkins.scm.SCMProbe) let alone deep probes (via SCMProbe.getRoot() or jenkins.scm.SCMFileSystem.of(…​)).

  • Persist the results of any discovery of any interesting SCMHead instances resulting from calls to SCMSource.fetch(…​) in some form or other.

    For example the Branch API plugin creates sub-jobs for each SCMHead that it is interested in. The SCMHead instances are persisted with their corresponding sub-jobs and the SCMRevision instances are persisted with the builds that were triggered for each discovered SCMRevision.

  • Have a jenkins.scm.api.SCMEventListener implementation to respond to events.

    The listener will need to update the required SCMSourceOwner instances on the basis of the event and should minimize full indexing, e.g. by using the SCMSource.fetch(…​, event, listener) variants.

  • Persist the results of SCMSource.fetchActions(listener) as part of each index and on receipt of any SCMSourceEvent events of type CREATED or UPDATED.

    Tip

    It may not always be either possible or preferred to attach the actions directly to the owner.

    For example, hudson.model.AbstractProject subclasses do not allow modification of the project’s actions (because AbstractProject.getActions() returns a read-only list and Actionable.addAction(action) tries to add the action to the list returned by getActions()).

    An alternative is to persist the actions through some other mechanism (in the case of the AbstractProject subclass, we could store them in a JobProperty) and then populate at runtime them using a TransientItemActionFactory.

The primary purpose of the SCMSource API is to allow the enumeration of SCMHead instances (and their current corresponding SCMRevision) from the source, so presumably you have implemented SCMSourceOwner because you want to do something with the SCMHead instances.

  • If the thing you are doing with SCMHead instances is creating Actionable objects [1] then you shall call SCMSource.fetchActions(head,listener) whenever either performing a full index or when an event indicates a change for the corresponding SCMHead.

    Any returned Action instances must be persisted with the Actionable object.

    Tip
    In plain english, when you create a job from a SCMHead you shall call SCMSource.fetchActions(head,listener) and if that method returns any actions you must add the actions with the job and save the job.
  • If the thing you are doing with SCMHead instances results in creating Actionable objects associated with specific the SCMRevision for each SCMHead [2] then you shall call SCMSource.fetchActions(head,revision,listener) whenever you are creating your Actionable object for a specific SCMRevision.

    Any returned Action instances must be persisted with the Actionable object.

    Tip
    In plain english, when you trigger a build of a job for a SCMRevision you shall call SCMSource.fetchActions(head,revision,listener) and if that method returns any actions you must add the actions to the build before the build is saved.
  • If you need to partition your SCMHead things, you can use SCMSource.getCategories() to obtain the categorization of SCMHead instances.

  • The idiomatic name [3] for the kind of thing that a SCMSource represents is provided by SCMSource.getPronoun().

    For example, a Git SCMSource might return Repository, an Accurev SCMSource might return Depot, a CVS SCMSource might return Module while a Subversion SCMSource could return Repository or perhaps even something more generic like Project depending on the way in which the Subversion server is being used [4].

Using SCMNavigator instances

If you implement jenkins.scm.api.SCMNavigatorOwner your implementation must:

  • Be discoverable through jenkins.scm.api.SCMNavigatorOwners.

    This is the normal expected situation and will be the case if your implementation is part of the standard Jenkins.getInstance().getAllItems() tree.

    Note

    The contract for hudson.model.Item does not mandate that instances be discoverable through Jenkins.getInstance().getAllItems().

    For example, it is conceivable that a plugin might decide to store hudson.model.User specific hudson.model.Item through a custom hudson.model.ItemGroup attached to the owning user.

    If your plugin has decided to wander off the well established idiom, you will have to pay the cost of providing an implementation of SCMNavigatorOwners.Enumerator that can discover your jenkins.scm.api.SCMNavigatorOwner instances in order to ensure that events are delivered correctly.

  • Call SCMNavigator.afterSave(owner) on all the SCMNavigatorOwner.getSCMNavigators() instances after every save of the SCMNavigatorOwner.

Your implementation of jenkins.scm.api.SCMSourceOwner should:

  • Persist the results of any discovery of any interesting project instances resulting from calls to SCMNavigator.visitSources(…​) in some form or other.

    For example the Branch API plugin creates sub-jobs in an organizational folder for each project that it is interested in. The project name and attributes are persisted with their corresponding sub-jobs.

  • Have a jenkins.scm.api.SCMEventListener implementation to respond to events.

    The listener will need to update the required SCMNavigatorOwner instances on the basis of the event and should minimize full indexing, e.g. by using the SCMNavigator.visitSources(observer, event) variants.

    Note
    SCMSourceEvent and SCMHeadEvent events can transition a project from non-interesting to interesting, so a SCMNavigatorOwner will need to listen out for these events also. For example, an Organization Folder for Workflow jobs would need to see if any SCMHeadEvent for a project name that has not been created yet leads to a verification as to whether the named head actually contains a Jenkinsfile. If a Jenkinsfile were present then the project would need to be created in order to enable the branch specific grandchild job to be created.
  • Persist the results of SCMNavigator.fetchActions(owner, listener) as part of each index and on receipt of any SCMNavigatorEvent events of type UPDATED.

    If two SCMNavigator instances have the same SCMNavigator.getId() then the SCMNavigatorOwner can use a shared set of actions for these instances as they are both navigating the same thing (although with different selection criteria).

    Tip

    It may not always be either possible or preferred to attach the actions directly to the owner.

    For example, hudson.model.AbstractProject subclasses do not allow modification of the project’s actions (because AbstractProject.getActions() returns a read-only list and Actionable.addAction(action) tries to add the action to the list returned by getActions()).

    An alternative is to persist the actions through some other mechanism (in the case of the AbstractProject subclass, we could store them in a JobProperty) and then populate at runtime them using a TransientItemActionFactory.

The primary purpose of the SCMNavigator API is to allow the enumeration of named projects which each have a corresponding SCMSource instances, so presumably you have implemented SCMSourceOwner because you want to do something with the SCMSource instances.

  • Override SCMSourceObserver.isObserving() and SCMSourceObserver.getIncludes() if you are only interested in a subset so that implementations can minimize the amount of work that they do.

  • If your observer is not interested in any specific project, just return NoOpProjectObserver.instance() for those instances.

  • If you need to partition your SCMSource things, you can use SCMNavigator.getCategories() to obtain the categorization of SCMSource instances.

  • The idiomatic name for the kind of thing that a SCMNavigator represents is provided by SCMNavigator.getPronoun().

    For example, a GitHub SCMNavigator might return Organization, an Accurev SCMSource might return Repository, a CVS SCMSource might return Server while a Subversion SCMSource could return Repository or Server depending on the way in which the Subversion server is being used.


1. It is conceptually easier to think of these as Item instances or even Job instances, but perhaps you have some use case that we have not anticipated, so we use the most generic term: Actionable objects. If your use case is even more generic that you do not even need the Actionable contract then this you can ignore SCMSource.fetchActions(head,listener)
2. It is conceptually easier to think of these as Run instances or even Build instances. You, however, may have some completely different use case that requires using the most general base class. If your use case is even more generic that you do not even need the Actionable contract then this you can ignore SCMSource.fetchActions(head,revision,listener)
3. If only all source control systems could agree a consistent set of names of things
4. Do you create one trunk|branches|tags per repository or do you use a single big repository and create many project/(trunk|branches|tags) in that single repository?