Skip to main content

Contribution Guidelines

These guidelines instruct how to submit issues and contribute to code or documentation to Archipelo/top project. Another way to contribute includes participating in discussion and answering questions on our Archipelo Slack workspace.

Code contributions

If you have fixed a bug or implemented an enhancement, you can contribute your changes via GitHub's pull requests. This is not restricted to code, on the contrary, fixes and enhancements to documentation and tests alone are also very valuable.

Preconditions for code contribution are having a GitHub account, installing Git and having write access to the project.

It is suggested for each engineer to use a GPG key to sign their commits for higher security. Learn how to set it up.

Branching

Whenever you're going to start a new piece of work, you need to create a new branch from the main one. To do so, please follow these steps:

  • make sure that you are synchronized with the main branch by executing git pull
  • create new branch with git branch ticket_number-username/branch-name or create and checkout to it with git checkout -b ticket_number-username/branch-name

Branch naming

To keep the repository clean, everyone needs to adhere to the following rules of branch naming:

  • branch name must start with the Jira ticket number
  • the Jira ticket number must be followed with a username of its creator
  • the second part is a title of the branch that shortly explains what it contains
  • username and title must be separated with a slash (/)
  • the title must stay short but descriptive
  • split names in title using hyphen (-)

Good examples: add-118-mnojek/add-contribution-guide, add-118-rkuc/fix-maven-repos-likes-count

Bad examples: data-branch, rkuc/bugfix1

Commits

Commit message is a short description of the work done in a single commit. It's important to keep the style of the commit messages consistent across the whole repository, especially when the repository is public or the messages are used for automation (e.g. generation of release notes).

A commit message should reflect the content of the commit.

Commit messages on a branch under the review are mostly used for the development purposes and will not be a part of the main branch history. It's up to the developer how they are going to be named, but it's recommended to use the same rules consistently. The most important one is the commit message related to the merged commit.

Commit Message Format

It's important to differentiate between commits on the main branch, and all other commits done on other branches.

Commits on the main branch

Our main branch history is used to generate release notes and it must comply with the Conventional Commits specification.

It is very important to always make sure, that the PR being merged to the main branch, follows the convention.

Commits will be validated against the convention during PR creation (or any operation around the PR), and those that do not comply will fail the check, making it impossible to merge the PR.

Commits on other branches

Commits on other branches can be created in any way, but it is recommended to follow the Conventional Commits approach here as well, just as a training to learn the rules.

These commits are always going to be squashed into one, so the information from every commit will be lost. Make sure to always give a meaningful message of the squashed commit based on the whole scope of the PR (see rules below).

Conventional Commits specification

Full documentation is available here, but some main rules are described below.

Each commit message contains a header, a body and a footer:

<header>
<BLANK LINE>
<body>
<BLANK LINE>
<footer>

The header is mandatory and must conform to the Commit Message Header format.

The body is optional, but highly recommended. When the body is present it must conform to the Commit Message Body format.

The footer is optional. The Commit Message Footer format describes what the footer is used for and the structure it must have.

Commit Message Header

<type>(<scope>): <short summary>
│ │ │
│ │ └─⫸ Summary in present tense. Not capitalized. No period at the end.
│ │
│ └─⫸ Commit Scope: api|data|handbook|infra|ml|onebox|process|spider|vscode|webapp|*

└─⫸ Commit Type: build|ci|docs|feat|fix|perf|refactor|revert|style|test

The <type> and <summary> fields are mandatory, the (<scope>) field is optional.

Type

The type specifies the character of the changes from the commit.

Must be one of the following:

  • build: Changes that affect the build system or external dependencies
  • ci: Changes to our CI configuration files and scripts (examples: GitHub, docker)
  • docs: Documentation only changes
  • feat: A new feature
  • fix: A bug fix
  • perf: A code change that improves performance
  • refactor: A code change that neither fixes a bug nor adds a feature
  • revert: Any operation that reverts the previous commit followed by the header of the reverted commit
  • style: Linting changes like: indentation, trailing commas, semi-colons
  • test: Adding missing tests or correcting existing tests

Most important types are feat and fix because they are marking the commits that contain production changes affecting our users. These commits will be a part of the generated release notes.

Scope

The scope specifies the area which is affected by the commit.

The following is the list of supported scopes:

  • api
  • data
  • handbook
  • infra
  • ml
  • onebox
  • process
  • spider
  • vscode
  • webapp
  • *(anything else that follows a kebab-case)
Summary

A brief but meaningful description of the change. Here are recommendations for writing your subject:

  • use the imperative, present tense: "change" (not "changed" or "changes")
  • don't capitalize the first letter
  • no "." (dot) at the end

Commit Message Body

Explain the motivation for the change in the commit message body. This commit message should explain why you are making the change. You can include a comparison of the previous behavior with the new behavior in order to illustrate the impact of the change.

If your commit contains any UI-related changes, a video or screenshots should be added to the PR

The footer can contain information about breaking changes and deprecations and is also the place to reference GitHub issues, and other PRs that this commit is related to.

Breaking changes

There are 2 ways of marking the commit containing breaking changes:

  • as an entry in the footer

    If included as a footer, a breaking change MUST consist of the uppercase text BREAKING CHANGE, followed by a colon, space, and description, e.g. BREAKING CHANGE: environment variables now take precedence over config files.

  • using ! character after the scope of the commit message header, right before the :

    If included in the type/scope prefix, breaking changes MUST be indicated by a ! immediately before the :. If ! is used, BREAKING CHANGE: may be omitted from the footer section, but it is recommended to have it, and the commit description shall be used to describe the breaking change.

Examples

Small changes to styling in OneBox codebase:

style(onebox): run prettier on the codebase and add unify string quotes

Changes and improvements in documentation:

docs(handbook): update dark theme

Changes:
- Added images variants that look good on dark background
- Updated colors of the fonts in CSS styles
- Set dark theme as default

A feature to VS Code Extension:

feat(vscode): display commit digest summary after committing changes

New view in the extension that:
- adds new view in the extension
- displays the commit digest summary in new view
- creates a link to the commit digest that leads to WebApp

Ref: #12345

A bugfix in infrastructure:

fix(infra): fix terraform resource names

We had a bug in the TF code that resulted in some resources
not being properly configured in file `some_file.tf` at line 123.
Typos were fixed and verified.

A feature with breaking changes to API:

feat(api)!: add new v3 version of AR API

This introduces new V3 version of our internal API that is
backward-incompatible with V2 version. New API is available
to try on our Swagger page.

Link to Swagger page: https://swagger.archipelo.com/v3

New E2E test definitions for WebApp:

test(webapp): add 3 new E2E test definitions to commit digest page

Added tests: A3, B4, C1

Ref: #54321, #65432

Upgrade of external dependencies:

build(jsapps): bump version of external npm dependencies

Pre-commit checks

Before a commit is created, but after it is triggered from command line, pre-commit checks are being run. Their role is to make sure that the code is ready being committed, i.e. it meets code quality standards. It runs only for the parts of the code that were modified and rejects the commit if any check fails.

To install pre-commit tool, you need to have Python installed and run:

pip install pre-commit

After that, install the pre-commit scripts for the repository by running:

pre-commit install

Now, all commits that you initiate will trigger pre-commit checks and test your code against the rules defined in .pre-commit-config.yaml file.

Note See the full documentation for pre-commit tool here.

Troubleshooting

Common problems that you can hit when setting up pre-commits.

SSL Certificate Error

While setting up a new machine you may hit an issue with SSL certificate which will result in an error during installation similar to the following one:

An unexpected error has occurred: CalledProcessError: command: ('/Library/Frameworks/Python.framework/Versions/3.10/bin/python3.10', '-mnodeenv', '--prebuilt', '--clean-src', '/Users/gro/.cache/pre-commit/repoeefpnd11/node_env-default')
return code: 1
stdout: (none)
stderr:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1348, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1283, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1329, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1278, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1038, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 976, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1455, in connect
self.sock = self._context.wrap_socket(self.sock,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 513, in wrap_socket
return self.sslsocket_class._create(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1071, in _create
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1342, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)

In order to mitigate it, go to your Python installation directory, for example to /Applications/Python 3.10/ and run the Install Certificates command.

Pull requests

GitHub pull requests (PRs) are the main mechanism to contribute code. When you want your work to be submitted to the main branch, you need to push your changes to remote and create a pull request.

If you go to Pull requests tab in GitHub project after you pushed your changes, you should see a notification about your branch waiting to be created as pull request. Otherwise, you can click on "New pull request" button.

PR Title

A PR title must comply with the commit message header rules. PRs with invalid titles will be marked with "Invalid PR title" label.

Content

We are using a PR template to align with the Conventional Commits specification.

It's recommended to create PRs that have one purpose, otherwise it will be very difficult to define the type and scope for the changes.

The PR title is the commit message header, while the PR description serves as a commit body and commit footer.

Consider adding a description with a list of main changes introduced by your pull request. This will help the reviewers quickly assess the content of the PR without the need of reading individual commits.

Feel free to also add labels to allow for simple and fast search of PRs related to a specific area.

Avoid creating big PRs - they unnecessarily hinder the code review process and make the PR more liable to errors.

If the work done in a pull request is connected to specific task or epic, please connect it to Jira by using the correct prefix for the branch name. Try to always create the pull requests that are linked to some issue, to avoid submitting work that doesn't bring us closer towards the goals.

Do not use GitHub closing keywords as they change the linked issue state automatically to Done, when the PR is merged. This breaks our Task Management flow.

Code review

Each pull request needs to be approved by at least 1 reviewer upon merging.

If you're a reviewer:

  • try to provide a review as soon as it was requested from you,
  • focus on the code being changed, but also think about the context,
  • use "Request changes" feature when you feel some change is required,
  • notify the author if UI changes were not proven by related screenshots or a video,
  • comment when you have concerns or questions without a specific solution,
  • check your previous comments and resolve them to unblock the PR from being merged,
  • be polite - review the code, not the coder.

If your work is being reviewed:

  • address all incoming comments and questions
  • preferably, let the comment's author resolve the conversation (he's the one who had concerns),
  • don't hesitate to update code if changes are requested,
  • ping people when the PR is not being reviewed for a long time,
  • check status of automatic checks and resolve or raise issues when a problem was found.

Merging

Currently, the only possible strategy of merging a pull request is "Squash and merge". That means, that all pull request's commits will be squashed into a single commit and merged to the base (usually main) branch. This strategy helps to keep a clean history on the main branch. No matter how many commits you create on your working branch, the merged work will be reflected by one commit.

By default, the squashed merge commit will use the PR title as its header, and the PR description as its body and footer.

Auto-merge can be enabled to automatically merge the changes from the reviewed pull request into the base branch. When it is enabled, your branch will be automatically updated by the Auto Update job that is configured in /.github/workflows/autoupdate.yml. This job will be triggered every time there is a push to the main branch. During its run, it checks if there are PRs with enabled auto-merge and updates them with the latest code from main.

Merge is enabled only if:

  • all conversations were resolved,
  • there is at least 1 approval from the reviewers,
  • (if any changes were requested) a re-review from the person requesting changes is done,
  • all required checks executed with a success state,
  • the branch is up-to-date with the base branch,
  • the PR title comply with convention.

Code checks

Code checks are being executed using GitHub Actions, and they are defined in .github directory. Depending on the content of the PR, there will be some automatic build checks triggered when the PR is created. Their purpose is to verify if the proposed changes are:

  • compliant with our coding guidelines,
  • working correctly without errors,
  • do not decrease the code coverage,
  • passing against related test suites.

Some checks are marked as required, and they will block the PR from merging if they fail.

Security

We have a Dependabot feature enabled for the repository which scans our codebase for possible vulnerabilities. List of found security alerts can be found here (not everyone has access to it - for more info reach out to @dom). Also, Dependabot occasionally creates PRs with updated versions of dependencies to mitigate the risk of using outdated packages.

Code coverage

Currently, we calculate and measure code coverage for JavaScript, Golang and Python subprojects. We use Codecov tool integrated with our GitHub Archipelo/top repository for calculating, measuring and storing code coverage. Codecov also creates nice reports accessible here.

Current configuration for codecov is stored in codecov.yml file. We have a 1% threshold flags which means that no more than 1% drop in coverage is allowed for 1 particular Pull Request. Codecov report with coverage difference is also shown under each Pull Request and an inline warnings are applied for lines that are added on a commit without coverage as annotations (they can be easily hidden by pressing A character on a keyboard).

Backend convention

Pagination

When implementing pagination for our backend API, we follow these conventions.

  • Pagination argument in requests are passed as query parameters and called page and perPage
  • In the response header, we set the total of items present in the requests in a header called x-archipelo-paginate-total

The rpc package contains all the utilities to makes pagination easy. Here is an example of usage:

// define your request parameters and embed the PaginationQuery struct.
// It will define the page and perPage query argument
type ReqParam struct {
rpc.PaginationQuery
// ... other field
}

// in your HTTP handler
func myHandler(ctx iris.Context) {
q := ReqParam{}
if err := ctx.ReadQuery(&q); err != nil {
ctx.StopWithError(http.StatusBadRequest, fmt.Errorf("failed to decode query parameters: %v", err))
return
}
//... process your request
resp, total, err := service.Process(q)
if err != nil {
ctx.StopWithError(http.StatusInternalServerError, err)
return
}

// write the response
rpc.PaginationResponseHeader(ctx, total)
ctx.StatusCode(http.StatusOK)
apiv2.WriteResponse(ctx, resp)
}

API response format

When sending data from the API to the frontend, we define the response struct in the rpc package On the struct, you define your field and json tags.

  • we use camelCase for the field name
  • We do NOT use the omitempty tags in order to always have a consistent response format from the API.
  • Any slice in your response must never be nil, even if they are empty. So make sure to always send an empty slice in your response.

Example:

type MyReponse struct {
Foo string `json:"foo"`
Bar int `json:bar"`
MultiWord string `json:"multiWord"`
List []string `json:"list"` // this field must never be nil
}