Three Components for Reviewing a Pull Request

Thomas J. Fan

This talk on Github: thomasjpfan/data-umbrella-2021-reviewing-prs

Three Components for Reviewing a Pull Request


  1. Mechanics
  2. Social Aspects
  3. Technical Aspects


  • PR: Pull request
  • Reviewer: Person reviewing the pull request
  • Contributor: Person who create the pull request
  • Merged: Pull request is included into the main branch

Why code review? ๐Ÿ”Ž ๐Ÿ’ป

  • Correctness
  • Maintainability
  • Knowledge sharing

1. Mechanics

Mechanics of Reviewing a Pull Request on GitHub ๐Ÿ› 


Browsing code in GitHub

  • t: Activate the file finder
  • l: Jump to a line in your code
  • b: Open blame view

GitHub CLI

gh pr checkout 20165


Overview of Mechanics

  • Markdown with some extras
  • Use suggestions when possible
  • Check and Navigate the CI
  • GitHub Keyboard shortcuts for browsing code

2. Social Aspects

  • Collaboration
  • Conversational
  • Contributor In the Loop

Ask, Don’t Tell


This would cause a performance regression. We can cache the results and use it again for sequential calls. What do you think?


Never do this. It is very slow.

Offer Alternative Implementations


This leads to a memory regression. May we try using batched calls to reduce the memory overhead? Something like this:

Insert code snippet


This leads to a memory regression.

Explain why code should be changed


We have similar logic in multiple places in this PR. Can we refactor this into its own function to be more DRY?


Please refactor this into its own function.

Avoid using derogatory terms


We can improve this function by adding more input validation and tests on the following edge cases:

List edge cases


This whole function is stupid.

Be humble

Assume everyone is intelligent and well-meaning


I checked the reference and it uses the following equation:

Insert Equation

I do not think this matches our implementation. What do you think?


I have 20 years of experience and this is clearly wrong.

Ask for clarification


This is part of the code is unclear. What was your reasoning behind using a queue here?


This is confusing.

Communicate which ideas you do not feel strongly about

Communicate which ideas you feel strongly about

Try to leave a positive comment


This is a very clever usage of defaultdict!

Thank contributors for their work


Thank you for the PR @thomasjpfan!

General Social Aspects ๐ŸŽ™

  • Be explicit
  • Do not be sarcastic
  • Keep the discussion around the code
  • Self-directed commentary: “I feel”, and " I think"


  • Inevitable and good: People are engaged
  • Keep it positive
  • When code gets to a certain level of quality, disagreement is more about tradeoffs.
  • Get more opinions: Ping other contributors that are familiar with the code or the subject

Talk synchronously

  • If there are too many “I do not understand” or “alternative solutions”
  • Keep meeting notes and comment on the PR about the discussion

2. Social Aspects

  • Collaboration
  • Conversational
  • Contributor In the Loop

3. Technical aspects

  • Presentation
  • Engineering
  • Testing


3. Technical aspects

Does the description describe why the change is being made?

  • Description matches the contents of the PR
  • Links to information issues related to PR

Familiarize yourself with the context of the issue and why the Pull request exists.

Do we want this new feature? ๐Ÿค”

  • Make sure that the team and the community wants to build the feature
  • Every line of code we add means more code size and more maintenance

Was unrelated code changed? ๐Ÿ› 

  • Only include files that are directly related to the change
  • If you use VSCode:
    "editor.formatOnSave": false,

Is the PR easy to review?

  • Difficulty of reviewing a PR scales non-linearly with size
  • Suggest how to split the PR into smaller PRs

Is the PR easy to review?

Some PRs must be large because they require major design changes


3. Technical aspects

Every change should be intentional

Offer ways to simplify or improve code

Was a core component changed?

  • Needs to be discussed
  • Regressions?

Does the change maintain backwards compatibility?

  • Deprecation?

Could the change be more generally applicable elsewhere in the codebase?

  • Small changes: encourage the author to make the changes
  • Big changes: open another issue to track it

Does the change use idiomatic or existing functionally in the codebase?

There could be helper functions already implemented

Will the change break something planned in the future?

It may directly contradict planned future work

Performance optimizations

Require Benchmarks

Is there documentation?

  • User guide and documentation
  • Audience

Application (Examples)

Understanding (User Guide)

Details (API docs)

Code comments

  • Subtle changes
  • Are comments out of date?

Was there a hack?

  • Needs to be discussed
  • Code comments


3. Technical aspects

Are there tests?

  • Prevents breaking something in the future
  • Checks that the PR is correct
  • Bug fixes require non-regression tests
    • Failed on main branch
    • Does not fail with the PR

When is it okay not have test? ๐Ÿงช

  • Documentation
  • Optimizations
    • Performance
    • Memory
    • Make sure existing test pass

Are the tests comprehensive?

Test the different options or input types of a function

CI test fails what does it mean?

  • Can be unrelated
  • Can be linting error
  • Can be an actual error

3. Technical aspects

  • Presentation
  • Engineering
  • Testing

It is okay to make mistakes

Three Components for Reviewing a Pull Request


  1. Mechanics
  2. Social Aspects
  3. Technical Aspects