[NSWI004] Several notes on Git, GitLab and CI

Mon Nov 9 19:23:55 CET 2020

Hello,

one way to limit workload on the gitlab machines could be to exclude
jobs testing earlier assignments from the CI
pipeline at least when run on a feature branch (using
https://docs.gitlab.com/ce/ci/yaml/README.html#onlyexcept-basic).

For example, `suite_a02_base` takes quite a long time to execute, not
to speak of `suite_a02_fuzzy` which runs much longer still.

KV

po 9. 11. 2020 v 11:20 odesílatel Ondřej Roztočil <roztocil at outlook.com> napsal:
>
> Hi,
>
> thank you for the interesting discussion. I just wanted to follow up on the comment about VS Code. It is really easy to set it up to work via SSH seamlessly as you would on a local computer. You can follow the guide here: https://code.visualstudio.com/docs/remote/ssh. You can also use the official extension for C/C++ to get auto-completion, Intellisense squiggles, etc. If you are on Windows and want to use your local PC, developing in WSL has also worked for me perfectly (after building the toolchain from source).
>
> Regards,
>
> OR
> ________________________________
> From: NSWI004 <nswi004-bounces at d3s.mff.cuni.cz> on behalf of Vojtech Horky <horky at d3s.mff.cuni.cz>
> Sent: Monday, November 9, 2020 10:32 AM
> To: nswi004 at d3s.mff.cuni.cz <nswi004 at d3s.mff.cuni.cz>
> Subject: Re: [NSWI004] Several notes on Git, GitLab and CI
>
> Hello.
>
> Dne 09. 11. 20 v 9:51 Lukáš Bastián napsal(a):
> > Hi,
> >
> > might have some additional questions and remarks regarding the git
> > workflow that you want us to use:
> >
> > 1. One of the ways to develop is to have so-called feature branches -
> > the problem with those is commits can add up and to keep master clean
> > after merging the pull request (or merge request on GitLab) you should
> > squash the commits but that is messing with the history if multiple
> > people committed to the feature branch (I don't think it does on GitHub
> > but on GitLab, I think I saw it happening). How should we deal with
> > that? Should we use squash or does it impact the grading script?
> > (Depending on the answer I might need to contact you separately and
> > solve some issues with points).
>
> Personally, I am not a big fan of commit squashing - you should keep
> even your feature branch in such state that merging it as is (with
> --no-ff) should not break things. I have no problem with history changes
> in not-yet-merged branches.
>
> We definitely do not enforce that you have to squash/rebase/... in your
> team or even use branches. Use what you like, what you are comfortable with.
>
> However, our activity points scripts are counting only commits that made
> it into master. Hence, if you squash your branch, you will have only one
> commit. I do not think it is possible to count it in any other way (i.e.
> no way to see the commits before the squash as the branch would seem
> dead or would be GCed anyway).
>
> If squashing the commits is a big deal and you want to have it, let's
> see how the points will look at the end of the semester. If you would
> (by then) miss some points, we can look at the merge requests and
> manually recompute.
>
>
> >     It is a good practice to always commit code that compiles. For
> >     master/main branch, the rule is usually to commit code that compiles and
> >     passes the tests.
> >
> > In this case, this means completing the task - which doesn't go hand in
> > hand with the fact that you are trying to get us to collaborate and
> > "work as a team in a real company" (or at least that's what I felt like)
> > - I wouldn't be able to reasonably share code with my colleagues - there
> > are ways to get around this by having let's say a02-master and we all
> > branch from there but that might be problematic again when it comes to
> > what I mentioned above - squash on merge and the change of author when used.
>
> The word usually applies here. Note that some of your colleagues are
> seeing Git for the first time in this course hence I decided to add what
> is the usual rule in long-running projects. I.e. there is a difference
> when your project is being born and thousands of lines are added
> compared to a maintenance of a mature project. Sorry, I should have made
> this point more clear.
>
>
> >     As a matter of fact, if you commit/push every minute to debug your
> >     knowledge of C syntax and/or for every word in one code comment, two
> >     things may happen. We may consider this as a gaming of the activity
> >     points. And we might be forced to setup some type of accounting for the
> >     use of CI to ensure fairness (I already received a complaint from GitLab
> >     administrator about overloading the CI machines).
> >
> > Is there a way to set up the CI so that doesn't build before opening a
> > PR or that it cancels the previous pipeline if a new commit is made?
> > That is the approach that is used in the company I work for to prevent
> > what we are encountering here. At least on the feature branches - master
> > should always run with each commit/merge given the small team size -
> > there is also the option to make the build periodical if there were
> > changes in the last X amount of time.
>
> The pipelines are configured that redundant jobs shall be canceled
> automatically. You can also add interruptible [1] flag to CI manually.
>
> [1] https://gitlab.mff.cuni.cz/help/ci/yaml/README.md#interruptible
>
> But I do not think it should be necessary. If you work normally and
> simply do not push every commit (if you are in a private feature branch,
> there really is no reason to do so) all should be fine.
>
>
> > Mostly my comments come from a place of
> > a) interest and previous experience with some GitHub workflows - from
> > what I see the unit of work in your opinion should be a commit which in
> > my opinion is not easily accessible given the nature of the task and the
> > need to collaborate - I am used to using a PR as a unit of work which
> > would probably require some changes in the CI pipeline.
>
> I am not sure I follow what you mean by your understanding of PR.
>
> However, I strongly believe that commit should represent a logical unit
> of change. In this sense: if I add a code that detects available memory,
> it is a logical unit even if all tests still fail.
>
> And personally I would add it directly to master as it will be needed by
> my team mates very soon. But I would create a PR later on when
> refactoring the code.
>
>
> > And also b) slight frustration because I am one of the ones that were
> > probably overloading the GitLab machines (although I tried to minimize
> > it once I had a look at the queue, saw what was happening, and was
> > canceling some runs manually) - but the reason for it was that the
> > instructions on how to reproduce the dev environment on a local Linux VM
> > were not enough for me to successfully do it so I develop on a feature
> > branch in the mentioned VM which gives me more freedom when it comes to
> > reasonable IDE etc, push whenever I want to because I will open a PR and
> > squash anyway, and then pull the branch on a Rotunda machine and debug
> > and test there. The CI run results are not really what interests me at
> > that point but they run every time (hence my comments about automatic
> > cancellation on a new commit) and when I included some printing the logs
> > were getting out of hand (which is probably where most of the pressure
> > on the GitLab CI machine comes from - the amount of logs generated).
>
> I am sorry to hear that the instructions were not clear enough. Perhaps
> can you elaborate on this a bit more?
>
> You can also disable CI for a specific commit but I believe that if you
> are using Git to synchronize code between two machines to only allow you
> to code on one and test on another (if I understand the issue
> correctly), then your setup should be fixed first. Just to make it more
> comfortable for you.
>
> Note that you can also mount the remote disks via SSHFS and VSCode can
> also work in remote mode somehow. Perhaps your colleagues that are using
> it that way (I see several .vscode directories on lab.d3s) can share
> some links and comments about this.
>
> Hope this explain things a bit more.
>
> Cheers,
> - VH
>
>
> >
> > Looking forward to your reply hoping it will bring more clarity so I can
> > adjust and prevent future problems of this nature.
> >
> > Regards
> > Lukáš Bastián
> >
> >
> > On Mon, Nov 9, 2020 at 7:07 AM Vojtech Horky <horky at d3s.mff.cuni.cz
> > <mailto:horky at d3s.mff.cuni.cz>> wrote:
> >
> >     Hello,
> >
> >     just few notes in no particular ordering.
> >
> >     It is possible to setup your Git commit identity (e-mail) in GitLab
> >     instead of the default one.
> >
> >     When you are using Git for the first time on a given machine (e.g.
> >     lab.d3s.mff.cuni.cz <http://lab.d3s.mff.cuni.cz> or Rotunda
> >     servers), you should set your Git name
> >     and e-mail via "git config" command (probably with the --global
> >     switch).
> >     Note that Git will warn you during every commit that you have not
> >     done so.
> >
> >     It is a good practice to always commit code that compiles. For
> >     master/main branch, the rule is usually to commit code that compiles
> >     and
> >     passes the tests.
> >
> >     GitLab CI is not your development environment. You are supposed to
> >     develop, debug and test on your machine and push to GitLab only
> >     reasonable commits. There is no reason to push every single commit to
> >     GitLab to see whether it compiles.
> >
> >     As a matter of fact, if you commit/push every minute to debug your
> >     knowledge of C syntax and/or for every word in one code comment, two
> >     things may happen. We may consider this as a gaming of the activity
> >     points. And we might be forced to setup some type of accounting for the
> >     use of CI to ensure fairness (I already received a complaint from
> >     GitLab
> >     administrator about overloading the CI machines).
> >
> >     Note that it is completely fine and highly recommended to create very
> >     small, focused commits but each commit should represent a
> >     compact/logical/atomic change of your project. Think always about a
> >     reviewer that clicks on your commit - are there only changes related to
> >     the topic and are there all the changes related to the topic?
> >
> >     Note that "git add -p" allows you to split big change into multiple
> >     commits quite easily.
> >
> >     As a further reading, I would recommend [1] about good commit messages
> >     and perhaps [2] as well as it reiterates some notes about committing in
> >     general.
> >
> >     [1] https://chris.beams.io/posts/git-commit/
> >     [2]
> >     https://koukia.ca/git-some-commit-best-practices-and-how-to-undo-your-recent-commits-d13c9dc3144f
> >
> >     Hope this helps,
> >     - VH
> >     _______________________________________________
> >     NSWI004 mailing list
> >     NSWI004 at d3s.mff.cuni.cz <mailto:NSWI004 at d3s.mff.cuni.cz>
> >     https://d3s.mff.cuni.cz/mailman/listinfo/nswi004
> >
> >
> > _______________________________________________
> > NSWI004 mailing list
> > NSWI004 at d3s.mff.cuni.cz
> > https://d3s.mff.cuni.cz/mailman/listinfo/nswi004
> >
> _______________________________________________
> NSWI004 mailing list
> NSWI004 at d3s.mff.cuni.cz
> https://d3s.mff.cuni.cz/mailman/listinfo/nswi004
> _______________________________________________
> NSWI004 mailing list
> NSWI004 at d3s.mff.cuni.cz
> https://d3s.mff.cuni.cz/mailman/listinfo/nswi004