There's only two code files in the http-parser dependency, so using
globs to find those files seems quite excessive. Fix this by explicitly
naming both files.
We currently set up all of our warnings in the global scope, which is
quite suboptimal considering that we're also bundling a set of
third-party dependencies which inherit the same set of warnings.
Fix this by converting our warning-macros to target-scoped functions.
cf. RFC7230 section 6.7, an Upgrade header in a normal response merely informs the client that the server supports upgrading to other protocols, and the client can ask for such an upgrade in a later request. The server requiring an upgrade is via the 426 Upgrade Required response code, not the mere presence of the Upgrade response header.
(closes issue #5573)
Signed-off-by: Sven Strickroth <email@cs-ware.de>
`default_port_for_scheme` returns NULL if the scheme is not one of the
builtin ones. This may cause a segmentation fault if a custom transport
URL happens to contain a port number, and this code path is triggered
(e.g. by setting git_fetch_options->update_fetchhead to 1).
With the recent addition of GitHub Actions to our CI infrastructure, we
now have two jobs which generate documentation: once in GHA, once in
Azure. Naturally, as they both want to update the same branch, they race
against each other and one of both jobs will fail.
Fix this by removing the documentation job from Azure.
Our processing loop in git_zstream_get_output_chunk does not handle
`Z_BUF_ERROR` appropriately at the end of a compressed window.
From the zlib manual, inflate will return:
> Z_BUF_ERROR if no progress was possible or if there was not enough
> room in the output buffer when Z_FINISH is used. Note that Z_BUF_ERROR
> is not fatal, and inflate() can be called again with more input and
> more output space to continue decompressing.
In our loop, we were waiting until we got the expected size, then
ensuring that we were at `Z_STREAM_END`. We are not guaranteed to be,
since zlib may be in the `Z_BUF_ERROR` state where it has consumed a
full window's worth of data, but it doesn't know that it's really at the
end of the stream. There _could_ be more compressed data, but it
doesn't _know_ that there's not until we make a subsequent call.
We can change the loop to look for the end of stream instead of our
expected size. This allows us to call inflate one last time when we are
at the end of a window (and in the `Z_BUF_ERROR` state), allowing it to
recognize the end of the stream, and move from the `Z_BUF_ERROR` state
to the `Z_STREAM_END` state.
If we do this, we need another exit condition: when `bytes == 0`, then
no progress could be made and we should stop trying to inflate. This
will be an error case, caught by the size and/or end-of-stream test.
When appending config entries, we currently always first get the
currently existing map entry and then afterwards update the map to
contain the current config value. In the common scenario where keys
aren't being overridden, this is the best we can do. But in case a key
gets set multiple times, then we'll also perform these two map
operations. In extreme cases, hashing the map keys will thus start to
dominate performance.
Let's optimize the pattern by using a separately allocated map entry.
Currently, we always put the current list entry into the map and update
it to get any overridden multivar. As these list entries are also used
to iterate config entries, we cannot update them in-place in the map and
are thus forced to always set the map to contain the new entry. But with
a separately allocated map entry, we can now create one once per config
key and insert it into the map. Whenever appending a new config value
with the same key, we can now just update the map entry in-place instead
of having to replace the map entry completely.
This reduces calls to the hashing function by half and trades the
improved runtime for one more allocation per unique config key. Given
that the refactoring arguably improves code readability by splitting
concerns of the `config_entry_list` type and not having to track it in
two different structures, this alone would already be reason enough to
take the trade.
Given a pathological case of a gitconfig with 100.000 repeated keys and
a section of length 10.000 characters, this reduces runtime by half from
approximately 14 seconds to 7 seconds as expected.
In case where a branch is getting renamed, all HEADs of the main
repository and of its worktrees that point to the old branch need to get
updated to point to the new branch. We already do so and have a test for
this, but the test only verifies that we're able to lookup the updated
HEAD, not what it contains.
Let's make the test more specific by verifying the updated HEAD also has
the correct updated symbolic target.
The function `git_repository_head_for_worktree` currently uses
`git_reference__read_head` to directly read a given worktree's HEAD from
the filesystem. This is broken in case the repository uses a different
refdb implementation than the filesystem-based one, so let's instead
open the worktree as a real repository and use `git_reference_lookup`.
This also fixes the case where the worktree's HEAD is not a symref, but
a detached HEAD, which would have resulted in an error previously.
The function `git_repository_foreach_head` is broken, as it directly
interacts with the on-disk representation of the reference database,
thus assuming that no other refdb is used for the given repository. As
this is an internal function only and all users have been replaced,
let's remove this function.
We currently determine whether a branch is checked out via
`git_repository_foreach_head`. As this function reads references
directly from the disk, it breaks our refdb abstraction in case the
repository uses a different reference backend implementation than the
filesystem-based one. So let's use `git_repository_foreach_worktree`
instead -- while it's less efficient, it is at least correct in all
corner cases.
When renaming a reference, we need to iterate over every HEAD and
potentially update it in case it is a symbolic reference pointing to the
previous name of the renamed reference. Most importantly, this doesn't
only include HEADs from the repo we're renaming the reference in, but we
also need to iterate over HEADs from linked worktrees.
In order to update the HEADs, we directly read them from the worktree's
gitdir and thus assume that both repository and worktrees use the
filesystem-based reference backend. But this breaks as soon as one got a
repository with a different refdb and breaks our own abstractions. So
let's instead update HEAD references via the refdb by first opening each
worktree as a repository and then using the usual functions to read and
update HEADs. This is a lot less efficient than the current code, but
it's not like we can really help this: going via the refdb is mandatory.
Given a Git repository, it's non-trivial to iterate over all worktrees
that are associated with it, including the "main" repository. This
commit adds a new internal function `git_repository_foreach_worktree`
that does this for us.
To determine whether another reflog entry needs to be written for HEAD
on a reference update, we need to see whether HEAD directly or
indirectly points to the reference we're updating. The resolve logic is
currently completely unbounded except an error occurs, which effectively
means that we'd be spinning forever in case we have a symref loop in the
repository refdb.
Let's fix the issue by using `git_refdb_resolve` instead, which is
always bounded.