As OpenSSL loves using uninitialized bytes as another source of entropy,
we need to mark them as defined so that Valgrind won't complain about
use of these bytes. Traditionally, we've been using the macro
`VALGRIND_MAKE_MEM_DEFINED` provided by Valgrind, but starting with
OpenSSL 1.1 the code doesn't compile anymore due to `struct SSL` having
become opaque. As such, we also can't set it as defined anymore, as we
have no way of knowing its size.
Let's change gears instead by just swapping out the allocator functions
of OpenSSL with our own ones. The twist is that instead of calling
`malloc`, we just call `calloc` to have the bytes initialized
automatically. Next to soothing Valgrind, this approach has the benefit
of being completely agnostic of the memory sanitizer and is neatly
contained at a single place.
Note that we shouldn't do this for non-Valgrind builds. As we cannot
set up memory functions for a given SSL context, only, we need to swap
them at a global context. Furthermore, as it's possible to call
`OPENSSL_set_mem_functions` once only, we'd prevent users of libgit2 to
set up their own allocators.
OpenSSL doesn't initialize bytes on purpose in order to generate
additional entropy. Valgrind isn't too happy about that though, causing
it to generate warninings about various issues regarding use of
uninitialized bytes.
We traditionally had some infrastructure to silence these errors in our
OpenSSL stream implementation, where we invoke the Valgrind macro
`VALGRIND_MAKE_MEMDEFINED` in various callbacks that we provide to
OpenSSL. Naturally, we only include these instructions if a preprocessor
define "VALGRIND" is set, and that in turn is only set if passing
"-DVALGRIND" to CMake. We do that in our usual Azure pipelines, but we
in fact forgot to do this in our nightly build. As a result, we get a
slew of warnings for these nightly builds, but not for our normal
builds.
To fix this, we could just add "-DVALGRIND" to our nightly builds. But
starting with commit d827b11b6 (tests: execute leak checker via CTest
directly, 2019-06-28), we do have a secondary variable that directs
whether we want to use memory sanitizers for our builds. As such, every
user wishing to use Valgrind for our tests needs to pass both options
"VALGRIND" and "USE_LEAK_CHECKER", which is cumbersome and error prone,
as can be seen by our own builds.
Instead, let's consolidate this into a single option, removing the old
"-DVALGRIND" one. Instead, let's just add the preprocessor directive if
USE_LEAK_CHECKER equals "valgrind" and remove "-DVALGRIND" from our own
pipelines.
The testcase iterator::workdir::filesystem_gunk sets up quite a lot of
directories, which is why it only runs in case GITTEST_INVASIVE_SPEED is
set in the environment. Because we do not run our default CI with this
variable, we didn't notice commit 852c83ee4 (refs: refuse to delete
HEAD, 2020-01-15) breaking the test as it introduced a new reference to
the "testrepo" repository.
Fix the oversight by increasing the number of expected iterator items.
In order to properly tear down the test environment, we will kill
git-daemon(1) if we've exercised it. As git-daemon(1) is spawned as a
background process, it is still owned by the shell and thus killing it
later on will print a termination message to the shell's stderr, causing
Azure to report it as an error.
Fix this by disowning the background process.
The Docker entrypoint currently creates the libgit2 user with "useradd
--create-home". As we start the Docker container with two volumes
pointing into "/home/libgit2/", the home directory will already exist.
While useradd(1) copes with this just fine, it will print error messages
to stderr which end up as failures in our Azure pipelines.
Fix this by simply removing the "--create-home" parameter.
Without the "--silent" parameter, curl will print a progress meter to
stderr. Azure has the nice feature of interpreting any output to stderr
as errors with a big red warning towards the end of the build. Let's
thus silence curl to not generate any misleading messages.
When building dependencies for our Docker images, we first download the
sources to disk first, unpack them and finally remove the archive again.
This can be sped up by piping the downloading archive into tar(1)
directly to parallelize both tasks. Furthermore, let's silence curl(1)
to not print to status information to stderr, which tends to be
interpreted as errors by Azure Pipelines.
In commit b9c5b15a7 (http: use the new httpclient, 2019-12-22), the HTTP
code got refactored to extract a generic HTTP client that operates
independently of the Git protocol. Part of refactoring was the creation
of a new `git_http_request` struct that encapsulates the generation of
requests. Our Git-specific HTTP transport was converted to use that in
`generate_request`, but during the process we forgot to set up custom
headers for the `git_http_request` and as a result we do not send out
these headers anymore.
Fix the issue by correctly setting up the request's custom headers and
add a test to verify we correctly send them.
Back in commit 5a6740e7f (azure: build Docker images as part of the
pipeline, 2019-08-02), we have converted our pipelines to use self-built
Docker images to ease making changes to our Dockerfiles. The commit
didn't adjust our Coverity pipeline, though, so let's do this now.
In commit bbc0b20bd (azure: fix Coverity's build due to wrong container
name, 2019-08-02), Coverity builds were fixed to use the correct
container names. Unfortunately, the "fix" completely broke our Coverity
builds due to using wrong syntax for the Docker task. Let's fix this by
using "imageName" instead of the Docker dict.
While we already do have logic to re-run flaky tests, the FAILED
variable currently does not get reset to "0". As a result, successful
reruns will still cause the test to be registered as failed.
Fix this by resetting the variable accordingly.
The proxy tests regularly fail in our CI environment. Unfortunately,
this is expected due to the network layer. Thus, let's re-try the proxy
tests up to five times in case they fail.
If fetching from an anonymous remote via its URL, then the URL gets
written into the FETCH_HEAD reference. This is mainly done to give
valuable context to some commands, like for example git-merge(1), which
will put the URL into the generated MERGE_MSG. As a result, what gets
written into FETCH_HEAD may become public in some cases. This is
especially important considering that URLs may contain credentials, e.g.
when cloning 'https://foo:bar@example.com/repo' we persist the complete
URL into FETCH_HEAD and put it without any kind of sanitization into the
MERGE_MSG. This is obviously bad, as your login data has now just leaked
as soon as you do git-push(1).
When writing the URL into FETCH_HEAD, upstream git does strip
credentials first. Let's do the same by trying to parse the remote URL
as a "real" URL, removing any credentials and then re-formatting the
URL. In case this fails, e.g. when it's a file path or not a valid URL,
we just fall back to using the URL as-is without any sanitization. Add
tests to verify our behaviour.
To allow testing against a Kerberos instance, we have added variables
for the Kerberos password to allow authentication against LIBGIT2.ORG in
commit e5fb5fe5a (ci: perform SPNEGO tests, 2019-10-20). To set up the
password, we assign
"GITTEST_NEGOTIATE_PASSWORD=$(GITTEST_NEGOTIATE_PASSWORD)"
in the environmentVariables section which is then passed through to a
template. As the template does build-time expansion of the environment
variables, it will expand the above line verbosely, and due to the
envVar section not doing any further expansion the password variable
will end up with the value "$(GITTEST_NEGOTIATE_PASSWORD)" in the
container's environment.
Fix this fixed by doing expansion of GITTEST_NEGOTIATE_PASSWORD at
build-time, as well.
We avoid abbreviations where possible; rename git_cred to
git_credential.
In addition, we have standardized on a trailing `_t` for enum types,
instead of using "type" in the name. So `git_credtype_t` has become
`git_credential_t` and its members have become `GIT_CREDENTIAL` instead
of `GIT_CREDTYPE`.
Finally, the source and header files have been renamed to `credential`
instead of `cred`.
Keep previous name and values as deprecated, and include the new header
files from the previous ones.
Download poxygit, a debugging git server, and clone from it using NTLM,
both IIS-style (with connection affinity) and Apache-style ("broken",
requiring constant reauthentication).
When we're authenticating with a connection-based authentication scheme
(NTLM, Negotiate), we need to make sure that we're still connected
between the initial GET where we did the authentication and the POST
that we're about to send. Our keep-alive session may have not kept
alive, but more likely, some servers do not authenticate the entire
keep-alive connection and may have "forgotten" that we were
authenticated, namely Apache and nginx.
Send a "probe" packet, that is an HTTP POST request to the upload-pack
or receive-pack endpoint, that consists of an empty git pkt ("0000").
If we're authenticated, we'll get a 200 back. If we're not, we'll get a
401 back, and then we'll resend that probe packet with the first step of
our authentication (asking to start authentication with the given
scheme). We expect _yet another_ 401 back, with the authentication
challenge.
Finally, we will send our authentication response with the actual POST
data. This will allow us to authenticate without draining the POST data
in the initial request that gets us a 401.
Allow users to opt-in to expect/continue handling when sending a POST
and we're authenticated with a "connection-based" authentication
mechanism like NTLM or Negotiate.
If the response is a 100, return to the caller (to allow them to post
their body). If the response is *not* a 100, buffer the response for
the caller.
HTTP expect/continue is generally safe, but some legacy servers
have not implemented it correctly. Require it to be opt-in.
Store the last-seen credential challenges (eg, all the
'WWW-Authenticate' headers in a response message). Given some
credentials, find the best (first) challenge whose mechanism supports
these credentials. (eg, 'Basic' supports username/password credentials,
'Negotiate' supports default credentials).
Set up an authentication context for this mechanism and these
credentials. Continue exchanging challenge/responses until we're
authenticated.