`binstalk-downloader` contains stuff about http(s) before the
git code is moved into it and now it becomes http and git.
While git indeed uses http stuff, which is why I decided to put
it into binstalk-downloader, it is more than just downloading
since it is stateful (can be cached locally and updated)
where as http is stateless.
Also `binstalk-downloader`'s codegen time now increases
dramatically and it also creates extra dependencies for
binstalk-fetchers, delaying its execution.
The git code also don't use anything from `binstalk-downloader`
at all, it makes sense to be an independent crate.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
feat: `git::Repository` support cancellation.
To make sure users can cancel git operation via signal, e.g. when the
git operation fail or users no longer want to install.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Fixed#1183
Since the crate tarball could be downloaded from a different set of
servers than where the cargo registry is hosted, verifying the checksum
is necessary to verify its integrity.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
I planned to replace `futures-util` with `futures-lite`, but it turns
out that `reqwest` actually depends on `futures-util`, so there is no
point removing it and introduce yet another dependency.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
By accepting `&mut dyn DataVerifier` for users to pass any callback that
uses `digest::Digest`/`digest::Mac`, `sigstore` or whatever they want.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Replace use of `PhantomData::default()` in `src/download.rs` with
`PhantomData` since it is a unit struct.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
When installing `cargo-expand` v1.0.59, I got an error message:
```
Failed to parse http response body as Json: invalid type: null, expected a string at line
1 column 90
```
This is because `GraphQLPageInfo::end_cursor` can actually be `null`, so
I change its type to `Option<CompactString>`.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
- ci: Check feat powerset of leon & binstalk-downloader in `ci.yml`
- fix leon feature `cli`: Enable dep `miette` in feature `cli`
- fix binstalk-downloader when default feature is disabled and no other
tls related feature is enabled (breaking change due to replace of
`tls::Version` with newtype `TLSVersion`).
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
- Increase `DEFAULT_RETRY_DURATION` to 5 minutes, since GitHub enforces
rate limit on an hourly basis.
- Refactor `check_for_status` & `fetch_release_artifacts_restful_api`
- Optimize `percent_decode_http_url_path`: Avoid `percent_decode_str`
if there is no `%` in the `input`.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Fixed#838
- Add new key `subcrate` for rendering `pkg-url`
- Add new release paths in GitHub, GitLab & SourceForge using key `subcrate` for auto-detection
- Add subcrate detection for GitHub and GitLab
- Add `debug!` when using gh api token in `GhApiClient::new`
- Add subcrate testing to `e2e-tests/subcrate.sh`
- Bump cargo-release to 0.24.9 in e2e-tests/live.sh
to fix test failure on MacOS without libssl installed in `/usr/local/`.
- Optimize GhCrateMeta: Detect subcrate and repo-host in `Data::get_repo_info`
to cache the result and avoid duplicate works, this also makes the code
more ergonomic by removing the need to some `unwrap()` plus making it
more efficient since we don't need to clone the url just to modify it.
- Add instrument to `Data::get_repo_info`
- Fix `shellcheck` err in `e2e-tests/*.sh`
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
- Fix zip extraction code: Ensure dir is rwx and file is readable for curr user
- Add more integration test for `ExtractedFiles`
- Fix `bins::infer_bin_dir_template` introduced in #856
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Fixed#835
Using `HEAD` for this would often cause false negative that requires the `Client` to fallback to `GET`, which creates a lot of requests even if the url doesn't exist and then get cargo-binstall rate limited by GitHub/GitLab/etc.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
It also uses `max_stable_version` in the json downloaded from https://crates.io/api/v1/crates/$name if possible, which is equivalent to the version shown on https://crates.io/crates/$name .
- Add new feat `json` to `binstalk-downloader`
- Impl new async fn `Response::json`
- use `Response::json` in `GhApiClient` impl
- Mark all err types in binstalk-downloader as `non_exhaustive`
- Ret `remote::Error` in `remote::Certificate::{from_pem, from_der}` instead of `ReqwestError`.
- Refactor `BinstallError`: Merge variant `Unzip`, `Reqwest` & `Http`
into one variant `Download`.
- Manually download and parse json from httos://crates.io/api/v1
- Remove unused deps `crates_io_api`
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Fixed#776
- Add new feature gh-api-client to binstalk-downloader
- Impl new type `binstalk_downloader::remote::{RequestBuilder, Response}`
- Impl `binstalk_downloader::gh_api_client::GhApiClient`, exposed if `cfg(feature = "gh-api-client")` and add e2e and unit tests for it
- Use `binstalk_downloader::gh_api_client::GhApiClient` to speedup `cargo-binstall`
- Add new option `--github-token` to supply the token for GitHub restful API, or read from env variable `GITHUB_TOKEN` if not present.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
for specifying root ceritificates used for https connnections.
And remove old environment variable `CARGO_HTTP_CAINFO`, `SSL_CERT_FILE`
and `SSL_CERT_PATH` to avoid accidentally setting them, especially in CI
env.
Also:
- Rm fn `binstalk_downloader::Certificate::from_env`
- Enable feature `env` of dep `clap` in `crates/bin`
- Add new dep `file-format` v0.14.0 to `crates/bin`
- Use `file-format` to determine pem/der file format when loading root certs
- Rm fn `binstalk_downloader::Certificate::open` and enum `binstalk_downloader::OpenCertificateError`
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
which would cause the `StreamReadable` to return eof even if the
underlying stream is still open and has not sent EOF yet.
Fixed#777
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
* Support for custom root cert in `binstalk_downloader::remote::Client`
* Support adding root cert via env `CARGO_HTTP_CAINFO`, `SSL_CERT_{FILE, PATH}`
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Fixed#779#791
- Retry request on timeout
- Retry for `StatusCode::{REQUEST_TIMEOUT, GATEWAY_TIMEOUT}`
- Add `DEFAULT_RETRY_DURATION_FOR_RATE_LIMIT` for 503/429
if 503/429 does not give us a header or give us an invalid header on
when to retry, we would default to
`DEFAULT_RETRY_DURATION_FOR_RATE_LIMIT`.
- Fix `Client::get_redirected_final_url`: Retry using `GET` on status code 400..405 + 410
- Rename remote_exists => remote_gettable & support fallback to GET
if HEAD fails due to status code 400..405 + 410.
- Improve `Client::get_stream`: Include url & method in the err of the stream returned
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
`futures-util` has too many dependencies and it contains a lot of code
of which we only use a tiny bit of them.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Fixed#747
- Add dep compact_str v0.6.1 to binstalk-downloader
- Impl new type `DelayRequest`
- Handle 503/429 with wait duration > `MAX_RETRY_DURATION` by simply taking the min
- Fix `Client::send_request_inner`: Ensure 503/429 get propagated to other requests
even if the current requests reach its maximum retry and decides to
simply return an error.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
- Refactor: Mv fn `utils::asyncify` into mod `utils`
- Improve err msg for task failure in `utils::asyncify`
- Make sure `asyncify` always returns the same annoymous type
that implements `Future` if the `T` is same.
- Rewrite `extract_bin` to avoid `block_in_place`
support cancellation by dropping
- Rm unused dep scopeguard from binstalk-downloader
- Rewrite `extract_tar_based_stream` so that it is cancellable by dropping
- Unbox `extract_future` in `async_extracter::extract_zip`
- Refactor `Download` API: Remove `CancellationFuture` as param
since all futures returned by `Download::and_*` does not call
`block_in_place`, so they can be cancelled by drop instead of using this
cumbersome hack.
- Fix exports from mod `async_tar_visitor`
- Make `signal::{ignore_signals, wait_on_cancellation_signal}` private
- Rm the global variable `CANCELLED` in `wait_on_cancellation_signal`
and rm fn `wait_on_cancellation_signal_inner`
- Optimize `wait_on_cancellation_signal`: Avoid `tokio::select!` on `not(unix)`
- Rm unnecessary `tokio::select!` in `wait_on_cancellation_signal` on unix
Since `unix::wait_on_cancellation_signal_unix` already waits for ctrl + c signal.
- Optimize `extract_bin`: Send `Bytes` to blocking thread for zero-copy
- Optimize `extract_with_blocking_decoder`: Avoid dup monomorphization
- Box fut of `fetch_crate_cratesio` in `PackageInfo::resolve`
- Optimize `extract_zip_entry`: Spawn only one blocking task per fn call
by using a mspc queue for the data to be written to the `outfile`.
This would improve efficiency as using `tokio::fs::File` is expensive:
It spawns a new blocking task, which needs one heap allocation and then
pushed to a mpmc queue, and then wait for it to be done on every loop.
This also fix a race condition where the unix permission is set before
the whole file is written, which might be used by attackers.
- Optimize `extract_zip`: Use one `BytesMut` for entire extraction process
To avoid frequent allocation and deallocation.
- Optimize `extract_zip_entry`: Inc prob of reusing alloc in `BytesMut`
Performs the reserve before sending the buf over mpsc queue to
increase the possibility of reusing the previous allocation.
NOTE: `BytesMut` only reuses the previous allocation if it is the
only one holds the reference to it, which is either on the first
allocation or all the `Bytes` in the mpsc queue has been consumed,
written to the file and dropped.
Since reading from entry would have to wait for external file I/O,
this would give the blocking thread some time to flush `Bytes`
out.
- Disable unused feature fs of dep tokio
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
* Add new dep async_zip v0.0.9 to binstalk-downloader
with features "gzip", "zstd", "xz", "bzip2", "tokio".
* Refactor: Simplify `async_extracter::extract_*` API
* Refactor: Create newtype wrapper of `ZipError`
so that the zip can be upgraded without affecting API of this crate.
* Enable feature fs of dep tokio in binstalk-downloader
* Rewrite `extract_zip` to use `async_zip::read::stream::ZipFileReader`
which avoids writing the zip file to a temporary file and then read it
back into memory.
* Refactor: Impl new fn `await_on_option` and use it
* Optimize `tokio::select!`: Make them biased and check for cancellation first
to make cancellation takes effect ASAP.
* Rm unused dep zip from binstalk-downloader
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
* Add new dep tokio-tar v0.3.0 to binstalk-downloader
* Add new dep tokio-util v0.7.4 with feat io to binstalk-downloader
* Add dep async-trait v0.1.59 to binstalk-downloader
* Add new dep async-compression v0.3.15 to binstalk-downloader
with features "gzip", "zstd", "xz", "bzip2", "tokio".
* Rewrite `Download::and_visit_tar` to use `tokio-tar`
to avoid the cumbersome `block_in_place`.
* Apply temporary workaround: Rm use of let-else in mod visitor
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
as there is no need to randomize the first one to be polled.
For `cancel_on_user_sig_term` and `StreamReadable::fill_buf`, the
cancellation future should always to be polled first so that user would
feel responsive.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
* Avoid potential panicking in `args::parse` by using `Vec::get` instead of indexing
* Refactor: Simplify `opts::{resolve, install}` API
Many parameters can be shared and put into `opts::Options` intead and
that would also avoid a few `Arc<Path>`.
* Optimize `get_install_path`: Avoid cloning `install_path`
* Optimize `LazyJobserverClient`: Un`Arc` & remove `Clone` impl
to avoid additional boxing
* Optimize `find_version`: Avoid cloning `semver::Version`
* Optimize `GhCrateMeta::launch_baseline_find_tasks`
return `impl Iterator<Item = impl Future<Output = ...>>`
instead of `impl Iterator<Item = AutoAbortJoinHandle<...>>`
to avoid unnecessary spawning.
Each task spawned has to be boxed and then polled by tokio runtime.
They might also be moved.
While they increase parallelism, spawning these futures does not justify
the costs because:
- Each `Future` only calls `remote_exists`
- Each `remote_exists` call send requests to the same domain, which is
likely to share the same http2 connection.
Since the conn is shared anyway, spawning does not speedup anything
but merely add communication overhead.
- Plus the tokio runtime spawning cost
* Optimize `install_crates`: Destruct `Args` before any `.await` point
to reduce size of the future
* Refactor `logging`: Replace param `arg` with `log_level` & `json_output`
to avoid dep on `Args`
* Add dep strum & strum_macros to crates/bin
* Derive `strum_macros::EnumCount` for `Strategy`
* Optimize strategies parsing in `install_crates`
* Fix panic in `install_crates` when `Compile` is not the last strategy specified
* Optimize: Take `Vec<Self>` instead of slice in `CrateName::dedup`
* Refactor: Extract new fn `compute_resolvers`
* Refactor: Extract new fn `compute_paths_and_load_manifests`
* Refactor: Extract new fn `filter_out_installed_crates`
* Reorder `install_crates`: Only run target detection if args are valid
and there are some crates to be installed.
* Optimize `filter_out_installed_crates`: Avoid allocation
by returning an `Iterator`
* Fix user_agent of `remote::Client`: Let user specify it
* Refactor: Replace `UIThread` with `ui::confirm`
which is much simpler.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
* Optimize `Download::and_extract`: Avoid dup monomorphization
* Increase buffer size for binstall_crates_v1 to `4096 * 5`
* Optimize `opts::resolve`: Avoid unnecessary `clone`s
* Fix reserve in `opts::resolve`: Do not over-reserve
* Rename field `opts::Options::resolver` => `resolvers`
* Refactor: Extract new type `resolve::PackageInfo`
- which makes `opts::resolve_inner` easier to understand
- reduce number of parameters required for `download_extract_and_verify` and
`collect_bin_files`
- reducing size of future returned by `opts::resolve_inner` by dropping
`cargo_toml::{Manifest, Package}` as early as possible since
`Manifest` is 3000 Bytes large while `Package` is 600 Bytes large.
* Optimize `fetchers::Data`: Use `CompactString` for field name & version
since they are usually small enough to fit in inlined version of
`CompactString`.
* Optimize `gh_crate_meta`: Avoid unnecessary allocation
in `RepositoryHost::get_default_pkg_url_template`.
* Refacator: Use `Itertools::cartesian_product` in `apply_filenames_to_paths`
* Optimize `ops::resolve`: Avoid unnecessary `clone` & reduce future size
by calling `fetcher.target_meta()` to obtain final metadata after
downloaded and extracted the binaries.
* Optimize `ops::resolve`: Avoid unnecessary allocation
in `download_extract_and_verify`: Replace `Itertools::join` with
`Itertools::format` to avoid allocating the string.
* Fix disabling cargo-install fallback
* Simplify `BinFile::from_product`: Takes `&str` instead of `&Product`
since we only need `product.name`
* Rename `BinFile::from_product` => `BinFile::new`
* Refactor: Create newtype `ops::resolve::Bin`
so that we don't need to `unwrap()` on `Product::name`
and reduce memory usage.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>