Support private github repository (#1690)

* Refactor: Create new crate binstalk-git-repo-api

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix CI lint warnings

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix `just check`: Rm deleted features from `cargo-hack` check

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor: Extract  new mod error

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Ret artifact url in `has_release_artifact`

So that we can use it to download from private repositories.

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Move `test_graph_ql_error_type` to mod `error`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix running `cargo test` in `binstalk-git-repo-api``

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Remove unnecessary import in mod `error::test`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Rename mod `request`` to `release_artifacts`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Impl draft version of fetching repo info

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor: Move `HasReleaseArtifacts` failure variants into `GhApiError`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor: Use `GhRepo` in `GhRelease`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix testing

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor: Return `'static` future

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor: Make sure `'static` Future is returned

To make it easier to create generic function

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add logging to unit testing

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix unit testing

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor: Extract new fn `GhApiClient::do_fetch`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Rm unused `percent_encode_http_url_path`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix `cargo test` run on CI

`cargo test` run all tests in one process.

As such, `set_global_default` would fail on the second call.

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Optimize `GhApiClient::do_fetch`: Avoid unnecessary restful API call

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor: Rm param `auth_token` for restful API fn

which is always set to `None`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Impl new API `GhApiClient::get_repo_info`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix unit test for `GhApiClient::get_repo_info`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor testing: Parameter-ize testing

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Parallelise `test_get_repo_info`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor: Create parameter-ised `test_has_release_artifact`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Parallelize `test_has_release_artifact`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Refactor: `gh_api_client::test::create_client` shall not be `async`

as there is no `.await` in it.

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Return `Url` in `GhApiClient::has_release_artifact`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Impl new API `GhApiClient::download_artifact`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Remove unused deps added to binstalk-git-repo-api

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix clippy lints

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add new API `GhApiClient::remote_client`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add `GhApiClient::has_gh_token`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add `GhRepo::try_extract_from_url`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Rename `ReleaseArtifactUrl` to `GhReleaseArtifactUrl`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add new fn `Download::with_data_verifier`

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* feature: Support private repository

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix clippy lints

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add e2e-test/private-github-repo

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix clippy lints

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix `launch_baseline_find_tasks`: Retry on rate limit

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix test failure: Retry on rate limit

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Temporarily enable debug output for e2e-test-private-github-repo

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix `get_repo_info`: Retry on rate limit

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Improve `debug!` logging

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add more debug logging

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add more debugging

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add more debug logging

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Apply suggestions from code review

* Fix compilation

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Fix cargo fmt

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>

* Add crate binstalk-git-repo-api to release-pr.yml

* Update crates/binstalk-git-repo-api/Cargo.toml

* Apply suggestions from code review

* Update crates/binstalk/Cargo.toml

---------

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
This commit is contained in:
Jiahao XU 2024-06-10 16:02:12 +10:00 committed by GitHub
parent 48ee0b0e3e
commit 1dbd2460a3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
30 changed files with 1838 additions and 1127 deletions

View file

@ -0,0 +1,143 @@
use std::{fmt::Debug, future::Future, sync::OnceLock, time::Duration};
use binstalk_downloader::remote::{self, Response, Url};
use compact_str::CompactString;
use percent_encoding::percent_decode_str;
use serde::{de::DeserializeOwned, Deserialize, Serialize};
use serde_json::to_string as to_json_string;
use tracing::debug;
use super::{GhApiError, GhGraphQLErrors};
pub(super) fn percent_decode_http_url_path(input: &str) -> CompactString {
if input.contains('%') {
percent_decode_str(input).decode_utf8_lossy().into()
} else {
// No '%', no need to decode.
CompactString::new(input)
}
}
pub(super) fn check_http_status_and_header(response: &Response) -> Result<(), GhApiError> {
let headers = response.headers();
match response.status() {
remote::StatusCode::FORBIDDEN
if headers
.get("x-ratelimit-remaining")
.map(|val| val == "0")
.unwrap_or(false) =>
{
Err(GhApiError::RateLimit {
retry_after: headers.get("x-ratelimit-reset").and_then(|value| {
let secs = value.to_str().ok()?.parse().ok()?;
Some(Duration::from_secs(secs))
}),
})
}
remote::StatusCode::UNAUTHORIZED => Err(GhApiError::Unauthorized),
remote::StatusCode::NOT_FOUND => Err(GhApiError::NotFound),
_ => Ok(()),
}
}
fn get_api_endpoint() -> &'static Url {
static API_ENDPOINT: OnceLock<Url> = OnceLock::new();
API_ENDPOINT.get_or_init(|| {
Url::parse("https://api.github.com/").expect("Literal provided must be a valid url")
})
}
pub(super) fn issue_restful_api<T>(
client: &remote::Client,
path: &[&str],
) -> impl Future<Output = Result<T, GhApiError>> + Send + Sync + 'static
where
T: DeserializeOwned,
{
let mut url = get_api_endpoint().clone();
url.path_segments_mut()
.expect("get_api_endpoint() should return a https url")
.extend(path);
debug!("Getting restful API: {url}");
let future = client
.get(url)
.header("Accept", "application/vnd.github+json")
.header("X-GitHub-Api-Version", "2022-11-28")
.send(false);
async move {
let response = future.await?;
check_http_status_and_header(&response)?;
Ok(response.json().await?)
}
}
#[derive(Debug, Deserialize)]
struct GraphQLResponse<T> {
data: T,
errors: Option<GhGraphQLErrors>,
}
#[derive(Serialize)]
struct GraphQLQuery {
query: String,
}
fn get_graphql_endpoint() -> Url {
let mut graphql_endpoint = get_api_endpoint().clone();
graphql_endpoint
.path_segments_mut()
.expect("get_api_endpoint() should return a https url")
.push("graphql");
graphql_endpoint
}
pub(super) fn issue_graphql_query<T>(
client: &remote::Client,
query: String,
auth_token: &str,
) -> impl Future<Output = Result<T, GhApiError>> + Send + Sync + 'static
where
T: DeserializeOwned + Debug,
{
let res = to_json_string(&GraphQLQuery { query })
.map_err(remote::Error::from)
.map(|graphql_query| {
let graphql_endpoint = get_graphql_endpoint();
debug!("Sending graphql query to {graphql_endpoint}: '{graphql_query}'");
let request_builder = client
.post(graphql_endpoint, graphql_query)
.header("Accept", "application/vnd.github+json")
.bearer_auth(&auth_token);
request_builder.send(false)
});
async move {
let response = res?.await?;
check_http_status_and_header(&response)?;
let mut response: GraphQLResponse<T> = response.json().await?;
debug!("response = {response:?}");
if let Some(error) = response.errors.take() {
Err(error.into())
} else {
Ok(response.data)
}
}
}

View file

@ -0,0 +1,203 @@
use std::{error, fmt, io, time::Duration};
use binstalk_downloader::remote;
use compact_str::{CompactString, ToCompactString};
use serde::{de::Deserializer, Deserialize};
use thiserror::Error as ThisError;
#[derive(ThisError, Debug)]
#[error("Context: '{context}', err: '{err}'")]
pub struct GhApiContextError {
context: CompactString,
#[source]
err: GhApiError,
}
#[derive(ThisError, Debug)]
#[non_exhaustive]
pub enum GhApiError {
#[error("IO Error: {0}")]
Io(#[from] io::Error),
#[error("Remote Error: {0}")]
Remote(#[from] remote::Error),
#[error("Failed to parse url: {0}")]
InvalidUrl(#[from] url::ParseError),
/// A wrapped error providing the context the error is about.
#[error(transparent)]
Context(Box<GhApiContextError>),
#[error("Remote failed to process GraphQL query: {0}")]
GraphQLErrors(GhGraphQLErrors),
#[error("Hit rate-limit, retry after {retry_after:?}")]
RateLimit { retry_after: Option<Duration> },
#[error("Corresponding resource is not found")]
NotFound,
#[error("Does not have permission to access the API")]
Unauthorized,
}
impl GhApiError {
/// Attach context to [`GhApiError`]
pub fn context(self, context: impl fmt::Display) -> Self {
use GhApiError::*;
if matches!(self, RateLimit { .. } | NotFound | Unauthorized) {
self
} else {
Self::Context(Box::new(GhApiContextError {
context: context.to_compact_string(),
err: self,
}))
}
}
}
impl From<GhGraphQLErrors> for GhApiError {
fn from(e: GhGraphQLErrors) -> Self {
if e.is_rate_limited() {
Self::RateLimit { retry_after: None }
} else if e.is_not_found_error() {
Self::NotFound
} else {
Self::GraphQLErrors(e)
}
}
}
#[derive(Debug, Deserialize)]
pub struct GhGraphQLErrors(Box<[GraphQLError]>);
impl GhGraphQLErrors {
fn is_rate_limited(&self) -> bool {
self.0
.iter()
.any(|error| matches!(error.error_type, GraphQLErrorType::RateLimited))
}
fn is_not_found_error(&self) -> bool {
self.0
.iter()
.any(|error| matches!(&error.error_type, GraphQLErrorType::Other(error_type) if *error_type == "NOT_FOUND"))
}
}
impl error::Error for GhGraphQLErrors {}
impl fmt::Display for GhGraphQLErrors {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let last_error_index = self.0.len() - 1;
for (i, error) in self.0.iter().enumerate() {
write!(
f,
"type: '{error_type}', msg: '{msg}'",
error_type = error.error_type,
msg = error.message,
)?;
for location in error.locations.as_deref().into_iter().flatten() {
write!(
f,
", occured on query line {line} col {col}",
line = location.line,
col = location.column
)?;
}
for (k, v) in &error.others {
write!(f, ", {k}: {v}")?;
}
if i < last_error_index {
f.write_str("\n")?;
}
}
Ok(())
}
}
#[derive(Debug, Deserialize)]
struct GraphQLError {
message: CompactString,
locations: Option<Box<[GraphQLLocation]>>,
#[serde(rename = "type")]
error_type: GraphQLErrorType,
#[serde(flatten, with = "tuple_vec_map")]
others: Vec<(CompactString, serde_json::Value)>,
}
#[derive(Debug)]
pub(super) enum GraphQLErrorType {
RateLimited,
Other(CompactString),
}
impl fmt::Display for GraphQLErrorType {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.write_str(match self {
GraphQLErrorType::RateLimited => "RATE_LIMITED",
GraphQLErrorType::Other(s) => s,
})
}
}
impl<'de> Deserialize<'de> for GraphQLErrorType {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
let s = CompactString::deserialize(deserializer)?;
Ok(match &*s {
"RATE_LIMITED" => GraphQLErrorType::RateLimited,
_ => GraphQLErrorType::Other(s),
})
}
}
#[derive(Debug, Deserialize)]
struct GraphQLLocation {
line: u64,
column: u64,
}
#[cfg(test)]
mod test {
use super::*;
use serde::de::value::{BorrowedStrDeserializer, Error};
macro_rules! assert_matches {
($expression:expr, $pattern:pat $(if $guard:expr)? $(,)?) => {
match $expression {
$pattern $(if $guard)? => true,
expr => {
panic!(
"assertion failed: `{expr:?}` does not match `{}`",
stringify!($pattern $(if $guard)?)
)
}
}
}
}
#[test]
fn test_graph_ql_error_type() {
let deserialize = |input: &str| {
GraphQLErrorType::deserialize(BorrowedStrDeserializer::<'_, Error>::new(input)).unwrap()
};
assert_matches!(deserialize("RATE_LIMITED"), GraphQLErrorType::RateLimited);
assert_matches!(
deserialize("rATE_LIMITED"),
GraphQLErrorType::Other(val) if val == CompactString::new("rATE_LIMITED")
);
}
}

View file

@ -0,0 +1,187 @@
use std::{
borrow::Borrow,
collections::HashSet,
fmt,
future::Future,
hash::{Hash, Hasher},
};
use binstalk_downloader::remote::{self};
use compact_str::{CompactString, ToCompactString};
use serde::Deserialize;
use url::Url;
use super::{
common::{issue_graphql_query, issue_restful_api},
GhApiError, GhRelease, GhRepo,
};
// Only include fields we do care about
#[derive(Eq, Deserialize, Debug)]
struct Artifact {
name: CompactString,
url: Url,
}
// Manually implement PartialEq and Hash to ensure it will always produce the
// same hash as a str with the same content, and that the comparison will be
// the same to coparing a string.
impl PartialEq for Artifact {
fn eq(&self, other: &Self) -> bool {
self.name.eq(&other.name)
}
}
impl Hash for Artifact {
fn hash<H>(&self, state: &mut H)
where
H: Hasher,
{
let s: &str = self.name.as_str();
s.hash(state)
}
}
// Implement Borrow so that we can use call
// `HashSet::contains::<str>`
impl Borrow<str> for Artifact {
fn borrow(&self) -> &str {
&self.name
}
}
#[derive(Debug, Default, Deserialize)]
pub(super) struct Artifacts {
assets: HashSet<Artifact>,
}
impl Artifacts {
/// get url for downloading the artifact using GitHub API (for private repository).
pub(super) fn get_artifact_url(&self, artifact_name: &str) -> Option<Url> {
self.assets
.get(artifact_name)
.map(|artifact| artifact.url.clone())
}
}
pub(super) fn fetch_release_artifacts_restful_api(
client: &remote::Client,
GhRelease {
repo: GhRepo { owner, repo },
tag,
}: &GhRelease,
) -> impl Future<Output = Result<Artifacts, GhApiError>> + Send + Sync + 'static {
issue_restful_api(client, &["repos", owner, repo, "releases", "tags", tag])
}
#[derive(Debug, Deserialize)]
struct GraphQLData {
repository: Option<GraphQLRepo>,
}
#[derive(Debug, Deserialize)]
struct GraphQLRepo {
release: Option<GraphQLRelease>,
}
#[derive(Debug, Deserialize)]
struct GraphQLRelease {
#[serde(rename = "releaseAssets")]
assets: GraphQLReleaseAssets,
}
#[derive(Debug, Deserialize)]
struct GraphQLReleaseAssets {
nodes: Vec<Artifact>,
#[serde(rename = "pageInfo")]
page_info: GraphQLPageInfo,
}
#[derive(Debug, Deserialize)]
struct GraphQLPageInfo {
#[serde(rename = "endCursor")]
end_cursor: Option<CompactString>,
#[serde(rename = "hasNextPage")]
has_next_page: bool,
}
enum FilterCondition {
Init,
After(CompactString),
}
impl fmt::Display for FilterCondition {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
// GitHub imposes a limit of 100 for the value passed to param "first"
FilterCondition::Init => f.write_str("first:100"),
FilterCondition::After(end_cursor) => write!(f, r#"first:100,after:"{end_cursor}""#),
}
}
}
pub(super) fn fetch_release_artifacts_graphql_api(
client: &remote::Client,
GhRelease {
repo: GhRepo { owner, repo },
tag,
}: &GhRelease,
auth_token: &str,
) -> impl Future<Output = Result<Artifacts, GhApiError>> + Send + Sync + 'static {
let client = client.clone();
let auth_token = auth_token.to_compact_string();
let base_query_prefix = format!(
r#"
query {{
repository(owner:"{owner}",name:"{repo}") {{
release(tagName:"{tag}") {{"#
);
let base_query_suffix = r#"
nodes { name url }
pageInfo { endCursor hasNextPage }
}}}}"#
.trim();
async move {
let mut artifacts = Artifacts::default();
let mut cond = FilterCondition::Init;
let base_query_prefix = base_query_prefix.trim();
loop {
let query = format!(
r#"
{base_query_prefix}
releaseAssets({cond}) {{
{base_query_suffix}"#
);
let data: GraphQLData = issue_graphql_query(&client, query, &auth_token).await?;
let assets = data
.repository
.and_then(|repository| repository.release)
.map(|release| release.assets);
if let Some(assets) = assets {
artifacts.assets.extend(assets.nodes);
match assets.page_info {
GraphQLPageInfo {
end_cursor: Some(end_cursor),
has_next_page: true,
} => {
cond = FilterCondition::After(end_cursor);
}
_ => break Ok(artifacts),
}
} else {
break Err(GhApiError::NotFound);
}
}
}
}

View file

@ -0,0 +1,80 @@
use std::future::Future;
use compact_str::CompactString;
use serde::Deserialize;
use super::{
common::{issue_graphql_query, issue_restful_api},
remote, GhApiError, GhRepo,
};
#[derive(Clone, Eq, PartialEq, Hash, Debug, Deserialize)]
struct Owner {
login: CompactString,
}
#[derive(Clone, Eq, PartialEq, Hash, Debug, Deserialize)]
pub struct RepoInfo {
owner: Owner,
name: CompactString,
private: bool,
}
impl RepoInfo {
#[cfg(test)]
pub(crate) fn new(GhRepo { owner, repo }: GhRepo, private: bool) -> Self {
Self {
owner: Owner { login: owner },
name: repo,
private,
}
}
pub fn repo(&self) -> GhRepo {
GhRepo {
owner: self.owner.login.clone(),
repo: self.name.clone(),
}
}
pub fn is_private(&self) -> bool {
self.private
}
}
pub(super) fn fetch_repo_info_restful_api(
client: &remote::Client,
GhRepo { owner, repo }: &GhRepo,
) -> impl Future<Output = Result<Option<RepoInfo>, GhApiError>> + Send + Sync + 'static {
issue_restful_api(client, &["repos", owner, repo])
}
#[derive(Debug, Deserialize)]
struct GraphQLData {
repository: Option<RepoInfo>,
}
pub(super) fn fetch_repo_info_graphql_api(
client: &remote::Client,
GhRepo { owner, repo }: &GhRepo,
auth_token: &str,
) -> impl Future<Output = Result<Option<RepoInfo>, GhApiError>> + Send + Sync + 'static {
let query = format!(
r#"
query {{
repository(owner:"{owner}",name:"{repo}") {{
owner {{
login
}}
name
private: isPrivate
}}
}}"#
);
let future = issue_graphql_query(client, query, auth_token);
async move {
let data: GraphQLData = future.await?;
Ok(data.repository)
}
}