Skip to content

Run git gc periodically on the crates.io index (#778) #956

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 143 commits into from
Closed
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
69e51b4
Run git gc periodically on the crates.io index #778
Aug 11, 2020
9cb547f
Update config.rs
ohaddahan Aug 11, 2020
3d0ec4e
Fixed "cargo fmt"
Aug 11, 2020
ce5bbcf
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
94f8ac3
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
6335dd5
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
d3e2c80
Update src/config.rs
ohaddahan Aug 11, 2020
8d06f51
Update .gitignore
ohaddahan Aug 11, 2020
e323a2f
Added --auto
Aug 12, 2020
6d66f9e
--auto and fmt
Aug 12, 2020
29b1113
Fix cargo fmt
Aug 12, 2020
014338c
Adding pre-commit hook, adding dead-code lint since without it clippy…
Aug 12, 2020
45892ff
Changed envar prefix and replace echo with printf for pre-commit hook
Aug 12, 2020
525c6f8
Merge pull request #1 from ohaddahan/pre-commit-hook
ohaddahan Aug 12, 2020
1e4d83e
Removed allow(dead_code) and not(windows) , now OPEN_FILE_DESCRIPTORS…
Aug 12, 2020
39e5b26
Updated pre-commit script and instructions
Aug 13, 2020
3be80f5
Added newline at end of pre-commit script
Aug 13, 2020
1ba6fd5
Move run_git_gc to Index
Aug 13, 2020
7f09aea
Merge pull request #2 from ohaddahan/pre-commit-hook
ohaddahan Aug 13, 2020
f91a2fe
Fix typo
Aug 13, 2020
68332f7
Merge pull request #3 from ohaddahan/pre-commit-hook
ohaddahan Aug 13, 2020
b5ec00c
Rebased
Aug 13, 2020
b1d4a38
Update .gitignore
ohaddahan Aug 11, 2020
e6f1734
Rebased
Aug 13, 2020
7e74a68
Changed envar prefix and replace echo with printf for pre-commit hook
Aug 12, 2020
1fa9a7f
Rebased
Aug 13, 2020
98a0a1b
Added newline at end of pre-commit script
Aug 13, 2020
f840c5f
Update `cargo run` commands
Nemo157 Aug 8, 2020
14f8730
Add doc coverage information
GuillaumeGomez Aug 5, 2020
87d6fe4
Clean code a bit
GuillaumeGomez Aug 5, 2020
5c45d4b
Improve rendering
GuillaumeGomez Aug 6, 2020
7fe6225
Put doc coverage data into a new DocCoverage struct
GuillaumeGomez Aug 6, 2020
f914f1b
Move common code into one macro to make the maintenance easier
GuillaumeGomez Aug 6, 2020
b5ab712
Improve templates for doc coverage
GuillaumeGomez Aug 6, 2020
12c8828
Add explanations about the "config_command" macro and append "rustdoc…
GuillaumeGomez Aug 6, 2020
200e340
Add test to ensure that doc coverage is present
GuillaumeGomez Aug 7, 2020
35584bb
bump rustwide to 0.10.0
pietroalbini Aug 8, 2020
2d23d4c
remove the macro and improve the coverage extraction code
pietroalbini Aug 7, 2020
d250251
context: create the Context trait
pietroalbini Aug 8, 2020
154cb1e
web: use Context to initialize the web server
pietroalbini Aug 8, 2020
793bd58
Rebased
Aug 13, 2020
9436391
bump prometheus to 0.9.0
pietroalbini Aug 9, 2020
c9b7fe1
Rebased
Aug 13, 2020
f23e41c
storage: ensure uploaded_files_total metric is recorded
pietroalbini Aug 9, 2020
13e68ca
web: merge metrics tests
pietroalbini Aug 9, 2020
083f5d3
ci: run tests in parallel
pietroalbini Aug 9, 2020
9da332f
metrics: address review comments
pietroalbini Aug 9, 2020
39f7db6
metrics: make visibility explicit
pietroalbini Aug 9, 2020
13827d6
metrics: add comment explaining the namespace
pietroalbini Aug 9, 2020
ba416f6
Rebased
Aug 13, 2020
e54a21f
metrics: switch to cfg(target_os = "linux")
pietroalbini Aug 12, 2020
7266be3
update the coverage styling in the dropdown
pietroalbini Aug 12, 2020
f25e599
coverage: improve tests
pietroalbini Aug 13, 2020
4751bf2
Rebased
Aug 13, 2020
11930e0
Rebased
Aug 13, 2020
80d2cbc
Rebased
Aug 13, 2020
de82def
Run git gc periodically on the crates.io index #778
Aug 11, 2020
3440e52
Update config.rs
ohaddahan Aug 11, 2020
a94cb90
Fixed "cargo fmt"
Aug 11, 2020
fcf3302
Added --auto
Aug 12, 2020
819d32b
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
eb47785
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
8a4eec3
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
ffc3d9e
Update src/config.rs
ohaddahan Aug 11, 2020
c8744fe
Update .gitignore
ohaddahan Aug 11, 2020
a9f04c4
Fix cargo fmt
Aug 12, 2020
e8daa76
Adding pre-commit hook, adding dead-code lint since without it clippy…
Aug 12, 2020
b622d56
Changed envar prefix and replace echo with printf for pre-commit hook
Aug 12, 2020
860c276
Removed allow(dead_code) and not(windows) , now OPEN_FILE_DESCRIPTORS…
Aug 12, 2020
ad575fa
Updated pre-commit script and instructions
Aug 13, 2020
9897721
Added newline at end of pre-commit script
Aug 13, 2020
95b6061
Move run_git_gc to Index
Aug 13, 2020
b53b87d
Fix typo
Aug 13, 2020
618d966
Rebased
Aug 13, 2020
4e7b8a6
Update .gitignore
ohaddahan Aug 11, 2020
0a0128f
Rebased
Aug 13, 2020
4867b38
Changed envar prefix and replace echo with printf for pre-commit hook
Aug 12, 2020
16fb2a7
Rebased
Aug 13, 2020
58ab2b2
Added newline at end of pre-commit script
Aug 13, 2020
d6383e7
Rebased
Aug 13, 2020
19ca702
Rebased
Aug 13, 2020
a1ef4bc
Rebased
Aug 13, 2020
6715bed
Rebased
Aug 13, 2020
b073e27
Rebased
Aug 13, 2020
f231072
Rebased
Aug 13, 2020
ad35125
Run git gc periodically on the crates.io index #778
Aug 11, 2020
4f9fd37
Update config.rs
ohaddahan Aug 11, 2020
755eda1
Fixed "cargo fmt"
Aug 11, 2020
4b48570
Added --auto
Aug 12, 2020
50473e7
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
c651b13
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
20bed43
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
0ab6ece
Update src/config.rs
ohaddahan Aug 11, 2020
655dd18
Update .gitignore
ohaddahan Aug 11, 2020
26bd4e3
Fix cargo fmt
Aug 12, 2020
d585733
Adding pre-commit hook, adding dead-code lint since without it clippy…
Aug 12, 2020
a17c0c5
Changed envar prefix and replace echo with printf for pre-commit hook
Aug 12, 2020
0478cdf
Removed allow(dead_code) and not(windows) , now OPEN_FILE_DESCRIPTORS…
Aug 12, 2020
7688b67
Updated pre-commit script and instructions
Aug 13, 2020
252edf9
Added newline at end of pre-commit script
Aug 13, 2020
69b363e
Move run_git_gc to Index
Aug 13, 2020
37bc3aa
Fix typo
Aug 13, 2020
5285793
Rebased
Aug 13, 2020
993aef8
Update .gitignore
ohaddahan Aug 11, 2020
b992cc2
Rebased
Aug 13, 2020
18e89b5
Changed envar prefix and replace echo with printf for pre-commit hook
Aug 12, 2020
7a5b5d7
Rebased
Aug 13, 2020
2868dea
Added newline at end of pre-commit script
Aug 13, 2020
6087b61
Rebased
Aug 13, 2020
c69d553
Rebased
Aug 13, 2020
29412b9
Rebased
Aug 13, 2020
c063e0c
Rebased
Aug 13, 2020
7374345
Rebased
Aug 13, 2020
def42a4
Rebased
Aug 13, 2020
fe68adb
Run git gc periodically on the crates.io index #778
Aug 11, 2020
29ecb45
Update config.rs
ohaddahan Aug 11, 2020
d2c20c4
Fixed "cargo fmt"
Aug 11, 2020
3214302
Added --auto
Aug 12, 2020
d072f02
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
cb406be
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
a6a5c65
Update src/utils/daemon.rs
ohaddahan Aug 11, 2020
28065e1
Update src/config.rs
ohaddahan Aug 11, 2020
be9e3e0
Update .gitignore
ohaddahan Aug 11, 2020
89578dd
Fix cargo fmt
Aug 12, 2020
87ed97d
Adding pre-commit hook, adding dead-code lint since without it clippy…
Aug 12, 2020
2bf3ae3
Changed envar prefix and replace echo with printf for pre-commit hook
Aug 12, 2020
a9efd05
Removed allow(dead_code) and not(windows) , now OPEN_FILE_DESCRIPTORS…
Aug 12, 2020
d7c0ec1
Updated pre-commit script and instructions
Aug 13, 2020
a7568a2
Added newline at end of pre-commit script
Aug 13, 2020
db8ce50
Move run_git_gc to Index
Aug 13, 2020
4121fb6
Fix typo
Aug 13, 2020
c7fcff6
Rebased
Aug 13, 2020
d8b8a75
Update .gitignore
ohaddahan Aug 11, 2020
60c58fb
Rebased
Aug 13, 2020
3ec3b83
Changed envar prefix and replace echo with printf for pre-commit hook
Aug 12, 2020
79a295d
Rebased
Aug 13, 2020
02bc0ff
Added newline at end of pre-commit script
Aug 13, 2020
ff02d23
Rebased
Aug 13, 2020
5722ae1
Rebased
Aug 13, 2020
6c7193b
Rebased
Aug 13, 2020
ef9cd22
Rebased
Aug 13, 2020
a593212
Rebased
Aug 13, 2020
5b0cc9f
Rebased
Aug 13, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .git_hooks/pre-commit
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/usr/bin/env sh
if ! cargo fmt -- --check ; then
printf "\n"
printf "\033[0;31mpre-commit hook failed during:\033[0m\n"
printf "\033[0;31m\tcargo fmt -- --check\033[0m\n"
exit 1
fi

if ! cargo clippy --locked -- -D warnings ; then
printf "\n"
printf "\033[0;31mpre-commit hook failed during:\033[0m\n"
printf "\033[0;31m\tclippy --locked -- -D warning\033[0m\n"
exit 1
fi

printf "\n"
printf "\033[0;32mpre-commit hook succeeded\033[0m\n"
exit 0

6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,12 @@ The recommended way to develop docs.rs is a combination of `cargo run` for
the main binary and [docker-compose](https://docs.docker.com/compose/) for the external services.
This gives you reasonable incremental build times without having to add new users and packages to your host machine.

### Git Hooks
For ease of use, `git_hooks` directory contains useful `git hooks` to make your development easier.
```bash
Comment on lines +25 to +27
Copy link
Member

@Kixiron Kixiron Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Git Hooks
For ease of use, `git_hooks` directory contains useful `git hooks` to make your development easier.
```bash
### Git Hooks
For ease of use, `git_hooks` directory contains useful `git hooks` to make your development easier.
```bash

cd .git/hooks && ln -s ../../.git_hooks/* . && cd ../..
```
Comment on lines +27 to +29
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```bash
cd .git/hooks && ln -s ../../.git_hooks/* . && cd ../..
```
```bash
# Unix
cd .git/hooks && ln -s ../../.git_hooks/* . && cd ../..
# Powershell
cd .git/hooks && New-Item -Path ../../.git_hooks/* -ItemType SymbolicLink -Value . && cd ../..


### Dependencies

Docs.rs requires at least the following native C dependencies.
Expand Down
3 changes: 3 additions & 0 deletions src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ pub struct Config {
pub(crate) max_file_size_html: usize,
// The most memory that can be used to parse an HTML file
pub(crate) max_parse_memory: usize,
// Time between 'git gc' calls in seconds
pub(crate) registry_gc_interval: u64,
}

impl Config {
Expand Down Expand Up @@ -61,6 +63,7 @@ impl Config {
// LOL HTML only uses as much memory as the size of the start tag!
// https://github.com/rust-lang/docs.rs/pull/930#issuecomment-667729380
max_parse_memory: env("DOCSRS_MAX_PARSE_MEMORY", 5 * 1024 * 1024)?,
registry_gc_interval: env("DOCSRS_REGISTRY_GC_INTERVAL", 60 * 60)?,
})
}

Expand Down
4 changes: 4 additions & 0 deletions src/docbuilder/queue.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ use crates_index_diff::ChangeKind;
use log::{debug, error};

impl DocBuilder {
pub fn run_git_gc(&self) {
self.index.run_git_gc();
}

/// Updates registry index repository and adds new crates into build queue.
/// Returns the number of crates added
pub fn get_new_crates(&mut self) -> Result<usize> {
Expand Down
9 changes: 9 additions & 0 deletions src/index/mod.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
use std::path::{Path, PathBuf};
use std::process::Command;
Comment on lines 1 to +2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use std::path::{Path, PathBuf};
use std::process::Command;
use std::{path::{Path, PathBuf}, process::Command};


use url::Url;

Expand Down Expand Up @@ -50,6 +51,14 @@ impl Index {
Ok(Self { path, api })
}

pub fn run_git_gc(&self) {
let cmd = format!("cd {} && gc --auto", self.path.to_str().unwrap());
let gc = Command::new("sh").args(&["-c", cmd.as_str()]).output();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let cmd = format!("cd {} && gc --auto", self.path.to_str().unwrap());
let gc = Command::new("sh").args(&["-c", cmd.as_str()]).output();
let gc = Command::new("git").arg("-C").arg(&self.path).args(&["gc", "--auto"]).output();

Avoids the use of sh and potential panic if the path is non-UTF-8.

if let Err(err) = gc {
Comment on lines +56 to +57
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let gc = Command::new("sh").args(&["-c", cmd.as_str()]).output();
if let Err(err) = gc {
let gc = Command::new("sh").args(&["-c", cmd.as_str()]).output();
if let Err(err) = gc {

log::error!("Failed to run `{}`: {:?}", cmd, err);
}
}

pub(crate) fn diff(&self) -> Result<crates_index_diff::Index> {
let diff = crates_index_diff::Index::from_path_or_cloned(&self.path)
.context("re-opening registry index for diff")?;
Expand Down
16 changes: 14 additions & 2 deletions src/utils/daemon.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,18 +13,21 @@ use failure::Error;
use log::{debug, error, info};
use std::sync::Arc;
use std::thread;
use std::time::Duration;
use std::time::{Duration, Instant};

fn start_registry_watcher(
opts: DocBuilderOptions,
pool: Pool,
build_queue: Arc<BuildQueue>,
config: Arc<Config>,
) -> Result<(), Error> {
thread::Builder::new()
.name("registry index reader".to_string())
.spawn(move || {
// space this out to prevent it from clashing against the queue-builder thread on launch
thread::sleep(Duration::from_secs(30));
let mut last_gc = Instant::now();

loop {
let mut doc_builder =
DocBuilder::new(opts.clone(), pool.clone(), build_queue.clone());
Expand All @@ -39,6 +42,10 @@ fn start_registry_watcher(
}
}

if last_gc.elapsed().as_secs() >= config.registry_gc_interval {
doc_builder.run_git_gc();
last_gc = Instant::now();
}
thread::sleep(Duration::from_secs(60));
}
})?;
Expand All @@ -60,7 +67,12 @@ pub fn start_daemon(

if enable_registry_watcher {
// check new crates every minute
start_registry_watcher(dbopts.clone(), db.clone(), build_queue.clone())?;
start_registry_watcher(
dbopts.clone(),
db.clone(),
build_queue.clone(),
config.clone(),
)?;
}

// build new crates every minute
Expand Down
4 changes: 2 additions & 2 deletions src/web/metrics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ pub static MAX_DB_CONNECTIONS: Lazy<IntGauge> = Lazy::new(|| {
.unwrap()
});

#[cfg(not(windows))]
#[cfg(target_os = "linux")]
pub static OPEN_FILE_DESCRIPTORS: Lazy<IntGauge> = Lazy::new(|| {
register_int_gauge!(
"docsrs_open_file_descriptors",
Expand All @@ -138,7 +138,7 @@ pub static OPEN_FILE_DESCRIPTORS: Lazy<IntGauge> = Lazy::new(|| {
.unwrap()
});

#[cfg(not(windows))]
#[cfg(target_os = "linux")]
pub static CURRENTLY_RUNNING_THREADS: Lazy<IntGauge> = Lazy::new(|| {
register_int_gauge!(
"docsrs_running_threads",
Expand Down