47 lines
1.4 KiB
Markdown
47 lines
1.4 KiB
Markdown
# prawn
|
|
|
|
<!--
|
|
|
|
Logo Image
|
|
Sadly I cant do my cool Styling for the Div :C
|
|
-->
|
|
<div
|
|
style = "
|
|
display: flex;
|
|
justify-content: center;
|
|
">
|
|
<img
|
|
src = "assets/logo.png"
|
|
alt = "logo"
|
|
style = "width:50%"
|
|
/>
|
|
</div>
|
|
|
|
|
|
prawn is an extremely fast Rust web scraper that downloads a webpage's HTML and all linked CSS and JS resources, saving them into a local folder for offline use.
|
|
|
|
## Features
|
|
- High-performance: Uses `reqwest` (with connection pooling), `tokio`, and `rayon` for parallelism.
|
|
- CLI tool: Accepts a URL as an argument.
|
|
- Downloads and parses HTML as fast as possible.
|
|
- Extracts and concurrently downloads all `<link rel="stylesheet">` and `<script src="...">` resources.
|
|
- Rewrites HTML to point to local files and saves it as `saved_site/index.html`.
|
|
- All CSS and JS files are saved into `saved_site/css/` and `saved_site/js/` respectively.
|
|
|
|
## Usage
|
|
|
|
```
|
|
cargo run -- https://example.com
|
|
```
|
|
|
|
This will download the HTML, CSS, and JS concurrently and save them to `./saved_site/` within seconds.
|
|
|
|
## Constraints
|
|
- Uses async Rust (`tokio`) for HTTP I/O and `rayon` or `futures` for concurrent downloads.
|
|
- Uses `scraper` for fast DOM-like parsing.
|
|
- No GUI dependencies or headless browsers (pure HTTP and HTML/CSS/JS).
|
|
- Avoids unsafe code unless absolutely justified and documented.
|
|
- Minimizes unnecessary allocations or cloning.
|
|
|
|
## License
|
|
MIT |