Thibaut
c69136056e
Log current URL on scraper error
10 years ago
Thibaut
7de19cf800
Make EntryIndex a unique index (don't add the same entry twice)
10 years ago
Thibaut
018628ea7d
Add two-pass redirection rewriter
...
... to avoid having to maintain huge lists of redirects. This works by doing a first pass to detect which internal URL is redirected where, before doing a second (normal) pass that rewrites all these URLs (links) with their final destination. There's a bit of monkey-patching I'm not proud of, but this works(tm).
10 years ago
Thibaut
9ecb1f9498
Refactor StubRootPage
10 years ago
Thibaut
b29d6ca002
Move doc links to manifest
10 years ago
Thibaut
5c4c1ce2b6
Log entries/types/files diff in docs:generate command
10 years ago
Thibaut
cf7f446738
Change home_url to a list of links
10 years ago
Thu Trang Pham
ab85da334d
home_url must be applied after inner_html
10 years ago
Thu Trang Pham
642c1cff7d
Make sure that home_url can be nil
10 years ago
Thu Trang Pham
53bba10212
Adding home_url attr to doc.rb
10 years ago
Thu Trang Pham
42910c62ce
Adding home_url to scraper
10 years ago
Thu Trang Pham
5d7ad301ea
Give scrapper access to home_url
10 years ago
Thu Trang Pham
610e49119f
Renamed official_url to home_url
10 years ago
Thu Trang Pham
0dc8cfd5d3
Adding Official Url for url scrapped documentations
10 years ago
Thibaut
a59ef1cdb6
Add db_size attribute in doc manifest
10 years ago
Thibaut
bc5488faa2
Make docs mtime the greatest of the index and db files' mtime
10 years ago
Thibaut
ca7ff6086e
Exclude docs without a db file from the manifest
10 years ago
Thibaut
ca61a2b746
Add Doc#db_path
10 years ago
Thibaut
5c46eabc67
Output a JSON file containing all the pages' content
10 years ago
Thibaut
e9125c6ec2
Refactor Doc.store_pages
10 years ago
Thibaut
ecf774e22c
Add EntryIndex#blank?
10 years ago
Thibaut
1655a00fb6
Refactor Doc.store_page
10 years ago
Thibaut
e0556365a8
Remove deprecated Kernel#quietly calls
10 years ago
Thibaut
1fa81b8461
Silence Nokogiri warnings
10 years ago
Thibaut
6769c90c8a
Silent nokorigi/libxml2 warnings
10 years ago
Thibaut
bcd4a5b522
Use String#remove
11 years ago
Thibaut
c0be178556
Auto-require gems in the "docs" bundle
11 years ago
Thibaut
864188e24c
Use String#sub instead of String#gsub when possible
11 years ago
Thibaut
b92db88506
Refactor Docs::Scraper
11 years ago
Thibaut
ca06cc7ad9
Implement Docs::Filter#initial_page?
11 years ago
Thibaut
f8298181e9
Implement initial_paths scraper option
11 years ago
Thibaut
5e69d6df5e
Make Docs::FileScraper#request_all accept an array of URLs
11 years ago
Thibaut
cd6057e392
Make Docs::UrlScraper#request_all accept an array of URLs
11 years ago
Thibaut
c6be9b6ae4
Make Docs:Requester#request accept an array of URLs
11 years ago
Thibaut
6e9c16a1fc
Increase default :max_concurrency
11 years ago
Thibaut
706270d89c
Don't store pages with no entries
11 years ago
Thibaut
24347f312b
Add CleanLocalUrls filter for removing localhost URLs
...
Enabled by default in FileScraper
11 years ago
Thibaut
9089d03211
Set the default FileScraper::base_url to localhost
11 years ago
Thibaut
5c4775201b
Move code in scraper.rb
11 years ago
Thibaut
d964fd4ca9
Make Docs::URL#relative_path_to work with unqualified URLs
11 years ago
Thibaut
4df387d475
Make Docs::URL#subpath_to work with unqualified URLs
11 years ago
Thibaut
1204390f81
Make Docs::Scraper.root_path inheritable
11 years ago
Thibaut
18986c1814
Going open source
11 years ago