Changes for version 0.04
- Change: cae590fbe008e2c539fa40c24c449d9cbb44b3ff Author: Paul Waring <paul@xk7.net> Date : 2014-05-11 15:07:28 +0000
- Bump version to 0.04
- Change: f158ecdc8ff11c3f84d69a42530a698fcb528226 Author: Paul Waring <paul@xk7.net> Date : 2014-05-11 15:07:09 +0000
- Output file is now required.
- Change: e19c3b9a2d83e12481baf0740e6b2ccbb418ec9f Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-09 13:26:42 +0000
- Bump version to 0.03
- Change: cefc2864f0e116b0b6de982dd420d19ca6b57a1c Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-09 13:24:11 +0000
- Document excluded_urls parameter
- Change: 4a2807ec5af276d4effb2fd778e93881d242ce36 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-08 10:46:28 +0000
- Check final URL as well as initial absolute URL
- Change: 2d8a87f8d3a195a4f07cbac26a7845d6ecce14e9 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-07 16:28:46 +0000
- Allow URLs to be excluded
- Required feature otherwise the link checker can get stuck on pages which have a huge number of self-referencing links (e.g. calendars).
- Change: 9cd5c75a2e5650051bfca20ea129c683b4b29c9e Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-06 12:46:10 +0000
- Automatically flush STDOUT
- Need to do this otherwise we cannot monitor progress (e.g. with ./script.pl | tee output.txt)
- Change: 71f2d63ff15a015f963c7e20a7b44bb4cf17bfb3 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-06 12:27:18 +0000
- Extra debugging
- Change: ffd37475471353bd385ecd5a434f59e6972bdf80 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-06 12:18:03 +0000
- Convert say to print
- Change: a33befee109fbf4a17aa640b11c723a3d4af79fa Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-06 11:37:36 +0000
- Ignore *.txt files
- Change: 7e3b1fec5d6d016d15edcbc5417f80b0975e35fc Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-02 15:52:47 +0000
- Ignore broken URLs which appear more than 10 times
- Chances are that if we encounter a broken URL more than 10 times, it will be part of the site's header or footer, and therefore there is no point in reporting it on every single page.
- Change: f6079f91b0bb106ad7a5811c80e059feb4271a0b Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-02 14:57:37 +0000
- Prune Perl and CSV files from distribution
- Change: cf1fae20c4777f100b2c311a65fbea7f3c1e68a6 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-02 10:43:39 +0000
- Bump version to 0.02
- Change: 4d5df83e0ae84b257a2918d9080f06d3c764bdb9 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-05-01 13:14:11 +0000
- Ignore CSV files
- Change: 864be37d5e4ac55629055d2ef5624fbfc615d155 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 13:21:00 +0000
- Adding documentation
- Change: 1d9c520550bfca3533894b9727566f8866e93107 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 12:51:46 +0000
- Fix syntax error
- Change: 01ca1992d75758827496839b1b9011b30dad735d Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 12:50:29 +0000
- Check images as well as links
- This should detect images where the src URL results in an error
- Change: 9f4aeb336223b98a82b8f923685bf12e6ee50a15 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 12:44:46 +0000
- Add headings to CSV file
- Change: 96acf43ce082adbe5148d9776fda3b38bfd55850 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 12:40:00 +0000
- Move CSV options into separate variable
- Change: e72289c899393cdaf12a98aef1e24cb2a5764edd Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 12:37:32 +0000
- Configure Text::CSV
- Always quote (makes importing into other applications easier).
- Explicitly set end of line character as Text::CSV default does not appear to work.
- Change: 5439144f395e66d0e0b8df59f77617b390d13efa Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 12:10:03 +0000
- Use Text::CSV for output
- Much easier to use this module than to try and remember correct encodings, quotations etc.
- Change: a06eadf6007692a7511052f84481efb46fb5ec15 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 12:01:43 +0000
- Print to specified output file or STDOUT
- Change: 25605a712e770390dfd169dcb65eca7f44b818b3 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 11:20:01 +0000
- Change newline to tab
- Change: 2a5b173d8f75fcc392da9cc879eb7cb7b19fdf3b Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 11:18:28 +0000
- Remove debugging info
- Change: aab491a63bff7974ebc6c99983de25c7e7c4e0a7 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 11:14:01 +0000
- Use index to search for substring instead of regex
- Change: 9a0bca7da020ff793a1257fa11e329bf88ddb072 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 11:11:20 +0000
- Correct regex
- Change: 6055a3f4e880045820d93198e31e5bfe2e062cb2 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 11:10:29 +0000
- Extra debugging info
- Change: a2e04c26ee78e45947c44629d2d271cf4d63bcba Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 11:04:52 +0000
- Don't check URLs which we have previously scanned
- This check needs to be further up to prevent unnecessary HEAD requests.
- Change: 813614ec9093ace30d61248bad3ecebbde5df853 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 10:59:53 +0000
- Remove URL fragment
- This prevents us from making a HEAD request for the same page multiple times.
- Change: c06587623535e20ffe5196da2fc06b176316861c Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 10:49:35 +0000
- Correct variable name
- Change: 5f12686437d4917984684c4f66539d4ba6750c32 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 10:48:16 +0000
- Only check http(s) links
- Don't need to check javascript, mailto etc.
- Also move sleep gaps to immediately after get/head is called, so we definitely pause after each request.
- Change: 654b93024554aa4300209a842915bc99d46a06ce Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 10:40:19 +0000
- Issue HEAD request and check content-type
- Prevents fetching and parsing URLs which are not HTML and therefore will not contain links - especially important to avoid this for large binary files.
- Change: 26de2a6b84e5a6b2dd848a72451ca1ac500ea42a Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-29 10:39:23 +0000
- Ignore todo.txt
- Change: de68d4d9c839952e8390faec624221456f7868bc Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 16:36:43 +0000
- Stop Mechanize from dying on errors
- We want to check if a URL is not found and handle it gracefully, whereas default behaviour in WWW::Mechanize is to die() immediately.
- Change: 25abdd48411d4e22c9d19f2c40052b2bb53b7bbc Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 16:09:16 +0000
- Correct variable name
- Change: 3fcb89b408bf7cfa0866df295c50f5960602df17 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 16:08:27 +0000
- Configurable request gap
- Sleep for this number of seconds between requests
- Change: b69ee875031d5f582f7a70d16a9177ce6b59b10e Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 16:06:02 +0000
- Debugging info
- Change: 7a3b92edf1a9f5be0b54770290e3d362d57f1bed Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 16:03:13 +0000
- Change false to 0
- Change: 5db58cc1ccdda95d02ddd289743c95ae6d77173c Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 16:02:08 +0000
- Correct variable name
- Change: 49377fd0f4c6661d8b88d85483e3c03ef5e18c74 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:57:25 +0000
- Initial module code
- Should be enough here to run and test
- Change: 98915b9f5c369b1f908ab1bf4718132b49c78182 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:42:45 +0000
- Instance variable and method
- Change: 19acfb039b9c1efd11147fe1bf25ccf4b1402e69 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:40:23 +0000
- Ignore build and test files
- Change: 4d61cd6c5dd32ddadc9838c7c8b3b55f08f8b377 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:38:38 +0000
- Rename module to match package name
- Change: 84bff4e62880df73d0a383849e3e2466345d3fde Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:27:58 +0000
- Initial dist.ini
- Allows distribution to be built using Dist::Zilla
- Change: 8a083dec8bcc8cc131e3fcf9d10a0ce5b6a6d646 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:27:38 +0000
- Initial version number
- Change: 935fcc5cfd71d28ff24434a6ac5b7a650a18cefb Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:23:48 +0000
- End of file newline
- Change: b56655d52db01a003feda0ccf338a0216292f3eb Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:23:28 +0000
- Basic documentation
- Change: f0104e4d2febfd73874c813641a0eb9a60cacc42 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:22:35 +0000
- Moose code in line with best practice
- Change: 6191482426f94d0ee032a6f64284f30db9479b20 Author: Paul Waring <paul.waring@manchester.ac.uk> Date : 2014-04-28 15:19:34 +0000
- Skeleton module
- Change: de67745ed3adedb5a435bf05d51d8caa931b354a Author: Paul Waring <paul@xk7.net> Date : 2014-04-28 07:14:05 +0000
- Initial commit
- End of releases.
Modules
Finds broken links (including images) on a website.