| 266 | === Changelog 1.3 → 2.0 === |
| 267 | Major bugfixes |
| 268 | * ignored redirection to path + "/" fixed |
| 269 | * binary files discard fixed (text extraction from pdf, doc,... works now) |
| 270 | Major updates |
| 271 | * multilingual webiste support (see util/config.py) |
| 272 | * keeping near-good or even bad paragraphs allowed |
| 273 | Minor updates |
| 274 | * machine translation filter (based on some known MT identifiers in HTML) |
| 275 | * extract text from ODF format (.odt files) |
| 276 | * get file type from Content-Type from the HTTP header |
| 277 | * add HTTP Last-Modified date to prevertical |
| 278 | * Justext classification added to paragraph attributes |
| 279 | |
| 280 | === Changelog 1.1 → 1.3 === |
| 281 | * decode IDNA hostnames in prevertical |
| 282 | * adding URLs to download on-the-fly enabled |
| 283 | * bugfixes |
| 284 | |