Changes between Initial Version and Version 1 of Justext_changelog

Timestamp:: 10/09/18 15:41:45 (8 years ago)
Author:: admin
Comment:: Created from git log

Legend:

: Unmodified
: Added
: Removed
: Modified

Justext_changelog

               v1
+Apr 11 2018
+    Norwegian joined wordlist
+Apr 11 2018
+    More wordlists
+Sep 11 2017
+    Lowercased stoplist
+Aug 24 2017
+    New and updated wordlists
+Aug 24 2017
+    Justext 1.4
+Aug 24 2017
+    Web demo
+Aug 24 2017
+    max_good_distance, a new context classification parameter
+    Maximum distance (in paragraphs) of a short paragraph from a good
+    paragraph to re-classify the short paragraph as good.
+Jun 30 2017
+    Minor package updates
+Jun 30 2017
+    Justext 1.3
+Jun 29 2017
+    Preprocess split to get_html_root and preprocess_html_root
+    Allows using the DOM root before the head (and other possibly useful
+    elements) are removed. Needed to get the page title from the head.
+Apr 12 2017
+    new README
+Apr 12 2017
+    filter out HTML(5) elements
+Feb 24 2017
+    remove words containing Latin characters from Korean stoplist
+Jan 12 2015
+    Move * out of trunk/
+Nov 11 2012
+    Temporary workaround for issue #2: Remove any text nodes that cannot be decoded.
+Jan 26 2012
+    Added stoplists for Kazakh, Kyrgyz, Turkmen and Uzbek.
+Dec 6 2011
+    Fixed inserting spaces between text nodes. Before, content such as "abc<b>efg</b>" became "abc efg" after processing. Now it correctly becomes "abcefg".
+Aug 8 2011
+    jusText 1.2
+Aug 8 2011
+    Edited wiki page Algorithm through web user interface.
+Aug 4 2011
+    Use character counts instead of word counts where possible (length-low, length-high, max-heading-distance and for computing link density). This is to make the algorithm work well in the language independent mode (without a stoplist) for languages where counting words is not easy (Japanese, Chinese, Thai, etc). The default thresholds have been adjusted correspondingly.
+Aug 4 2011
+    More robust parsing of meta tags containing the information about used charset.
+Jun 6 2011
+    Bug fix: Corrected decoding of HTML entities &#128; to &#159;
+Mar 28 2011
+    Edited wiki page Algorithm through web user interface.
+Mar 28 2011
+    Edited wiki page Algorithm through web user interface.
+Mar 23 2011
+    Edited wiki page Algorithm through web user interface.
+Mar 17 2011
+    Edited wiki page Algorithm through web user interface.
+Mar 9 2011
+    Edited wiki page Algorithm through web user interface.
+Mar 9 2011
+    Edited wiki page Algorithm through web user interface.
+Mar 9 2011
+    Edited wiki page Algorithm through web user interface.
+Mar 9 2011
+    Edited wiki page Algorithm through web user interface.
+Mar 9 2011
+    Created wiki page through web user interface.
+Mar 9 2011
+    jusText 1.1
+Mar 9 2011
+    Initial import.