Changes between Version 22 and Version 23 of WikiStart


Ignore:
Timestamp:
05/25/15 10:47:52 (10 years ago)
Author:
admin
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WikiStart

    v22 v23  
    1010<p><a href="/wiki/Onion">
    1111Onion (ONe Instance ONly) is a de-duplicator for large collections of texts. It can measure the similarity of paragraphs or whole documents and drop duplicate ones based on the threshold you set.</a></p>
    12 <p><a href="http://is.muni.cz/th/45523/fi_d/phdthesis.pdf">Paper</a></p>
     12<p>
     13<a class="lnk" href="http://is.muni.cz/th/45523/fi_d/phdthesis.pdf">Paper</a>
     14|
     15<a class="lnk" href="">Cite</a>
     16|
     17<a class="lnk" href="http://opensource.org/licenses/BSD-3-Clause">Licence</a>
     18</p>
    1319</td>
    1420
    1521<td class="app" style="background-color:#800000 ; background-image:url('/chrome/site/unitok_nb.png')">
    16 <a href="/wiki/Unitok">
    17 Unitok is a universal text tokeniser with specific settings for many languages. It can turn plain text into a sequence of newline-separated tokens (“vertical” format), while preserving XML-like tags containing metadata.
     22<p><a href="/wiki/Unitok">
     23Unitok is a universal text tokeniser with specific settings for many languages. It can turn plain text into a sequence of newline-separated tokens (“vertical” format), while preserving XML-like tags containing metadata.</a></p>
     24<p>
     25<a class="lnk" href="http://nlp.fi.muni.cz/raslan/raslan14.pdf#page=79">Paper</a>
     26|
     27<a class="lnk" href="">Cite</a>
     28|
     29<a class="lnk" href="https://www.mozilla.org/MPL/2.0/">Licence</a>
     30</p>
     31</td>
    1832
    19 </a></td></tr><tr>
     33</tr><tr>
    2034
    2135<td class="app" style="background-color:#0080ff ; background-image:url('/chrome/site/justext_nb.png')">
    22 <a href="/wiki/Justext">
    23 JusText is a HTML boilerplate removal tool. It can strip navigation links, headers, footers, etc. from HTML pages and leave just regular text containing full sentences.
     36<p><a href="/wiki/Justext">
     37JusText is a HTML boilerplate removal tool. It can strip navigation links, headers, footers, etc. from HTML pages and leave just regular text containing full sentences.</a><p>
     38<p>
     39<a class="lnk" href="http://is.muni.cz/th/45523/fi_d/phdthesis.pdf">Paper</a>
     40|
     41<a class="lnk" href="">Cite</a>
     42|
     43<a class="lnk" href="http://opensource.org/licenses/BSD-3-Clause">Licence</a>
     44</p>
     45</td>
    2446
    25 </a></td><td></td></tr></table>
     47<td></td></tr></table>
    2648}}}